Nash Equilibrium

Reading time: 25 minutes | Coding time: 10 minutes

In game theory, the Nash equilibrium, named after American mathematician John Forbes Nash Jr., is a solution concept of a non-cooperative game involving two or more players in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only their own strategy.

If each player has chosen a strategy and no player can benefit by changing strategies while the other players keep theirs unchanged, then the current set of strategy choices and the corresponding payoffs constitutes a Nash equilibrium. The Nash equilibrium is one of the foundational concepts in game theory.

The reality of the Nash equilibrium of a game can be tested using experimental economics methods.


Note : Now imagine if both prisoners are lcoked up N times. Then what to do in such case?

Store the last move what your opponent did in a variable
myVar = 0
Check if it is the first jail occurence or not
if firstSentence:
    myVar = <whatever other prisoner chose to do>
    return 0 
Now in next occurences, return what your oponent last did
if not firstSentence:
    tmp = myVar
    myVar = <whatever oter prisoner chose to do>
    return tmp

This is the zero determinant strategy!


  • Worst case Performance : O(1)
  • Worst case space complexity : O(1)


Using Python 3.5 as our main language

  • Planning next move as a prisoner

Interactive console game

    # 0 means stay silent
    # 1 means confess
    oppLastMove = 0
    def nextMove(isFirst , oppMove = 0):
        if isFirst :
            oppLastMove = oppMove
            print (0)
        if not isFirst : 
            tmp = oppLastMove
            oppLastMove = oppMove
            print (tmp)


The prisoner setting may seem contrived, but there are in fact many examples in human interaction as well as interactions in nature that have the same payoff matrix. The prisoner's dilemma is therefore of interest to the social sciences such as economics, politics, and sociology, as well as to the biological sciences such as ethology and evolutionary biology.

Many natural processes have been abstracted into models in which living beings are engaged in endless games of prisoner's dilemma. This wide applicability of the PD gives the game its substantial importance.

Some of the out of the way examples can be taken as follows, (Courtesy of Quora) :


This has a great example of the Prisoner’s Dilemma. There are essentially two strategies that people use on Tinder:

  • Strategy 1 :
    • swipe left on people you don’t like
    • swipe right on people you do like

If you trigger a match, contact them with honest intent to dialog.

  • Strategy 2: degenerate strategy
    • swipe right on everyone.
    • If you trigger a match, then decide if you like them and want to contact them, or just ignore their messages.

The degenerate strategy has the advantage that you don’t spend time evaluating large numbers of people who will only reject you, giving you a better idea of what is available.

This creates a Prisoner’s Dilemma between two matched parties as follows:

If both players play as intended, neither will know the other’s opinion before making their own decision, and if a match results it will be genuine and should be activated as part of the intended play (mutual cooperation).

If both players play degenerate, a potentially meaningless match will be created. However, since both knew they were playing degenerate, they are aware that this match is likely meaningless and are unlikely to set much effort by it (mutual defection).

If A plays intended and B plays degenerate, B will get to make their decision with the pre-knowledge that A likes them, which is beneficial to B (temptation to defect). If B does not like them, A will get silence from someone who they believed did like them, which is a negative experience for A (sucker’s payoff).
Thus T > C > D > S and a Prisoner’s Dilemma results. Users become sick of unactivated matches in which they had investment, and so switch to the degenerate strategy. This generates more unactivated matches for others and creates a snowball effect.

There is a slight exception to this. The reward functions vary heavily based on whether B likes A or not, so if we consider them as a probabilistic averages, then it is clear that the value of the intended strategy for an individual is affected by:

the number of other people playing degenerate (the probability that they end up on one of those branches);
how “attractive” they are (ie, the probability of others liking them) - since if B like them, they don’t experience S even if B is playing degenerate.

So Tinder would work great if it was full of highly desirable people who all played by the rules.


What is the strategy adopted for iterative prisoner's dilemma

Zero Determinant
Stochastic Determinant
Zero Matrix
Sparse Forest