Mixed strategies (Ch. 2)
Game Theory: Chapter 2
An outcome is a Nash equilibrium when all players are best responding. Take the following coordination game as an example:
Each player has two strategies. Row chooses between ‘b’ and ‘f’ and Column chooses between ‘B’ and ‘F’. From the payoff matrix there are four outcomes. They can either coordinate on playing (b, B) which Row prefers or they can coordinate on playing (f, F) which Column prefers. Both players are better off with either of those outcomes than when they fail to coordinate: (f, B) and (b, F) yields 0 to both players.
We can solve this game quickly by seeking best responses. Row’s best response to Column’s choice of B is to play b as then Row receives a payoff of 4 whereas if Row instead plays f, the payoff is 0.
Row’s best response to Column’s choice of F is to play f as then Row receives a payoff of 1 whereas if Row instead plays b, the payoff is 0.
We indicate these in the matrix using underlines:
Similarly for Column player: If Row selects b then Column best responds by using B, because then Column receives a payoff of 1 whereas playing F yields a payoff of 0, conditional on Row still using b.
If Row selects f then Column best responds by using F because then Column receives a payoff of 4 whereas playing B yields a payoff of 0, conditional on Row still using f.
We indicate these in the matrix using underlines and note any cell with an underline from both player is a Nash equilibrium.
As it stands there are two Nash equilibrium, (b, B) and (f, F). But this assumes that players only use pure strategies. A Nash equilibrium in pure strategies means each player is using only a single strategy. But what if a player decides to randomize? Another perfectly valid strategy is to mix between single strategies. The way this works is we think of players as choosing their strategies based on some probability distribution.
This sort of behavior is more natural in the following game. Suppose each player has a coin which two clear faces, `heads’ and ‘tails’. The players simultaneously reveal one side of their coin, perhaps by placing it on a table. If the faces of both coins match, Player 1 keeps both coins. If the faces of both coins do not match, Player 2 keeps both coins. Using the convention of modeling Player 1 as ‘Row player’ and Player 2 as ‘Column player’ we can illustrate with the following matrix:
Clearly Row player has a best response to play ‘h’ when Column has played ‘H’ and to play ‘t’ when Column has played ‘T’. Likewise, Column has a best response to play ‘H’ when Row has played ‘t’ and to play ‘T’ when Row has played ‘h’. We place underlines in the matrix accordingly:
Now the player are choosing simultaneously so they’re not able to react to what their co-player is selecting. If the players were to repeat this interaction a few times they would quickly learn that it’s a bad idea to always use the same strategy! Instead what the players optimally do is randomize between their two strategies. In fact, looking at the payoffs we see that if both players use each of their two strategies half the time, no one will have an incentive to deviate: the expected value will be zero.
Without loss of generality, suppose Column player decides to use ‘Heads’ more frequently than half the time. Now Row player definitely chooses to use ‘tails’ more than half the time and Column quickly realizes its best to play ‘Heads’ less and ‘Tails’ more. This rebalancing process stops and stabilizes precisely where it started: Each player must use both strategies exactly half the time in order to keep their co-player indifferent between their own strategies. This is a Nash equilibrium in mixed strategies.
Returning to the above coordinate game and applying this same logic we can find its Nash equilibrium in mixed strategies. In order to find this equilibrium we need to find the probability distribution that each player uses to keep the co-player indifferent. We model this as follows: Row player uses b with probability p and therefore uses f with probability (1-p). Column player uses B with probability q and therefore uses F with probability (1-q). Now we just need to solve for p and q. The key, once again, is remembering that each player must choose their probability distribution in order to keep the other player indifferent between theirs.
Column is indifferent between using B and F when:
If Row were to use its strategy b less than 4/5 of the time, Column is no longer indifferent between B and F and would want to use F more frequently in order to have a better chance at attaining its payoff of 4 from the outcome (f, F). Row player must play b exactly 80% of the time in order to prevent this.
Similarly, Row is indifferent between using b and f when:
If Column were to use its strategy B more than 1/5 of the time, Row is no longer indifferent between b and f. In this case Row would want to use b more frequently as the payoff of 4 from the outcome (b, B) is already so attractive.
It is precisely when Row player plays b exactly 4/5 of the time and Column player plays B exactly 1/5 of the time that both players have no incentive to change their mixing strategy. This is what is necessary for a Nash equilibrium!
We have found the mixed strategy Nash equilibrium and can report it using the following convention:
You can see the video version right here:
I have another video worked example as well:

