Nash Equilibrium and Poker

The point of this post is to give you a simple scenario where we can easily see what Nash Equilibrium is, and why it is important in thinking about games. I might write more extensively about game theory, especially as it applies to political science, later, but for now here is a quick example. Thanks to Daniel Negreanu for mentioning this is one of his YouTube videos, although I can't seem to find it at the moment.

What if we played poker with a computer?

Poker will always be the game we love, a game where you constantly have to adapt to what others do. A game in which no "secret math strategy that beats everything" exists. - Ivan Demidov

Poker is a strange game to find in a casino, mostly because you are playing against other humans, who themselves are providing the bets, and the house really only collects a rake (a percentage fee of the pot). What if poker were like slots, where you sat at machine and were dealt two cards, the computer is dealt two, and the normal betting procedure were followed.

Assuming the computer had one strategy (i.e it didn't change it depending on how you played), then the computer would implement a basic approach entirely about betting when your expected value is positive, folding when it is negative, et cetera. To play against this computer, you would have to run pretty much the exact same strategy against it, because deviating from it (for example, betting on a weak hand to try and get more money) will always loose in the long run. Here we introduce the concept of Nash equilibrium: if you are playing this computer there is an equilibrium, where both players must maintain their strategies, because if you or the computer change then the other will have an upper hand and beat you in the long run.

This style of poker is called "Game Theory Optimal Poker" or GTO poker, and it is great to play against opponents where you have no information on their playing style (i.e. strategy) because the worst case scenario for you is an equilibrium, as if you were playing the computer, but if their strategy deviates from that you are winning. Not every game has an optimal strategy like this, even if every game has a Nash equilibria, and to know if you are in a equilibrium you need to know about your opponents strategy. So, why doesn't everyone play GTO poker? Well, this is exactly why equilibria are so important: there isn't an incentive for unilateral change in strategy, but if you know that your opponent is not playing GTO, then there is a better strategy for you to play!

Escaping Equilibria

He said, "Son, I've made a life Out of readin' people's faces Knowin' what the cards were By the way they held their eyes So if you don't mind me sayin' I can see you're out of aces" - Kenny Rogers, The Gambler

Poker is famously a game all about information, and the reason that information is so vital is that it lets you escape equilibria. Even though knowing your opponent's hand for one hand might make you a bit of money, knowing their strategy can make you much more. GTO poker is based on staying in Nash equilibria, where Nash equilibria are points in the game where one player deviating from the strategy is going to be penalized, but that doesn't mean we should always keep playing GTO. Suppose we knew about a bug in the computer that always made it bluff, then we know that calling the computers bets will make us a lot of money in the long term. We have managed to escape equilibrium because we know the other player is not playing GTO. If we didn't know that, then there is a chance we will be penalized for changing strategies.

Humans bluff, they read each other, they have tells, and all of this makes playing with a human different than playing a computer. Humans, in other words, change strategies. Finding out that someone is a bluffer is a great way to make money off of them, because you then have information that lets you safely escape the Nash equilibrium of GTO poker, you can change your strategy accordingly. See, Nash equilibrium is a equilibrium that requires you know your strategy, and your opponents strategy (or, if you don't know, you can use their best possible strategy, if that exists). If you know your opponent changes their strategy then you can safely exit equilibria (this is called being in unstable equilibria), but this requires information, hence why a good poker face is so important. However, if everyone can play GTO, why would players do anything but that? Well, it's all about the variance.

Playing non equilibria strategies

There are two main reasons poker players deviate from GTO: they want better returns, they expect other players are also deviating from GTO. Increasing your returns by way of different strategies involves several different things, but on a purely mathematical level, leaving GTO is often done so that the amount possible to be won by any given hand goes up, often called playing "loose aggressive" (i.e. playing lots of hands with lots of bets). This can be good in the short term, especially if mixed with more standard GTO play, since it has the possibility of getting more money. Similarly, bets aren't always about expected value because they can be leveraged into forcing another player to risk continuing to play, such as making a player go all in. This is why, especially in tournaments, people often talk about the Independent Chip Model (ICM), where you model the probability of surviving the game longer (i.e. winning more money) depending only on the amount of chips you have. GTO poker is often left because a player is trying to bait another player or suspects that another player is already deviating from GTO. Since there is no perfect information in poker, we never have a perfect knowledge that leaving equilibrium will be rewarded, but we can try and read our fellow humans as much as possible to try and get higher returns.

These non-equilibrium strategies are really important in different kinds of games, such as collaborative games. Let us take an (oversimplified) historical example: suppose the US and the USSR are playing a game where the have two options, either have nuclear weapons or not have them. Clearly, the best option (lowest risk) is for both of them to not have any. However, the worst option for either is to let the other have nuclear weapons while they do not. If they both have nuclear weapons, this is unstable equilibria, because neither country can unilaterally decide to disarm without hurting themselves, and if they did disarm the best strategy is to rearm as soon as possible. However, if they could be sure that the other country would collaborate, then the best outcome would be for both to disarm simultaneously. This, however, is not an equilibrium, because if one country decides to rearm unilaterally they will get the upper hand. So, in diplomacy, like in poker, information is everything.

The Three Tier Structure of Spanish Colonialism