The prisoners’ dilemma (PD) is the best-known game of strategy in social science. It helps us understand what governs the balance between cooperation and competition in business, in politics, and in social settings. Prisoner's dilemma is a fundamental problem in game theory that demonstrates why two people might not cooperate even if it is in both their best interests to do so. It was originally framed by Merrill Flood, an American mathematician and Melvin Dresher, a Polish-born American mathematician in 1950. Albert W. Tucker, a Canadian-born American mathematician formalized the game with prison sentence payoffs and gave it the name "prisoner's dilemma" name in 1992.
How can we Understand Prisoner's Dilemma?
Prisoner's dilemma (PD) can be better understood through a typical example presented as below:
Two suspects are arrested by the police. The police don’t have adequate evidence for a conviction. So, they separated the prisoners and visited each of them to offer the same deal.
If one testifies for the trial against the other (defects) and the other remains silent (cooperates), the traitor goes free and the silent partner in crime be given the full 10-year sentence.
If both remain silent, both prisoners are sentenced to only six months in jail for a minor charge.
If each betrays the other, each receives a five-year sentence.
Each prisoner must choose to betray the other or to remain silent.
Each one is assured that the other would not know about the betrayal before the end of the investigation.
How should the prisoners act?
If we presume that each player cares only about minimizing his or her own time in jail, then the prisoner's dilemma forms a non-zero-sum game in which two players may each either cooperate with or defect from (betray) the other player. In this game, despite of what the opponent chooses, each player always gets a higher payoff (lesser sentence) by betraying; that is to say that betraying is the strictly dominant strategy. For instance, Prisoner A can precisely say, "No matter what Prisoner B does, I personally am better off betraying than staying silent. So, for my own sake, I should betray”. However, if the other player acts similarly, then they both betray and both get a lower payoff than they would get by staying silent.
Rational self-interested decisions result in each prisoner being worse off than if each chose to lessen the sentence of the accomplice at the cost of staying a little longer in jail himself (hence the seeming dilemma). In game theory, this demonstrates very elegantly that in a non-zero-sum game Nash equilibrium need not be a Pareto optimum (a concept in economics). In game theory, Nash equilibrium (named after John Forbes Nash, who proposed it) is a solution concept of a game involving two or more players, in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only his own strategy unilaterally.
What is the statistical response to the Prisoner's Dilemma?
An experiment based on the simple dilemma predicted that roughly 40% of participants played "cooperate" (i.e., stayed silent).
What is the iterated Prisoner's Dilemma?
In the iterated prisoner's dilemma, the game is played repeatedly. Thus each player has an opportunity to punish the other player for previous non-cooperative play. If the number of steps is known by both players in advance, economic theory says that the two players should defect repeatedly; no matter how many times the game is played. However, this analysis fails to predict the behavior of human players in a real iterated prisoner’s dilemma situation, and it also fails to predict the optimum algorithm when computer programs play in a tournament. Only when the players play an indefinite or random number of times can cooperation be an equilibrium, technically a sub-game perfect equilibrium meaning that both players defecting always remains an equilibrium and there are many other equilibrium outcomes. In this case, the incentive to defect can be overcome by the threat of punishment.
What are the components of successful strategies for Prisoner's Dilemma?
Being Nice: The very important condition is that the strategy must be "nice", that is, no participant will not defect before their opponent does (this is sometimes referred to as an "optimistic" algorithm). Almost all of the top-scoring strategies were nice; therefore a purely selfish strategy will not "cheat" on its opponent, for purely utilitarian reasons first.
Retaliating: However, the successful strategy must not be a blind optimist. It must sometimes hit back. An example of a non-retaliating strategy is “always cooperate”. This is a very bad choice, as "nasty" strategies will brutally use such players.
Forgiving: Successful strategies must also be forgiving. Though players will retaliate, they will once again fall back to cooperating if the opponent does not continue to defect. This stops long runs of revenge and counter-revenge, maximizing points.
Being non-envious: The last quality is being non-envious, that is to score more than the opponent (impossible for a ‘nice’ strategy, i.e., a 'nice' strategy can never score more than the opponent).
What are some real-life applications of Prisoner's Dilemma?
Politics: In political science, for instance, the Prisoner’s Dilemma scenario is often used to illustrate the problem of two states engaged in an arms race. Both will reason that they have two options, either to increase military expenditure or to make an agreement to reduce weapons.
Social science: In sociology or criminology, the Prisoner’s Dilemma may be applied to an actual dilemma facing two inmates. The game theorist, Marek Kaminski, a former political prisoner concluded that while the PD is the ideal game of a prosecutor, numerous factors may strongly affect the payoffs and potentially change the properties of the game
Science: In environmental studies, the Prisoner’s Dilemma is evident in crises such as global climate change. All countries will benefit from a stable climate, but any single country is often hesitant to curb CO2 emissions.
Law: The theoretical conclusion of Prisoner’s Dilemma is one reason why, in many countries, plea bargaining, an agreement in a criminal case whereby the prosecutor offers the defendant the opportunity to plead guilty, usually to a lesser charge or to the original criminal charge with a recommendation of a lighter than the maximum sentence is forbidden.
Steroid Use: The Prisoner’s Dilemma applies to the decision whether or not to use performance enhancing drugs in athletics. Given that the drugs have an approximately equal impact on each athlete, it is to all athletes' advantage that no athletes take the drugs (because of the side effects).