Puzzles with this structure were devised and discussed by Merrill Flood and Melvin Dresher in 1950, as part of the Rand Corporation’s investigations into game theory (which Rand pursued because of possible applications to global nuclear strategy). The title "prisoner’s dilemma" and the version with prison sentences as payoffs are due to Albert Tucker, who wanted to make Flood and Dresher’s ideas more accessible to an audience of Stanford psychologists.The Prisoner's Dilemma is a short parable about two prisoners who are individually offered a chance to rat on each other for which the "ratter" would receive a lighter sentence and the "rattee" would receive a harsher sentence. The problem results from the fact that both can play this game -- that is, defect -- and if both do, then both do worse than they would had they both kept silent. This peculiar parable serves as a model of cooperation between two or more individuals (or corporations or countries) in ordinary life in that in many cases each individual would be personally better off not cooperating (defecting) on the other.
Two prisoners, lets call them Joe and Sam, are being held for trial. They are being held in separate cells with no means of communication. The prosecutor offers each of them a deal. He also disclosed to each that the deal was made to the other. The deal he offered is this:Put yourself in Joe's position. If Sam stays mum and you sing, you get zero years. If he stays mum and you stay mum, you will each get 2 years. On the other hand if both of you confess, you both get 4 years. Finally, if he confesses and you don't, you will get 5 years. Whatever Sam does, it is to your advantage to admit your wrong doing. Of course, Sam is also a rational person and he will, therefore, come to the same conclusion. So you both end up confessing which nets a total of 8 man-years in the pokey. The paradox is, if you had both denied the crime, a total of only 4 man-years would be spent behind bars. Wait a minute! Can it really be that rationality leads to an inferior result? Let's look at this one more time. We will use a payoff matrix, a common tool of the game theoreticians. The payoff matrix is usually presented in the following form:
- a) If you will confess that the two of you committed the crime and the other guy denies it, we will let you go free and send him up for five years.
- b) If you both deny the crime, we have enough circumstantial evidence to put both of you away for two years.
- c) If both of you confess to the crime, then you'll both get 4 year sentences.
ACTION PAYOFF
Joe Sam Joe Sam
Cooperate Cooperate -2 (R) -2 (R)
Cooperate Defect -5 (S) 0 (T)
Defect Cooperate 0 (T) -5 (S)
Defect Defect -4 (P) -4 (P)
(The codes represent standard terminology for each action:
R Reward for mutual cooperation
S Sucker's payoff
T Temptation to defect
P Punishment for mutual defection )
The general form of the Prisoner's Dilemma model is that the preference ranking of the four payoffs be, from best to worst, T, R, P, S and that R be greater than the average of T and S. That is, any situation that meets these conditions will be a "Prisoner's Dilemma".
In summary, the Prisoner's Dilemma model postulates a condition in which the rational action of each individual is to not cooperate (that is, to defect), yet, if both parties act rationally, each party's reward is less that it would have been if both acted irrationally and cooperated!
The model can be applied to many real world situations, from genetics to business transactions to international politics.
Another addition to the game that makes it more realistic is to assume that each player interacts with a multitude of other players. Additionally, it can be assumed that each player remembers the past history of the interactions with each of the other players and that past history is the only information he has.
The Iterated "Prisoner's Dilemma" has been the subject of much study and computer simulation (see references). An interesting and possibly useful result of these studies is that a player's best strategy in this "game" is "Tit for Tat", with the additional proviso that the player be initially cooperative. That is, "I'll start off being nice but from that point on, whatever you do to me, I will do to you on the next interaction". This strategy has been shown to be clearly more productive than "The Golden Rule"!
Note that we are discussing multiple participants in which activities are between pairs of "actors". There is yet another more complex situation in which an individual is interacting with ALL of the other participants at once. This situation, which is more common in the real world, is called the "Many- person-dilemma" or, in my terminology, the "Voter's Paradox". See the companion essay, "Voter's Paradox" at this and other sites.
A new book that provides much greater insight into the Prisoner's Dilemma and the possibility of cooperation is The Origins of Virtue by Matt Ridley (see the references). Matt presents the idea that cooperation between humans may have evolved in spite of the "rationality" of defection. That is, we may be more like ants than we would like to admit!
For further study on the Prisoner's Dilemma, consult the references given below.
Author: Leon Felkins
Email: leonf@perspicuity.net