T.R | Title | User | Personal Name | Date | Lines |
---|
518.1 | Who needs rational decisions? | MODEL::YARBROUGH | | Tue Jun 17 1986 15:03 | 15 |
| The amusing thing about this game is that the best strategy is not
(entirely) rational. If it were entirely rational, then both players
would use the same strategy and make the same moves, since the game
is competely symmetrical. But if both are going to make the same
moves, then they should both cooperate. And cooperation is always
the worst choice, as defecting will always be more profitable! So
some irrational strategy must evolve from the dilemma.
A random strategy can also be demonstrated to be bad, as any random
selection to cooperate will be punished by the other player's choice
of defecting.
Typical game-theory considerations for this game break down because
the game is non-zero-sum, i.e. it is not the case that your gain
is the other player's loss.
|
518.2 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Wed Jun 18 1986 10:15 | 12 |
| Re .1:
> A random strategy can also be demonstrated to be bad, as any random
> selection to cooperate will be punished by the other player's choice
> of defecting.
I'm not sure that's correct. A selection to cooperate will not
always be punished by the other player's choice of defecting, since
the other player will not always chose that (assuming a good player).
-- edp
|
518.3 | one-shot vs serial | MODEL::YARBROUGH | | Wed Jun 18 1986 11:33 | 8 |
| Re .-1; What do you mean by 'always'? The problem was originally
stated as a one-shot situation. Your remark makes sense if you assume
the game might be played many times, but then we are into the area
of comparing dynamic strategies, which is fun, but for which random
strategies tend not to work well in comparison with others.
I'm a little sorry I got wound up on this. The whole game is a great
time sink.
|
518.4 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Wed Jun 18 1986 13:50 | 21 |
| Re .3:
I said "always" because we must consider all the possible variations
on the play before concluding what would happen. There is no
difference in what you must decide to do if you are playing the
game only once or are going to play it many times.
Consider that zero-sum games have the same problem: Selecting a
certain strategy will often be disadvantageous if the opponent selects
some strategy. That doesn't affect the fact that a random choice
is the best way to go in games without a minimax point.
Also, this game has a minimax point. Consider that your opponent
will either cooperate or double-cross you. If your opponent cooperates
and you double-cross, you go off free instead of receiving 30 days.
If your opponent double-crosses you and you double-cross, you get
eight years instead of ten years. In both cases, double-crossing
is the better choice.
-- edp
|
518.5 | Nope, multi-games ARE different | TLE::BRETT | | Wed Jun 18 1986 14:46 | 33 |
| > There is no difference in what you must decide to do if you are
> playing the game once or are going to play it many times.
Totally wrong. If you are going to play the game repeatedly, you
can use you successive plays to, say, code messages to the other
person. If the game is such that cooperation benefits you both
then such messages can be VERY useful.
For example: Punishment - if you screw me in this game I will ALWAYS
screw you in the next.
Such a technique can not be applied to "play once" games. In fact
the zero-sum, one-shot games are the most probabilistic.
Consider this game instead...
Me
\ 0 1
---------------------
You 0| 1/0 -2/1000 Your points/my points
|
1| -1/0 -1/1
Now, no matter what you do, I do best by playing a 1, so in a one-round
game, I should play 1, in which case you should play a 1 also. If
you repeat this analysis for 10 games the total score is -10/10
In a multi-round game I could play a 0 repeatedly until you switched
to playing 0's also, and then every tenth game I could play a 1.
In this case in 10 games the total score would be 8/1000 - we are
both better off!
/Bevin
|
518.6 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Wed Jun 18 1986 18:32 | 26 |
| Re .5:
The n-th case is still the same as the first case, assuming you
know your partner/opponent is intelligent. You don't what the other
player will do, and you must decide what to do now. Unless there
is a pure strategy, as in the original example, you must make a
random selection, and the question is reduced to asking the
probabilities you should assign to each choice.
If the probability you (labeled "Me" in your diagram) choose 0 is p and
the probability I choose 1 is q, it is simple to write our points as
functions of p and q. If p and q were independent, p would be 0 or 1,
depending on the value of q. But they are not independent, because if
you made p 0 or 1, I might adjust q accordingly. Finding the
appropriate values of p and q then becomes a problem (for me) of
selecting a function f such that q = f(p) and my points are maximized.
You have the task of selecting a function g such that p = g(q) and your
points are maximized. This again brings us to a game, that of
selecting functions. The new game involves an infinite number of
choices, but the strategy will be pure (because a random selection
between two functions assigning probabilities will be equivalent to
some single function assigning probabilities, and that function
exists as a play in the new game). How do we find those functions?
-- edp
|
518.7 | There is a better theoretical basis | MODEL::YARBROUGH | | Thu Jun 19 1986 11:43 | 17 |
| The arguments of .6 appear to me to be based on theory developed
for two-person, zero-sum games, and appear to be correct for that
basis; unfortunately, the game in question is two-person NON-zero-sum,
and the theory is not wholly applicable. In particular, the Prisoner's
Dilemma game has a saddle point in its matrix, but there is another
set of strategies for which the sum is significantly different,
so there is motivation to find a way to influence the other player
to choose a strategy with the largest SUM, not necessarily the largest
immediate payoff.
Another way of looking at the game is as a THREE-person zero-sum
game in which only one of the players (the State) is allowed to
offer coalitions with another player. In this interpretation, each
of the other players (prisoners) can choose to accept the offered
coalition, but if there is some way to encode by play selection
an offer of the prisoner-prisoner coalition, the State loses and
the prisoners each win more than with the State-prisoner coalition.
|
518.8 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Thu Jun 19 1986 13:37 | 6 |
| Can somebody demonstrate a game with a scheme for cooperation such
that the scheme produces better results than any assignment of
probabilities to the plays?
-- edp
|
518.9 | The coalition game | MODEL::YARBROUGH | | Fri Jun 20 1986 09:59 | 6 |
| Well, there's the "Coalition" game: three-person, zero-sum. At each
turn the majority players win, the minority loses. The only strategy
in the game is to form a coalition with one other player. A secondary
issue is whether or not you can pay one of the other players enough
to break up an established (and very profitable) coalition. I don't
see any way in which randomness is relevant; negotiation is everything.
|
518.10 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Fri Jun 27 1986 13:34 | 10 |
| Okay, I concede. Now, in games where cooperation is a benefit,
what happens? If there is no simple solution, such as in the first
game where both players remain silent, I don't think we can do much
analysis without more specification. For example, if there is a
choice of "best" plays of one player getting 8 and the other player
getting 10 or vice-versa, which will be chosen? In reality, it
depend on the wills of the players and other factors.
-- edp
|
518.11 | tit for tat | LATOUR::JMUNZER | | Thu Jul 10 1986 18:31 | 20 |
| From the article of .0:
The basic prisoner's dilemma model is more than three decades old,
invented by two scientists at the Rand Corporation. The modern era of
prisoner's dilemma research, however, began in 1978, when Dr. Axelrod decided
to test different strategies experimentally, by pitting them against each
other in a series of computer tournaments. Contestants submitted tiny programs
that competed with one another in a computerized round-robin prisoner's
dilemma universe.
The most fertile legacy of Dr. Axelrod's original tournaments was
the discovery of what is, in almost all circumstances, the best strategy
for playing prisoner's dilemma. It is called "Tit for Tat," and it can
be summed up this way: Do unto your opponent as he has just done unto you.
...Tit for Tat did better than strategies that always betrayed,
or betrayed at random, or used sophisticated statistical strategies....
Those strategies never lost when playing Tit for Tat -- they couldn't.
But when they were playing each other, they tended to do badly, because
they defected too much....
John
|
518.12 | | ERIS::CALLAS | Jon Callas | Fri Jul 11 1986 13:02 | 10 |
| No, tit for tat is not the best strategy. It is simply the best
strategy that participated in Axelrod's tournaments. In "The Evolution
of Cooperation," Axelrod's book on the subject, he mentions a couple
times that there were strategies that would have done better than tit
for tat. And it is certainly not proved that tit for tat is optimal.
However, the strategies that do well are all related to tit for tat,
and tit for tat will always do well. Its biggest advantage is that it
is so simple.
Jon
|