English subtitles

← 21. Repeated games: cooperation vs. the end game

Get Embed Code
1 Language

Showing Revision 1 created 08/17/2012 by Amara Bot.

  1. Professor Ben Polak:
    Okay, let's make a start.
  2. So I hope everyone had a good
  3. We're going to spend this week
    looking at repeated interaction.
  4. We already saw last time,
    before the break,
  5. that once we repeat games--once
    games go on for a while--we can
  6. sustain behavior that's quite
  7. So, for example,
    before the break,
  8. we saw that we could sustain
    fighting by players that were
  9. rational in a war of attrition.
    Another thing we learned before
  10. the break is when we're
    analyzing these potentially very
  11. long games,
    it helps sometimes to break the
  12. analysis up into what we might
    call "stage games,"
  13. each period of the game,
    and break the payoffs up into:
  14. the payoffs that are associated
    with that stage;
  15. payoffs that are associated
    with the past (they're sunk,
  16. so they doesn't really matter);
    and payoffs that are going to
  17. come in the future from future
    equilibrium play.
  18. So those are some ideas we're
    going to pick up today,
  19. but for the most part what we
    do today will be new.
  20. Now whereas last time we
    focused on fighting,
  21. for the whole of today I want
    to focus on the issue of
  22. cooperation.
    In fact, for the whole of this
  23. week, I want to focus on the
    issue of cooperation.
  24. The question behind everything
    this week is going to be:
  25. can repeated interaction among
  26. both induce and sustain
    cooperative behavior,
  27. or if you like "good behavior."
    Our canonical example is going
  28. to be Prisoners' Dilemma.
    Way back in the very first
  29. class, we talked about
    Prisoners' Dilemma,
  30. and we mentioned that playing
    the game repeatedly might be
  31. able to get us out of the
  32. It might be able to enable us
    to sustain cooperation.
  33. And what's going to be good
    about that is not just sustained
  34. cooperation, but sustained
    cooperation without the use of
  35. outside payments such as
    contracts or the mafia or
  36. whatever.
    So why does this matter?
  37. Well one reason it matters is
    that most interactions in
  38. society, either don't or perhaps
    even can't rely on contracts.
  39. Most relationships are not
  40. However, many relationships are
  41. So this is going to be of more
    importance perhaps in general
  42. life--though perhaps less so in
    business--more important in
  43. general life than thinking about
  44. So let's think about some
    obvious examples,
  45. think about your own
  46. I don't know if you have any
    friendships--I assume you
  47. do--but for those of you who do
    your friendships are typically
  48. not contractual.
    You don't have a contract that
  49. says if you're nice to me,
    I'll be nice to you.
  50. Similarly, think about
    interactions among nations.
  51. Interactions among nations
    typically cannot be contractual
  52. because there's no court to
    enforce those would-be
  53. contracts,
    although you can have treaties
  54. I suppose.
    But most interactions among
  55. nations--cooperation among
    nations is sustained by the fact
  56. that those relationships are
    going to go forever.
  57. Even in business,
    even where we have contracts,
  58. and even in a very litigious
    society like the U.S.
  59. which is probably the most
    litigious society in the world,
  60. we can't really rely on
    contracts for everyday business
  61. relationships.
    So, in some sense,
  62. we need a way to model a way to
    sustain cooperation and good
  63. behavior that forms,
    if you like,
  64. the social fabric of our
    society and prevents always
  65. going to court about everything.
    Now, why might repeated
  66. interaction work?
    Why do we think way back in day
  67. one of the class that repeated
    interaction might be able to
  68. enable us to behave well,
    even in situations like
  69. Prisoner's Dilemma's or
    situations involving moral
  70. hazard where bad behavior is
    going to occur in one shot
  71. games?
    So the lesson we're going to be
  72. sort of underlying things today
    and all week is this one.
  73. In ongoing relationships the
    promise of future rewards and
  74. the threat of future punishments
    may--let's be careful--may
  75. sometimes provide incentives for
    good behavior today.
  76. Just leave a gap here in your
    notes because we're going to
  77. come back to this.
    So this is a very general idea.
  78. And the idea is that future
    behavior in the relationship can
  79. generate the possibility of
    future rewards and/or future
  80. punishments,
    and those promises or threats
  81. may sometimes provide incentives
    for people to behave well today.
  82. The reason I want to leave a
    gap here is I want--part of the
  83. purpose of this week's lectures
    is to try and get beyond this.
  84. This is kind of,
    almost a platitude,
  85. right?
    Most of you knew this already.
  86. So I want to get beyond this.
    I want to see when is this
  87. going to work?
    When is it not going to work?
  88. How is it going to work?
    So I don't want people to leave
  89. this week of classes or leave
    the course thinking:
  90. oh well,
    we're going to interact more
  91. than once so everything's fine.
    That's not true.
  92. I want to make sure we
    understand when things work,
  93. how they work,
    and more importantly,
  94. when they don't work and how
    they don't work.
  95. So we're going to try and fill
    in the gap that we just left on
  96. that board as we go on today.
  97. we do have this very strong
    intuition that repeated
  98. interaction will get us,
    as it were, out of the
  99. Prisoner's Dilemma.
    So why don't we start with the
  100. Prisoner's Dilemma.
    I'll put this up out of the way
  101. and we'll come back to it.
    Let's just remind ourselves
  102. what the Prisoner's Dilemma is
    because you guys are all full of
  103. turkey and cranberry sauce and
    you've probably forgotten what
  104. Game Theory is entirely.
    Let's name these strategies,
  105. rather than alpha and beta,
    let's call them cooperation and
  106. defect.
    And that will be our convention
  107. this week.
    We'll call them cooperation and
  108. defect.
    This is Player A and this is
  109. Player B, and the payoffs are
    something like this (2,2),
  110. (-1,3), (3, -1) and (0,0).
    It doesn't have to be exactly
  111. this but this will do.
    This is the game we're going to
  112. play.
    And, to try and see if we get
  113. cooperation out of it by having
    repeated interaction,
  114. we're going to play it more
    than once.
  115. So let me go and find some
    players to play here.
  116. This should be a familiar game
    to everybody here.
  117. So why don't I pick some people
    who are kind of close to the
  118. front row.
    So what's your name again?
  119. Student: Brooke.
    Professor Ben Polak:
  120. Brooke.
    Okay so Brooke is going to be
  121. Player B.
    And I've forgotten your name,
  122. by this stage I should know it.
    Patrick you're going to be
  123. Player A.
    And the difference between
  124. playing this game now and
    playing this game earlier on in
  125. the class is we're going to play
    not once, but twice.
  126. We're going to play it twice.
    So write down what you're going
  127. to do the first time and show it
    to your neighbor.
  128. Don't show it to each other.
    And let's find out what they
  129. did the first time.
    So is it written down?
  130. Something's written down?
    So Brooke.
  131. Student: I cooperated.
    Professor Ben Polak:
  132. You cooperated.
  133. Student: I defected.
    Professor Ben Polak:
  134. Patrick defected,
  135. Okay let's play it a second
  136. So write down what you're going
    to do the second time.
  137. Brooke?
    Student: This time I'm
  138. going to defect.
    Student: Me too.
  139. Professor Ben Polak:
    All right,
  140. so we had the play this
    time--let's just put it up
  141. here--so when we played it this
    time, we had A and B.
  142. And the first time we had
    (defect, cooperate) and the
  143. second time we had (defect,
  144. Let's try another pair.
    We'll just play this a couple
  145. times and we'll talk about it.
    So that's fair enough.
  146. Why don't we go to your
  147. That's fair enough. It's easy.
    So you are?
  148. Student: Ben.
    Professor Ben Polak:
  149. That's a good name,
    very good okay.
  150. You are? Student: Edwina.
    Professor Ben Polak:
  151. Edwina, Edwina and Ben
  152. So we're going to make Ben
    Player B and Edwina Player A.
  153. And why don't you write down
    what you're going to do for the
  154. first time.
    Again, we're going to play it
  155. twice.
    Why don't we mix it up,
  156. we can play it three times.
    We'll play it three times this
  157. time okay.
    We'll play it three times.
  158. Both people are happy with
    their decisions.
  159. Okay so the first time Edwina
    what did you choose?
  160. Student: Defect.
    Student: Cooperate.
  161. Professor Ben Polak:
    All right so we had--let's
  162. put down this time--so we've got
    Edwina will Play A and Ben B.
  163. And we had (cooperate, defect).
    Second time please, Edwina?
  164. Student: Cooperate.
    Student: Defect.
  165. Professor Ben Polak:
    Okay, so we're going to and
  166. fro now.
    So this was cooperate and
  167. defect and one more time:
    write it down.
  168. Both players written down?
  169. Student: Cooperate.
    Student: Defect.
  170. Professor Ben Polak:
    Okay, so you we flipped
  171. round again okay.
    Okay, so we're seeing some
  172. pretty odd behavior here.
    Who did what that time?
  173. Edwina what did you do?
    Student: I cooperated.
  174. Professor Ben Polak: So
    we had this.
  175. Is that right?
    So keep the microphones for a
  176. minute, and we'll just talk
    about it a second.
  177. All right, so first of all
    let's start with Ben here.
  178. Ben you were cooperating on the
    first go.
  179. So why did you choose to
    cooperate the first turn?
  180. Student: I felt that if
    I established a reputation for
  181. cooperating we could end up in
    the cooperate,
  182. cooperate.
    Professor Ben Polak:
  183. All right,
    so you thought that by playing
  184. cooperate early you could
    establish some kind of
  185. reputation.
    And what about later on when
  186. you played defect thereafter,
    what were you thinking there?
  187. Student: I realized that
    she established a reputation for
  188. defecting a second time.
    Professor Ben Polak:
  189. All right,
    so you switched strategies
  190. mid-course.
    Edwina you started off by
  191. defecting.
    Why did you start off by
  192. defecting?
    Shout it out so people can hear
  193. you.
    Student: Because his
  194. friend defected so I thought he
    might defect.
  195. Professor Ben Polak:
    Okay, his friend defected.
  196. Okay, so he's been tainted by
    his friend there.
  197. There's a shortage of space in
    the class.
  198. They could have just been
    sitting next to each other.
  199. Thereafter you cooperated.
    Why was that?
  200. Student: Because I
    thought he cooperated.
  201. Maybe he was going to keep
  202. Professor Ben Polak:
    All right,
  203. so in fact your reputation
    works in some sense.
  204. By cooperating early you
    convinced Edwina you would
  205. cooperate.
    And then you went on
  206. cooperating even after he
    defected, so what were you doing
  207. in the third round?
    Shout out.
  208. Student: I thought he
    might cooperate because I
  209. cooperated.
    Professor Ben Polak:
  210. All right,
    he might come back.
  211. Let's talk about it to your
  212. So Brooke, shout out why you
    cooperated in the first round.
  213. Student: Because I was
    hopeful that he would cooperate.
  214. Professor Ben Polak:
    You were hoping he would
  215. cooperate all right,
    and why did you defect
  216. thereafter?
    Student: Because I
  217. thought he would continue to
    defect after he defected.
  218. Professor Ben Polak:
    Because he defected,
  219. he would continue to defect.
    Patrick, you're the person who
  220. just defected throughout here.
    Grab the mic that's next to you.
  221. Why did you just defect?
    Student: It's such a
  222. short game that it makes sense
    to defect in the last period so
  223. the second last period and the
    first period.
  224. Professor Ben Polak:
    All right,
  225. that's an interesting idea.
    So Patrick's saying,
  226. actually if we look at the last
    period of this game,
  227. if we look at this last period
    of the game,
  228. what does the game look like in
    the last period?
  229. Student: It's a single
    period game.
  230. Professor Ben Polak:
    In the last period,
  231. this actually is the game.
    If I drew out the game with two
  232. periods, it would be kind of a
    hard thing to draw,
  233. it would be an annoying diagram
    to draw.
  234. But in the last period of the
    game, whatever happened in the
  235. first period is what?
    It's sunk, is that right?
  236. Everything that happened in the
    first period is sunk.
  237. So in the last period of the
    game these are the only relevant
  238. payoffs, is that right?
    Since these are the only
  239. relevant payoffs looking
    forward, in the last period of
  240. the game, we know that there's
    actually a dominant strategy.
  241. And what is that dominant
    strategy in the last period of
  242. the game?
    To do what?
  243. In Prisoner's Dilemma,
    what's the dominant strategy?
  244. Shout it out.
    Defect, okay.
  245. So what we should see in this
    game--we didn't actually because
  246. we had some kindness over here
    from Edwina but what we should
  247. see in general--is we know that
    in the last period of the game,
  248. in period two,
    we're going to get both people
  249. defecting.
    The reason we're going to get
  250. both people defecting is because
    the last period of the game is
  251. just a one shot game.
    There's nothing particularly
  252. exciting about it.
    There is no tomorrow,
  253. and so people are going to
  254. But now let's go back and
    revisit some of the arguments
  255. that Edwina, Brooke,
    and--I've forgotten what your
  256. neighbor is called again,
    Ben, (I should remember
  257. that)--and Ben said earlier.
    They gave quite elaborate
  258. reasons for cooperating:
    cooperating to establish
  259. reputation;
    cooperating because the other
  260. person might cooperate;
  261. But most of these behaviors
    were designed to either induce
  262. or promise cooperation in period
    two, is that right?
  263. What we've just argued is that
    in period two everyone's going
  264. to defect.
    Period two is just a trivial
  265. one stage Prisoner's Dilemma.
    We actually analyzed it in the
  266. very first week of the class.
    And provided we believe these
  267. payoffs, we're done.
    Period two people are going to
  268. defect.
    Since they're going to defect
  269. in period two,
    nothing I can do in period one
  270. is going to affect that behavior
    and therefore I should defect
  271. also in period one.
    In order to belabor this point,
  272. we can actually draw up what
    the matrix looks like in period
  273. one--so let's do that--using the
    style we did last week--two
  274. weeks ago--before we went away.
    So here once again is the
  275. matrix we had before,
    and I want to analyze the first
  276. stage game.
    In the first stage game,
  277. what I'm going to do is I'm
    going to add in the payoffs I'm
  278. going to get from tomorrow.
    The payoffs I'm going to get
  279. from tomorrow are from
    tomorrow's equilibrium.
  280. Well this isn't going to do
    very much for me as we'll see
  281. because I'll get 2 + 0 tomorrow
    because we know I'm playing
  282. defect tomorrow,
    2 + 0 tomorrow,
  283. - 1 + 0 tomorrow,
    3 + 0 tomorrow,
  284. 3 + 0 tomorrow,
    - 1 + 0 tomorrow,
  285. and 0 + 0 tomorrow,
    0 + 0 tomorrow.
  286. So just as we did with the war
    of attrition game two weeks ago,
  287. we can put in the payoffs from
  288. we can roll back those
    equilibrium payoffs to today,
  289. it's just in this particular
    exercise it's rather a boring
  290. thing,
    because we're just adding 0 to
  291. everything.
    When I add 0 to everything,
  292. I then just cancel out the
    zeros and I'm back where I
  293. started, and of course I should
  294. So what I'm going to see is
    because I'm going to defect
  295. anyway tomorrow,
    today is just like a one shot
  296. game as well.
    And I'm going to get defect
  297. again.
    Now here we played the game
  298. twice and got defect,
    defect, what about if we played
  299. the game three times?
    It's the same thing.
  300. We didn't play three times but
    we did play the game three times
  301. between Edwina and Ben.
    There we know we're going to
  302. defect in the third round.
    Therefore we may as well defect
  303. in the second to last round.
    Therefore we may as well defect
  304. in the first round.
    If we played it five times,
  305. we know we're going to all
    defect in the fifth round.
  306. Therefore we may as well defect
    in the fourth round.
  307. Therefore we may as well defect
    in the third,
  308. and so on.
    If we played it 500 times,
  309. we wouldn't have time in the
    class, but if we played it 500
  310. times,
    we know in that 500th period
  311. it's a one shot game and people
    are going to defect.
  312. And therefore,
    in the 499th period people are
  313. going to defect.
    And therefore in the 498th
  314. period people are going to
    defect, and so on.
  315. So the problem here is that we
    get unraveling,
  316. something we've seen before in
    this class, we get unraveling
  317. from the back.
    I have a worry that there might
  318. only be one L in unraveling in
    America, is that right?
  319. How many L's do we put in
    unraveling in America?
  320. One, I've just come back from
    England and my spelling is
  321. somewhere in the mid-Atlantic
    right now.
  322. I'll leave it as one.
    All right: unraveling from the
  323. back.
    Essentially this is a backward
  324. induction argument,
    only instead of using backward
  325. induction we're really using
    sub-game perfection.
  326. We're looking at the equilibria
    in the last games and as we roll
  327. back up the game,
    we get unraveling.
  328. So here's bad news.
    The bad news is,
  329. we'd hoped that by having
    repeated interaction in the
  330. Prisoners' Dilemma,
    we would be able to sustain
  331. cooperation.
    That's been our hope since day
  332. one of the class.
    In fact, we stated it kind of
  333. confidently in the first day of
    the class, and we kind of
  334. intuitively believe it.
    But what we're discovering is,
  335. even if you played this game
    for 500 times and then stopped,
  336. you wouldn't be able to sustain
    cooperation in equilibrium
  337. because we're going to get
    unraveling in the last stage and
  338. so on and so forth.
    So it seems like our big hope
  339. that repeated interaction would
    induce cooperation in society is
  340. going down the plug hole.
    That's bad.
  341. So let's come back and modify
    our lesson a little bit.
  342. So what went wrong here was,
    in the last period of the game,
  343. there was no incentives
    generated by the future,
  344. so there was no promise of
    future rewards or future
  345. punishments, and therefore
    cooperation broke down and then
  346. we had unraveling.
    So the lesson here is what?
  347. The lesson is:
    but for this to work it helps
  348. to have a future.
  349. This whole idea of repeated
    interaction was the future was
  350. going to create incentives for
    the present.
  351. But if the games come to an
    end, there's going to be some
  352. point when there isn't a future
    anymore and then we get
  353. unraveling.
    Now this is not just a formal
  354. technical point to be made in
    the ivory tower of Yale.
  355. This is a true idea.
    So for example,
  356. if we think about CEO's or
    presidents, or managers of
  357. sports teams,
    there's a term we use,
  358. there's a word we use--at least
    in the states--to describe such
  359. leaders when they're getting
    towards the end of their term
  360. and everyone knows it.
    What's the expression we use?
  361. "Lame duck."
    So we have this lame duck
  362. effect.
    The lame duck effect at the end
  363. of somebody's term undermines
    their ability to cooperate,
  364. their ability to provide
    incentives for people to
  365. cooperate with them,
    and causes a problem.
  366. So this lame duck effect
    affects presidents but it also
  367. affects CEO's of companies.
    But it's not just leaders who
  368. run into this problem.
    So if you have an employee,
  369. if you're employing somebody,
    you may have a contract with
  370. the person you're employing,
    but basically you're sustaining
  371. cooperation with this person
    because you interact with them
  372. often.
    You know you're always going to
  373. interact with them.
    But then this employee
  374. approaches retirement.
    Everyone knows that in April or
  375. something they're going to
    retire, then the future can't
  376. provide incentives anymore.
    And you have to switch over
  377. from the implicit incentive of
    knowing you're going to be
  378. interacting in the future,
    to an explicit incentive of
  379. putting incentive clauses in the
  380. So retirement can cause,
    if you like,
  381. a lame duck effect.
  382. This is even true in personal
  383. In your personal relationships
    with your friends,
  384. if you think that those
    friendships are going to go on
  385. for a long time,
    be they with your significant
  386. other or just with the people
    you hang out with,
  387. you're likely to get a lot of
  388. But if, as with perhaps most
    economics majors,
  389. most of your significant others
    are only going to last for a day
  390. at most,
    you're not going to get great
  391. cooperation.
    You're going to get cheating.
  392. No one's rising to that one but
    I guess it's true.
  393. So what do we call these:
    "economics majors'
  394. relationships."
  395. These are kind of "end effects."
    All of these things are caused
  396. by the fact that the
    relationship is coming to an
  397. end.
    And once the relationship is
  398. coming to an end,
    all those threats and promises
  399. of future behavior,
    implicit or otherwise,
  400. are going to basically
  401. So at this point we might think
    the following.
  402. You might conclude the
  403. You might conclude that if a
    relationship has a known end,
  404. if everyone knows the
    relationship's going to end at a
  405. certain time,
    then we're done and we
  406. basically can't sustain
    cooperation through repeated
  407. interaction.
    That's kind of what the example
  408. we looked at seems to suggest.
    However, that's not quite true.
  409. So let's look at another
    example where a relationship is
  410. going to have a known end but
    nevertheless we are able to
  411. sustain some cooperation.
    And we'll see how.
  412. So again, I'm being careful
    here, I've said it helps
  413. to have a future.
    I haven't said it's
  414. necessary to have a
  415. So that's good news for the
    Economics majors again.
  416. So let's do this example to
    illustrate that,
  417. even a finite interaction,
    even an interaction that's
  418. going to end and everyone knows
    it's going to end,
  419. might still have some hope for
  420. So look at this slightly more
    complicated game here.
  421. And this game has three
    strategies and we'll call them
  422. A, B, and C for each player.
    The payoffs are as follows
  423. (4,4), (0,5),
    (0,0), down here we'll do
  424. (0,0), (0,0) and (3,3),
    and in the middle row (5,0),
  425. (1,1), (0,0).
  426. We're going to assume that this
    game, just like we did with the
  427. first time we did Prisoner's
    Dilemma, this game is going to
  428. be played twice.
    It's going to be repeated,
  429. it's going to be played twice,
    repeated once.
  430. So let's just make sure we
    understand what the point of
  431. this game is.
    In this game,
  432. in the one shot game I hope
    it's clear that (A,
  433. A) is kind of the cooperative
    thing to do.
  434. We'd like to sustain play of
    (A, A), because then both
  435. players get 4 and that looks
    pretty good for everybody.
  436. However, in the one shot game
    (A, A) is not a Nash
  437. equilibrium.
    Why is (A, A) not a Nash
  438. equilibrium?
    Let me grab those mikes again.
  439. Why is (A, A) not a Nash
  440. Anybody?
    I'm even getting to know the
  441. names at this stage of the term.
    This is Katie, right?
  442. So shout out.
    Student: The best
  443. response to the other guy
    playing A is playing B.
  444. Professor Ben Polak:
    Good, so if I think the
  445. other person's going to play A,
    I'm going to want to defect and
  446. play B and obtain a gain,
    a gain of 1.
  447. So basically I'll get 5 rather
    than 4.
  448. I'll defect to playing B and
    get 5 rather than 4 for a gain
  449. of 1, is that right?
    So (A, A) is not a Nash
  450. equilibrium in the one shot
    game--we're sometimes going to
  451. call it--it's fine--in the one
    shot game.
  452. So now imagine we play this
    game twice.
  453. Instead of just playing once,
    we're going to play this game
  454. two times.
    I'll come back to that.
  455. Before I do that what are the
    pure strategy Nash equilibria in
  456. this game?
  457. So the Nash equilibria in this
    one shot game are (B,
  458. B) and (C, C).
    There's some mixed ones as well
  459. but this will do.
    So (B, B) and (C,
  460. C) are the pure strategy Nash
  461. Now consider playing this game
  462. And last time we looked at a
    game played twice,
  463. it was Prisoner's Dilemma,
    and we noticed that we couldn't
  464. sustain cooperation because in
    the last stage people weren't
  465. going to cooperate and hence in
    the first stage people weren't
  466. going to cooperate.
    But let's look what happens
  467. here.
    If this game is played twice is
  468. there any hope of sustaining
    cooperation, i.e.
  469. A, in both stages?
    Could we have people play A in
  470. the first stage and then play A
    again in the second stage?
  471. So Patrick's shaking his head.
    So that's right to shake his
  472. head.
    So let me grab the other mike.
  473. So why is that not going to
  474. Why can't we get people to
    cooperate and play A in both
  475. periods?
    Shout out.
  476. Student: In the second
    period you're still going to
  477. defect and play B.
    Professor Ben Polak:
  478. Good, so in the second
    period, exactly the argument
  479. that Katie produced just now in
    the one shot game applies
  480. because the second period game
    is a one shot game.
  481. So we've got no hope of
    sustaining cooperation in both
  482. periods.
    Let's call this cooperation.
  483. We can't sustain (A,
    A) in period two,
  484. in the second period.
    However, I claim that we may be
  485. able to get people to cooperate
    in the first period of the game.
  486. Now how are we going to do that?
    So to see that let's consider
  487. the following strategy.
    But consider the strategy--the
  488. strategy's going to be--play A
    and then play C if (A,
  489. A) was played;
    and play B otherwise.
  490. So this strategy is an
    instruction telling the player
  491. how to play.
    Now, before we consider whether
  492. this is an equilibrium or not,
    let's just check that this
  493. actually is a strategy.
    So what does a strategy have to
  494. do?
    It has to tell me what I should
  495. do--it should give me an
    instruction--at each of my
  496. information sets.
    In this two period game,
  497. each of us, each of the players
    in the game has two information
  498. sets.
    They have an information set at
  499. the beginning of the game and
    they have another information
  500. set at the beginning of period
  501. Is that right?
    So it has to tell you what to
  502. do at the first information set
    and at the second information
  503. set and it does.
    It says play A at the first
  504. one, and then at the beginning
    of period two--I said there's
  505. only one information set there,
    but actually there's nine
  506. possible information sets
    depending on what happened in
  507. the first period.
    So each thing that happened in
  508. the first period is associated
    with a different information
  509. set.
    I always know what happened in
  510. the first period.
    And at each of those nine
  511. information sets it tells me
    what to do at the beginning of
  512. period two.
    In particular,
  513. it says if it turns out that
    (A, A) was played then play C
  514. now.
    And otherwise,
  515. all the other eight possible
    information sets I could find
  516. myself in, play B.
    So this is a strategy.
  517. Now of course the big question
    is, is this strategy an
  518. equilibrium, and in particular
    is it a sub-game perfect
  519. equilibrium?
  520. Let me a bit more precise:
    if both players were playing
  521. this strategy would that be a
    sub-game perfect equilibrium?
  522. Well let's have a look.
  523. (Of course I don't see it now
    so let's pull both these boards
  524. down.) So to check whether this
    is a sub-game perfect
  525. equilibrium,
    we're going to have to check
  526. what?
    We're going to have to check
  527. that it induces Nash behavior in
    each sub-game.
  528. (I think the battery is going
    on that.
  529. Shall I get rid of that.
    Okay, I'm going to shout.
  530. Can people still hear me?
    People in the balcony can they
  531. hear me?
    Yup, okay.) So we're going to
  532. have to see if we can sustain
    Nash behavior in each sub-game.
  533. So let's start with the
    sub-games associated with the
  534. second period.
  535. there are nine such sub-games
    depending on what happened in
  536. the past, depending on what
    happened in the first period.
  537. There's a sub-game following
  538. There's a sub-game following
  539. There's a sub-game following
    (A,C), and so on.
  540. So for each activity in the
    first period,
  541. for each profile in the first
    period there's a sub-game.
  542. However, it doesn't really
    matter to distinguish all of
  543. these sub-games particularly
    carefully here,
  544. since the costs from the past,
    what happened in the past is
  545. sunk, so we'll just look at them
    as a whole.
  546. So in period two after
    (A,A)--so you have one in
  547. particular of those nine
    sub-games--after (A,A),
  548. this strategy induces (C,
  549. If people play A,
    if both people play A in the
  550. first period,
    then in the sub-game following
  551. people are supposed to play (C,
  552. Is that a Nash equilibrium of
    the sub-game?
  553. Well was (C,C) a Nash
  554. Yeah, it's one of our Nash
  555. Let's just look up there,
    we've got it listed.
  556. There it is.
    So we're playing this Nash
  557. equilibrium.
    So that is a Nash equilibrium,
  558. so we're okay.
    After the other choices in
  559. period one, this strategy
    induces (B, B).
  560. That's good news too because
    (B, B), we already agreed,
  561. was a Nash equilibrium in the
    one shot game.
  562. So in all of those nine
    sub-games, the one after (A,A),
  563. and the eight after everything
    else, we're playing Nash
  564. behavior, so that's good.
    What about in the whole game?
  565. In the whole game,
    starting from period one,
  566. we have to ask do you do better
    to play the strategy as
  567. designated,
    in particular to choose A,
  568. or would you do better to
  569. Well let's have a look.
    So if I choose A--remember the
  570. other person is playing this
    strategy--so if I choose A then
  571. my payoff in this period comes
    from (A,A) and is 4.
  572. If I choose A then we're both
    playing A in this period and I
  573. get 4.
    Tomorrow, according to this
  574. strategy--tomorrow since we both
    played A--both of us will now
  575. play C.
    Since we're both playing C,
  576. I'll get an additional payoff
    of 3.
  577. So tomorrow (C,
    C) will occur and I'll get 3
  578. for a total of 7.
    What about if I defect?
  579. We could consider lots of
    possible defections,
  580. but let's just consider the
    obvious defection.
  581. You can check the other ones at
  582. So if I defect and choose B
    now, then, in this period,
  583. I will be playing B and my
    opponent or my pair will be
  584. playing A.
    So in this period I will get 5.
  585. And tomorrow,
    since (A,A) did not occur,
  586. both of us will play B and get
    a continuation payoff of 1.
  587. So the continuation payoff this
    time will be following from (B,
  588. B), and I'll get 1.
    Why don't I do what I've been
  589. doing before in this class and
    put boxes around the
  590. continuation payoffs,
    just to indicate that they are,
  591. in fact, continuation payoffs.
    So if I play A,
  592. I get 4 now and a continuation
    payoff of 3, for a total of 7.
  593. If I play B now,
    I gain something now,
  594. I get 5 now,
    but tomorrow I'll only get 1
  595. for a total of 6.
  596. So in fact, 7 is bigger than 6,
    so I'm okay and I won't want to
  597. do this defection.
  598. I just want to write this one
    other way because it's going to
  599. be useful for later.
    So one other way to write this,
  600. I think we've convinced
    ourselves that this is an
  601. equilibrium,
    but one other way to write this
  602. is, and it's a more general way
    in repeated games,
  603. is to write it explicitly
    comparing the temptations to
  604. cheat today with the rewards and
    punishments from tomorrow.
  605. So what we want to do is,
    in general, we can just rewrite
  606. this as checking that the
    temptation to cheat or defect
  607. today is smaller than the value
    of the reward minus the value of
  608. the punishment.
    But the key words here are:
  609. defecting occurs today;
    rewards and punishments occur
  610. tomorrow.
    If we just rewrite it this way,
  611. we'll see exactly the same
    thing, just rearranging
  612. slightly.
    The temptation to defect today
  613. is I get 5 rather than 4,
    or if you like a gain of 1.
  614. And the value of the reward
    tomorrow--the reward was to play
  615. (C, C) tomorrow and get 3.
    The value of the punishment
  616. tomorrow was to play (B,
    B) tomorrow and get 1,
  617. and that difference is 2.
    So here the fact that the
  618. temptation is outweighed by the
    difference between the value of
  619. the reward and the value of the
    punishment is what enabled us to
  620. sustain cooperation.
    I'm just writing that in a more
  621. general way because this is the
    way that we can apply in games
  622. from here on.
    We're going to compare
  623. temptations to cheat with
    tomorrow's promises.
  624. Patrick, let me get you a mike.
    Student: I don't
  625. understand why it's reasonable
    to think you would play (B,
  626. B) in the second period though.
    In the second period you have a
  627. temptation to play C,
    C even if the person defected
  628. on you.
    Professor Ben Polak:
  629. Good, that's a very good
  630. So what Patrick's saying is,
    it's all very well to say we're
  631. sustaining cooperation in the
    first period here,
  632. but the way in which we
    sustained cooperation was by
  633. going along with,
    as it were,
  634. the punishment tomorrow.
    It required me,
  635. tomorrow, to go along with the
    strategy of choosing B if I
  636. cheated in the first period.
    I want to answer this twice,
  637. once disagreeing with him and
    once agreeing with him.
  638. So let me just disagree with
    him first.
  639. So notice tomorrow,
    if the other person,
  640. the other player is going to
    play B then I'm going to want to
  641. play B.
    So the key idea here is,
  642. as always in Nash equilibrium,
    if I take the other person's
  643. play as given and just look at
    my own behavior--if I think the
  644. other person is playing this
    strategy and hence he's going to
  645. play B tomorrow after I've
    cheated--then I want to be play
  646. B myself.
    So that check is just our
  647. standard check,
    and actually that's the check
  648. that makes sure that it really
    is a sub-game perfect
  649. equilibrium.
    We're not putting some
  650. punishments down the tree that
    are arising out of equilibrium.
  651. It has to be I want to do
    tomorrow what I'm told to do
  652. tomorrow.
    So that idea seems right and
  653. I'm glad Patrick raised it
    because that was the next thing
  654. in my notes.
    I want to go along with this
  655. punishment because if the other
    person's playing B I want to
  656. play B myself.
  657. I think Patrick's onto
    something and let me come back
  658. to it in a minute.
    What I want to do before I do
  659. that is just draw out a general
    lesson from this game.
  660. The general lesson is we can
    sustain cooperation even in a
  661. finitely repeated game,
    but to do so we need there to
  662. be more than one Nash
    equilibrium in the stage game.
  663. What we need there to be is
    several Nash equilibria,
  664. one at least of which we can
    use as a reward and another one
  665. which we can use as a
  666. So even if a game is only
    played a finite number of times,
  667. if there are several equilibria
    in the stage game,
  668. both (B,B) and (C,C),
    we can use one of them as a
  669. reward and the other one as a
  670. and use that difference to try
    and get people to resist
  671. temptations today.
    So that's the general idea
  672. here, and let's just write that
  673. Patrick don't let me get away
    with not coming back to your
  674. point, I want to come back to it
    in a second.
  675. So the lesson here is,
    if a stage game--a stage game
  676. is the game that's going to be
    repeated--if a stage game has
  677. more than one Nash equilibrium
    in it,
  678. then we may be able to use the
    prospect of playing different
  679. equilibria tomorrow to provide
    incentives--and we could think
  680. of these incentives as rewards
    and punishments--for cooperation
  681. today.
    In the game we just saw,
  682. there were exactly two pure
    strategy Nash equilibria in the
  683. sub-game.
    We used one of them as a reward
  684. and the other one as a
    punishment, and we were able to
  685. sustain cooperation in a
    sub-game perfect equilibrium.
  686. Now, a question arises here,
    and I think it's behind
  687. Patrick's question,
    and that is how plausible is
  688. this?
    How plausible is this?
  689. Formally, if we write down the
    game and do the math,
  690. this comes out.
    But how plausible is this as a
  691. model of what's going on in
  692. I think the worry--I'm guessing
    this is worry that was behind
  693. Patrick's question--is this.
    Suppose I'm playing this game
  694. with Patrick and suppose Patrick
    cheats on me the first period,
  695. so Patrick chooses B while I
    wanted him to choose A in the
  696. first period.
    Now in the second period,
  697. according to the equilibrium
  698. we're supposed to play (B,
    B) and get payoffs of 1 rather
  699. than (C,C) and get payoffs of 3.
    So let's make that visible
  700. again.
  701. But suppose Patrick comes to me
    in the meantime.
  702. So between period one and
    period two, Patrick shows up at
  703. my office hours and he says:
  704. I know I cheated on you
    yesterday, but why should we
  705. punish ourselves today?
    Why should we,
  706. both of us, lose today by
    playing the (B,B) equilibrium?
  707. Why don't we both switch to the
    (C,C) equilibrium?
  708. After all, that's better for
    both of us.
  709. Patrick's saying to me,
    it's true that I cheated you
  710. yesterday, but "let bygones be
  711. or "why cry over spilt milk,"
    or he'll use some other saying
  712. plucked out of the book of
  713. and say to me:
    well why go along with the
  714. punishment.
    Let's just play the good
  715. equilibrium now.
    And, if I look at things and I
  716. say well, actually,
    it's true I got nothing in the
  717. first period because Patrick
    kind of cheated me in the first
  718. period--so it's true I got
    nothing yesterday--and it's true
  719. it was Patrick who caused me to
    get nothing yesterday,
  720. but nevertheless that's a sunk
    cost and I'm comparing getting 1
  721. now with getting 3 now.
    Why don't we just go along and
  722. get 3?
    Moreover, I'm not in danger of
  723. being cheated again because if
    Patrick believes I'm going to
  724. play C, he's going to play C
  725. So that kind of argument
    involves what?
  726. It involves some kind of
    communication between stages,
  727. but it sounds like that's going
    to be a problem.
  728. Why?
    Well, suppose it's the case
  729. that we are going to get
    communication between periods
  730. and suppose it's the case that
    someone with the gift of the
  731. gab,
    someone on his way to law
  732. school like Patrick,
    is going to be able to persuade
  733. me to go back to the good
    equilibrium for everybody in
  734. period two,
    then we know we're going to
  735. play the good equilibrium in
    period two and now we've lost
  736. any incentive to cooperate in
    period one.
  737. The only reason I was willing
    to cooperate in period one was
  738. because the temptation to defect
    was outweighed by the difference
  739. between the value of the reward
    and the value of the punishment.
  740. If we're going to get the
    reward anyway,
  741. I'll go ahead and defect today.
    So the problem here is this
  742. notion of "renegotiation," this
    notion of communicating between
  743. periods can undermine this kind
    of equilibrium.
  744. There's a problem that arises
    if we have renegotiation.
  745. So there may be a problem of
  746. Now, this probably may not be
    such a big problem.
  747. For example,
    it may be, say,
  748. I'll be so angry at Patrick
    because he screwed me over in
  749. period one that I won't go along
    with the renegotiation.
  750. It may also be the case,
    and we'll see some examples of
  751. this on the homework assignment,
    that the many equilibria in the
  752. second stage of the game are not
    such that a punishment for
  753. Patrick is also a punishment for
  754. What really caused the problem
    here was, in trying to punish
  755. Patrick, I had to punish myself.
    But you could imagine games,
  756. or see some concrete examples
    on the next homework assignment,
  757. in which punishing Patrick is
    rather fun for me,
  758. and punishing me is rather fun
    for Patrick,
  759. and that's going to be much
    harder to renegotiate our way
  760. out of.
    There was a question,
  761. let me get a mike out to the
  762. Yeah?
    Student: If we're ruling
  763. out renegotiation,
    can't we devise a strategy for
  764. Prisoner's Dilemma as well even
    though it doesn't have multiple
  765. Nash equilibriums?
    Professor Ben Polak:
  766. Yeah, okay good,
    so the issue there is,
  767. in Prisoner's Dilemma,
    we established in the first
  768. week that if we're not allowed
    to make side payments,
  769. we're not allowed to bring in
    outside contracts,
  770. then no amount of communication
    is going to help us.
  771. So you're right if we can rely
    on the courts or the mafia to
  772. enforce the contracts that would
    be fine and then communication
  773. would have bite.
    But you remember way back in
  774. the first week when we tried to
    talk our way out of bad behavior
  775. in the Prisoner's Dilemma it
    didn't help precisely because
  776. it's a dominant strategy.
    Whereas, here,
  777. Patrick's conversation,
    Patrick's verbal agreement to
  778. play the other equilibrium is an
    agreement to play a Nash
  779. equilibrium.
    That's what is getting us into
  780. trouble.
    So what may help us here,
  781. what may avoid renegotiation is
    simply I'm not going to go along
  782. with that renegotiation--I'm too
    angry about having been cheated
  783. on--and it may be for other
    reasons it may actually be that
  784. I enjoy the punishment.
  785. this is a real problem in
    society and I don't think we
  786. should pretend that this problem
    isn't there.
  787. So a good example is in
    bankruptcy, which is one of
  788. those words I can never spell.
    It seems I have too many
  789. consonants in it,
    is that right?
  790. It's approximately right
  791. So bankruptcy law in the U.S.
    for the last 200 odd years has
  792. gone through cycles.
    One way to view these cycles
  793. is, they're cycles of relaxing
    the law and making life "easier
  794. for borrowers" and then
    tightening up again.
  795. This is not a recent
    phenomenon, this is not only a
  796. recent phenomenon,
    this occurred throughout the
  797. nineteenth century.
    So what typically happened was
  798. there was either explicit
    renegotiation between parties or
  799. renegotiation through act of
    Congress or sometimes through
  800. the acts of the states,
    in which bankrupt debtors were
  801. basically let off or given
    easier terms.
  802. The argument was always the
  803. These people are not going to
    pay back now.
  804. It's clear from the nineteenth
    century, often if you were
  805. bankrupt you were in jail,
    actually worse than that.
  806. Sometimes in the nineteenth
    century in England not only if
  807. you were bankrupt were you in
    jail but your creditors were
  808. having to pay the fees to feed
    you in jail.
  809. So there you were sitting in
    jail, you weren't paying that
  810. money back to your creditor,
    and you're actually costing
  811. money to your creditor by being
    in jail.
  812. This seems like a situation
    that you want to renegotiate
  813. your way out of.
    You say, hey let's let these
  814. guys out of jail.
    Let them be productive again,
  815. and then they'll pay back part
    of the loans.
  816. So you had these waves of
    bankruptcy reform in which the
  817. debtors' prisons were closed
    down, people were let out,
  818. people were relieved of debt.
    What's the problem with doing
  819. that?
    That seems like a good idea
  820. right.
    After all, you don't want all
  821. these people bankrupt,
    in debt, not paying money back
  822. to their creditors anyway.
    That doesn't seem like a good
  823. situation in society.
    It seems like a renegotiation
  824. that's a win-win situation:
    it's better for everybody.
  825. What's the problem with it
  826. Let's get a mike down here.
    What's the problem with this?
  827. Student: It incentivizes
  828. Professor Ben Polak:
    Right, it creates an
  829. incentive for people not to
    repay in the first place.
  830. It creates an incentive for
    people to take big risks now,
  831. and hence, it makes bankruptcy,
    if you like,
  832. or makes non-repayment of debt
    more likely.
  833. So this has been going on for a
    while, but you see it very much
  834. today if you read the financial
    pages of the papers in the last
  835. few weeks.
    There's a big worry in the U.S.
  836. right now about people failing
    to repay what kind of debt?
  837. What kind of debt is the big
    worry about?
  838. Mortgage debt,
    right, so those people who are
  839. house owners failing to pay back
    mortgage debt and,
  840. equally worrying,
    financial institutions that
  841. have a lent a lot of,
    for example,
  842. sub-prime debt now find
    themselves in financial trouble.
  843. You're going to read a lot in
    the papers about not letting
  844. people out lightly out of those
    situations of being in debt,
  845. or not letting people out
    lightly out of bankruptcy.
  846. The term you're going to hear
    is "bail out."
  847. So bail out--the argument
    you're going to read is,
  848. you don't want the government
    or the central bank bailing out
  849. those financial institutions who
    have apparently taken too large
  850. risks on sub-prime mortgage
  851. even though we all agree it's
    better right now for those
  852. financial institutions not to go
  853. Why are we not going to--Even
    though it's better for everybody
  854. for it not to go under,
    why are we not going to bail
  855. them out?
    Because it undermines the
  856. incentives for them not to make
    bad loans to start with.
  857. To a lesser extent you're going
    to hear that on the debtor side
  858. as well.
    You're going to hear some
  859. people say we shouldn't be
    bailing out people who took on
  860. bad loans,
    took on bad mortgages to
  861. finance their houses,
    again for bail out reasons.
  862. So this is an important trade
  863. If you go on to law school,
    you're going to see a lot about
  864. this kind of discussion,
    and this is the discussion of
  865. trading off ex-ante efficiency
    and ex-post efficiency.
  866. Sometimes, as Patrick has
    pointed out in the game just
  867. now, the ex-post efficient thing
    to do is to go back to the good
  868. equilibrium,
    or if you like to bail out
  869. these firms who've made bad
  870. However, from an ex-ante point
    of view, it creates bad
  871. incentives for people to make
    those loans in the first place;
  872. and, in the ex-ante point of
    view, it created the incentive
  873. for people to defect in the
    first period of that game.
  874. So this theme of ex-ante versus
    ex-post efficiency is not one
  875. we're going to go into anymore
    in this class,
  876. but it should be there in the
    back of your minds when you all
  877. end up in law school in a few
    years time.
  878. Okay, so, so far what have we
  879. We've been looking at repeated
    interaction and seeing if it can
  880. sustain cooperation.
    The first thing we learned was
  881. that if the repeated interaction
    is a finite interaction,
  882. if we know when it's going to
    end--we know when the
  883. interaction's going to end--then
    sustaining cooperation is going
  884. to be hard because in the last
    period there will be an
  885. incentive to defect.
    We saw we could get around that
  886. to some extent if games have
    multiple equilibria,
  887. but in a game like Prisoner's
    Dilemma, we're really in
  888. trouble.
    Things will unravel from the
  889. back.
    So now let's mix things up a
  890. little bit by looking at a more
    complicated variety of repeated
  891. interactions.
    Rather than just play the game
  892. once or twice,
    or three times,
  893. let's play the game under the
    following rules.
  894. We'll go back to our same
    players, how many mikes are
  895. still out here?
    I took them both back,
  896. is that right?
    I'm taking both the green and
  897. the blue mike,
    and I'm giving them back to our
  898. players.
    So this is to Brooke and this
  899. is to Patrick.
    And we're going to have Brooke
  900. and Patrick play Prisoner's
    Dilemma again.
  901. I'm hoping I haven't deleted it.
    Maybe I did.
  902. It doesn't matter we know the
  903. We're going to have them play
    Prisoner's Dilemma again,
  904. but this time,
    in between every play of the
  905. game, I'm going to toss a coin.
    Actually I'll toss the coin
  906. twice and if that coin comes up
    heads both times then the game
  907. will end, but otherwise they'll
    play again.
  908. So everyone understand what
    we're going to do?
  909. We're going to play Prisoner's
  910. At the end of every period I'll
    toss a coin twice.
  911. I might get Jake to toss it.
    Jake will toss a coin twice.
  912. If it comes up heads both times
    the game's over but otherwise
  913. the game continues.
    So both Brooke and Patrick
  914. should get ready to play,
    and the payoffs of this game
  915. are just what we had before.
    So let's just remind ourselves
  916. what the payoffs of that game
  917. So we've got cooperate,
    defect, cooperate,
  918. defect, (2,2),
    (-1,3), (3, -1) and (0,0).
  919. And we'll keep score here:
    so this Brooke and Patrick.
  920. So, putting pressure on these
    guys, let's write down what
  921. you're going to do the first
  922. Brooke? Student: Defect.
    Professor Ben Polak:
  923. Patrick?
    Student: Cooperate.
  924. Professor Ben Polak:
    All right.
  925. I think we're getting some
    payback from earlier,
  926. right.
    Round two.
  927. Student: Are you going
    to toss the coin?
  928. Professor Ben Polak:
    Oh I have to toss the coin,
  929. you're absolutely right,
    thank you.
  930. Now I have to find a coin.
    Look at that, thank you Ale.
  931. Twice: toss it twice.
  932. Heads, heads again,
    so the game is over.
  933. That didn't last long.
    Just for the sake of the class,
  934. let's pretend that it came up
  935. Okay we'll cheat a little bit.
    Okay, so we're playing a second
  936. time--just with a little bit of
  937. I need someone else,
    someone less honest to toss the
  938. coin.
    Brooke what do you choose?
  939. Student: Oh I'm
  940. Professor Ben Polak:
    Defecting again,
  941. Patrick?
    Student: Cooperate.
  942. Professor Ben Polak:
  943. Patrick seems very trusting
    here, all right let's toss the
  944. coin a third time.
    All right, Brooke?
  945. Student: I'm going to
    defect again.
  946. Student: Defect.
    Professor Ben Polak:
  947. All right,
    heads, heads,
  948. so this time we'll end it.
    So what happened this time,
  949. let's just talk about it a bit.
    So Brooke and Patrick were
  950. playing, Patrick cooperated a
    bit in the beginning,
  951. Brooke's defected throughout.
    Brooke why did you defect?
  952. Shout out so everyone can hear
  953. Why did you defect right from
    the start of the game?
  954. Student: Because last
    time it didn't work so well
  955. cooperating.
    Professor Ben Polak:
  956. Last time it didn't work
    so well, okay.
  957. Fair enough but even after
    Patrick was sort of cooperating
  958. you went on defecting.
    So why then?
  959. Student: Because I
    wanted to get the higher payoff,
  960. I thought either he would
    continue cooperating and I could
  961. defect,
    Professor Ben Polak:
  962. All right,
    if he had gone on cooperating,
  963. which in fact he did.
    Patrick why were you
  964. cooperating early on here?
    Shout out so people can hear
  965. you.
    Student: So with a two
  966. head rule, like you have a 75%
    chance at having another game.
  967. So with those payoffs,
    even one period the payoff of
  968. cooperating twice is the same as
    defecting once,
  969. so it's better if you can
    continue cooperating,
  970. and the percentage is high
    enough that it would make sense
  971. to do so.
    Professor Ben Polak:
  972. All right,
    if you figure there's a good
  973. enough chance of getting--even
    after Brooke's defected the
  974. first period you went on
  975. but then after the second
    period you gave up and started
  976. defecting.
    If it had gone on to the fourth
  977. period what would you have done?
    Student: Defected.
  978. Professor Ben Polak:
    You would have defected
  979. again, all right.
    Fifth period?
  980. Student: Well if she
    kept defecting,
  981. I would keep defecting.
    Professor Ben Polak:
  982. All right,
    so what Patrick's saying is he
  983. started off cooperating but once
    he saw that Brooke was
  984. defecting,
    he was going to switch to
  985. defect.
    And basically as long as she
  986. went on defecting,
    he was going to stick with
  987. defecting.
    Let's try a different pair.
  988. So why don't we switch it over
    to your partners there.
  989. So Ben here and Edwina.
  990. So why don't you stand up.
    I want everybody to see these
  991. people.
    So stand up a second.
  992. So these are our players,
    I want people at the back to
  993. essentially know who are
    playing, this is Edwina and this
  994. is Ben.
    So Edwina--sit down so you can
  995. actually write things down.
    So Edwina and Ben,
  996. Edwina, have you both written
    down a strategy?
  997. Ben, have you written down a
  998. Edwina what did you choose?
    Student: Cooperate.
  999. Professor Ben Polak:
    So Edwina's cooperating,
  1000. Ben?
    Student: Cooperate.
  1001. Professor Ben Polak:
    Okay, let's toss the coin.
  1002. So we're okay,
    so we're still playing.
  1003. Edwina?
    Student: Cooperate.
  1004. Professor Ben Polak:
  1005. Student: I chose
  1006. Professor Ben Polak:
    All right,
  1007. so they're cooperating.
    Tails again,
  1008. so you're still playing.
    Student: Cooperate.
  1009. Student: Cooperate.
    Professor Ben Polak:
  1010. All right,
    so they're still cooperating.
  1011. Some pain in the voice this
  1012. Heads and then tails,
    write down what you're going to
  1013. do.
    Edwina? Student: Defect.
  1014. Professor Ben Polak:
  1015. Student: Cooperate.
    Professor Ben Polak:
  1016. Things were going so nicely
  1017. We had such a nice class going
    on there–.
  1018. All right, so we're still
  1019. Edwina? Student: Defect.
    Professor Ben Polak:
  1020. Ben?
    Student: Defect.
  1021. Professor Ben Polak:
    All right,
  1022. Jake?
    Tails, tails, we're still going.
  1023. Student: Defect.
    Student: Defect.
  1024. Professor Ben Polak:
    All right,
  1025. let me stop it there,
    we'll pretend that we had two
  1026. heads.
    So let's talk about this.
  1027. We had some cooperation going
    on here, both people started
  1028. cooperating.
    So Ben, why did you cooperate
  1029. at the beginning?
    Student: Well,
  1030. going along with Patrick's
    reasoning I felt that if we
  1031. could have the cooperate,
    cooperate in the long term with
  1032. the 75% chance of continuing
    playing, that it would be a
  1033. worthwhile investment.
    Professor Ben Polak:
  1034. All right.
    Student: Until I
  1035. realized that Edwina had started
  1036. Professor Ben Polak:
    Let's come back a second.
  1037. Let's get you guys to stand up
    so people can hear you.
  1038. When you stand up you shout
  1039. So stand up again.
    Edwina, so you also started
  1040. cooperating, why did you start
  1041. Student: For the same
  1042. Professor Ben Polak:
    Same reason,
  1043. okay.
    So the key thing here is why
  1044. did you start defecting?
    You heard the big sigh in the
  1045. class.
    Why did you start defecting at
  1046. this stage?
    Student: Because we'd
  1047. had so many, I mean the coin
    toss had to come to heads,
  1048. heads sometime,
    so I started thinking that
  1049. maybe- Professor Ben Polak:
    The reversion to the mean
  1050. of the coin.
    Student: Yeah,
  1051. I just thought that it.
    I thought, I mean, I don't know.
  1052. Professor Ben Polak:
    So what did I say about the
  1053. relationships of Economic majors
    that are in the class?
  1054. Anyway, all right,
    so Edwina defected and then Ben
  1055. you switched after,
    why did you switch?
  1056. Student: Because once
    Edwina started defecting I felt
  1057. that we'd revert back to the
    defect, defect equilibrium.
  1058. Professor Ben Polak:
    All right,
  1059. so thank you guys.
    So there's another good
  1060. strategy here.
    People started off cooperating
  1061. and I claim that at least
    Ben--Ben can contradict me in a
  1062. second--but I think Ben's
    strategy here was something like
  1063. this.
    I'm going to cooperate and I'm
  1064. going to go on cooperating as
    long as we're cooperating.
  1065. But if at some point Edwina
    defects--or for that matter I
  1066. defect--then this relationship's
    over and we're going to play
  1067. defect forever.
    Is that right?
  1068. That's kind of a rough
    description of your strategy?
  1069. Edwina was more or less playing
    the same thing.
  1070. In fact it was her who
    defected, but once she defected
  1071. she realized that it was over
    and she went on defecting.
  1072. So this strategy has a name.
    Let's just be clear what the
  1073. strategy is.
    This strategy says play C which
  1074. is cooperate,
    and then play C if no one has
  1075. played D;
    and play D otherwise.
  1076. So start off by cooperating.
    Keep cooperating as long as
  1077. nobody's cheated.
    But if somebody cheats,
  1078. this relationship's over:
    we're just going to defect
  1079. forever.
    Now this strategy is a famous
  1080. strategy.
    It has a name.
  1081. Anyone know what the name is?
    This is called the "Grim
  1082. Trigger Strategy."
    So this strategy again,
  1083. it says we're going to
    cooperate, but if that
  1084. cooperation breaks down ever,
    even if it's me who breaks it
  1085. down, then I'm just going to
    defect forever.
  1086. Now, we're going to come back
    next time to see if this is an
  1087. equilibrium, but there's a few
    things to do first.
  1088. First let's just check that it
    actually is a strategy.
  1089. What does it mean to be a
    strategy again?
  1090. It has to tell us what to do at
    every information set I could
  1091. find myself at.
    And this game is potentially
  1092. infinite, so potentially there's
    an infinite number of
  1093. information sets I could reach.
    So you might think that writing
  1094. down a strategy that gives me an
    instruction at every single
  1095. information set is going to be
    incredibly complicated once we
  1096. go to games that are potentially
  1097. because there needs to be an
    infinite number of instructions.
  1098. But it turns out,
    actually it's possible to write
  1099. down such strategies rather
    simply, at least if they're
  1100. simple strategies.
    This example is one.
  1101. This tells me what to do at the
    first information set,
  1102. it says play C.
    It then tells me for every
  1103. information set I find myself
    at, in which only cooperation
  1104. has ever occurred in the history
    of the game,
  1105. I'm going to go on cooperating:
    play C.
  1106. And it says for all other
    histories, for all other
  1107. information sets I might find
    myself at, play D.
  1108. So it really is a strategy.
    Now this is very different
  1109. behavior--we played with the
    same players--this kind of
  1110. behavior is very different,
    in both games actually,
  1111. is very different than the
    behavior we saw in the game that
  1112. ended,
    the game with two periods or
  1113. three periods.
    What is it essentially that
  1114. made this different?
    What's different about this way
  1115. of playing Prisoner's Dilemma,
    where we had Jake toss the coin
  1116. versus the way we played before
    and we just played for five
  1117. periods and then stopped?
    What's different about it?
  1118. Somebody?
    Let's talk to our players,
  1119. Patrick why is this different?
    Student: We don't when
  1120. the game is going to end or if
    it's going to end,
  1121. so there's no last period.
    Professor Ben Polak:
  1122. Good, so our analysis of
    the game before,
  1123. the analysis of the Prisoner's
    Dilemma when we knew it was
  1124. going to end after two periods,
    after five periods,
  1125. whatever it was,
    was we all knew it was going to
  1126. end.
    There was a clearly defined
  1127. last period.
    When people are going to
  1128. retire, we know the month in
    which they're going to retire.
  1129. When President's are going to
    step down, we know they're going
  1130. to step down that period.
    When CEO's are going to go,
  1131. we know they're going to go--or
    acctually we don't always know
  1132. they're going to go but let's
    just pretend we do.
  1133. So what's different about this
    game is, every time we play the
  1134. game, there is a probability,
    in this case a .75 probability
  1135. that the game is going to
    continue to the next period.
  1136. Every time we play the game,
    with probability of .75 there's
  1137. going to be a future.
    There's no obvious last period
  1138. from which we can unravel the
    game in the way we did before.
  1139. Just to remind ourselves,
    the way in which
  1140. cooperation--our analysis of
    cooperation--broke down in the
  1141. finitely repeated Prisoner's
  1142. was when we looked at the last
    period, we know people are going
  1143. to defect.
    And once that thread is loose
  1144. we can unravel it all the way
    back to the beginning.
  1145. But here, since there is no
    last period that unraveling
  1146. argument never gets to hold.
    Now instead we're able to see
  1147. strategies emerge like the Grim
    Trigger Strategy,
  1148. and notice that the Grim
    Trigger Strategy has a pretty
  1149. good chance of actually
    sustaining cooperation.
  1150. So in particular,
    as long as people play this
  1151. strategy they are cooperating.
    It turns out that Edwina
  1152. eventually gave up that
    strategy, but had she gone on
  1153. playing it, they would have gone
    on cooperating forever.
  1154. But of course there's a
    question here,
  1155. and the question is:
    is this in fact an equilibrium?
  1156. We know that if people play
    this way, we get cooperation,
  1157. but the question--the thousand
    dollar question or whatever--is:
  1158. is this an equilibrium?
    So what do we have to do check
  1159. whether this is an equilibrium
    or not?
  1160. We have to mimic the argument
    we had before.
  1161. We have to compare the
    temptation to defect today and
  1162. compare that with the value of
    the reward (to cooperating) and
  1163. the value of the punishment
    (from defecting) tomorrow.
  1164. So this basic idea is going to
  1165. Having said that,
    let me now delete it so I have
  1166. some room.
  1167. To show this is an equilibrium,
    we need to show that the
  1168. temptation to defect--the
    temptation to cheat in the short
  1169. term--is outweighed by the
    difference between the value of
  1170. the reward and the value of the
  1171. All right, so let's set that up.
  1172. Let's put the temptation here
  1173. So the temptation in Prisoner's
    Dilemma, the temptation to cheat
  1174. today is what?
    I'll get 3 rather than 2,
  1175. is that right?
    So if I defect--when Edwina
  1176. defected: here's Edwina
    defecting in this period--she
  1177. got a payoff of 3 rather than
    the payoff of 2 she would have
  1178. got from cooperating.
    So the temptation here is just
  1179. 3 - 2 and let's be clear,
    this is a temptation today and
  1180. we want to compare this with the
    value of the reward minus the
  1181. value of the punishment,
    but the key observation is that
  1182. these occur tomorrow.
    So since they occur tomorrow we
  1183. have to weight them a little bit
  1184. So in general,
    the way in which we're going to
  1185. weight them tomorrow is we're
    going to discount them just like
  1186. we did in our bargaining game.
    We're going to weight
  1187. tomorrow's payments by δ,
    where δ
  1188. < 1.
    Now why is δ < 1?
  1189. Why are we weighing tomorrow
    less than payment today?
  1190. Why are payments tomorrow worth
    less than payments today?
  1191. Because tomorrow might not
  1192. There are other reasons why,
    by the way.
  1193. It might be that we are
    impatient to get the money
  1194. today.
    Edwina just wanted the payoff
  1195. in a hurry, or it might be that
    she wanted to take the payment
  1196. today and put it in the bank and
    earn interest.
  1197. There are other reasons why
    money today might be more
  1198. valuable than money tomorrow,
    but, in games,
  1199. the most important reason is:
    tomorrow may not happen.
  1200. By tomorrow you might be dead,
    or, if not dead,
  1201. at least Jake's thrown two
    heads in the coins.
  1202. So δ is less than 1
    because the game may end.
  1203. Now, what's the value of the
  1204. The value of the reward is
    going to be the value of C "for
  1205. ever," but you want to be
    careful about "for ever."
  1206. It's C for ever,
    but of course it isn't really
  1207. for ever because the game may
  1208. So by "for ever" I mean until
    the game ends.
  1209. Let me be a bit more careful
    actually, it's (C,
  1210. C) isn't it?
    The value of (C,
  1211. C)--(cooperate,
    cooperate)--for ever.
  1212. Here we're going to have the
    value of (D, D) for ever.
  1213. And once again,
    the for ever here means until
  1214. the game ends.
    So this is the calculation
  1215. we're going to have to do.
    We're going to have to compare
  1216. the temptation,
    that was easy,
  1217. that was just 1 with the
    discounted difference between
  1218. the value of cooperation and the
    value of defecting.
  1219. Let's do the easy bits now,
    and then we'll leave you in
  1220. suspense until Wednesday.
    So let's do all the easy bits.
  1221. So what's this δ
    in this case?
  1222. In this particular game what
    was the probability that the
  1223. game was going to continue?
    What was the probability that
  1224. the game was going to end?
    The probability of it ending
  1225. was .25, so δ
    here was .75,
  1226. that's easy.
    The second bit that's
  1227. relatively easy is what's the
    value of playing (D,
  1228. D) until the game ends?
    Once people have cheated you're
  1229. going to play D for ever--here
    we are: Edwina's cheating here.
  1230. You're going to get (D,
    D) in this period,
  1231. (D, D) in this period,
    and so on and so forth until
  1232. the game ends.
    In each of those periods you're
  1233. going to earn 0,
    so this is just 0.
  1234. Which leaves us with a messy
    bit: what's the value of
  1235. cooperating forever?
    Let's try and do it.
  1236. We've got one minute.
    Let's do it.
  1237. So in every period in which we
    both cooperate what do we earn?
  1238. Throughout the beginning of the
    game: we cooperated in the first
  1239. period;
    now in the second period,
  1240. we cooperate again.
    What payoff do we get from
  1241. cooperating again?
    We get 2 and then Jake tosses
  1242. his coin and with probability
    δ we continue and we're
  1243. going to cooperate again.
    So with probability δ
  1244. we cooperate again and get what
    payoff the next period?
  1245. 2 again, and then Jakes tosses
    the coin again,
  1246. so now he's tossed the coin
  1247. so with probability
    δ² we're still playing
  1248. and we get 2,
    and then Jakes tosses the coin
  1249. again and it comes up other than
  1250. heads again,
    that's with probability
  1251. δ³ we get 2 and so on.
  1252. So your exercise between now
    and Wednesday is to figure out
  1253. what the value of cooperation
    forever is: figure out this
  1254. equation and find whether in
    fact it was an equilibrium for
  1255. people to cooperate.
    We'll pick it up on Wednesday.