
Title:
22. Repeated games: cheating, punishment, and outsourcing

Description:
Game Theory (ECON 159)
In business or personal relationships, promises and threats of good and bad behavior tomorrow may provide good incentives for good behavior today, but, to work, these promises and threats must be credible. In particular, they must come from equilibrium behavior tomorrow, and hence form part of a subgame perfect equilibrium today. We find that the grim strategy forms such an equilibrium provided that we are patient and the game has a high probability of continuing. We discuss what this means for the personal relationships of seniors in the class. Then we discuss less draconian punishments, and find there is a trade off between the severity of punishments and the required probability that relationships will endure. We apply this idea to a moralhazard problem that arises with outsourcing, and find that the high wage premiums found in foreign sectors of emerging markets may be reduced as these relationships become more stable.
00:00  Chapter 1. Repeated Interaction: The Grim Trigger Strategy in the Prisoner's Dilemma (Continued)
29:21  Chapter 2. The Grim Trigger Strategy: Generalization and Real World Examples
37:56  Chapter 3. Cooperation in Repeated Interactions: The "One Period Punishment" Strategy
53:09  Chapter 4. Cooperation in Repeated Interactions: Repeated Moral Hazard
01:13:53  Chapter 5. Cooperation in Repeated Interactions: Conclusions
Complete course materials are available at the Open Yale Courses website: http://open.yale.edu/courses
This course was recorded in Fall 2007.

Professor Ben Polak:
So last time we were

focusing on repeated interaction
and that's what we're going to

continue with today.
There's lots of things we could

study under repeated interaction
but the emphasis of this week is

can we attaincan we
achievecooperation in business

or personal relationships
without contracts,

by use of the fact that these
relationships go on over time?

Our central intuition,
where we started from last

time, was perhaps the future of
a relationship can provide

incentives for good behavior
today,

can provide incentives for
people not to cheat.

So specifically let's just
think of an example.

We'll go back to where we were
last time.

Specifically suppose I have a
business relationship,

an ongoing business
relationship with Jake.

And each period I'm supposed to
supply Jake with some inputs for

his business,
let's say some fruit.

And each period he's supposed
to provide me with some input

for my business,
namely vegetables.

Clearly there are opportunities
here, in each period,

for us to cheat.
We could cheat both on the

quality of the fruit that I
provide or the quantity of the

fruit that I provide to Jake,
and he can cheat on the

quantity or quality of the
vegetables that he provides to

me.
Our central intuition is:

perhaps what can give us good
incentives is the idea that if

Jake cooperates today,
then I might cooperate

tomorrow, I might not cheat
tomorrow.

Conversely, if he cheats and
provides me with lousy

vegetables today I'm going to
provide him with lousy fruit

tomorrow.
Similarly for me,

if I provide Jake with lousy
fruit today he can provide me

with lousy vegetables tomorrow.
So what do we need?

We need the difference in the
value of the promise of good

behavior tomorrow and the threat
of bad behavior tomorrow to

outweigh the temptation to cheat
today.

I'm going to gain by providing
him with the bad fruit or fewer

fruit todaybad fruit because
those I would otherwise have to

throw away.
So that temptation to cheat has

to be outweighed by the promise
of getting good vegetables in

the future from Jake and vice
versa.

So here's that idea on the
board.

What we need is the gain if I
cheat today to be outweighed by

the difference between the value
of my relationship with Jake

after cooperating and the value
of my relationship with Jake

after cheating tomorrow.
Now what we discovered last

timethis was an idea I think
we kind of knew,

we have kind of known it since
the first weekbut we

discovered last time,
somewhat surprisingly,

that life is not quite so
simple.

In particular,
what we discovered was we need

these to be credible,
so there's a problem here of

credibility.
So in particular,

if we think of the value of the
relationship after cooperating

tomorrow as being a promise,
and the value of the

relationship after cheating as
being a threat,

we need these promises and
threats to be credible.

We need to actually believe
that they're going to happen.

And one very simple area where
we saw that ran immediately into

problems was if this repeated
relationship,

although repeated,
had a known end.

Why did known ends cause
problems for us?

Because in the last period,
in the last period of the game

we know that whatever we promise
to do or whatever we threaten to

do,
in the last period,

once we reached that last
period, in that subgame we're

going to play a Nash
equilibrium.

What we do has to be consistent
with our incentives in the last

period.
So in particular,

if there's only one Nash
equilibrium in that last period,

then we know in that last
period that's what we're going

to do.
So if we look at the second to

last period we might hope that
we could promise to cooperate,

if you cooperate today,
tomorrow.

Or you could promise to punish
tomorrow if you cheat today,

but those threats won't be
credible because we know that

tomorrow you're just going to
play whatever that Nash

equilibrium is.
That lack of credibility means

there's no scope to provide
incentives today for us to

cooperate and we saw things
unravel backwards.

So the way in which we ensure
that we're really focusing on

credible promises and credible
threats here is by focusing on

subgame perfect equilibrium,
the idea that we introduced

just before the Thanksgiving
break.

We know that subgame perfect
equilibria have the property

that they have Nash behavior in
every subgame,

so in particular in the last
period of the game and so on.

So what we want to be able to
do here, is try to find scope

for cooperation in relationships
without contracts,

without side payments,
by focusing on subgame perfect

equilibria of these repeated
games.

Right at the end last time,
we said okay,

let's move away from the
setting where we know our game

is going to end,
and let's look at a game which

continues, or at least might
continue.

So in particular,
we looked at the problem of the

Prisoner's Dilemma which was
repeated with the probability

that we called δ
each period,

with the probability δ
of continuing.


So every period we're going to
play Prisoner's Dilemma.

However, with probability 1 
δ the game might just end

every period.
We already noticed last time

some things about this.
The first thing we noticed was

that we can immediately get away
from this unraveling argument

because there's no known end to
the game.

We don't have to worry about
that thread coming loose and

unraveling all the way back.
So at least there's some hope

here to be able to establish
credible promises and credible

threats later on in the game
that will induce good behavior

earlier on in the game.
So that's where we were last

time, And here is the Prisoner's
Dilemma, we saw this time,

and we actually focused on a
particular strategy.

But before I come back to this
strategy that we focused on last

time let's just see some things
that won't work,

just to sort of reinforce the
idea.

So here's a possible strategy
in the Prisoner's Dilemma.

A possible strategy in the
Prisoner's Dilemma would be

cooperate now and go on
cooperating regardless of what

anyone does.
So let's just cooperate forever

regardless of the history of the
game.

Now if two players,
if Jake and I are involved in

this business relationship,
which has the structure of a

Prisoner's Dilemma and both of
us play this strategy of

cooperate now and cooperate
forever no matter what,

clearly that will induce
cooperation.

That's the good news.
The problem is that isn't an

equilibrium, that's not even a
Nash equilibrium,

let alone a subgame perfect
equilibrium.

Why is it not a subgame
perfect equilibrium?

Because in particular,
if Jake is smart (and he is),

Jake will look at this
equilibrium and say:

Ben is going to cooperate no
matter what I do,

so I may as well cheat,
and in fact,

I may as well go on cheating.
So Jake has a very good

deviation there which is simply
to cheat forever.

So the strategy cooperate now
and go on cooperating no matter

what doesn't contain incentives
to support itself as an

equilibrium.
And we need to focus on

strategies that contain subtle
behavior that generates promises

of rewards and threats of
punishment that induce people to

actually stick to that
equilibrium behavior.

So is everyone clear that
cooperating no matter whatit

sounds goodbut it isn't going
to work.

People aren't going to stick
with that.

So instead what we focused on
last time, and actually we had

some players who seemed to
actuallythey've moved nowbut

they seemed actually to be
playing this strategy.

We focused on what we called
the grim trigger strategy.

And the grim trigger strategy
is what?

It says in the first period
cooperate and then go on playing

cooperate as long as nobody has
ever defected,

nobody has ever cheated.
But if anybody ever plays D,

anybody ever plays the defect
strategy, then we just play D

forever.
So this is a strategy,

it tells us what to do at every
possible information set.

It also, if two players are
playing the strategy,

has the property that they will
cooperate forever:,

that's good news.
And what we left ourselves last

time was checking that this
actually is an equilibrium,

or more generally,
under what conditions is this

actually an equilibrium.
So we got halfway through that

calculation last time.
So what we need to do is we

need to make sure that the
temptation of cheating today is

less than the value of the
promise minus the value of the

threat tomorrow.
We did parts of this already,

let's just do the easy parts.
So the temptation today is:

if I cheat today I get 3,
whereas if I went on

cooperating today I get 2.
So the temptation is just 1.

What's the threat?
The threat is playing D

forever, so this is actually the
value of (D, D) forever.

You've got to be careful about
for ever: when I say for ever,

I mean until the game ends
because eventually the game is

going to end,
but let's use the code for ever

to mean until the game ends.
What's the promise?

The promise is the value of
continuing cooperation,

so the value of (C,C) for ever.
That's what this bracket is,

and it's still tomorrow.


So let's go on working on this.
So the value of cooperating for

ever is actuallylet's be a bit
more detailedthis is the value

of getting 2 in every period,
so it's value of 2 for ever;

and this is the value of 0
forever.


So the value of 0 forever,
that's pretty easy to work out:

I get 0 tomorrow,
I get 0 the day after tomorrow,

I get 0 the day after the day
after tomorrow.

Or more accurately:
I get 0 tomorrow,

I get 0 the day after tomorrow
if we're still playing,

I get 0 the day after the day
after tomorrow if we're still

playing and so on.
But that isn't a very hard

calculation, this thing is going
to equal 0.

So this object here is just 0.
This object here is 3  2,

I can do that one in my head,
that's 1.

So I'm left with the value of
getting 2 for ever,

and that requires a little bit
more thought.

But let's do that one bit of
algebra because it's going to be

useful throughout today.
So this thing here,

the value of 2 for ever is
what?

Well I get 2,
that's tomorrow,

and then, assuming I'm still
playing the day after

tomorrowso I need to discount
itwith probability of δ

I'm still playing the day after
tomorrowand I get 2 again.

And the day after the day after
tomorrow I'm still playing with

the probability that the game
didn't end tomorrow and didn't

end the next day so that's with
probability δ²

and again I get 2.
And then the day after,

what is it?
This is tomorrow,

the day after tomorrow,
the day after the day after

tomorrow: this is the day after
the day after the day after

tomorrow which is δ³
2 and so on.

Everyone happy with that?
So starting from tomorrow,

if we play (C,
C) for ever,

I'll get 2 tomorrow,
2 the day after tomorrow,

2 the day after the day after
tomorrow, and so on.

And I just need to take an
account of the fact that the

game may end between tomorrow
and the next day,

the game may end between the
day after tomorrow and the day

after the day after tomorrow and
so on.

Everyone happy with that?
So what is the value,

what is thing?
Let's call this X for a second.

So we've done this once before
in the class but let's do it

again anyway.
This is the geometric sum,

some of you may even remember
from high school how to do a

geometric sum,
but let's do it slowly.

So to work out what X is what
I'm going to do is I'm going to

multiply X by δ,
so what's δX?

So this 2 here will become a
2δ, and this δ2 here

will become a δ²2,
and this δ²2 will

become a δ³2,
and this δ³2 will

become a δ^(4)2,
and so on.

Now what I'm going to do is I'm
going to subtract the second of

those lines from the first of
those lines.

So what I'm going to do is,
I'm going to subtract

XδX.
So I'm going to subtract the

second line from the first line.
And when I do that I'm going to

notice I hope that this 2δ
is going to cancel with this

2δ,
and this δ²2 is going

to cancel with this
δ²2,

and this δ³2 is going
to cancel with this

δ³2 and so on.
So what I'm going to get left

with is what?
Everything's going to cancel

except for what?
Except for that first 2 there,

so this is just equal to 2.
Now this is a calculation I can

do.
So I've got X = 2 / [1δ].

So just to summarize the
algebra, getting 2 forever,

that means 2 + δ2 +
δ²2 + δ³2

etc..
The value of that object is

2/[1δ].
So we can put that in here as

well.
This object here 2/[1δ]

is the value of 2 forever.
Now before I go onto a new

board I want to do one other
thing.

On the left hand side I've got
my temptation,

that was 1, I've got the value
of cooperating forever starting

from tomorrow which is
2/[1δ]

and I've got the value of
defecting forever starting from

tomorrow which is 0.
However, all of these objects

on the right hand side,
they start tomorrow,

whereas, the temptation today
is today.

Temptation today happens today.
These differences in value

start tomorrow.
Since they start tomorrow I

need to discount them because we
don't know that tomorrow is

going to happen.
The world may end,

or more importantly the
relationship may end,

between today and tomorrow.
So how much do I have to weight

them by?
By δ, I need to multiply

all of these lines by δ
and so on.

Now this is now a mess so let's
go to a new board.


Now let's summarize what we now
have, What we're doing here is

asking is it the case that if
people play the grim trigger

strategy that that is in fact an
equilibrium?

That is a way of sustaining
cooperation.

The answer is we need 1,
that's our temptation,

to be less than 2/[1δ],
that's the value of cooperating

for ever starting from tomorrow,
minus 0, that's the value of

defecting forever starting
tomorrow,

and this whole thing is
multiplied by δ

because tomorrow may not
happen.

Everyone happy with that so far?
I'm just kind of collecting up

the terms that we did slowly
just now.

So now what I want to do
isquestion mark here because

we don't know whether it isI'm
going to solve this for δ.

So when I solve this for δ
I'll probably get it wrong,

but let's be careful.
So this is equivalent to saying

1δ <
2δ and it's also

equivalent to saying therefore
that δ >

= 1/3.
Everyone happy with that?

Let me just turn my own page.
So what have we shown so far?

We've shown that if we're
playing the grim trigger

strategy, and we want to deter
people from doing what?

From defecting from this
strategy in the very first

period, then we're okay provided
δ is bigger than 1/3.

But at this point some of you
could say, yeah but that's just

one of the possible ways I could
defect from this strategy.

After all, the defection we
just considered,

the move away from equilibrium
we just considered was what?

We considered my cheating
today, but thereafter,

I reversed it back to doing
what I was supposed to do:

I went along with playing D
thereafter.

So the particular defection we
looked at just now was in Period

1, I'm going to defect,
but thereafter,

I'm actually going to do what
the equilibrium strategy tells

me to do.
I'm going to go along with the

punishment and play my part of
(D,D) forever.

So you might want to ask,
why would I do that?

Why would I go along?
I cheated the first time but

now I'm doing what the strategy
tells me to do.

It tells me to play D.
Why am I going along with that?

You could consider going away
from the equilibrium by

defecting, for example in Period
1,

and then in Period 2 do
something completely different

like cooperating.
So we might want to worry,

how about playing D now and
then C in the next period,

and then D forever.
That's just some other way of

defecting.
So far we've said I'm going to

defect by playing D and then
playing D forever,

but now I'm saying let's play D
now and then play a period of C

and then D forever.
Is that going to be a

profitable deviation?
Well let's see what I'd get if

I do that particular deviation.
What play is that going to

induce?
Remember the other player is

playing equilibrium,
so that player is going to

induce, in the first period,
I'm playing D and Jake's

playing C.
In the second period Jake's

going to start punishing me,
so he's going to play D and

according to this deviation I'm
going to play C.

So in the second period I'll
play C and Jake will play D,

and in the third period and
thereafter, we'll just play D,

D, D, D, D, D.
So these are just some other

deviation other than the one we
looked at.

So what payoff do I get from
this?

Okay, I get three in the first
period, just as I did for my

original defection,
that's good news.

But now in the second period
discounted, I actually get 1,

I'm actually doing even worse
in the second period because I'm

cooperating while Jake's
defecting,

and then in the third period I
get 0 and in the fourth period I

get 0 and so on.
So the total payoff to this

defection is 3  δ.
Now, that's even worse than the

defection we considered to start
with.

The defection we considered to
start with, I got 3 in the first

period and thereafter I got 0.
Now I got 3 in the first

period, 1 in the second period,
and then 0 thereafter.

So this defection in which I
defectthis move away from

equilibriumin which I cheat in
the first period and then don't

go along with the punishment,
I don't in fact play D forever

is even worse.
Is that right? It's even worse.

So what's the lesson here?
The lesson here is the reason

that I'm prepared to go along
with my own punishment and play

D forever after a defection is
what?

It's if Jake is going to play D
forever I may as well play D

forever.
Is that right?

So another way of saying this
is the only way which I could

possibly hope to have a
profitable deviation,

given that Jake's going to
revert to playing D forever is

for me to defect on Jake once
and then go along with playing D

forever.
There's no point once he's

playing D, there's no point me
doing anything else,

so this is worse,
this is even worse.

This defection is even worse.
More generally,

the reason this is even worse
is because the punishment we

looked at before,
which was (D,

D) for ever,
the punishment (D,D) forever is

itself an equilibrium.
It's credible because it's

itself an equilibrium.


So unlike in the finitely
repeated games we did last time,

unlike in the two period or the
five period repeated games,

here the punishment really is a
credible punishment,

because what I'm doing in the
punishment phase is playing an

equilibrium.
There's no point considering

any other deviation other than
playing D once and then just

going on playing D.
So that's one other possible

deviation, but there are others
you might want to consider.

So far all we've considered is
what?

We've considered the deviation
where I, in the very first

period, I cheat on Jake and then
I just play D forever.

But what about the second
period?

Another thing I could do is how
about cheating not in the first

period of the game but in the
second.


So according to this strategy
what am I going to do.

The first period of the game
I'll go along with Jake and

cooperate, but in the second
period I'll cheat on him.

Now how am I going to check
whether that's a good deviation

or not?
How do I know that's not going

to be a good deviation?
Well we already know that I'm

not going to want to cheat in
the first period of the game.

I want to argue that exactly
the same analysis tells me I'm

not going to want to cheat in
the second period of the game.

Why?
Because once we reach the

second period of the game,
it is the first period

of the game.
Once we reach the second period

of the game, looking from period
two onwards,

it's exactly the same as it was
when we looked from period one

initially.
So to say it again,

what we argued before wason
the board that I've now covered

upwhat we argued before was,
I'm not going to want to cheat

in the very first period of the
game provided δ

> 1/3.
I want to claim that that same

argument tells me I'm not going
to want to cheat in the second

period of the game provided
δ > 1/3.

I'm not going to want to cheat
in the fifth period of the game

provided δ
> 1/3.

Because this game from the
fifth period on,

or the five hundredth period
on,

or the thousandth period on
looks exactly the same as is it

does from the beginning.
So what's neat about this

argument is the same analysis
says, this is not profitable if

δ > 1/3.


So what have we learned here?
I want to show you some nerdy

lessons and then some actual
sort of real world lessons.

Let's start with the nerdy
lessons.

The nerdy lesson is this grim
strategy works because

bothlet's put it up again so
we can actually see itthis

grim strategy,
it works because both the play

that it suggests if we both
cooperate and the play that it

suggests if we both defect are
themselves equilibria.

These are credible threats and
credible promises because what

you end up doing both in the
promise and in the threat is

itself equilibrium behavior.
That's good.

The second thing we've learned,
however, is for this to work we

need δ >
1/3, we need the probability

continuation to be bigger than
1/3.

So leaving aside the nerdy
stuff for a secondyou have

more practice on the nerdy stuff
on the homework assignmentthe

lesson is we can get cooperation
in the Prisoner's Dilemma using

the grim trigger.
Remember the grim trigger

strategy is cooperate until
someone defects and then defect

forever.
So you get cooperation in the

Prisoner's Dilemma using the
grim trigger as a subgame

perfect equilibrium.
So this is an equilibrium

strategy, that's good news,
provided the probability of

continuation is bigger than 1/3.


Let's try and generalize that
lesson away from the Prisoner's

Dilemma.
So last time our lesson was

about what in general could we
hope for in ongoing

relationships?
So let's put down a more

general lesson that refines what
we learned last time.

So the more general lesson is,
in an ongoing relationshiplet

me mimic exactly the words I
used last timeso for an

ongoing relationship to provide
incentives for good behavior

today,
it helpswhat we wrote last

time wasit helps for that
relationship to have a future.

But now we can refine this,
it helps for there to be a high

probability that the
relationship will continue.

So the specific lesson for
Prisoner's Dilemma and the grim

trigger strategy is we need
δ, the probability

continuation,
to be bigger than 1/3.

But the more general intuition
is, if we want my ongoing

business relationship with me
and Jake to generate good

behaviorso I'm going to
provide him with good fruit and

he's going to provide me with
good vegetableswe need the

probability that that
relationship will continue to be

reasonably high.
I claim this is a very natural

intuition.
Why?

Because the probability that
the relationship will continue

is the weight that you put on
the future.

The probability that the
relationship will continue,

this thing, this is the weight
you put on the future.


The more weight I put on the
future, the easier it is for the

future to give me incentives to
behave well today,

the easier it is for those to
overcome the temptations to

cheat today.
That seems like a much more

general lesson than just the
Prisoner's Dilemma example.

Let's try to push this to some
examples and see if it rings

true.
So the lesson we've got here is

to get cooperation in these
relationships we need there to

be a high probability,
a reasonably high probability

that they're going to continue.
We know exactly what that is

for Prisoner's Dilemma but the
lesson seems more general.

So here's two examples.
How many of you are seniors?

One or two, quite a few are
seniors.

Keep your hands up a second.
All of those of you who are

seniorswe can pan these guys.
Let's have a look at them.

Actually, why don't we get all
the seniors to stand up:

make you work a bit here.
Now the tricky question,

the tricky personal question.
How many of you who are seniors

are currently involved in
personal relationships,

you know: have a significant
other?

Stay standing up if you have a
significant other.

Look at this, it's pathetic.
What have I been saying about

economic majors?
All right, so let's just think

about, stay standing a second,
let's get these guys to think

about it a second.
So seniors who are involved in

ongoing relationships with
significant others,

what do we have to worry about
those seniors?

Well these seniors are about to
depart from the beautiful

confines of New Haven and
they're going to take jobs in

different parts of the world.
And the problem is some of them

are going to take jobs in New
York while their significant

other takes a job in San
Francisco or Baghdad or

whatever,
let's hope not Baghdad,

London shall we say.
Now if it's the case that you

are going to take a job in New
York next year and your

significant other is going to
take a job in Baghdad or London,

or anyway far away,
in reality, being cynical a

little bit, what does that do to
the probability that your

relationship is going to last?
It makes it go down.

It lowers the probability that
your relationship's going to

continue.
So what is the

predictionlet's be mean here.
These are the people with

significant others who are
seniors, how many of you are

going to be separated by a long
distance from your significant

others next period?
Well one of them at the back,

okay one guy,
at the back,

two guys, honesty here,
three, four of you right?

So what's our prediction here?
What does this model predict as

a social science experiment.
What does it predict?

It predicts that for those of
you who just raised your hands,

those seniors who just raised
their hands who are about to be

separated by large distances,
those relationships,

each player in that
relationship is going to have a

lower value on the future.
So during the rest of your

senior year, during the spring
of your senior year what's the

prediction of this model?
They're going to cheat.

So we could actually do a
controlled experiment,

what we should do here is we
should keep track of the people

here,
the seniors who are going to be

separatedyou can sit down now,
I'm sorry to embarrass you all.

We could keep track of those
seniors who are about to be

separated and go into a long
distance relationships,

and those that are not.
The people who are not are our

control group.
And we should see if during the

spring semester the people who
are going to be separated cheat

more often than the others.
So it's a very clear prediction

of the model that's relevant to
some of your lives.

Let me give you another example
that's less exciting perhaps,

but same sort of thing.
Consider the relationship that

I have with my garage mechanic.
I should stress this is not a

significant other relationship.
So I have a garage mechanic in

New Haven, and that garage
mechanic fixes my car.

And we have an ongoing business
relationship.

He knows that whenever my car
needs fixing,

even if it's just a small thing
like an oil change,

I'm going to go to him and have
him fix it, even though it might

be cheaper for me to go to Jiffy
Lube or something.

So I'm going to take my car to
him to be fixed,

and he's going to make some
money off me on even the easy

things.
What do I want in return for

that?
I want him to be honest and if

all I need is an oil change I
want him to tell me that,

and if what I actually need is
a new engine,

he tells me I need new engine.
So my cooperating with him,

is always going to him,
even if it's something simple;

and his cooperating with me,
is his not cheating on fixing

the car.
He knows more about the car

than I do.
But now what happens if he

knows either that I'm about to
leave town (which is the example

we just did),
or, more realistically,

he kind of knows that my car is
a lemon and I'm about to get rid

of it anyway.
Once I get a new car I'm not

going to go to him anymore
because I have to go to the

dealer to keep the warranty
intact.

So he knows that my car is
about to break down anyway,

and he knows that I know that
the car is about to break

anyway,
so my lemon of a car is about

to be passed onprobably to one
of my graduate studentsthen

what's going to happen?
So I'm going to have an

incentive to cheat because I'm
going to start taking my useless

car to Jiffy Lube for the oil
changes.

And he's going to have an
incentive to cheat.

He's going to start telling me
you know you really need a new

engine or a new clutchit's a
manual so I have a clutch:

it's a real carso I'm going
to need a new clutch rather than

just tightening up a bolt.
So once again the probability

of the continuation of the
relationship,

as it changes,
it leads to incentives to

cheat.
It leads to that relationship

breaking down.
That's the content,

that's the real world content
of the math we just did.

Let's try and push this a
little further.

Now what we've shown is that
the grim trigger works provided

δ > 1/3,
and δ being bigger than

1/3 doesn't seem like a very
large continuation probability.

So just having a probability of
1/3 that the relationship

continues allows the grim
trigger to work,

so that seems good news for the
grim trigger.

However, in reality,
in the real world,

the grim trigger might have
some disadvantages.

So let's just think about what
the grim trigger is telling us

in the real world.
It's telling us that if even

one of us cheats just a little
bitI just provide one item of

rotten fruit to Jake or he gives
me one too few branches of

asparagus in his provisions to
methen we never do business

with each other again ever.
It's completely the end.

We just never cooperate again.
That seems a little bit drastic.

It's a little bit draconian if
you like.

So in particular,
in the real world,

there's a complication here,
in the real world every now and

then one of us going "to cheat"
by accident.

That day that I didn't have my
glasses on and I put in a rotten

apple in the apples I supplied
to Jake.

In the fruit,
he was counting out the

asparagus and he lost count at
1,405 and he gave me one too

few.
So we might want to worry about

the fact that the grim trigger,
it's triggered by any amount of

cheating and it's very drastic:
it says we never do business

again.
The grim trigger is the analog

of the death penalty.
It's the business analog of the

death penalty.
It's not that I'm going to kill

Jake if he gives me one too few
branches of asparagus,

but I'm going to kill the
relationship.

For you seniors or otherwise,
who are involved in personal

relationships,
it's the equivalent of saying,

if you even see your partner
looking at someone else,

let alone sitting next to them
in the class,

the relationship is over.
It seems drastic.

So we might be interested
because mistakes happen,

because misperceptions happen,
we might be interested in using

punishments that are less
draconian than the grim trigger,

less draconian than the death
penalty.

Is that right?
So what I want to do is I want

to consider a different
strategy, a strategy other than

the grim trigger strategy,
and see if that could work.

So where shall I start?
Let's start here,

so again what I'm going to
revert to is the math and the

nerdiness of our analysis of the
Prisoner's Dilemma but I want

you to have in mind business
relationships,

your own personal
relationships,

your friendships and so on.
More or less everything you do

in life involves repeated
interaction, so have that in the

back of your mind,
but let's be nerdy now.

So what I want to consider is a
one period punishment.

So how are we going to write
down a strategy that has

cooperation but a one period
punishment.

So here's the strategy.
It saysit's kind of weird

thing but it worksplay C to
start and then play C ifthis

is going to seem weird but trust
me for a secondplay C if

either (C, C) or (D,D) were
played last.

So, if in the previous period
either both people cooperated or

both people defected,
then we'll play cooperation

this period.
And play D otherwise:

play D if either (C,
D) or (D, C) were played last.

Let's just think about this
strategy for a second.

What does that strategy mean?
So provided people start off

cooperating and they go on
cooperatingif both Jake and I

play this strategyin fact,
we'll cooperate forever.

Is that right?
So I claim this is a one period

punishment strategy.
Let's just see how that works.

So suppose Jake and I are
playing this strategy.

We're supposed to play C every
period.

And suppose deliberately or
otherwise, I play D.

So now in that period in which
I play D, the strategys played

were D by me and C by Jake.
So next period what does this

strategy tell us both to play?
So it was D by me and C by

Jake, so this strategy tells us
to play D.

So next period both of us will
play D.

So both of us will be
uncooperative precisely for that

period, that next period.
Now, what about the period

after that?
The period after that,

Jake will have played D,
I will have played D.

So this is what will have
happened: we both played D,

and now it tells us to
cooperate again.

Everyone happy with that?
So this strategy I've written

downit seems kind of
cumbersomebut what it actually

induces is exactly a one period
punishment.

If Jake is the only cheat then
we both defect for one period

and go back to cooperation.
If I'm the only person who

cheats then we both defect for
one period and go back to

cooperation.
It's a one period punishment

strategy.
Of course the question is,

the question you should be
asking is, is this going to

work?
Is this an equilibrium?

So let's just check.
Is this an SPE.

Is it an equilibrium?
So what do we need to check?

We need to check,
as usual, that the temptation

is less than or equal to the
value of the promisethe value

of the promise of continuing in
cooperationthe value of the

promise minus the value of the
threat.

And once again we have to be
careful, because the temptation

occurs today and this difference
between values occurs tomorrow.

Is that right?
So this is nothing new,

this is what we've always
written down,

this is what we have to check.
So the temptation for me to

cheat today, that's the same as
it was before,

it's 3  2.
The fact that it's tomorrow is

going to give me a δ
here.

Here's our square bracket.
So what's the value of the

promise?
So provided we both go on

cooperating, we're going to go
on cooperating forever,

in which case we're going to
get 2 for ever.

Is that right?
So this is going to be the

value of 2 forever starting
tomorrow (and again for ever

means until the game ends).
The value of the threat is what?

Be a bit careful now.
It's the value ofso what's

going to happen?
If I cheat then tomorrow we're

both going to cheat,
so tomorrow,

what am I going to get
tomorrow?

0.
So it's the value of 0

tomorrow: we're both going to
cheat, we're both going to play

D.
And then the next period what's

going to happen?
We're going to play C again,

and from thereon we're going to
go on playing C.

So it's going to the value of 0
tomorrow and then 2 forever

starting the next day.
That's what we have to evaluate.

So 3  2, I can do that one
again, that's 1.

So what's the value of 2
forever, well we did that

already today,
what was it?

It's in your notes.
Actually it's on the board,

it's the X up there,
what is it?

Here it is, 2 for ever:
we figured out the value of it

before and it was
2/[1–δ].

So the value of 2 forever is
going to be 2/[1–δ].

How about the value of 0?
So starting for tomorrow I'm

going to get 0 and then with one
period delay I'm going to get 2

for ever.
Well 2 forever,

we know what the value of that
is, it's 2/[1–δ],

but now I get it with one
period delay,

so what do I have to multiply
it by?

By δ good.
So the value of 0 tomorrow and

then 2 forever starting the next
day is δ

x 2/[1–δ].
And here's the δ

coming from here which just
takes into account that all this

analysis is starting tomorrow.
So to summarize,

this is my temptation today.
This is what I'll get starting

tomorrow if I'm a good boy and
cooperate.

And this is the value of what
I'll get if I cheat today.

Starting tomorrow I'll get
nothing, and then I'll revert

back to cooperation.
And since all of these values

in this square bracket start
tomorrow I've discounted them by

δ.
Now this requires some math so

bear with me while I probably
get some algebra wrongand

please can I get the T.A.'s to
stare at me a second because

I'll probably get this wrong.
Okay so what I'm going to do

is, I'm going to look at my
notes, I'm going to cheat,

that's what I'm going to do.
Okay, so what I'm going to do

is I'm going to have 1 is less
than or equal to,

I'm going to take a common
factor of 2 / [1–δ]

and δ, so I'm going to
have 2δ/[1–δ],

and that's going to leave
inside the square brackets:

this is a 1 and this is a
δ.


So this δ
here was that δ

there, and then I took out a
common factor of

2/[1–δ]
from this bracket.

Everyone okay with the algebra?
Just algebra,

nothing fancy going on there.
So that's good because now the

1δ cancels,
this cancels with this,

so this tells us we're okay
provided 1/2 <= δ:

it went up.
So don't worry too much about

the algebra, trust me on the
algebra a second,

let's just worry about the
conclusion.

What's the conclusion?
The conclusion is that this one

period punishment is an SPE,
it will be enough,

one period of punishment will
be enough to sustain cooperation

in my Prisoner's Dilemma
repeated business relationship

with Jake,
or in the seniors'

relationships with their
significant others,

provided δ
> 1/2.

What did δ
need to be for the grim

strategy?
1/3, so what have we learned

here?
We learnednerdilywhat we

learned was that for the grim
strategy we needed δ

> 1/3.
For the one period punishment

we needed δ
> 1/2, but what's the more

general lesson?
The more general lesson is,

if you use a softer punishment,
a less draconian punishment,

for that to work we're going to
need a higher δ.

Is that right?
So what we're learning here is

there's a trade off,
there's a trade off in

incentives.
And the trade off is if you use

a shorter punishment,
a less draconian

punishmentinstead of cutting
people's hands off or killing

them,
or never dealing with them

again, you just don't deal with
them for one periodthat's okay

provided there's a slightly
higher probability of the

relationship continuing.
So shorter punishments are okay

but they needthe implication
sign isn't really necessary

therethey need more weight
δ on the future.

I claim that's very intuitive.
What its saying is,

we're always trading things off
in the incentives.

We're trading off the ability
to cheat and get some cookies

today versus waiting and,
we hope, getting cookies

tomorrow.
So if, in fact,

the difference between the
reward and the punishment isn't

such a big deal,
isn't so bigthe punishment is

just, I'm going to give you one
fewer cookies tomorrowthen you

better be pretty patient not to
go for the cookies today.

I was about to say,
those of you who have children.

I'm probably the only person in
the room with children.

That cookie example will
resonate for the rest of

youwait until you get
thereyou'll discover that,

in fact, cookies are the right
example.

So shorter punishment,
less draconian punishments,

less reduction in your kid's
cookie rations tomorrow is only

going to work,
is only going to sustain good

behavior provided those kids put
a high weight on tomorrow.

In that case,
it isn't that the kids will

worry about the relationship
breaking down,

you're stuck with your kids,
it's just that they're

impatient.
Okay, so we've been doing a lot

of formal stuff here and I want
to go on doing formal stuff,

but what I want to do now is
spend the rest of today looking

at an application.
An application is,

I hope going to convince you
that repeated interaction really

matters.
So this is assuming that the

one about the seniors and their
boyfriends and girlfriends

wasn't enough.
Okay, so the application is

going to take us back a little
bit because what I want to talk

about is repeated moral hazard.


Moral hazard is something we
discussed the first class after

the midterm.
So what I want you to imagine

is that you are running a
business in the U.S.

and you are considering making
an investment in an emerging

market, and again,
so as not to offend anybody who

watches this on the video,
let's just call that emerging

market Freedonia,
rather than give it a name like

Kazakhstan, a name like
something other than Freedonia.

So Freedonia,
for those of you who don't

know, is a republic in a Marx
Brothers film.

So you're thinking of
outsourcing some production of

part of what your business is to
Freedonia.

The reason you're thinking of
doing this outsourcing,

what makes it attractive is
that wages are low in Freedonia.

So you get this outsourced in
Freedonia.

You think you're going to get
it done cheaply.

The down side is because
Freedonia is an emerging market,

the court system,
it doesn't operate very well.

And in particular,
it's going to be pretty hard to

enforce contracts and to jail
people and so on in Freedonia.

So you're considering
outsourcing.

The plus is,
from your point of view,

the plus is wages are cheap
where you're going to get this

production done.
The down side is it's going to

be hard to enforce contracts
because this is an emerging

market.
So what you're considering

doing is employing an agent and
you're going to pay that agent

W, so W is the wage if you
employ them.

I'll put this up in a tree in a
second.

Let's assume that the "going
wage" in Freedonia is 1:

we'll just normalize it.
So the going wage in Freedonia

is 1, and let's assume that to
get this outsourcing to work

you're going to have to send
some resources to your agent,

your employee in Freedonia.
And let's assume that the

amount you're going to have to
send over there is equivalent to

another 1.
So the going wage in Freedonia

is 1 and the amount you're going
to have to invest in giving this

agent materials or machinery is
another 1.

Let's assume that this project
is a pretty profitable project.

So if the project succeeds,
if the project goes ahead and

succeeds, it's going to generate
a gross revenue of 4.

Of course you have to invest 1
so that's a net revenue of 3 for

you, but nonetheless there's a
big potential return here.

The bad news is that your agent
in Freedonia can cheat on you.

In particular,
what he can do is he can simply

take the 1 that you've sent to
him,

sell those materials on the
market and then go away and just

work in his normal job anyway.
So he can get his normal wage

of 1 for just going and doing
his normal job,

whatever that was,
and he can steal the resources

from you.
So let's put this up as a kind

of tree.
This is a slight cheat,

this tree, but we'll see why in
a second.

So your decision is to invest
and set W.

So if you invest in Freedonia,
you'll invest and set W,

set the wage you're going to
pay him.

The going wage is 1 but you can
set a different wage or you

could just not invest.
If you don't invest you get

nothing and your agent in
Freedonia just gets the going

wage of 1.
If you do invest in Freedonia

and set a wage of W,
then your agent has a choice.

Either he can be honest or he
can cheat.

If he cheats,
what's going to happen to you?

You had to invest 1 in sending
it over there,

you're going to get nothing
back, so you'll get 1.

And he will go away and work
his normal job and get 1,

and, in addition,
he'll sell your materials so

he'll get a total of 1 + 1 is?
2, thank you.

So he'll get a total of 2.
On the other hand,

if he's honest,
then you're going to get a

return of 4 minus the 1 you had
to invest minus whatever wage

you paid to him.
So your return will be 3 minus

the wage you pay him.
You're only going to pay him

once the job's done,
3  W, and he's going to get W.

He's done his jobhe hasn't
exercised his outside option,

he hasn't sold your
materialsso he'll just get W.

Now, I'm slightly cheating here
because this isn't really the

way the tree looks because I
could choose different levels of

W.
So this upper branch where I

invest and set W is actually a
continuum of such branches,

one for each possible W,
I could set.

But for the purpose of today
this is enough.

This gives us what we needed to
see.

So let's imagine that this is a
one shot investment.

What I want to learn is in this
one shot investment,

I invest in Freedonia.
I hire my agent once,

what I want to learn is how
much do I have to pay that agent

to actually get the job done?
Remember the starting position.

The starting position is it
looks very attractive.

It looks very attractive
because the returns on this

project are 4 or 4  1,
so that the surplus available

on this project is 3 minus the
wage, and the going wage was

just 1.
So it looks like there's lots

of profit around to make this
outsourcing profitable.

I mumbled that so let me try it
again.

So the reason this looks
attractive is the going wage is

just 1, so if I just pay him 1
and he does the project then

I'll get a gross return of 4
minus the 1 I invested minus the

1 that I had to pay him for a
net return of 2.

It seems like that's a 100%
profitable project,

so it looks very attractive.
What's the problem?

The problem is if I only
setthis is going to give us

backward inductionif I set the
wage equal to the going wage,

so if I set W = 1 what will my
agent do?

He's going to cheat.
The problem is if I set W = 1,

which is the going wage,
the going wage in Freedonia,

the agent will cheat.
If he cheats I just lose my

investment.
So how much do I have to set

the W to?
Let's look at this.

So we have to set W.
What I need is I need his wage

to be big enough so that being
honest and going on with my

projectoutweighs his incentive
to cheat.

I need W to be bigger than 2.
Is that right?

I need W to be at least as big
as 2.

So in setting the wage,
in equilibrium,

what are we going to do?
I'm going to set a wage,

let's call it W* = 2 (plus a
penny), is that right?

So this is an exercise which we
visited the first day after the

midterm.
This is about incentive design.

In this one shot game,
which we can easily solve by

backward induction,
I'm going to need to set a wage

equal to 2, and then he'll work.


So in a minute,
we're going to look at the

repeated version of this,
but before we do let's just sum

up where we are so far.
What is this telling us?

It's telling us that when you
invest in an emerging market,

where the courts don't work so
they aren't going to be able to

enforce this guy to work
wellin particular,

he can run off with your
investmenteven though wages

are low, so it seems very
attractive to do outsourcing,

if you worry about getting
incentives right you're going to

have pay an enormous wage
premium to get the guy to work.

So the going wage in Freedonia
was 1, but you had to set a wage

equal to 2, a 100% wage premium,
to get the guy to work.

So the wage premium in this
emerging market is 100%,

you're paying 2 even though the
going wage is 1.

By the way, this is not an
unreasonable prediction.

If you look at the wages payed
by European and American

companies in some of these
emerging markets,

which have very,
very low going wages,

and if you look at the wages
that are actually being paid by

the companies that are doing
outsourcing you see enormous

wage premiums.
You see enormous premiums over

and above the going wage.
Now what I want to do is I want

to revisit exactly the same
situation, but now we're going

to introduce the wrinkle of the
day.

What's the wrinkle of the day?
The wrinkle of the day is

you're not only going to invest
in Freedonia today,

but if things go well you'll
invest tomorrow,

and if things go well again
you'll invest the day after at

least with some significant
probability.

So the wage premium we just
calculated was the one shot wage

premium.
It was getting this jobthis

single one shot joboutsourced
to Freedonia.

Now I want to consider how much
you're going to have to pay,

what are wages going to be in
Freedonia in the foreign

investment sector,
if instead of just having a one

shot, one job investment,
you're investing for the long

term.
You're going to be in Freedonia

for a while.
So consider repeated

interaction with probability
δ of continuing.

So we don't know that you're
going to go on in Freedonia.

Things might break down in
Freedonia because there's a

coup.
It might break down in

Freedonia because the American
administration says you're not

allowed to do outsourcing
anymore.

All sorts of things might
happen, but with some

probability δ
the relationship is going to

continue.
So repeated interaction with

probability of δ.
Let's redo the exercise we did

before to see what wage you'll
have to charge.

Our question is what
wagelet's call it W**what

wage will you pay?


The way we're going to solve
this, is exactly using the

methods we've learned in this
class.

So what we're going to compare
is the temptation to cheat

todayand we better make sure
that that's less than δ

times the value of continuing
the relationship minus the value

of ending the relationship.
Let's call this tomorrow.

So what's happening now is,
once again, I'm employing my

agent in Freedonia,
and provided he does a good

job, I'll employ him again
tomorrow, at least with

probability δ.
But if he doesn't do a good

job, if he runs off with my
investment and doesn't do my

job, what am I going to do?
What would you do?

You'd fire him.
So the punishmentit's clear

what the punishment's going to
be herethe punishment is,

if he doesn't do a good job,
you fire him.

The value of ending the
relationship.

This is firing and this is
continuing.


So let's just work out what
these things are.

So his temptation to cheat
today: if he cheats today,

he doesn't get my wage.
But he does run off with my

cash, and he does go and do his
job at the going wage.

So if he cheats today he gets
2, he stole all my cash,

and he's going off and working
at the going wage,

but he doesn't get what I would
have paid him W** if the job was

well done.
We need this to be less than

the value of continuing the
relationship.

Let's do the easy bit first.
What does he get if we end the

relationship?
He's been fired,

so he'll just work at the going
wage for ever.

So this is the value of 1 for
ever, or at least until the end

of the world.
This is the value of what?

As long as he stayed employed
by me what's he going to get

paid every period?
What's he going to get paid?

W**.
So the value of W** for ever.

Let me cheat a little bit and
assume that the probability of

some coup happening that ends
our relationship exogenously is

the same probability of the coup
happening and ending his ongoing

wage exogenously,
so we can use the same δ.

So let's just do some math
here, what's the value of W**

forever?
So remember the value of 2

forever was what?
2/[1δ].

So what's the value of W**
forever?

So this is going to be
W**/[1δ].

What's the value of 1 forever?
1/[1δ].

The whole thing is multiplied
by δ and this is 2W**.

Now I need to do some algebra
to solve for W**.

So let's try and do that.
So I claim that this is the

same as [1δ]
2[1δ]

W** < W**δ
 δ 1.

Everyone okay with that?
One more line:

let me just sort out some terms
here.

So taking these on the other
side, I have [1–δ]

2 + δ1 <= W**δ
+ [1–δ]

W** = W**.
So someone should just check my

algebra at home,
but I think that's right.

So the last two steps were just
algebra, nothing fancy.

What have we learned?
We have learned that the wage I

have to pay this guy,
the wage I have to pay him lies

somewhere between 2 and 1,
but we can do a bit better than

that.


Let's just delete everything
here.


So in particular,
if δ = 0,

what's W**?
If δ = 0,

W** is equal to what?
Somebody?

Equal to 2 and that's what we
had before.

In the one shot game,
there it is up there,

where there was no possibility
of continuing the relationship

tomorrow,
I had to pay him a wage of 2,

or if you like,
a wage premium of 100%.

If there's no probabilityif
there's no chance of continuing

this relationship,
if δ = 0we find again

that I'm paying 100% wage
premium.

Let's take the other extreme.
If δ = 1,

so I just know this
relationship's going to

continueif δ
= 1,

so there's no probability of
the world ending or there being

a coupthen what's W**?
It's equal to 1.

What's that?
What's 1?

It's the going wage.
So this is the going wage.

If I know for sure we're going
to continue forever I can get

away with paying the guy the
going wage, at least in the

limit.
If we know we're not going to

continue then I have to play the
one shot wage.

But let's look at a more
interesting intermediate case.

Suppose δ = ½.
There's just a 1/2

probabilitythat's pretty
lowthere's 1/2 probability

that your company,
American Widgets,

is going to stay in Freedonia:
with probability 1/2 it's going

to be done next period,
with probability 1/2 it's going

to stay.
What does that do to the wage?

What happens to the wage in
this case in which there's a

probability of 1/2 of American
Widgets staying in Freedonia?

It's a 1/2 between 2 and 1,
which is therefore one and a

half½.
Or another way of saying that

is, the wage premium is now only
50%.

What have we learned from this
example?

Just an example of using
repeated games.

Well the first thing we've
learned is it's going to be

easy, once we get used to it,
it's easy to use this

technology of comparing
temptations to cheat,

with values of continuing in a
cooperative relationship versus

the value of the punishment,
which is in this case was just

firing the guy.
But more specifically in this

example we've learned that even
a relatively small probability

of this relationship
continuingso this is good news

for those of you who are seniors
and are about to move to San

Francisco and your significant
other is going to Londoneven a

small probability of the
relationship continuing

drastically reduces the wage
premium.

The amount you have to "pay"
your significant other not to

cheat on you as they go off to
London or San Francisco is

drastically lower if there's
some probability,

in this case just a ½,
of continuing.

Before you leave,
one more thought okay.

So how did this all work?
Just to summarize,

to get good behavior in these
continuing relationships there

has to be some reward tomorrow.
That reward needs to be higher,

if the weight you put on
tomorrow, if the probability of

continuing tomorrow,
is lower.

The less likely tomorrow is to
occur the bigger that reward has

to be tomorrow.
We're going to have to charge

wage premia to employ people in
Freedonia but those premiums

will come down once we realize
that we're in established

relationships in Freedoniaonce
the American firms are

established and not fly by night
operations in Freedonia.

Whether that's good news or bad
news for Freedonia we'll leave

there.
On Monday, totally new topic.
