-
因果推断之路径既黑暗又危险
-
但是计量经济学是很厉害的武器
-
当自然界给你带来偶然的随机分配时
-
使用气势汹汹与灵活多變的
工具变量进行攻击
-
[]
-
随机试验是完成
“其他条件不变”的比较
-
的最可靠途径
-
但我们经常无法使用
这个功能强大的工具
-
但是有时候,随机是偶然发生的
-
这时候我们转向工具变量
-
—简称IV
-
工具变量
-
今天的课堂是IV两节课的第一节
-
我们的第一节IV课
从学校的故事开始
-
[]
-
特许学校是一些公立学校
-
不受日常学区监督
与教师工会合同约束
-
特许学校能否提高成绩
-
是美国教育改革史上
-
最重要的问题之一
-
最受欢迎的特许学校的申请人数
远多于学位
-
因此抽奖运决定了
谁家孩子可获得学位
-
在学生争夺机会时需要面对很多风险
-
正如获奖纪录片“等待超人”中
-
所描述的那样
-
等待结果时会产生很多种情绪
-
别哭,你会让妈妈哭的
好吗?
-
特许学校真的能提供更好的教育吗?
-
评论家肯定会说不是的
-
他们会争辩说特许学校
能夠招募更好
-
更聪明或更主动的学生
因此以后的结果的差异
-
反映了选择性偏差
-
等一下,这个似乎很容易
-
In a lottery, winners
are chosen randomly,
-
so just compare winners and losers.
- [Student] Obviously.
-
- On the right track, Kamal,
-
but charter lotteries
don't force kids
-
into or out
of a particular school --
-
they randomize offers
of a charter seat.
-
Some kids get lucky.
-
Some kids don't.
-
If we just wanted
to know the effect
-
that charter school offers,
-
we could treat this
as a randomized trial.
-
But we're interested
in the effects
-
of charter school attendance,
-
not offers.
-
And not everyone
who is offered, accepts.
-
IV turns the effect of being offered
a charter seat into the effect
-
of actually attending
a charter school.
-
- [Student] Cool.
- Oh, nice.
-
- Let's look at an example,
a charter school from
-
the Knowledge Is Power Program,
or KIPP for short.
-
This KIPP school is in Lynn --
-
a faded industrial town
on the coast of Massachusetts.
-
The school has
more applicants than seats
-
and therefore picks its students
using a lottery.
-
From 2005 to 2008,
371 fourth and fifth graders
-
put their names
in the KIPP Lynn lottery,
-
253 students won a seat at KIPP,
-
118 students lost.
-
A year later, lottery winners
had much higher math scores
-
than lottery losers.
-
But remember,
we're not trying to figure out
-
whether winning a lottery
makes you better at math.
-
We want to know if attending KIPP
makes you better at math.
-
Of the 253 lottery winners,
only 199 actually went to KIPP.
-
The others chose
a traditional public school.
-
Similarly, of the 118 lottery losers,
a few actually ended up at KIPP.
-
They got an offer later.
-
So what was the effect
on test scores
-
of actually attending KIPP?
-
- [Kamal] Why can't we just measure
their math scores?
-
- [Instructor] Great question.
-
Who would you compare them to?
-
- [Kamal] Those who didn't attend.
-
- [Instructor] Is attendance random?
-
- [Camilla] No.
- Selection bias.
-
- [Instructor] Correct.
- [Otto] What?
-
- [Instructor] The KIPP offers
are random so we can be confident
-
of ceteris paribus,
but attendance is not random.
-
The choice to accept the offer
might be due to characteristics
-
that are related
to math performance --
-
say, for example,
that dedicated parents
-
are more likely
to accept the offer.
-
Their kids are also more likely
to do better in math,
-
regardless of school.
-
- [Student] Right.
-
- [Instructor] IV converts
the offer effect
-
into the effect of KIPP attendance,
-
adjusting for the fact
that some winners go elsewhere
-
and some losers manage
to attend KIPP anyway.
-
Essentially, IV takes
an incomplete randomization
-
and makes the appropriate
adjustments.
-
How? IV describes a chain reaction.
-
Why do offers affect achievement?
-
Probably because they affect
charter attendance,
-
and charter attendance
improves math scores.
-
The first link in the chain
called the first stage
-
is the effect of the lottery
on charter attendance.
-
The second stage is the link
between attending a charter
-
and an outcome variable --
-
in this case, math scores.
-
The instrumental variable,
or "instrument" for short,
-
is the variable that initiates
the chain reaction.
-
The effect of the instrument
on the outcome
-
is called the reduced form.
-
This chain reaction can be
represented mathematically.
-
We multiply the first stage,
-
the effect of winning
on attendance,
-
by the second stage,
-
the effect of attendance on scores.
-
And we get the reduced form,
-
the effect of winning
the lottery on scores.
-
The reduced form and first stage
are observable and easy to compute.
-
However, the effect of attendance
on achievement
-
is not directly observed.
-
This is the causal effect
we're trying to determine.
-
Given some important assumptions
we'll discuss shortly,
-
we can find the effect
of KIPP attendance
-
by dividing the reduced form
by the first stage.
-
This will become more clear
as we work through an example.
-
- [Student] Let's do this.
-
- A quick note on measurement.
-
We measure achievement
using standard deviations,
-
often denoted
by the Greek letter sigma (σ).
-
One σ is a huge move
from around the bottom 15%
-
to the middle of most
achievement distributions.
-
Even a ¼ or ½ σ difference is big.
-
- [Instructor] Now we're ready
to plug some numbers
-
into the equation
we introduced earlier.
-
First up, what's the effect
-
of winning the lottery
on math scores?
-
KIPP applicants' math scores
are a third of a standard deviation
-
below the state average
-
in the year before
they apply to KIPP.
-
But a year later, lottery winners
score right at the state average,
-
while the lottery losers
are still well behind
-
with an average score
around -0.36 σ.
-
The effect of winning the lottery
on scores is the difference
-
between the winners' scores
and the losers' scores.
-
Take the winners'
average math scores,
-
subtract the losers'
average math scores,
-
and you will have 0.36 σ.
-
Next up: what's the effect
of winning the lottery on attendance?
-
In other words,
if you win the lottery,
-
how much more likely
are you to attend KIPP
-
than if you lose?
-
First, what percentage
of lottery winners attend KIPP?
-
Divide the number of winners
who attended KIPP
-
by the total number
of lottery winners -- that's 78%.
-
To find the percentage
of lottery losers who attended KIPP,
-
we divide the number of losers
who attended KIPP
-
by the total number
of lottery losers -- that's 4%.
-
Subtract 4 from 78, and we find
that winning the lottery
-
makes you 74%
more likely to attend KIPP.
-
Now we can find
what we're really after --
-
the effect of attendance on scores,
by dividing 0.36 by 0.74.
-
Attending KIPP raises math scores
-
by 0.48 standard deviations
on average.
-
That's an awesome achievement gain,
-
equal to moving
from about the bottom third
-
to the middle
of the achievement distribution.
-
- [Student] Whoa, half a sig.
-
- [Instructor] These estimates
are for kids opting in
-
to the KIPP lottery,
-
whose enrollment status
is changed by winning.
-
That's not necessarily
a random sample
-
of all children in Lynn.
-
So we can't assume
we'd see the same effect
-
for other types of students.
- [Student] Huh.
-
- But this effect
on keen for KIPP kids
-
is likely to be a good indicator
of the consequences
-
of adding additional charter seats.
-
- [Student] Cool.
- [Student] Got it.
-
- IV eliminates selection bias,
but like all of our tools,
-
the solution builds on a set
of assumptions
-
not to be taken for granted.
-
First, there must be
a substantial first stage --
-
that is the instrumental variable,
winning or losing the lottery,
-
must really change the variable
whose effect we're interested in --
-
here, KIPP attendance.
-
In this case, the first stage
is not really in doubt.
-
Winning the lottery makes
KIPP attendance much more likely.
-
Not all IV stories are like that.
-
Second, the instrument
must be as good
-
as randomly assigned,
-
meaning lottery winners and losers
have similar characteristics.
-
This is the independence assumption.
-
Of course, KIPP lottery wins
really are randomly assigned.
-
Still, we should check for balance
and confirm that winners and losers
-
have similar family backgrounds,
-
similar aptitudes and so on.
-
In essence, we're checking
to ensure KIPP lotteries are fair
-
with no group of applicants
suspiciously likely to win.
-
Finally, we require
the instrument change outcomes
-
solely through
the variable of interest --
-
in this case, attending KIPP.
-
This assumption is called
the exclusion restriction.
-
- IV only works if you can satisfy
these three assumptions.
-
- I don't understand
the exclusion restriction.
-
How could winning the lottery
affect math scores
-
other than by attending KIPP?
-
- [Student] Yeah.
- [Instructor] Great question.
-
Suppose lottery winners
are just thrilled to win,
-
and this happiness motivates them
to study more and learn more math,
-
regardless of where
they go to school.
-
This would violate
the exclusion restriction
-
because the motivational effect
of winning is a second channel
-
whereby lotteries
might affect test scores.
-
While it's hard
to rule this out entirely,
-
there's no evidence
of any alternative channels
-
in the KIPP study.
-
- IV solves the problem
of selection bias
-
in scenarios like the KIPP lottery
where treatment offers are random
-
but some of those offered opt out.
-
This sort of intentional
yet incomplete random assignment
-
is surprisingly common.
-
Even randomized clinical trials
have this feature.
-
IV solves the problem
of non-random take-up
-
in lotteries or clinical research.
-
But lotteries are not the only source
of compelling instruments.
-
Many causal questions
can be addressed
-
by naturally occurring
-
as good as randomly
assigned variation.
-
Here's a causal question for you:
-
Do women who have children
early in their careers
-
suffer a substantial earnings penalty
-
as a result?
-
After all, women earn less than men.
-
We could, of course, simply compare
the earnings of women
-
with more and fewer children.
-
But such comparisons are fraught
with selection bias.
-
If only we could
randomly assign babies
-
to different households.
-
Yeah, right,
sounds pretty fanciful.
-
Our next IV story -- fantastic
and not fanciful --
-
illustrates an amazing,
naturally occurring instrument
-
for family size.
-
♪ [music] ♪
-
- [Instructor] You're on your way
to mastering econometrics.
-
Make sure this video sticks
-
by taking a few
quick practice questions.
-
Or, if you're ready,
click for the next video.
-
You can also check out
MRU's website for more courses,
-
teacher resources, and more.
-
♪ [music] ♪