0:00:00.299,0:00:04.644 因果推断之路径既黑暗又危险 0:00:05.139,0:00:08.015 但是计量经济学是很厉害的武器 0:00:08.480,0:00:11.734 当自然界给你带来偶然的随机分配时 0:00:11.734,0:00:15.803 使用气势汹汹与灵活多變的[br]工具变量进行攻击 0:00:19.393,0:00:21.094 [] 0:00:23.653,0:00:26.362 随机试验是完成[br]“其他条件不变”的比较 0:00:26.362,0:00:28.704 的最可靠途径 0:00:28.704,0:00:32.640 但我们经常无法使用[br]这个功能强大的工具 0:00:33.224,0:00:36.940 但是有时候,随机是偶然发生的 0:00:36.940,0:00:40.592 这时候我们转向工具变量 0:00:40.592,0:00:41.938 —简称IV 0:00:41.938,0:00:44.508 工具变量 0:00:44.508,0:00:48.186 今天的课堂是IV两节课的第一节 0:00:48.958,0:00:52.801 我们的第一节IV课[br]从学校的故事开始 0:00:52.801,0:00:54.348 [] 0:00:54.348,0:00:56.138 特许学校是一些公立学校 0:00:56.138,0:01:00.112 不受日常学区监督[br]与教师工会合同约束 0:01:00.895,0:01:03.511 特许学校能否提高成绩 0:01:03.511,0:01:05.161 是美国教育改革史上 0:01:05.161,0:01:07.761 最重要的问题之一 0:01:08.145,0:01:12.562 最受欢迎的特许学校的申请人数[br]远多于学位 0:01:12.562,0:01:16.462 因此抽奖运决定了[br]谁家孩子可获录取 0:01:16.870,0:01:20.695 在学生争夺机会时需要面对很多风险 0:01:20.695,0:01:25.003 正如获奖纪录片“等待超人”中 0:01:25.003,0:01:27.832 所描述的那样 0:01:27.832,0:01:29.699 等待结果时会产生很多种情绪 0:01:30.258,0:01:32.916 别哭,你会让妈妈哭的[br]好吗? 0:01:37.498,0:01:40.618 特许学校真的能提供更好的教育吗? 0:01:40.948,0:01:43.183 评论家肯定会说"不是的" 0:01:43.413,0:01:46.586 他们会争辩说特许学校[br]能夠招募更好 0:01:46.586,0:01:50.164 更聪明或更主动的学生[br]因此以后结果的差异 0:01:50.164,0:01:52.061 反映了选择性偏差 0:01:52.595,0:01:54.729 等一下,这个似乎很容易 0:01:55.139,0:01:57.639 在抽奖活动中[br]我们会随机选择优胜者 0:01:57.639,0:02:00.083 因此只比较赢家和输家[br]- 很明显的 0:02:00.083,0:02:01.784 On the right track,卡马尔 0:02:01.784,0:02:04.375 但是特许学校的抽签安排 0:02:04.375,0:02:07.560 不会强迫孩子们进入[br]或离开特定的学校 0:02:07.749,0:02:10.667 他们随机分配了特许学校的学位 0:02:11.650,0:02:13.449 有些孩子很幸运 0:02:13.449,0:02:14.966 有些孩子不是 0:02:14.966,0:02:17.235 如果我们只是想知道特许学校 0:02:17.235,0:02:19.202 所带来的影响 0:02:19.202,0:02:22.417 我们可以将其视为随机试验 0:02:22.717,0:02:24.684 但是,我们只对特许学校[br]就学的影响 0:02:24.684,0:02:27.042 感兴趣 0:02:27.042,0:02:28.283 而对录取不感兴趣 0:02:28.568,0:02:32.039 并非所有获录取的学生[br]都会接受学位 0:02:32.039,0:02:37.234 IV将被录取为特许学校学生的影响 0:02:37.234,0:02:40.367 转变为实际就读特许学校的影响 0:02:40.367,0:02:42.344 - 太酷了[br]- 哦,太好了 0:02:45.925,0:02:48.871 让我们看一个例子 0:02:48.871,0:02:52.353 这是一所执行知识就是力量专案[br]的特许学校,或简称为KIPP 0:02:52.736,0:02:54.937 这所KIPP特许学校位于林恩 0:02:54.937,0:02:58.837 一座位于麻省海边的[br]褪色工业城镇 0:02:59.104,0:03:01.886 这所学校的申请者多于学位 0:03:01.886,0:03:05.620 因此他们要抽签来挑选学生 0:03:05.834,0:03:11.854 从2005年到2008年[br]共有371名四年级以及五年级生 0:03:11.854,0:03:15.350 参加了KIPP林恩的抽签 0:03:15.350,0:03:18.805 当中253名学生KIPP获录取 0:03:18.805,0:03:21.651 118名学生没有录取 0:03:21.967,0:03:26.001 一年后,获录取者的数学分数 0:03:26.001,0:03:27.852 比未获录取者更高 0:03:27.852,0:03:30.466 我们并不是试图弄清楚 0:03:30.466,0:03:33.803 获录取后是否会提高[br]你的数学水平 0:03:34.070,0:03:38.471 我们想知道参加KIPP[br]是否会使你的数学成绩改进 0:03:39.041,0:03:45.750 在253位获录取者中[br]实际上只有199位到KIPP上学 0:03:46.139,0:03:48.804 其他学生选择了传统的公立学校 0:03:49.563,0:03:55.536 同样,在118名未被录取的学生中[br]事实上有一些最终参加了KIPP 0:03:55.536,0:03:57.452 他们后来也获录取 0:03:57.452,0:04:00.044 那么,实际上参加KIPP 0:04:00.044,0:04:02.377 对考试成绩有何影响呢? 0:04:03.109,0:04:05.426 为什么我们不能只衡量[br]他们的数学成绩? 0:04:05.894,0:04:07.235 这是很好的问题 0:04:07.235,0:04:09.302 你将他们与谁进行比较呢? 0:04:09.302,0:04:11.111 那些没有参加的学生 0:04:11.111,0:04:12.944 上学率是随机的吗? 0:04:14.161,0:04:16.177 - 不是啊[br]- 选择性偏差 0:04:16.177,0:04:17.909 - 对啊[br]- 什么? 0:04:17.909,0:04:21.826 The KIPP offers are random so we can be confident 0:04:21.826,0:04:26.409 of ceteris paribus,[br]但上学率不是随机的 0:04:26.635,0:04:30.626 The choice to accept the offer[br]might be due to characteristics 0:04:30.626,0:04:32.984 that are related[br]to math performance -- 0:04:33.251,0:04:36.157 say, for example,[br]that dedicated parents 0:04:36.157,0:04:38.957 are more likely[br]to accept the offer. 0:04:38.957,0:04:42.646 Their kids are also more likely[br]to do better in math, 0:04:42.646,0:04:44.090 regardless of school. 0:04:44.090,0:04:45.114 - [Student] Right. 0:04:45.114,0:04:47.725 - [Instructor] IV converts[br]the offer effect 0:04:47.725,0:04:50.567 into the effect of KIPP attendance, 0:04:50.573,0:04:53.371 adjusting for the fact[br]that some winners go elsewhere 0:04:53.371,0:04:56.573 and some losers manage[br]to attend KIPP anyway. 0:04:56.950,0:05:00.517 Essentially, IV takes[br]an incomplete randomization 0:05:00.517,0:05:03.007 and makes the appropriate[br]adjustments. 0:05:03.684,0:05:07.107 How? IV describes a chain reaction. 0:05:07.641,0:05:10.343 Why do offers affect achievement? 0:05:10.343,0:05:13.256 Probably because they affect[br]charter attendance, 0:05:13.256,0:05:16.643 and charter attendance[br]improves math scores. 0:05:16.643,0:05:20.645 The first link in the chain[br]called the first stage 0:05:20.645,0:05:24.478 is the effect of the lottery[br]on charter attendance. 0:05:24.478,0:05:28.452 The second stage is the link[br]between attending a charter 0:05:28.452,0:05:30.153 and an outcome variable -- 0:05:30.153,0:05:32.261 in this case, math scores. 0:05:32.732,0:05:36.441 The instrumental variable,[br]or "instrument" for short, 0:05:36.441,0:05:40.246 is the variable that initiates[br]the chain reaction. 0:05:40.979,0:05:43.993 The effect of the instrument[br]on the outcome 0:05:43.993,0:05:46.631 is called the reduced form. 0:05:48.143,0:05:51.869 This chain reaction can be[br]represented mathematically. 0:05:51.869,0:05:54.241 We multiply the first stage, 0:05:54.241,0:05:56.349 the effect of winning[br]on attendance, 0:05:56.349,0:05:57.960 by the second stage, 0:05:57.960,0:06:00.538 the effect of attendance on scores. 0:06:00.538,0:06:02.713 And we get the reduced form, 0:06:02.713,0:06:05.680 the effect of winning[br]the lottery on scores. 0:06:06.780,0:06:11.566 The reduced form and first stage[br]are observable and easy to compute. 0:06:11.752,0:06:14.876 However, the effect of attendance[br]on achievement 0:06:14.876,0:06:17.093 is not directly observed. 0:06:17.093,0:06:20.360 This is the causal effect[br]we're trying to determine. 0:06:21.043,0:06:23.827 Given some important assumptions[br]we'll discuss shortly, 0:06:23.827,0:06:25.977 we can find the effect[br]of KIPP attendance 0:06:25.977,0:06:29.265 by dividing the reduced form[br]by the first stage. 0:06:29.265,0:06:32.910 This will become more clear[br]as we work through an example. 0:06:32.910,0:06:34.207 - [Student] Let's do this. 0:06:37.161,0:06:38.728 - A quick note on measurement. 0:06:38.728,0:06:41.745 We measure achievement[br]using standard deviations, 0:06:41.745,0:06:44.728 often denoted[br]by the Greek letter sigma (σ). 0:06:44.728,0:06:48.862 One σ is a huge move[br]from around the bottom 15% 0:06:48.862,0:06:51.634 to the middle of most[br]achievement distributions. 0:06:51.634,0:06:55.412 Even a ¼ or ½ σ difference is big. 0:06:56.262,0:06:58.389 - [Instructor] Now we're ready[br]to plug some numbers 0:06:58.389,0:07:01.655 into the equation[br]we introduced earlier. 0:07:01.655,0:07:03.231 First up, what's the effect 0:07:03.231,0:07:06.076 of winning the lottery[br]on math scores? 0:07:06.354,0:07:10.421 KIPP applicants' math scores[br]are a third of a standard deviation 0:07:10.421,0:07:11.835 below the state average 0:07:11.835,0:07:14.386 in the year before[br]they apply to KIPP. 0:07:14.386,0:07:18.320 But a year later, lottery winners[br]score right at the state average, 0:07:18.320,0:07:21.482 while the lottery losers[br]are still well behind 0:07:21.482,0:07:25.499 with an average score[br]around -0.36 σ. 0:07:25.834,0:07:29.619 The effect of winning the lottery[br]on scores is the difference 0:07:29.619,0:07:32.819 between the winners' scores[br]and the losers' scores. 0:07:33.403,0:07:35.784 Take the winners'[br]average math scores, 0:07:35.784,0:07:38.269 subtract the losers'[br]average math scores, 0:07:38.269,0:07:41.502 and you will have 0.36 σ. 0:07:41.908,0:07:46.880 Next up: what's the effect[br]of winning the lottery on attendance? 0:07:46.880,0:07:49.193 In other words,[br]if you win the lottery, 0:07:49.193,0:07:52.257 how much more likely[br]are you to attend KIPP 0:07:52.257,0:07:53.456 than if you lose? 0:07:53.671,0:07:57.798 First, what percentage[br]of lottery winners attend KIPP? 0:07:57.798,0:08:00.774 Divide the number of winners[br]who attended KIPP 0:08:00.774,0:08:05.490 by the total number[br]of lottery winners -- that's 78%. 0:08:05.810,0:08:09.331 To find the percentage[br]of lottery losers who attended KIPP, 0:08:09.331,0:08:12.333 we divide the number of losers[br]who attended KIPP 0:08:12.333,0:08:16.865 by the total number[br]of lottery losers -- that's 4%. 0:08:17.377,0:08:21.597 Subtract 4 from 78, and we find[br]that winning the lottery 0:08:21.597,0:08:25.600 makes you 74%[br]more likely to attend KIPP. 0:08:25.946,0:08:28.532 Now we can find[br]what we're really after -- 0:08:28.532,0:08:34.551 the effect of attendance on scores,[br]by dividing 0.36 by 0.74. 0:08:34.789,0:08:37.585 Attending KIPP raises math scores 0:08:37.585,0:08:41.606 by 0.48 standard deviations[br]on average. 0:08:42.126,0:08:44.503 That's an awesome achievement gain, 0:08:44.503,0:08:47.380 equal to moving[br]from about the bottom third 0:08:47.380,0:08:49.925 to the middle[br]of the achievement distribution. 0:08:49.925,0:08:51.085 - [Student] Whoa, half a sig. 0:08:51.085,0:08:53.507 - [Instructor] These estimates[br]are for kids opting in 0:08:53.507,0:08:54.781 to the KIPP lottery, 0:08:54.781,0:08:57.762 whose enrollment status[br]is changed by winning. 0:08:57.985,0:09:00.617 That's not necessarily[br]a random sample 0:09:00.617,0:09:02.283 of all children in Lynn. 0:09:02.536,0:09:05.035 So we can't assume[br]we'd see the same effect 0:09:05.035,0:09:07.327 for other types of students.[br]- [Student] Huh. 0:09:07.327,0:09:10.218 - But this effect[br]on keen for KIPP kids 0:09:10.218,0:09:13.367 is likely to be a good indicator[br]of the consequences 0:09:13.367,0:09:15.767 of adding additional charter seats. 0:09:15.767,0:09:17.216 - [Student] Cool.[br]- [Student] Got it. 0:09:19.628,0:09:23.352 - IV eliminates selection bias,[br]but like all of our tools, 0:09:23.352,0:09:25.624 the solution builds on a set[br]of assumptions 0:09:25.624,0:09:27.540 not to be taken for granted. 0:09:28.098,0:09:31.463 First, there must be[br]a substantial first stage -- 0:09:31.463,0:09:35.565 that is the instrumental variable,[br]winning or losing the lottery, 0:09:35.565,0:09:39.065 must really change the variable[br]whose effect we're interested in -- 0:09:39.065,0:09:41.031 here, KIPP attendance. 0:09:41.298,0:09:44.594 In this case, the first stage[br]is not really in doubt. 0:09:44.594,0:09:47.894 Winning the lottery makes[br]KIPP attendance much more likely. 0:09:48.386,0:09:50.631 Not all IV stories are like that. 0:09:51.321,0:09:53.698 Second, the instrument[br]must be as good 0:09:53.698,0:09:54.931 as randomly assigned, 0:09:54.931,0:09:58.716 meaning lottery winners and losers[br]have similar characteristics. 0:09:58.893,0:10:01.559 This is the independence assumption. 0:10:01.977,0:10:05.716 Of course, KIPP lottery wins[br]really are randomly assigned. 0:10:05.716,0:10:09.656 Still, we should check for balance[br]and confirm that winners and losers 0:10:09.656,0:10:11.493 have similar family backgrounds, 0:10:11.493,0:10:13.590 similar aptitudes and so on. 0:10:13.590,0:10:16.969 In essence, we're checking[br]to ensure KIPP lotteries are fair 0:10:16.969,0:10:20.055 with no group of applicants[br]suspiciously likely to win. 0:10:21.373,0:10:24.373 Finally, we require[br]the instrument change outcomes 0:10:24.373,0:10:26.092 solely through[br]the variable of interest -- 0:10:26.092,0:10:28.100 in this case, attending KIPP. 0:10:28.299,0:10:31.367 This assumption is called[br]the exclusion restriction. 0:10:32.951,0:10:37.500 - IV only works if you can satisfy[br]these three assumptions. 0:10:38.033,0:10:40.418 - I don't understand[br]the exclusion restriction. 0:10:40.917,0:10:43.599 How could winning the lottery[br]affect math scores 0:10:43.599,0:10:45.244 other than by attending KIPP? 0:10:45.244,0:10:47.230 - [Student] Yeah.[br]- [Instructor] Great question. 0:10:47.230,0:10:50.536 Suppose lottery winners[br]are just thrilled to win, 0:10:50.536,0:10:55.045 and this happiness motivates them[br]to study more and learn more math, 0:10:55.045,0:10:57.285 regardless of where[br]they go to school. 0:10:57.285,0:10:59.901 This would violate[br]the exclusion restriction 0:10:59.901,0:11:03.787 because the motivational effect[br]of winning is a second channel 0:11:03.787,0:11:06.569 whereby lotteries[br]might affect test scores. 0:11:06.865,0:11:09.546 While it's hard[br]to rule this out entirely, 0:11:09.546,0:11:12.650 there's no evidence[br]of any alternative channels 0:11:12.650,0:11:14.108 in the KIPP study. 0:11:17.817,0:11:20.700 - IV solves the problem[br]of selection bias 0:11:20.700,0:11:25.051 in scenarios like the KIPP lottery[br]where treatment offers are random 0:11:25.051,0:11:27.083 but some of those offered opt out. 0:11:28.451,0:11:31.700 This sort of intentional[br]yet incomplete random assignment 0:11:31.700,0:11:33.367 is surprisingly common. 0:11:33.367,0:11:36.318 Even randomized clinical trials[br]have this feature. 0:11:37.134,0:11:40.053 IV solves the problem[br]of non-random take-up 0:11:40.053,0:11:42.534 in lotteries or clinical research. 0:11:43.054,0:11:46.725 But lotteries are not the only source[br]of compelling instruments. 0:11:46.915,0:11:49.124 Many causal questions[br]can be addressed 0:11:49.124,0:11:50.758 by naturally occurring 0:11:50.758,0:11:53.831 as good as randomly[br]assigned variation. 0:11:54.731,0:11:56.915 Here's a causal question for you: 0:11:56.915,0:11:59.450 Do women who have children[br]early in their careers 0:11:59.450,0:12:01.647 suffer a substantial earnings penalty 0:12:01.647,0:12:02.648 as a result? 0:12:02.648,0:12:04.970 After all, women earn less than men. 0:12:05.573,0:12:08.506 We could, of course, simply compare[br]the earnings of women 0:12:08.506,0:12:10.891 with more and fewer children. 0:12:10.891,0:12:14.190 But such comparisons are fraught[br]with selection bias. 0:12:14.806,0:12:17.401 If only we could[br]randomly assign babies 0:12:17.401,0:12:19.089 to different households. 0:12:19.089,0:12:22.131 Yeah, right,[br]sounds pretty fanciful. 0:12:22.470,0:12:26.714 Our next IV story -- fantastic[br]and not fanciful -- 0:12:26.714,0:12:30.234 illustrates an amazing,[br]naturally occurring instrument 0:12:30.234,0:12:31.918 for family size. 0:12:33.317,0:12:34.551 ♪ [music] ♪ 0:12:34.551,0:12:38.202 - [Instructor] You're on your way[br]to mastering econometrics. 0:12:38.202,0:12:40.170 Make sure this video sticks 0:12:40.170,0:12:42.636 by taking a few[br]quick practice questions. 0:12:42.886,0:12:46.336 Or, if you're ready,[br]click for the next video. 0:12:46.529,0:12:50.204 You can also check out[br]MRU's website for more courses, 0:12:50.204,0:12:52.027 teacher resources, and more. 0:12:52.289,0:12:53.772 ♪ [music] ♪