WEBVTT 00:00:00.299 --> 00:00:04.644 - [Instructor] The path from cause to effect is dark and dangerous. 00:00:05.041 --> 00:00:08.015 But the weapons of Econometrics are strong. 00:00:08.480 --> 00:00:11.734 Attack with fierce and flexible instrumental variables 00:00:11.734 --> 00:00:15.803 when nature blesses you with fortuitous random assignment. 00:00:19.393 --> 00:00:21.094 [gong rings] 00:00:23.653 --> 00:00:28.704 Randomized trials are the surest path to ceteris parabus comparisons. 00:00:28.704 --> 00:00:32.640 Alas, this powerful tool is often unavailable. 00:00:33.224 --> 00:00:36.940 But sometimes, randomization happens by accident. 00:00:36.940 --> 00:00:40.592 That's when we turn to instrumental variables -- 00:00:40.592 --> 00:00:41.938 IV for short. 00:00:41.938 --> 00:00:44.508 - [Voice whispers] Instrumental variables. 00:00:44.508 --> 00:00:48.186 - [Instructor] Today's lesson is the first of two on IV. 00:00:48.958 --> 00:00:52.951 Our first IV lesson begins with a story of schools. 00:00:52.951 --> 00:00:54.348 [school bell rings] 00:00:54.348 --> 00:00:56.138 - [Josh] Charter schools are public schools 00:00:56.138 --> 00:01:00.112 freed from daily district oversight and teacher union contracts. 00:01:00.895 --> 00:01:03.511 The question of whether charters boost achievement 00:01:03.511 --> 00:01:05.161 is one of the most important 00:01:05.161 --> 00:01:07.761 in the history of American education reform. 00:01:08.145 --> 00:01:12.562 - The most popular charter schools have more applicants than seats 00:01:12.562 --> 00:01:16.462 so the luck of a lottery draw decides who's offered a seat. 00:01:16.870 --> 00:01:20.503 A lot is at stake for the students vying for their chance, 00:01:20.503 --> 00:01:25.003 and waiting for the lottery results brings up lots of emotions 00:01:25.003 --> 00:01:27.832 as was captured in the award-winning documentary 00:01:27.832 --> 00:01:29.699 "Waiting For Superman." 00:01:30.258 --> 00:01:32.916 - [Mother] Don't cry. You're gonna make Mommy cry. Okay? 00:01:37.498 --> 00:01:40.618 - Do charters really provide a better education? 00:01:40.948 --> 00:01:43.183 Critics most definitely say no, 00:01:43.413 --> 00:01:46.479 arguing that charters enroll better students to begin with, 00:01:46.479 --> 00:01:50.164 smarter or more motivated, so differences in later outcomes 00:01:50.164 --> 00:01:52.061 reflects selection bias. 00:01:52.595 --> 00:01:54.729 - [Kamal] Wait, this one seems easy. 00:01:55.139 --> 00:01:57.444 In a lottery, winners are chosen randomly, 00:01:57.498 --> 00:02:00.083 so just compare winners and losers. - [Student] Obviously. 00:02:00.083 --> 00:02:01.698 - On the right track, Kamal, 00:02:01.698 --> 00:02:04.747 but charter lotteries don't force kids into 00:02:04.747 --> 00:02:07.560 or out of a particular school. 00:02:07.749 --> 00:02:10.667 They randomize offers of a charter seat. 00:02:11.650 --> 00:02:13.449 Some kids get lucky. 00:02:13.449 --> 00:02:14.966 Some kids don't. 00:02:14.966 --> 00:02:19.118 If we just wanted to know the effect of charter school offers, 00:02:19.118 --> 00:02:22.417 we could treat this as a randomized trial. 00:02:22.717 --> 00:02:24.684 But we we're interested in the effects 00:02:24.684 --> 00:02:28.283 of charter school attendance, not offers. 00:02:28.568 --> 00:02:31.917 And not everyone who is offered, accepts. 00:02:31.917 --> 00:02:37.234 IV turns the effect of being offered a charter seat into the effect 00:02:37.234 --> 00:02:40.367 of actually attending a charter school. 00:02:40.367 --> 00:02:42.344 - [Student] Cool. - Oh nice. 00:02:45.925 --> 00:02:48.871 - Let's look at an example, a charter school from 00:02:48.871 --> 00:02:52.353 the Knowledge Is Power Program, or KIPP for short. 00:02:52.736 --> 00:02:54.937 This KIPP school is in Lynn, 00:02:54.937 --> 00:02:58.837 a faded industrial town on the coast of Massachusetts. 00:02:59.104 --> 00:03:01.886 The school has more applicants than seats 00:03:01.886 --> 00:03:05.620 and therefore picks its students using a lottery. 00:03:05.834 --> 00:03:11.854 From 2005 to 2008, 371 fourth and fifth graders 00:03:11.854 --> 00:03:15.320 put their names in the KIPP Lynn lottery, 00:03:15.382 --> 00:03:18.754 253 students won a seat at KIPP, 00:03:18.754 --> 00:03:21.651 118 students lost. 00:03:21.967 --> 00:03:26.001 A year later, lottery winners had much higher math scores 00:03:26.001 --> 00:03:27.719 than lottery losers. 00:03:27.802 --> 00:03:30.370 But remember, we're not trying to figure out 00:03:30.370 --> 00:03:33.803 whether winning a lottery makes you better at math. 00:03:34.070 --> 00:03:38.471 We want to know if attending KIPP makes you better at math. 00:03:38.788 --> 00:03:45.671 Of the 253 lottery winners, only 199 actually went to KIPP. 00:03:46.139 --> 00:03:48.804 The others chose a traditional public school. 00:03:49.563 --> 00:03:55.370 Similarly of the 118 lottery losers, a few actually ended up at KIPP. 00:03:55.509 --> 00:03:57.452 They got an offer later. 00:03:57.452 --> 00:04:02.377 So what was the effect on test scores of actually attending KIPP? 00:04:03.109 --> 00:04:05.426 - [Kamal] Why can't we just measure their math scores? 00:04:05.426 --> 00:04:07.096 - [Instructor] Great question. 00:04:07.096 --> 00:04:09.302 Who would you compare them to? 00:04:09.302 --> 00:04:11.111 - [Kamal] Those who didn't attend. 00:04:11.111 --> 00:04:12.944 - [Instructor] Is attendance random? 00:04:13.937 --> 00:04:15.057 - [Camilla] No. 00:04:15.057 --> 00:04:16.177 - Selection bias. 00:04:16.177 --> 00:04:17.909 - [Instructor] Correct. - [Otto] What? 00:04:17.909 --> 00:04:21.826 - [Instructor] The KIPP offers are random so we can be confident 00:04:21.826 --> 00:04:26.409 of ceteris parabus, but attendance is not random. 00:04:26.635 --> 00:04:30.601 The choice to accept the offer might be due to characteristics 00:04:30.601 --> 00:04:32.984 that are related to math performance -- 00:04:33.251 --> 00:04:36.157 say, for example, that dedicated parents 00:04:36.157 --> 00:04:38.941 are more likely to accept the offer. 00:04:38.941 --> 00:04:42.646 Their kids are also more likely to do better in math, 00:04:42.646 --> 00:04:44.090 regardless of school. 00:04:44.090 --> 00:04:45.114 - [Student] Right. 00:04:45.114 --> 00:04:47.613 - [Instructor] IV converts the offer effect 00:04:47.613 --> 00:04:50.567 into the effect of KIPP attendance, 00:04:50.573 --> 00:04:53.371 adjusting for the fact that some winners go elsewhere 00:04:53.371 --> 00:04:56.573 and some losers manage to attend KIPP anyway. 00:04:56.950 --> 00:05:00.517 Essentially, IV takes an incomplete randomization 00:05:00.517 --> 00:05:03.007 and makes the appropriate adjustments. 00:05:03.684 --> 00:05:07.107 How? IV describes a chain reaction. 00:05:07.426 --> 00:05:10.343 Why do offers affect achievement? 00:05:10.343 --> 00:05:13.175 Probably because they affect charter attendance 00:05:13.175 --> 00:05:16.643 and charter attendance improves math scores, 00:05:16.643 --> 00:05:20.442 the first link in the chain called the first stage 00:05:20.442 --> 00:05:24.341 is the effect of the lottery on charter attendance. 00:05:24.446 --> 00:05:28.361 The second stage is the length between attending a charter 00:05:28.361 --> 00:05:30.153 and an outcome variable, 00:05:30.153 --> 00:05:32.261 in this case, math scores. 00:05:32.940 --> 00:05:36.441 The instrumental variable, or instrument for short, 00:05:36.441 --> 00:05:40.246 is the variable that initiates the chain reaction. 00:05:40.899 --> 00:05:44.833 The effect of the instrument on the outcome is called 00:05:44.833 --> 00:05:46.631 the reduced form. 00:05:48.143 --> 00:05:51.615 This chain reaction can be represented mathematically. 00:05:51.615 --> 00:05:55.266 We multiply the first stage, the effect of winning 00:05:55.266 --> 00:05:57.866 on attendance, by the second stage, 00:05:57.866 --> 00:06:00.567 the effect of attendance on scores. 00:06:00.630 --> 00:06:02.713 And we get the reduced form, 00:06:02.713 --> 00:06:05.680 the effect of winning the lottery on scores. 00:06:06.780 --> 00:06:11.566 The reduced form and first stage are observable and easy to compute. 00:06:11.752 --> 00:06:14.876 However, the effect of attendance on achievement 00:06:14.876 --> 00:06:16.993 is not directly observed. 00:06:16.993 --> 00:06:20.360 This is the causal effect we're trying to determine. 00:06:21.043 --> 00:06:23.827 Given some important assumptions we'll discuss shortly, 00:06:23.827 --> 00:06:25.977 we can find the effect of KIPP attendance 00:06:25.977 --> 00:06:29.183 by dividing the reduced form by the first stage. 00:06:29.225 --> 00:06:32.774 This will become more clear as we work through an example. 00:06:32.774 --> 00:06:34.207 - [Student] Let's do this. 00:06:37.161 --> 00:06:38.728 - A quick note on measurement. 00:06:38.728 --> 00:06:41.678 We measure achievement using standard deviations, 00:06:41.678 --> 00:06:44.728 often denoted by the Greek letter sigma (σ). 00:06:44.728 --> 00:06:48.862 One σ is a huge move from around the bottom 15% 00:06:48.862 --> 00:06:51.634 to the middle of most achievement distributions. 00:06:51.634 --> 00:06:55.412 Even a ¼ or ½ σ difference is big. 00:06:56.262 --> 00:06:58.389 - [Instructor] Now we're ready to plug some numbers 00:06:58.389 --> 00:07:01.382 into the equation we introduced earlier. 00:07:01.557 --> 00:07:03.231 First up, what's the effect 00:07:03.231 --> 00:07:06.076 of winning the lottery on math scores? 00:07:06.354 --> 00:07:10.437 KIPP applicants' math scores are a third of a standard deviation 00:07:10.504 --> 00:07:14.386 below the state average in the year before they apply to KIPP. 00:07:14.386 --> 00:07:18.120 But a year later, lottery winners score right at the state average 00:07:18.215 --> 00:07:21.482 while the lottery losers are still well behind 00:07:21.482 --> 00:07:25.499 with an average score around - 0.36 σ. 00:07:25.834 --> 00:07:29.619 The effect of winning the lottery on scores is the difference 00:07:29.619 --> 00:07:32.819 between the winners' scores and the losers' scores. 00:07:33.403 --> 00:07:35.784 Take the winners' average math scores, 00:07:35.784 --> 00:07:38.269 subtract the losers' average math scores, 00:07:38.269 --> 00:07:41.502 and you will have 0.36 σ . 00:07:41.908 --> 00:07:46.659 Next up: what's the effect of winning the lottery on attendance? 00:07:46.809 --> 00:07:49.193 In other words, if you win the lottery, 00:07:49.193 --> 00:07:53.293 how much more likely are you to attend KIPP than if you lose? 00:07:53.643 --> 00:07:57.610 First, what percentage of lottery winners attend KIPP? 00:07:57.610 --> 00:08:00.626 Divide the number of winners who attended KIPP 00:08:00.626 --> 00:08:05.361 by the total number of lottery winners -- that's 78%. 00:08:05.810 --> 00:08:09.143 To find the percentage of lottery losers who attended KIPP, 00:08:09.143 --> 00:08:12.293 we divide the number of losers who attended KIPP 00:08:12.293 --> 00:08:16.760 by the total number of lottery losers -- that's 4%. 00:08:17.377 --> 00:08:21.393 Subtract 4 from 78, and we find that winning the lottery 00:08:21.393 --> 00:08:25.512 makes you 74% more likely to attend KIPP. 00:08:25.946 --> 00:08:28.226 Now we can find what we're really after, 00:08:28.383 --> 00:08:34.551 the effect of attendance on scores, by dividing 0.36 by 0.74. 00:08:34.789 --> 00:08:37.585 Attending KIPP raises math scores 00:08:37.585 --> 00:08:41.518 by 0.48 standard deviations on average. 00:08:42.269 --> 00:08:44.503 That's an awesome achievement gain, 00:08:44.503 --> 00:08:47.236 equal to moving from about the bottom third 00:08:47.236 --> 00:08:49.955 to the middle of the achievement distribution. 00:08:49.955 --> 00:08:51.238 - [Student] Whoa, half a sig. 00:08:51.238 --> 00:08:53.507 - [Instructor] These estimates are for kids opting in 00:08:53.507 --> 00:08:56.047 to the KIPP lottery, whose enrollment status 00:08:56.047 --> 00:08:57.762 is changed by winning. 00:08:57.985 --> 00:09:00.617 That's not necessarily a random sample 00:09:00.617 --> 00:09:02.283 of all children in Lynn. 00:09:02.536 --> 00:09:05.035 So we can't assume we'd see the same effect 00:09:05.035 --> 00:09:07.327 for other types of students. - [Student] Huh. 00:09:07.327 --> 00:09:10.218 - But this effect on keen for KIPP kids 00:09:10.218 --> 00:09:13.367 is likely to be a good indicator of the consequences 00:09:13.367 --> 00:09:15.767 of adding additional charter seats. 00:09:15.767 --> 00:09:17.216 - [Student] Cool. - [Student] Got it. 00:09:19.628 --> 00:09:23.145 - IV eliminates selection bias, but like all of our tools, 00:09:23.145 --> 00:09:25.624 the solution builds on a set of assumptions 00:09:25.624 --> 00:09:27.540 not to be taken for granted. 00:09:28.098 --> 00:09:31.347 First, there must be a substantial first stage -- 00:09:31.347 --> 00:09:35.465 that is the instrumental variable, winning or losing the lottery, 00:09:35.465 --> 00:09:38.915 must really change the variable whose effect we're interested in -- 00:09:38.915 --> 00:09:41.031 here, KIPP attendance. 00:09:41.298 --> 00:09:44.415 In this case, the first stage is not really in doubt. 00:09:44.415 --> 00:09:47.894 Winning the lottery makes KIPP attendance much more likely. 00:09:48.386 --> 00:09:50.631 Not all IV stories are like that. 00:09:51.321 --> 00:09:53.698 Second, the instrument must be as good 00:09:53.698 --> 00:09:56.731 as randomly assigned, meaning lottery winners and losers 00:09:56.731 --> 00:09:58.716 have similar characteristics. 00:09:58.893 --> 00:10:01.559 This is the independence assumption. 00:10:01.977 --> 00:10:05.627 Of course, KIPP lottery wins really are randomly assigned. 00:10:05.627 --> 00:10:09.293 Still, we should check for balance and confirm that winners and losers 00:10:09.293 --> 00:10:11.493 have similar family backgrounds, 00:10:11.493 --> 00:10:13.394 similar aptitudes and so on. 00:10:13.543 --> 00:10:16.969 In essence, we're checking to ensure KIPP lotteries are fair 00:10:16.969 --> 00:10:20.017 with no group of applicants suspiciously likely to win. 00:10:21.373 --> 00:10:24.373 Finally, we require the instrument change outcomes 00:10:24.373 --> 00:10:26.252 solely through the variable of interest, 00:10:26.252 --> 00:10:28.100 in this case, attending KIPP. 00:10:28.299 --> 00:10:31.367 This assumption is called the exclusion restriction. 00:10:32.951 --> 00:10:37.500 - IV only works if you can satisfy these three assumptions. 00:10:37.783 --> 00:10:40.418 - I don't understand the exclusion restriction. 00:10:40.917 --> 00:10:43.599 How could winning the lottery affect math scores 00:10:43.599 --> 00:10:45.244 other than by attending KIPP? 00:10:45.244 --> 00:10:47.230 - [Student] Yeah. - [Instructor] Great question. 00:10:47.230 --> 00:10:50.536 Suppose lottery winners are just thrilled to win, 00:10:50.536 --> 00:10:55.045 and this happiness motivates them to study more and learn more math, 00:10:55.045 --> 00:10:57.144 regardless of where they go to school. 00:10:57.231 --> 00:10:59.901 This would violate the exclusion restriction 00:10:59.901 --> 00:11:03.787 because the motivational effect of winning is a second channel 00:11:03.787 --> 00:11:06.569 whereby lotteries might affect test scores. 00:11:06.865 --> 00:11:09.546 While it's hard to rule this out entirely, 00:11:09.546 --> 00:11:12.650 there's no evidence of any alternative channels 00:11:12.650 --> 00:11:14.499 in the KIPP study. 00:11:17.817 --> 00:11:20.700 - IV solves the problem of selection bias 00:11:20.700 --> 00:11:24.850 in scenarios like the KIPP lottery where treatment offers are random 00:11:24.850 --> 00:11:27.083 but some of those offered opt out. 00:11:28.451 --> 00:11:31.700 This sort of intentional yet incomplete random assignment 00:11:31.700 --> 00:11:33.367 is surprisingly common. 00:11:33.367 --> 00:11:36.318 Even randomized clinical trials have this feature. 00:11:37.134 --> 00:11:40.053 IV solves the problem of non-random take up 00:11:40.053 --> 00:11:42.534 in lotteries or clinical research. 00:11:43.054 --> 00:11:46.725 But lotteries are not the only source of compelling instruments. 00:11:46.915 --> 00:11:50.397 Many causal questions can be addressed by naturally occurring 00:11:50.397 --> 00:11:53.831 as good as randomly assigned variation. 00:11:54.731 --> 00:11:56.915 Here's a causal question for you -- 00:11:56.915 --> 00:11:59.955 do women who have children early in their careers suffer 00:11:59.955 --> 00:12:02.648 a substantial earnings penalty as a result? 00:12:02.648 --> 00:12:04.970 After all, women earn less than men. 00:12:05.573 --> 00:12:08.506 We could, of course, simply compare the earnings of women 00:12:08.506 --> 00:12:10.891 with more and fewer children. 00:12:10.891 --> 00:12:14.190 But such comparisons are fraught with selection bias. 00:12:14.806 --> 00:12:19.089 If only we could randomly assign babies to different households. 00:12:19.089 --> 00:12:22.131 Yeah, right, sounds pretty fanciful. 00:12:22.470 --> 00:12:26.601 Our next IV story -- fantastic and not fanciful -- 00:12:26.601 --> 00:12:30.234 illustrates an amazing, naturally-occurring instrument 00:12:30.234 --> 00:12:31.918 for family size. 00:12:33.317 --> 00:12:34.317 ♪ [music] ♪ 00:12:34.551 --> 00:12:37.985 - [Instructor] You're on your way to mastering Econometrics. 00:12:38.153 --> 00:12:40.170 Make sure this video sticks 00:12:40.170 --> 00:12:42.636 by taking a few quick practice questions. 00:12:42.886 --> 00:12:46.336 Or, if you're ready, click for the next video. 00:12:46.529 --> 00:12:50.278 You can also check out MRU's website for more courses, 00:12:50.278 --> 00:12:52.027 teacher resources, and more. 00:12:52.289 --> 00:12:53.772 ♪ [music] ♪