1 00:00:00,299 --> 00:00:04,644 - [Instructor] The path from cause to effect is dark and dangerous. 2 00:00:05,041 --> 00:00:08,015 But the weapons of Econometrics are strong. 3 00:00:08,480 --> 00:00:11,734 Attack with fierce and flexible instrumental variables 4 00:00:11,734 --> 00:00:15,803 when nature blesses you with fortuitous random assignment. 5 00:00:19,393 --> 00:00:21,094 [gong rings] 6 00:00:23,653 --> 00:00:28,704 Randomized trials are the surest path to ceteris parabus comparisons. 7 00:00:28,704 --> 00:00:32,640 Alas, this powerful tool is often unavailable. 8 00:00:33,224 --> 00:00:36,940 But sometimes, randomization happens by accident. 9 00:00:36,940 --> 00:00:40,592 That's when we turn to instrumental variables -- 10 00:00:40,592 --> 00:00:41,938 IV for short. 11 00:00:41,938 --> 00:00:44,508 - [Voice whispers] Instrumental variables. 12 00:00:44,508 --> 00:00:48,186 - [Instructor] Today's lesson is the first of two on IV. 13 00:00:48,958 --> 00:00:52,951 Our first IV lesson begins with a story of schools. 14 00:00:52,951 --> 00:00:54,348 [school bell rings] 15 00:00:54,348 --> 00:00:56,138 - [Josh] Charter schools are public schools 16 00:00:56,138 --> 00:01:00,112 freed from daily district oversight and teacher union contracts. 17 00:01:00,895 --> 00:01:03,511 The question of whether charters boost achievement 18 00:01:03,511 --> 00:01:05,161 is one of the most important 19 00:01:05,161 --> 00:01:07,761 in the history of American education reform. 20 00:01:08,145 --> 00:01:12,562 - The most popular charter schools have more applicants than seats 21 00:01:12,562 --> 00:01:16,462 so the luck of a lottery draw decides who's offered a seat. 22 00:01:16,870 --> 00:01:20,503 A lot is at stake for the students vying for their chance, 23 00:01:20,503 --> 00:01:25,003 and waiting for the lottery results brings up lots of emotions 24 00:01:25,003 --> 00:01:27,832 as was captured in the award-winning documentary 25 00:01:27,832 --> 00:01:29,699 "Waiting For Superman." 26 00:01:30,258 --> 00:01:32,916 - [Mother] Don't cry. You're gonna make Mommy cry. Okay? 27 00:01:37,498 --> 00:01:40,618 - Do charters really provide a better education? 28 00:01:40,948 --> 00:01:43,183 Critics most definitely say no, 29 00:01:43,413 --> 00:01:46,479 arguing that charters enroll better students to begin with, 30 00:01:46,479 --> 00:01:50,164 smarter or more motivated, so differences in later outcomes 31 00:01:50,164 --> 00:01:52,061 reflects selection bias. 32 00:01:52,595 --> 00:01:54,729 - [Kamal] Wait, this one seems easy. 33 00:01:55,139 --> 00:01:57,444 In a lottery, winners are chosen randomly, 34 00:01:57,498 --> 00:02:00,083 so just compare winners and losers. - [Student] Obviously. 35 00:02:00,083 --> 00:02:01,698 - On the right track, Kamal, 36 00:02:01,698 --> 00:02:04,747 but charter lotteries don't force kids into 37 00:02:04,747 --> 00:02:07,560 or out of a particular school. 38 00:02:07,749 --> 00:02:10,667 They randomize offers of a charter seat. 39 00:02:11,650 --> 00:02:13,449 Some kids get lucky. 40 00:02:13,449 --> 00:02:14,966 Some kids don't. 41 00:02:14,966 --> 00:02:19,118 If we just wanted to know the effect of charter school offers, 42 00:02:19,118 --> 00:02:22,417 we could treat this as a randomized trial. 43 00:02:22,717 --> 00:02:24,684 But we we're interested in the effects 44 00:02:24,684 --> 00:02:28,283 of charter school attendance, not offers. 45 00:02:28,568 --> 00:02:31,917 And not everyone who is offered, accepts. 46 00:02:31,917 --> 00:02:37,234 IV turns the effect of being offered a charter seat into the effect 47 00:02:37,234 --> 00:02:40,367 of actually attending a charter school. 48 00:02:40,367 --> 00:02:42,344 - [Student] Cool. - Oh nice. 49 00:02:45,925 --> 00:02:48,871 - Let's look at an example, a charter school from 50 00:02:48,871 --> 00:02:52,353 the Knowledge Is Power Program, or KIPP for short. 51 00:02:52,736 --> 00:02:54,937 This KIPP school is in Lynn, 52 00:02:54,937 --> 00:02:58,837 a faded industrial town on the coast of Massachusetts. 53 00:02:59,104 --> 00:03:01,886 The school has more applicants than seats 54 00:03:01,886 --> 00:03:05,620 and therefore picks its students using a lottery. 55 00:03:05,834 --> 00:03:11,854 From 2005 to 2008, 371 fourth and fifth graders 56 00:03:11,854 --> 00:03:15,320 put their names in the KIPP Lynn lottery, 57 00:03:15,382 --> 00:03:18,754 253 students won a seat at KIPP, 58 00:03:18,754 --> 00:03:21,651 118 students lost. 59 00:03:21,967 --> 00:03:26,001 A year later, lottery winners had much higher math scores 60 00:03:26,001 --> 00:03:27,719 than lottery losers. 61 00:03:27,802 --> 00:03:30,370 But remember, we're not trying to figure out 62 00:03:30,370 --> 00:03:33,803 whether winning a lottery makes you better at math. 63 00:03:34,070 --> 00:03:38,471 We want to know if attending KIPP makes you better at math. 64 00:03:38,788 --> 00:03:45,671 Of the 253 lottery winners, only 199 actually went to KIPP. 65 00:03:46,139 --> 00:03:48,804 The others chose a traditional public school. 66 00:03:49,563 --> 00:03:55,370 Similarly of the 118 lottery losers, a few actually ended up at KIPP. 67 00:03:55,509 --> 00:03:57,452 They got an offer later. 68 00:03:57,452 --> 00:04:02,377 So what was the effect on test scores of actually attending KIPP? 69 00:04:03,109 --> 00:04:05,426 - [Kamal] Why can't we just measure their math scores? 70 00:04:05,426 --> 00:04:07,096 - [Instructor] Great question. 71 00:04:07,096 --> 00:04:09,302 Who would you compare them to? 72 00:04:09,302 --> 00:04:11,111 - [Kamal] Those who didn't attend. 73 00:04:11,111 --> 00:04:12,944 - [Instructor] Is attendance random? 74 00:04:13,937 --> 00:04:15,057 - [Camilla] No. 75 00:04:15,057 --> 00:04:16,177 - Selection bias. 76 00:04:16,177 --> 00:04:17,909 - [Instructor] Correct. - [Otto] What? 77 00:04:17,909 --> 00:04:21,826 - [Instructor] The KIPP offers are random so we can be confident 78 00:04:21,826 --> 00:04:26,409 of ceteris parabus, but attendance is not random. 79 00:04:26,635 --> 00:04:30,601 The choice to accept the offer might be due to characteristics 80 00:04:30,601 --> 00:04:32,984 that are related to math performance -- 81 00:04:33,251 --> 00:04:36,157 say, for example, that dedicated parents 82 00:04:36,157 --> 00:04:38,941 are more likely to accept the offer. 83 00:04:38,941 --> 00:04:42,646 Their kids are also more likely to do better in math, 84 00:04:42,646 --> 00:04:44,090 regardless of school. 85 00:04:44,090 --> 00:04:45,114 - [Student] Right. 86 00:04:45,114 --> 00:04:47,613 - [Instructor] IV converts the offer effect 87 00:04:47,613 --> 00:04:50,567 into the effect of KIPP attendance, 88 00:04:50,573 --> 00:04:53,371 adjusting for the fact that some winners go elsewhere 89 00:04:53,371 --> 00:04:56,573 and some losers manage to attend KIPP anyway. 90 00:04:56,950 --> 00:05:00,517 Essentially, IV takes an incomplete randomization 91 00:05:00,517 --> 00:05:03,007 and makes the appropriate adjustments. 92 00:05:03,684 --> 00:05:07,107 How? IV describes a chain reaction. 93 00:05:07,426 --> 00:05:10,343 Why do offers affect achievement? 94 00:05:10,343 --> 00:05:13,175 Probably because they affect charter attendance 95 00:05:13,175 --> 00:05:16,643 and charter attendance improves math scores, 96 00:05:16,643 --> 00:05:20,442 the first link in the chain called the first stage 97 00:05:20,442 --> 00:05:24,341 is the effect of the lottery on charter attendance. 98 00:05:24,446 --> 00:05:28,361 The second stage is the length between attending a charter 99 00:05:28,361 --> 00:05:30,153 and an outcome variable, 100 00:05:30,153 --> 00:05:32,261 in this case, math scores. 101 00:05:32,940 --> 00:05:36,441 The instrumental variable, or instrument for short, 102 00:05:36,441 --> 00:05:40,246 is the variable that initiates the chain reaction. 103 00:05:40,899 --> 00:05:44,833 The effect of the instrument on the outcome is called 104 00:05:44,833 --> 00:05:46,631 the reduced form. 105 00:05:48,143 --> 00:05:51,615 This chain reaction can be represented mathematically. 106 00:05:51,615 --> 00:05:55,266 We multiply the first stage, the effect of winning 107 00:05:55,266 --> 00:05:57,866 on attendance, by the second stage, 108 00:05:57,866 --> 00:06:00,567 the effect of attendance on scores. 109 00:06:00,630 --> 00:06:02,713 And we get the reduced form, 110 00:06:02,713 --> 00:06:05,680 the effect of winning the lottery on scores. 111 00:06:06,780 --> 00:06:11,566 The reduced form and first stage are observable and easy to compute. 112 00:06:11,752 --> 00:06:14,876 However, the effect of attendance on achievement 113 00:06:14,876 --> 00:06:16,993 is not directly observed. 114 00:06:16,993 --> 00:06:20,360 This is the causal effect we're trying to determine. 115 00:06:21,043 --> 00:06:23,827 Given some important assumptions we'll discuss shortly, 116 00:06:23,827 --> 00:06:25,977 we can find the effect of KIPP attendance 117 00:06:25,977 --> 00:06:29,183 by dividing the reduced form by the first stage. 118 00:06:29,225 --> 00:06:32,774 This will become more clear as we work through an example. 119 00:06:32,774 --> 00:06:34,207 - [Student] Let's do this. 120 00:06:37,161 --> 00:06:38,728 - A quick note on measurement. 121 00:06:38,728 --> 00:06:41,678 We measure achievement using standard deviations, 122 00:06:41,678 --> 00:06:44,728 often denoted by the Greek letter sigma (σ). 123 00:06:44,728 --> 00:06:48,862 One σ is a huge move from around the bottom 15% 124 00:06:48,862 --> 00:06:51,634 to the middle of most achievement distributions. 125 00:06:51,634 --> 00:06:55,412 Even a ¼ or ½ σ difference is big. 126 00:06:56,262 --> 00:06:58,389 - [Instructor] Now we're ready to plug some numbers 127 00:06:58,389 --> 00:07:01,382 into the equation we introduced earlier. 128 00:07:01,557 --> 00:07:03,231 First up, what's the effect 129 00:07:03,231 --> 00:07:06,076 of winning the lottery on math scores? 130 00:07:06,354 --> 00:07:10,437 KIPP applicants' math scores are a third of a standard deviation 131 00:07:10,504 --> 00:07:14,386 below the state average in the year before they apply to KIPP. 132 00:07:14,386 --> 00:07:18,120 But a year later, lottery winners score right at the state average 133 00:07:18,215 --> 00:07:21,482 while the lottery losers are still well behind 134 00:07:21,482 --> 00:07:25,499 with an average score around - 0.36 σ. 135 00:07:25,834 --> 00:07:29,619 The effect of winning the lottery on scores is the difference 136 00:07:29,619 --> 00:07:32,819 between the winners' scores and the losers' scores. 137 00:07:33,403 --> 00:07:35,784 Take the winners' average math scores, 138 00:07:35,784 --> 00:07:38,269 subtract the losers' average math scores, 139 00:07:38,269 --> 00:07:41,502 and you will have 0.36 σ . 140 00:07:41,908 --> 00:07:46,659 Next up: what's the effect of winning the lottery on attendance? 141 00:07:46,809 --> 00:07:49,193 In other words, if you win the lottery, 142 00:07:49,193 --> 00:07:53,293 how much more likely are you to attend KIPP than if you lose? 143 00:07:53,643 --> 00:07:57,610 First, what percentage of lottery winners attend KIPP? 144 00:07:57,610 --> 00:08:00,626 Divide the number of winners who attended KIPP 145 00:08:00,626 --> 00:08:05,361 by the total number of lottery winners -- that's 78%. 146 00:08:05,810 --> 00:08:09,143 To find the percentage of lottery losers who attended KIPP, 147 00:08:09,143 --> 00:08:12,293 we divide the number of losers who attended KIPP 148 00:08:12,293 --> 00:08:16,760 by the total number of lottery losers -- that's 4%. 149 00:08:17,377 --> 00:08:21,393 Subtract 4 from 78, and we find that winning the lottery 150 00:08:21,393 --> 00:08:25,512 makes you 74% more likely to attend KIPP. 151 00:08:25,946 --> 00:08:28,226 Now we can find what we're really after, 152 00:08:28,383 --> 00:08:34,551 the effect of attendance on scores, by dividing 0.36 by 0.74. 153 00:08:34,789 --> 00:08:37,585 Attending KIPP raises math scores 154 00:08:37,585 --> 00:08:41,518 by 0.48 standard deviations on average. 155 00:08:42,269 --> 00:08:44,503 That's an awesome achievement gain, 156 00:08:44,503 --> 00:08:47,236 equal to moving from about the bottom third 157 00:08:47,236 --> 00:08:49,955 to the middle of the achievement distribution. 158 00:08:49,955 --> 00:08:51,238 - [Student] Whoa, half a sig. 159 00:08:51,238 --> 00:08:53,507 - [Instructor] These estimates are for kids opting in 160 00:08:53,507 --> 00:08:56,047 to the KIPP lottery, whose enrollment status 161 00:08:56,047 --> 00:08:57,762 is changed by winning. 162 00:08:57,985 --> 00:09:00,617 That's not necessarily a random sample 163 00:09:00,617 --> 00:09:02,283 of all children in Lynn. 164 00:09:02,536 --> 00:09:05,035 So we can't assume we'd see the same effect 165 00:09:05,035 --> 00:09:07,327 for other types of students. - [Student] Huh. 166 00:09:07,327 --> 00:09:10,218 - But this effect on keen for KIPP kids 167 00:09:10,218 --> 00:09:13,367 is likely to be a good indicator of the consequences 168 00:09:13,367 --> 00:09:15,767 of adding additional charter seats. 169 00:09:15,767 --> 00:09:17,216 - [Student] Cool. - [Student] Got it. 170 00:09:19,628 --> 00:09:23,145 - IV eliminates selection bias, but like all of our tools, 171 00:09:23,145 --> 00:09:25,624 the solution builds on a set of assumptions 172 00:09:25,624 --> 00:09:27,540 not to be taken for granted. 173 00:09:28,098 --> 00:09:31,347 First, there must be a substantial first stage -- 174 00:09:31,347 --> 00:09:35,465 that is the instrumental variable, winning or losing the lottery, 175 00:09:35,465 --> 00:09:38,915 must really change the variable whose effect we're interested in -- 176 00:09:38,915 --> 00:09:41,031 here, KIPP attendance. 177 00:09:41,298 --> 00:09:44,415 In this case, the first stage is not really in doubt. 178 00:09:44,415 --> 00:09:47,894 Winning the lottery makes KIPP attendance much more likely. 179 00:09:48,386 --> 00:09:50,631 Not all IV stories are like that. 180 00:09:51,321 --> 00:09:53,698 Second, the instrument must be as good 181 00:09:53,698 --> 00:09:56,731 as randomly assigned, meaning lottery winners and losers 182 00:09:56,731 --> 00:09:58,716 have similar characteristics. 183 00:09:58,893 --> 00:10:01,559 This is the independence assumption. 184 00:10:01,977 --> 00:10:05,627 Of course, KIPP lottery wins really are randomly assigned. 185 00:10:05,627 --> 00:10:09,293 Still, we should check for balance and confirm that winners and losers 186 00:10:09,293 --> 00:10:11,493 have similar family backgrounds, 187 00:10:11,493 --> 00:10:13,394 similar aptitudes and so on. 188 00:10:13,543 --> 00:10:16,969 In essence, we're checking to ensure KIPP lotteries are fair 189 00:10:16,969 --> 00:10:20,017 with no group of applicants suspiciously likely to win. 190 00:10:21,373 --> 00:10:24,373 Finally, we require the instrument change outcomes 191 00:10:24,373 --> 00:10:26,252 solely through the variable of interest, 192 00:10:26,252 --> 00:10:28,100 in this case, attending KIPP. 193 00:10:28,299 --> 00:10:31,367 This assumption is called the exclusion restriction. 194 00:10:32,951 --> 00:10:37,500 - IV only works if you can satisfy these three assumptions. 195 00:10:37,783 --> 00:10:40,418 - I don't understand the exclusion restriction. 196 00:10:40,917 --> 00:10:43,599 How could winning the lottery affect math scores 197 00:10:43,599 --> 00:10:45,244 other than by attending KIPP? 198 00:10:45,244 --> 00:10:47,230 - [Student] Yeah. - [Instructor] Great question. 199 00:10:47,230 --> 00:10:50,536 Suppose lottery winners are just thrilled to win, 200 00:10:50,536 --> 00:10:55,045 and this happiness motivates them to study more and learn more math, 201 00:10:55,045 --> 00:10:57,144 regardless of where they go to school. 202 00:10:57,231 --> 00:10:59,901 This would violate the exclusion restriction 203 00:10:59,901 --> 00:11:03,787 because the motivational effect of winning is a second channel 204 00:11:03,787 --> 00:11:06,569 whereby lotteries might affect test scores. 205 00:11:06,865 --> 00:11:09,546 While it's hard to rule this out entirely, 206 00:11:09,546 --> 00:11:12,650 there's no evidence of any alternative channels 207 00:11:12,650 --> 00:11:14,499 in the KIPP study. 208 00:11:17,817 --> 00:11:20,700 - IV solves the problem of selection bias 209 00:11:20,700 --> 00:11:24,850 in scenarios like the KIPP lottery where treatment offers are random 210 00:11:24,850 --> 00:11:27,083 but some of those offered opt out. 211 00:11:28,451 --> 00:11:31,700 This sort of intentional yet incomplete random assignment 212 00:11:31,700 --> 00:11:33,367 is surprisingly common. 213 00:11:33,367 --> 00:11:36,318 Even randomized clinical trials have this feature. 214 00:11:37,134 --> 00:11:40,053 IV solves the problem of non-random take up 215 00:11:40,053 --> 00:11:42,534 in lotteries or clinical research. 216 00:11:43,054 --> 00:11:46,725 But lotteries are not the only source of compelling instruments. 217 00:11:46,915 --> 00:11:50,397 Many causal questions can be addressed by naturally occurring 218 00:11:50,397 --> 00:11:53,831 as good as randomly assigned variation. 219 00:11:54,731 --> 00:11:56,915 Here's a causal question for you -- 220 00:11:56,915 --> 00:11:59,955 do women who have children early in their careers suffer 221 00:11:59,955 --> 00:12:02,648 a substantial earnings penalty as a result? 222 00:12:02,648 --> 00:12:04,970 After all, women earn less than men. 223 00:12:05,573 --> 00:12:08,506 We could, of course, simply compare the earnings of women 224 00:12:08,506 --> 00:12:10,891 with more and fewer children. 225 00:12:10,891 --> 00:12:14,190 But such comparisons are fraught with selection bias. 226 00:12:14,806 --> 00:12:19,089 If only we could randomly assign babies to different households. 227 00:12:19,089 --> 00:12:22,131 Yeah, right, sounds pretty fanciful. 228 00:12:22,470 --> 00:12:26,601 Our next IV story -- fantastic and not fanciful -- 229 00:12:26,601 --> 00:12:30,234 illustrates an amazing, naturally-occurring instrument 230 00:12:30,234 --> 00:12:31,918 for family size. 231 00:12:33,317 --> 00:12:34,317 ♪ [music] ♪ 232 00:12:34,551 --> 00:12:37,985 - [Instructor] You're on your way to mastering Econometrics. 233 00:12:38,153 --> 00:12:40,170 Make sure this video sticks 234 00:12:40,170 --> 00:12:42,636 by taking a few quick practice questions. 235 00:12:42,886 --> 00:12:46,336 Or, if you're ready, click for the next video. 236 00:12:46,529 --> 00:12:50,278 You can also check out MRU's website for more courses, 237 00:12:50,278 --> 00:12:52,027 teacher resources, and more. 238 00:12:52,289 --> 00:12:53,772 ♪ [music] ♪