1 00:00:00,100 --> 00:00:02,350 ♪ [music] ♪ 2 00:00:03,700 --> 00:00:05,700 Welcome to Nobel conversations. 3 00:00:07,000 --> 00:00:10,128 In this episode, Josh Angrist and Guido Imbens 4 00:00:10,128 --> 00:00:13,700 sit down with Isaiah Andrews to discuss and disagree 5 00:00:13,700 --> 00:00:16,700 over the role of machine learning in applied econometrics. 6 00:00:18,300 --> 00:00:21,200 So, of course, there are a lot of topics where you guys largely agree, 7 00:00:21,200 --> 00:00:22,313 but I'd like to turn to one 8 00:00:22,313 --> 00:00:24,240 where maybe you have some differences of opinion. 9 00:00:24,240 --> 00:00:25,860 So I would love to hear some of your thoughts 10 00:00:25,860 --> 00:00:26,883 about machine learning 11 00:00:26,883 --> 00:00:29,900 and the goal that it's playing and is going to play in economics. 12 00:00:30,200 --> 00:00:33,900 I've looked at some data like this proprietary so that there's 13 00:00:33,900 --> 00:00:35,100 no published paper there with 14 00:00:36,200 --> 00:00:39,500 the was an experiment that was done on some search algorithm. 15 00:00:39,700 --> 00:00:45,600 And the question was it sort of it was about ranking things and changing the ranking 16 00:00:45,900 --> 00:00:47,500 that was sort of clear that I made. 17 00:00:48,400 --> 00:00:52,800 That was going to be a lot of heterogeneity there. Mmm, you know, if if you know, 18 00:00:53,700 --> 00:00:56,100 look for for say, 19 00:00:56,800 --> 00:00:57,400 the 20 00:00:58,300 --> 00:01:02,400 a picture of Britney Spears that it doesn't really matter where you rank it 21 00:01:02,400 --> 00:01:05,500 because you going to figure out what you're looking for. 22 00:01:06,200 --> 00:01:09,800 But if you put it in the first or second, or third possession of the ranking, 23 00:01:10,100 --> 00:01:12,500 but if you're looking for the best econometrics book, 24 00:01:13,300 --> 00:01:16,500 if you put your book first or your book tense, 25 00:01:16,500 --> 00:01:18,000 that's going to make a big difference. 26 00:01:18,600 --> 00:01:22,600 How much how often people are going to click on it? And so, 27 00:01:22,800 --> 00:01:25,500 there you go. Why do I need machine learning to 28 00:01:26,500 --> 00:01:29,100 discover that? It seems like I said Discover it soon. 29 00:01:29,300 --> 00:01:31,800 So in general, there were lots of possible. 30 00:01:32,100 --> 00:01:36,300 You what you want to think about there being lots of characteristics of the 31 00:01:36,400 --> 00:01:42,000 the items that you want to understand where, what drives the heterogeneity 32 00:01:42,300 --> 00:01:45,600 in the effect of your just rekt, you know, that in some sense. 33 00:01:45,600 --> 00:01:47,700 You're solving a marketing problem. 34 00:01:48,400 --> 00:01:51,800 Also affect you, it's causal, but it has no scientific content. 35 00:01:51,800 --> 00:01:53,300 I think about think about, 36 00:01:54,100 --> 00:01:57,300 but it's similar things and medical settings. 37 00:01:58,000 --> 00:02:01,200 If you do an experiment, you may actually be very interested 38 00:02:01,300 --> 00:02:03,800 in whether the treatment works for some groups or not. 39 00:02:03,900 --> 00:02:06,500 And you have a lot of individual characteristics and you want 40 00:02:06,500 --> 00:02:09,500 to systematically search. Yeah. I'm skeptical about that. 41 00:02:09,500 --> 00:02:13,900 That sort of idea that there's this personal causal effect that I should care about, 42 00:02:14,000 --> 00:02:18,200 and that machine learning can Discover it in some way that's useful. So think about 43 00:02:18,300 --> 00:02:21,400 I've done a lot of work on schools, going to say 44 00:02:21,400 --> 00:02:26,500 a charter school publicly funded private school effectively, you know, 45 00:02:26,500 --> 00:02:29,300 that's free to structure its own curriculum for context there. 46 00:02:29,300 --> 00:02:32,700 Some types of charter, schools are generate spectacular, 47 00:02:32,700 --> 00:02:36,400 achievement gains and in the data set that produces that result. 48 00:02:36,400 --> 00:02:37,800 I have a lot of covariance. 49 00:02:37,800 --> 00:02:41,200 So I have Baseline scores, and I have family background, 50 00:02:41,200 --> 00:02:45,800 the education of the parents, the sex of the child, the race of the child. 51 00:02:45,800 --> 00:02:48,300 And, well, soon as I put 52 00:02:48,400 --> 00:02:51,900 Half a dozen of those together. I have a very high dimensional space. 53 00:02:52,300 --> 00:02:54,900 I'm definitely interested in in sort, of course, 54 00:02:54,900 --> 00:02:59,400 features of that treatment effect, like whether it's better for people who 55 00:02:59,900 --> 00:03:02,100 come from lower income families. 56 00:03:02,600 --> 00:03:06,000 I have a hard time believing that there's an application, 57 00:03:06,400 --> 00:03:10,300 you know, for the very high dimensional version of that, where 58 00:03:10,500 --> 00:03:13,200 I discovered that for non-white children who have 59 00:03:13,800 --> 00:03:17,800 high family incomes, but Baseline scores in the third quartile, 60 00:03:18,300 --> 00:03:23,000 And only went to public school in the third grade, but not the sixth grade. 61 00:03:23,000 --> 00:03:25,500 So that's what that high dimensional analysis produces. 62 00:03:25,800 --> 00:03:28,100 This very elaborate conditional statement. 63 00:03:28,300 --> 00:03:31,000 There's two things that are wrong with that. In my view first. 64 00:03:31,000 --> 00:03:34,000 I don't see it as I just can't imagine why it's actionable. 65 00:03:34,600 --> 00:03:36,600 I don't know why you'd want to act on it. 66 00:03:36,600 --> 00:03:41,200 And I know also that there's some alternative model that fits almost as well. 67 00:03:41,800 --> 00:03:43,000 That flips everything, 68 00:03:43,200 --> 00:03:47,500 right? Because machine learning doesn't tell me that this is really the predictor 69 00:03:47,900 --> 00:03:48,100 that 70 00:03:48,400 --> 00:03:52,300 Is it just tells me that this is a good predictor? And so, 71 00:03:52,800 --> 00:03:55,900 you know, I think there is something different about the 72 00:03:56,000 --> 00:03:58,400 Moss social science contest. So I think 73 00:03:58,500 --> 00:04:02,600 the socialized signs of applications you're talking about once where 74 00:04:03,400 --> 00:04:08,100 I think there's not a huge amount of heterogeneity in the effects. 75 00:04:08,400 --> 00:04:14,000 And so what there might be a few allow me to to fill that space. No, 76 00:04:14,600 --> 00:04:18,100 not even then I think for a lot of those those into 77 00:04:18,300 --> 00:04:22,000 Sanctions even effect. You would expect that. The effect is the same sign 78 00:04:22,100 --> 00:04:22,900 for everybody. 79 00:04:23,400 --> 00:04:27,600 It may be there may be small differences in the magnitude, but it's not 80 00:04:28,200 --> 00:04:31,700 for a lot of these education defenses. They're good for everybody. 81 00:04:31,800 --> 00:04:32,300 They're 82 00:04:32,900 --> 00:04:37,600 the it's not that they're bad for some people and good for other people and 83 00:04:37,600 --> 00:04:40,800 that is kind of very small Pockets where they're bad the 84 00:04:40,900 --> 00:04:43,900 but it may be some variation in the magnitude, 85 00:04:44,000 --> 00:04:48,200 but you would need very very big data sets to find those and I 86 00:04:48,400 --> 00:04:51,400 Then in those cases, they probably wouldn't be very actionable anyone. 87 00:04:51,700 --> 00:04:53,800 But there's I think there's a lot of other settings 88 00:04:54,100 --> 00:04:56,600 where there is much more hydrogen it. 89 00:04:57,400 --> 00:05:01,600 Well, I'm open to that possibility and I think the example you gave of 90 00:05:01,900 --> 00:05:05,000 it's essentially a marketing example. 91 00:05:06,400 --> 00:05:08,400 Now that maybe they say there's a there's a 92 00:05:08,500 --> 00:05:10,700 have implications for and that's organization. 93 00:05:10,700 --> 00:05:13,900 How you actually need to whether you need to worry about 94 00:05:14,000 --> 00:05:17,900 the well, I know Market power, some see that paper. 95 00:05:18,400 --> 00:05:21,200 So that's the sense. The sense I'm getting is that 96 00:05:21,500 --> 00:05:23,500 we still disagree on something. Yes. 97 00:05:24,100 --> 00:05:26,700 We have it converged on everything. I'm getting that sense. 98 00:05:27,200 --> 00:05:31,000 Actually. We've diverged on this because this wasn't around to argue about. 99 00:05:33,200 --> 00:05:38,000 Is it getting a little warm here? Yeah. Warm warmed up. Warmed up is good. 100 00:05:38,100 --> 00:05:40,800 The sense. I'm getting his Jaws. Sort of, you're not, you're not 101 00:05:40,900 --> 00:05:43,400 saying that you're confident that there is no way. 102 00:05:43,400 --> 00:05:45,400 That there is an application where the stuff is useful. 103 00:05:45,400 --> 00:05:48,200 You are saying you are you're unconvinced by the existing. 104 00:05:48,300 --> 00:05:52,200 Applications to dedicate fair that I'm very confident. Yeah, 105 00:05:54,200 --> 00:05:55,000 in this case. 106 00:05:55,300 --> 00:05:57,500 I think Josh does have a point that today 107 00:05:58,000 --> 00:06:02,100 even in the prediction cases the where 108 00:06:02,300 --> 00:06:05,000 a lot of the machine learning methods really shine is 109 00:06:05,000 --> 00:06:06,600 where there's just a lot of heterogeneity. 110 00:06:07,300 --> 00:06:10,600 You don't really care much about the details there, right? 111 00:06:10,900 --> 00:06:15,000 Yes. It does. It doesn't have a policy angle or something. 112 00:06:15,200 --> 00:06:18,100 They kind of recognizing handwritten digits and stuff. 113 00:06:18,300 --> 00:06:24,000 For it does much better there than building some complicated model. 114 00:06:24,400 --> 00:06:28,100 But a lot of the social science, a lot of the economic applications. 115 00:06:28,300 --> 00:06:32,100 We actually know a huge amount about the relationship between various variables. 116 00:06:32,100 --> 00:06:34,600 A lot of the relationships are strictly monotone. 117 00:06:35,400 --> 00:06:39,400 There and education is going to increase people's earnings, 118 00:06:39,800 --> 00:06:44,100 irrespective of the demographic, irrespective of the level of Education. 119 00:06:44,100 --> 00:06:47,800 You already have until they get to a PhD. Yeah. There is a graduate school. 120 00:06:49,500 --> 00:06:50,700 A reasonable range. 121 00:06:51,600 --> 00:06:55,900 It's a it's not going to go down very much. We're 122 00:06:56,100 --> 00:06:59,700 in a lot of the settings. For these machine learning method shine. 123 00:06:59,700 --> 00:07:01,900 It's going to there's a lot of non-monetary Necessities 124 00:07:02,100 --> 00:07:04,900 kind of multi modality in these relationships 125 00:07:05,300 --> 00:07:11,500 and they're they're going to be very powerful but I still stand by that. 126 00:07:11,700 --> 00:07:16,100 It kind of It kind of this message just have a huge amount to offer the for 127 00:07:16,400 --> 00:07:18,100 for economists and they go. 128 00:07:18,200 --> 00:07:21,700 To be a big part of the future. 129 00:07:23,400 --> 00:07:25,800 Feels like there's something interesting to be said about machine learning here. 130 00:07:25,800 --> 00:07:27,700 So, here I was wondering, could you give some more, 131 00:07:28,000 --> 00:07:29,000 maybe some examples 132 00:07:29,000 --> 00:07:32,500 of the sorts of examples you're thinking about with applications? I'm at the moment. 133 00:07:32,500 --> 00:07:34,100 So while I'm on areas where 134 00:07:34,700 --> 00:07:36,400 instead of looking for average 135 00:07:36,500 --> 00:07:42,200 cause of facts were looking for individualized estimates, and predictions of 136 00:07:42,400 --> 00:07:47,500 of course of facts and their machine learning algorithms have been very effective, 137 00:07:48,000 --> 00:07:48,100 too. 138 00:07:48,300 --> 00:07:51,500 Surely would have, we would have done these things, using kernel methods. 139 00:07:51,600 --> 00:07:54,500 And theoretically they work great and 140 00:07:54,600 --> 00:07:57,400 the sort of some arguments that you formally can't do any better. 141 00:07:57,600 --> 00:08:00,500 But in practice, they don't work very well and 142 00:08:00,900 --> 00:08:05,400 random Forest, random cause of forest type things that stuff on wagon, Susan. 143 00:08:05,400 --> 00:08:09,500 I think I've been working on. I used very widely. 144 00:08:09,600 --> 00:08:12,200 They've been very effective, kind of, in the settings 145 00:08:12,400 --> 00:08:18,100 to actually get cause of facts that are that the ferry by 146 00:08:18,200 --> 00:08:19,900 Bike over has, and this kind of, 147 00:08:20,700 --> 00:08:25,700 I think this is still just the beginning of these methods. But in many cases, 148 00:08:26,400 --> 00:08:31,600 the these algorithms are very effective as searching over big spaces 149 00:08:31,800 --> 00:08:35,600 and finding the functions that fit 150 00:08:35,900 --> 00:08:41,100 the very well in ways that we couldn't really do the beforehand. 151 00:08:41,500 --> 00:08:45,300 I don't know of an example, where machine learning has generated insights 152 00:08:45,300 --> 00:08:48,100 about a causal effect that I'm interested in. And I, 153 00:08:48,300 --> 00:08:51,300 You know of examples where it's potentially very misleading. 154 00:08:51,300 --> 00:08:53,700 So I've done some work with Brigham Franz and 155 00:08:54,100 --> 00:08:55,100 using, for example, 156 00:08:55,100 --> 00:08:59,900 random Forest to model covariate effects in an instrumental variables problem. 157 00:09:00,200 --> 00:09:01,200 Where you need, 158 00:09:01,600 --> 00:09:03,500 you need to condition on covariance 159 00:09:04,400 --> 00:09:08,200 and you don't particularly have strong feelings about the functional form for that. 160 00:09:08,200 --> 00:09:10,000 So maybe you should curve 161 00:09:10,500 --> 00:09:10,900 think, 162 00:09:10,900 --> 00:09:14,500 be open to flexible curve fitting and that leads you down a path 163 00:09:14,500 --> 00:09:18,000 where there's a lot of nonlinearities in the model and 164 00:09:18,200 --> 00:09:23,000 That's very dangerous with IV because any sort of excluded non-linearity 165 00:09:23,300 --> 00:09:27,600 potentially generates a spurious, causal effect and Brigham. And I showed that 166 00:09:27,900 --> 00:09:32,200 very powerfully. I think in the case of two instruments 167 00:09:32,700 --> 00:09:36,000 that come from a paper, mine with Bill Evans. Where if you, 168 00:09:36,500 --> 00:09:37,600 you know, replace it 169 00:09:38,100 --> 00:09:42,600 in a traditional two stage least squares, estimator with some kind of random Forest. 170 00:09:42,900 --> 00:09:48,000 You get very precisely at estimated nonsense estimates and 171 00:09:49,000 --> 00:09:51,100 You know, I think that's a, that's a big caution. 172 00:09:51,100 --> 00:09:53,400 And I, you know, in view of those findings 173 00:09:53,700 --> 00:09:57,100 in an example, I care about where the instruments are very simple 174 00:09:57,400 --> 00:09:59,100 and I believe that they're valid, 175 00:09:59,300 --> 00:10:01,600 you know, I would be skeptical of that. So 176 00:10:02,900 --> 00:10:06,800 non-linearity and Ivy don't mix very comfortably. Now I said, 177 00:10:07,200 --> 00:10:11,400 you know in some sense that's already a more complicated. Well, it's Ivy. 178 00:10:11,600 --> 00:10:11,900 Yeah, 179 00:10:12,500 --> 00:10:16,700 but then we work on that and friend out. 180 00:10:18,600 --> 00:10:22,300 I sat in tow vehicle actually guy a lot of these papers Cross by my desk and it, 181 00:10:22,700 --> 00:10:29,500 but the motivation is is not clear at a fact, really lacking. 182 00:10:29,800 --> 00:10:35,100 And they're not, they're not, they called type semi-parametric foundational papers. 183 00:10:35,400 --> 00:10:37,100 So that that's a big problem 184 00:10:38,000 --> 00:10:42,400 and kind of related problem is that we have this tradition in econometrics 185 00:10:42,600 --> 00:10:47,500 being very focused on these formulas and tonic results kind of weird. 186 00:10:48,800 --> 00:10:52,600 We have just have a lot of papers that where you people, propose 187 00:10:52,800 --> 00:10:55,700 a method and then establish the asymptotic properties 188 00:10:56,300 --> 00:11:01,900 in in a very kind of standardized way that bad. 189 00:11:02,900 --> 00:11:07,200 Well, I think it's sort of close the door for a lot of work. 190 00:11:07,200 --> 00:11:11,600 That doesn't fit it into that. We're in the machine learning literature. 191 00:11:11,900 --> 00:11:14,300 A lot of things are more algorithmic people. 192 00:11:15,700 --> 00:11:18,500 Had algorithms for coming up with predictions. 193 00:11:18,800 --> 00:11:23,600 The turn out to actually work much better than say, nonparametric kernel regression 194 00:11:24,000 --> 00:11:26,800 for a long-ass time. We're doing all the nonparametric syndecan, metrics. 195 00:11:26,800 --> 00:11:31,100 We do it using kernel regression and I was great for proving theorems. 196 00:11:31,300 --> 00:11:34,800 You could get confidence, intervals and consistency, and asymptotic normality, 197 00:11:34,800 --> 00:11:37,000 and it was all great, but it wasn't very useful. 198 00:11:37,300 --> 00:11:40,900 And the things they did in machine learning. I just way way better, 199 00:11:41,000 --> 00:11:45,100 but they didn't have to the proper. That's not my beef with machine learning theory. 200 00:11:45,300 --> 00:11:51,200 As we know my name, I'm saying there for the prediction part. 201 00:11:51,400 --> 00:11:54,500 It does much better. Yeah, that's a better curve fitting to it. 202 00:11:54,900 --> 00:11:56,500 But it did. So 203 00:11:57,100 --> 00:12:02,700 in a way that would not have made those papers initially easy to get into 204 00:12:03,000 --> 00:12:06,300 the econometrics journals because it wasn't proving the type of things. 205 00:12:06,400 --> 00:12:11,200 You know, when when Brian was doing his regression trees that just didn't fit in 206 00:12:11,800 --> 00:12:15,100 and I think he would have had a very hard time. 207 00:12:15,200 --> 00:12:18,400 Polishing these things. And it could have had six journals. 208 00:12:18,900 --> 00:12:24,400 I, so I think we're we limited ourselves too much and we 209 00:12:24,700 --> 00:12:27,900 that left us close things off 210 00:12:28,000 --> 00:12:30,800 for a lot of these machine learning methods, that actually very useful. 211 00:12:30,900 --> 00:12:34,000 Hmm. I mean, I think they're in general, 212 00:12:34,900 --> 00:12:36,200 that literature the computer. 213 00:12:36,200 --> 00:12:39,300 Scientists have brought a huge number of these algorithms. 214 00:12:39,600 --> 00:12:43,900 The have proposed a huge number of these algorithms that actually very useful 215 00:12:44,000 --> 00:12:44,700 at that are 216 00:12:45,500 --> 00:12:49,100 Affecting the way we're going to be doing empirical work, 217 00:12:49,800 --> 00:12:55,100 but we've not fully internalize that because we're still very focused on getting 218 00:12:55,300 --> 00:12:57,500 Point estimates and getting standard errors 219 00:12:58,600 --> 00:13:01,200 and getting P values in a way that 220 00:13:01,700 --> 00:13:03,100 we need to move Beyond 221 00:13:03,300 --> 00:13:04,300 to fully harness. 222 00:13:04,300 --> 00:13:10,700 The force, the quote, the benefits from machine learning literature. 223 00:13:10,900 --> 00:13:15,100 Hmm. On the one hand. I guess I very much take your point that sort of the the 224 00:13:15,200 --> 00:13:18,600 Tional. Econometrics, framework of sort of propose, a method, 225 00:13:18,600 --> 00:13:22,600 proved a limit theorem under some asymptotic story, story story, 226 00:13:22,600 --> 00:13:26,900 story story publish a paper is constraining. 227 00:13:26,900 --> 00:13:29,700 And that in some sense by thinking, more, 228 00:13:29,700 --> 00:13:33,200 broadly about what a methods paper could look. Like we may write in some sense. 229 00:13:33,200 --> 00:13:35,900 Certainly the machine learning literature has found a bunch of things, 230 00:13:35,900 --> 00:13:38,300 which seem to work quite well for a number of problems 231 00:13:38,300 --> 00:13:42,400 and are now having substantial influence in economics. I guess a question. 232 00:13:42,400 --> 00:13:44,800 I'm interested in is, how do you think? 233 00:13:45,200 --> 00:13:47,600 The goal of fear. 234 00:13:47,900 --> 00:13:51,200 Sort of, do you think there is? There's no value in the theory part of it? 235 00:13:51,600 --> 00:13:54,800 Because I guess it's sort of a question that I often have to sort of seeing 236 00:13:54,800 --> 00:13:56,900 that output from a machine learning tool 237 00:13:56,900 --> 00:13:59,400 that actually a number of the methods that you talked about. 238 00:13:59,400 --> 00:14:01,800 Actually do have inferential results, develop for them, 239 00:14:02,600 --> 00:14:06,400 something that I always wonder about a sort of uncertainty quantification and just, 240 00:14:06,500 --> 00:14:08,000 you know, I I have my prior, 241 00:14:08,000 --> 00:14:11,000 I come into the world with my view. I see the result of this thing. 242 00:14:11,000 --> 00:14:14,500 How should I update based on it? And in some sense, if I'm in a world where 243 00:14:14,600 --> 00:14:15,100 things are. 244 00:14:15,200 --> 00:14:18,200 Normally distributed. I know how to do it here. I don't. 245 00:14:18,200 --> 00:14:21,400 And so I'm interested to hear had I think it sounds. So 246 00:14:21,500 --> 00:14:24,300 I don't see this as sort of close it saying, well 247 00:14:24,400 --> 00:14:26,500 we do these results are not not interesting 248 00:14:26,600 --> 00:14:27,700 but it's gonna be a lot of cases 249 00:14:28,000 --> 00:14:31,200 where it's going to be incredibly hard to get those results and we may not be able 250 00:14:31,200 --> 00:14:33,200 to get there and 251 00:14:33,400 --> 00:14:37,700 we may need to do it in stages. Where first someone says. Hey I have this 252 00:14:39,600 --> 00:14:44,800 interesting algorithm for for doing something and it works well by some 253 00:14:45,600 --> 00:14:49,900 The Criterion that on this this particular data set 254 00:14:51,000 --> 00:14:53,400 and I'm visit put it out there and we should 255 00:14:53,700 --> 00:14:58,000 maybe someone will figure out a way that you can later actually still do inference 256 00:14:58,000 --> 00:14:59,100 on the some condition. 257 00:14:59,100 --> 00:15:02,100 So and maybe those are not particularly realistic conditions, 258 00:15:02,100 --> 00:15:05,500 then we kind of go further, but I think we've been 259 00:15:06,700 --> 00:15:11,400 Too constraining things too much where we said, you know, this is the type of things 260 00:15:12,100 --> 00:15:14,400 that we need to do. And I had some sense 261 00:15:15,700 --> 00:15:18,200 that goes back to kind of the way they dress and I 262 00:15:19,700 --> 00:15:21,900 thought about things for the local average treatment effect. 263 00:15:21,900 --> 00:15:24,600 That wasn't quite the way people were thinking about these problems. 264 00:15:24,600 --> 00:15:29,200 Before they say they there was a sense that some of the people said, you know, 265 00:15:29,500 --> 00:15:31,900 the way you need to do. These things, is you first, say 266 00:15:32,200 --> 00:15:36,300 what you're interested in estimating and then you do the best job you can. 267 00:15:36,500 --> 00:15:37,700 In estimating that 268 00:15:38,100 --> 00:15:44,200 and what you have you guys had doing is doing it, you guys are doing it backwards. 269 00:15:44,300 --> 00:15:46,700 You're going to say here. I have an estimator 270 00:15:47,300 --> 00:15:49,600 and now I'm going to figure out what what 271 00:15:49,800 --> 00:15:51,400 what it says estimating then expose. 272 00:15:51,400 --> 00:15:53,900 You're going to say why you think that's interesting 273 00:15:53,900 --> 00:15:56,600 or maybe why it's not interesting and that's that's not okay. 274 00:15:56,600 --> 00:15:58,600 You're not allowed to do that that way. 275 00:15:59,000 --> 00:16:04,100 And I think we should just be a little bit more flexible and thinking about the 276 00:16:04,300 --> 00:16:06,300 how to look at at 277 00:16:06,400 --> 00:16:11,300 Problems because I think we've missed some things by not by not doing that. 278 00:16:13,000 --> 00:16:16,600 So you've heard our views. Isaiah, you've seen that, we have 279 00:16:17,000 --> 00:16:20,400 some points of disagreement. Why don't you referee this dispute for us? 280 00:16:22,500 --> 00:16:28,100 Oh, I'm so so nice of you to ask me a small question. So I guess for one. 281 00:16:28,200 --> 00:16:33,200 I very much agree with something that he do said earlier of. 282 00:16:36,000 --> 00:16:36,300 So what? 283 00:16:36,500 --> 00:16:37,900 Where it seems. Where the, 284 00:16:37,900 --> 00:16:41,400 the case for machine learning seems relatively clear is in settings, where 285 00:16:41,500 --> 00:16:45,100 you know, we're interested in some version of a nonparametric prediction problem. 286 00:16:45,100 --> 00:16:49,700 So I'm interested in estimating a conditional expectation or conditional probability 287 00:16:50,000 --> 00:16:52,100 and in the past, maybe I would have run a colonel, 288 00:16:52,100 --> 00:16:55,800 I would have run a kernel regression or I would have run a series regression or 289 00:16:56,100 --> 00:16:57,400 something along those lines. 290 00:16:57,700 --> 00:16:58,000 Sort of, 291 00:16:58,000 --> 00:16:58,700 it seems like 292 00:16:58,700 --> 00:17:02,000 at this point we've a fairly good sense that in a fairly wide range 293 00:17:02,000 --> 00:17:06,300 of applications machine learning methods seem to do better for 294 00:17:06,400 --> 00:17:06,800 Or, you know, 295 00:17:06,800 --> 00:17:08,800 estimating conditional mean functions 296 00:17:08,800 --> 00:17:12,000 or conditional probabilities or various other nonparametric objects 297 00:17:12,400 --> 00:17:16,600 than more traditional nonparametric methods that were studied in econometrics 298 00:17:16,600 --> 00:17:19,100 and statistics, especially in high dimensional settings. 299 00:17:19,500 --> 00:17:23,100 So you thinking of maybe the propensity score or something like that? 300 00:17:23,100 --> 00:17:25,300 So exactly, so nuisance functions. Yeah. 301 00:17:25,300 --> 00:17:28,900 So things like propensity scores things or I mean even objects 302 00:17:28,900 --> 00:17:30,100 of more direct inference 303 00:17:30,200 --> 00:17:32,400 interest, like conditional average treatment effects, right? 304 00:17:32,400 --> 00:17:35,100 Which of the difference of two conditional, expectation functions, 305 00:17:35,100 --> 00:17:36,300 potentially things like that. 306 00:17:36,500 --> 00:17:40,400 Of course, even there, right? We the the theory 307 00:17:40,500 --> 00:17:43,700 for in France or the theory for sort of how to how to interpret, 308 00:17:43,700 --> 00:17:45,900 how to make large simple statements about some of these things are 309 00:17:46,000 --> 00:17:50,100 less well-developed depending on the machine learning, estimator used. 310 00:17:50,100 --> 00:17:53,800 And so, I think there's something that is tricky is that we 311 00:17:53,900 --> 00:17:55,700 can have these methods, which work a lot, 312 00:17:55,700 --> 00:17:58,000 which seemed to work a lot better for some purposes. 313 00:17:58,000 --> 00:18:01,600 But which we need to be a bit careful in how we plug them in or how 314 00:18:01,600 --> 00:18:03,300 we interpret the resulting statements. 315 00:18:03,600 --> 00:18:06,200 But of course, that's a very, very active area right now. We're 316 00:18:06,400 --> 00:18:10,400 People are doing tons of great work. And so I exfoli expect and hope 317 00:18:10,400 --> 00:18:12,800 to see much more going forward there. 318 00:18:13,000 --> 00:18:17,300 So one issue with machine learning, that always seems a danger is, or 319 00:18:17,400 --> 00:18:20,300 that is sometimes a danger and had some times led to 320 00:18:20,500 --> 00:18:22,600 applications that have made. Less sense, is 321 00:18:22,800 --> 00:18:25,100 when folks start with a method that are 322 00:18:25,300 --> 00:18:28,500 start with a method that they're very excited about rather than a question, 323 00:18:28,900 --> 00:18:32,100 right? So sort of starting with a question where here's the 324 00:18:32,500 --> 00:18:36,200 object I'm interested in here is the parameter of Interest. Let me 325 00:18:36,700 --> 00:18:37,100 You know, 326 00:18:37,300 --> 00:18:39,500 think about how I would identify that thing, 327 00:18:39,500 --> 00:18:41,800 how I would recover that thing, if I had a ton of data, 328 00:18:41,900 --> 00:18:44,000 oh, here's a conditional expectation function. 329 00:18:44,000 --> 00:18:47,100 Let me plug in an estimator on machine. Learning estimator for that. 330 00:18:47,200 --> 00:18:48,800 That seems very very sensible. 331 00:18:49,000 --> 00:18:53,100 Whereas, you know, if I digress quantity on price 332 00:18:53,700 --> 00:18:56,000 and say that I used a machine learning method, 333 00:18:56,300 --> 00:18:58,900 maybe I'm satisfied that that solves the in dodging, 80 problem. 334 00:18:58,900 --> 00:19:01,200 We're usually worried about their maybe I'm not, 335 00:19:01,500 --> 00:19:03,200 but again, that's something where the, 336 00:19:03,400 --> 00:19:06,300 the way to address. It, seems relatively clear, right? 337 00:19:06,500 --> 00:19:09,000 It's the find your object of interest and 338 00:19:09,200 --> 00:19:11,600 think about, is that just bringing the economics? 339 00:19:11,700 --> 00:19:12,200 Exactly. 340 00:19:12,200 --> 00:19:15,400 And and can I think about it, and they denied it, but harnessed 341 00:19:15,400 --> 00:19:18,300 the power of the machine learning methods for precisely 342 00:19:18,500 --> 00:19:22,800 for some of the components precisely. Exactly. So sort of, you know, the, the, 343 00:19:22,900 --> 00:19:25,600 the question of interest is the same as the question of interest is always been, 344 00:19:25,600 --> 00:19:29,500 but we now better methods for estimating some pieces of this, right? The 345 00:19:29,900 --> 00:19:31,600 the place that seems harder to, uh, 346 00:19:31,900 --> 00:19:33,400 harder to forecast is Right. 347 00:19:33,400 --> 00:19:36,300 Obviously, there's a huge amount going in going on in the machine. 348 00:19:36,400 --> 00:19:37,400 Learning literature 349 00:19:37,500 --> 00:19:39,700 and the great sort of The Limited ways 350 00:19:39,700 --> 00:19:42,900 of plugging it in that I've referenced so far are limited piece of that. 351 00:19:43,000 --> 00:19:46,100 And so I think there are all sorts of other interesting questions about where, 352 00:19:46,300 --> 00:19:46,900 right sort of 353 00:19:47,100 --> 00:19:49,300 where does this interaction go? What else can we learn? 354 00:19:49,300 --> 00:19:52,000 And that's something where, you know, I think there's 355 00:19:52,200 --> 00:19:56,400 a ton going on which seems very promising and I have no idea what the answer is. 356 00:19:57,000 --> 00:20:01,200 No, no. No, it's I so I totally agree with that but it's no. 357 00:20:01,800 --> 00:20:03,500 That's makes it very exciting. 358 00:20:03,800 --> 00:20:06,100 And I think that's just a little work to be done there. 359 00:20:06,600 --> 00:20:11,400 All right. So I say agrees with me there, say that person. 360 00:20:14,500 --> 00:20:17,700 If you'd like to watch more Nobel conversations, click here, 361 00:20:18,000 --> 00:20:20,400 or if you'd like to learn more about econometrics, 362 00:20:20,500 --> 00:20:23,100 check out Josh's mastering econometrics series. 363 00:20:23,600 --> 00:20:26,500 If you'd like to learn more about he do Josh and Isaiah 364 00:20:26,700 --> 00:20:28,200 check out the links in the description.