WEBVTT 00:00:00.100 --> 00:00:02.350 ♪ [music] ♪ 00:00:03.700 --> 00:00:05.700 - [narrator] Welcome to Nobel conversations. 00:00:07.000 --> 00:00:10.128 In this episode, Josh Angrist and Guido Imbens 00:00:10.128 --> 00:00:13.700 sit down with Isaiah Andrews to discuss and disagree 00:00:13.700 --> 00:00:16.580 over the role of machine learning in applied econometrics. 00:00:18.300 --> 00:00:19.769 - [Isaiah] So, of course, there are a lot of topics 00:00:19.769 --> 00:00:21.087 where you guys largely agree, 00:00:21.087 --> 00:00:22.313 but I'd like to turn to one 00:00:22.313 --> 00:00:24.240 where maybe you have some differences of opinion. 00:00:24.240 --> 00:00:25.728 So I'd love to hear some of your thoughts 00:00:25.728 --> 00:00:26.883 about machine learning 00:00:26.883 --> 00:00:29.900 and the goal that it's playing and is going to play in economics. 00:00:30.200 --> 00:00:33.352 - [Guido] I've looked at some data like the proprietary 00:00:33.352 --> 00:00:35.100 so that there's no published paper there. 00:00:36.719 --> 00:00:38.159 There was an experiment that was done 00:00:38.159 --> 00:00:39.500 on some search algorithm. 00:00:39.700 --> 00:00:41.497 And the question was... 00:00:42.901 --> 00:00:45.600 it was about ranking things and changing the ranking. 00:00:45.900 --> 00:00:47.500 That was sort of clear... 00:00:48.400 --> 00:00:50.600 that was going to be a lot of heterogeneity there. 00:00:50.600 --> 00:00:51.700 Mmm, 00:00:51.700 --> 00:00:58.120 You know, if you look for say, 00:00:58.300 --> 00:01:00.350 a picture of Britney Spears 00:01:00.350 --> 00:01:02.400 that it doesn't really matter where you rank it 00:01:02.400 --> 00:01:05.500 because you're going to figure out what you're looking for, 00:01:06.200 --> 00:01:07.867 whether you put it in the first or second 00:01:07.867 --> 00:01:09.800 or third position of the ranking. 00:01:10.100 --> 00:01:12.500 But if you're looking for the best econometrics book, 00:01:13.300 --> 00:01:16.500 if you put your book first or your book tenth, 00:01:16.500 --> 00:01:18.100 that's going to make a big difference 00:01:18.600 --> 00:01:21.829 how much how often people are going to click on it. 00:01:21.829 --> 00:01:23.417 And so there you go -- 00:01:23.417 --> 00:01:27.218 - [Josh] Why do I need machine learning to discover that? 00:01:27.218 --> 00:01:29.100 It seems like because I can discover it simply. 00:01:29.300 --> 00:01:31.800 So in general, there were lots of possible. 00:01:32.100 --> 00:01:36.300 You what you want to think about there being lots of characteristics of the 00:01:36.400 --> 00:01:42.000 the items that you want to understand where, what drives the heterogeneity 00:01:42.300 --> 00:01:45.600 in the effect of your just rekt, you know, that in some sense. 00:01:45.600 --> 00:01:47.700 You're solving a marketing problem. 00:01:48.400 --> 00:01:51.800 Also affect you, it's causal, but it has no scientific content. 00:01:51.800 --> 00:01:53.300 I think about think about, 00:01:54.100 --> 00:01:57.300 but it's similar things and medical settings. 00:01:58.000 --> 00:02:01.200 If you do an experiment, you may actually be very interested 00:02:01.300 --> 00:02:03.800 in whether the treatment works for some groups or not. 00:02:03.900 --> 00:02:06.500 And you have a lot of individual characteristics and you want 00:02:06.500 --> 00:02:09.500 to systematically search. Yeah. I'm skeptical about that. 00:02:09.500 --> 00:02:13.900 That sort of idea that there's this personal causal effect that I should care about, 00:02:14.000 --> 00:02:18.200 and that machine learning can Discover it in some way that's useful. So think about 00:02:18.300 --> 00:02:21.400 I've done a lot of work on schools, going to say 00:02:21.400 --> 00:02:26.500 a charter school publicly funded private school effectively, you know, 00:02:26.500 --> 00:02:29.300 that's free to structure its own curriculum for context there. 00:02:29.300 --> 00:02:32.700 Some types of charter, schools are generate spectacular, 00:02:32.700 --> 00:02:36.400 achievement gains and in the data set that produces that result. 00:02:36.400 --> 00:02:37.800 I have a lot of covariance. 00:02:37.800 --> 00:02:41.200 So I have Baseline scores, and I have family background, 00:02:41.200 --> 00:02:45.800 the education of the parents, the sex of the child, the race of the child. 00:02:45.800 --> 00:02:48.300 And, well, soon as I put 00:02:48.400 --> 00:02:51.900 Half a dozen of those together. I have a very high dimensional space. 00:02:52.300 --> 00:02:54.900 I'm definitely interested in in sort, of course, 00:02:54.900 --> 00:02:59.400 features of that treatment effect, like whether it's better for people who 00:02:59.900 --> 00:03:02.100 come from lower income families. 00:03:02.600 --> 00:03:06.000 I have a hard time believing that there's an application, 00:03:06.400 --> 00:03:10.300 you know, for the very high dimensional version of that, where 00:03:10.500 --> 00:03:13.200 I discovered that for non-white children who have 00:03:13.800 --> 00:03:17.800 high family incomes, but Baseline scores in the third quartile, 00:03:18.300 --> 00:03:23.000 And only went to public school in the third grade, but not the sixth grade. 00:03:23.000 --> 00:03:25.500 So that's what that high dimensional analysis produces. 00:03:25.800 --> 00:03:28.100 This very elaborate conditional statement. 00:03:28.300 --> 00:03:31.000 There's two things that are wrong with that. In my view first. 00:03:31.000 --> 00:03:34.000 I don't see it as I just can't imagine why it's actionable. 00:03:34.600 --> 00:03:36.600 I don't know why you'd want to act on it. 00:03:36.600 --> 00:03:41.200 And I know also that there's some alternative model that fits almost as well. 00:03:41.800 --> 00:03:43.000 That flips everything, 00:03:43.200 --> 00:03:47.500 right? Because machine learning doesn't tell me that this is really the predictor 00:03:47.900 --> 00:03:48.100 that 00:03:48.400 --> 00:03:52.300 Is it just tells me that this is a good predictor? And so, 00:03:52.800 --> 00:03:55.900 you know, I think there is something different about the 00:03:56.000 --> 00:03:58.400 Moss social science contest. So I think 00:03:58.500 --> 00:04:02.600 the socialized signs of applications you're talking about once where 00:04:03.400 --> 00:04:08.100 I think there's not a huge amount of heterogeneity in the effects. 00:04:08.400 --> 00:04:14.000 And so what there might be a few allow me to to fill that space. No, 00:04:14.600 --> 00:04:18.100 not even then I think for a lot of those those into 00:04:18.300 --> 00:04:22.000 Sanctions even effect. You would expect that. The effect is the same sign 00:04:22.100 --> 00:04:22.900 for everybody. 00:04:23.400 --> 00:04:27.600 It may be there may be small differences in the magnitude, but it's not 00:04:28.200 --> 00:04:31.700 for a lot of these education defenses. They're good for everybody. 00:04:31.800 --> 00:04:32.300 They're 00:04:32.900 --> 00:04:37.600 the it's not that they're bad for some people and good for other people and 00:04:37.600 --> 00:04:40.800 that is kind of very small Pockets where they're bad the 00:04:40.900 --> 00:04:43.900 but it may be some variation in the magnitude, 00:04:44.000 --> 00:04:48.200 but you would need very very big data sets to find those and I 00:04:48.400 --> 00:04:51.400 Then in those cases, they probably wouldn't be very actionable anyone. 00:04:51.700 --> 00:04:53.800 But there's I think there's a lot of other settings 00:04:54.100 --> 00:04:56.600 where there is much more hydrogen it. 00:04:57.400 --> 00:05:01.600 Well, I'm open to that possibility and I think the example you gave of 00:05:01.900 --> 00:05:05.000 it's essentially a marketing example. 00:05:06.400 --> 00:05:08.400 Now that maybe they say there's a there's a 00:05:08.500 --> 00:05:10.700 have implications for and that's organization. 00:05:10.700 --> 00:05:13.900 How you actually need to whether you need to worry about 00:05:14.000 --> 00:05:17.900 the well, I know Market power, some see that paper. 00:05:18.400 --> 00:05:21.200 So that's the sense. The sense I'm getting is that 00:05:21.500 --> 00:05:23.500 we still disagree on something. Yes. 00:05:24.100 --> 00:05:26.700 We have it converged on everything. I'm getting that sense. 00:05:27.200 --> 00:05:31.000 Actually. We've diverged on this because this wasn't around to argue about. 00:05:33.200 --> 00:05:38.000 Is it getting a little warm here? Yeah. Warm warmed up. Warmed up is good. 00:05:38.100 --> 00:05:40.800 The sense. I'm getting his Jaws. Sort of, you're not, you're not 00:05:40.900 --> 00:05:43.400 saying that you're confident that there is no way. 00:05:43.400 --> 00:05:45.400 That there is an application where the stuff is useful. 00:05:45.400 --> 00:05:48.200 You are saying you are you're unconvinced by the existing. 00:05:48.300 --> 00:05:52.200 Applications to dedicate fair that I'm very confident. Yeah, 00:05:54.200 --> 00:05:55.000 in this case. 00:05:55.300 --> 00:05:57.500 I think Josh does have a point that today 00:05:58.000 --> 00:06:02.100 even in the prediction cases the where 00:06:02.300 --> 00:06:05.000 a lot of the machine learning methods really shine is 00:06:05.000 --> 00:06:06.600 where there's just a lot of heterogeneity. 00:06:07.300 --> 00:06:10.600 You don't really care much about the details there, right? 00:06:10.900 --> 00:06:15.000 Yes. It does. It doesn't have a policy angle or something. 00:06:15.200 --> 00:06:18.100 They kind of recognizing handwritten digits and stuff. 00:06:18.300 --> 00:06:24.000 For it does much better there than building some complicated model. 00:06:24.400 --> 00:06:28.100 But a lot of the social science, a lot of the economic applications. 00:06:28.300 --> 00:06:32.100 We actually know a huge amount about the relationship between various variables. 00:06:32.100 --> 00:06:34.600 A lot of the relationships are strictly monotone. 00:06:35.400 --> 00:06:39.400 There and education is going to increase people's earnings, 00:06:39.800 --> 00:06:44.100 irrespective of the demographic, irrespective of the level of Education. 00:06:44.100 --> 00:06:47.800 You already have until they get to a PhD. Yeah. There is a graduate school. 00:06:49.500 --> 00:06:50.700 A reasonable range. 00:06:51.600 --> 00:06:55.900 It's a it's not going to go down very much. We're 00:06:56.100 --> 00:06:59.700 in a lot of the settings. For these machine learning method shine. 00:06:59.700 --> 00:07:01.900 It's going to there's a lot of non-monetary Necessities 00:07:02.100 --> 00:07:04.900 kind of multi modality in these relationships 00:07:05.300 --> 00:07:11.500 and they're they're going to be very powerful but I still stand by that. 00:07:11.700 --> 00:07:16.100 It kind of It kind of this message just have a huge amount to offer the for 00:07:16.400 --> 00:07:18.100 for economists and they go. 00:07:18.200 --> 00:07:21.700 To be a big part of the future. 00:07:23.400 --> 00:07:25.800 Feels like there's something interesting to be said about machine learning here. 00:07:25.800 --> 00:07:27.700 So, here I was wondering, could you give some more, 00:07:28.000 --> 00:07:29.000 maybe some examples 00:07:29.000 --> 00:07:32.500 of the sorts of examples you're thinking about with applications? I'm at the moment. 00:07:32.500 --> 00:07:34.100 So while I'm on areas where 00:07:34.700 --> 00:07:36.400 instead of looking for average 00:07:36.500 --> 00:07:42.200 cause of facts were looking for individualized estimates, and predictions of 00:07:42.400 --> 00:07:47.500 of course of facts and their machine learning algorithms have been very effective, 00:07:48.000 --> 00:07:48.100 too. 00:07:48.300 --> 00:07:51.500 Surely would have, we would have done these things, using kernel methods. 00:07:51.600 --> 00:07:54.500 And theoretically they work great and 00:07:54.600 --> 00:07:57.400 the sort of some arguments that you formally can't do any better. 00:07:57.600 --> 00:08:00.500 But in practice, they don't work very well and 00:08:00.900 --> 00:08:05.400 random Forest, random cause of forest type things that stuff on wagon, Susan. 00:08:05.400 --> 00:08:09.500 I think I've been working on. I used very widely. 00:08:09.600 --> 00:08:12.200 They've been very effective, kind of, in the settings 00:08:12.400 --> 00:08:18.100 to actually get cause of facts that are that the ferry by 00:08:18.200 --> 00:08:19.900 Bike over has, and this kind of, 00:08:20.700 --> 00:08:25.700 I think this is still just the beginning of these methods. But in many cases, 00:08:26.400 --> 00:08:31.600 the these algorithms are very effective as searching over big spaces 00:08:31.800 --> 00:08:35.600 and finding the functions that fit 00:08:35.900 --> 00:08:41.100 the very well in ways that we couldn't really do the beforehand. 00:08:41.500 --> 00:08:45.300 I don't know of an example, where machine learning has generated insights 00:08:45.300 --> 00:08:48.100 about a causal effect that I'm interested in. And I, 00:08:48.300 --> 00:08:51.300 You know of examples where it's potentially very misleading. 00:08:51.300 --> 00:08:53.700 So I've done some work with Brigham Franz and 00:08:54.100 --> 00:08:55.100 using, for example, 00:08:55.100 --> 00:08:59.900 random Forest to model covariate effects in an instrumental variables problem. 00:09:00.200 --> 00:09:01.200 Where you need, 00:09:01.600 --> 00:09:03.500 you need to condition on covariance 00:09:04.400 --> 00:09:08.200 and you don't particularly have strong feelings about the functional form for that. 00:09:08.200 --> 00:09:10.000 So maybe you should curve 00:09:10.500 --> 00:09:10.900 think, 00:09:10.900 --> 00:09:14.500 be open to flexible curve fitting and that leads you down a path 00:09:14.500 --> 00:09:18.000 where there's a lot of nonlinearities in the model and 00:09:18.200 --> 00:09:23.000 That's very dangerous with IV because any sort of excluded non-linearity 00:09:23.300 --> 00:09:27.600 potentially generates a spurious, causal effect and Brigham. And I showed that 00:09:27.900 --> 00:09:32.200 very powerfully. I think in the case of two instruments 00:09:32.700 --> 00:09:36.000 that come from a paper, mine with Bill Evans. Where if you, 00:09:36.500 --> 00:09:37.600 you know, replace it 00:09:38.100 --> 00:09:42.600 in a traditional two stage least squares, estimator with some kind of random Forest. 00:09:42.900 --> 00:09:48.000 You get very precisely at estimated nonsense estimates and 00:09:49.000 --> 00:09:51.100 You know, I think that's a, that's a big caution. 00:09:51.100 --> 00:09:53.400 And I, you know, in view of those findings 00:09:53.700 --> 00:09:57.100 in an example, I care about where the instruments are very simple 00:09:57.400 --> 00:09:59.100 and I believe that they're valid, 00:09:59.300 --> 00:10:01.600 you know, I would be skeptical of that. So 00:10:02.900 --> 00:10:06.800 non-linearity and Ivy don't mix very comfortably. Now I said, 00:10:07.200 --> 00:10:11.400 you know in some sense that's already a more complicated. Well, it's Ivy. 00:10:11.600 --> 00:10:11.900 Yeah, 00:10:12.500 --> 00:10:16.700 but then we work on that and friend out. 00:10:18.600 --> 00:10:22.300 I sat in tow vehicle actually guy a lot of these papers Cross by my desk and it, 00:10:22.700 --> 00:10:29.500 but the motivation is is not clear at a fact, really lacking. 00:10:29.800 --> 00:10:35.100 And they're not, they're not, they called type semi-parametric foundational papers. 00:10:35.400 --> 00:10:37.100 So that that's a big problem 00:10:38.000 --> 00:10:42.400 and kind of related problem is that we have this tradition in econometrics 00:10:42.600 --> 00:10:47.500 being very focused on these formulas and tonic results kind of weird. 00:10:48.800 --> 00:10:52.600 We have just have a lot of papers that where you people, propose 00:10:52.800 --> 00:10:55.700 a method and then establish the asymptotic properties 00:10:56.300 --> 00:11:01.900 in in a very kind of standardized way that bad. 00:11:02.900 --> 00:11:07.200 Well, I think it's sort of close the door for a lot of work. 00:11:07.200 --> 00:11:11.600 That doesn't fit it into that. We're in the machine learning literature. 00:11:11.900 --> 00:11:14.300 A lot of things are more algorithmic people. 00:11:15.700 --> 00:11:18.500 Had algorithms for coming up with predictions. 00:11:18.800 --> 00:11:23.600 The turn out to actually work much better than say, nonparametric kernel regression 00:11:24.000 --> 00:11:26.800 for a long-ass time. We're doing all the nonparametric syndecan, metrics. 00:11:26.800 --> 00:11:31.100 We do it using kernel regression and I was great for proving theorems. 00:11:31.300 --> 00:11:34.800 You could get confidence, intervals and consistency, and asymptotic normality, 00:11:34.800 --> 00:11:37.000 and it was all great, but it wasn't very useful. 00:11:37.300 --> 00:11:40.900 And the things they did in machine learning. I just way way better, 00:11:41.000 --> 00:11:45.100 but they didn't have to the proper. That's not my beef with machine learning theory. 00:11:45.300 --> 00:11:51.200 As we know my name, I'm saying there for the prediction part. 00:11:51.400 --> 00:11:54.500 It does much better. Yeah, that's a better curve fitting to it. 00:11:54.900 --> 00:11:56.500 But it did. So 00:11:57.100 --> 00:12:02.700 in a way that would not have made those papers initially easy to get into 00:12:03.000 --> 00:12:06.300 the econometrics journals because it wasn't proving the type of things. 00:12:06.400 --> 00:12:11.200 You know, when when Brian was doing his regression trees that just didn't fit in 00:12:11.800 --> 00:12:15.100 and I think he would have had a very hard time. 00:12:15.200 --> 00:12:18.400 Polishing these things. And it could have had six journals. 00:12:18.900 --> 00:12:24.400 I, so I think we're we limited ourselves too much and we 00:12:24.700 --> 00:12:27.900 that left us close things off 00:12:28.000 --> 00:12:30.800 for a lot of these machine learning methods, that actually very useful. 00:12:30.900 --> 00:12:34.000 Hmm. I mean, I think they're in general, 00:12:34.900 --> 00:12:36.200 that literature the computer. 00:12:36.200 --> 00:12:39.300 Scientists have brought a huge number of these algorithms. 00:12:39.600 --> 00:12:43.900 The have proposed a huge number of these algorithms that actually very useful 00:12:44.000 --> 00:12:44.700 at that are 00:12:45.500 --> 00:12:49.100 Affecting the way we're going to be doing empirical work, 00:12:49.800 --> 00:12:55.100 but we've not fully internalize that because we're still very focused on getting 00:12:55.300 --> 00:12:57.500 Point estimates and getting standard errors 00:12:58.600 --> 00:13:01.200 and getting P values in a way that 00:13:01.700 --> 00:13:03.100 we need to move Beyond 00:13:03.300 --> 00:13:04.300 to fully harness. 00:13:04.300 --> 00:13:10.700 The force, the quote, the benefits from machine learning literature. 00:13:10.900 --> 00:13:15.100 Hmm. On the one hand. I guess I very much take your point that sort of the the 00:13:15.200 --> 00:13:18.600 Tional. Econometrics, framework of sort of propose, a method, 00:13:18.600 --> 00:13:22.600 proved a limit theorem under some asymptotic story, story story, 00:13:22.600 --> 00:13:26.900 story story publish a paper is constraining. 00:13:26.900 --> 00:13:29.700 And that in some sense by thinking, more, 00:13:29.700 --> 00:13:33.200 broadly about what a methods paper could look. Like we may write in some sense. 00:13:33.200 --> 00:13:35.900 Certainly the machine learning literature has found a bunch of things, 00:13:35.900 --> 00:13:38.300 which seem to work quite well for a number of problems 00:13:38.300 --> 00:13:42.400 and are now having substantial influence in economics. I guess a question. 00:13:42.400 --> 00:13:44.800 I'm interested in is, how do you think? 00:13:45.200 --> 00:13:47.600 The goal of fear. 00:13:47.900 --> 00:13:51.200 Sort of, do you think there is? There's no value in the theory part of it? 00:13:51.600 --> 00:13:54.800 Because I guess it's sort of a question that I often have to sort of seeing 00:13:54.800 --> 00:13:56.900 that output from a machine learning tool 00:13:56.900 --> 00:13:59.400 that actually a number of the methods that you talked about. 00:13:59.400 --> 00:14:01.800 Actually do have inferential results, develop for them, 00:14:02.600 --> 00:14:06.400 something that I always wonder about a sort of uncertainty quantification and just, 00:14:06.500 --> 00:14:08.000 you know, I I have my prior, 00:14:08.000 --> 00:14:11.000 I come into the world with my view. I see the result of this thing. 00:14:11.000 --> 00:14:14.500 How should I update based on it? And in some sense, if I'm in a world where 00:14:14.600 --> 00:14:15.100 things are. 00:14:15.200 --> 00:14:18.200 Normally distributed. I know how to do it here. I don't. 00:14:18.200 --> 00:14:21.400 And so I'm interested to hear had I think it sounds. So 00:14:21.500 --> 00:14:24.300 I don't see this as sort of close it saying, well 00:14:24.400 --> 00:14:26.500 we do these results are not not interesting 00:14:26.600 --> 00:14:27.700 but it's gonna be a lot of cases 00:14:28.000 --> 00:14:31.200 where it's going to be incredibly hard to get those results and we may not be able 00:14:31.200 --> 00:14:33.200 to get there and 00:14:33.400 --> 00:14:37.700 we may need to do it in stages. Where first someone says. Hey I have this 00:14:39.600 --> 00:14:44.800 interesting algorithm for for doing something and it works well by some 00:14:45.600 --> 00:14:49.900 The Criterion that on this this particular data set 00:14:51.000 --> 00:14:53.400 and I'm visit put it out there and we should 00:14:53.700 --> 00:14:58.000 maybe someone will figure out a way that you can later actually still do inference 00:14:58.000 --> 00:14:59.100 on the some condition. 00:14:59.100 --> 00:15:02.100 So and maybe those are not particularly realistic conditions, 00:15:02.100 --> 00:15:05.500 then we kind of go further, but I think we've been 00:15:06.700 --> 00:15:11.400 Too constraining things too much where we said, you know, this is the type of things 00:15:12.100 --> 00:15:14.400 that we need to do. And I had some sense 00:15:15.700 --> 00:15:18.200 that goes back to kind of the way they dress and I 00:15:19.700 --> 00:15:21.900 thought about things for the local average treatment effect. 00:15:21.900 --> 00:15:24.600 That wasn't quite the way people were thinking about these problems. 00:15:24.600 --> 00:15:29.200 Before they say they there was a sense that some of the people said, you know, 00:15:29.500 --> 00:15:31.900 the way you need to do. These things, is you first, say 00:15:32.200 --> 00:15:36.300 what you're interested in estimating and then you do the best job you can. 00:15:36.500 --> 00:15:37.700 In estimating that 00:15:38.100 --> 00:15:44.200 and what you have you guys had doing is doing it, you guys are doing it backwards. 00:15:44.300 --> 00:15:46.700 You're going to say here. I have an estimator 00:15:47.300 --> 00:15:49.600 and now I'm going to figure out what what 00:15:49.800 --> 00:15:51.400 what it says estimating then expose. 00:15:51.400 --> 00:15:53.900 You're going to say why you think that's interesting 00:15:53.900 --> 00:15:56.600 or maybe why it's not interesting and that's that's not okay. 00:15:56.600 --> 00:15:58.600 You're not allowed to do that that way. 00:15:59.000 --> 00:16:04.100 And I think we should just be a little bit more flexible and thinking about the 00:16:04.300 --> 00:16:06.300 how to look at at 00:16:06.400 --> 00:16:11.300 Problems because I think we've missed some things by not by not doing that. 00:16:13.000 --> 00:16:16.600 So you've heard our views. Isaiah, you've seen that, we have 00:16:17.000 --> 00:16:20.400 some points of disagreement. Why don't you referee this dispute for us? 00:16:22.500 --> 00:16:28.100 Oh, I'm so so nice of you to ask me a small question. So I guess for one. 00:16:28.200 --> 00:16:33.200 I very much agree with something that he do said earlier of. 00:16:36.000 --> 00:16:36.300 So what? 00:16:36.500 --> 00:16:37.900 Where it seems. Where the, 00:16:37.900 --> 00:16:41.400 the case for machine learning seems relatively clear is in settings, where 00:16:41.500 --> 00:16:45.100 you know, we're interested in some version of a nonparametric prediction problem. 00:16:45.100 --> 00:16:49.700 So I'm interested in estimating a conditional expectation or conditional probability 00:16:50.000 --> 00:16:52.100 and in the past, maybe I would have run a colonel, 00:16:52.100 --> 00:16:55.800 I would have run a kernel regression or I would have run a series regression or 00:16:56.100 --> 00:16:57.400 something along those lines. 00:16:57.700 --> 00:16:58.000 Sort of, 00:16:58.000 --> 00:16:58.700 it seems like 00:16:58.700 --> 00:17:02.000 at this point we've a fairly good sense that in a fairly wide range 00:17:02.000 --> 00:17:06.300 of applications machine learning methods seem to do better for 00:17:06.400 --> 00:17:06.800 Or, you know, 00:17:06.800 --> 00:17:08.800 estimating conditional mean functions 00:17:08.800 --> 00:17:12.000 or conditional probabilities or various other nonparametric objects 00:17:12.400 --> 00:17:16.600 than more traditional nonparametric methods that were studied in econometrics 00:17:16.600 --> 00:17:19.100 and statistics, especially in high dimensional settings. 00:17:19.500 --> 00:17:23.100 So you thinking of maybe the propensity score or something like that? 00:17:23.100 --> 00:17:25.300 So exactly, so nuisance functions. Yeah. 00:17:25.300 --> 00:17:28.900 So things like propensity scores things or I mean even objects 00:17:28.900 --> 00:17:30.100 of more direct inference 00:17:30.200 --> 00:17:32.400 interest, like conditional average treatment effects, right? 00:17:32.400 --> 00:17:35.100 Which of the difference of two conditional, expectation functions, 00:17:35.100 --> 00:17:36.300 potentially things like that. 00:17:36.500 --> 00:17:40.400 Of course, even there, right? We the the theory 00:17:40.500 --> 00:17:43.700 for in France or the theory for sort of how to how to interpret, 00:17:43.700 --> 00:17:45.900 how to make large simple statements about some of these things are 00:17:46.000 --> 00:17:50.100 less well-developed depending on the machine learning, estimator used. 00:17:50.100 --> 00:17:53.800 And so, I think there's something that is tricky is that we 00:17:53.900 --> 00:17:55.700 can have these methods, which work a lot, 00:17:55.700 --> 00:17:58.000 which seemed to work a lot better for some purposes. 00:17:58.000 --> 00:18:01.600 But which we need to be a bit careful in how we plug them in or how 00:18:01.600 --> 00:18:03.300 we interpret the resulting statements. 00:18:03.600 --> 00:18:06.200 But of course, that's a very, very active area right now. We're 00:18:06.400 --> 00:18:10.400 People are doing tons of great work. And so I exfoli expect and hope 00:18:10.400 --> 00:18:12.800 to see much more going forward there. 00:18:13.000 --> 00:18:17.300 So one issue with machine learning, that always seems a danger is, or 00:18:17.400 --> 00:18:20.300 that is sometimes a danger and had some times led to 00:18:20.500 --> 00:18:22.600 applications that have made. Less sense, is 00:18:22.800 --> 00:18:25.100 when folks start with a method that are 00:18:25.300 --> 00:18:28.500 start with a method that they're very excited about rather than a question, 00:18:28.900 --> 00:18:32.100 right? So sort of starting with a question where here's the 00:18:32.500 --> 00:18:36.200 object I'm interested in here is the parameter of Interest. Let me 00:18:36.700 --> 00:18:37.100 You know, 00:18:37.300 --> 00:18:39.500 think about how I would identify that thing, 00:18:39.500 --> 00:18:41.800 how I would recover that thing, if I had a ton of data, 00:18:41.900 --> 00:18:44.000 oh, here's a conditional expectation function. 00:18:44.000 --> 00:18:47.100 Let me plug in an estimator on machine. Learning estimator for that. 00:18:47.200 --> 00:18:48.800 That seems very very sensible. 00:18:49.000 --> 00:18:53.100 Whereas, you know, if I digress quantity on price 00:18:53.700 --> 00:18:56.000 and say that I used a machine learning method, 00:18:56.300 --> 00:18:58.900 maybe I'm satisfied that that solves the in dodging, 80 problem. 00:18:58.900 --> 00:19:01.200 We're usually worried about their maybe I'm not, 00:19:01.500 --> 00:19:03.200 but again, that's something where the, 00:19:03.400 --> 00:19:06.300 the way to address. It, seems relatively clear, right? 00:19:06.500 --> 00:19:09.000 It's the find your object of interest and 00:19:09.200 --> 00:19:11.600 think about, is that just bringing the economics? 00:19:11.700 --> 00:19:12.200 Exactly. 00:19:12.200 --> 00:19:15.400 And and can I think about it, and they denied it, but harnessed 00:19:15.400 --> 00:19:18.300 the power of the machine learning methods for precisely 00:19:18.500 --> 00:19:22.800 for some of the components precisely. Exactly. So sort of, you know, the, the, 00:19:22.900 --> 00:19:25.600 the question of interest is the same as the question of interest is always been, 00:19:25.600 --> 00:19:29.500 but we now better methods for estimating some pieces of this, right? The 00:19:29.900 --> 00:19:31.600 the place that seems harder to, uh, 00:19:31.900 --> 00:19:33.400 harder to forecast is Right. 00:19:33.400 --> 00:19:36.300 Obviously, there's a huge amount going in going on in the machine. 00:19:36.400 --> 00:19:37.400 Learning literature 00:19:37.500 --> 00:19:39.700 and the great sort of The Limited ways 00:19:39.700 --> 00:19:42.900 of plugging it in that I've referenced so far are limited piece of that. 00:19:43.000 --> 00:19:46.100 And so I think there are all sorts of other interesting questions about where, 00:19:46.300 --> 00:19:46.900 right sort of 00:19:47.100 --> 00:19:49.300 where does this interaction go? What else can we learn? 00:19:49.300 --> 00:19:52.000 And that's something where, you know, I think there's 00:19:52.200 --> 00:19:56.400 a ton going on which seems very promising and I have no idea what the answer is. 00:19:57.000 --> 00:20:01.200 No, no. No, it's I so I totally agree with that but it's no. 00:20:01.800 --> 00:20:03.500 That's makes it very exciting. 00:20:03.800 --> 00:20:06.100 And I think that's just a little work to be done there. 00:20:06.600 --> 00:20:11.400 All right. So I say agrees with me there, say that person. 00:20:14.500 --> 00:20:17.700 If you'd like to watch more Nobel conversations, click here, 00:20:18.000 --> 00:20:20.400 or if you'd like to learn more about econometrics, 00:20:20.500 --> 00:20:23.100 check out Josh's mastering econometrics series. 00:20:23.600 --> 00:20:26.500 If you'd like to learn more about he do Josh and Isaiah 00:20:26.700 --> 00:20:28.200 check out the links in the description.