WEBVTT

00:00:00.100 --> 00:00:02.350
♪ [music] ♪

00:00:03.700 --> 00:00:05.700
- [narrator] Welcome
to Nobel Conversations.

00:00:07.000 --> 00:00:10.128
In this episode, Josh Angrist
and Guido Imbens

00:00:10.128 --> 00:00:13.700
sit down with Isaiah Andrews
to discuss and disagree

00:00:13.700 --> 00:00:16.580
over the role of machine learning
in applied econometrics.

00:00:18.300 --> 00:00:19.769
- [Isaiah] So, of course,
there are a lot of topics

00:00:19.769 --> 00:00:21.087
where you guys largely agree,

00:00:21.087 --> 00:00:22.313
but I'd like to turn to one

00:00:22.313 --> 00:00:24.240
where maybe you have
some differences of opinion.

00:00:24.240 --> 00:00:25.728
So I'd love to hear
some of your thoughts

00:00:25.728 --> 00:00:26.883
about machine learning

00:00:26.883 --> 00:00:29.900
and the goal that it's playing
and is going to play in economics.

00:00:30.200 --> 00:00:33.352
- [Guido] I've looked at some data
like the proprietary

00:00:33.352 --> 00:00:35.100
so that there's
no published paper there.

00:00:36.719 --> 00:00:38.159
There was an experiment
that was done

00:00:38.159 --> 00:00:39.500
on some search algorithm.

00:00:39.700 --> 00:00:41.497
And the question was...

00:00:42.901 --> 00:00:45.600
it was about ranking things
and changing the ranking.

00:00:45.900 --> 00:00:47.500
That was sort of clear...

00:00:48.400 --> 00:00:50.600
that was going to be
a lot of heterogeneity there.

00:00:50.600 --> 00:00:51.700
Mmm,

00:00:51.700 --> 00:00:58.120
You know, if you look for say,

00:00:58.300 --> 00:01:00.350
a picture of Britney Spears

00:01:00.350 --> 00:01:02.400
that it doesn't really matter
where you rank it

00:01:02.400 --> 00:01:05.500
because you're going to figure out
what you're looking for,

00:01:06.200 --> 00:01:07.867
whether you put it
in the first or second

00:01:07.867 --> 00:01:09.800
or third position of the ranking.

00:01:10.100 --> 00:01:12.500
But if you're looking
for the best econometrics book,

00:01:13.300 --> 00:01:16.500
if you put your book
first or your book tenth,

00:01:16.500 --> 00:01:18.100
that's going to make
a big difference

00:01:18.600 --> 00:01:21.829
how much how often people
are going to click on it.

00:01:21.829 --> 00:01:23.417
And so there you go --

00:01:23.417 --> 00:01:27.218
- [Josh] Why do I need
machine learning to discover that?

00:01:27.218 --> 00:01:29.100
It seems like because
I can discover it simply.

00:01:29.300 --> 00:01:31.800
So in general, there
were lots of possible.

00:01:32.100 --> 00:01:36.300
You what you want to think about there
being lots of characteristics of the

00:01:36.400 --> 00:01:42.000
the items that you want to understand
where, what drives the heterogeneity

00:01:42.300 --> 00:01:45.600
in the effect of your just rekt,
you know, that in some sense.

00:01:45.600 --> 00:01:47.700
You're solving a marketing problem.

00:01:48.400 --> 00:01:51.800
Also affect you, it's causal,
but it has no scientific content.

00:01:51.800 --> 00:01:53.300
I think about think about,

00:01:54.100 --> 00:01:57.300
but it's similar things
and medical settings.

00:01:58.000 --> 00:02:01.200
If you do an experiment, you
may actually be very interested

00:02:01.300 --> 00:02:03.800
in whether the treatment
works for some groups or not.

00:02:03.900 --> 00:02:06.500
And you have a lot of individual
characteristics and you want

00:02:06.500 --> 00:02:09.500
to systematically search.
Yeah. I'm skeptical about that.

00:02:09.500 --> 00:02:13.900
That sort of idea that there's this personal
causal effect that I should care about,

00:02:14.000 --> 00:02:18.200
and that machine learning can Discover it
in some way that's useful. So think about

00:02:18.300 --> 00:02:21.400
I've done a lot of work
on schools, going to say

00:02:21.400 --> 00:02:26.500
a charter school publicly funded
private school effectively, you know,

00:02:26.500 --> 00:02:29.300
that's free to structure its own
curriculum for context there.

00:02:29.300 --> 00:02:32.700
Some types of charter, schools
are generate spectacular,

00:02:32.700 --> 00:02:36.400
achievement gains and in the data
set that produces that result.

00:02:36.400 --> 00:02:37.800
I have a lot of covariance.

00:02:37.800 --> 00:02:41.200
So I have Baseline scores,
and I have family background,

00:02:41.200 --> 00:02:45.800
the education of the parents, the sex
of the child, the race of the child.

00:02:45.800 --> 00:02:48.300
And, well, soon as I put

00:02:48.400 --> 00:02:51.900
Half a dozen of those together. I
have a very high dimensional space.

00:02:52.300 --> 00:02:54.900
I'm definitely interested
in in sort, of course,

00:02:54.900 --> 00:02:59.400
features of that treatment effect,
like whether it's better for people who

00:02:59.900 --> 00:03:02.100
come from lower income families.

00:03:02.600 --> 00:03:06.000
I have a hard time believing
that there's an application,

00:03:06.400 --> 00:03:10.300
you know, for the very high
dimensional version of that, where

00:03:10.500 --> 00:03:13.200
I discovered that for
non-white children who have

00:03:13.800 --> 00:03:17.800
high family incomes, but Baseline
scores in the third quartile,

00:03:18.300 --> 00:03:23.000
And only went to public school in the
third grade, but not the sixth grade.

00:03:23.000 --> 00:03:25.500
So that's what that high
dimensional analysis produces.

00:03:25.800 --> 00:03:28.100
This very elaborate conditional statement.

00:03:28.300 --> 00:03:31.000
There's two things that are wrong
with that. In my view first.

00:03:31.000 --> 00:03:34.000
I don't see it as I just can't
imagine why it's actionable.

00:03:34.600 --> 00:03:36.600
I don't know why you'd want to act on it.

00:03:36.600 --> 00:03:41.200
And I know also that there's some
alternative model that fits almost as well.

00:03:41.800 --> 00:03:43.000
That flips everything,

00:03:43.200 --> 00:03:47.500
right? Because machine learning doesn't
tell me that this is really the predictor

00:03:47.900 --> 00:03:48.100
that

00:03:48.400 --> 00:03:52.300
Is it just tells me that this
is a good predictor? And so,

00:03:52.800 --> 00:03:55.900
you know, I think there is
something different about the

00:03:56.000 --> 00:03:58.400
Moss social science contest. So I think

00:03:58.500 --> 00:04:02.600
the socialized signs of applications
you're talking about once where

00:04:03.400 --> 00:04:08.100
I think there's not a huge amount
of heterogeneity in the effects.

00:04:08.400 --> 00:04:14.000
And so what there might be a few
allow me to to fill that space. No,

00:04:14.600 --> 00:04:18.100
not even then I think for
a lot of those those into

00:04:18.300 --> 00:04:22.000
Sanctions even effect. You would expect
that. The effect is the same sign

00:04:22.100 --> 00:04:22.900
for everybody.

00:04:23.400 --> 00:04:27.600
It may be there may be small differences
in the magnitude, but it's not

00:04:28.200 --> 00:04:31.700
for a lot of these education
defenses. They're good for everybody.

00:04:31.800 --> 00:04:32.300
They're

00:04:32.900 --> 00:04:37.600
the it's not that they're bad for some
people and good for other people and

00:04:37.600 --> 00:04:40.800
that is kind of very small
Pockets where they're bad the

00:04:40.900 --> 00:04:43.900
but it may be some
variation in the magnitude,

00:04:44.000 --> 00:04:48.200
but you would need very very big
data sets to find those and I

00:04:48.400 --> 00:04:51.400
Then in those cases, they probably
wouldn't be very actionable anyone.

00:04:51.700 --> 00:04:53.800
But there's I think there's
a lot of other settings

00:04:54.100 --> 00:04:56.600
where there is much more hydrogen it.

00:04:57.400 --> 00:05:01.600
Well, I'm open to that possibility
and I think the example you gave of

00:05:01.900 --> 00:05:05.000
it's essentially a marketing example.

00:05:06.400 --> 00:05:08.400
Now that maybe they
say there's a there's a

00:05:08.500 --> 00:05:10.700
have implications for
and that's organization.

00:05:10.700 --> 00:05:13.900
How you actually need to
whether you need to worry about

00:05:14.000 --> 00:05:17.900
the well, I know Market
power, some see that paper.

00:05:18.400 --> 00:05:21.200
So that's the sense. The
sense I'm getting is that

00:05:21.500 --> 00:05:23.500
we still disagree on something. Yes.

00:05:24.100 --> 00:05:26.700
We have it converged on
everything. I'm getting that sense.

00:05:27.200 --> 00:05:31.000
Actually. We've diverged on this because
this wasn't around to argue about.

00:05:33.200 --> 00:05:38.000
Is it getting a little warm here? Yeah.
Warm warmed up. Warmed up is good.

00:05:38.100 --> 00:05:40.800
The sense. I'm getting his Jaws.
Sort of, you're not, you're not

00:05:40.900 --> 00:05:43.400
saying that you're confident
that there is no way.

00:05:43.400 --> 00:05:45.400
That there is an application
where the stuff is useful.

00:05:45.400 --> 00:05:48.200
You are saying you are you're
unconvinced by the existing.

00:05:48.300 --> 00:05:52.200
Applications to dedicate fair
that I'm very confident. Yeah,

00:05:54.200 --> 00:05:55.000
in this case.

00:05:55.300 --> 00:05:57.500
I think Josh does have a point that today

00:05:58.000 --> 00:06:02.100
even in the prediction cases the where

00:06:02.300 --> 00:06:05.000
a lot of the machine learning
methods really shine is

00:06:05.000 --> 00:06:06.600
where there's just a lot of heterogeneity.

00:06:07.300 --> 00:06:10.600
You don't really care much
about the details there, right?

00:06:10.900 --> 00:06:15.000
Yes. It does. It doesn't have
a policy angle or something.

00:06:15.200 --> 00:06:18.100
They kind of recognizing
handwritten digits and stuff.

00:06:18.300 --> 00:06:24.000
For it does much better there than
building some complicated model.

00:06:24.400 --> 00:06:28.100
But a lot of the social science, a
lot of the economic applications.

00:06:28.300 --> 00:06:32.100
We actually know a huge amount about the
relationship between various variables.

00:06:32.100 --> 00:06:34.600
A lot of the relationships
are strictly monotone.

00:06:35.400 --> 00:06:39.400
There and education is going
to increase people's earnings,

00:06:39.800 --> 00:06:44.100
irrespective of the demographic,
irrespective of the level of Education.

00:06:44.100 --> 00:06:47.800
You already have until they get to a
PhD. Yeah. There is a graduate school.

00:06:49.500 --> 00:06:50.700
A reasonable range.

00:06:51.600 --> 00:06:55.900
It's a it's not going to
go down very much. We're

00:06:56.100 --> 00:06:59.700
in a lot of the settings. For these
machine learning method shine.

00:06:59.700 --> 00:07:01.900
It's going to there's a lot
of non-monetary Necessities

00:07:02.100 --> 00:07:04.900
kind of multi modality
in these relationships

00:07:05.300 --> 00:07:11.500
and they're they're going to be very
powerful but I still stand by that.

00:07:11.700 --> 00:07:16.100
It kind of It kind of this message just
have a huge amount to offer the for

00:07:16.400 --> 00:07:18.100
for economists and they go.

00:07:18.200 --> 00:07:21.700
To be a big part of the future.

00:07:23.400 --> 00:07:25.800
Feels like there's something interesting
to be said about machine learning here.

00:07:25.800 --> 00:07:27.700
So, here I was wondering,
could you give some more,

00:07:28.000 --> 00:07:29.000
maybe some examples

00:07:29.000 --> 00:07:32.500
of the sorts of examples you're thinking
about with applications? I'm at the moment.

00:07:32.500 --> 00:07:34.100
So while I'm on areas where

00:07:34.700 --> 00:07:36.400
instead of looking for average

00:07:36.500 --> 00:07:42.200
cause of facts were looking for
individualized estimates, and predictions of

00:07:42.400 --> 00:07:47.500
of course of facts and their machine
learning algorithms have been very effective,

00:07:48.000 --> 00:07:48.100
too.

00:07:48.300 --> 00:07:51.500
Surely would have, we would have done
these things, using kernel methods.

00:07:51.600 --> 00:07:54.500
And theoretically they work great and

00:07:54.600 --> 00:07:57.400
the sort of some arguments that
you formally can't do any better.

00:07:57.600 --> 00:08:00.500
But in practice, they
don't work very well and

00:08:00.900 --> 00:08:05.400
random Forest, random cause of forest
type things that stuff on wagon, Susan.

00:08:05.400 --> 00:08:09.500
I think I've been working
on. I used very widely.

00:08:09.600 --> 00:08:12.200
They've been very effective,
kind of, in the settings

00:08:12.400 --> 00:08:18.100
to actually get cause of facts
that are that the ferry by

00:08:18.200 --> 00:08:19.900
Bike over has, and this kind of,

00:08:20.700 --> 00:08:25.700
I think this is still just the beginning
of these methods. But in many cases,

00:08:26.400 --> 00:08:31.600
the these algorithms are very
effective as searching over big spaces

00:08:31.800 --> 00:08:35.600
and finding the functions that fit

00:08:35.900 --> 00:08:41.100
the very well in ways that we
couldn't really do the beforehand.

00:08:41.500 --> 00:08:45.300
I don't know of an example, where
machine learning has generated insights

00:08:45.300 --> 00:08:48.100
about a causal effect that
I'm interested in. And I,

00:08:48.300 --> 00:08:51.300
You know of examples where it's
potentially very misleading.

00:08:51.300 --> 00:08:53.700
So I've done some work
with Brigham Franz and

00:08:54.100 --> 00:08:55.100
using, for example,

00:08:55.100 --> 00:08:59.900
random Forest to model covariate effects
in an instrumental variables problem.

00:09:00.200 --> 00:09:01.200
Where you need,

00:09:01.600 --> 00:09:03.500
you need to condition on covariance

00:09:04.400 --> 00:09:08.200
and you don't particularly have strong
feelings about the functional form for that.

00:09:08.200 --> 00:09:10.000
So maybe you should curve

00:09:10.500 --> 00:09:10.900
think,

00:09:10.900 --> 00:09:14.500
be open to flexible curve fitting
and that leads you down a path

00:09:14.500 --> 00:09:18.000
where there's a lot of
nonlinearities in the model and

00:09:18.200 --> 00:09:23.000
That's very dangerous with IV because
any sort of excluded non-linearity

00:09:23.300 --> 00:09:27.600
potentially generates a spurious, causal
effect and Brigham. And I showed that

00:09:27.900 --> 00:09:32.200
very powerfully. I think in
the case of two instruments

00:09:32.700 --> 00:09:36.000
that come from a paper, mine
with Bill Evans. Where if you,

00:09:36.500 --> 00:09:37.600
you know, replace it

00:09:38.100 --> 00:09:42.600
in a traditional two stage least squares,
estimator with some kind of random Forest.

00:09:42.900 --> 00:09:48.000
You get very precisely at
estimated nonsense estimates and

00:09:49.000 --> 00:09:51.100
You know, I think that's
a, that's a big caution.

00:09:51.100 --> 00:09:53.400
And I, you know, in view of those findings

00:09:53.700 --> 00:09:57.100
in an example, I care about where
the instruments are very simple

00:09:57.400 --> 00:09:59.100
and I believe that they're valid,

00:09:59.300 --> 00:10:01.600
you know, I would be skeptical of that. So

00:10:02.900 --> 00:10:06.800
non-linearity and Ivy don't mix
very comfortably. Now I said,

00:10:07.200 --> 00:10:11.400
you know in some sense that's already
a more complicated. Well, it's Ivy.

00:10:11.600 --> 00:10:11.900
Yeah,

00:10:12.500 --> 00:10:16.700
but then we work on that and friend out.

00:10:18.600 --> 00:10:22.300
I sat in tow vehicle actually guy a lot
of these papers Cross by my desk and it,

00:10:22.700 --> 00:10:29.500
but the motivation is is not
clear at a fact, really lacking.

00:10:29.800 --> 00:10:35.100
And they're not, they're not, they called
type semi-parametric foundational papers.

00:10:35.400 --> 00:10:37.100
So that that's a big problem

00:10:38.000 --> 00:10:42.400
and kind of related problem is that
we have this tradition in econometrics

00:10:42.600 --> 00:10:47.500
being very focused on these formulas
and tonic results kind of weird.

00:10:48.800 --> 00:10:52.600
We have just have a lot of papers
that where you people, propose

00:10:52.800 --> 00:10:55.700
a method and then establish
the asymptotic properties

00:10:56.300 --> 00:11:01.900
in in a very kind of
standardized way that bad.

00:11:02.900 --> 00:11:07.200
Well, I think it's sort of close
the door for a lot of work.

00:11:07.200 --> 00:11:11.600
That doesn't fit it into that. We're
in the machine learning literature.

00:11:11.900 --> 00:11:14.300
A lot of things are
more algorithmic people.

00:11:15.700 --> 00:11:18.500
Had algorithms for coming
up with predictions.

00:11:18.800 --> 00:11:23.600
The turn out to actually work much better
than say, nonparametric kernel regression

00:11:24.000 --> 00:11:26.800
for a long-ass time. We're doing all
the nonparametric syndecan, metrics.

00:11:26.800 --> 00:11:31.100
We do it using kernel regression and
I was great for proving theorems.

00:11:31.300 --> 00:11:34.800
You could get confidence, intervals and
consistency, and asymptotic normality,

00:11:34.800 --> 00:11:37.000
and it was all great, but
it wasn't very useful.

00:11:37.300 --> 00:11:40.900
And the things they did in machine
learning. I just way way better,

00:11:41.000 --> 00:11:45.100
but they didn't have to the proper. That's
not my beef with machine learning theory.

00:11:45.300 --> 00:11:51.200
As we know my name, I'm saying
there for the prediction part.

00:11:51.400 --> 00:11:54.500
It does much better. Yeah, that's
a better curve fitting to it.

00:11:54.900 --> 00:11:56.500
But it did. So

00:11:57.100 --> 00:12:02.700
in a way that would not have made
those papers initially easy to get into

00:12:03.000 --> 00:12:06.300
the econometrics journals because it
wasn't proving the type of things.

00:12:06.400 --> 00:12:11.200
You know, when when Brian was doing his
regression trees that just didn't fit in

00:12:11.800 --> 00:12:15.100
and I think he would have
had a very hard time.

00:12:15.200 --> 00:12:18.400
Polishing these things. And it
could have had six journals.

00:12:18.900 --> 00:12:24.400
I, so I think we're we limited
ourselves too much and we

00:12:24.700 --> 00:12:27.900
that left us close things off

00:12:28.000 --> 00:12:30.800
for a lot of these machine learning
methods, that actually very useful.

00:12:30.900 --> 00:12:34.000
Hmm. I mean, I think they're in general,

00:12:34.900 --> 00:12:36.200
that literature the computer.

00:12:36.200 --> 00:12:39.300
Scientists have brought a huge
number of these algorithms.

00:12:39.600 --> 00:12:43.900
The have proposed a huge number of these
algorithms that actually very useful

00:12:44.000 --> 00:12:44.700
at that are

00:12:45.500 --> 00:12:49.100
Affecting the way we're going
to be doing empirical work,

00:12:49.800 --> 00:12:55.100
but we've not fully internalize that
because we're still very focused on getting

00:12:55.300 --> 00:12:57.500
Point estimates and
getting standard errors

00:12:58.600 --> 00:13:01.200
and getting P values in a way that

00:13:01.700 --> 00:13:03.100
we need to move Beyond

00:13:03.300 --> 00:13:04.300
to fully harness.

00:13:04.300 --> 00:13:10.700
The force, the quote, the benefits
from machine learning literature.

00:13:10.900 --> 00:13:15.100
Hmm. On the one hand. I guess I very
much take your point that sort of the the

00:13:15.200 --> 00:13:18.600
Tional. Econometrics, framework
of sort of propose, a method,

00:13:18.600 --> 00:13:22.600
proved a limit theorem under some
asymptotic story, story story,

00:13:22.600 --> 00:13:26.900
story story publish a
paper is constraining.

00:13:26.900 --> 00:13:29.700
And that in some sense by thinking, more,

00:13:29.700 --> 00:13:33.200
broadly about what a methods paper could
look. Like we may write in some sense.

00:13:33.200 --> 00:13:35.900
Certainly the machine learning
literature has found a bunch of things,

00:13:35.900 --> 00:13:38.300
which seem to work quite
well for a number of problems

00:13:38.300 --> 00:13:42.400
and are now having substantial influence
in economics. I guess a question.

00:13:42.400 --> 00:13:44.800
I'm interested in is, how do you think?

00:13:45.200 --> 00:13:47.600
The goal of fear.

00:13:47.900 --> 00:13:51.200
Sort of, do you think there is? There's
no value in the theory part of it?

00:13:51.600 --> 00:13:54.800
Because I guess it's sort of a question
that I often have to sort of seeing

00:13:54.800 --> 00:13:56.900
that output from a machine learning tool

00:13:56.900 --> 00:13:59.400
that actually a number of the
methods that you talked about.

00:13:59.400 --> 00:14:01.800
Actually do have inferential
results, develop for them,

00:14:02.600 --> 00:14:06.400
something that I always wonder about a sort
of uncertainty quantification and just,

00:14:06.500 --> 00:14:08.000
you know, I I have my prior,

00:14:08.000 --> 00:14:11.000
I come into the world with my view.
I see the result of this thing.

00:14:11.000 --> 00:14:14.500
How should I update based on it? And
in some sense, if I'm in a world where

00:14:14.600 --> 00:14:15.100
things are.

00:14:15.200 --> 00:14:18.200
Normally distributed. I know
how to do it here. I don't.

00:14:18.200 --> 00:14:21.400
And so I'm interested to hear
had I think it sounds. So

00:14:21.500 --> 00:14:24.300
I don't see this as sort
of close it saying, well

00:14:24.400 --> 00:14:26.500
we do these results
are not not interesting

00:14:26.600 --> 00:14:27.700
but it's gonna be a lot of cases

00:14:28.000 --> 00:14:31.200
where it's going to be incredibly hard to
get those results and we may not be able

00:14:31.200 --> 00:14:33.200
to get there and

00:14:33.400 --> 00:14:37.700
we may need to do it in stages. Where
first someone says. Hey I have this

00:14:39.600 --> 00:14:44.800
interesting algorithm for for doing
something and it works well by some

00:14:45.600 --> 00:14:49.900
The Criterion that on this
this particular data set

00:14:51.000 --> 00:14:53.400
and I'm visit put it
out there and we should

00:14:53.700 --> 00:14:58.000
maybe someone will figure out a way that
you can later actually still do inference

00:14:58.000 --> 00:14:59.100
on the some condition.

00:14:59.100 --> 00:15:02.100
So and maybe those are not
particularly realistic conditions,

00:15:02.100 --> 00:15:05.500
then we kind of go further,
but I think we've been

00:15:06.700 --> 00:15:11.400
Too constraining things too much where we
said, you know, this is the type of things

00:15:12.100 --> 00:15:14.400
that we need to do. And I had some sense

00:15:15.700 --> 00:15:18.200
that goes back to kind of
the way they dress and I

00:15:19.700 --> 00:15:21.900
thought about things for the
local average treatment effect.

00:15:21.900 --> 00:15:24.600
That wasn't quite the way people
were thinking about these problems.

00:15:24.600 --> 00:15:29.200
Before they say they there was a sense
that some of the people said, you know,

00:15:29.500 --> 00:15:31.900
the way you need to do. These
things, is you first, say

00:15:32.200 --> 00:15:36.300
what you're interested in estimating
and then you do the best job you can.

00:15:36.500 --> 00:15:37.700
In estimating that

00:15:38.100 --> 00:15:44.200
and what you have you guys had doing is
doing it, you guys are doing it backwards.

00:15:44.300 --> 00:15:46.700
You're going to say
here. I have an estimator

00:15:47.300 --> 00:15:49.600
and now I'm going to figure out what what

00:15:49.800 --> 00:15:51.400
what it says estimating then expose.

00:15:51.400 --> 00:15:53.900
You're going to say why you
think that's interesting

00:15:53.900 --> 00:15:56.600
or maybe why it's not interesting
and that's that's not okay.

00:15:56.600 --> 00:15:58.600
You're not allowed to do that that way.

00:15:59.000 --> 00:16:04.100
And I think we should just be a little
bit more flexible and thinking about the

00:16:04.300 --> 00:16:06.300
how to look at at

00:16:06.400 --> 00:16:11.300
Problems because I think we've missed
some things by not by not doing that.

00:16:13.000 --> 00:16:16.600
So you've heard our views.
Isaiah, you've seen that, we have

00:16:17.000 --> 00:16:20.400
some points of disagreement. Why
don't you referee this dispute for us?

00:16:22.500 --> 00:16:28.100
Oh, I'm so so nice of you to ask me
a small question. So I guess for one.

00:16:28.200 --> 00:16:33.200
I very much agree with something
that he do said earlier of.

00:16:36.000 --> 00:16:36.300
So what?

00:16:36.500 --> 00:16:37.900
Where it seems. Where the,

00:16:37.900 --> 00:16:41.400
the case for machine learning seems
relatively clear is in settings, where

00:16:41.500 --> 00:16:45.100
you know, we're interested in some version
of a nonparametric prediction problem.

00:16:45.100 --> 00:16:49.700
So I'm interested in estimating a conditional
expectation or conditional probability

00:16:50.000 --> 00:16:52.100
and in the past, maybe I
would have run a colonel,

00:16:52.100 --> 00:16:55.800
I would have run a kernel regression or
I would have run a series regression or

00:16:56.100 --> 00:16:57.400
something along those lines.

00:16:57.700 --> 00:16:58.000
Sort of,

00:16:58.000 --> 00:16:58.700
it seems like

00:16:58.700 --> 00:17:02.000
at this point we've a fairly good
sense that in a fairly wide range

00:17:02.000 --> 00:17:06.300
of applications machine learning
methods seem to do better for

00:17:06.400 --> 00:17:06.800
Or, you know,

00:17:06.800 --> 00:17:08.800
estimating conditional mean functions

00:17:08.800 --> 00:17:12.000
or conditional probabilities or
various other nonparametric objects

00:17:12.400 --> 00:17:16.600
than more traditional nonparametric
methods that were studied in econometrics

00:17:16.600 --> 00:17:19.100
and statistics, especially
in high dimensional settings.

00:17:19.500 --> 00:17:23.100
So you thinking of maybe the propensity
score or something like that?

00:17:23.100 --> 00:17:25.300
So exactly, so nuisance functions. Yeah.

00:17:25.300 --> 00:17:28.900
So things like propensity scores
things or I mean even objects

00:17:28.900 --> 00:17:30.100
of more direct inference

00:17:30.200 --> 00:17:32.400
interest, like conditional
average treatment effects, right?

00:17:32.400 --> 00:17:35.100
Which of the difference of two
conditional, expectation functions,

00:17:35.100 --> 00:17:36.300
potentially things like that.

00:17:36.500 --> 00:17:40.400
Of course, even there,
right? We the the theory

00:17:40.500 --> 00:17:43.700
for in France or the theory for
sort of how to how to interpret,

00:17:43.700 --> 00:17:45.900
how to make large simple statements
about some of these things are

00:17:46.000 --> 00:17:50.100
less well-developed depending on the
machine learning, estimator used.

00:17:50.100 --> 00:17:53.800
And so, I think there's something
that is tricky is that we

00:17:53.900 --> 00:17:55.700
can have these methods, which work a lot,

00:17:55.700 --> 00:17:58.000
which seemed to work a lot
better for some purposes.

00:17:58.000 --> 00:18:01.600
But which we need to be a bit
careful in how we plug them in or how

00:18:01.600 --> 00:18:03.300
we interpret the resulting statements.

00:18:03.600 --> 00:18:06.200
But of course, that's a very,
very active area right now. We're

00:18:06.400 --> 00:18:10.400
People are doing tons of great work.
And so I exfoli expect and hope

00:18:10.400 --> 00:18:12.800
to see much more going forward there.

00:18:13.000 --> 00:18:17.300
So one issue with machine learning,
that always seems a danger is, or

00:18:17.400 --> 00:18:20.300
that is sometimes a danger
and had some times led to

00:18:20.500 --> 00:18:22.600
applications that have
made. Less sense, is

00:18:22.800 --> 00:18:25.100
when folks start with a method that are

00:18:25.300 --> 00:18:28.500
start with a method that they're very
excited about rather than a question,

00:18:28.900 --> 00:18:32.100
right? So sort of starting with
a question where here's the

00:18:32.500 --> 00:18:36.200
object I'm interested in here is
the parameter of Interest. Let me

00:18:36.700 --> 00:18:37.100
You know,

00:18:37.300 --> 00:18:39.500
think about how I would
identify that thing,

00:18:39.500 --> 00:18:41.800
how I would recover that
thing, if I had a ton of data,

00:18:41.900 --> 00:18:44.000
oh, here's a conditional
expectation function.

00:18:44.000 --> 00:18:47.100
Let me plug in an estimator on
machine. Learning estimator for that.

00:18:47.200 --> 00:18:48.800
That seems very very sensible.

00:18:49.000 --> 00:18:53.100
Whereas, you know, if I
digress quantity on price

00:18:53.700 --> 00:18:56.000
and say that I used a
machine learning method,

00:18:56.300 --> 00:18:58.900
maybe I'm satisfied that that
solves the in dodging, 80 problem.

00:18:58.900 --> 00:19:01.200
We're usually worried
about their maybe I'm not,

00:19:01.500 --> 00:19:03.200
but again, that's something where the,

00:19:03.400 --> 00:19:06.300
the way to address. It, seems
relatively clear, right?

00:19:06.500 --> 00:19:09.000
It's the find your object of interest and

00:19:09.200 --> 00:19:11.600
think about, is that just
bringing the economics?

00:19:11.700 --> 00:19:12.200
Exactly.

00:19:12.200 --> 00:19:15.400
And and can I think about it,
and they denied it, but harnessed

00:19:15.400 --> 00:19:18.300
the power of the machine
learning methods for precisely

00:19:18.500 --> 00:19:22.800
for some of the components precisely.
Exactly. So sort of, you know, the, the,

00:19:22.900 --> 00:19:25.600
the question of interest is the same as
the question of interest is always been,

00:19:25.600 --> 00:19:29.500
but we now better methods for estimating
some pieces of this, right? The

00:19:29.900 --> 00:19:31.600
the place that seems harder to, uh,

00:19:31.900 --> 00:19:33.400
harder to forecast is Right.

00:19:33.400 --> 00:19:36.300
Obviously, there's a huge amount
going in going on in the machine.

00:19:36.400 --> 00:19:37.400
Learning literature

00:19:37.500 --> 00:19:39.700
and the great sort of The Limited ways

00:19:39.700 --> 00:19:42.900
of plugging it in that I've referenced
so far are limited piece of that.

00:19:43.000 --> 00:19:46.100
And so I think there are all sorts of
other interesting questions about where,

00:19:46.300 --> 00:19:46.900
right sort of

00:19:47.100 --> 00:19:49.300
where does this interaction
go? What else can we learn?

00:19:49.300 --> 00:19:52.000
And that's something where,
you know, I think there's

00:19:52.200 --> 00:19:56.400
a ton going on which seems very promising
and I have no idea what the answer is.

00:19:57.000 --> 00:20:01.200
No, no. No, it's I so I totally
agree with that but it's no.

00:20:01.800 --> 00:20:03.500
That's makes it very exciting.

00:20:03.800 --> 00:20:06.100
And I think that's just a
little work to be done there.

00:20:06.600 --> 00:20:11.400
All right. So I say agrees
with me there, say that person.

00:20:14.500 --> 00:20:17.700
If you'd like to watch more
Nobel conversations, click here,

00:20:18.000 --> 00:20:20.400
or if you'd like to learn
more about econometrics,

00:20:20.500 --> 00:20:23.100
check out Josh's mastering
econometrics series.

00:20:23.600 --> 00:20:26.500
If you'd like to learn more
about he do Josh and Isaiah

00:20:26.700 --> 00:20:28.200
check out the links in the description.