-
♪ [music] ♪
-
- [narrator] Welcome
to Nobel conversations.
-
In this episode, Josh Angrist
and Guido Imbens
-
sit down with Isaiah Andrews
to discuss and disagree
-
over the role of machine learning
in applied econometrics.
-
- [Isaiah] So, of course,
there are a lot of topics
-
where you guys largely agree,
-
but I'd like to turn to one
-
where maybe you have
some differences of opinion.
-
So I'd love to hear
some of your thoughts
-
about machine learning
-
and the goal that it's playing
and is going to play in economics.
-
- [Guido] I've looked at some data
like the proprietary
-
so that there's
no published paper there.
-
There was an experiment
that was done
-
on some search algorithm.
-
And the question was...
-
it was about ranking things
and changing the ranking.
-
That was sort of clear...
-
that was going to be
a lot of heterogeneity there.
-
Mmm,
-
You know, if you look for say,
-
a picture of Britney Spears
-
that it doesn't really matter
where you rank it
-
because you're going to figure out
what you're looking for,
-
whether you put it
in the first or second
-
or third position of the ranking.
-
But if you're looking
for the best econometrics book,
-
if you put your book
first or your book tenth,
-
that's going to make
a big difference
-
how much how often people
are going to click on it.
-
And so there you go --
-
- [Josh] Why do I need
machine learning to discover that?
-
It seems like because
I can discover it simply.
-
So in general, there
were lots of possible.
-
You what you want to think about there
being lots of characteristics of the
-
the items that you want to understand
where, what drives the heterogeneity
-
in the effect of your just rekt,
you know, that in some sense.
-
You're solving a marketing problem.
-
Also affect you, it's causal,
but it has no scientific content.
-
I think about think about,
-
but it's similar things
and medical settings.
-
If you do an experiment, you
may actually be very interested
-
in whether the treatment
works for some groups or not.
-
And you have a lot of individual
characteristics and you want
-
to systematically search.
Yeah. I'm skeptical about that.
-
That sort of idea that there's this personal
causal effect that I should care about,
-
and that machine learning can Discover it
in some way that's useful. So think about
-
I've done a lot of work
on schools, going to say
-
a charter school publicly funded
private school effectively, you know,
-
that's free to structure its own
curriculum for context there.
-
Some types of charter, schools
are generate spectacular,
-
achievement gains and in the data
set that produces that result.
-
I have a lot of covariance.
-
So I have Baseline scores,
and I have family background,
-
the education of the parents, the sex
of the child, the race of the child.
-
And, well, soon as I put
-
Half a dozen of those together. I
have a very high dimensional space.
-
I'm definitely interested
in in sort, of course,
-
features of that treatment effect,
like whether it's better for people who
-
come from lower income families.
-
I have a hard time believing
that there's an application,
-
you know, for the very high
dimensional version of that, where
-
I discovered that for
non-white children who have
-
high family incomes, but Baseline
scores in the third quartile,
-
And only went to public school in the
third grade, but not the sixth grade.
-
So that's what that high
dimensional analysis produces.
-
This very elaborate conditional statement.
-
There's two things that are wrong
with that. In my view first.
-
I don't see it as I just can't
imagine why it's actionable.
-
I don't know why you'd want to act on it.
-
And I know also that there's some
alternative model that fits almost as well.
-
That flips everything,
-
right? Because machine learning doesn't
tell me that this is really the predictor
-
that
-
Is it just tells me that this
is a good predictor? And so,
-
you know, I think there is
something different about the
-
Moss social science contest. So I think
-
the socialized signs of applications
you're talking about once where
-
I think there's not a huge amount
of heterogeneity in the effects.
-
And so what there might be a few
allow me to to fill that space. No,
-
not even then I think for
a lot of those those into
-
Sanctions even effect. You would expect
that. The effect is the same sign
-
for everybody.
-
It may be there may be small differences
in the magnitude, but it's not
-
for a lot of these education
defenses. They're good for everybody.
-
They're
-
the it's not that they're bad for some
people and good for other people and
-
that is kind of very small
Pockets where they're bad the
-
but it may be some
variation in the magnitude,
-
but you would need very very big
data sets to find those and I
-
Then in those cases, they probably
wouldn't be very actionable anyone.
-
But there's I think there's
a lot of other settings
-
where there is much more hydrogen it.
-
Well, I'm open to that possibility
and I think the example you gave of
-
it's essentially a marketing example.
-
Now that maybe they
say there's a there's a
-
have implications for
and that's organization.
-
How you actually need to
whether you need to worry about
-
the well, I know Market
power, some see that paper.
-
So that's the sense. The
sense I'm getting is that
-
we still disagree on something. Yes.
-
We have it converged on
everything. I'm getting that sense.
-
Actually. We've diverged on this because
this wasn't around to argue about.
-
Is it getting a little warm here? Yeah.
Warm warmed up. Warmed up is good.
-
The sense. I'm getting his Jaws.
Sort of, you're not, you're not
-
saying that you're confident
that there is no way.
-
That there is an application
where the stuff is useful.
-
You are saying you are you're
unconvinced by the existing.
-
Applications to dedicate fair
that I'm very confident. Yeah,
-
in this case.
-
I think Josh does have a point that today
-
even in the prediction cases the where
-
a lot of the machine learning
methods really shine is
-
where there's just a lot of heterogeneity.
-
You don't really care much
about the details there, right?
-
Yes. It does. It doesn't have
a policy angle or something.
-
They kind of recognizing
handwritten digits and stuff.
-
For it does much better there than
building some complicated model.
-
But a lot of the social science, a
lot of the economic applications.
-
We actually know a huge amount about the
relationship between various variables.
-
A lot of the relationships
are strictly monotone.
-
There and education is going
to increase people's earnings,
-
irrespective of the demographic,
irrespective of the level of Education.
-
You already have until they get to a
PhD. Yeah. There is a graduate school.
-
A reasonable range.
-
It's a it's not going to
go down very much. We're
-
in a lot of the settings. For these
machine learning method shine.
-
It's going to there's a lot
of non-monetary Necessities
-
kind of multi modality
in these relationships
-
and they're they're going to be very
powerful but I still stand by that.
-
It kind of It kind of this message just
have a huge amount to offer the for
-
for economists and they go.
-
To be a big part of the future.
-
Feels like there's something interesting
to be said about machine learning here.
-
So, here I was wondering,
could you give some more,
-
maybe some examples
-
of the sorts of examples you're thinking
about with applications? I'm at the moment.
-
So while I'm on areas where
-
instead of looking for average
-
cause of facts were looking for
individualized estimates, and predictions of
-
of course of facts and their machine
learning algorithms have been very effective,
-
too.
-
Surely would have, we would have done
these things, using kernel methods.
-
And theoretically they work great and
-
the sort of some arguments that
you formally can't do any better.
-
But in practice, they
don't work very well and
-
random Forest, random cause of forest
type things that stuff on wagon, Susan.
-
I think I've been working
on. I used very widely.
-
They've been very effective,
kind of, in the settings
-
to actually get cause of facts
that are that the ferry by
-
Bike over has, and this kind of,
-
I think this is still just the beginning
of these methods. But in many cases,
-
the these algorithms are very
effective as searching over big spaces
-
and finding the functions that fit
-
the very well in ways that we
couldn't really do the beforehand.
-
I don't know of an example, where
machine learning has generated insights
-
about a causal effect that
I'm interested in. And I,
-
You know of examples where it's
potentially very misleading.
-
So I've done some work
with Brigham Franz and
-
using, for example,
-
random Forest to model covariate effects
in an instrumental variables problem.
-
Where you need,
-
you need to condition on covariance
-
and you don't particularly have strong
feelings about the functional form for that.
-
So maybe you should curve
-
think,
-
be open to flexible curve fitting
and that leads you down a path
-
where there's a lot of
nonlinearities in the model and
-
That's very dangerous with IV because
any sort of excluded non-linearity
-
potentially generates a spurious, causal
effect and Brigham. And I showed that
-
very powerfully. I think in
the case of two instruments
-
that come from a paper, mine
with Bill Evans. Where if you,
-
you know, replace it
-
in a traditional two stage least squares,
estimator with some kind of random Forest.
-
You get very precisely at
estimated nonsense estimates and
-
You know, I think that's
a, that's a big caution.
-
And I, you know, in view of those findings
-
in an example, I care about where
the instruments are very simple
-
and I believe that they're valid,
-
you know, I would be skeptical of that. So
-
non-linearity and Ivy don't mix
very comfortably. Now I said,
-
you know in some sense that's already
a more complicated. Well, it's Ivy.
-
Yeah,
-
but then we work on that and friend out.
-
I sat in tow vehicle actually guy a lot
of these papers Cross by my desk and it,
-
but the motivation is is not
clear at a fact, really lacking.
-
And they're not, they're not, they called
type semi-parametric foundational papers.
-
So that that's a big problem
-
and kind of related problem is that
we have this tradition in econometrics
-
being very focused on these formulas
and tonic results kind of weird.
-
We have just have a lot of papers
that where you people, propose
-
a method and then establish
the asymptotic properties
-
in in a very kind of
standardized way that bad.
-
Well, I think it's sort of close
the door for a lot of work.
-
That doesn't fit it into that. We're
in the machine learning literature.
-
A lot of things are
more algorithmic people.
-
Had algorithms for coming
up with predictions.
-
The turn out to actually work much better
than say, nonparametric kernel regression
-
for a long-ass time. We're doing all
the nonparametric syndecan, metrics.
-
We do it using kernel regression and
I was great for proving theorems.
-
You could get confidence, intervals and
consistency, and asymptotic normality,
-
and it was all great, but
it wasn't very useful.
-
And the things they did in machine
learning. I just way way better,
-
but they didn't have to the proper. That's
not my beef with machine learning theory.
-
As we know my name, I'm saying
there for the prediction part.
-
It does much better. Yeah, that's
a better curve fitting to it.
-
But it did. So
-
in a way that would not have made
those papers initially easy to get into
-
the econometrics journals because it
wasn't proving the type of things.
-
You know, when when Brian was doing his
regression trees that just didn't fit in
-
and I think he would have
had a very hard time.
-
Polishing these things. And it
could have had six journals.
-
I, so I think we're we limited
ourselves too much and we
-
that left us close things off
-
for a lot of these machine learning
methods, that actually very useful.
-
Hmm. I mean, I think they're in general,
-
that literature the computer.
-
Scientists have brought a huge
number of these algorithms.
-
The have proposed a huge number of these
algorithms that actually very useful
-
at that are
-
Affecting the way we're going
to be doing empirical work,
-
but we've not fully internalize that
because we're still very focused on getting
-
Point estimates and
getting standard errors
-
and getting P values in a way that
-
we need to move Beyond
-
to fully harness.
-
The force, the quote, the benefits
from machine learning literature.
-
Hmm. On the one hand. I guess I very
much take your point that sort of the the
-
Tional. Econometrics, framework
of sort of propose, a method,
-
proved a limit theorem under some
asymptotic story, story story,
-
story story publish a
paper is constraining.
-
And that in some sense by thinking, more,
-
broadly about what a methods paper could
look. Like we may write in some sense.
-
Certainly the machine learning
literature has found a bunch of things,
-
which seem to work quite
well for a number of problems
-
and are now having substantial influence
in economics. I guess a question.
-
I'm interested in is, how do you think?
-
The goal of fear.
-
Sort of, do you think there is? There's
no value in the theory part of it?
-
Because I guess it's sort of a question
that I often have to sort of seeing
-
that output from a machine learning tool
-
that actually a number of the
methods that you talked about.
-
Actually do have inferential
results, develop for them,
-
something that I always wonder about a sort
of uncertainty quantification and just,
-
you know, I I have my prior,
-
I come into the world with my view.
I see the result of this thing.
-
How should I update based on it? And
in some sense, if I'm in a world where
-
things are.
-
Normally distributed. I know
how to do it here. I don't.
-
And so I'm interested to hear
had I think it sounds. So
-
I don't see this as sort
of close it saying, well
-
we do these results
are not not interesting
-
but it's gonna be a lot of cases
-
where it's going to be incredibly hard to
get those results and we may not be able
-
to get there and
-
we may need to do it in stages. Where
first someone says. Hey I have this
-
interesting algorithm for for doing
something and it works well by some
-
The Criterion that on this
this particular data set
-
and I'm visit put it
out there and we should
-
maybe someone will figure out a way that
you can later actually still do inference
-
on the some condition.
-
So and maybe those are not
particularly realistic conditions,
-
then we kind of go further,
but I think we've been
-
Too constraining things too much where we
said, you know, this is the type of things
-
that we need to do. And I had some sense
-
that goes back to kind of
the way they dress and I
-
thought about things for the
local average treatment effect.
-
That wasn't quite the way people
were thinking about these problems.
-
Before they say they there was a sense
that some of the people said, you know,
-
the way you need to do. These
things, is you first, say
-
what you're interested in estimating
and then you do the best job you can.
-
In estimating that
-
and what you have you guys had doing is
doing it, you guys are doing it backwards.
-
You're going to say
here. I have an estimator
-
and now I'm going to figure out what what
-
what it says estimating then expose.
-
You're going to say why you
think that's interesting
-
or maybe why it's not interesting
and that's that's not okay.
-
You're not allowed to do that that way.
-
And I think we should just be a little
bit more flexible and thinking about the
-
how to look at at
-
Problems because I think we've missed
some things by not by not doing that.
-
So you've heard our views.
Isaiah, you've seen that, we have
-
some points of disagreement. Why
don't you referee this dispute for us?
-
Oh, I'm so so nice of you to ask me
a small question. So I guess for one.
-
I very much agree with something
that he do said earlier of.
-
So what?
-
Where it seems. Where the,
-
the case for machine learning seems
relatively clear is in settings, where
-
you know, we're interested in some version
of a nonparametric prediction problem.
-
So I'm interested in estimating a conditional
expectation or conditional probability
-
and in the past, maybe I
would have run a colonel,
-
I would have run a kernel regression or
I would have run a series regression or
-
something along those lines.
-
Sort of,
-
it seems like
-
at this point we've a fairly good
sense that in a fairly wide range
-
of applications machine learning
methods seem to do better for
-
Or, you know,
-
estimating conditional mean functions
-
or conditional probabilities or
various other nonparametric objects
-
than more traditional nonparametric
methods that were studied in econometrics
-
and statistics, especially
in high dimensional settings.
-
So you thinking of maybe the propensity
score or something like that?
-
So exactly, so nuisance functions. Yeah.
-
So things like propensity scores
things or I mean even objects
-
of more direct inference
-
interest, like conditional
average treatment effects, right?
-
Which of the difference of two
conditional, expectation functions,
-
potentially things like that.
-
Of course, even there,
right? We the the theory
-
for in France or the theory for
sort of how to how to interpret,
-
how to make large simple statements
about some of these things are
-
less well-developed depending on the
machine learning, estimator used.
-
And so, I think there's something
that is tricky is that we
-
can have these methods, which work a lot,
-
which seemed to work a lot
better for some purposes.
-
But which we need to be a bit
careful in how we plug them in or how
-
we interpret the resulting statements.
-
But of course, that's a very,
very active area right now. We're
-
People are doing tons of great work.
And so I exfoli expect and hope
-
to see much more going forward there.
-
So one issue with machine learning,
that always seems a danger is, or
-
that is sometimes a danger
and had some times led to
-
applications that have
made. Less sense, is
-
when folks start with a method that are
-
start with a method that they're very
excited about rather than a question,
-
right? So sort of starting with
a question where here's the
-
object I'm interested in here is
the parameter of Interest. Let me
-
You know,
-
think about how I would
identify that thing,
-
how I would recover that
thing, if I had a ton of data,
-
oh, here's a conditional
expectation function.
-
Let me plug in an estimator on
machine. Learning estimator for that.
-
That seems very very sensible.
-
Whereas, you know, if I
digress quantity on price
-
and say that I used a
machine learning method,
-
maybe I'm satisfied that that
solves the in dodging, 80 problem.
-
We're usually worried
about their maybe I'm not,
-
but again, that's something where the,
-
the way to address. It, seems
relatively clear, right?
-
It's the find your object of interest and
-
think about, is that just
bringing the economics?
-
Exactly.
-
And and can I think about it,
and they denied it, but harnessed
-
the power of the machine
learning methods for precisely
-
for some of the components precisely.
Exactly. So sort of, you know, the, the,
-
the question of interest is the same as
the question of interest is always been,
-
but we now better methods for estimating
some pieces of this, right? The
-
the place that seems harder to, uh,
-
harder to forecast is Right.
-
Obviously, there's a huge amount
going in going on in the machine.
-
Learning literature
-
and the great sort of The Limited ways
-
of plugging it in that I've referenced
so far are limited piece of that.
-
And so I think there are all sorts of
other interesting questions about where,
-
right sort of
-
where does this interaction
go? What else can we learn?
-
And that's something where,
you know, I think there's
-
a ton going on which seems very promising
and I have no idea what the answer is.
-
No, no. No, it's I so I totally
agree with that but it's no.
-
That's makes it very exciting.
-
And I think that's just a
little work to be done there.
-
All right. So I say agrees
with me there, say that person.
-
If you'd like to watch more
Nobel conversations, click here,
-
or if you'd like to learn
more about econometrics,
-
check out Josh's mastering
econometrics series.
-
If you'd like to learn more
about he do Josh and Isaiah
-
check out the links in the description.