Return to Video

How Will Machine Learning Impact Economics?

  • 0:00 - 0:02
    ♪ [music] ♪
  • 0:04 - 0:06
    - [Narrator] Welcome
    to Nobel Conversations.
  • 0:07 - 0:10
    In this episode, Josh Angrist
    and Guido Imbens
  • 0:10 - 0:14
    sit down with Isaiah Andrews
    to discuss and disagree
  • 0:14 - 0:17
    over the role of machine learning
    in applied econometrics.
  • 0:18 - 0:20
    - [Isaiah] So, of course,
    there are a lot of topics
  • 0:20 - 0:21
    where you guys largely agree,
  • 0:21 - 0:22
    but I'd like to turn to one
  • 0:22 - 0:24
    where maybe you have
    some differences of opinion.
  • 0:24 - 0:26
    I'd love to hear
    some of your thoughts
  • 0:26 - 0:27
    about machine learning
  • 0:27 - 0:30
    and the goal that it's playing
    and is going to play in economics.
  • 0:30 - 0:33
    - [Guido] I've looked at some data
    like the proprietary.
  • 0:33 - 0:35
    We see that there's
    no published paper there.
  • 0:36 - 0:38
    There was an experiment
    that was done
  • 0:38 - 0:40
    on some search algorithm,
  • 0:40 - 0:41
    and the question was...
  • 0:43 - 0:46
    it was about ranking things
    and changing the ranking.
  • 0:46 - 0:47
    And it was sort of clear...
  • 0:48 - 0:51
    that there was going to be
    a lot of heterogeneity there.
  • 0:52 - 0:58
    If you look for, say,
  • 0:58 - 1:01
    a picture of Britney Spears
  • 1:01 - 1:02
    that it doesn't really matter
    where you rank it
  • 1:02 - 1:06
    because you're going to figure out
    what you're looking for,
  • 1:06 - 1:08
    whether you put it
    in the first or second
  • 1:08 - 1:10
    or third position of the ranking.
  • 1:10 - 1:12
    But if you're looking
    for the best econometrics book,
  • 1:13 - 1:16
    if you put your book first
    or your book tenth --
  • 1:16 - 1:18
    that's going to make
    a big difference
  • 1:19 - 1:21
    how often people
    are going to click on it.
  • 1:22 - 1:23
    And so there you --
  • 1:23 - 1:27
    - [Josh] Why do I need
    machine learning to discover that?
  • 1:27 - 1:29
    It seems like I could
    I can discover it simply?
  • 1:29 - 1:30
    - [Guido] So in general--
  • 1:30 - 1:32
    - [Josh] There were lots
    of possible...
  • 1:32 - 1:35
    - You what you want to think about
    there being lots of characteristics
  • 1:35 - 1:38
    of the items
  • 1:38 - 1:42
    that you want to understand
    what drives the heterogeneity
  • 1:42 - 1:43
    in the effect of--
  • 1:43 - 1:46
    - But you're just predicting
  • 1:46 - 1:48
    In some sense, you're solving
    a marketing problem.
  • 1:48 - 1:50
    - [inaudible] it's causal effect,
  • 1:50 - 1:52
    - It's causal, but it has
    no scientific content.
  • 1:52 - 1:53
    Think about...
  • 1:54 - 1:57
    - No, but it's similar things
    in medical settings.
  • 1:58 - 2:01
    If you do an experiment,
    you may actually be very interested
  • 2:01 - 2:04
    in whether the treatment
    works for some groups or not.
  • 2:04 - 2:06
    And you have a lot of individual
    characteristics,
  • 2:06 - 2:08
    and you want
    to systematically search.
  • 2:08 - 2:10
    - Yeah. I'm skeptical about that --
  • 2:10 - 2:13
    that sort of idea that there's
    this personal causal effect
  • 2:13 - 2:14
    that I should care about,
  • 2:14 - 2:16
    and that machine learning
    can discover it
  • 2:16 - 2:18
    in some way that's useful.
  • 2:18 - 2:21
    So think about -- I've done
    a lot of work on schools,
  • 2:21 - 2:24
    going to, say, a charter school,
  • 2:24 - 2:25
    a publicly funded private school,
  • 2:25 - 2:26
    effectively, you know,
    that's free to structure
  • 2:26 - 2:29
    its own curriculum
    for context there.
  • 2:29 - 2:31
    Some types of charter schools
  • 2:31 - 2:33
    generate spectacular
    achievement gains,
  • 2:33 - 2:36
    and in the data set
    that produces that result,
  • 2:36 - 2:38
    I have a lot of covariance.
  • 2:38 - 2:41
    So I have baseline scores,
    and I have family background,
  • 2:41 - 2:44
    the education of the parents,
  • 2:44 - 2:46
    the sex of the child,
    the race of the child.
  • 2:46 - 2:48
    And, well, soon as I put
    half a dozen of those together,
  • 2:48 - 2:52
    I have a very high dimensional space.
  • 2:52 - 2:54
    I'm definitely interested
    in sort of coarse features
  • 2:54 - 2:55
    of that treatment effect,
  • 2:55 - 2:57
    like whether it's better for people
  • 2:57 - 2:59
    who come from
    lower income families.
  • 3:03 - 3:06
    I have a hard time believing
    that there's an application,
  • 3:06 - 3:10
    for the very high dimensional
    version of that,
  • 3:10 - 3:12
    where I discovered
    that for non-white children
  • 3:12 - 3:13
    who have high family incomes
  • 3:14 - 3:18
    but baseline scores
    in the third quartile
  • 3:18 - 3:21
    and only went to public school
    in the third grade
  • 3:21 - 3:23
    but not the sixth grade.
  • 3:23 - 3:26
    So that's what that high
    dimensional analysis produces.
  • 3:26 - 3:28
    This very elaborate
    conditional statement.
  • 3:28 - 3:31
    There's two things that are wrong
    with that in my view.
  • 3:31 - 3:32
    First, I don't see it as...
  • 3:32 - 3:34
    I just can't imagine
    why it's actionable.
  • 3:35 - 3:37
    I don't know why
    you'd want to act on it.
  • 3:37 - 3:39
    And I know also
    that there's some alternative model
  • 3:39 - 3:41
    that fits almost as well,
  • 3:42 - 3:43
    that flips everything,
  • 3:43 - 3:45
    Because machine learning
    doesn't tell me
  • 3:45 - 3:48
    that this is really
    the predictor that matters.
  • 3:48 - 3:52
    It just tells me that
    this is a good predictor.
  • 3:53 - 3:54
    And so, I think
    there is something different
  • 3:54 - 3:56
    about the social science contest.
  • 3:58 - 4:00
    - [Guido] I think
    the [socialized sign] applications
  • 4:00 - 4:01
    you're talking about,
  • 4:01 - 4:03
    once were...
  • 4:03 - 4:08
    I think there's not a huge amount
    of heterogeneity in the effects.
  • 4:08 - 4:11
    - [Josh] There might be
  • 4:11 - 4:14
    if you allow me
    to to fill that space.
  • 4:15 - 4:16
    - No... not even then.
  • 4:16 - 4:18
    I think for a lot
    of those interventions,
  • 4:18 - 4:22
    you would expect that the effect
    is the same sign for everybody.
  • 4:23 - 4:28
    There may be small differences
    in the magnitude, but it's not...
  • 4:28 - 4:32
    For a lot of these education
    defenses -- they're good for everybody.
  • 4:33 - 4:35
    It's not that they're bad
    for some people
  • 4:35 - 4:38
    and good for other people,
  • 4:38 - 4:39
    and that is kind
    of very small pockets
  • 4:39 - 4:41
    where they're bad there.
  • 4:41 - 4:44
    But it may be some variation
    in the magnitude,
  • 4:44 - 4:48
    but you would need very,
    very big data sets to find those.
  • 4:48 - 4:50
    I agree that in those cases,
  • 4:50 - 4:51
    they probably wouldn't be
    very actionable anyone.
  • 4:52 - 4:54
    But I think there's a lot
    of other settings
  • 4:54 - 4:57
    where there is
    much more heterogeneity.
  • 4:57 - 5:00
    - Well, I'm open
    to that possibility,
  • 5:00 - 5:06
    and I think the example you gave
    is essentially a marketing example.
  • 5:06 - 5:11
    - No, those have implications for it
    and that's the organization,
  • 5:11 - 5:14
    whether you need
    to worry about the...
  • 5:14 - 5:18
    - Well, I need to see that paper.
  • 5:18 - 5:21
    - So the sense I'm getting...
  • 5:22 - 5:23
    - We still disagree on something.
    - Yes.
  • 5:23 - 5:24
    [laughter]
  • 5:24 - 5:25
    - We haven't converged
    on everything.
  • 5:25 - 5:26
    - I'm getting that sense.
  • 5:26 - 5:27
    [laughter]
  • 5:27 - 5:29
    - Actually, we've diverged on this
  • 5:29 - 5:30
    because this wasn't around
    to argue about.
  • 5:30 - 5:31
    [laughter]
  • 5:33 - 5:36
    - Is it getting a little warm here?
  • 5:36 - 5:38
    - Warmed up. Warmed up is good.
  • 5:38 - 5:41
    The sense I'm getting is, Josh,
    you're not saying
  • 5:41 - 5:43
    that you're confident
    that there is no way
  • 5:43 - 5:45
    that there is an application
    where the stuff.
  • 5:45 - 5:47
    It's useful you are saying
  • 5:47 - 5:48
    you are unconvinced by
    the existing application to date.
  • 5:48 - 5:51
    Fair enough.
  • 5:51 - 5:53
    - I'm very confident.
  • 5:53 - 5:54
    [laughter]
  • 5:54 - 5:55
    - In this case.
  • 5:55 - 5:58
    - I think Josh does have a point
  • 5:58 - 6:02
    that even in the prediction cases
  • 6:02 - 6:05
    where a lot of the machine learning
    methods really shine
  • 6:05 - 6:07
    is where there's just a lot
    of heterogeneity.
  • 6:07 - 6:11
    - You don't really care much
    about the details there, right?
  • 6:11 - 6:15
    It doesn't have
    a policy angle or something.
  • 6:15 - 6:18
    - They kind of recognizing
    handwritten digits and stuff.
  • 6:18 - 6:21
    It does much better there
  • 6:21 - 6:24
    than building
    some complicated model.
  • 6:24 - 6:28
    But a lot of the social science,
    a lot of the economic applications,
  • 6:28 - 6:30
    we actually know a huge amount
    about the relationship
  • 6:30 - 6:32
    between its variables.
  • 6:32 - 6:35
    A lot of the relationships
    are strictly monotone.
  • 6:35 - 6:39
    Education is going to increase
    people's earnings,
  • 6:40 - 6:42
    irrespective of the demographic,
  • 6:42 - 6:44
    irrespective of the level
    of education you already have.
  • 6:44 - 6:46
    - Until they get to a Ph.D.
  • 6:46 - 6:48
    - Yeah, there is a graduate school...
  • 6:48 - 6:49
    [laughter]
  • 6:50 - 6:51
    but go over a reasonable range.
  • 6:52 - 6:56
    It's not going
    to go down very much.
  • 6:56 - 6:58
    In a lot of the settings
  • 6:58 - 7:00
    where these machine learning
    methods shine,
  • 7:00 - 7:02
    there's a lot of [ ]
  • 7:02 - 7:05
    kind of multimodality
    in these relationships,
  • 7:05 - 7:08
    and they're going to be
    very powerful.
  • 7:08 - 7:12
    But I still stand by that.
  • 7:12 - 7:16
    These methods just have
    a huge amount to offer
  • 7:16 - 7:18
    for economists,
  • 7:18 - 7:22
    and they're going to be
    a big part of the future.
  • 7:23 - 7:25
    - [Isaiah] Feels like
    there's something interesting
  • 7:25 - 7:26
    to be said about
    machine learning here.
  • 7:26 - 7:28
    So, Guido, I was wondering,
    could you give some more...
  • 7:28 - 7:29
    maybe some examples
    of the sorts of examples
  • 7:29 - 7:32
    you're thinking about
    with applications [ ] at the moment?
  • 7:32 - 7:34
    - So on areas where
  • 7:35 - 7:36
    instead of looking
    for average cause or effects
  • 7:36 - 7:39
    we're looking for
    individualized estimates,
  • 7:39 - 7:42
    predictions of cause or effects
  • 7:42 - 7:45
    and the machine learning algorithms
    have been very effective,
  • 7:48 - 7:52
    Traditionally, we would have done
    these things using kernel methods.
  • 7:52 - 7:54
    And theoretically they work great,
  • 7:55 - 7:56
    and there's some arguments
  • 7:56 - 7:57
    that, formally,
    you can't do any better.
  • 7:58 - 8:00
    But in practice,
    they don't work very well.
  • 8:01 - 8:03
    Random causal forest-type things
  • 8:03 - 8:05
    that Stefan Wager and Susan Athey
    have been working on
  • 8:05 - 8:10
    have used very widely.
  • 8:10 - 8:12
    They've been very effective
    in these settings
  • 8:12 - 8:18
    to actually get causal effects
    that vary be [ ].
  • 8:21 - 8:23
    I think this is still just the beginning
    of these methods.
  • 8:23 - 8:26
    But in many cases,
  • 8:26 - 8:32
    these algorithms are very effective
    as searching over big spaces
  • 8:32 - 8:36
    and finding the functions that fit very well
  • 8:36 - 8:41
    in ways that we couldn't
    really do beforehand.
  • 8:42 - 8:43
    - I don't know of an example
  • 8:43 - 8:45
    where machine learning
    has generated insights
  • 8:45 - 8:48
    about a causal effect
    that I'm interested in.
  • 8:48 - 8:50
    And I do know of examples
  • 8:50 - 8:51
    where it's potentially
    very misleading.
  • 8:51 - 8:54
    So I've done some work
    with Brigham Frandsen,
  • 8:54 - 8:55
    using, for example, random forest
    to model covariate effects
  • 8:55 - 9:00
    in an instrumental
    variables problem
  • 9:00 - 9:01
    Where you need you need
    to condition on covariance.
  • 9:04 - 9:06
    And you don't particularly
    have strong feelings
  • 9:06 - 9:08
    about the functional form for that,
  • 9:08 - 9:10
    so maybe you should curve...
  • 9:11 - 9:13
    be open to flexible curve fitting,
  • 9:13 - 9:14
    and that leads you down a path
  • 9:14 - 9:18
    where there's a lot
    of nonlinearities in the model,
  • 9:18 - 9:21
    and that's very dangerous with IV
  • 9:21 - 9:23
    because any sort
    of excluded non-linearity
  • 9:23 - 9:25
    potentially generates
    a spurious causal effect
  • 9:25 - 9:28
    and Brigham and I
    showed that very powerfully.
  • 9:28 - 9:32
    I think in the case
    of two instruments
  • 9:33 - 9:36
    that come from a paper of mine
    with Bill Evans,
  • 9:36 - 9:38
    where if you replace it
  • 9:38 - 9:40
    a traditional two stage
    [ ] squares estimator
  • 9:40 - 9:43
    with some kind of random forest,
  • 9:43 - 9:48
    you get very precisely
    estimated [non-sense] estimates.
  • 9:49 - 9:51
    I think that's a big caution.
  • 9:51 - 9:53
    In view of those findings
    in an example I care about
  • 9:54 - 9:57
    where the instruments
    are very simple
  • 9:57 - 9:59
    and I believe that they're valid,
  • 9:59 - 10:02
    I would be skeptical of that.
  • 10:03 - 10:07
    So non-linearity and IV
    don't mix very comfortably.
  • 10:07 - 10:10
    No, it sounds like that's already
    a more complicated...
  • 10:10 - 10:11
    - Well, it's IV....
    - Yeah.
  • 10:12 - 10:17
    - ...and we work on that.
  • 10:17 - 10:18
    [laughter]
  • 10:18 - 10:19
    - Fair enough.
  • 10:19 - 10:20
    - As Editor of Econometric [guy],
  • 10:20 - 10:22
    a lot of these papers
    cross by my desk,
  • 10:23 - 10:26
    but the motivation is not clear
  • 10:26 - 10:30
    and, in fact, really lacking.
  • 10:30 - 10:35
    They're not... [we call] type
    semi-parametric foundational papers.
  • 10:35 - 10:37
    So that that's a big problem.
  • 10:38 - 10:42
    A related problem is that we have
    this tradition in econometrics
  • 10:43 - 10:48
    of being very focused
    on these formal [ ] results.
  • 10:49 - 10:53
    We have just have a lot of papers
    where people propose a method
  • 10:53 - 10:56
    and then establish
    the asymptotic properties
  • 10:56 - 10:59
    in a very kind of standardized way.
  • 10:59 - 11:02
    - Is that bad?
  • 11:03 - 11:07
    - Well, I think it's sort
    of closed the door
  • 11:07 - 11:09
    for a lot of work
    that doesn't fit it into that.
  • 11:09 - 11:12
    where in the machine
    learning literature,
  • 11:12 - 11:14
    a lot of things
    are more algorithmic.
  • 11:14 - 11:18
    People had algorithms
    for coming up with predictions
  • 11:19 - 11:21
    that turn out
    to actually work much better
  • 11:21 - 11:24
    than, say, nonparametric
    kernel regression
  • 11:24 - 11:27
    For a long time, we were doing all
    the nonparametrics in econometrics,
  • 11:27 - 11:29
    we were using kernel regression,
  • 11:29 - 11:31
    and it was great for proving theorems.
  • 11:31 - 11:33
    You could get [ ] intervals
  • 11:33 - 11:35
    and consistency,
    and asymptotic normality,
  • 11:35 - 11:36
    and it was all great,
  • 11:36 - 11:37
    But it wasn't very useful.
  • 11:37 - 11:39
    And the things they did
    in machine learning
  • 11:39 - 11:41
    are just way, way better.
  • 11:41 - 11:43
    But they didn't have the problem--
  • 11:43 - 11:44
    - That's not my beef
    with machine learning theory.
  • 11:44 - 11:45
    [laughter]
  • 11:45 - 11:51
    No, but I'm saying there,
    for the prediction part,
  • 11:51 - 11:53
    it does much better.
  • 11:53 - 11:54
    - Yeah, it's a better
    curve fitting to it.
  • 11:55 - 11:56
    - But it did so in a way
  • 11:57 - 11:58
    that would not have made
    those papers
  • 11:58 - 12:00
    initially easy to get into,
    the econometrics journals,
  • 12:05 - 12:06
    because it wasn't proving
    the type of things.
  • 12:06 - 12:09
    When Brigham was doing
    his regression trees
  • 12:09 - 12:11
    that just didn't fit in.
  • 12:12 - 12:15
    I think he would have had
    a very hard time
  • 12:15 - 12:18
    publishing these things
    in econometric journals.
  • 12:19 - 12:24
    I think we've limited
    ourselves too much
  • 12:25 - 12:28
    that left us close things off
  • 12:28 - 12:29
    for a lot of these
    machine learning methods
  • 12:29 - 12:31
    that are actually very useful.
  • 12:31 - 12:34
    I mean, I think, in general,
  • 12:35 - 12:36
    that literature,
    the computer scientist,
  • 12:36 - 12:38
    have proposed a huge number
    of these algorithms
  • 12:38 - 12:39
    that actually are very useful.
  • 12:46 - 12:47
    and that are affecting
  • 12:47 - 12:49
    the way we're going
    to be doing empirical work.
  • 12:50 - 12:52
    But we've not fully internalized that
  • 12:52 - 12:55
    because we're still very focused
  • 12:55 - 12:58
    on getting point estimates
    and getting standard errors
  • 12:59 - 13:01
    and getting P values
  • 13:02 - 13:03
    in a way that we need to move beyond
  • 13:03 - 13:04
    to fully harness the force,
  • 13:04 - 13:11
    the benefits
    from the machine learning literature.
  • 13:11 - 13:13
    - On the one hand, I guess I very
    much take your point
  • 13:13 - 13:15
    that sort of the traditional
    econometrics framework
  • 13:15 - 13:19
    of sort of propose a method,
    prove a limit theorem
  • 13:19 - 13:23
    under some asymptotic story,
    story story, story story...
  • 13:23 - 13:27
    publisher paper is constraining.
  • 13:27 - 13:30
    And that, in some sense,
  • 13:30 - 13:31
    by thinking more broadly
  • 13:31 - 13:31
    about what a methods paper
    could look like,
  • 13:31 - 13:33
    we may [write] in some sense.
  • 13:33 - 13:36
    Certainly the machine learning
    literature has found a bunch of things,
  • 13:36 - 13:38
    which seem to work quite well
    for a number of problems
  • 13:38 - 13:40
    and are now having
    substantial influence in economics.
  • 13:40 - 13:42
    I guess a question I'm interested in
  • 13:42 - 13:45
    is how do you think
    about the role of...
  • 13:48 - 13:51
    sort of -- do you think there is
    no value in the theory part of it?
  • 13:52 - 13:55
    Because I guess a question
    that I often have
  • 13:55 - 13:57
    to sort of seeing that output
    from a machine learning tool,
  • 13:57 - 13:59
    that actually a number of the
    methods that you talked about
  • 13:59 - 14:02
    actually do have inferential results
    developed for them,
  • 14:03 - 14:04
    something that
    I always wonder about
  • 14:04 - 14:06
    of uncertainty quantification
    and just...
  • 14:06 - 14:08
    I have my prior,
  • 14:08 - 14:11
    I come into the world with my view.
    I see the result of this thing.
  • 14:11 - 14:13
    How should I update based on it?
  • 14:13 - 14:14
    And in some sense,
    if I'm in a world
  • 14:15 - 14:15
    where things are normally distributed,
  • 14:15 - 14:17
    I know how to do it here --
  • 14:17 - 14:18
    here I don't.
  • 14:18 - 14:21
    And so I'm interested to hear
    what you think about that.
  • 14:22 - 14:24
    - I don't see this as sort
    of saying, well,
  • 14:24 - 14:26
    these results are not interesting,
  • 14:27 - 14:28
    but it's going to be a lot of cases
  • 14:28 - 14:30
    where it's going
    to be incredibly hard
  • 14:30 - 14:31
    to get those results
  • 14:31 - 14:33
    and we may not be able to get there
  • 14:33 - 14:36
    and we may need to do it in stages
  • 14:36 - 14:38
    where first someone says,
  • 14:40 - 14:41
    "Hey, I have
    this interesting algorithm
  • 14:41 - 14:42
    for doing something
  • 14:42 - 14:45
    and it works well by some of the criterion
  • 14:46 - 14:50
    that on this particular data set,
  • 14:51 - 14:53
    and I'm visit put it out there,
  • 14:54 - 14:56
    and maybe someone will figure out a way
  • 14:56 - 14:58
    that you can later actually
    still do inference
  • 14:58 - 14:59
    on the [sum] condition,
  • 14:59 - 15:02
    and maybe those are not
    particularly realistic conditions,
  • 15:02 - 15:04
    then we kind of go further.
  • 15:04 - 15:06
    But I think we've been
    constraining things too much
  • 15:07 - 15:09
    where we said,
  • 15:09 - 15:11
    "This is the type of things
    that we need to do.
  • 15:12 - 15:14
    And in some sense,
  • 15:16 - 15:18
    that goes back
    to the way Josh and I
  • 15:20 - 15:22
    thought about things for the
    [local average treatment] effect.
  • 15:22 - 15:23
    That wasn't quite the way
  • 15:23 - 15:25
    people were thinking
    about these problems before.
  • 15:25 - 15:29
    There was a sense
    that some of the people said
  • 15:30 - 15:32
    the way you need to do
    these things is you first say,
  • 15:32 - 15:34
    what you're interested in
    in estimating
  • 15:34 - 15:36
    and then you do the best job
    you can in estimating that.
  • 15:38 - 15:44
    and what you guys are doing
    is you're doing it backwards.
  • 15:44 - 15:47
    You kind of say,
    "Here, I have an estimator,
  • 15:47 - 15:50
    and now I'm going to figure out
    what it's estimating,
  • 15:51 - 15:54
    and I suppose you're going to say
    why you think that's interesting
  • 15:54 - 15:57
    or maybe why it's not interesting,
    and that's not okay.
  • 15:57 - 15:59
    You're not allowed
    to do that that way.
  • 15:59 - 16:04
    And I think we should
    just be a little bit more flexible
  • 16:04 - 16:06
    in thinking about
    how to look at problems
  • 16:06 - 16:09
    because I think
    we've missed some things
  • 16:09 - 16:11
    by not doing that.
  • 16:13 - 16:15
    - [Josh] So you've heard
    our views, Isaiah.
  • 16:15 - 16:17
    You've seen that we have
    some points of disagreement.
  • 16:17 - 16:20
    Why don't you referee
    this dispute for us?
  • 16:21 - 16:22
    [laughter]
  • 16:22 - 16:25
    - Oh, it's so nice of you
    to ask me a small question.
  • 16:25 - 16:28
    So I guess for one,
  • 16:28 - 16:33
    I very much agree with something
    that Guido said earlier of...
  • 16:34 - 16:35
    [laughter]
  • 16:36 - 16:38
    - So one thing where it seems
  • 16:38 - 16:40
    where the case for machine learning
    seems relatively clear
  • 16:40 - 16:41
    is in settings where
    we're interested in some version
  • 16:42 - 16:45
    of a nonparametric
    prediction problem.
  • 16:45 - 16:47
    So I'm interested in estimating
  • 16:47 - 16:50
    a conditional expectation
    or conditional probability,
  • 16:50 - 16:52
    and in the past, maybe
    I would have run a kernel...
  • 16:52 - 16:54
    I would have run
    a kernel regression
  • 16:54 - 16:56
    or I would have run
    a series regression,
  • 16:56 - 16:57
    or something along those lines.
  • 16:59 - 17:00
    It seems like, at this point,
    we've a fairly good sense
  • 17:00 - 17:02
    that in a fairly wide range
    of applications,
  • 17:02 - 17:06
    machine learning methods
    seem to do better
  • 17:07 - 17:09
    for estimating conditional
    mean functions
  • 17:09 - 17:10
    or conditional probabilities
  • 17:10 - 17:12
    or various other
    nonparametric objects
  • 17:12 - 17:14
    than more traditional
    nonparametric methods
  • 17:14 - 17:17
    that were studied
    in econometrics and statistics,
  • 17:17 - 17:19
    especially
    in high dimensional settings.
  • 17:20 - 17:21
    - So you're thinking of maybe
    the propensity score
  • 17:21 - 17:23
    or something like that?
  • 17:23 - 17:24
    - Yeah, exactly,
  • 17:24 - 17:25
    - Nuisance functions.
  • 17:25 - 17:27
    Yeah, so things
    like propensity scores,
  • 17:28 - 17:30
    even objects of more direct
  • 17:30 - 17:32
    interest-like conditional
    average treatment effects,
  • 17:32 - 17:35
    which of the difference of two
    conditional expectation functions,
  • 17:35 - 17:36
    potentially things like that.
  • 17:36 - 17:40
    Of course, even there, the theory...
  • 17:40 - 17:44
    inference of the theory
    for how to interpret,
  • 17:44 - 17:46
    how to make large simple statements
    about some of these things
  • 17:46 - 17:48
    are less well-developed
    depending on
  • 17:48 - 17:50
    the machine learning
    estimator used.
  • 17:50 - 17:54
    And so I think there's
    something that is tricky
  • 17:54 - 17:56
    is that we can have these methods,
    which work a lot,
  • 17:56 - 17:58
    which seemed to work
    a lot better for some purposes,
  • 17:58 - 18:02
    but which we need to be a bit
    careful in how we plug them in
  • 18:02 - 18:03
    or how we interpret
    the resulting statements.
  • 18:04 - 18:06
    But of course, that's a very,
    very active area right now
  • 18:06 - 18:08
    where people are doing
    tons of great work.
  • 18:08 - 18:10
    And so I fully expect
    and hope to see
  • 18:10 - 18:13
    much more going forward there.
  • 18:13 - 18:17
    So one issue with machine learning
    that always seems a danger
  • 18:17 - 18:20
    or that is sometimes a danger
  • 18:20 - 18:22
    and had sometimes
    led to applications
  • 18:22 - 18:23
    that have made less sense
  • 18:23 - 18:25
    is when folks start with a method
    that they're very excited about
  • 18:25 - 18:28
    rather than a question.
  • 18:29 - 18:32
    So sort of starting with a question
  • 18:32 - 18:34
    where here's the object I'm interested in,
  • 18:34 - 18:36
    here is the parameter of interest.
  • 18:37 - 18:40
    let me think about how I would
    identify that thing,
  • 18:40 - 18:42
    how I would recover that thing
    if I had a ton of data.
  • 18:42 - 18:44
    Oh, here's a conditional
    expectation function.
  • 18:44 - 18:47
    Let me plug in the machine
    learning estimator for that.
  • 18:47 - 18:49
    That seems very, very sensible.
  • 18:49 - 18:53
    Whereas, you know,
    if I regress quantity on price
  • 18:54 - 18:56
    and say that I used
    a machine learning method,
  • 18:56 - 18:59
    maybe I'm satisfied that
    that solves the [ ] problem
  • 18:59 - 19:01
    we're usually worried
    about there... maybe I'm not.
  • 19:02 - 19:03
    But again, that's something
  • 19:03 - 19:06
    where the way to address it
    seems relatively clear.
  • 19:06 - 19:09
    It's to find your object of interest
  • 19:09 - 19:10
    and think about--
  • 19:10 - 19:12
    - Just bring in the economics.
  • 19:12 - 19:12
    - Exactly.
  • 19:12 - 19:15
    - And and can I think about heterogeneity,
  • 19:15 - 19:18
    but harnessed the power
    of the machine learning methods
  • 19:18 - 19:21
    for some of the components.
  • 19:21 - 19:23
    - Precisely. Exactly.
  • 19:23 - 19:24
    So the question of interest
  • 19:24 - 19:26
    is the same as the question
    of interest has always been,
  • 19:26 - 19:30
    but we now have better methods
    for estimating some pieces of this.
  • 19:30 - 19:32
    The place that seems
    harder to forecast
  • 19:33 - 19:35
    is obviously, there's
    a huge amount going on
  • 19:35 - 19:36
    in the machine learning literature
  • 19:38 - 19:39
    and the limited ways
    of plugging it in
  • 19:39 - 19:40
    that I've referenced so far
  • 19:40 - 19:43
    are a limited piece of that.
  • 19:43 - 19:45
    And so I think there are all sorts
    of other interesting questions
  • 19:45 - 19:46
    about where...
  • 19:47 - 19:49
    where does this interaction go?
    What else can we learn?
  • 19:49 - 19:52
    And that's something where
    I think there's a ton going on
  • 19:52 - 19:54
    which seems very promising,
  • 19:54 - 19:56
    and I have no idea
    what the answer is.
  • 19:57 - 19:59
    - No, I totally agree with that,
  • 19:59 - 20:01
    but that makes it very exciting.
  • 20:04 - 20:06
    And I think there's just
    a little work to be done there.
  • 20:07 - 20:09
    Alright. So I say, he agrees
    with me there.
  • 20:09 - 20:11
    [laughter]
  • 20:12 - 20:13
    - I didn't say that per se.
  • 20:14 - 20:16
    - [Narrator] If you'd like to watch
    more Nobel Conversations,
  • 20:16 - 20:18
    click here.
  • 20:18 - 20:20
    Pr if you'd like to learn
    more about econometrics,
  • 20:20 - 20:23
    check out Josh's
    Mastering Econometrics series.
  • 20:24 - 20:26
    If you'd like to learn more
    about Guido, Josh, and Isaiah,
  • 20:27 - 20:28
    check out the links
    in the description.
  • 20:29 - 20:31
    ♪ [music] ♪
Title:
How Will Machine Learning Impact Economics?
ASR Confidence:
0.83
Description:

more » « less
Video Language:
English
Team:
Marginal Revolution University
Duration:
20:33

English subtitles

Revisions Compare revisions