Return to Video

R Squared in SKlearn - Intro to Machine Learning

  • 0:00 - 0:03
    Now that I've explained r-squared to you, question you might be
  • 0:03 - 0:07
    asking is this is all well and good Katie but how do I get this information?
  • 0:07 - 0:10
    You haven't given me an equation for it or anything like that.
  • 0:10 - 0:14
    And what I want to do instead of giving you a big mathematical equation,
  • 0:14 - 0:17
    which I don't find that interesting and you could look up on your own.
  • 0:17 - 0:20
    I want to show you how to get this information out of scikit-learn.
  • 0:20 - 0:23
    This is the code we were looking at a few videos ago when we were building our
  • 0:23 - 0:24
    net worth predictor.
  • 0:24 - 0:29
    Now, I filled in these lines that are importing the linear progression and
  • 0:29 - 0:30
    making some predictions.
  • 0:30 - 0:34
    Another thing that happened was I printed some information to the screen,
  • 0:34 - 0:35
    you may remember.
  • 0:35 - 0:36
    Two of these things I explained to you already.
  • 0:36 - 0:38
    The slope and the intercept.
  • 0:38 - 0:41
    I access that information by looking for the coefficients and
  • 0:41 - 0:42
    the intercept of the regression.
  • 0:42 - 0:46
    These are just lines of code that I found in an example online.
  • 0:46 - 0:48
    But one thing I did promise you we would come back to,
  • 0:48 - 0:51
    and now we are, is this r-squared score that I was printing out.
  • 0:51 - 0:56
    And the way access that, is through the reg.score quantity.
  • 0:56 - 1:00
    This is kind of similar to how we computed the accuracy in
  • 1:00 - 1:01
    our supervised classifier.
  • 1:01 - 1:06
    So what we do is we pass the ages, which are the features in this case,
  • 1:06 - 1:07
    the input, and
  • 1:07 - 1:10
    the net_worths, which are the outputs, the things we're trying to predict.
  • 1:10 - 1:14
    And then since the regression has already been fit, up here,
  • 1:14 - 1:17
    it knows what it thinks the relationship between these two quantities are.
  • 1:17 - 1:21
    So this is all the information that it needs to compute an r-squared score.
  • 1:21 - 1:22
    And then, I can just print it out.
  • 1:22 - 1:26
    So let me take you over here and show you again what that looks like.
  • 1:26 - 1:28
    I have the same output as I had before,
  • 1:28 - 1:31
    this might look a little bit familiar so I'm predicting my own net worth.
  • 1:31 - 1:33
    I have my slope, my intercept.
  • 1:33 - 1:35
    But now you understand the importance of the r-squared score.
  • 1:35 - 1:40
    So my r-squared score is about point eight six which is actually really good.
  • 1:40 - 1:45
    I'm predicting, I'm doing about 85% of what the best I could doing is.
  • 1:45 - 1:48
    I would say 86% is close to one.
  • 1:48 - 1:54
    It can be a little bit of an art to translate between an r-squared numerically,
  • 1:54 - 1:55
    and saying whether it's a good fit or not.
  • 1:55 - 1:58
    And this is something you'll get some intuition for
  • 1:58 - 2:01
    overtime, as you play with things.
  • 2:01 - 2:07
    I would certainly say that .857 is a good r-squared.
  • 2:07 - 2:11
    We're doing a good job of capturing the relationship between the age and
  • 2:11 - 2:13
    the net worth of people here.
  • 2:13 - 2:16
    I've also seen higher r-squareds in my life.
  • 2:16 - 2:19
    So it's possible that there still could be variables out there.
  • 2:19 - 2:23
    For example, features that if we were able to incorporate the information from
  • 2:23 - 2:27
    additional features we would be better able to predict a person's net worth.
  • 2:27 - 2:30
    So in other words, if we were able to use more than one feature,
  • 2:30 - 2:33
    sometimes we can push up this r squared even further.
  • 2:33 - 2:36
    On the other hand, there are sometimes really complicated problems where it's
  • 2:36 - 2:40
    almost impossible to get an r squared that would be anywhere near this high.
  • 2:40 - 2:43
    So sometimes, in Political Science for example they're trying to
  • 2:43 - 2:47
    run a regression that will predict whether a country will go to war.
Title:
R Squared in SKlearn - Intro to Machine Learning
Description:

more » « less
Video Language:
English
Team:
Udacity
Project:
ud120 - Intro to Machine Learning
Duration:
02:47

English subtitles

Revisions Compare revisions