1 00:00:00,909 --> 00:00:02,250 - [Instructor] Adrianna gathered data 2 00:00:02,250 --> 00:00:04,249 on different schools' winning percentages 3 00:00:04,249 --> 00:00:07,968 and the average yearly salary of their head coaches 4 00:00:07,968 --> 00:00:12,225 in millions of dollars in the years 2000 to 2011. 5 00:00:12,225 --> 00:00:15,619 She then created the following scatter plot and trend line. 6 00:00:15,619 --> 00:00:17,604 So this is salary in millions of dollars 7 00:00:17,604 --> 00:00:19,936 and the winning percentage. 8 00:00:19,936 --> 00:00:24,056 And so here, we have a coach who made over $4 million, 9 00:00:24,056 --> 00:00:28,413 and looks like they won over 80% of their games. 10 00:00:28,413 --> 00:00:29,831 But you have this coach over here 11 00:00:29,831 --> 00:00:31,313 who has a salary of a little over 12 00:00:31,313 --> 00:00:32,823 a million and a half dollars, 13 00:00:32,823 --> 00:00:35,451 and they are winning over 85%, 14 00:00:35,451 --> 00:00:38,119 and so each of one of these data points 15 00:00:38,119 --> 00:00:39,576 is a coach, 16 00:00:39,576 --> 00:00:41,701 and is plotting their salary 17 00:00:41,701 --> 00:00:44,770 or their winning percentage against their salary. 18 00:00:44,770 --> 00:00:48,326 Assuming the line correctly shows the trend in the data, 19 00:00:48,326 --> 00:00:49,625 and it's a bit of an assumption, 20 00:00:49,625 --> 00:00:51,070 there are some outliers here 21 00:00:51,070 --> 00:00:53,508 that are well away from the model, 22 00:00:53,508 --> 00:00:54,679 and this isn't a, 23 00:00:54,679 --> 00:00:55,911 it looks like there's a linear, 24 00:00:55,911 --> 00:00:58,050 a positive linear correlation here, 25 00:00:58,050 --> 00:00:59,582 but it's not super tight 26 00:00:59,582 --> 00:01:01,121 and there's a bunch of coaches right over here, 27 00:01:01,121 --> 00:01:03,781 in the lower salary area, 28 00:01:03,781 --> 00:01:05,875 going all the way from 20 something percent 29 00:01:05,875 --> 00:01:07,483 to over 60 percent. 30 00:01:07,483 --> 00:01:10,209 Assuming the line correctly shows the trend in the data, 31 00:01:10,209 --> 00:01:15,169 what does it mean that the line's y intercept is 39? 32 00:01:15,169 --> 00:01:16,786 Well if you believe the model, 33 00:01:16,786 --> 00:01:19,784 then the y intercept of being 39 34 00:01:19,784 --> 00:01:22,357 would be the model is saying 35 00:01:22,357 --> 00:01:24,437 that if someone makes no money, 36 00:01:24,437 --> 00:01:26,583 that they could, zero dollars, 37 00:01:26,583 --> 00:01:28,054 that they could win, 38 00:01:28,054 --> 00:01:31,666 that the model would expect them to win 39% of their games, 39 00:01:31,666 --> 00:01:33,230 which seems a little unrealistic, 40 00:01:33,230 --> 00:01:36,072 because you would expect most coaches to get paid something. 41 00:01:36,072 --> 00:01:37,760 But anyway, let's see which of this choices 42 00:01:37,760 --> 00:01:40,323 actually describe that. 43 00:01:40,323 --> 00:01:42,984 So let me look at the choices. 44 00:01:42,984 --> 00:01:46,124 The average salary was 39 million dollars, nope. 45 00:01:46,124 --> 00:01:48,108 No one on our chart made 39 million. 46 00:01:48,108 --> 00:01:50,350 On average, each million dollar increase in salary 47 00:01:50,350 --> 00:01:52,765 was associated with a 39% increase in winning percentage. 48 00:01:52,765 --> 00:01:56,686 That would be something related to the slope 49 00:01:56,686 --> 00:01:58,735 and the slope was definitely not 39. 50 00:01:58,735 --> 00:02:01,353 The average winning percentage was 39%, 51 00:02:01,353 --> 00:02:02,692 we know that wasn't the case either. 52 00:02:02,692 --> 00:02:05,248 The model indicates that teams with coaches 53 00:02:05,248 --> 00:02:06,781 who had a salary of zero millions dollars 54 00:02:06,781 --> 00:02:09,971 will average a winning percentage of approximately 39%. 55 00:02:09,971 --> 00:02:12,717 Yeah this is the closest statement 56 00:02:12,717 --> 00:02:14,264 to what we just said, 57 00:02:14,264 --> 00:02:15,740 that if you believe that model, 58 00:02:15,740 --> 00:02:17,285 and that's a big if, 59 00:02:17,285 --> 00:02:19,452 if you believe this model, 60 00:02:20,916 --> 00:02:23,375 then this model says someone making zero dollars 61 00:02:23,375 --> 00:02:24,966 will get 39%, 62 00:02:24,966 --> 00:02:27,446 and this is frankly why you have to be skeptical of models. 63 00:02:27,446 --> 00:02:28,688 They're not going to be perfect, 64 00:02:28,688 --> 00:02:31,249 especially in extreme cases oftentimes, 65 00:02:31,249 --> 00:02:32,247 but who knows. 66 00:02:32,247 --> 00:02:34,873 Anyway, hopefully you found that useful.