1 99:59:59,999 --> 99:59:59,999 ♪ [music] ♪ 2 99:59:59,999 --> 99:59:59,999 - [Thomas Stratmann] Hi! 3 99:59:59,999 --> 99:59:59,999 In the upcoming series of videos 4 99:59:59,999 --> 99:59:59,999 we're going to give you a shiny new tool 5 99:59:59,999 --> 99:59:59,999 to put into your Understanding Data toolbox: 6 99:59:59,999 --> 99:59:59,999 linear regression. 7 99:59:59,999 --> 99:59:59,999 Say you've got this theory. 8 99:59:59,999 --> 99:59:59,999 You've witnessed how good-looking people 9 99:59:59,999 --> 99:59:59,999 seem to get special perks. 10 99:59:59,999 --> 99:59:59,999 You're wondering -- "Where else might we see this phenomenon?" 11 99:59:59,999 --> 99:59:59,999 What about full professors? 12 99:59:59,999 --> 99:59:59,999 Is it possible good-looking professors 13 99:59:59,999 --> 99:59:59,999 might get special perks too? 14 99:59:59,999 --> 99:59:59,999 Is it possible students treat them better 15 99:59:59,999 --> 99:59:59,999 by showering them with better student evaluations? 16 99:59:59,999 --> 99:59:59,999 If so, is the effect of looks 17 99:59:59,999 --> 99:59:59,999 on evaluation score big or [inaudible]? 18 99:59:59,999 --> 99:59:59,999 And say there is a new professor starting at the university. 19 99:59:59,999 --> 99:59:59,999 What can we predict about his evaluation 20 99:59:59,999 --> 99:59:59,999 simply by his looks? 21 99:59:59,999 --> 99:59:59,999 Given that these evaluations can determine pay raises, 22 99:59:59,999 --> 99:59:59,999 if this theory were true we might see professors resort 23 99:59:59,999 --> 99:59:59,999 to some surprising tactics to boost their scores. 24 99:59:59,999 --> 99:59:59,999 Suppose you wanted to find out 25 99:59:59,999 --> 99:59:59,999 if evaluations really improve with better looks. 26 99:59:59,999 --> 99:59:59,999 How would you go about testing this hypothesis? 27 99:59:59,999 --> 99:59:59,999 You could collect data. 28 99:59:59,999 --> 99:59:59,999 First you would have students rate on a scale from 1 to 10 29 99:59:59,999 --> 99:59:59,999 how good looking a professor was, 30 99:59:59,999 --> 99:59:59,999 which gives you an average beauty score. 31 99:59:59,999 --> 99:59:59,999 Then you could retrieve the teacher's teaching evaluations 32 99:59:59,999 --> 99:59:59,999 from 25 students. 33 99:59:59,999 --> 99:59:59,999 Let's look at these two variables at the same time 34 99:59:59,999 --> 99:59:59,999 by using a scatterplot. 35 99:59:59,999 --> 99:59:59,999 We'll put beauty on the horizontal axis, 36 99:59:59,999 --> 99:59:59,999 and teacher evaluations on the vertical axis. 37 99:59:59,999 --> 99:59:59,999 For example, this dot represents Professor Peate, 38 99:59:59,999 --> 99:59:59,999 who received a beauty score of 3 39 99:59:59,999 --> 99:59:59,999 and an evaluation of 8.425. 40 99:59:59,999 --> 99:59:59,999 This one way out here is Professor Helmchen. 41 99:59:59,999 --> 99:59:59,999 - [Professor Helmchen] Ridiculously good-looking! 42 99:59:59,999 --> 99:59:59,999 - [Thomas] Who got a very high beauty score, 43 99:59:59,999 --> 99:59:59,999 but not such a good evaluation. 44 99:59:59,999 --> 99:59:59,999 Can you see a trend? 45 99:59:59,999 --> 99:59:59,999 As we move from left to right on the horizontal axis, 46 99:59:59,999 --> 99:59:59,999 from the ugly to the gorgeous, 47 99:59:59,999 --> 99:59:59,999 we see a trend upwards in evaluation scores. 48 99:59:59,999 --> 99:59:59,999 By the way, the data we're exploring in this series 49 99:59:59,999 --> 99:59:59,999 is not made up -- it comes from a real study 50 99:59:59,999 --> 99:59:59,999 done at the University of Texas. 51 99:59:59,999 --> 99:59:59,999 If you're wondering, "pulchritude" is just the fancy academic way 52 99:59:59,999 --> 99:59:59,999 of saying beauty. 53 99:59:59,999 --> 99:59:59,999 With scatterplots it can sometimes be hard 54 99:59:59,999 --> 99:59:59,999 to make out the exact relationship between two variables -- 55 99:59:59,999 --> 99:59:59,999 especially when the variables bounce around quite a bit 56 99:59:59,999 --> 99:59:59,999 as we go from left to right. 57 99:59:59,999 --> 99:59:59,999 One way to cut through this bounciness 58 99:59:59,999 --> 99:59:59,999 is to draw a straight line through the data cloud 59 99:59:59,999 --> 99:59:59,999 in such a way that this line summarizes the data 60 99:59:59,999 --> 99:59:59,999 as closely as possible. 61 99:59:59,999 --> 99:59:59,999 The technical term for this is "linear regression." 62 99:59:59,999 --> 99:59:59,999 Later on we'll talk about how this line is created, 63 99:59:59,999 --> 99:59:59,999 but for now we can assume that the line fits the data 64 99:59:59,999 --> 99:59:59,999 as closely as possible. 65 99:59:59,999 --> 99:59:59,999 So, what can this line tell us? 66 99:59:59,999 --> 99:59:59,999 First, we immediately see 67 99:59:59,999 --> 99:59:59,999 if the line is sloping upward or downward. 68 99:59:59,999 --> 99:59:59,999 In our data set we see the [fitted] line slopes upward. 69 99:59:59,999 --> 99:59:59,999 It thus confirms what we have conjectured earlier 70 99:59:59,999 --> 99:59:59,999 by just looking at the scatterplot. 71 99:59:59,999 --> 99:59:59,999 The upward slope means that there is a positive association 72 99:59:59,999 --> 99:59:59,999 between looks and evaluation scores. 73 99:59:59,999 --> 99:59:59,999 In other words, on average, 74 99:59:59,999 --> 99:59:59,999 better-looking professors are getting better evaluations. 75 99:59:59,999 --> 99:59:59,999 For other data sets we might see a stronger positive association. 76 99:59:59,999 --> 99:59:59,999 Or, you might see a negative association. 77 99:59:59,999 --> 99:59:59,999 Or perhaps no association at all. 78 99:59:59,999 --> 99:59:59,999 And our lines don't have to be straight. 79 99:59:59,999 --> 99:59:59,999 They can curve to fit the data when necessary. 80 99:59:59,999 --> 99:59:59,999 This line also gives us a way to predict outcomes. 81 99:59:59,999 --> 99:59:59,999 We can simply take a beauty score and read off the line 82 99:59:59,999 --> 99:59:59,999 what the predicted evaluation score would be. 83 99:59:59,999 --> 99:59:59,999 So, back to our new professor. 84 99:59:59,999 --> 99:59:59,999 - [Professor Lloyd] Look familiar? 85 99:59:59,999 --> 99:59:59,999 - [Thomas] We can precisely predict his evaluation score. 86 99:59:59,999 --> 99:59:59,999 "But wait! Wait!" you might say. 87 99:59:59,999 --> 99:59:59,999 "Can we trust this prediction?" 88 99:59:59,999 --> 99:59:59,999 How well does this one beauty variable 89 99:59:59,999 --> 99:59:59,999 really predict evaluations? 90 99:59:59,999 --> 99:59:59,999 Linear regression gives us some useful measures 91 99:59:59,999 --> 99:59:59,999 to answer those questions 92 99:59:59,999 --> 99:59:59,999 which we'll cover in a future video. 93 99:59:59,999 --> 99:59:59,999 We also have to be aware of other pitfalls 94 99:59:59,999 --> 99:59:59,999 before we draw any definite conclusions. 95 99:59:59,999 --> 99:59:59,999 You could imagine a scenario 96 99:59:59,999 --> 99:59:59,999 where what is driving the association 97 99:59:59,999 --> 99:59:59,999 we see is really a third variable that we have left out. 98 99:59:59,999 --> 99:59:59,999 For example, the difficulty of the course 99 99:59:59,999 --> 99:59:59,999 might be behind the positive association 100 99:59:59,999 --> 99:59:59,999 between beauty ratings and evaluation scores. 101 99:59:59,999 --> 99:59:59,999 Easy intro. courses get good evaluations. 102 99:59:59,999 --> 99:59:59,999 Harder, more advanced courses get bad evaluations. 103 99:59:59,999 --> 99:59:59,999 And younger professors might get assigned to intro. courses. 104 99:59:59,999 --> 99:59:59,999 Then, if students judge younger professors more attractive, 105 99:59:59,999 --> 99:59:59,999 you will find a positive association 106 99:59:59,999 --> 99:59:59,999 between beauty ratings and evaluation scores. 107 99:59:59,999 --> 99:59:59,999 But it's really the difficulty of the course. 108 99:59:59,999 --> 99:59:59,999 The variable that we've left out, not beauty, 109 99:59:59,999 --> 99:59:59,999 that is driving evaluation scores. 110 99:59:59,999 --> 99:59:59,999 In that case, all the primping would be for naught -- 111 99:59:59,999 --> 99:59:59,999 a case of mistaken correlation for causation, 112 99:59:59,999 --> 99:59:59,999 something we'll talk about further in a later video. 113 99:59:59,999 --> 99:59:59,999 And what if there were other important variables 114 99:59:59,999 --> 99:59:59,999 that affect both beauty ratings and evaluation scores? 115 99:59:59,999 --> 99:59:59,999 You might want to add considerations like skill, 116 99:59:59,999 --> 99:59:59,999 race, sex, and whether English is the teacher's native language 117 99:59:59,999 --> 99:59:59,999 to isolate more cleanly the effect of beauty on evaluations. 118 99:59:59,999 --> 99:59:59,999 When we get into multiple regression 119 99:59:59,999 --> 99:59:59,999 we will be able to measure the impact of beauty 120 99:59:59,999 --> 99:59:59,999 on teacher evaluations 121 99:59:59,999 --> 99:59:59,999 while accounting for other variables 122 99:59:59,999 --> 99:59:59,999 that might confound this association. 123 99:59:59,999 --> 99:59:59,999 Next up, we'll get our hands dirty by playing with this data 124 99:59:59,999 --> 99:59:59,999 to gain a better understanding of what this line can tell us. 125 99:59:59,999 --> 99:59:59,999 - [Narrator] Congratulations! 126 99:59:59,999 --> 99:59:59,999 You're one step closer to being a data ninja! 127 99:59:59,999 --> 99:59:59,999 However, to master this you'll need to strengthen your skills 128 99:59:59,999 --> 99:59:59,999 with some practice questions. 129 99:59:59,999 --> 99:59:59,999 Ready for your next mission? Click "Next Video." 130 99:59:59,999 --> 99:59:59,999 Still here? 131 99:59:59,999 --> 99:59:59,999 Move from understanding eata to understanding your world 132 99:59:59,999 --> 99:59:59,999 by checking out MRU's other popular economics videos. 133 99:59:59,999 --> 99:59:59,999 ♪ [music] ♪