-
♪ [music] ♪
-
- [Thomas Stratmann] Hi!
-
In the upcoming series of videos,
-
we're going to give you
a shiny new tool
-
to put into your
Understanding Data toolbox:
-
linear regression.
-
Say you've got this theory.
-
You've witnessed
how good-looking people
-
seem to get special perks.
-
You're wondering,
-
"Where else might we see
this phenomenon?"
-
What about for professors?
-
Is it possible
good-looking professors
-
might get special perks too?
-
Is it possible
students treat them better
-
by showering them
with better student evaluations?
-
If so, is the effect of looks
-
on evaluations really big
or really small?
-
And say there is a new professor
starting at a university.
-
- [man] G'day, mate.
-
- What can we predict
about his evaluation
-
simply by his looks?
-
Given that these evaluations
can determine pay raises,
-
if this theory were true,
we might see professors resort
-
to some surprising tactics
to boost their scores.
-
- [Lloyd Christmas] Yeah!
-
- Suppose you wanted to find out
-
if evaluations really improve
with better looks.
-
How would you go about
testing this hypothesis?
-
You could collect data.
-
First you would have students rate
on a scale from 1 to 10
-
how good-looking a professor was,
-
which gives you
an average beauty score.
-
Then you could retrieve
the teacher's teaching evaluations
-
from twenty-five students.
-
Let's look at these two variables
at the same time
-
by using a scatterplot.
-
We'll put beauty
on the horizontal axis,
-
and teacher evaluations
on the vertical axis.
-
For example, this dot
represents Professor Peate,
-
- [Bib Fortuna] De wana wanga.
-
- who received a beauty score of 3
-
and an evaluation of 8.425.
-
This one way out here
is Professor Helmchen.
-
- [Ben Stiller, "Zoolander"]
Ridiculously good-looking!
-
- Who got
a very high beauty score,
-
but not such a good evaluation.
-
Can you see a trend?
-
As we move from left to right
on the horizontal axis,
-
from the ugly to the gorgeous,
-
we see a trend upwards
in evaluation scores.
-
By the way, the data
we're exploring in this series
-
is not made up --
it comes from a real study
-
done at the University of Texas.
-
If you're wondering, "pulchritude"
is just the fancy academic way
-
of saying beauty.
-
With scatterplots,
it can sometimes be hard
-
to make out the exact relationship
between two variables --
-
especially when the values
bounce around quite a bit
-
as we go from left to right.
-
One way to cut through
this bounciness
-
is to draw a straight line
through the data cloud
-
in such a way that this line
summarizes the data
-
as closely as possible.
-
The technical term for this
is "linear regression."
-
Later on we'll talk about
how this line is created,
-
but for now we can assume
that the line fits the data
-
as closely as possible.
-
So, what can this line tell us?
-
First, we immediately see
-
if the line is sloping
upward or downward.
-
In our data set we see
the fitted line slopes upward.
-
It thus confirms what
we have conjectured earlier
-
by just looking at the scatterplot.
-
The upward slope means
that there is a positive association
-
between looks
and evaluation scores.
-
In other words, on average,
-
better-looking professors
are getting better evaluations.
-
For other data sets, we might see
a stronger positive association.
-
Or, you might see
a negative association.
-
Or perhaps no association at all.
-
And our lines
don't have to be straight.
-
They can curve to fit the data
when necessary.
-
This line also gives us
a way to predict outcomes.
-
We can simply take a beauty score
and read off the line
-
what the predicted
evaluation score would be.
-
So, back to our new professor.
-
- [Lloyd] Look familiar?
-
- We can precisely predict
his evaluation score.
-
"But wait! Wait!" you might say.
-
"Can we trust this prediction?"
-
How well does
this one beauty variable
-
really predict evaluations?
-
Linear regression gives us
some useful measures
-
to answer those questions
-
which we'll cover
in a future video.
-
We also have to be aware
of other pitfalls
-
before we draw
any definite conclusions.
-
You could imagine a scenario
-
where what is driving
the association we see
-
is really a third variable
that we have left out.
-
For example,
the difficulty of the course
-
might be behind
the positive association
-
between beauty ratings
and evaluation scores.
-
Easy intro courses
get good evaluations.
-
Harder, more advanced courses
get bad evaluations.
-
And younger professors might
get assigned to intro courses.
-
Then, if students judge
younger professors more attractive,
-
you will find
a positive association
-
between beauty ratings
and evaluation scores.
-
But it's really
the difficulty of the course,
-
the variable that we've left out,
not beauty,
-
that is driving evaluation scores.
-
In that case, all the primping
would be for naught --
-
a case of mistaken correlation
for causation --
-
- [Lloyd] Wait a minute.
-
- Something we'll talk about further
in a later video.
-
And what if there were
other important variables
-
that affect both beauty ratings
and evaluation scores?
-
You might want to add
considerations like skill,
-
race, sex, and whether English
is the teacher's native language
-
to isolate more cleanly the effect
of beauty on evaluations.
-
When we get
into multiple regression,
-
we will be able to measure
the impact of beauty
-
on teacher evaluations
-
while accounting
for other variables
-
that might confound
this association.
-
Next up, we'll get our hands dirty
by playing with this data
-
to gain a better understanding
of what this line can tell us.
-
- [Narrator] Congratulations!
-
You're one step closer
to being a data ninja!
-
However, to master this
-
you'll need
to strengthen your skills
-
with some practice questions.
-
Ready for your next mission?
Click "Next Video."
-
Still here?
-
Move from understanding data
to understanding your world
-
by checking out MRU's
other popular economics videos.
-
♪ [music] ♪