-
Not Synced
♪ [music] ♪
-
Not Synced
- [Thomas Stratmann] Hi!
-
Not Synced
In the upcoming series of videos
-
Not Synced
we're going to give you
a shiny new tool
-
Not Synced
to put into your
Understanding Data toolbox:
-
Not Synced
linear regression.
-
Not Synced
Say you've got this theory.
-
Not Synced
You've witnessed
how good-looking people
-
Not Synced
seem to get special perks.
-
Not Synced
You're wondering -- "Where else
might we see this phenomenon?"
-
Not Synced
What about full professors?
-
Not Synced
Is it possible
good-looking professors
-
Not Synced
might get special perks too?
-
Not Synced
Is it possible
students treat them better
-
Not Synced
by showering them
with better student evaluations?
-
Not Synced
If so, is the effect of looks
-
Not Synced
on evaluation score
big or [inaudible]?
-
Not Synced
And say there is a new professor
starting at the university.
-
Not Synced
What can we predict
about his evaluation
-
Not Synced
simply by his looks?
-
Not Synced
Given that these evaluations
can determine pay raises,
-
Not Synced
if this theory were true
we might see professors resort
-
Not Synced
to some surprising tactics
to boost their scores.
-
Not Synced
Suppose you wanted to find out
-
Not Synced
if evaluations really improve
with better looks.
-
Not Synced
How would you go about
testing this hypothesis?
-
Not Synced
You could collect data.
-
Not Synced
First you would have students rate
on a scale from 1 to 10
-
Not Synced
how good looking a professor was,
-
Not Synced
which gives you
an average beauty score.
-
Not Synced
Then you could retrieve
the teacher's teaching evaluations
-
Not Synced
from 25 students.
-
Not Synced
Let's look at these two variables
at the same time
-
Not Synced
by using a scatterplot.
-
Not Synced
We'll put beauty
on the horizontal axis,
-
Not Synced
and teacher evaluations
on the vertical axis.
-
Not Synced
For example, this dot
represents Professor Peate,
-
Not Synced
who received a beauty score of 3
-
Not Synced
and an evaluation of 8.425.
-
Not Synced
This one way out here
is Professor Helmchen.
-
Not Synced
- [Professor Helmchen]
Ridiculously good-looking!
-
Not Synced
- [Thomas] Who got
a very high beauty score,
-
Not Synced
but not such a good evaluation.
-
Not Synced
Can you see a trend?
-
Not Synced
As we move from left to right
on the horizontal axis,
-
Not Synced
from the ugly to the gorgeous,
-
Not Synced
we see a trend upwards
in evaluation scores.
-
Not Synced
By the way, the data
we're exploring in this series
-
Not Synced
is not made up --
it comes from a real study
-
Not Synced
done at the University of Texas.
-
Not Synced
If you're wondering, "pulchritude"
is just the fancy academic way
-
Not Synced
of saying beauty.
-
Not Synced
With scatterplots
it can sometimes be hard
-
Not Synced
to make out the exact relationship
between two variables --
-
Not Synced
especially when the variables
bounce around quite a bit
-
Not Synced
as we go from left to right.
-
Not Synced
One way to cut through
this bounciness
-
Not Synced
is to draw a straight line
through the data cloud
-
Not Synced
in such a way that this line
summarizes the data
-
Not Synced
as closely as possible.
-
Not Synced
The technical term for this
is "linear regression."
-
Not Synced
Later on we'll talk about
how this line is created,
-
Not Synced
but for now we can assume
that the line fits the data
-
Not Synced
as closely as possible.
-
Not Synced
So, what can this line tell us?
-
Not Synced
First, we immediately see
-
Not Synced
if the line is sloping
upward or downward.
-
Not Synced
In our data set we see
the [fitted] line slopes upward.
-
Not Synced
It thus confirms what
we have conjectured earlier
-
Not Synced
by just looking at the scatterplot.
-
Not Synced
The upward slope means
that there is a positive association
-
Not Synced
between looks
and evaluation scores.
-
Not Synced
In other words, on average,
-
Not Synced
better-looking professors
are getting better evaluations.
-
Not Synced
For other data sets we might see
a stronger positive association.
-
Not Synced
Or, you might see
a negative association.
-
Not Synced
Or perhaps no association at all.
-
Not Synced
And our lines
don't have to be straight.
-
Not Synced
They can curve to fit the data
when necessary.
-
Not Synced
This line also gives us
a way to predict outcomes.
-
Not Synced
We can simply take a beauty score
and read off the line
-
Not Synced
what the predicted
evaluation score would be.
-
Not Synced
So, back to our new professor.
-
Not Synced
- [Professor Lloyd] Look familiar?
-
Not Synced
- [Thomas] We can precisely
predict his evaluation score.
-
Not Synced
"But wait! Wait!" you might say.
-
Not Synced
"Can we trust this prediction?"
-
Not Synced
How well does
this one beauty variable
-
Not Synced
really predict evaluations?
-
Not Synced
Linear regression gives us
some useful measures
-
Not Synced
to answer those questions
-
Not Synced
which we'll cover
in a future video.
-
Not Synced
We also have to be aware
of other pitfalls
-
Not Synced
before we draw
any definite conclusions.
-
Not Synced
You could imagine a scenario
-
Not Synced
where what is driving
the association
-
Not Synced
we see is really a third variable
that we have left out.
-
Not Synced
For example,
the difficulty of the course
-
Not Synced
might be behind
the positive association
-
Not Synced
between beauty ratings
and evaluation scores.
-
Not Synced
Easy intro. courses
get good evaluations.
-
Not Synced
Harder, more advanced courses
get bad evaluations.
-
Not Synced
And younger professors might
get assigned to intro. courses.
-
Not Synced
Then, if students judge
younger professors more attractive,
-
Not Synced
you will find
a positive association
-
Not Synced
between beauty ratings
and evaluation scores.
-
Not Synced
But it's really
the difficulty of the course.
-
Not Synced
The variable that we've left out,
not beauty,
-
Not Synced
that is driving evaluation scores.
-
Not Synced
In that case, all the primping
would be for naught --
-
Not Synced
a case of mistaken correlation
for causation,
-
Not Synced
something we'll talk about further
in a later video.
-
Not Synced
And what if there were
other important variables
-
Not Synced
that affect both beauty ratings
and evaluation scores?
-
Not Synced
You might want to add
considerations like skill,
-
Not Synced
race, sex, and whether English
is the teacher's native language
-
Not Synced
to isolate more cleanly the effect
of beauty on evaluations.
-
Not Synced
When we get
into multiple regression
-
Not Synced
we will be able to measure
the impact of beauty
-
Not Synced
on teacher evaluations
-
Not Synced
while accounting
for other variables
-
Not Synced
that might confound
this association.
-
Not Synced
Next up, we'll get our hands dirty
by playing with this data
-
Not Synced
to gain a better understanding
of what this line can tell us.
-
Not Synced
- [Narrator] Congratulations!
-
Not Synced
You're one step closer
to being a data ninja!
-
Not Synced
However, to master this you'll need
to strengthen your skills
-
Not Synced
with some practice questions.
-
Not Synced
Ready for your next mission?
Click "Next Video."
-
Not Synced
Still here?
-
Not Synced
Move from understanding eata
to understanding your world
-
Not Synced
by checking out MRU's
other popular economics videos.
-
Not Synced
♪ [music] ♪