-
-
Today, we'll cover section
13.2 on measures of dispersion.
-
In the previous section, we
studied the mean, median,
-
and mode.
-
Those are measures
of central tendency--
-
in other words, locations--
-
around which the data
tends to cluster.
-
For example, suppose we
plot a small data set--
-
say it looks like that--
-
and then a second one.
-
It looks the same, but it's
in a different location--
-
and yet a third one.
-
If you plot them
all together, you
-
can see that the data sets
basically look the same.
-
They've just been shifted.
-
-
In other words, they have
different means, medians,
-
and modes.
-
But each is equally
spread out or dispersed.
-
That's what we
mean by dispersion.
-
Each of those three data
set is equally dispersed.
-
-
That's what leads us to
the study of measures
-
of variation or dispersion
of data as a contrast
-
to just a measure of
location, which we studied
-
in the previous section.
-
These measures will
tell us, in some sense,
-
just how much the data either
spread out or packed in
-
together.
-
-
The first and simplest measure
of variation's called a range.
-
The range of a
data set is simply
-
the difference between the
largest and smallest data
-
points.
-
So the range is just a maximum
value minus the minimum value.
-
It's a relatively crude
measure of the spread
-
of a data set because it doesn't
say anything about the values
-
in between the minimum
and maximum values.
-
Here's an example
that's sort of aged.
-
It's been in my slides
for quite a long time,
-
so the information
may be out of date,
-
but the point is the same.
-
This is a list of
the heights in inches
-
of four different people--
-
at least at the time,
the tallest person
-
in the world, the shortest
person, and two other people,
-
including LeBron James
and LaDainian Tomlinson.
-
In any case, the
range of that data set
-
would simply be the largest
number, the maximum,
-
minus the smallest
number, the minimum.
-
In this case, that
would be 102 minus 26,
-
which happens to be 76 inches.
-
And in case you're interested,
that's about 6 feet, 4 inches.
-
-
Another measure of
variation of a data set
-
is the standard deviation.
-
Now, we'll say here, I'm going
to flash up that formula.
-
And in fact, I'm even
going to show you
-
how to calculate it
a step at a time.
-
However, just so it
doesn't scare you too much,
-
we will find that there is a
keystroke on the calculator
-
that will do this for us.
-
I basically want to show you
how much value that keystroke is
-
going to have for
us by showing you
-
what you would do if you
did not have that keystroke,
-
so bear with me.
-
Let's take a data set--
-
16, 14, 12, 21, 22--
-
and calculate the
standard deviation
-
without the assumption
that there's
-
a button on the calculator
that can do it for us.
-
-
If you're still
having difficulty
-
calculating the mean, I'll
refer you back to the video
-
from the previous
section on the mean.
-
But at this point, I'll assume
that you can calculate the mean
-
without me mentioning it again.
-
It turns out to be
17 for that data set.
-
But to calculate the
standard deviation
-
from this sample of
1, 2, 3, 4, 5 numbers,
-
here's what you would do by--
what I would say, by hand.
-
You're actually using a
calculator in any case,
-
probably, but you
could do this by hand.
-
The first thing you do
is you make a column
-
for the deviations
from the mean.
-
If you look at that
formula, there's
-
one point where you
have x minus x bar.
-
So just calculate the
difference of each data
-
point from the mean.
-
So you take 17 away
from 16, get minus 1.
-
You take 17 away from
14, you get minus 3.
-
From 12, you get minus 5.
-
You take 17 away
from 21 and get 4.
-
You take 17 away
from 22 and get 5.
-
So there is your
deviations from the mean.
-
Then, if you look
back at the formula,
-
you'll notice they get squared.
-
So you need another column
for that squared deviation.
-
So you square minus 1.
-
You square minus 3. you square
minus 5, and you square 4,
-
and you square 5.
-
And then you see, in that
formula, there's that sigma.
-
Sigma means sum, so
you need to add up
-
all those squared deviations.
-
And if you add them up, it
turns out you'll get 76.
-
So the numerator inside that
square root symbol is 76.
-
The denominator
is n minus 1 where
-
n is how many values there are.
-
Well, we said there were
five, so n minus 1 would be 4.
-
So you'd get 76 divided by
4 under the square root.
-
That's the square root of 19.
-
And if you take that calculator,
and do the square root,
-
and round it to two decimal
places, you get about 4.36.
-
This is how you would
do the calculation
-
if we didn't have a
Standard Deviation
-
button on the calculator.
-
Luckily for us, we do.
-
So these calculations
are strictly educational,
-
just to show you
what you would do
-
if you didn't have the button.
-
Lucky for us, we
do have a button,
-
and we will not have
to do it this way.
-
-
Let's show the same
process given the knowledge
-
of a set of keystrokes that will
give us the standard deviation
-
from this sample.
-
And all it is a
procedure exactly
-
like calculating
the mean, except you
-
do Shift-9 instead of Shift-7.
-
Otherwise, it's identical.
-
So if I want to
calculate the mean,
-
I don't need any of
those other columns.
-
I just need the data
values themselves.
-
I enter those data values
exactly like I enter them
-
if I were calculating the mean.
-
So you press On to
clear out any old data.
-
You do Mode-period, which
puts it in stats mode.
-
And then you key in each number
and press M-plus afterwards.
-
This is exactly
what you did when
-
you were calculating the mean.
-
This is nothing new.
-
But when you're doing a
standard deviation of a sample,
-
after all the data
is in, you press
-
Shift-9 instead of Shift-7.
-
That's the only difference.
-
I'm going to throw up
the other useful tips
-
just so that list is complete.
-
But calculating the standard
deviation of a sample--
-
it's exactly the same as
calculating the mean except you
-
do Shift-9 instead of Shift-7.
-
Just remember that.
-
In this particular
case, the keystrokes
-
will be On to clear the old data
and to wake the calculator up,
-
Mode-period to put
it in stats mode,
-
and then you enter each data
point followed by M-plus.
-
Don't forget M-plus
after the 22.
-
And then do Shift-9.
-
And if you look in your
calculator display,
-
you'll see 4.358898944.
-
And if you round that to two
decimal places, you get 4.36.
-
This also might be a
good time to introduce
-
a new term called the variance.
-
The variance is nothing but
the standard deviation squared.
-
So once you get the
standard deviation,
-
if you want the
variance, you just
-
square the standard deviation.
-
For example, in this problem,
the standard deviation
-
was 4.358898944.
-
If you square that,
you get the variance.
-
And if you are going to round
it to two decimal places,
-
you'd get 19.00.
-
But do not use the
rounded standard deviation
-
to calculate the variance.
-
Take the unrounded value, square
it, and then round it back.
-
-
How about this example?
-
A company hired six interns.
-
After four months,
their work records
-
show the number of work days
missed for each worker--
-
0, 2, 1, 4, 2, 3.
-
Find the mean, sample standard
deviation, and sample variance
-
of this data set.
-
And then round your final
answer to two decimal places.
-
First, calculate the mean.
-
This is the material from
the previous section--
-
should be straightforward.
-
Press On.
-
Press Mode-dot, Mode-period,
whatever you call that.
-
Then enter the data
followed by M-pluses.
-
Then hit Shift-7.
-
That's the mean.
-
It turns out to be 2.
-
-
The data's already
been entered now.
-
So when you're doing something
beyond just the mean,
-
you do not have to
re-enter the data.
-
You could clear
everything and start over.
-
But if the data has
not been cleared,
-
you do not have to put it
back in in order to calculate
-
the standard deviation.
-
So all you need to do
now is press Shift-9.
-
And when you do, you'll see that
the sample standard deviation
-
is 1.414213562.
-
So you do not have to clear
the data and start over.
-
If the data is already
in there, and you
-
want to use that same data set
to calculate something else,
-
you do not have to start over.
-
And if you round that
to two decimal places,
-
you'd get a standard
deviation of 1.41.
-
If I ask for the
variance, you simply
-
square the standard deviation.
-
But remember, you're going to
square the number before you
-
round it, and then round.
-
So you're going to square
1.414213562, and then round it.
-
And it turns out to be
2.00 to two decimal places.
-
So far, we've been doing our
calculations with samples
-
without really explaining why.
-
And without going
into a lot of detail,
-
when we work with samples,
we're making the assumption
-
that the sample data is drawn
from some larger population.
-
For example, we might take a
sample of 10 student grades
-
from a class of 96 students.
-
This set of 10
grades is the sample,
-
while the 96 student
grades are the population.
-
It turns out that, when we
calculate the sample mean,
-
it's the same as
a population mean.
-
It doesn't matter whether
it's from a sample
-
or from the population.
-
Turns out, though, that there is
a slight difference when you're
-
doing standard deviation.
-
There's a slight change
in the standard deviation
-
formula when you're
calculating the population
-
of standard deviation.
-
So whereas it never mattered
with the distinction
-
between a sample and a
population with means,
-
it does matter if we're talking
about the standard deviation.
-
Just to illustrate,
suppose we take the data
-
from the previous example--
-
the 0, 2, 1, 4, 2, 3.
-
We found that sample
standard deviation
-
to be about 1.414213562.
-
That was from a sample.
-
-
If I told you that was
the entire population,
-
you have to do
something differently,
-
but it amounts to
just one keystroke.
-
If you look at it,
every single keystroke
-
is the same until you
get to the very end,
-
and it amounts to making
one press differently.
-
If you're doing a sample
standard deviation,
-
you do Shift-9.
-
If you're doing a population
standard deviation,
-
you do Shift-8.
-
So it's strictly
one press different.
-
But you do have to
be watchful to see
-
if you're asking for a
population standard deviation
-
or a sample standard deviation.
-
I do want to say,
when you're looking
-
at this on your calculator, if
you'll notice, above the nine
-
there is a sigma n minus 1.
-
That's what we're calling small
s, which is the sample standard
-
deviation.
-
If you look above the eight
on your calculator key,
-
you see sigma sub n.
-
And that stands for the
population standard deviation.
-
So you do need to notice
that on the calculator.
-
What we're calling s and
sigma, your calculator
-
calls sigma sub n minus
1 and sigma sub n.
-
I don't really want to
get into why that is.
-
There is a reason.
-
If we were studying this as
a whole course in statistics,
-
I would talk about
it more in depth.
-
But I do want to point it
out that the sample standard
-
deviation that's associated
with the nine keystroke
-
is going to be labeled
sigma sub n minus 1.
-
The population
standard deviation,
-
which is associated with
the keystroke eight,
-
is going to be sigma sub n.
-
So just look at that and
get it in your head--
-
sigma sub n for population
standard deviation, sigma sub
-
n minus 1 for sample
standard deviation.
-
-
Just take a minute,
and look at that.
-
And once you've got
it, you've got it.
-
-
Let's look at this example.
-
Consider the following--
12, 21, 13, 20, 27.
-
Compute the population standard
deviation of the numbers.
-
Round your answer to
one decimal place.
-
Remember, this is a
population standard deviation.
-
So when you put the numbers
in, it's exactly the same
-
as if you're doing the sample
standard deviation except,
-
at the end, you press
Shift-8 instead of Shift-9.
-
-
And you get
approximately 5.535341.
-
If you round it to one decimal
place, that's about 5.5.
-
Here's an additional question.
-
Add a non-zero constant c to
each of your original numbers
-
and compute the
standard deviation
-
of this new population.
-
Again, it's a population.
-
And again, we want to round our
answer to one decimal place.
-
So they're asking us to add
a constant c that's not zero.
-
So they're saying, pick
any number you like,
-
and add it to each of those data
points, and then recalculate.
-
So I'll pick two.
-
You can pick five.
-
It doesn't matter.
-
Pick some number that's not
zero and add it to each of them.
-
So if I add 2 to 12, 2 to 21,
2 to 13, 2 to 20, 2 to 27,
-
I get 14, 23, 15, 22, 29.
-
Now calculate the population
standard deviation again.
-
If you do that, you'll
find out it doesn't change.
-
If you round it, you still
get 5.5, approximately,
-
to one decimal place.
-
-
The B part asks you to look
at those two calculations
-
and make an inference.
-
It says, use the results from
part A in inductive reasoning
-
to state what happens to
the standard deviation
-
of a population when a non-zero
constant is added to each data
-
item.
-
Now, an inference is
not necessarily a proof.
-
But it looks like
it doesn't change.
-
And that's the
choice I would choose
-
based on what I just did.
-
I did the calculation both ways.
-
The answers came out the same.
-
So the inference would
be the standard deviation
-
remains the same.
-
It turns out that
is, in fact, true--
-
that when you add a single
number to each data point,
-
the calculation for
the standard deviation
-
does not change at all.
-
-
Now, this has nothing
to do with your ability
-
to do these problems,
but I do feel
-
that I should try to explain
a little bit about what
-
the standard
deviation really is.
-
And I'm going to give it a shot.
-
But understanding what I'm going
to say from here out really
-
doesn't affect your
ability to do the problems.
-
I'm just trying to add a little
bit more to your understanding.
-
So let's give it a shot.
-
Suppose the length that I
put up here on the screen
-
represents the difference
between two values.
-
And what if each of the line
segments I've displayed down
-
here in the lower left quadrant
is one standard deviation
-
in length?
-
If I take one of those
things, and then another,
-
and then another,
and then another,
-
and stretch them along the
length of that wider segment,
-
it turns out it
takes four of them
-
to equal the length
of that one wider bar.
-
So what I could say is, that
blue bar-- that wider bar--
-
is four standard
deviations long.
-
-
But what if the bars for
the standard deviations
-
were really longer?
-
What if, on the
right quadrant-- what
-
if those bars represented one
standard deviation of length?
-
What if I stretched
those bars out?
-
Well, it takes only
three of those.
-
-
So the wider bar is
three standard deviations
-
long if the standard
deviation is the length of one
-
of the bars on the right.
-
But the length of the wider
bar is four standard deviations
-
long if the standard
deviation is
-
the length of the bars on
the lower left quadrant.
-
-
So in effect, when we calculate
the standard deviation,
-
we're computing the length
of our measuring stick.
-
If the standard deviation
is really small,
-
it will take several
standard deviations
-
to span a certain distance.
-
If the standard deviation is
larger, it takes fewer of them.
-
Now, I don't know if
that helps or not.
-
It has no bearing on whether
you do the calculations or not.
-
But I do hope that it
gives you some little bit
-
of more understanding in
what a standard deviation is.
-
It's sort of the length
of our measuring stick.
-
-
One more reminder--
there is a summary sheet
-
that you can download if you
bring this presentation up
-
in PowerPoint.
-
The link will bring
you to a summary sheet
-
that you can actually
print out and keep.