-
-
In the last video, we came up
with a 95% confidence interval
-
for the mean weight loss between
the low-fat group and
-
the control group.
-
In this video, I actually want
to do a hypothesis test,
-
really to test if this data
makes us believe that the
-
low-fat diet actually does
anything at all.
-
And to do that let's set up
our null and alternative
-
hypotheses.
-
So our null hypothesis
should be that this
-
low-fat diet does nothing.
-
And if the low-fat diet does
nothing, that means that the
-
population mean on our low-fat
diet minus the population mean
-
on our control should
be equal to zero.
-
And this is a completely
equivalent statement to saying
-
that the mean of the sampling
distribution of our low-fat
-
diet minus the mean of the
sampling distribution of our
-
control should be
equal to zero.
-
And that's because we've seen
this multiple times.
-
The mean of your sampling
distribution is going to be
-
the same thing as your
population mean.
-
So this is the same
thing is that.
-
That is the same
thing is that.
-
Or, another way of saying it is,
if we think about the mean
-
of the distribution of the
difference of the sample
-
means, and we focused on this
in the last video, that that
-
should be equal to zero.
-
Because this thing right over
here is the same thing as that
-
right over there.
-
So that is our null
hypothesis.
-
And our alternative hypothesis,
-
I'll write over here.
-
It's just that it actually
does do something.
-
-
And let's say that it actually
has an improvement.
-
So that would mean that we
have more weight loss.
-
So if we have the mean of Group
One, the population mean
-
of Group One minus the
population mean of Group Two
-
should be greater then zero.
-
So this is going to be a one
tailed distribution.
-
Or another way we can view it,
is that the mean of the
-
difference of the distributions,
x1 minus x2 is
-
going to be greater then zero.
-
These are equivalent
statements.
-
Because we know that this is the
same thing as this, which
-
is the same thing as this,
which is what I
-
wrote right over here.
-
Now, to do any type of
hypothesis test, we have to
-
decide on a level
of significance.
-
-
What we're going to do is, we're
going to assume that our
-
null hypothesis is correct.
-
And then with that assumption
that the null hypothesis is
-
correct, we're going to see
what is the probability of
-
getting this sample data
right over here.
-
And if that probability is below
some threshold, we will
-
reject the null hypothesis in
favor of the alternative
-
hypothesis.
-
Now, that probability threshold,
and we've seen this
-
before, is called the
significance level, sometimes
-
called alpha.
-
And here, we're going to decide
for a significance
-
level of 95%.
-
Or another way to think about
it, assuming that the null
-
hypothesis is correct, we want
there to be no more than a 5%
-
chance of getting this
result here.
-
Or no more than a 5% chance of
incorrectly rejecting the null
-
hypothesis when it
is actually true.
-
Or that would be a
type one error.
-
So if there's less than a 5%
probability of this happening,
-
we're going to reject
the null hypothesis.
-
Less than a 5% probability given
the null hypothesis is
-
true, then we're going to reject
the null hypothesis in
-
favor of the alternative.
-
So let's think about this.
-
So we have the null
hypothesis.
-
Let me draw a distribution
over here.
-
The null hypothesis says that
the mean of the differences of
-
the sampling distributions
should be equal to zero.
-
Now, in that situation, what
is going to be our critical
-
region here?
-
Well, we need a result, so
we're going to need some
-
critical value here.
-
Because this isn't a
normalized normal
-
distribution.
-
But there's some critical
value here.
-
-
The hardest thing is statistics
is getting the
-
wording right.
-
There's some critical value here
that the probability of
-
getting a sample from this
distribution above that value
-
is only 5%.
-
-
So we just need to figure out
what this critical value is.
-
And if our value is larger than
that critical value, then
-
we can reject the
null hypothesis.
-
Because that means the
probability of getting this is
-
less than 5%.
-
We could reject the null
hypothesis and go with the
-
alternative hypothesis.
-
Remember, once again, we can
use Z-scores, and we can
-
assume this is a normal
distribution because our
-
sample size is large for either
of those samples.
-
We have a sample size of 100.
-
And to figure that out, the
first step, if we just look at
-
a normalized normal distribution
like this, what
-
is your critical Z value?
-
-
We're getting a result
above that Z value,
-
only has a 5% chance.
-
So this is actually
cumulative.
-
So this whole area right
over here is
-
going to be 95% chance.
-
We can just look
at the Z table.
-
We're looking for 95% percent.
-
We're looking at the
one tailed case.
-
So let's look for 95%.
-
This is the closest thing.
-
We want to err on the side of
being a little bit maybe to
-
the right of this.
-
So let's say 95.05
is pretty good.
-
So that's 1.65.
-
So this critical Z value
is equal to 1.65.
-
Or another way to view it is,
this distance right here is
-
going to be 1.65 standard
deviations.
-
-
I know my writing
is really small.
-
I'm just saying the standard
deviation of that
-
distribution.
-
So what is the standard
deviation of that
-
distribution?
-
We actually calculated it in
the last video, and I'll
-
recalculate it here.
-
The standard deviation of our
distribution of the difference
-
of the sample means is going to
be equal to the square root
-
of the variance of our
first population.
-
Now, the variance of our first
population, we don't know it.
-
But we could estimate it with
our sample standard deviation.
-
If you take your sample standard
deviation, 4.67 and
-
you square it, you get
your sample variance.
-
And so this is the variance.
-
This is our best estimate
of the variance of the
-
population.
-
And we want to divide that
by the sample size.
-
And then plus our best estimate
of the variance of
-
the population of group two,
which is 4.04 squared.
-
The sample standard deviation
of group two squared.
-
That gives us variance
divided by 100.
-
I did before in the last. Maybe
it's still sitting on my
-
calculator.
-
Yes, it's still sitting
on the calculator.
-
It's this quantity
right up here.
-
4.67 squared divided
by 100 plus 4.04
-
squared divided by 100.
-
So it's 0.617.
-
So this right here is
going to be 0.617.
-
So this distance right
here, is going to
-
be 1.65 times 0.617.
-
So let's figure out
what that is.
-
So let's take 0.617
times 1.65.
-
So it's 1.02.
-
This distance right
here is 1.02.
-
So what this tells us is, if
we assume that the diet
-
actually does nothing, there's a
only a 5% chance of having a
-
difference between the means of
these two samples to have a
-
difference of more than 1.02.
-
There's only a 5%
chance of that.
-
Well, the mean that we
actually got is 1.91.
-
So that's sitting out
here someplace.
-
So it definitely falls in
this critical region.
-
The probability of getting this,
assuming that the null
-
hypothesis is correct,
is less than 5%.
-
So it's smaller probability than
our significance level.
-
Actually, let me
be very clear.
-
The significance level,
this alpha right
-
here, needs to be 5%.
-
Not the 95%.
-
I think I might have
said here.
-
But I wrote down the
wrong number there.
-
I subtracted it from
one by accident.
-
Probably in my head.
-
But anyway, the significance
level is 5%.
-
The probability given that the
null hypothesis is true, the
-
probability of getting the
result that we got, the
-
probability of getting that
difference, is less than our
-
significance level.
-
It is less than 5%.
-
So based on the rules that we
set out for ourselves of
-
having a significance level of
5%, we will reject the null
-
hypothesis in favor of the
alternative that the diet
-
actually does make you
lose more weight.
-