0:00:01.476,0:00:04.996
Hello, in this video, I want to talk about[br]the standard error and this is really

0:00:04.996,0:00:09.777
extending our understanding of sampling [br]distributions and essential limit theorem.

0:00:10.485,0:00:13.222
So, let's talk about what [br]a standard error is.

0:00:14.998,0:00:17.734
First of all, we'll go back to [br]this penguin example and

0:00:17.734,0:00:22.086
you've seen this distribution before[br]as a uniform distribution of data.

0:00:22.958,0:00:26.862
It has, like any distribution, it has--[br]there's descriptive statistics.

0:00:26.862,0:00:28.524
So, it has a population mean.

0:00:28.524,0:00:30.060
The average is 5.04.

0:00:30.060,0:00:33.876
The average penguin is 5.04 meters [br]from the edge of the ice sheet.

0:00:33.876,0:00:36.747
You can calculate a standard [br]deviation for this.

0:00:36.747,0:00:39.513
So, the deviation is 2.88.

0:00:39.513,0:00:42.311
So, that's the, you know, [br]a measure of the spread.

0:00:42.311,0:00:47.580
And there was 5,000 penguins floating[br]on this ice sheet, that's the n,

0:00:47.580,0:00:49.181
the population size.

0:00:50.105,0:00:54.799
We then discussed about how if you were[br]just to sample either just randomly select

0:00:54.799,0:01:00.189
five penguins at a time or 50 penguins[br]at a time, that each of those samples

0:01:00.189,0:01:04.435
of, let's pick the n equals five for now,[br]each of those five penguins,

0:01:04.435,0:01:07.996
you could calculate how, what the average[br]distance from the front of the edge sheet

0:01:07.996,0:01:13.409
was for each of those individual penguins,[br]sample of five penguins and if you were

0:01:13.409,0:01:17.259
to do that over and over and over again[br]and in this histogram, we did it

0:01:17.259,0:01:22.998
1,000 times, we would be able to generate[br]what's called the sampling distribution.

0:01:24.598,0:01:30.153
And it's the sampling distribution of[br]the sample means, that's what it is

0:01:30.153,0:01:35.257
and I told you that we could calculate[br]from that what the average of

0:01:35.257,0:01:37.867
those sample means across [br]the 1,000 samples was and

0:01:37.867,0:01:43.947
that's this value and the notation that[br]we use for that is this mu and then

0:01:43.947,0:01:53.283
subscript x bar and that's the mean[br]of the sample means and I've forgotten

0:01:53.283,0:01:56.655
what it was, the exact value, but it's[br]pretty much going to approximate

0:01:56.655,0:01:58.208
very, very close.

0:01:58.208,0:02:04.261
So, I just put approximately equal to [br]5.05, just go back, it's 5.04.

0:02:04.261,0:02:11.111
So, it was-- it's going to approximate[br]the population average and you can

0:02:11.111,0:02:13.494
do that for any sample size.

0:02:13.494,0:02:16.677
So, that was sample size five,[br]let's look at the sample size 50.

0:02:16.677,0:02:24.090
Again, we have the mean of the sampling[br]distribution-- sorry, the mean of

0:02:24.090,0:02:28.577
the sample means and that is also going[br]to be very close to 5.04, it might be

0:02:28.577,0:02:32.167
a little bit closer because [br]our sample size is larger.

0:02:32.881,0:02:35.646
Two other things to notice about [br]these distributions, number one

0:02:35.646,0:02:39.026
they're normally distributed or approx--[br]sorry, the approximate to normal

0:02:39.026,0:02:43.151
distributions despite the fact for [br]the original distribution of penguins.

0:02:43.151,0:02:46.435
The population distribution was [br]a uniform distribution.

0:02:46.435,0:02:50.962
Second thing to notice, the sample size[br]doesn't really effect where the value

0:02:50.962,0:02:55.979
of the mean, of the sample means, it does[br]effect the standard deviation of

0:02:55.979,0:02:57.428
the sample means.

0:02:57.428,0:03:00.299
So, if this is a normal distribution,[br]or we believe it to approximate,

0:03:00.299,0:03:08.467
and then also this approximates [br]a normal distribution, then, it's clear

0:03:08.467,0:03:14.622
that the distance here, let's just assume[br]that's a standard deviation and I put it

0:03:14.622,0:03:16.771
in the right place.

0:03:16.771,0:03:21.487
This standard deviation, it's greater than[br]whatever the corresponding value is

0:03:21.487,0:03:25.124
over here, if that's also [br]the standard deviation.

0:03:25.124,0:03:29.318
So, as the sample size gets larger,[br]the spead of the sample means

0:03:29.318,0:03:33.152
gets smaller, so, we can say [br]the standard deviation gets smaller.

0:03:33.152,0:03:38.419
Now, does this standard deviation have any[br]relationship at all to the original

0:03:38.419,0:03:41.421
standard deviation of [br]the original population.

0:03:41.421,0:03:45.290
The original standard deviation was 2.88,[br]so, I'll just say population of

0:03:45.290,0:03:48.720
the original-- standard deviation was 2.88

0:03:48.955,0:03:51.825
Is there any relationship at all between [br]these two standard deviations?

0:03:52.195,0:03:55.895
Because it's not like the mean of[br]the sample means, which is pretty

0:03:55.895,0:04:00.751
much the same, regardless of the sample [br]size, I mean it does get better with

0:04:00.751,0:04:04.130
larger samples but it approximates,[br]it's close, especially if you have

0:04:04.130,0:04:06.017
enough of these samples.

0:04:06.017,0:04:08.977
What's the relationship of these standard[br]deviations because it's clear that when

0:04:08.977,0:04:14.598
you change n, this value is going to [br]change, so is there a relationship?

0:04:14.598,0:04:16.522
And it turns out that there is [br]a relationship and we're going to

0:04:16.522,0:04:18.166
look into that.

0:04:18.166,0:04:23.755
This graph here just shows you that [br]the normal distribution for becomes

0:04:23.755,0:04:26.789
better and better the larger [br]the sample size, so, it's a little

0:04:26.789,0:04:29.854
tricky to see but let me, I just want to[br]really point out one or two things here.

0:04:29.854,0:04:33.863
I'm going to pick a color [br]that represents that.

0:04:33.863,0:04:38.746
So, this value here, actually in red, so,[br]if I was just to pick one penguin at

0:04:38.746,0:04:43.512
a time, a sample size of one, this is[br]my estimate of the sample-- I'm going

0:04:43.512,0:04:45.049
for the red line here.

0:04:45.049,0:04:49.901
That's my estimate of the sample--[br]sorry, let's say that again.

0:04:49.901,0:04:52.165
That's the distribution of [br]the sample means.

0:04:52.165,0:04:53.999
It looks like the original population.

0:04:53.999,0:04:57.317
So, for a sample size of one, you don't[br]get a normal distribution of the sample

0:04:57.317,0:05:00.625
means, you get whatever [br]the original population was.

0:05:00.625,0:05:06.698
Let's look at two and I've got to find it[br]on here, so, it's the orange one and

0:05:06.698,0:05:11.632
I believe it's this one here.

0:05:11.632,0:05:13.394
It is this one here.

0:05:13.394,0:05:14.799
This is what it looks like.

0:05:14.799,0:05:17.166
This is the n is two.

0:05:17.166,0:05:19.492
So, again, not a really [br]normal distribution.

0:05:19.492,0:05:22.190
Now, let's skip to 50.

0:05:22.190,0:05:27.129
This is 50 here and you can see it[br]really, you don't need me to help

0:05:27.129,0:05:28.368
you too much.

0:05:28.368,0:05:30.781
This is the 50 value, it's very normal.

0:05:30.781,0:05:35.651
And then, we got blue at ten--[br]sorry, 25 here.

0:05:35.651,0:05:37.894
This is the 25 one and so on.

0:05:37.894,0:05:39.806
This is the ten.

0:05:39.806,0:05:41.677
This is the five.

0:05:41.677,0:05:44.260
I wanted to just show you this graph[br]because I wanted to show you that

0:05:44.260,0:05:48.701
even with very, very, very small [br]sample sizes of like five, we already

0:05:48.701,0:05:51.258
get very close to a normal distribution.

0:05:51.258,0:05:54.748
It's only with sample sizes of ridiculous[br]sample sizes of like one or two that

0:05:54.748,0:05:56.762
we don't do a very good job,

0:05:56.762,0:05:59.971
So, even with small sample sizes, [br]we get to the normal distribution

0:05:59.971,0:06:02.768
of the normal distribution of [br]the sample means.

0:06:02.768,0:06:07.385
So, back to the problem [br]I just posted a moment ago.

0:06:07.385,0:06:14.501
This is our original standard deviation[br]of a population, this is our population

0:06:14.501,0:06:17.106
and whenever we get a sample,[br]and again, this is just the sample

0:06:17.106,0:06:17.950
size of five.

0:06:17.950,0:06:19.411
This is the distribution of sample means.

0:06:19.411,0:06:26.544
The mean is going to approximate the mean[br]here but what is the relationship of

0:06:26.544,0:06:31.469
the standard deviation to [br]this original population.

0:06:31.469,0:06:32.730
What is the relationship?

0:06:32.730,0:06:38.633
It must be also related to the sample size[br]because it changes with its sample size.

0:06:38.633,0:06:43.024
And it's just a formula and we're not[br]going to talk too much about--

0:06:43.024,0:06:47.407
we're not going to talk much really at all[br]about how it's derived but this formula

0:06:47.407,0:06:52.141
here, very neatly, just tells us [br]about their relationship and

0:06:52.141,0:06:55.513
so, what we have here is this is [br]our standard deviation of

0:06:55.513,0:06:59.987
the sampling distribution of[br]the sample means.

0:06:59.987,0:07:02.975
So, we call that sigma subscript x bar,

0:07:02.975,0:07:04.401
sigma x bar.

0:07:04.401,0:07:07.863
The standard deviation, so just to really[br]reiterate what we're looking at, this is

0:07:07.863,0:07:12.661
the distribution of sample means,[br]this is-- we're looking for this value

0:07:12.661,0:07:15.812
what's this standard deviation?

0:07:15.812,0:07:22.158
And actually, technically, that's [br]the notation, what is that standard

0:07:22.158,0:07:23.149
deviation?

0:07:23.149,0:07:25.191
So, what we do is, we just take [br]the original population.

0:07:25.191,0:07:30.300
This is the population standard deviation[br]from the original population and we're

0:07:30.300,0:07:35.479
going to divide it by the square root of n[br]and that gives us that this value,

0:07:35.479,0:07:36.752
this standard deviation.

0:07:36.752,0:07:40.715
Its technical name is the standard[br]deviation of the sampling distribution

0:07:40.715,0:07:44.420
of the sample means, which is an awful[br]mouthful but we just call

0:07:44.420,0:07:45.420
it the standard error of

0:07:45.420,0:07:49.447
the mean, which is what we call it[br]the standard error of the mean.

0:07:49.447,0:07:55.089
So, this graph illustrates how [br]the standard error of the mean

0:07:55.089,0:07:56.990
changes by sample size.

0:07:56.990,0:08:06.727
So, if I just go back to-- maybe, [br]I'll just go back to this slide here

0:08:06.727,0:08:11.099
and we were asking the question of, [br]you know, what's this value over

0:08:11.099,0:08:14.841
sample size 50 compared to this [br]value of a sample size of five?

0:08:14.841,0:08:19.229
So, that was the question and I'm going to[br]plot-- maybe here I'll plot it or write it

0:08:19.229,0:08:20.265
sorry.

0:08:20.265,0:08:25.102
So, this is the formula, the standard[br]error of the mean or the standard

0:08:25.102,0:08:27.675
deviation of the sampling distribution[br]of the sample means is equal to

0:08:27.675,0:08:31.601
the original population standard deviation[br]divided by the square root of n.

0:08:31.601,0:08:36.925
So, when we had that sample size of five,[br]which is this one up here, what we're

0:08:36.925,0:08:42.176
really looking at is this, the original[br]standard deviation was 2.88 and

0:08:42.176,0:08:45.774
we're going to divide by the square root[br]of the sample size which is five, so that

0:08:45.774,0:08:47.767
equals 1.3.

0:08:47.767,0:08:53.340
So, the standard deviation here is 1.3 and[br]that standard error we call that is 1.3.

0:08:53.340,0:08:59.429
So, what this is saying is this value here[br]is 1.3 higher that was it, I forget.

0:08:59.429,0:09:04.000
I think it was 5.04 was the mean of[br]the sample means and so this value here

0:09:04.000,0:09:10.068
is going to be a 6.5-- nope, nope, not five.

0:09:10.068,0:09:15.243
It's going to be at 6.34.

0:09:15.243,0:09:22.008
This is one standard deviation above [br]the sample mean but if we have

0:09:22.008,0:09:26.743
a sample size of fifty, then [br]the calculation becomes this.

0:09:26.743,0:09:30.303
Becomes the original standard deviation[br]of the population divided by the square

0:09:30.303,0:09:32.523
root of 50, which is equal to and I've

0:09:32.523,0:09:35.914
written this down so I can check, 0.4.

0:09:35.914,0:09:40.234
So, back to this graph, [br]this value is 0.4,

0:09:40.234,0:09:43.002
and this value is 1.3.

0:09:43.002,0:09:46.554
And so, it gets smaller the bigger the[br]sample size.

0:09:46.554,0:09:52.456
This graph here that I got to previously[br]is actually showing us

0:09:52.456,0:09:56.063
how the standard error changes by[br]the sample size.

0:09:56.063,0:09:59.661
So we just had a sample size of 50,[br]which is approximately here.

0:09:59.661,0:10:06.839
If we go across to this value on this[br]axis, it tells us that's about 0.4,

0:10:06.839,0:10:11.837
sample size of 50, and if we had [br]a sample size of 5,

0:10:11.837,0:10:15.991
which is approximately here --[br]I'm doing a line, not very well,

0:10:15.991,0:10:20.372
but it goes to about there.[br]This was about 1.3.

0:10:20.372,0:10:23.829
And I just want you to -- there's nothing[br]really too much for you to take home

0:10:23.829,0:10:27.337
from this graph other than showing you[br]that as the sample size increases,

0:10:27.337,0:10:32.416
that the -- any population [br]standard deviation that we have,

0:10:32.416,0:10:36.584
the standard error is going to get[br]much smaller very rapidly.

0:10:36.584,0:10:41.044
A sample size of 5 is still quite high up[br]on this curve,

0:10:41.044,0:10:44.432
but once you come down to sample sizes[br]of 20 or 30 or more,

0:10:44.432,0:10:49.222
then we get a very, very small [br]standard error.

0:10:50.578,0:10:56.786
This is just to reiterate that point so[br]you can see what these are on this graph.

0:10:56.786,0:11:00.394
So let's put together what we've [br]just learned about the standard error

0:11:00.394,0:11:04.525
with what we have learned previously about[br]the Central Limit Theorem.

0:11:04.525,0:11:08.797
So what we have just been discussing is [br]that we just know that we have

0:11:08.797,0:11:10.929
an original population, [br]it could be any distribution,

0:11:10.929,0:11:13.369
here's our uniform distribution.

0:11:13.369,0:11:16.762
If we take many samples from it,[br]we get our sampling distribution.

0:11:16.762,0:11:24.601
In this case, of the sample means,[br]is normally distributed

0:11:24.601,0:11:27.102
or approximately normally distributed.

0:11:27.102,0:11:34.696
And we know that the sampling distribution[br]has a mean that is approximately equal to

0:11:34.696,0:11:40.404
the population mean and we've just learned[br]that we just know now that

0:11:40.404,0:11:44.118
the standard deviation of this [br]approximately normal distribution,

0:11:44.118,0:11:46.826
this is the standard error.

0:11:46.826,0:11:49.653
I'll write here, "standard error."

0:11:50.319,0:11:53.645
So we can actually write this in[br]notation form,

0:11:53.645,0:11:56.844
and we say that this sampling distribution[br]is approximately normal,

0:11:56.844,0:11:59.943
this is what this tilde squiggle means, [br]is approximately normal,

0:11:59.943,0:12:07.615
approximately normal and it has a mean[br]of the population mean,

0:12:07.615,0:12:11.153
so I'll just write here, [br]the mean is the population mean.

0:12:11.153,0:12:13.251
And the standard deviation of that [br]distribution,

0:12:13.251,0:12:15.767
and we're talking about this distribution[br]down here,

0:12:15.767,0:12:18.941
the standard deviation of that [br]distribution is the standard error,

0:12:18.941,0:12:20.604
that's what we call it.

0:12:20.604,0:12:22.572
And it's approximately equal to the [br]standard deviation of the

0:12:22.572,0:12:26.758
original population divided by the[br]square root of the sample size n.

0:12:26.758,0:12:34.628
So, this is a key thing that we know.[br]If we have at a population of any --

0:12:34.628,0:12:37.634
I'll just write "uniform" in here, [br]of any type, it could bimodal,

0:12:37.634,0:12:40.490
it could be uniform, it could be skewed,[br]we know that if we were to take

0:12:40.490,0:12:43.704
thousands and thousands of samples[br]or just one thousand -- or just a few,

0:12:43.704,0:12:47.370
hundred samples, the sample means that[br]we get from all those samples

0:12:47.370,0:12:50.035
are going to approximate [br]a normal distribution

0:12:50.035,0:12:52.651
if our sample size is larger, [br]it's going to approximate

0:12:52.651,0:12:57.088
a normal distribution even more.[br]And we can already determine what the

0:12:57.088,0:13:01.234
shape of that distribution is going to be[br]because we know that the population mean

0:13:01.234,0:13:04.340
is approximately equal to the mean [br]of the sample means,

0:13:04.340,0:13:09.130
and we know that the standard deviation,[br]this is the standard error,

0:13:09.130,0:13:15.096
we know that that, the standard error,[br]is the standard deviation of the

0:13:15.096,0:13:17.330
sampling distribution.

0:13:17.330,0:13:19.712
Okay, so we can work that out.

0:13:19.712,0:13:22.833
But the thing is, what you're probably [br]already thinking is,

0:13:22.833,0:13:25.780
"why do you care?" And you may not care,[br]and that's fine.

0:13:25.780,0:13:29.255
There's no reason to particularly.

0:13:29.255,0:13:34.286
But, it can be very, very helpful.[br]I'm just going to just float this idea

0:13:34.286,0:13:38.121
and we'll return to it in future videos.

0:13:38.121,0:13:42.588
Hopefully it's gone through your head [br]that why is this strange person

0:13:42.588,0:13:45.715
taking thousands of samples all the time?

0:13:45.715,0:13:47.446
You know, you're not going to go to this[br]penguin ice sheet and just keep

0:13:47.446,0:13:50.951
randomly picking 5 penguins at random[br]1,000 times.

0:13:50.951,0:13:53.627
Science and other types of time --[br]when we collect data,

0:13:53.627,0:13:56.998
it doesn't work like that.[br]We pretty much usually only just collect

0:13:56.998,0:13:59.349
one sample of data.

0:13:59.349,0:14:03.331
And so, when we collect one sample of data[br]and this here -- I've got

0:14:03.331,0:14:07.558
sampling distribution of n = 5 penguins.

0:14:07.558,0:14:09.768
This is when we did do it 1,000 times.

0:14:09.768,0:14:14.352
But let's just say that we did it one time[br]and we got a value around about here,

0:14:14.352,0:14:17.504
around about 7 meters, [br]that was our sample.

0:14:17.504,0:14:20.123
We just got one sample.

0:14:20.123,0:14:24.697
If we just got one sample, [br]we don't know anything really about that

0:14:24.697,0:14:30.402
in terms of how certain or how uncertain [br]are we that this truly is the sample mean.

0:14:30.402,0:14:34.368
We knew if we did this many, many times[br]the average of al the sample means

0:14:34.368,0:14:36.618
would converge on the true [br]population mean.

0:14:36.618,0:14:38.455
And that's our ultimate goal, [br]we're trying to est--

0:14:38.455,0:14:42.139
normally we don't know the population mean[br]we're trying to estimate it.

0:14:42.139,0:14:46.211
So in our one sample, we just got this [br]value of 7, say.

0:14:46.211,0:14:50.854
How confident are we that that is [br]the population mean?

0:14:50.854,0:14:55.793
And so, what we're able to do by having[br]this belief that we're able to know

0:14:55.793,0:15:01.224
that this value of 7 does come from [br]in theory,

0:15:01.224,0:15:03.710
a sampling distribution that exists.

0:15:03.710,0:15:07.569
And in theory, this sampling distribution[br]exists with a standard deviation

0:15:07.569,0:15:11.300
that we call the standard error.[br]We're able to understand how far

0:15:11.300,0:15:16.439
this value of 7, or any value that [br]we collected, it could be some other value

0:15:16.439,0:15:19.850
but our one sample was 7 meters,[br]we get a sense of how far away

0:15:19.850,0:15:23.890
from the mean that is in the units[br]of standard deviations

0:15:23.890,0:15:26.063
or technically, [br]with a sampling distribution,

0:15:26.063,0:15:27.410
standard errors.

0:15:27.410,0:15:30.264
So we're going to come back to this topic,[br]but really the value of the standard error

0:15:30.264,0:15:33.996
is that enables us to determine [br]when we collect one sample,

0:15:33.996,0:15:39.677
we're able to work out how far away[br]or how confident we are in our value,

0:15:39.677,0:15:41.717
is how far away is it from the [br]population mean,

0:15:41.717,0:15:45.652
how confident we are that this is a true[br]representation of the population mean.

0:15:45.652,0:15:48.456
We're going to come back to this[br]in future videos.