https:/.../2020-02-21_psy317l_standard_error.mp4

0:01 - 0:05

Hello, in this video, I want to talk about
the standard error and this is really
0:05 - 0:10

extending our understanding of sampling
distributions and essential limit theorem.
0:10 - 0:13

So, let's talk about what
a standard error is.
0:15 - 0:18

First of all, we'll go back to
this penguin example and
0:18 - 0:22

you've seen this distribution before
as a uniform distribution of data.
0:23 - 0:27

It has, like any distribution, it has--
there's descriptive statistics.
0:27 - 0:29

So, it has a population mean.
0:29 - 0:30

The average is 5.04.
0:30 - 0:34

The average penguin is 5.04 meters
from the edge of the ice sheet.
0:34 - 0:37

You can calculate a standard
deviation for this.
0:37 - 0:40

So, the deviation is 2.88.
0:40 - 0:42

So, that's the, you know,
a measure of the spread.
0:42 - 0:48

And there was 5,000 penguins floating
on this ice sheet, that's the n,
0:48 - 0:49

the population size.
0:50 - 0:55

We then discussed about how if you were
just to sample either just randomly select
0:55 - 1:00

five penguins at a time or 50 penguins
at a time, that each of those samples
1:00 - 1:04

of, let's pick the n equals five for now,
each of those five penguins,
1:04 - 1:08

you could calculate how, what the average
distance from the front of the edge sheet
1:08 - 1:13

was for each of those individual penguins,
sample of five penguins and if you were
1:13 - 1:17

to do that over and over and over again
and in this histogram, we did it
1:17 - 1:23

1,000 times, we would be able to generate
what's called the sampling distribution.
1:25 - 1:30

And it's the sampling distribution of
the sample means, that's what it is
1:30 - 1:35

and I told you that we could calculate
from that what the average of
1:35 - 1:38

those sample means across
the 1,000 samples was and
1:38 - 1:44

that's this value and the notation that
we use for that is this mu and then
1:44 - 1:53

subscript x bar and that's the mean
of the sample means and I've forgotten
1:53 - 1:57

what it was, the exact value, but it's
pretty much going to approximate
1:57 - 1:58

very, very close.
1:58 - 2:04

So, I just put approximately equal to
5.05, just go back, it's 5.04.
2:04 - 2:11

So, it was-- it's going to approximate
the population average and you can
2:11 - 2:13

do that for any sample size.
2:13 - 2:17

So, that was sample size five,
let's look at the sample size 50.
2:17 - 2:24

Again, we have the mean of the sampling
distribution-- sorry, the mean of
2:24 - 2:29

the sample means and that is also going
to be very close to 5.04, it might be
2:29 - 2:32

a little bit closer because
our sample size is larger.
2:33 - 2:36

Two other things to notice about
these distributions, number one
2:36 - 2:39

they're normally distributed or approx--
sorry, the approximate to normal
2:39 - 2:43

distributions despite the fact for
the original distribution of penguins.
2:43 - 2:46

The population distribution was
a uniform distribution.
2:46 - 2:51

Second thing to notice, the sample size
doesn't really effect where the value
2:51 - 2:56

of the mean, of the sample means, it does
effect the standard deviation of
2:56 - 2:57

the sample means.
2:57 - 3:00

So, if this is a normal distribution,
or we believe it to approximate,
3:00 - 3:08

and then also this approximates
a normal distribution, then, it's clear
3:08 - 3:15

that the distance here, let's just assume
that's a standard deviation and I put it
3:15 - 3:17

in the right place.
3:17 - 3:21

This standard deviation, it's greater than
whatever the corresponding value is
3:21 - 3:25

over here, if that's also
the standard deviation.
3:25 - 3:29

So, as the sample size gets larger,
the spead of the sample means
3:29 - 3:33

gets smaller, so, we can say
the standard deviation gets smaller.
3:33 - 3:38

Now, does this standard deviation have any
relationship at all to the original
3:38 - 3:41

standard deviation of
the original population.
3:41 - 3:45

The original standard deviation was 2.88,
so, I'll just say population of
3:45 - 3:49

the original-- standard deviation was 2.88
3:49 - 3:52

Is there any relationship at all between
these two standard deviations?
3:52 - 3:56

Because it's not like the mean of
the sample means, which is pretty
3:56 - 4:01

much the same, regardless of the sample
size, I mean it does get better with
4:01 - 4:04

larger samples but it approximates,
it's close, especially if you have
4:04 - 4:06

enough of these samples.
4:06 - 4:09

What's the relationship of these standard
deviations because it's clear that when
4:09 - 4:15

you change n, this value is going to
change, so is there a relationship?
4:15 - 4:17

And it turns out that there is
a relationship and we're going to
4:17 - 4:18

look into that.
4:18 - 4:24

This graph here just shows you that
the normal distribution for becomes
4:24 - 4:27

better and better the larger
the sample size, so, it's a little
4:27 - 4:30

tricky to see but let me, I just want to
really point out one or two things here.
4:30 - 4:34

I'm going to pick a color
that represents that.
4:34 - 4:39

So, this value here, actually in red, so,
if I was just to pick one penguin at
4:39 - 4:44

a time, a sample size of one, this is
my estimate of the sample-- I'm going
4:44 - 4:45

for the red line here.
4:45 - 4:50

That's my estimate of the sample--
sorry, let's say that again.
4:50 - 4:52

That's the distribution of
the sample means.
4:52 - 4:54

It looks like the original population.
4:54 - 4:57

So, for a sample size of one, you don't
get a normal distribution of the sample
4:57 - 5:01

means, you get whatever
the original population was.
5:01 - 5:07

Let's look at two and I've got to find it
on here, so, it's the orange one and
5:07 - 5:12

I believe it's this one here.
5:12 - 5:13

It is this one here.
5:13 - 5:15

This is what it looks like.
5:15 - 5:17

This is the n is two.
5:17 - 5:19

So, again, not a really
normal distribution.
5:19 - 5:22

Now, let's skip to 50.
5:22 - 5:27

This is 50 here and you can see it
really, you don't need me to help
5:27 - 5:28

you too much.
5:28 - 5:31

This is the 50 value, it's very normal.
5:31 - 5:36

And then, we got blue at ten--
sorry, 25 here.
5:36 - 5:38

This is the 25 one and so on.
5:38 - 5:40

This is the ten.
5:40 - 5:42

This is the five.
5:42 - 5:44

I wanted to just show you this graph
because I wanted to show you that
5:44 - 5:49

even with very, very, very small
sample sizes of like five, we already
5:49 - 5:51

get very close to a normal distribution.
5:51 - 5:55

It's only with sample sizes of ridiculous
sample sizes of like one or two that
5:55 - 5:57

we don't do a very good job,
5:57 - 6:00

So, even with small sample sizes,
we get to the normal distribution
6:00 - 6:03

of the normal distribution of
the sample means.
6:03 - 6:07

So, back to the problem
I just posted a moment ago.
6:07 - 6:15

This is our original standard deviation
of a population, this is our population
6:15 - 6:17

and whenever we get a sample,
and again, this is just the sample
6:17 - 6:18

size of five.
6:18 - 6:19

This is the distribution of sample means.
6:19 - 6:27

The mean is going to approximate the mean
here but what is the relationship of
6:27 - 6:31

the standard deviation to
this original population.
6:31 - 6:33

What is the relationship?
6:33 - 6:39

It must be also related to the sample size
because it changes with its sample size.
6:39 - 6:43

And it's just a formula and we're not
going to talk too much about--
6:43 - 6:47

we're not going to talk much really at all
about how it's derived but this formula
6:47 - 6:52

here, very neatly, just tells us
about their relationship and
6:52 - 6:56

so, what we have here is this is
our standard deviation of
6:56 - 7:00

the sampling distribution of
the sample means.
7:00 - 7:03

So, we call that sigma subscript x bar,
7:03 - 7:04

sigma x bar.
7:04 - 7:08

The standard deviation, so just to really
reiterate what we're looking at, this is
7:08 - 7:13

the distribution of sample means,
this is-- we're looking for this value
7:13 - 7:16

what's this standard deviation?
7:16 - 7:22

And actually, technically, that's
the notation, what is that standard
7:22 - 7:23

deviation?
7:23 - 7:25

So, what we do is, we just take
the original population.
7:25 - 7:30

This is the population standard deviation
from the original population and we're
7:30 - 7:35

going to divide it by the square root of n
and that gives us that this value,
7:35 - 7:37

this standard deviation.
7:37 - 7:41

Its technical name is the standard
deviation of the sampling distribution
7:41 - 7:44

of the sample means, which is an awful
mouthful but we just call
7:44 - 7:45

it the standard error of
7:45 - 7:49

the mean, which is what we call it
the standard error of the mean.
7:49 - 7:55

So, this graph illustrates how
the standard error of the mean
7:55 - 7:57

changes by sample size.
7:57 - 8:07

So, if I just go back to-- maybe,
I'll just go back to this slide here
8:07 - 8:11

and we were asking the question of,
you know, what's this value over
8:11 - 8:15

sample size 50 compared to this
value of a sample size of five?
8:15 - 8:19

So, that was the question and I'm going to
plot-- maybe here I'll plot it or write it
8:19 - 8:20

sorry.
8:20 - 8:25

So, this is the formula, the standard
error of the mean or the standard
8:25 - 8:28

deviation of the sampling distribution
of the sample means is equal to
8:28 - 8:32

the original population standard deviation
divided by the square root of n.
8:32 - 8:37

So, when we had that sample size of five,
which is this one up here, what we're
8:37 - 8:42

really looking at is this, the original
standard deviation was 2.88 and
8:42 - 8:46

we're going to divide by the square root
of the sample size which is five, so that
8:46 - 8:48

equals 1.3.
8:48 - 8:53

So, the standard deviation here is 1.3 and
that standard error we call that is 1.3.
8:53 - 8:59

So, what this is saying is this value here
is 1.3 higher that was it, I forget.
8:59 - 9:04

I think it was 5.04 was the mean of
the sample means and so this value here
9:04 - 9:10

is going to be a 6.5-- nope, nope, not five.
9:10 - 9:15

It's going to be at 6.34.
9:15 - 9:22

This is one standard deviation above
the sample mean but if we have
9:22 - 9:27

a sample size of fifty, then
the calculation becomes this.
9:27 - 9:30

Becomes the original standard deviation
of the population divided by the square
9:30 - 9:33

root of 50, which is equal to and I've
9:33 - 9:36

written this down so I can check, 0.4.
9:36 - 9:40

So, back to this graph,
this value is 0.4,
9:40 - 9:43

and this value is 1.3.
9:43 - 9:47

And so, it gets smaller the bigger the
sample size.
9:47 - 9:52

This graph here that I got to previously
is actually showing us
9:52 - 9:56

how the standard error changes by
the sample size.
9:56 - 10:00

So we just had a sample size of 50,
which is approximately here.
10:00 - 10:07

If we go across to this value on this
axis, it tells us that's about 0.4,
10:07 - 10:12

sample size of 50, and if we had
a sample size of 5,
10:12 - 10:16

which is approximately here --
I'm doing a line, not very well,
10:16 - 10:20

but it goes to about there.
This was about 1.3.
10:20 - 10:24

And I just want you to -- there's nothing
really too much for you to take home
10:24 - 10:27

from this graph other than showing you
that as the sample size increases,
10:27 - 10:32

that the -- any population
standard deviation that we have,
10:32 - 10:37

the standard error is going to get
much smaller very rapidly.
10:37 - 10:41

A sample size of 5 is still quite high up
on this curve,
10:41 - 10:44

but once you come down to sample sizes
of 20 or 30 or more,
10:44 - 10:49

then we get a very, very small
standard error.
10:51 - 10:57

This is just to reiterate that point so
you can see what these are on this graph.
10:57 - 11:00

So let's put together what we've
just learned about the standard error
11:00 - 11:05

with what we have learned previously about
the Central Limit Theorem.
11:05 - 11:09

So what we have just been discussing is
that we just know that we have
11:09 - 11:11

an original population,
it could be any distribution,
11:11 - 11:13

here's our uniform distribution.
11:13 - 11:17

If we take many samples from it,
we get our sampling distribution.
11:17 - 11:25

In this case, of the sample means,
is normally distributed
11:25 - 11:27

or approximately normally distributed.
11:27 - 11:35

And we know that the sampling distribution
has a mean that is approximately equal to
11:35 - 11:40

the population mean and we've just learned
that we just know now that
11:40 - 11:44

the standard deviation of this
approximately normal distribution,
11:44 - 11:47

this is the standard error.
11:47 - 11:50

I'll write here, "standard error."
11:50 - 11:54

So we can actually write this in
notation form,
11:54 - 11:57

and we say that this sampling distribution
is approximately normal,
11:57 - 12:00

this is what this tilde squiggle means,
is approximately normal,
12:00 - 12:08

approximately normal and it has a mean
of the population mean,
12:08 - 12:11

so I'll just write here,
the mean is the population mean.
12:11 - 12:13

And the standard deviation of that
distribution,
12:13 - 12:16

and we're talking about this distribution
down here,
12:16 - 12:19

the standard deviation of that
distribution is the standard error,
12:19 - 12:21

that's what we call it.
12:21 - 12:23

And it's approximately equal to the
standard deviation of the
12:23 - 12:27

original population divided by the
square root of the sample size n.
12:27 - 12:35

So, this is a key thing that we know.
If we have at a population of any --
12:35 - 12:38

I'll just write "uniform" in here,
of any type, it could bimodal,
12:38 - 12:40

it could be uniform, it could be skewed,
we know that if we were to take
12:40 - 12:44

thousands and thousands of samples
or just one thousand -- or just a few,
12:44 - 12:47

hundred samples, the sample means that
we get from all those samples
12:47 - 12:50

are going to approximate
a normal distribution
12:50 - 12:53

if our sample size is larger,
it's going to approximate
12:53 - 12:57

a normal distribution even more.
And we can already determine what the
12:57 - 13:01

shape of that distribution is going to be
because we know that the population mean
13:01 - 13:04

is approximately equal to the mean
of the sample means,
13:04 - 13:09

and we know that the standard deviation,
this is the standard error,
13:09 - 13:15

we know that that, the standard error,
is the standard deviation of the
13:15 - 13:17

sampling distribution.
13:17 - 13:20

Okay, so we can work that out.
13:20 - 13:23

But the thing is, what you're probably
already thinking is,
13:23 - 13:26

"why do you care?" And you may not care,
and that's fine.
13:26 - 13:29

There's no reason to particularly.
13:29 - 13:34

But, it can be very, very helpful.
I'm just going to just float this idea
13:34 - 13:38

and we'll return to it in future videos.
13:38 - 13:43

Hopefully it's gone through your head
that why is this strange person
13:43 - 13:46

taking thousands of samples all the time?
13:46 - 13:47

You know, you're not going to go to this
penguin ice sheet and just keep
13:47 - 13:51

randomly picking 5 penguins at random
1,000 times.
13:51 - 13:54

Science and other types of time --
when we collect data,
13:54 - 13:57

it doesn't work like that.
We pretty much usually only just collect
13:57 - 13:59

one sample of data.
13:59 - 14:03

And so, when we collect one sample of data
and this here -- I've got
14:03 - 14:08

sampling distribution of n = 5 penguins.
14:08 - 14:10

This is when we did do it 1,000 times.
14:10 - 14:14

But let's just say that we did it one time
and we got a value around about here,
14:14 - 14:18

around about 7 meters,
that was our sample.
14:18 - 14:20

We just got one sample.
14:20 - 14:25

If we just got one sample,
we don't know anything really about that
14:25 - 14:30

in terms of how certain or how uncertain
are we that this truly is the sample mean.
14:30 - 14:34

We knew if we did this many, many times
the average of al the sample means
14:34 - 14:37

would converge on the true
population mean.
14:37 - 14:38

And that's our ultimate goal,
we're trying to est--
14:38 - 14:42

normally we don't know the population mean
we're trying to estimate it.
14:42 - 14:46

So in our one sample, we just got this
value of 7, say.
14:46 - 14:51

How confident are we that that is
the population mean?
14:51 - 14:56

And so, what we're able to do by having
this belief that we're able to know
14:56 - 15:01

that this value of 7 does come from
in theory,
15:01 - 15:04

a sampling distribution that exists.
15:04 - 15:08

And in theory, this sampling distribution
exists with a standard deviation
15:08 - 15:11

that we call the standard error.
We're able to understand how far
15:11 - 15:16

this value of 7, or any value that
we collected, it could be some other value
15:16 - 15:20

but our one sample was 7 meters,
we get a sense of how far away
15:20 - 15:24

from the mean that is in the units
of standard deviations
15:24 - 15:26

or technically,
with a sampling distribution,
15:26 - 15:27

standard errors.
15:27 - 15:30

So we're going to come back to this topic,
but really the value of the standard error
15:30 - 15:34

is that enables us to determine
when we collect one sample,
15:34 - 15:40

we're able to work out how far away
or how confident we are in our value,
15:40 - 15:42

is how far away is it from the
population mean,
15:42 - 15:46

how confident we are that this is a true
representation of the population mean.
15:46 - 15:48

We're going to come back to this
in future videos.

Title:: https:/.../2020-02-21_psy317l_standard_error.mp4
Video Language:: English
Duration:: 16:05

	Richard M Gaunt edited English subtitles for https:/.../2020-02-21_psy317l_standard_error.mp4
	Richard M Gaunt edited English subtitles for https:/.../2020-02-21_psy317l_standard_error.mp4
	Rivers Stewart edited English subtitles for https:/.../2020-02-21_psy317l_standard_error.mp4
	julietammarquez edited English subtitles for https:/.../2020-02-21_psy317l_standard_error.mp4
	julietammarquez edited English subtitles for https:/.../2020-02-21_psy317l_standard_error.mp4

English subtitles

Revisions

Revision 5 Edited

Richard M Gaunt

https:/.../2020-02-21_psy317l_standard_error.mp4

Revisions

Our website uses cookies

Operating cookies (Required)