< Return to Video

Central Limit Theorem

  • 0:01 - 0:03
    In this video, I want to
    talk about what is easily
  • 0:03 - 0:07
    one of the most fundamental and
    profound concepts in statistics
  • 0:07 - 0:09
    and maybe in all of mathematics.
  • 0:09 - 0:10
    And that's the
    central limit theorem.
  • 0:17 - 0:18
    And what it tells us
    is we can start off
  • 0:18 - 0:21
    with any distribution that
    has a well-defined mean and
  • 0:21 - 0:23
    variance-- and if it has
    a well-defined variance,
  • 0:23 - 0:25
    it has a well-defined
    standard deviation.
  • 0:25 - 0:28
    And it could be a continuous
    distribution or a discrete one.
  • 0:28 - 0:30
    I'll draw a discrete one,
    just because it's easier
  • 0:30 - 0:33
    to imagine, at least for
    the purposes of this video.
  • 0:33 - 0:36
    So let's say I have a discrete
    probability distribution
  • 0:36 - 0:37
    function.
  • 0:37 - 0:39
    And I want to be
    very careful not
  • 0:39 - 0:41
    to make it look anything close
    to a normal distribution.
  • 0:41 - 0:44
    Because I want to show you
    the power of the central limit
  • 0:44 - 0:44
    theorem.
  • 0:44 - 0:46
    So let's say I have
    a distribution.
  • 0:46 - 0:48
    Let's say it could take
    on values 1 through 6.
  • 0:48 - 0:51
    1, 2, 3, 4, 5, 6.
  • 0:51 - 0:53
    It's some kind of crazy dice.
  • 0:53 - 0:54
    It's very likely to get a one.
  • 0:54 - 0:56
    Let's say it's
    impossible-- well,
  • 0:56 - 0:57
    let me make that
    a straight line.
  • 0:57 - 0:59
    You have a very high
    likelihood of getting a 1.
  • 0:59 - 1:01
    Let's say it's
    impossible to get a 2.
  • 1:01 - 1:03
    Let's say it's an OK likelihood
    of getting a 3 or a 4.
  • 1:03 - 1:05
    Let's say it's
    impossible to get a 5.
  • 1:05 - 1:08
    And let's say it's very
    likely to get a 6 like that.
  • 1:08 - 1:10
    So that's my probability
    distribution function.
  • 1:10 - 1:12
    If I were to draw a
    mean-- this the symmetric,
  • 1:12 - 1:15
    so maybe the mean would
    be something like that.
  • 1:15 - 1:16
    The mean would be halfway.
  • 1:16 - 1:18
    So that would be my
    mean right there.
  • 1:18 - 1:19
    The standard
    deviation maybe would
  • 1:19 - 1:21
    look-- it would be
    that far and that
  • 1:21 - 1:23
    far above and below the mean.
  • 1:23 - 1:26
    But that's my discrete
    probability distribution
  • 1:26 - 1:26
    function.
  • 1:26 - 1:29
    Now what I'm going to do
    here, instead of just taking
  • 1:29 - 1:31
    samples of this
    random variable that's
  • 1:31 - 1:34
    described by this probability
    distribution function,
  • 1:34 - 1:36
    I'm going to take samples of it.
  • 1:36 - 1:38
    But I'm going to
    average the samples
  • 1:38 - 1:39
    and then look at
    those samples and see
  • 1:39 - 1:42
    the frequency of the
    averages that I get.
  • 1:42 - 1:44
    And when I say average,
    I mean the mean.
  • 1:44 - 1:45
    Let me define something.
  • 1:45 - 1:48
    Let's say my sample size-- and
    I could put any number here.
  • 1:48 - 1:58
    But let's say first off we try a
    sample size of n is equal to 4.
  • 1:58 - 2:00
    And what that means is I'm going
    to take four samples from this.
  • 2:00 - 2:03
    So let's say the first
    time I take four samples--
  • 2:03 - 2:06
    so my sample sizes is
    four-- let's say I get a 1.
  • 2:06 - 2:08
    Let's say I get another 1.
  • 2:08 - 2:09
    And let's say I get a 3.
  • 2:09 - 2:11
    And I get a 6.
  • 2:11 - 2:15
    So that right there is my
    first sample of sample size 4.
  • 2:15 - 2:16
    I know the terminology
    can get confusing.
  • 2:16 - 2:20
    Because this is the sample
    that's made up of four samples.
  • 2:20 - 2:23
    But then when we talk about the
    sample mean and the sampling
  • 2:23 - 2:25
    distribution of the
    sample mean, which we're
  • 2:25 - 2:28
    going to talk more and more
    about over the next few videos,
  • 2:28 - 2:32
    normally the sample refers
    to the set of samples
  • 2:32 - 2:33
    from your distribution.
  • 2:33 - 2:36
    And the sample size tells
    you how many you actually
  • 2:36 - 2:37
    took from your distribution.
  • 2:37 - 2:39
    But the terminology
    can be very confusing,
  • 2:39 - 2:42
    because you could easily view
    one of these as a sample.
  • 2:42 - 2:44
    But we're taking four
    samples from here.
  • 2:44 - 2:46
    We have a sample size of four.
  • 2:46 - 2:48
    And what I'm going to do is
    I'm going to average them.
  • 2:48 - 2:51
    So let's say the mean-- I
    want to be very careful when
  • 2:51 - 2:51
    I say average.
  • 2:51 - 2:55
    The mean of this first
    sample of size 4 is what?
  • 2:55 - 2:56
    1 plus 1 is 2.
  • 2:56 - 2:58
    2 plus 3 is 5.
  • 2:58 - 3:00
    5 plus 6 is 11.
  • 3:00 - 3:06
    11 divided by 4 is 2.75.
  • 3:06 - 3:11
    That is my first sample mean
    for my first sample of size 4.
  • 3:11 - 3:12
    Let me do another one.
  • 3:12 - 3:19
    My second sample of size 4,
    let's say that I get a 3, a 4.
  • 3:19 - 3:21
    Let's say I get another 3.
  • 3:21 - 3:22
    And let's say I get a 1.
  • 3:22 - 3:24
    I just didn't happen
    to get a 6 that time.
  • 3:24 - 3:26
    And notice I can't
    get a 2 or a 5.
  • 3:26 - 3:27
    It's impossible for
    this distribution.
  • 3:27 - 3:29
    The chance of getting
    a 2 or 5 is 0.
  • 3:29 - 3:31
    So I can't have any
    2s or 5s over here.
  • 3:31 - 3:38
    So for the second
    sample of sample size 4,
  • 3:38 - 3:42
    my second sample mean is
    going to be 3 plus 4 is 7.
  • 3:42 - 3:46
    7 plus 3 is 10 plus 1 is 11.
  • 3:46 - 3:50
    11 divided by 4,
    once again, is 2.75.
  • 3:50 - 3:51
    Let me do one more,
    because I really
  • 3:51 - 3:53
    want to make it clear
    what we're doing here.
  • 3:53 - 3:54
    So I do one more.
  • 3:54 - 3:55
    Actually, we're going
    to do a gazillion more.
  • 3:55 - 3:57
    But let me just do
    one more in detail.
  • 3:57 - 4:01
    So let's say my third
    sample of sample size 4--
  • 4:01 - 4:04
    so I'm going to
    literally take 4 samples.
  • 4:04 - 4:06
    So my sample is
    made up of 4 samples
  • 4:06 - 4:08
    from this original
    crazy distribution.
  • 4:08 - 4:13
    Let's say I get a 1,
    a 1, and a 6 and a 6.
  • 4:13 - 4:19
    And so my third sample mean
    is going to be 1 plus 1 is 2.
  • 4:19 - 4:20
    2 plus 6 is 8.
  • 4:20 - 4:21
    8 plus 6 is 14.
  • 4:21 - 4:30
    14 divided by 4 is 3 and 1/2.
  • 4:30 - 4:32
    And as I find each
    of these sample
  • 4:32 - 4:35
    means-- so for each of my
    samples of sample size 4,
  • 4:35 - 4:37
    I figure out a mean.
  • 4:37 - 4:38
    And as I do each
    of them, I'm going
  • 4:38 - 4:41
    to plot it on a
    frequency distribution.
  • 4:41 - 4:44
    And this is all going to
    amaze you in a few seconds.
  • 4:44 - 4:47
    So I plot this all on a
    frequency distribution.
  • 4:47 - 4:49
    So I say, OK, on
    my first sample,
  • 4:49 - 4:52
    my first sample mean was 2.75.
  • 4:52 - 4:55
    So I'm plotting the actual
    frequency of the sample
  • 4:55 - 4:56
    means I get for each sample.
  • 4:56 - 4:59
    So 2.75, I got it one time.
  • 4:59 - 5:00
    So I'll put a little plot there.
  • 5:00 - 5:02
    So that's from that
    one right there.
  • 5:02 - 5:05
    And the next time,
    I also got a 2.75.
  • 5:05 - 5:06
    That's a 2.75 there.
  • 5:06 - 5:08
    So I got it twice.
  • 5:08 - 5:10
    So I'll plot the
    frequency right there.
  • 5:10 - 5:11
    Then I got a 3 and 1/2.
  • 5:11 - 5:14
    So all the possible values,
    I could have a three,
  • 5:14 - 5:17
    I could have a 3.25, I
    could have a 3 and 1/2.
  • 5:17 - 5:19
    So then I have the 3 and 1/2,
    so I'll plot it right there.
  • 5:19 - 5:21
    And what I'm going
    to do is I'm going
  • 5:21 - 5:23
    to keep taking these samples.
  • 5:23 - 5:25
    Maybe I'll take 10,000 of them.
  • 5:25 - 5:27
    So I'm going to keep
    taking these samples.
  • 5:27 - 5:30
    So I go all the way to S 10,000.
  • 5:30 - 5:31
    I just do a bunch of these.
  • 5:31 - 5:34
    And what it's going to look like
    over time is each of these--
  • 5:34 - 5:36
    I'm going to make it
    a dot, because I'm
  • 5:36 - 5:37
    going to have to zoom out.
  • 5:37 - 5:41
    So if I look at it like
    this, over time-- it still
  • 5:41 - 5:43
    has all the values that it
    might be able to take on,
  • 5:43 - 5:46
    2.75 might be here.
  • 5:46 - 5:48
    So this first dot is
    going to be-- this one
  • 5:48 - 5:50
    right here is going
    to be right there.
  • 5:50 - 5:53
    And that second one is
    going to be right there.
  • 5:53 - 5:56
    Then that one at 3.5 is
    going to look right there.
  • 5:56 - 5:58
    But I'm going to
    do it 10,000 times.
  • 5:58 - 5:59
    Because I'm going
    to have 10,000 dots.
  • 5:59 - 6:02
    And let's say as I do it, I'm
    going just keep plotting them.
  • 6:02 - 6:04
    I'm just going to keep
    plotting the frequencies.
  • 6:04 - 6:06
    I'm just going to
    keep plotting them
  • 6:06 - 6:08
    over and over and over again.
  • 6:08 - 6:10
    And what you're going
    to see is, as I take
  • 6:10 - 6:12
    many, many samples
    of size 4, I'm
  • 6:12 - 6:14
    going to have
    something that's going
  • 6:14 - 6:18
    to start kind of approximating
    a normal distribution.
  • 6:18 - 6:23
    So each of these dots represent
    an incidence of a sample mean.
  • 6:23 - 6:25
    So as I keep adding on
    this column right here,
  • 6:25 - 6:28
    that means I kept getting
    the sample mean 2.75.
  • 6:28 - 6:29
    So over time.
  • 6:29 - 6:30
    I'm going to have
    something that's
  • 6:30 - 6:33
    starting to approximate
    a normal distribution.
  • 6:33 - 6:36
    And that is a neat thing about
    the central limit theorem.
  • 6:39 - 6:42
    So an orange, that's the
    case for n is equal to 4.
  • 6:42 - 6:45
    This was a sample size of 4.
  • 6:45 - 6:48
    Now, if I did the same thing
    with a sample size of maybe
  • 6:48 - 6:52
    20-- so in this case, instead
    of just taking 4 samples
  • 6:52 - 6:55
    from my original crazy
    distribution, every sample
  • 6:55 - 6:58
    I take 20 instances
    of my random variable,
  • 6:58 - 7:00
    and I average those 20.
  • 7:00 - 7:03
    And then I plot the
    sample mean on here.
  • 7:03 - 7:04
    So in that case,
    I'm going to have
  • 7:04 - 7:07
    a distribution that
    looks like this.
  • 7:07 - 7:09
    And we'll discuss
    this in more videos.
  • 7:09 - 7:12
    But it turns out if I were
    to plot 10,000 of the sample
  • 7:12 - 7:14
    means here, I'm going
    to have something
  • 7:14 - 7:18
    that, two things-- it's going
    to even more closely approximate
  • 7:18 - 7:19
    a normal distribution.
  • 7:19 - 7:20
    And we're going to
    see in future videos,
  • 7:20 - 7:22
    it's actually going to
    have a smaller-- well,
  • 7:22 - 7:23
    let me be clear.
  • 7:23 - 7:26
    It's going to have
    the same mean.
  • 7:26 - 7:27
    So that's the mean.
  • 7:27 - 7:29
    This is going to
    have the same mean.
  • 7:29 - 7:32
    So it's going to have a
    smaller standard deviation.
  • 7:32 - 7:34
    Well, I should plot
    these from the bottom
  • 7:34 - 7:35
    because you kind of stack it.
  • 7:35 - 7:37
    One you get one, then another
    instance and another instance.
  • 7:37 - 7:39
    But this is going to
    more and more approach
  • 7:39 - 7:40
    a normal distribution.
  • 7:40 - 7:45
    So this is what's super
    cool about the central limit
  • 7:45 - 7:46
    theorem.
  • 7:46 - 7:53
    As your sample size
    becomes larger--
  • 7:53 - 7:55
    or you could even say as
    it approaches infinity.
  • 7:55 - 7:56
    But you really don't
    have to get that close
  • 7:56 - 7:59
    to infinity to really get
    close to a normal distribution.
  • 7:59 - 8:01
    Even if you have a
    sample size of 10 or 20,
  • 8:01 - 8:04
    you're already getting very
    close to a normal distribution,
  • 8:04 - 8:06
    in fact about as
    good an approximation
  • 8:06 - 8:08
    as we see in our everyday life.
  • 8:08 - 8:12
    But what's cool is we can start
    with some crazy distribution.
  • 8:12 - 8:15
    This has nothing to do
    with a normal distribution.
  • 8:15 - 8:17
    This was n equals 4, but if
    we have a sample size of n
  • 8:17 - 8:20
    equals 10 or n
    equals 100, and we
  • 8:20 - 8:22
    were to take 100 of these,
    instead of four here,
  • 8:22 - 8:24
    and average them and
    then plot that average,
  • 8:24 - 8:27
    the frequency of it, then we
    take 100 again, average them,
  • 8:27 - 8:29
    take the mean, plot
    that again, and if we
  • 8:29 - 8:30
    do that a bunch
    of times, in fact,
  • 8:30 - 8:32
    if we were to do that
    an infinite time,
  • 8:32 - 8:33
    we would find that
    we, especially
  • 8:33 - 8:35
    if we had an
    infinite sample size,
  • 8:35 - 8:38
    we would find a perfect
    normal distribution.
  • 8:38 - 8:39
    That's the crazy thing.
  • 8:39 - 8:42
    And it doesn't apply just
    to taking the sample mean.
  • 8:42 - 8:44
    Here we took the
    sample mean every time.
  • 8:44 - 8:46
    But you could have also
    taken the sample sum.
  • 8:46 - 8:49
    The central limit theorem
    would have still applied.
  • 8:49 - 8:51
    But that's what's so
    super useful about it.
  • 8:51 - 8:54
    Because in life, there's all
    sorts of processes out there,
  • 8:54 - 8:57
    proteins bumping into
    each other, people doing
  • 8:57 - 9:01
    crazy things, humans
    interacting in weird ways.
  • 9:01 - 9:03
    And you don't know the
    probability distribution
  • 9:03 - 9:04
    functions for any
    of those things.
  • 9:04 - 9:06
    But what the central
    limit theorem
  • 9:06 - 9:09
    tells us is if we add a
    bunch of those actions
  • 9:09 - 9:11
    together, assuming that they
    all have the same distribution,
  • 9:11 - 9:14
    or if we were to take the
    mean of all of those actions
  • 9:14 - 9:17
    together, and if we were to plot
    the frequency of those means,
  • 9:17 - 9:19
    we do get a normal distribution.
  • 9:19 - 9:22
    And that's frankly why the
    normal distribution shows up
  • 9:22 - 9:26
    so much in statistics
    and why, frankly, it's
  • 9:26 - 9:28
    a very good
    approximation for the sum
  • 9:28 - 9:31
    or the means of a
    lot of processes.
  • 9:31 - 9:34
    Normal distribution.
  • 9:34 - 9:36
    What I'm going to show you in
    the next video is I'm actually
  • 9:36 - 9:39
    going to show you that this is
    a reality, that as you increase
  • 9:39 - 9:41
    your sample size, as
    you increase your n,
  • 9:41 - 9:43
    and as you take a
    lot of sample means,
  • 9:43 - 9:46
    you're going to have a frequency
    plot that looks very, very
  • 9:46 - 9:48
    close to a normal distribution.
Title:
Central Limit Theorem
Description:

more » « less
Video Language:
English
Team:
Khan Academy
Duration:
09:49

English subtitles

Revisions Compare revisions