< Return to Video

Statistics: Sample vs. Population Mean

  • 0:01 - 0:03
    Before, at the end of the last
    video, I actually said that
  • 0:03 - 0:06
    we'd talk about measures of
    dispersion or how things are
  • 0:06 - 0:08
    distributed, but before I go
    into that, I realize that I
  • 0:08 - 0:11
    have more to talk about,
    especially the mean.
  • 0:11 - 0:16
    And before I do that, I want
    to differentiate between a
  • 0:16 - 0:17
    sample and a population.
  • 0:23 - 0:26
    I touched on this a little
    bit in the last video.
  • 0:26 - 0:29
    Let's say I wanted to
    know-- I don't know.
  • 0:29 - 0:32
    Let's say I wanted to know
    the average height of all
  • 0:32 - 0:34
    men in America, right?
  • 0:34 - 0:38
    So let me make the set
    of all men in America.
  • 0:38 - 0:39
    So that's all men in America.
  • 0:39 - 0:43
    I know there's 300 million
    people in the U.S., and half of
  • 0:43 - 0:45
    them maybe roughly are men,
    so this would be 150
  • 0:45 - 0:49
    million men, right?
  • 0:49 - 0:53
    And it would be nearly
    impossible, even if I was
  • 0:53 - 0:57
    intent on doing it, to actually
    measure the average height
  • 0:57 - 0:58
    of every man in America.
  • 0:58 - 1:01
    Frankly, you know, every few
    seconds, one of these men is
  • 1:01 - 1:02
    being born and one of
    these men is dying.
  • 1:02 - 1:05
    So you know by the time I'm
    done measuring everything,
  • 1:05 - 1:07
    someone would have died, and
    some new men would have
  • 1:07 - 1:10
    been born, so it would
    almost be impossible.
  • 1:10 - 1:16
    And if not impossible, it would
    be very tiresome to measure the
  • 1:16 - 1:19
    average, or the mean, or the
    median, or the mode of this
  • 1:19 - 1:21
    entire population, right?
  • 1:21 - 1:24
    So the best way I can get a
    sense of this, because I'm
  • 1:24 - 1:28
    interested in what the average
    of the population is, maybe I
  • 1:28 - 1:30
    can take the average
    of a sample.
  • 1:30 - 1:33
    So what I could do is I can go
    up to, you know-- and I'd try
  • 1:33 - 1:34
    to be pretty random about it.
  • 1:34 - 1:37
    I don't want to like-- you
    know, hopefully, my sample
  • 1:37 - 1:41
    wouldn't be my college's
    basketball team because that
  • 1:41 - 1:44
    would be a skewed sample, but
    I'd try to find random people
  • 1:44 - 1:47
    and random situations where it
    wouldn't be skewed
  • 1:47 - 1:48
    based on height.
  • 1:48 - 1:51
    And I'd maybe collect 10
    heights, and I'd get, well,
  • 1:51 - 1:53
    maybe-- you know, the more
    people I get the more
  • 1:53 - 1:55
    indicative it is, but if I got
    10 heights, and those 10
  • 1:55 - 1:58
    heights were-- I don't know.
  • 1:58 - 2:03
    I'll do it in, you know, 5
    feet, 6 feet, 5 and a half
  • 2:03 - 2:09
    feet, 5.75 feet, and, well,
    let's say I only do 6,
  • 2:09 - 2:11
    or let's say in 6 and
    a half feet, right?
  • 2:11 - 2:14
    Those are the five people that
    I'd sample, and we could talk
  • 2:14 - 2:17
    more about what's a good way to
    generate a random sample from a
  • 2:17 - 2:20
    population so it's not skewed
    one way or the other.
  • 2:20 - 2:22
    But anyway, if I wanted to get
    a sense of it and if I was kind
  • 2:22 - 2:24
    of lazy, so I only took
    five measurements, this
  • 2:24 - 2:25
    is the way I would do it.
  • 2:25 - 2:28
    This would be a sample.
  • 2:28 - 2:30
    This would be a sample
    of the population.
  • 2:30 - 2:34
    So instead of taking the mean--
    let's say how I wanted to
  • 2:34 - 2:36
    calculate the average by
    taking the arithmetic mean.
  • 2:36 - 2:39
    Instead of taking the
    arithmetic mean of this entire
  • 2:39 - 2:42
    group of 150 million people, I
    might just be happy taking the
  • 2:42 - 2:44
    mean of this sample, and
    that'll be called
  • 2:44 - 2:46
    the sample mean.
  • 2:46 - 2:49
    And I want to introduce you to
    some notation, even though it's
  • 2:49 - 2:54
    kind of-- so in statistics
    speak, the mean, this mu, it's
  • 2:54 - 2:57
    a Greek letter, essentially the
    Greek letter that later turns
  • 2:57 - 3:03
    into m, but mu is the
    population mean, and this is
  • 3:03 - 3:07
    just a convention
    population mean.
  • 3:10 - 3:15
    And x with a line over it, that
    is equal to a sample mean.
  • 3:18 - 3:20
    And these are just notations
    that people might see, and you
  • 3:20 - 3:22
    might have been confused
    because sometimes you see
  • 3:22 - 3:24
    something-- people talk about
    means, and you see this mu, and
  • 3:24 - 3:28
    sometimes you see this x with a
    line over it, and it's nice
  • 3:28 - 3:29
    to know the distinction.
  • 3:29 - 3:31
    Here they're talking about
    the mean of a sample of the
  • 3:31 - 3:35
    population, and here they're
    talking about the mean of
  • 3:35 - 3:38
    the population as a whole.
  • 3:38 - 3:41
    Now, the way you calculate
    them is essentially the same.
  • 3:41 - 3:43
    If you want to figure out the
    population mean, you'd go to
  • 3:43 - 3:48
    all 150 million people at one
    moment and add up all their
  • 3:48 - 3:51
    heights, and divide by 150
    million to get the
  • 3:51 - 3:52
    population mean.
  • 3:52 - 3:55
    The sample mean, you just add
    up the numbers in your sample
  • 3:55 - 3:58
    and divide by the number
    of data points you have.
  • 3:58 - 4:00
    And the formulas I
    want to show you.
  • 4:00 - 4:03
    I think you know how to
    calculate averages.
  • 4:03 - 4:06
    It's a fairly straightforward
    operation, and I want to show
  • 4:06 - 4:08
    you how it's often written in
    statistics books, so that
  • 4:08 - 4:10
    you're not intimidated
    when you see it.
  • 4:10 - 4:12
    The population mean, they'll
    write it as-- so just to do,
  • 4:12 - 4:14
    you know, the convention.
  • 4:14 - 4:17
    Each member of a-- well, let
    me do the sample first.
  • 4:17 - 4:20
    Each member of a sample, say
    this is the first sample.
  • 4:20 - 4:22
    They'll call that x sub 1.
  • 4:22 - 4:25
    They'll call this x sub 2.
  • 4:25 - 4:30
    They'll call this one x
    sub 3, x sub 4, and this
  • 4:30 - 4:32
    one x sub 5, right?
  • 4:32 - 4:33
    And this is just a way
    of referring to each
  • 4:33 - 4:34
    of the samples.
  • 4:34 - 4:36
    So in a sample mean, they'll
    say, do you know what you do?
  • 4:36 - 4:38
    You take the sum
    of these numbers.
  • 4:38 - 4:41
    And you know how to do that,
    but the fancy way of writing
  • 4:41 - 4:43
    it is to say, let's
    write a capital Sigma.
  • 4:43 - 4:46
    That means the sum.
  • 4:46 - 4:49
    Sum of every x sub n, right?
  • 4:49 - 4:52
    Take the sum of each of
    these numbers, right?
  • 4:52 - 4:58
    This is x sub 1, x sub 2, where
    n goes from 1 to-- I mean, you
  • 4:58 - 5:01
    could say to the size
    of the population.
  • 5:01 - 5:05
    You know, sometimes-- you know,
    in this case it would be 5, or
  • 5:05 - 5:08
    sometimes they'd write a big--
    they'd write an n like that.
  • 5:08 - 5:11
    And you'd divide it by the
    number of members there are
  • 5:11 - 5:15
    in that population,
    so divided by n.
  • 5:15 - 5:17
    You know, when you see this in
    a book, you're like, wow, this
  • 5:17 - 5:18
    is advanced mathematics.
  • 5:18 - 5:21
    But essentially, they're saying
    take the sum of all the data
  • 5:21 - 5:23
    points, just sum up these
    numbers, and divide by the
  • 5:23 - 5:24
    number of numbers there are.
  • 5:24 - 5:27
    So this would just be 5
    plus 5 plus 5.5 plus 5.75
  • 5:27 - 5:30
    plus 6.5 divided by 5.
  • 5:30 - 5:32
    That's all this is telling you.
  • 5:32 - 5:34
    For the population mean,
    it's the same thing.
  • 5:34 - 5:36
    They just use a slightly
    different notation.
  • 5:36 - 5:42
    They'll say that's equal to the
    sum from n is equal to 1 to a
  • 5:42 - 5:46
    big N-- and I'll explain why
    they write a big N-- of each
  • 5:46 - 5:50
    data point in the population,
    not just the sample, all
  • 5:50 - 5:53
    that divided by big N.
  • 5:53 - 5:55
    And this is just a way of,
    when they're at big N,
  • 5:55 - 5:57
    they mean 150 million.
  • 5:57 - 6:00
    They mean, you know, we want
    you to get every data point
  • 6:00 - 6:01
    in the entire population.
  • 6:01 - 6:04
    So that's what they mean by--
    and then divide by the number
  • 6:04 - 6:05
    of the entire population.
  • 6:05 - 6:08
    While the small n, they're kind
    of-- it's just the convention,
  • 6:08 - 6:11
    the notation, that they say,
    hey, we just want you to get
  • 6:11 - 6:14
    some smaller number, not
    the entire population.
  • 6:14 - 6:16
    But the way you calculate
    them is, you know, they're
  • 6:16 - 6:19
    essentially equivalent.
  • 6:19 - 6:22
    Anyway, I wanted to leave you
    with that just because this is
  • 6:22 - 6:25
    something that if you don't get
    it clarified early on-- it's a
  • 6:25 - 6:28
    fairly simple concept-- later
    on, it becomes very confusing
  • 6:28 - 6:29
    when people want to
    differentiate between the
  • 6:29 - 6:31
    population and the sample mean.
  • 6:31 - 6:33
    And you see these formulas
    written slightly different.
  • 6:33 - 6:37
    Sometimes you'll see a mu, and
    sometimes you'll see an x with
  • 6:37 - 6:39
    a line over it for
    the sample mean.
  • 6:39 - 6:41
    Anyway, I'll see in
    the next video.
Title:
Statistics: Sample vs. Population Mean
Description:

more » « less
Video Language:
English
Team:
Khan Academy
Duration:
06:42

English subtitles

Revisions Compare revisions