Return to Video

Normal Distribution Excel Exercise

  • 0:02 - 0:05
    In this video we're going to
    cover what is arguably, the
  • 0:05 - 0:10
    single most important concept
    in all the statistics.
  • 0:10 - 0:14
    Well if you go into almost any
    scientific field you might even
  • 0:14 - 0:16
    argue that it's the single
    most important concept.
  • 0:16 - 0:19
    I've actually told people that
    it's kind of sad they don't
  • 0:19 - 0:22
    cover this in the
    core curriculum.
  • 0:22 - 0:24
    Everyone should know about is
    because it touches on every
  • 0:24 - 0:28
    single aspect of our lives and
    that's the normal distribution
  • 0:28 - 0:31
    or the Gaussian distribution
    or the bell curve.
  • 0:31 - 0:35
    And just to kind of give you a
    preview of what it is, my
  • 0:35 - 0:40
    preview will actually make it
    seem pretty strange but as we
  • 0:40 - 0:42
    go through this video hopefully
    you'll get a little bit more
  • 0:42 - 0:45
    intuition of what
    it's all about.
  • 0:45 - 0:48
    The Gaussian distribution or
    the normal distribution,
  • 0:48 - 0:50
    they're two words
    for the same thing.
  • 0:50 - 0:53
    It was actually Gauss
    who came up with it.
  • 0:53 - 0:56
    I think he was studying
    astronomical phenomenon
  • 0:56 - 0:57
    when he did.
  • 0:57 - 1:00
    But it's a probability density
    function just like we studies
  • 1:00 - 1:01
    the Poisson distribution.
  • 1:01 - 1:02
    It's just like that.
  • 1:02 - 1:05
    And just to give you the
    preview it looks like this.
  • 1:05 - 1:09
    The probability of getting
    any x, and it's a
  • 1:09 - 1:12
    class of probability
    distribution functions.
  • 1:12 - 1:15
    Just like the binomial
    distribution is and the Poisson
  • 1:15 - 1:19
    distribution, it's based on
    a bunch of parameters.
  • 1:19 - 1:21
    This is how you would
    traditionally see it written in
  • 1:21 - 1:23
    a lot of textbooks and if we
    have time, I'd like to
  • 1:23 - 1:26
    rearrange the algebra just to
    get a little bit more intuition
  • 1:26 - 1:27
    of how it all works out.
  • 1:27 - 1:29
    Or maybe we could get
    some insights on where
  • 1:29 - 1:30
    it all came from.
  • 1:30 - 1:32
    I'm not going to prove it in
    this video, that's a little
  • 1:32 - 1:33
    bit beyond our scope.
  • 1:33 - 1:35
    Although, I do want to do it
    and there's actually some
  • 1:35 - 1:39
    really neat mathematics
    that might show up.
  • 1:39 - 1:41
    If you're a math lead there's
    something called Sterling's
  • 1:41 - 1:43
    formula what you might want
    to do a Wikipedia search on,
  • 1:43 - 1:44
    which is really fascinating.
  • 1:44 - 1:48
    It approximates factorials
    with essentially a
  • 1:48 - 1:49
    continuous function.
  • 1:49 - 1:50
    But I won't go into
    that right now.
  • 1:53 - 1:57
    The normal distribution is 1
    over -- this is how it's
  • 1:57 - 2:00
    normally written -- the
    standard deviation times the
  • 2:00 - 2:08
    square root of 2 pi times
    e to the minus 1/2.
  • 2:08 - 2:12
    Well, I like to write it this
    way, it easier to remember,
  • 2:12 - 2:17
    times whatever value you're
    trying to get minus the mean of
  • 2:17 - 2:21
    our distribution divided by the
    standard deviation of our
  • 2:21 - 2:25
    distribution squared.
  • 2:25 - 2:27
    And so if you think about it,
    actually this is a good thing
  • 2:27 - 2:28
    to just notice right now.
  • 2:28 - 2:31
    This is how far I'm from the
    mean and we're dividing that
  • 2:31 - 2:33
    by the standard deviation of
    whatever our distribution is.
  • 2:33 - 2:35
    This is a preview of actually a
    normal distribution that I've
  • 2:35 - 2:38
    plotted, the purple line here
    is a normal distribution.
  • 2:38 - 2:41
    Initially the whole exercise --
    I know I jump around a little
  • 2:41 - 2:44
    bit -- is to show you that the
    normal distribution is a good
  • 2:44 - 2:49
    approximation for the binomial
    distribution and vice versa.
  • 2:49 - 2:52
    If you take enough trials in
    your binomal distribution and
  • 2:52 - 2:55
    we'll touch on that a second.
  • 2:55 - 2:57
    The intuition of this term
    right here I think is
  • 2:57 - 3:01
    interesting because we're
    saying, how far are we away
  • 3:01 - 3:03
    from the mean, we're dividing
    by the standard deviation.
  • 3:03 - 3:08
    So this whole term right here
    is how many standard deviations
  • 3:08 - 3:10
    we are away from the mean.
  • 3:10 - 3:12
    This is actually called
    the standard z score.
  • 3:12 - 3:16
    One thing I found in statistics
    is there's a lot of words a lot
  • 3:16 - 3:18
    of definitions and they all
    sound very fancy, the
  • 3:18 - 3:20
    standard z score.
  • 3:20 - 3:25
    But the underlying concept
    is pretty straightforward.
  • 3:25 - 3:30
    Let's say I had a probability
    distribution and I get some x
  • 3:30 - 3:32
    value that's out here and it's
    3 and a half the standard
  • 3:32 - 3:35
    deviations is away from the
    mean, then it's standard
  • 3:35 - 3:36
    z score is 3 and a half.
  • 3:36 - 3:39
    Anyway let's focus on the
    purpose of this video.
  • 3:39 - 3:43
    So that's what the normal
    distribution, I guess
  • 3:43 - 3:45
    the probability density
    function for the normal
  • 3:45 - 3:46
    distribution looks like.
  • 3:46 - 3:49
    But how did it get there?
  • 3:49 - 3:51
    By the end of this video you
    should at least feel
  • 3:51 - 3:55
    comfortable that this is a good
    approximation for the binomial
  • 3:55 - 3:58
    distribution if you're
    taking enough trials.
  • 3:58 - 4:01
    And that's what's fascinating
    about the normal distribution
  • 4:01 - 4:04
    is that if you have the sum --
    and I'll do a whole other video
  • 4:04 - 4:08
    on the central limit theorem --
    but if you have the sum of many
  • 4:08 - 4:12
    independent trials approaching
    infinity, that the distribution
  • 4:12 - 4:14
    of those, even though the
    distribution of each of those
  • 4:14 - 4:18
    trials might have been
    non-normal but the distribution
  • 4:18 - 4:21
    of the sum of all those trials
    approaches the normal
  • 4:21 - 4:21
    distribution.
  • 4:21 - 4:23
    I'll talk more
    about that later.
  • 4:23 - 4:27
    But that's why it's such a good
    distribution to kind of assume
  • 4:27 - 4:29
    for a lot of underlying
    phenomenon.
  • 4:29 - 4:31
    If you're kind of modeling
    weather patterns or drug
  • 4:31 - 4:35
    interactions and you we'll talk
    about where it might work well
  • 4:35 - 4:36
    and where it might
    not work so well.
  • 4:36 - 4:39
    Like sometimes people might
    assume things like a normal
  • 4:39 - 4:42
    distribution in finance and
    we've see the financial crisis
  • 4:42 - 4:44
    that's led to a lot of
    things blowing up but.
  • 4:44 - 4:46
    Anyway, let's go back to this.
  • 4:46 - 4:47
    This is a spreadsheet
    right here.
  • 4:47 - 4:52
    I just made a black background
    and you can downloaded it at
  • 4:52 - 5:02
    khanacademy.org/downloads
    Actually, if you just do that
  • 5:02 - 5:03
    you'll see all of
    the downloads.
  • 5:03 - 5:05
    I haven't put it there yet, I'm
    going to do it right after I
  • 5:05 - 5:06
    record the videos
    downloads/normal
  • 5:06 - 5:06
    distribution.xls.
  • 5:18 - 5:22
    If you just go up to
    khanacademy.org/download/
  • 5:22 - 5:23
    you'll see all the things
    there and you'll see
  • 5:23 - 5:24
    this spreadsheet.
  • 5:24 - 5:27
    I encourage you to play with it
    and maybe do other spreadsheets
  • 5:27 - 5:29
    were you experiment with it.
  • 5:29 - 5:33
    So this spreadsheet what we do
    is we're doing a game or let's
  • 5:33 - 5:36
    say I'm sitting I'm on a street
    and I flip a coin, I flip
  • 5:36 - 5:37
    a completely fair coin.
  • 5:37 - 5:45
    If I get heads, this is heads,
    I take a step backwards or
  • 5:45 - 5:46
    let's say a step to the left.
  • 5:46 - 5:51
    And if I get a tails I
    take a step to the right.
  • 5:51 - 5:54
    So in general I always have a
    -- this is a completely fair
  • 5:54 - 5:56
    coin -- I have a 50 percent
    chance of taking a step to the
  • 5:56 - 5:59
    left and I have a 50 percent
    chance of taking a
  • 5:59 - 6:00
    step to the right.
  • 6:00 - 6:03
    So your intuition there is if
    I told you I took a you a
  • 6:03 - 6:07
    thousand flips of the coin
    you're going to keep
  • 6:07 - 6:08
    going left and right.
  • 6:08 - 6:10
    If by chance you get a bunch
    of heads, you might end up
  • 6:10 - 6:13
    really kind of moving
    over to the left.
  • 6:13 - 6:16
    If you get a bunch of tails you
    might move over to the right.
  • 6:16 - 6:21
    And we learned already the odds
    of getting a bunch of tales or
  • 6:21 - 6:25
    many more tails than heads is a
    lot lower than things kind of
  • 6:25 - 6:29
    being equal or close to equal.
  • 6:29 - 6:37
    Right here what I've done --
    let me scroll down a little
  • 6:37 - 6:49
    bit, I don't want to lose the
    whole thing -- is I have this
  • 6:49 - 6:51
    little assumption here and I
    encourage you to fill that out
  • 6:51 - 6:52
    and change it as you like.
  • 6:52 - 6:55
    This is the number
    of steps I take.
  • 6:55 - 6:59
    This is the mean number of left
    steps and all I did is I got
  • 6:59 - 7:01
    the probability and we
    figured out the mean of the
  • 7:01 - 7:03
    binomial distribution.
  • 7:03 - 7:07
    The mean of the binomial
    distribution is essentially the
  • 7:07 - 7:09
    probability of taking a left
    step times the total
  • 7:09 - 7:11
    number of trials.
  • 7:11 - 7:14
    So that's equal to 5, that's
    where that number comes from.
  • 7:14 - 7:17
    And then the variance -- and
    I'm not sure if I went over
  • 7:17 - 7:19
    this and I need prove this to
    you if I have and I'll make a
  • 7:19 - 7:22
    whole other video on the
    variance of the binomial
  • 7:22 - 7:27
    distribution -- is essentially
    equal to the number of trials,
  • 7:27 - 7:33
    10 times the probability of
    taking the left step or kind of
  • 7:33 - 7:36
    a successful trial -- I'm
    defining left as a successful
  • 7:36 - 7:41
    trial, that could be right as
    well -- times the probability
  • 7:41 - 7:44
    of 1 minus the successful trial
    or non successful trial.
  • 7:44 - 7:46
    In this case they're equally
    probable and that's where
  • 7:46 - 7:49
    I got the 2.5 from.
  • 7:49 - 7:50
    And that's all on
    the spreadsheet.
  • 7:50 - 7:52
    If you actually click on
    the cell and look at the
  • 7:52 - 7:53
    actual formula I did that.
  • 7:53 - 7:55
    Although sometimes when you
    see it in Excel it's a
  • 7:55 - 7:56
    little bit confusing.
  • 7:56 - 7:57
    And this is just the square
    root of that number.
  • 7:57 - 7:59
    The standard deviation
    is just the square
  • 7:59 - 8:01
    root of the variance.
  • 8:01 - 8:04
    That's just the
    square root of 2.5.
  • 8:04 - 8:09
    And so if you look here
    this says, OK what is
  • 8:09 - 8:11
    the probability that
    I take 0 steps?
  • 8:11 - 8:14
    So I take a total of 10 steps
    -- just to understand this
  • 8:14 - 8:18
    spreadsheet -- what is the
    probability that I
  • 8:18 - 8:20
    take 0 left steps?
  • 8:20 - 8:23
    And just to make clear, if I
    take 0 left steps that means I
  • 8:23 - 8:25
    must have taken 10 right steps.
  • 8:25 - 8:27
    And I calculate this
    probability -- I should have
  • 8:27 - 8:32
    drawn maybe a line here -- I
    calculate this using the
  • 8:32 - 8:34
    binomial distribution.
  • 8:34 - 8:35
    And how do I do that?
  • 8:41 - 8:45
    Let me actually switch
    colors just to make
  • 8:45 - 8:46
    things interesting.
  • 8:46 - 8:48
    Do they have a purple here?
  • 8:48 - 8:51
    I'll do a blue.
  • 8:51 - 8:54
    So blue for binomial.
  • 8:54 - 8:59
    So what I have here is
    how many total steps?
  • 8:59 - 9:00
    There's a total of 10 steps.
  • 9:00 - 9:04
    So 10 factorial, that's kind of
    the number of trials I have.
  • 9:04 - 9:09
    Of that I'm choosing
    0 to go left.
  • 9:09 - 9:14
    So 0 factorial divided by
    10 minus 0 factorial.
  • 9:14 - 9:16
    This is 10 choose 0.
  • 9:16 - 9:20
    I'm choosing 0 left steps of
    the total 10 steps I'm taking
  • 9:20 - 9:24
    times the probability of 0 left
    steps so, it's the probability
  • 9:24 - 9:28
    of a left step, I'm only taking
    0 of them times the probability
  • 9:28 - 9:32
    of a right step, and I'm
    taking 10 of those.
  • 9:32 - 9:35
    So that's where this number
    came from, this .001.
  • 9:35 - 9:38
    That's what the binomial
    distribution tells us.
  • 9:38 - 9:45
    And then this one similarly, is
    10 factorial over 1 factorial
  • 9:45 - 9:47
    over 10 minus 1 factorial.
  • 9:47 - 9:48
    That's how I get that one.
  • 9:48 - 9:51
    And once again, if you click
    on the actual cell you'll
  • 9:51 - 9:52
    see that explained.
  • 9:52 - 9:53
    We've done this multiple times.
  • 9:53 - 9:54
    This is just a
    bionomial calculation.
  • 9:54 - 9:59
    Then right here, after
    this line right here, you
  • 9:59 - 10:01
    can almost ignore it.
  • 10:01 - 10:03
    I did that so that I can do a
    bunch of different scenarios.
  • 10:03 - 10:09
    For example, if I were to go to
    my spreadsheet, and instead of
  • 10:09 - 10:18
    doing 10 I wanted to do 20
    steps then everything changes.
  • 10:18 - 10:23
    And that's why down here after
    you get to a certain point the
  • 10:23 - 10:26
    whole thing just
    kind of repeats.
  • 10:26 - 10:28
    I'll let you think
    about why I do that.
  • 10:28 - 10:30
    Maybe I should have made
    a cleaner spreadsheet.
  • 10:30 - 10:33
    But it doesn't affect the
    scatter plot chart that I did.
  • 10:33 - 10:38
    And so this plot in blue, and
    you can't see it because the
  • 10:38 - 10:40
    purple is almost right over.
  • 10:40 - 10:44
    Actually let me make it
    smaller so that you can see.
  • 10:44 - 10:48
    Let's say I only took 6 steps.
  • 10:48 - 10:51
    Well it's still hard to see the
    difference between the two.
  • 10:51 - 10:55
    Once again the whole point
    of this is to see that the
  • 10:55 - 10:57
    normal distribution is
    a good approximation.
  • 10:57 - 10:59
    But they're so close that
    you can't even see the
  • 10:59 - 10:59
    difference on mine.
  • 10:59 - 11:02
    If you only took four steps,
    OK, I think you can see here.
  • 11:05 - 11:06
    Let me get my screen draw on.
  • 11:10 - 11:13
    The blue curve is
    right around there.
  • 11:13 - 11:15
    This is the binomial.
  • 11:15 - 11:17
    There's only a few points
    here, you the points
  • 11:17 - 11:19
    only go up to here.
  • 11:19 - 11:22
    This is if I take 0 steps left,
    1 step left, 2 steps left,
  • 11:22 - 11:23
    3 steps left, 4 steps left.
  • 11:23 - 11:26
    And then I plot it and then I
    say what's the probability
  • 11:26 - 11:28
    using the binomial
    distribution?
  • 11:28 - 11:30
    And this is my final
    position right?
  • 11:30 - 11:33
    If I take 0 steps to the left
    then I take 4 steps to the
  • 11:33 - 11:36
    right so my final position
    is at 4, so that's the
  • 11:36 - 11:38
    scenario right here.
  • 11:38 - 11:40
    Let me switch my color back to
    yellow, it's easier to see.
  • 11:44 - 11:50
    If I take 4 steps to the left,
    I take 0 steps to the right and
  • 11:50 - 11:53
    so my final position is
    going to be at minus 4.
  • 11:53 - 11:54
    It's going to be here.
  • 11:54 - 11:59
    If I take an equal amount of
    both, that's this scenario,
  • 11:59 - 12:00
    then I'm neutral.
  • 12:00 - 12:03
    I'm just stuck in the
    middle right here.
  • 12:03 - 12:05
    I take 2 steps to the right and
    then I take 2 steps to the left
  • 12:05 - 12:08
    or vice versa, I take 2 steps
    to the left and then I take
  • 12:08 - 12:10
    2 steps right and I
    end up right there.
  • 12:10 - 12:13
    Hopefully that makes
    a little sense.
  • 12:13 - 12:14
    My phone is ringing.
  • 12:14 - 12:17
    I'll ignore that because
    the normal distribution
  • 12:17 - 12:18
    is so important.
  • 12:18 - 12:21
    Actually, my 9 week old son is
    watching so this is the first
  • 12:21 - 12:23
    time I have a live audience.
  • 12:23 - 12:27
    He might pick up something
    about the normal distribution.
  • 12:27 - 12:31
    So the blue line right here --
    I'll trace it maybe in yellow
  • 12:31 - 12:35
    so you can see it -- is the
    plot of the binomial
  • 12:35 - 12:36
    distribution.
  • 12:36 - 12:41
    I connected the lines but you
    see the binomial distribution
  • 12:41 - 12:43
    look something more like this.
  • 12:43 - 12:47
    This is the probability
    of getting to minus 4.
  • 12:47 - 12:52
    This is the probability
    of going to minus 2.
  • 12:52 - 12:55
    This right here is
    the probability of
  • 12:55 - 12:57
    ending up nowhere.
  • 12:57 - 13:06
    Then this is the probability of
    ending up 2 to the right and
  • 13:06 - 13:10
    this is the probability of
    ending up 4 to the right.
  • 13:10 - 13:12
    This is the binomial
    distribution, I just plotted
  • 13:12 - 13:14
    these points right here.
  • 13:14 - 13:14
    This is 0.375.
  • 13:14 - 13:17
    This is 0.375.
  • 13:17 - 13:18
    That's the height of that.
  • 13:18 - 13:21
    Now what I wanted to show
    you is that the normal
  • 13:21 - 13:25
    distribution approximates
    the binomial distribution.
  • 13:25 - 13:30
    So this right here, I wanted
    to say what does the normal
  • 13:30 - 13:35
    distribution tell me is the
    probability of ending up
  • 13:35 - 13:38
    with exactly 0 left steps?
  • 13:38 - 13:42
    This is a little bit tricky.
  • 13:42 - 13:45
    The binomial distribution
    is a discreet probability
  • 13:45 - 13:46
    distribution.
  • 13:46 - 13:48
    You can just look at this chart
    or look here and you say, what
  • 13:48 - 13:56
    is the probability of having
    exactly 1 left step and 3 right
  • 13:56 - 13:58
    steps which puts me right here?
  • 13:58 - 14:01
    Well you just look at this
    chart and you say oh, that
  • 14:01 - 14:02
    puts me right there.
  • 14:02 - 14:05
    I just read that probability,
    it's actually .25.
  • 14:05 - 14:07
    And I say oh, I have a 25
    percent chance of ending
  • 14:07 - 14:12
    up 2 steps to the right.
  • 14:12 - 14:14
    There's a 25 percent chance.
  • 14:14 - 14:17
    The normal distribution
    function is a continuous
  • 14:17 - 14:20
    probability distribution so
    it's a continuous curve.
  • 14:20 - 14:22
    It looks like that, it's a
    bell curve and it goes off
  • 14:22 - 14:26
    to infinity and starts
    approaching 0 on both sides.
  • 14:26 - 14:28
    It looks something like that.
  • 14:28 - 14:30
    This is a continuous
    probability distribution.
  • 14:30 - 14:32
    You can't just take a point
    here and say, what's the
  • 14:32 - 14:35
    probability that I end
    up 2 feet to the right?
  • 14:35 - 14:37
    Because if you just say that
    there's the actual the
  • 14:37 - 14:40
    probability of being exactly --
    and you should watch with my
  • 14:40 - 14:42
    video on probability density
    functions -- but the
  • 14:42 - 14:45
    probability of being exactly 2
    feet to the right, exactly, I
  • 14:45 - 14:48
    mean I'm talking to the
    atoms, is close to 0.
  • 14:48 - 14:51
    You actually have to specify
    a range around this.
  • 14:51 - 14:57
    What I assume in this
    within a half a foot
  • 14:57 - 14:58
    in either direction.
  • 14:58 - 14:59
    Right?
  • 14:59 - 15:00
    If we're talking about feet.
  • 15:00 - 15:04
    To figure that out what I did
    here is I took the value of
  • 15:04 - 15:07
    the probability density
    function there.
  • 15:07 - 15:10
    And I'll show you how
    I evaluated that.
  • 15:10 - 15:12
    And then I multiply that by 1.
  • 15:12 - 15:15
    So it gives me this area.
  • 15:15 - 15:18
    And I use that as an
    approximation for this area.
  • 15:18 - 15:20
    If you really want to be
    particular about it what you
  • 15:20 - 15:23
    would do is you would take the
    integral of this curve between
  • 15:23 - 15:27
    this point and this point
    as a better approximation.
  • 15:27 - 15:28
    We'll do that in the future.
  • 15:28 - 15:30
    But right now I just want to
    give you the intuition that the
  • 15:30 - 15:32
    binomial distribution really
    does converge to the
  • 15:32 - 15:33
    normal distribution.
  • 15:33 - 15:36
    So how did I get this
    number right here?
  • 15:36 - 15:45
    Well I said, what is the
    probability that I think 1 left
  • 15:45 - 15:53
    step -- I kind of used less
    steps as success -- of one?
  • 15:53 - 15:59
    And that equaled 1 over
    the standard deviation.
  • 15:59 - 16:02
    When I only took 4 steps the
    standard deviation was 1.
  • 16:02 - 16:04
    So 1 over 1.
  • 16:04 - 16:05
    Actually let me change this.
  • 16:08 - 16:10
    Let me change it to
    a higher number.
  • 16:14 - 16:16
    We'll go back to the
    example where I'm at 10.
  • 16:19 - 16:21
    So if this is at 10.
  • 16:21 - 16:23
    Let me go back to
    my drawing tool.
  • 16:29 - 16:31
    Let me do this calculation.
  • 16:31 - 16:35
    Actually, even better let
    me do this calculation.
  • 16:35 - 16:37
    So what's the probability
    that I have 2 left steps?
  • 16:37 - 16:40
    If I have 2 left steps I took a
    total of 10 steps so I'm going
  • 16:40 - 16:43
    to have 8 right steps and
    that puts me 6 to the right.
  • 16:43 - 16:46
    So that's this
    point right here.
  • 16:46 - 16:47
    So what's that probability?
  • 16:47 - 16:49
    How do I figure this out
    using the probability
  • 16:49 - 16:50
    density function?
  • 16:50 - 16:51
    How do I figure this height?
  • 16:51 - 16:56
    Well I say the probability of
    taking 2 left steps -- that's
  • 16:56 - 16:58
    how I calculate it, if you
    actually click on the cell
  • 16:58 - 17:04
    you'll see that -- is equal to
    1 over the standard deviation,
  • 17:04 - 17:11
    1.581 -- and I just directly
    reference the cell there --
  • 17:11 - 17:13
    times the square root of 2 pi.
  • 17:15 - 17:19
    I'm always in awe of the whole
    notion of e to the i pi is
  • 17:19 - 17:20
    equal to negative 1
    and all of that.
  • 17:20 - 17:21
    But there's another
    amazing thing.
  • 17:21 - 17:26
    That all of a sudden as we take
    many trials we have this
  • 17:26 - 17:29
    formula that has e and pi in it
    and square roots but once again
  • 17:29 - 17:30
    these two numbers just
    keep showing up.
  • 17:30 - 17:33
    It tells you something about
    the order of the universe
  • 17:33 - 17:34
    with a capital o.
  • 17:34 - 17:43
    But let's see, times e to
    the minus 1/2 times x.
  • 17:43 - 17:46
    Well x is what we're trying
    to calculate, two successes.
  • 17:46 - 17:55
    To to have exactly 2 left,
    so it's 2 minus the mean.
  • 17:55 - 18:01
    So the mean is five, 2 minus
    five divided by the standard
  • 18:01 - 18:10
    deviation, divided by 1.581,
    all of that squared.
  • 18:10 - 18:12
    That's where this
    calculation came from.
  • 18:15 - 18:19
    So I told you in the last one
    this right here just tells
  • 18:19 - 18:23
    me this value up here.
  • 18:23 - 18:25
    If I want to know this
    exact probability,
  • 18:25 - 18:27
    it's the area of this.
  • 18:27 - 18:29
    And if I just take one
    line the are is 0.
  • 18:33 - 18:36
    Remember, in this case you can
    only be 2 feet away because
  • 18:36 - 18:38
    we're taking very exact steps.
  • 18:38 - 18:40
    But what the normal
    distribution is it's the
  • 18:40 - 18:44
    continuous probability density
    function so it can tell us
  • 18:44 - 18:49
    what's the probability of
    being 2.183 feet away?
  • 18:49 - 18:52
    Which obviously can only happen
    if we're taking infinitely
  • 18:52 - 18:53
    small steps every time.
  • 18:53 - 18:54
    But that's what it's use is.
  • 18:54 - 18:56
    It happens when you start
    taking an infinite
  • 18:56 - 18:57
    number of steps.
  • 18:57 - 18:59
    But it can approximate
    the discreet.
  • 18:59 - 19:01
    And the way I approximate it is
    I say oh, what's probability of
  • 19:01 - 19:03
    being within a foot of that.
  • 19:03 - 19:05
    And so I multiply
    this height, which I
  • 19:05 - 19:09
    calculate here, times 1.
  • 19:09 - 19:13
    So let's say this has a base of
    1, to calculate this area which
  • 19:13 - 19:15
    I use as an approximation.
  • 19:15 - 19:19
    So you just multiply that times
    1 and that's what you get here.
  • 19:19 - 19:20
    And I just want to show you.
  • 19:20 - 19:26
    Even with just 10 trials, the
    curves, the normal distribution
  • 19:26 - 19:29
    here is in purple and the
    binomial distribution
  • 19:29 - 19:29
    is in blue.
  • 19:29 - 19:31
    So they're almost right
    on top of each other.
  • 19:36 - 19:40
    As you can take many more steps
    they almost converge right on
  • 19:40 - 19:41
    top of each other and I
    encourage you to play
  • 19:41 - 19:43
    with the spreadsheets.
  • 19:43 - 19:46
    Actually, let me show
    you that they converge.
  • 19:46 - 19:49
    There's a convergence worksheet
    on this spreadsheet as well if
  • 19:49 - 19:52
    you click on the bottom
    tab on convergence.
  • 19:52 - 19:54
    This is the same thing but I
    just want to show you what
  • 19:54 - 19:59
    happens at any given point.
  • 19:59 - 20:03
    Let me explain this
    spreadsheet to you.
  • 20:03 - 20:04
    So this is what's
    the probability of
  • 20:04 - 20:07
    moving left, right?
  • 20:07 - 20:09
    So this is just saying, I'm
    just fixing a point what's
  • 20:09 - 20:11
    the probability -- and you
    can change this -- of my
  • 20:11 - 20:13
    final position being 10.
  • 20:13 - 20:19
    And this essentially tells you
    that if I take 10 moves, for my
  • 20:19 - 20:21
    final position to be 10 to the
    right, I have to take 10 right
  • 20:21 - 20:23
    moves and 0 left moves.
  • 20:23 - 20:26
    That's a typo right there, it
    should be moves not movest.
  • 20:26 - 20:32
    If I take 20 moves to end up 10
    moves to the right then I have
  • 20:32 - 20:35
    to make 15 right moves
    and 5 left moves.
  • 20:35 - 20:38
    Likewise if I take a total of
    80 moves, if I think 80 flips
  • 20:38 - 20:42
    of my coin to make me go left
    or right, in order end up 10 to
  • 20:42 - 20:46
    the right, I to take 45 right
    moves and 35 left moves in any
  • 20:46 - 20:49
    order and it will end up
    with 10 to the right.
  • 20:49 - 20:53
    So what I want to figure out
    is, as I start taking a bunch
  • 20:53 - 20:59
    of total moves -- here I max it
    out at 170 -- if I start
  • 20:59 - 21:01
    flipping this coin an infinite
    number of times, I want to
  • 21:01 - 21:03
    figure out what's the
    probability that my final
  • 21:03 - 21:05
    position is 10 to the right.
  • 21:05 - 21:09
    And I want to show you that as
    you take more and more moves
  • 21:09 - 21:12
    the normal distribution becomes
    a better and better
  • 21:12 - 21:15
    approximation for the
    binomial distribution.
  • 21:15 - 21:19
    So right here, this calculates
    the binomial probability, just
  • 21:19 - 21:21
    the way we did before and you
    can look at the cell
  • 21:21 - 21:22
    to figure it out.
  • 21:25 - 21:26
    I used left moves as a success.
  • 21:30 - 21:33
    So this is 10 choose 0 and
    we know what that is.
  • 21:33 - 21:37
    It's 10 factorial over 0
    factorial over 10 minus 0
  • 21:37 - 21:43
    factorial times 0.5 to the
    0 times 0.5 to the 10th.
  • 21:43 - 21:45
    That's where that
    number comes from.
  • 21:45 - 21:53
    If I go to this one right
    here is calculated.
  • 21:53 - 21:54
    Actually let me write
    it out because I think
  • 21:54 - 21:54
    it's interesting.
  • 21:54 - 21:55
    I have a total of 60 total
    moves, so it's 60 factorial
  • 21:55 - 22:03
    over, I have to have 25 left
    moves so 25 factorial.
  • 22:03 - 22:10
    So that's I'm 60 minus 25
    factorial times the probability
  • 22:10 - 22:13
    of a left move and have 25 of them, times the
  • 22:13 - 22:18
    probability of a right move
    and I have 35 of those.
  • 22:18 - 22:21
    So that's just what the
    binomial probability
  • 22:21 - 22:23
    distribution will tell us.
  • 22:23 - 22:25
    And then it figures out the
    mean and the variance for each
  • 22:25 - 22:27
    of those circumstances and you
    could look at the formula.
  • 22:27 - 22:30
    But the mean is just the
    probability of having
  • 22:30 - 22:33
    a left move times the
    total number of moves.
  • 22:33 - 22:36
    The variance is probability of
    left times probability of right
  • 22:36 - 22:38
    times total number of moves.
  • 22:38 - 22:41
    And then the normal
    probability, once again, I just
  • 22:41 - 22:44
    use the normal probability.
  • 22:44 - 22:45
    So I approximate
    it the same way.
  • 22:49 - 22:52
    And Excel has a normal
    distribution function but I
  • 22:52 - 22:54
    actually typed in the formula
    because I wanted you kind of
  • 22:54 - 22:58
    see what was under the covers
    for that function that
  • 22:58 - 22:59
    Excel actually has.
  • 22:59 - 23:05
    So I actually say what's the
    probability of 25 left moves?
  • 23:05 - 23:07
    No, 45 left moves.
  • 23:07 - 23:15
    So I say the probability of
    45 left moves is equal to 1
  • 23:15 - 23:17
    over the standard deviation.
  • 23:17 - 23:20
    So in this situation the
    standard deviation is
  • 23:20 - 23:22
    the square root of 25.
  • 23:22 - 23:33
    So it's five times 2 pi times e
    to the minus 1/2 times 45 minus
  • 23:33 - 23:38
    the mean, minus 50 over the
    standard deviation, which we
  • 23:38 - 23:41
    figured out was 5, squared.
  • 23:41 - 23:45
    So that tells me the value of
    what the normal distribution
  • 23:45 - 23:48
    tells me for this situation
    with this standard deviation
  • 23:48 - 23:51
    and this mean and then I
    multiply that by 1 -- you don't
  • 23:51 - 23:53
    see that in the formula, I
    don't actually write times 1 --
  • 23:53 - 23:55
    to figure out the area
    under the curve.
  • 23:55 - 23:57
    Because remember it's a
    continuous distribution
  • 23:57 - 23:58
    function.
  • 23:58 - 24:02
    This right here just gives me
    the value but to figure out the
  • 24:02 - 24:04
    probability of being within a
    foot of it I have
  • 24:04 - 24:05
    to multiply by 1.
  • 24:05 - 24:06
    I'm approximating really.
  • 24:06 - 24:08
    I really should take the
    integral from there to there
  • 24:08 - 24:12
    but this little rectangle is
    a pretty good approximation.
  • 24:12 - 24:17
    In this chart I show you that
    as the total number of moves
  • 24:17 - 24:21
    gets larger and larger the
    difference between what the
  • 24:21 - 24:24
    normal probability distribution
    tells us and the binomial
  • 24:24 - 24:27
    probability of distribution
    tells us, gets smaller and
  • 24:27 - 24:31
    smaller in terms of the
    probability of you ending up
  • 24:31 - 24:32
    to 10 moves to the right.
  • 24:32 - 24:35
    And you can change
    this number here.
  • 24:35 - 24:37
    Let me change it
    just to show you.
  • 24:37 - 24:38
    You could say what's the
    probability of being
  • 24:38 - 24:44
    15 moves to the right?
  • 25:03 - 25:04
    I think that something is
    happening with the floating
  • 25:04 - 25:06
    point error because when you
    get to large factorials I
  • 25:06 - 25:09
    think something weird
    happens out here.
  • 25:19 - 25:21
    You may just have to
    get even further out.
  • 25:21 - 25:24
    For 10 you can see clearly that
    it converges and I'll trying to
  • 25:24 - 25:26
    figure out why I was getting
    those weird saw tooth patterns.
  • 25:32 - 25:35
    Maybe while I do screen capture
    something weird is happening.
  • 25:35 - 25:38
    The whole point of this was to
    show you that if you want to
  • 25:38 - 25:41
    figure out the probability of
    being 10 moves to the right, as
  • 25:41 - 25:45
    you take more and more flips
    of your coin the normal
  • 25:45 - 25:50
    distribution becomes a much
    better approximation for the
  • 25:50 - 25:52
    actual binomial distribution.
  • 25:52 - 25:54
    And as you approach infinity
    they actually converge
  • 25:54 - 25:55
    to each other.
  • 25:55 - 25:58
    Anyway, that's all
    for this video.
  • 25:58 - 25:59
    I'll actually do several
    more videos on the normal
  • 25:59 - 26:02
    distributions because it is
    such an important concept.
  • 26:02 - 26:02
    See you soon.
Title:
Normal Distribution Excel Exercise
Description:

(Long-26 minutes) Presentation on spreadsheet to show that the normal distribution approximates the binomial distribution for a large number of trials.

more » « less
Video Language:
English
Duration:
26:04

English subtitles

Revisions