< Return to Video

Constructing a box-and-whisker plot

  • 0:01 - 0:04
    The owner of a restaurant
    wants to find out more about
  • 0:04 - 0:06
    where his patrons
    are coming from.
  • 0:06 - 0:08
    One day, he decided
    to gather data
  • 0:08 - 0:11
    about the distance
    in miles that people
  • 0:11 - 0:13
    commuted to get
    to his restaurant.
  • 0:13 - 0:16
    People reported the
    following distances traveled.
  • 0:16 - 0:18
    So here are all the
    distances traveled.
  • 0:18 - 0:20
    He wants to create
    a graph that helps
  • 0:20 - 0:23
    him understand the
    spread of the distances--
  • 0:23 - 0:26
    this is a key word--
    the spread of distances
  • 0:26 - 0:33
    and the median distance
    that people traveled
  • 0:33 - 0:34
    or that people travel.
  • 0:34 - 0:37
    What kind of graph
    should he create?
  • 0:37 - 0:40
    So the answer of what kind
    of graph he should create,
  • 0:40 - 0:42
    that might be a little
    bit more straightforward
  • 0:42 - 0:45
    than the actual creation of the
    graph, which we will also do.
  • 0:45 - 0:50
    But he's trying to visualize
    the spread of information.
  • 0:50 - 0:52
    And at the same time,
    he wants the median.
  • 0:52 - 0:56
    So what a graph captures
    both of that information?
  • 0:56 - 0:58
    Well, a box and whisker plot.
  • 0:58 - 1:02
    So let's actually try to
    draw a box and whisker plot.
  • 1:02 - 1:04
    And to do that, we need to
    come up with the median.
  • 1:04 - 1:07
    And we'll also see the median
    of the two halves of the data
  • 1:07 - 1:07
    as well.
  • 1:07 - 1:10
    And whenever we're trying to
    take the median of something,
  • 1:10 - 1:12
    it's really helpful
    to order our data.
  • 1:12 - 1:16
    So let's start off by
    attempting to order our data.
  • 1:16 - 1:19
    So what is the
    smallest number here?
  • 1:19 - 1:20
    Well, let's see.
  • 1:20 - 1:21
    There's one 2.
  • 1:21 - 1:22
    So let me mark it off.
  • 1:22 - 1:26
    And then we have another two.
  • 1:26 - 1:27
    So we've got all the 2's.
  • 1:27 - 1:30
    And then we have this 3.
  • 1:30 - 1:32
    Then we have this 3.
  • 1:32 - 1:34
    I think we've got all the 3's.
  • 1:34 - 1:37
    Then we have that 4.
  • 1:37 - 1:41
    Then we have this 4.
  • 1:41 - 1:42
    Do we have any 5's?
  • 1:42 - 1:43
    No.
  • 1:43 - 1:43
    Do we have any 6's?
  • 1:43 - 1:44
    Yep.
  • 1:44 - 1:45
    We have that 6.
  • 1:45 - 1:48
    And that looks like the only 6.
  • 1:48 - 1:49
    Any 7's?
  • 1:49 - 1:50
    Yep.
  • 1:50 - 1:52
    We have this 7 right over here.
  • 1:52 - 1:54
    And I just realized
    that I missed this 1.
  • 1:54 - 1:57
    So let me put the 1 at
    the beginning of our set.
  • 1:57 - 1:58
    So I got that 1
    right over there.
  • 1:58 - 2:00
    Actually, there was two 1's.
  • 2:00 - 2:01
    I missed both of them.
  • 2:01 - 2:04
    So both of those 1's
    are right over there.
  • 2:04 - 2:07
    So I have the 1's,
    2's, 3's, 4's, no 5's.
  • 2:07 - 2:09
    This is one 6.
  • 2:09 - 2:10
    There was one 7.
  • 2:10 - 2:13
    There's one 8 right over here.
  • 2:13 - 2:15
    And then, let's see, any 9's?
  • 2:15 - 2:16
    No 9's.
  • 2:16 - 2:17
    Any 10s?
  • 2:17 - 2:17
    Yep.
  • 2:17 - 2:19
    There's a 10.
  • 2:19 - 2:20
    Any 11s?
  • 2:20 - 2:22
    We have an 11 right over there.
  • 2:22 - 2:23
    Any 12s?
  • 2:23 - 2:24
    Nope.
  • 2:24 - 2:27
    13, 14?
  • 2:27 - 2:31
    Then we have a 15.
  • 2:31 - 2:35
    And then we have a
    20 and then a 22.
  • 2:35 - 2:36
    So we've ordered all our data.
  • 2:36 - 2:38
    Now it should be relatively
    straightforward to find
  • 2:38 - 2:41
    the middle of our
    data, the median.
  • 2:41 - 2:43
    So how many data
    points do we have?
  • 2:43 - 2:50
    1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
    11, 12, 13, 14, 15, 16, 17.
  • 2:50 - 2:52
    So the middle number
    is going to be
  • 2:52 - 2:54
    a number that has 8
    numbers larger than it
  • 2:54 - 2:56
    and 8 numbers smaller than it.
  • 2:56 - 2:57
    So let's think about it.
  • 2:57 - 3:00
    1, 2, 3, 4, 5, 6, 7, 8.
  • 3:00 - 3:04
    So the number 6 here is
    larger than 8 of the values.
  • 3:04 - 3:06
    And if I did the
    calculations right,
  • 3:06 - 3:08
    it should be smaller
    than 8 of the values.
  • 3:08 - 3:12
    1, 2, 3, 4, 5, 6, 7, 8.
  • 3:12 - 3:17
    So it is, indeed, the median.
  • 3:17 - 3:21
    Now, when we're trying to
    construct a box and whisker
  • 3:21 - 3:25
    plot, the convention is,
    OK, we have our median.
  • 3:25 - 3:27
    And it's essentially dividing
    our data into two sets.
  • 3:27 - 3:31
    Now, let's take the median
    of each of those sets.
  • 3:31 - 3:33
    And the convention is to
    take our median out and have
  • 3:33 - 3:34
    the sets that are left over.
  • 3:34 - 3:36
    Sometimes people leave it in.
  • 3:36 - 3:38
    But the standard convention,
    take this median out.
  • 3:38 - 3:39
    And now, look
    separately at this set
  • 3:39 - 3:42
    and look separately at this set.
  • 3:42 - 3:45
    So if we look at this first
    bottom half of our numbers
  • 3:45 - 3:49
    essentially, what's the
    median of these numbers?
  • 3:49 - 3:55
    Well, we have 1, 2, 3, 4,
    5, 6, 7, 8 data points.
  • 3:55 - 3:57
    So we're actually going to
    have two middle numbers.
  • 3:57 - 4:01
    So the two middle numbers
    are this 2 and this 3,
  • 4:01 - 4:02
    three numbers less
    than these two,
  • 4:02 - 4:04
    three numbers greater than it.
  • 4:04 - 4:05
    And so when we're
    looking for a median,
  • 4:05 - 4:07
    you have two middle numbers.
  • 4:07 - 4:08
    We take the mean of
    these two numbers.
  • 4:08 - 4:13
    So halfway in between
    two and three is 2.5.
  • 4:13 - 4:17
    Or you can say 2 plus 3
    is 5 divided by 2 is 2.5.
  • 4:17 - 4:22
    So here we have a median
    of this bottom half of 2.5.
  • 4:22 - 4:25
    And then the middle
    of the top half,
  • 4:25 - 4:27
    once again, we
    have 8 data points.
  • 4:27 - 4:30
    So our middle two
    numbers are going
  • 4:30 - 4:34
    to be this 11 and this 14.
  • 4:34 - 4:36
    And so if we want to take the
    mean of these two numbers,
  • 4:36 - 4:39
    11 plus 14 is 25.
  • 4:39 - 4:43
    Halfway in between
    the two is 12.5.
  • 4:43 - 4:47
    So 12.5 is exactly
    halfway between 11 and 14.
  • 4:47 - 4:49
    And now, we've figured
    out all of the information
  • 4:49 - 4:52
    we need to actually
    plot or actually
  • 4:52 - 4:55
    create or actually draw
    our box and whisker plot.
  • 4:55 - 5:03
    So let me draw a number line,
    so my best attempt at a number
  • 5:03 - 5:05
    line.
  • 5:05 - 5:07
    So that's my number line.
  • 5:07 - 5:10
    And let's say that this
    right over here is a 0.
  • 5:10 - 5:14
    I need to make sure I get all
    the way up to 22 or beyond 22.
  • 5:14 - 5:15
    So let's say that's 0.
  • 5:15 - 5:17
    Let's say this is 5.
  • 5:17 - 5:18
    This is 10.
  • 5:18 - 5:21
    That could be 15.
  • 5:21 - 5:23
    And that could be 20.
  • 5:23 - 5:25
    This could be 25.
  • 5:25 - 5:30
    We could keep
    going-- 30, maybe 35.
  • 5:30 - 5:33
    So the first thing we might
    want to think about-- there's
  • 5:33 - 5:34
    several ways to draw it.
  • 5:34 - 5:37
    We want to think about
    the box part of the box
  • 5:37 - 5:39
    and whisker
    essentially represents
  • 5:39 - 5:41
    the middle half of our data.
  • 5:41 - 5:46
    So it's essentially trying to
    represent this data right over
  • 5:46 - 5:52
    here, so the data between the
    medians of the two halves.
  • 5:52 - 5:54
    So this is a part
    that we would attempt
  • 5:54 - 5:55
    to represent with the box.
  • 5:55 - 6:00
    So we would start right
    over here at this 2.5.
  • 6:00 - 6:02
    This is essentially
    separating the first quartile
  • 6:02 - 6:05
    from the second quartile, the
    first quarter of our numbers
  • 6:05 - 6:07
    from the second
    quarter of our numbers.
  • 6:07 - 6:08
    So let's put it right over here.
  • 6:08 - 6:10
    So this is 2.5.
  • 6:10 - 6:13
    2.5 is halfway between 0 and 5.
  • 6:13 - 6:15
    So that's 2.5.
  • 6:15 - 6:17
    And then up here, we have 12.5.
  • 6:17 - 6:22
    And 12.5 is right
    over-- let's see.
  • 6:22 - 6:25
    This is 10.
  • 6:25 - 6:29
    So this right over here would be
    halfway between, well, halfway
  • 6:29 - 6:32
    between 10 and 15 is 12.5.
  • 6:32 - 6:33
    So let me do this.
  • 6:33 - 6:38
    So this is 12.5 right over here.
  • 6:38 - 6:40
    So that separates
    the third quartile
  • 6:40 - 6:41
    from the fourth quartile.
  • 6:41 - 6:44
    And then our boxes,
    everything in between,
  • 6:44 - 6:46
    so this is literally the
    middle half of our numbers.
  • 6:48 - 6:50
    And we'd want to show
    where the actual median is.
  • 6:50 - 6:52
    And that was actually
    one of the things
  • 6:52 - 6:54
    we wanted to be able
    to think about when
  • 6:54 - 6:55
    the owner of the
    restaurant wanted
  • 6:55 - 6:58
    to think about how far
    people are traveling from.
  • 6:58 - 7:00
    So the median is 6.
  • 7:00 - 7:02
    So we can plot it
    right over here.
  • 7:02 - 7:06
    So this right here is about six.
  • 7:06 - 7:08
    Let me do that same pink color.
  • 7:08 - 7:12
    So this right over here is 6.
  • 7:12 - 7:15
    And then the whiskers of
    the box and whisker plot
  • 7:15 - 7:17
    essentially show us
    the range of our data.
  • 7:17 - 7:21
    And I can do this in a different
    color that I haven't used yet.
  • 7:21 - 7:22
    I'll do this in orange.
  • 7:22 - 7:24
    So essentially, if
    we want to see, look,
  • 7:24 - 7:26
    the numbers go all
    the way up to 22.
  • 7:26 - 7:27
    So they go all the
    way up to-- so let's
  • 7:27 - 7:30
    say that this is
    22 right over here.
  • 7:30 - 7:32
    Our numbers go all
    the way up to 22.
  • 7:37 - 7:39
    And they go as low as 1.
  • 7:39 - 7:43
    So 1 is right about here.
  • 7:43 - 7:44
    Let me label that.
  • 7:44 - 7:45
    So that's 1.
  • 7:45 - 7:48
    And they go as low as 1.
  • 7:48 - 7:48
    So there you have it.
  • 7:48 - 7:50
    We have our box
    and whisker plot.
  • 7:50 - 7:52
    And you can see if you
    have a plot like this,
  • 7:52 - 7:54
    just visually, you
    can immediately
  • 7:54 - 7:55
    see, OK, what is the median?
  • 7:55 - 7:58
    It's the middle of
    the box, essentially.
  • 7:58 - 7:59
    It shows you the middle half.
  • 7:59 - 8:00
    So it shows you how
    far they're spread
  • 8:00 - 8:02
    or where the meat
    of the spread is.
  • 8:02 - 8:05
    And then it shows, well, beyond
    that, we have the range that
  • 8:05 - 8:10
    goes well beyond that or how
    far the total spread of our data
  • 8:10 - 8:11
    is.
  • 8:11 - 8:14
    So this gives a pretty good
    sense of both the median
  • 8:14 - 8:17
    and the spread of our data.
Title:
Constructing a box-and-whisker plot
Description:

more » « less
Video Language:
English
Team:
Khan Academy
Duration:
08:18

English subtitles

Revisions Compare revisions