[Script Info] Title: [Events] Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text Dialogue: 0,0:00:00.15,0:00:01.94,Default,,0000,0000,0000,,- [Voiceover] What I wanna do with this video is look Dialogue: 0,0:00:01.94,0:00:04.91,Default,,0000,0000,0000,,at some examples of data represented in different ways, Dialogue: 0,0:00:04.91,0:00:07.67,Default,,0000,0000,0000,,and think about which representation is the best, Dialogue: 0,0:00:07.67,0:00:10.92,Default,,0000,0000,0000,,or can help us answer different questions? Dialogue: 0,0:00:10.92,0:00:12.78,Default,,0000,0000,0000,,So we see this first example. Dialogue: 0,0:00:12.78,0:00:14.98,Default,,0000,0000,0000,,A statistician recorded the length of each Dialogue: 0,0:00:14.98,0:00:17.62,Default,,0000,0000,0000,,of Pixar's first 14 films. Dialogue: 0,0:00:17.62,0:00:21.76,Default,,0000,0000,0000,,The statistician made a dot plot, each dot is a film, Dialogue: 0,0:00:21.76,0:00:23.96,Default,,0000,0000,0000,,a histogram, and a box plot Dialogue: 0,0:00:23.96,0:00:26.42,Default,,0000,0000,0000,,to display the running time data. Dialogue: 0,0:00:26.42,0:00:30.25,Default,,0000,0000,0000,,Which display could be used to find the median? Dialogue: 0,0:00:30.25,0:00:31.53,Default,,0000,0000,0000,,To find the median. Dialogue: 0,0:00:31.53,0:00:34.57,Default,,0000,0000,0000,,All right, so let's look at these displays. Dialogue: 0,0:00:34.57,0:00:38.17,Default,,0000,0000,0000,,So over here we see, this is the dot plot. Dialogue: 0,0:00:38.17,0:00:40.40,Default,,0000,0000,0000,,We have a dot for each of the 14 films. Dialogue: 0,0:00:40.40,0:00:43.100,Default,,0000,0000,0000,,So one film had a running time of 81 minutes. Dialogue: 0,0:00:43.100,0:00:44.88,Default,,0000,0000,0000,,We see that there. Dialogue: 0,0:00:44.88,0:00:47.06,Default,,0000,0000,0000,,One film had a running time of 92. Dialogue: 0,0:00:47.06,0:00:50.08,Default,,0000,0000,0000,,One had a running time of 93. Dialogue: 0,0:00:50.08,0:00:52.54,Default,,0000,0000,0000,,We see one had a running time of 95. Dialogue: 0,0:00:52.54,0:00:55.84,Default,,0000,0000,0000,,We see two had running times of 96 minutes, Dialogue: 0,0:00:55.84,0:00:57.93,Default,,0000,0000,0000,,and so on and so forth. Dialogue: 0,0:00:57.93,0:01:01.04,Default,,0000,0000,0000,,So I claim that I could use this to figure out the median, Dialogue: 0,0:01:01.04,0:01:03.88,Default,,0000,0000,0000,,because I could make a list of all of the running times Dialogue: 0,0:01:03.88,0:01:05.94,Default,,0000,0000,0000,,of the films, I could order them, Dialogue: 0,0:01:05.94,0:01:07.50,Default,,0000,0000,0000,,and then I could find the middle value. Dialogue: 0,0:01:07.50,0:01:08.56,Default,,0000,0000,0000,,I could literally make a list. Dialogue: 0,0:01:08.56,0:01:12.23,Default,,0000,0000,0000,,I could write down 81, and then write down 92, Dialogue: 0,0:01:12.23,0:01:14.93,Default,,0000,0000,0000,,then write down 93, then write down 95, Dialogue: 0,0:01:14.93,0:01:16.95,Default,,0000,0000,0000,,then I could write down 96 twice, Dialogue: 0,0:01:16.95,0:01:19.30,Default,,0000,0000,0000,,and then I could write down 98, Dialogue: 0,0:01:19.30,0:01:20.43,Default,,0000,0000,0000,,then I could write down 100. Dialogue: 0,0:01:20.43,0:01:22.68,Default,,0000,0000,0000,,I think you see where this is going. Dialogue: 0,0:01:22.68,0:01:24.54,Default,,0000,0000,0000,,I could write out the entire list, Dialogue: 0,0:01:24.54,0:01:26.54,Default,,0000,0000,0000,,and then I could find the middle values. Dialogue: 0,0:01:26.54,0:01:31.13,Default,,0000,0000,0000,,So the dot plot, I could definitely use to find the median. Dialogue: 0,0:01:31.13,0:01:32.57,Default,,0000,0000,0000,,Now, what about the histogram? Dialogue: 0,0:01:32.57,0:01:35.31,Default,,0000,0000,0000,,This is the histogram right over here. Dialogue: 0,0:01:35.31,0:01:38.15,Default,,0000,0000,0000,,And the key here is, for a median, to figure out a median, Dialogue: 0,0:01:38.15,0:01:40.42,Default,,0000,0000,0000,,I just need to figure out a list of numbers. Dialogue: 0,0:01:40.42,0:01:41.72,Default,,0000,0000,0000,,I need to figure out a list of numbers. Dialogue: 0,0:01:41.72,0:01:45.09,Default,,0000,0000,0000,,So here, I don't know, they say I have one film Dialogue: 0,0:01:45.09,0:01:47.41,Default,,0000,0000,0000,,that's between 80 and 85, Dialogue: 0,0:01:47.41,0:01:49.34,Default,,0000,0000,0000,,but I don't know its exact running time. Dialogue: 0,0:01:49.34,0:01:52.10,Default,,0000,0000,0000,,Its running time might have been 81 minutes, Dialogue: 0,0:01:52.10,0:01:54.61,Default,,0000,0000,0000,,its running time might have been 84 minutes. Dialogue: 0,0:01:54.61,0:01:58.30,Default,,0000,0000,0000,,So I don't know here, and so I can't really make a list Dialogue: 0,0:01:58.30,0:02:00.11,Default,,0000,0000,0000,,of the running times of the films Dialogue: 0,0:02:00.11,0:02:01.50,Default,,0000,0000,0000,,and find the middle values, Dialogue: 0,0:02:01.50,0:02:02.68,Default,,0000,0000,0000,,so I don't think I'm gonna be able Dialogue: 0,0:02:02.68,0:02:04.61,Default,,0000,0000,0000,,to do it using the histogram. Dialogue: 0,0:02:04.61,0:02:08.59,Default,,0000,0000,0000,,Now, with the box plot right over here, Dialogue: 0,0:02:08.59,0:02:10.16,Default,,0000,0000,0000,,so I'm not gonna click histogram. Dialogue: 0,0:02:10.16,0:02:11.72,Default,,0000,0000,0000,,With the box plot over here, Dialogue: 0,0:02:11.72,0:02:14.36,Default,,0000,0000,0000,,I might not be able to make a list of all the values, Dialogue: 0,0:02:14.36,0:02:17.78,Default,,0000,0000,0000,,but the box plot explicitly tells us what the median is. Dialogue: 0,0:02:17.78,0:02:20.43,Default,,0000,0000,0000,,This middle line in the middle of the box, Dialogue: 0,0:02:20.43,0:02:23.17,Default,,0000,0000,0000,,that tells us the median is, what is this, Dialogue: 0,0:02:23.17,0:02:26.74,Default,,0000,0000,0000,,this median is, if this is 100, this is 99. Dialogue: 0,0:02:26.74,0:02:29.83,Default,,0000,0000,0000,,So this is 95, 96, 97, 98, 99. Dialogue: 0,0:02:29.83,0:02:32.06,Default,,0000,0000,0000,,It explicitly tells us the median is 99. Dialogue: 0,0:02:32.06,0:02:34.89,Default,,0000,0000,0000,,This is actually the easiest for calculating the median. Dialogue: 0,0:02:34.89,0:02:36.29,Default,,0000,0000,0000,,So I'll go with the box plot. Dialogue: 0,0:02:36.29,0:02:38.35,Default,,0000,0000,0000,,So the histogram is of no use to me Dialogue: 0,0:02:38.35,0:02:40.00,Default,,0000,0000,0000,,if I wanna calculate the median. Dialogue: 0,0:02:40.00,0:02:41.93,Default,,0000,0000,0000,,Let's do a couple more of these. Dialogue: 0,0:02:41.93,0:02:44.81,Default,,0000,0000,0000,,Nam owns a used car lot. Dialogue: 0,0:02:44.81,0:02:46.60,Default,,0000,0000,0000,,He checked the odometers of the cars Dialogue: 0,0:02:46.60,0:02:49.13,Default,,0000,0000,0000,,and recorded how far they had driven. Dialogue: 0,0:02:49.13,0:02:51.94,Default,,0000,0000,0000,,He then created both a histogram and a box plot Dialogue: 0,0:02:51.94,0:02:55.27,Default,,0000,0000,0000,,to display the same data, both diagrams are shown below. Dialogue: 0,0:02:55.100,0:02:58.19,Default,,0000,0000,0000,,Which display can be used Dialogue: 0,0:02:58.19,0:03:00.89,Default,,0000,0000,0000,,to find how many vehicles had driven Dialogue: 0,0:03:00.89,0:03:04.14,Default,,0000,0000,0000,,more than 200,000 kilometers? Dialogue: 0,0:03:04.14,0:03:06.16,Default,,0000,0000,0000,,So how many vehicles had driven Dialogue: 0,0:03:06.16,0:03:09.52,Default,,0000,0000,0000,,more than 200,000 kilometers? Dialogue: 0,0:03:09.52,0:03:13.26,Default,,0000,0000,0000,,So it looks like here in this histogram, Dialogue: 0,0:03:13.26,0:03:16.70,Default,,0000,0000,0000,,I have three vehicles that were between 200 and 250, Dialogue: 0,0:03:16.70,0:03:20.42,Default,,0000,0000,0000,,and then I have two vehicles that are between 250 and 300. Dialogue: 0,0:03:20.42,0:03:22.16,Default,,0000,0000,0000,,So it looks pretty clear that I have five vehicles, Dialogue: 0,0:03:22.16,0:03:26.10,Default,,0000,0000,0000,,three that had a mileage between 200,000 and 250,000, Dialogue: 0,0:03:26.10,0:03:27.94,Default,,0000,0000,0000,,and then I had two that had mileage Dialogue: 0,0:03:27.94,0:03:30.12,Default,,0000,0000,0000,,between 250,000 and 300,000. Dialogue: 0,0:03:30.12,0:03:31.51,Default,,0000,0000,0000,,So I may be able to answer the question. Dialogue: 0,0:03:31.51,0:03:36.37,Default,,0000,0000,0000,,Five vehicles had a mileage more than 200,000, Dialogue: 0,0:03:36.37,0:03:40.04,Default,,0000,0000,0000,,and so I would say that the histogram is pretty useful. Dialogue: 0,0:03:40.04,0:03:42.92,Default,,0000,0000,0000,,But let's verify that the box plot isn't so useful. Dialogue: 0,0:03:42.92,0:03:45.03,Default,,0000,0000,0000,,So I wanna know how many vehicles had a mileage Dialogue: 0,0:03:45.03,0:03:47.14,Default,,0000,0000,0000,,more than 200,000. Dialogue: 0,0:03:47.14,0:03:50.88,Default,,0000,0000,0000,,Well, I know that if I have a mileage more than 200,000, Dialogue: 0,0:03:50.88,0:03:54.94,Default,,0000,0000,0000,,I'm going to be in the fourth quartile, Dialogue: 0,0:03:54.94,0:03:58.19,Default,,0000,0000,0000,,but I don't know how many values I have sitting there Dialogue: 0,0:03:58.19,0:04:01.58,Default,,0000,0000,0000,,in the fourth quartile just looking at this data over here, Dialogue: 0,0:04:01.58,0:04:04.67,Default,,0000,0000,0000,,so that's not gonna be useful for answering that question. Dialogue: 0,0:04:04.67,0:04:06.23,Default,,0000,0000,0000,,Let's look at the second question. Dialogue: 0,0:04:06.23,0:04:07.69,Default,,0000,0000,0000,,Which display can be used Dialogue: 0,0:04:07.69,0:04:09.52,Default,,0000,0000,0000,,to find that the median distance, Dialogue: 0,0:04:09.52,0:04:11.48,Default,,0000,0000,0000,,which display can be used to find Dialogue: 0,0:04:11.48,0:04:12.48,Default,,0000,0000,0000,,that the median distance Dialogue: 0,0:04:12.48,0:04:15.42,Default,,0000,0000,0000,,was approximately 140,000 kilometers? Dialogue: 0,0:04:15.42,0:04:16.74,Default,,0000,0000,0000,,Well, to calculate the median, Dialogue: 0,0:04:16.74,0:04:18.72,Default,,0000,0000,0000,,you essentially wanna be able to list all of the numbers Dialogue: 0,0:04:18.72,0:04:20.25,Default,,0000,0000,0000,,and then find the middle number. Dialogue: 0,0:04:20.25,0:04:23.32,Default,,0000,0000,0000,,And over here, I can't list all of the numbers. Dialogue: 0,0:04:23.32,0:04:25.40,Default,,0000,0000,0000,,I know that there's three values that are Dialogue: 0,0:04:25.40,0:04:27.54,Default,,0000,0000,0000,,between zero and 50,000 kilometers, Dialogue: 0,0:04:27.54,0:04:28.58,Default,,0000,0000,0000,,but I don't know what they are. Dialogue: 0,0:04:28.58,0:04:30.79,Default,,0000,0000,0000,,Could be 10,000, 10,000, 10,000. Dialogue: 0,0:04:30.79,0:04:34.48,Default,,0000,0000,0000,,It could be 10,000, 15,000, and 40,000. Dialogue: 0,0:04:34.48,0:04:37.02,Default,,0000,0000,0000,,I don't know what they are, and so if I can't list all Dialogue: 0,0:04:37.02,0:04:39.14,Default,,0000,0000,0000,,of these things and put them in order, Dialogue: 0,0:04:39.15,0:04:42.23,Default,,0000,0000,0000,,I really am going to have trouble finding the middle value. Dialogue: 0,0:04:42.23,0:04:45.39,Default,,0000,0000,0000,,The middle value, it's going to be Dialogue: 0,0:04:45.39,0:04:47.78,Default,,0000,0000,0000,,in this range right around here, Dialogue: 0,0:04:47.78,0:04:49.69,Default,,0000,0000,0000,,but I don't know exactly what it's going to be. Dialogue: 0,0:04:49.69,0:04:50.96,Default,,0000,0000,0000,,The histogram is not useful, Dialogue: 0,0:04:50.96,0:04:53.92,Default,,0000,0000,0000,,because throwing all the values into these buckets. Dialogue: 0,0:04:53.92,0:04:56.26,Default,,0000,0000,0000,,While on the box plot, it explicitly, Dialogue: 0,0:04:56.26,0:04:58.12,Default,,0000,0000,0000,,it directly tells me the median value. Dialogue: 0,0:04:58.12,0:05:00.51,Default,,0000,0000,0000,,This line right over here, the middle of the box, Dialogue: 0,0:05:00.51,0:05:02.74,Default,,0000,0000,0000,,this tells us the median value, Dialogue: 0,0:05:02.74,0:05:05.03,Default,,0000,0000,0000,,and we see that the median value here, Dialogue: 0,0:05:05.03,0:05:08.42,Default,,0000,0000,0000,,this is 140,000 kilometers. Dialogue: 0,0:05:08.42,0:05:11.34,Default,,0000,0000,0000,,Right, this is 100, 110, 120, 130, Dialogue: 0,0:05:11.34,0:05:16.01,Default,,0000,0000,0000,,140,000 kilometers is the median mileage for the cars. Dialogue: 0,0:05:16.01,0:05:18.81,Default,,0000,0000,0000,,And so the box plot clearly... Dialogue: 0,0:05:20.84,0:05:23.24,Default,,0000,0000,0000,,clearly gives us that data.