[Script Info] Title: [Events] Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text Dialogue: 0,0:00:00.55,0:00:02.76,Default,,0000,0000,0000,,- [Narrator] So we have\Nnine students who recently Dialogue: 0,0:00:02.76,0:00:07.76,Default,,0000,0000,0000,,graduated from a small school\Nthat has a class size of nine, Dialogue: 0,0:00:07.76,0:00:10.84,Default,,0000,0000,0000,,and they wanna figure out\Nwhat is the central tendency Dialogue: 0,0:00:10.84,0:00:14.30,Default,,0000,0000,0000,,for salaries one year after graduation? Dialogue: 0,0:00:14.30,0:00:16.85,Default,,0000,0000,0000,,And they also wanna have a\Nsense of the spread around Dialogue: 0,0:00:16.85,0:00:20.25,Default,,0000,0000,0000,,that central tendency one\Nyear after graduation. Dialogue: 0,0:00:20.25,0:00:23.88,Default,,0000,0000,0000,,So they all agree to put in\Ntheir salaries into a computer, Dialogue: 0,0:00:23.88,0:00:25.91,Default,,0000,0000,0000,,and so these are their salaries. Dialogue: 0,0:00:25.91,0:00:27.34,Default,,0000,0000,0000,,They're measured in thousands. Dialogue: 0,0:00:27.34,0:00:30.79,Default,,0000,0000,0000,,So one makes 35,000, 50,000,\N50,000, 50,000, 56,000, Dialogue: 0,0:00:30.79,0:00:34.60,Default,,0000,0000,0000,,two make 60,000, one makes\N75,000, and one makes 250,000. Dialogue: 0,0:00:34.60,0:00:37.05,Default,,0000,0000,0000,,So she's doing very well for herself, Dialogue: 0,0:00:37.05,0:00:40.58,Default,,0000,0000,0000,,and the computer it spits\Nout a bunch of parameters Dialogue: 0,0:00:40.58,0:00:42.58,Default,,0000,0000,0000,,based on this data here. Dialogue: 0,0:00:43.44,0:00:47.23,Default,,0000,0000,0000,,So it spits out two typical\Nmeasures of central tendency. Dialogue: 0,0:00:47.23,0:00:50.14,Default,,0000,0000,0000,,The mean is roughly 76.2. Dialogue: 0,0:00:50.14,0:00:53.01,Default,,0000,0000,0000,,The computer would calculate\Nit by adding up all of these Dialogue: 0,0:00:53.01,0:00:55.85,Default,,0000,0000,0000,,numbers, these nine numbers,\Nand then dividing by nine, Dialogue: 0,0:00:55.85,0:00:59.65,Default,,0000,0000,0000,,and the median is 56, and median\Nis quite easy to calculate. Dialogue: 0,0:00:59.65,0:01:01.99,Default,,0000,0000,0000,,You just order the numbers and you take Dialogue: 0,0:01:01.99,0:01:04.62,Default,,0000,0000,0000,,the middle number here which is 56. Dialogue: 0,0:01:04.62,0:01:07.63,Default,,0000,0000,0000,,Now what I want you to\Ndo is pause this video Dialogue: 0,0:01:07.63,0:01:10.08,Default,,0000,0000,0000,,and think about for this data set, Dialogue: 0,0:01:10.08,0:01:14.28,Default,,0000,0000,0000,,for this population of\Nsalaries, which measure, Dialogue: 0,0:01:14.28,0:01:19.24,Default,,0000,0000,0000,,which measure of central\Ntendency is a better measure? Dialogue: 0,0:01:19.24,0:01:21.17,Default,,0000,0000,0000,,All right, so let's think\Nabout this a little bit. Dialogue: 0,0:01:21.17,0:01:23.84,Default,,0000,0000,0000,,I'm gonna plot it on a line here. Dialogue: 0,0:01:23.84,0:01:26.05,Default,,0000,0000,0000,,I'm gonna plot my data\Nso we get a better sense Dialogue: 0,0:01:26.05,0:01:28.41,Default,,0000,0000,0000,,and we just don't see them,\Nso we just don't see things Dialogue: 0,0:01:28.41,0:01:31.14,Default,,0000,0000,0000,,as numbers, but we see\Nwhere those numbers sit Dialogue: 0,0:01:31.14,0:01:32.63,Default,,0000,0000,0000,,relative to each other. Dialogue: 0,0:01:32.63,0:01:34.82,Default,,0000,0000,0000,,So let's say this is zero. Dialogue: 0,0:01:34.82,0:01:38.98,Default,,0000,0000,0000,,Let's say this is, let's see,\None, two, three, four, five. Dialogue: 0,0:01:41.68,0:01:45.85,Default,,0000,0000,0000,,So this would be 250, this\Nis 50, 100, 150, 200, 200, Dialogue: 0,0:01:51.56,0:01:52.53,Default,,0000,0000,0000,,and let's see. Dialogue: 0,0:01:52.53,0:01:56.37,Default,,0000,0000,0000,,Let's say if this is 50\Nthan this would be roughly Dialogue: 0,0:01:56.37,0:01:58.80,Default,,0000,0000,0000,,40 right here, and I just wanna get rough. Dialogue: 0,0:01:58.80,0:02:03.66,Default,,0000,0000,0000,,So this would be about 60,\N70, 80, 90, close enough. Dialogue: 0,0:02:03.66,0:02:05.59,Default,,0000,0000,0000,,I'm, I could draw this\Na little bit neater, Dialogue: 0,0:02:05.59,0:02:07.26,Default,,0000,0000,0000,,but, 60, 70, 80, 90. Dialogue: 0,0:02:08.95,0:02:12.44,Default,,0000,0000,0000,,Actually, let me just clean\Nthis up a little bit more too. Dialogue: 0,0:02:12.44,0:02:14.02,Default,,0000,0000,0000,,This one right over here would be Dialogue: 0,0:02:14.02,0:02:16.69,Default,,0000,0000,0000,,a little bit closer to this one. Dialogue: 0,0:02:18.42,0:02:22.05,Default,,0000,0000,0000,,Let me just put it right around here. Dialogue: 0,0:02:22.05,0:02:26.05,Default,,0000,0000,0000,,So that's 40, and then\Nthis would be 30, 20, 10. Dialogue: 0,0:02:27.23,0:02:28.69,Default,,0000,0000,0000,,Okay, that's pretty good. Dialogue: 0,0:02:28.69,0:02:30.10,Default,,0000,0000,0000,,So let's plot this data. Dialogue: 0,0:02:30.10,0:02:34.26,Default,,0000,0000,0000,,So, one student makes 35,000,\Nso that is right over there. Dialogue: 0,0:02:35.57,0:02:38.41,Default,,0000,0000,0000,,Two make 50,000, or three make 50,000, Dialogue: 0,0:02:38.41,0:02:40.33,Default,,0000,0000,0000,,so one, two, and three. Dialogue: 0,0:02:42.28,0:02:43.86,Default,,0000,0000,0000,,I'll put it like that. Dialogue: 0,0:02:43.86,0:02:48.03,Default,,0000,0000,0000,,One makes 56,000 which would\Nput them right over here. Dialogue: 0,0:02:49.90,0:02:53.29,Default,,0000,0000,0000,,One makes 60,000, or\Nactually, two make 60,000, Dialogue: 0,0:02:53.29,0:02:54.80,Default,,0000,0000,0000,,so it's like that. Dialogue: 0,0:02:54.80,0:02:58.39,Default,,0000,0000,0000,,One makes 75,000, so\Nthat's 60, 70, 75,000. Dialogue: 0,0:03:00.24,0:03:02.11,Default,,0000,0000,0000,,So it's gonna be right around there, Dialogue: 0,0:03:02.11,0:03:04.17,Default,,0000,0000,0000,,and then one makes 250,000. Dialogue: 0,0:03:04.17,0:03:07.67,Default,,0000,0000,0000,,So one's salary is all\Nthe way around there, Dialogue: 0,0:03:07.67,0:03:11.26,Default,,0000,0000,0000,,and then when we\Ncalculate the mean as 76.2 Dialogue: 0,0:03:11.26,0:03:13.33,Default,,0000,0000,0000,,as our measure of central tendency, Dialogue: 0,0:03:13.33,0:03:15.41,Default,,0000,0000,0000,,76.2 is right over there. Dialogue: 0,0:03:16.65,0:03:21.14,Default,,0000,0000,0000,,So is this a good measure\Nof central tendency? Dialogue: 0,0:03:21.14,0:03:23.24,Default,,0000,0000,0000,,Well to me it doesn't feel that good, Dialogue: 0,0:03:23.24,0:03:26.14,Default,,0000,0000,0000,,because our measure of central\Ntendency is higher than all Dialogue: 0,0:03:26.14,0:03:29.86,Default,,0000,0000,0000,,of the data points except for\None, and the reason is is that Dialogue: 0,0:03:29.86,0:03:33.56,Default,,0000,0000,0000,,you have this one that the,\Nthat our, our data is skewed Dialogue: 0,0:03:33.56,0:03:37.31,Default,,0000,0000,0000,,significantly by this\Ndata point at $250,000. Dialogue: 0,0:03:38.51,0:03:41.29,Default,,0000,0000,0000,,It is so far from the\Nrest of the distribution Dialogue: 0,0:03:41.29,0:03:44.59,Default,,0000,0000,0000,,from the rest of the data\Nthat it has skewed the mean, Dialogue: 0,0:03:44.59,0:03:46.89,Default,,0000,0000,0000,,and this is something\Nthat you see in general. Dialogue: 0,0:03:46.89,0:03:49.87,Default,,0000,0000,0000,,If you have data that is skewed,\Nand especially things like Dialogue: 0,0:03:49.87,0:03:52.74,Default,,0000,0000,0000,,salary data where someone might\Nmake, most people are making Dialogue: 0,0:03:52.74,0:03:56.10,Default,,0000,0000,0000,,50, 60, $70,000, but someone\Nmight make two million dollars, Dialogue: 0,0:03:56.10,0:03:59.66,Default,,0000,0000,0000,,and so that will skew the\Naverage or skew the mean I should Dialogue: 0,0:03:59.66,0:04:01.98,Default,,0000,0000,0000,,say, when you add them all\Nup and divide by the number Dialogue: 0,0:04:01.98,0:04:03.25,Default,,0000,0000,0000,,of data points you have. Dialogue: 0,0:04:03.25,0:04:06.40,Default,,0000,0000,0000,,In this case, especially when\Nyou have data points that Dialogue: 0,0:04:06.40,0:04:09.78,Default,,0000,0000,0000,,would skew the mean,\Nmedian is much more robust. Dialogue: 0,0:04:09.78,0:04:14.16,Default,,0000,0000,0000,,The median at 56 sits right\Nover here, which seems to be Dialogue: 0,0:04:14.16,0:04:17.46,Default,,0000,0000,0000,,much more indicative for central tendency. Dialogue: 0,0:04:17.46,0:04:18.51,Default,,0000,0000,0000,,And think about it. Dialogue: 0,0:04:18.51,0:04:21.58,Default,,0000,0000,0000,,Even if you made this instead of 250,000 Dialogue: 0,0:04:21.58,0:04:25.80,Default,,0000,0000,0000,,if you made this 250,000\Nthousand, which would be 250 Dialogue: 0,0:04:25.80,0:04:29.14,Default,,0000,0000,0000,,million dollars, which is\Na ginormous amount of money Dialogue: 0,0:04:29.14,0:04:32.72,Default,,0000,0000,0000,,to make, it wouldn't, it would\Nskew the mean incredibly, Dialogue: 0,0:04:32.72,0:04:35.53,Default,,0000,0000,0000,,but it actually would not\Neven change the median, Dialogue: 0,0:04:35.53,0:04:37.34,Default,,0000,0000,0000,,because the median, it doesn't matter Dialogue: 0,0:04:37.34,0:04:38.55,Default,,0000,0000,0000,,how high this number gets. Dialogue: 0,0:04:38.55,0:04:39.93,Default,,0000,0000,0000,,This could be a trillion dollars. Dialogue: 0,0:04:39.93,0:04:41.69,Default,,0000,0000,0000,,This could be a quadrillion dollars. Dialogue: 0,0:04:41.69,0:04:43.94,Default,,0000,0000,0000,,The median is going to stay the same. Dialogue: 0,0:04:43.94,0:04:45.83,Default,,0000,0000,0000,,So the median is much more robust Dialogue: 0,0:04:45.83,0:04:48.15,Default,,0000,0000,0000,,if you have a skewed data set. Dialogue: 0,0:04:48.15,0:04:51.52,Default,,0000,0000,0000,,Mean makes a little bit more\Nsense if you have a symmetric Dialogue: 0,0:04:51.52,0:04:54.56,Default,,0000,0000,0000,,data set or if you have things\Nthat are, you know, where, Dialogue: 0,0:04:54.56,0:04:56.56,Default,,0000,0000,0000,,where things are roughly\Nabove and below the mean, Dialogue: 0,0:04:56.56,0:04:59.61,Default,,0000,0000,0000,,or things aren't skewed\Nincredibly in one direction, Dialogue: 0,0:04:59.61,0:05:01.24,Default,,0000,0000,0000,,especially by a handful of data Dialogue: 0,0:05:01.24,0:05:03.56,Default,,0000,0000,0000,,points like we have right over here. Dialogue: 0,0:05:03.56,0:05:06.60,Default,,0000,0000,0000,,So in this example, the median is a much Dialogue: 0,0:05:06.60,0:05:09.81,Default,,0000,0000,0000,,better measure of central tendency. Dialogue: 0,0:05:09.81,0:05:11.40,Default,,0000,0000,0000,,And so what about spread? Dialogue: 0,0:05:11.40,0:05:13.85,Default,,0000,0000,0000,,Well you might say, well,\NSal you already told us Dialogue: 0,0:05:13.85,0:05:15.68,Default,,0000,0000,0000,,that the mean is not so good Dialogue: 0,0:05:15.68,0:05:18.50,Default,,0000,0000,0000,,and the standard deviation\Nis based on the mean. Dialogue: 0,0:05:18.50,0:05:22.19,Default,,0000,0000,0000,,You take each of these data\Npoints, find their distance Dialogue: 0,0:05:22.19,0:05:24.99,Default,,0000,0000,0000,,from the mean, square that\Nnumber, add up those squared Dialogue: 0,0:05:24.99,0:05:27.78,Default,,0000,0000,0000,,distances, divide by the\Nnumber of data points if we're Dialogue: 0,0:05:27.78,0:05:31.13,Default,,0000,0000,0000,,taking the population standard\Ndeviation, and then you, Dialogue: 0,0:05:31.13,0:05:34.56,Default,,0000,0000,0000,,and then you, you take the\Nsquare root of the whole thing. Dialogue: 0,0:05:34.56,0:05:37.83,Default,,0000,0000,0000,,And so since this is based on\Nthe mean, which isn't a good Dialogue: 0,0:05:37.83,0:05:41.40,Default,,0000,0000,0000,,measure of central tendency\Nin this situation, and this, Dialogue: 0,0:05:41.40,0:05:44.96,Default,,0000,0000,0000,,this is also going to skew\Nthat standard deviation. Dialogue: 0,0:05:44.96,0:05:47.94,Default,,0000,0000,0000,,This is going to be, this is a lot larger Dialogue: 0,0:05:47.94,0:05:50.47,Default,,0000,0000,0000,,than if you look at the, the actual, Dialogue: 0,0:05:50.47,0:05:53.45,Default,,0000,0000,0000,,if you wanted an indication of the spread. Dialogue: 0,0:05:53.45,0:05:56.65,Default,,0000,0000,0000,,Yes, you have this one data\Npoint that's way far away Dialogue: 0,0:05:56.65,0:05:59.62,Default,,0000,0000,0000,,from either the mean or\Nthe median depending on how Dialogue: 0,0:05:59.62,0:06:02.50,Default,,0000,0000,0000,,you wanna think about it, but\Nmost of the data points seem Dialogue: 0,0:06:02.50,0:06:04.94,Default,,0000,0000,0000,,much closer, and so for that situation, Dialogue: 0,0:06:04.94,0:06:07.11,Default,,0000,0000,0000,,not only are we using the median, Dialogue: 0,0:06:07.11,0:06:10.78,Default,,0000,0000,0000,,but the interquartile range\Nis once again more robust. Dialogue: 0,0:06:10.78,0:06:13.06,Default,,0000,0000,0000,,How do we calculate the\Ninterquartile range? Dialogue: 0,0:06:13.06,0:06:15.32,Default,,0000,0000,0000,,Well, you take the median\Nand then you take the bottom Dialogue: 0,0:06:15.32,0:06:18.98,Default,,0000,0000,0000,,group of numbers and\Ncalculate the median of those. Dialogue: 0,0:06:18.98,0:06:21.95,Default,,0000,0000,0000,,So that's 50 right over here\Nand then you take the top Dialogue: 0,0:06:21.95,0:06:24.88,Default,,0000,0000,0000,,group of numbers, the\Nupper group of numbers, Dialogue: 0,0:06:24.88,0:06:28.93,Default,,0000,0000,0000,,and the median there is\N60 and 75, it's 67.5. Dialogue: 0,0:06:28.93,0:06:30.91,Default,,0000,0000,0000,,If this looks unfamiliar\Nwe have many videos Dialogue: 0,0:06:30.91,0:06:32.83,Default,,0000,0000,0000,,on interquartile range and calculating Dialogue: 0,0:06:32.83,0:06:34.51,Default,,0000,0000,0000,,standard deviation and median and mean. Dialogue: 0,0:06:34.51,0:06:35.89,Default,,0000,0000,0000,,This is just a little bit of a review, Dialogue: 0,0:06:35.89,0:06:39.21,Default,,0000,0000,0000,,and then the difference\Nbetween these two is 17.5, Dialogue: 0,0:06:39.21,0:06:43.18,Default,,0000,0000,0000,,and notice, this distance\Nbetween these two, this 17.5, Dialogue: 0,0:06:43.18,0:06:44.91,Default,,0000,0000,0000,,this isn't going to change, Dialogue: 0,0:06:44.91,0:06:48.20,Default,,0000,0000,0000,,even if this is 250 billion dollars. Dialogue: 0,0:06:48.20,0:06:51.97,Default,,0000,0000,0000,,So once again, it is both of\Nthese measures are more robust Dialogue: 0,0:06:51.97,0:06:54.64,Default,,0000,0000,0000,,when you have a skewed data set. Dialogue: 0,0:06:56.06,0:06:59.41,Default,,0000,0000,0000,,So the big take away here is\Nmean and standard deviation, Dialogue: 0,0:06:59.41,0:07:02.23,Default,,0000,0000,0000,,they're not bad if you have\Na roughly symmetric data set, Dialogue: 0,0:07:02.23,0:07:05.05,Default,,0000,0000,0000,,if you don't have any\Nsignificant outliers, Dialogue: 0,0:07:05.05,0:07:07.19,Default,,0000,0000,0000,,things that really skew the data set, Dialogue: 0,0:07:07.19,0:07:10.28,Default,,0000,0000,0000,,mean and standard deviation\Ncan be quite solid. Dialogue: 0,0:07:10.28,0:07:12.58,Default,,0000,0000,0000,,But if you're looking at\Nsomething that could get really Dialogue: 0,0:07:12.58,0:07:15.83,Default,,0000,0000,0000,,skewed by a handful of data\Npoints median might be, Dialogue: 0,0:07:15.83,0:07:19.09,Default,,0000,0000,0000,,median and interquartile range,\Nmedian for central tendency, Dialogue: 0,0:07:19.09,0:07:23.49,Default,,0000,0000,0000,,interquartile range for spread\Naround that central tendency, Dialogue: 0,0:07:23.49,0:07:26.31,Default,,0000,0000,0000,,and that's why you'll see when\Npeople talk about salaries Dialogue: 0,0:07:26.31,0:07:28.28,Default,,0000,0000,0000,,they'll often talk about\Nmedian, because you can have Dialogue: 0,0:07:28.28,0:07:30.26,Default,,0000,0000,0000,,some skewed salaries,\Nespecially on the up side. Dialogue: 0,0:07:30.26,0:07:32.41,Default,,0000,0000,0000,,When we talk about things\Nlike home prices you'll see Dialogue: 0,0:07:32.41,0:07:35.47,Default,,0000,0000,0000,,median often measured\Nmore typically than mean, Dialogue: 0,0:07:35.47,0:07:38.100,Default,,0000,0000,0000,,because home prices in a\Nneighborhood, a lot of, Dialogue: 0,0:07:38.100,0:07:42.26,Default,,0000,0000,0000,,or in a city, a lot of the\Nhouses might be in the 200,000, Dialogue: 0,0:07:42.26,0:07:45.63,Default,,0000,0000,0000,,$300,000 range, but maybe\Nthere's one ginormous mansion Dialogue: 0,0:07:45.63,0:07:48.86,Default,,0000,0000,0000,,that is 100 million dollars,\Nand if you calculated mean Dialogue: 0,0:07:48.86,0:07:51.85,Default,,0000,0000,0000,,that would skew and give a\Nfalse impression of the average Dialogue: 0,0:07:51.85,0:07:55.77,Default,,0000,0000,0000,,or the central tendency\Nof prices in that city.