9:59:59.000,9:59:59.000 1[br]00:00:04,650 --> 00:00:05,470[br]Hello, in this video, 9:59:59.000,9:59:59.000 2[br]00:00:05,470 --> 00:00:14,610[br]I want to talk with you just a little bit about statistics that we can measure and how we want to think about differences between them. 9:59:59.000,9:59:59.000 3[br]00:00:14,610 --> 00:00:21,220[br]In the previous video or an earlier video, I talked about how bar charts emphasized relative differences. 9:59:59.000,9:59:59.000 4[br]00:00:21,220 --> 00:00:24,450[br]So in this one, I'm going to talk just a little bit about what that means. 9:59:59.000,9:59:59.000 5[br]00:00:24,450 --> 00:00:30,040[br]So are learning outcomes for this video to review a little bit of the statistics or the metrics that we've been talking about, 9:59:59.000,9:59:59.000 6[br]00:00:30,040 --> 00:00:37,990[br]to talk for you to be able to compute absolute and relative differences and to interpret a relative difference between two quantities. 9:59:59.000,9:59:59.000 7[br]00:00:37,990 --> 00:00:43,870[br]So as we've talked about, we can compute various statistics over our data means median mode success rates. 9:59:59.000,9:59:59.000 8[br]00:00:43,870 --> 00:00:46,720[br]We can compute percentiles, counts many. 9:59:59.000,9:59:59.000 9[br]00:00:46,720 --> 00:00:57,530[br]Basically, any statistic you can think of for that, you can compute over a of a set of numbers we can use as some kind of a statistic or a metric. 9:59:59.000,9:59:59.000 10[br]00:00:57,530 --> 00:01:04,210[br]And this often serves as the metric for analysis or evaluation. Try to evaluate a program or tried to evaluate a technology. 9:59:59.000,9:59:59.000 11[br]00:01:04,210 --> 00:01:08,920[br]We have some metric that is measuring its effectiveness. 9:59:59.000,9:59:59.000 12[br]00:01:08,920 --> 00:01:15,310[br]And we want to see whether it's improved or changed somehow. 9:59:59.000,9:59:59.000 13[br]00:01:15,310 --> 00:01:19,090[br]So when we compare to values, though, with a few different ways to do it. 9:59:59.000,9:59:59.000 14[br]00:01:19,090 --> 00:01:24,040[br]So let's take a couple the population estimate in 2018 of Boise in Salt Lake City. 9:59:59.000,9:59:59.000 15[br]00:01:24,040 --> 00:01:27,230[br]And there's two ways that we few different ways that we can compare it. 9:59:59.000,9:59:59.000 16[br]00:01:27,230 --> 00:01:33,460[br]Two of them are the absolute difference. Boise has twenty eight thousand more people than Salt Lake City. 9:59:59.000,9:59:59.000 17[br]00:01:33,460 --> 00:01:42,180[br]The other is the relative difference. Boise has fourteen point five percent or fourteen percent more people than Salt Lake City. 9:59:59.000,9:59:59.000 18[br]00:01:42,180 --> 00:01:49,020[br]So the absolute value is the difference between two values is actually the absolute value of the difference between two values. 9:59:59.000,9:59:59.000 19[br]00:01:49,020 --> 00:01:54,750[br]We can also talk about a science difference or a real difference where we don't have the absolute value if we need the direction on the difference. 9:59:59.000,9:59:59.000 20[br]00:01:54,750 --> 00:02:01,440[br]That becomes useful. But what we're talking about here is the actual difference in the underlying units. 9:59:59.000,9:59:59.000 21[br]00:02:01,440 --> 00:02:11,940[br]So in our example case, number of people. And but the another way we can do it is to talk about the relative difference. 9:59:59.000,9:59:59.000 22[br]00:02:11,940 --> 00:02:18,960[br]So this is the it's the difference normalized by. 9:59:59.000,9:59:59.000 23[br]00:02:18,960 --> 00:02:22,440[br]The reference quantity, and we have to be clear on which ones, 9:59:59.000,9:59:59.000 24[br]00:02:22,440 --> 00:02:29,700[br]the reference quantity and the reference quantity is the one we're starting from whom we're computing the relative difference. 9:59:59.000,9:59:59.000 25[br]00:02:29,700 --> 00:02:39,990[br]So, for example, 50 is 25 percent, more than 40 because you take 25 percent of fortius 10 add add that and you get 50. 9:59:59.000,9:59:59.000 26[br]00:02:39,990 --> 00:02:46,410[br]But 40 is 20 percent is only 20 percent less than 50 because you take 50. 9:59:59.000,9:59:59.000 27[br]00:02:46,410 --> 00:02:50,670[br]10 is 20 percent a 50, whereas it's 25 percent of 40. 9:59:59.000,9:59:59.000 28[br]00:02:50,670 --> 00:02:56,750[br]You subtract it. And you get and you get 40. 9:59:59.000,9:59:59.000 29[br]00:02:56,750 --> 00:03:01,010[br]So this different this order difference is really, really important. 9:59:59.000,9:59:59.000 30[br]00:03:01,010 --> 00:03:09,480[br]So for the 50, what we've got is we have 50 minus 40, over 40. 9:59:59.000,9:59:59.000 31[br]00:03:09,480 --> 00:03:20,010[br]That's this one. And we have for the minus 50, over 50, negative 10, over 50. 9:59:59.000,9:59:59.000 32[br]00:03:20,010 --> 00:03:25,080[br]That's 20 percent. Ten over 40 is 25 percent. 9:59:59.000,9:59:59.000 33[br]00:03:25,080 --> 00:03:33,330[br]So we need to be really, really careful about the order, another way we can compares with the ratio, we just divide one quantity by the other. 9:59:59.000,9:59:59.000 34[br]00:03:33,330 --> 00:03:40,470[br]So this is when we say slike this year's sales of 20 million are twice as much as last year's 10 million. 9:59:59.000,9:59:59.000 35[br]00:03:40,470 --> 00:03:47,010[br]This is an apt. It's an absolute change of one of 10 million and it's a relative change of one hundred percent. 9:59:59.000,9:59:59.000 36[br]00:03:47,010 --> 00:03:51,030[br]Twenty million is twice as much as one is 10 million. 9:59:59.000,9:59:59.000 37[br]00:03:51,030 --> 00:03:54,090[br]And it is one hundred percent higher than 10 million. 9:59:59.000,9:59:59.000 38[br]00:03:54,090 --> 00:03:59,940[br]Now, one thing to think about, if I say this year's returns are two times larger than last year's. 9:59:59.000,9:59:59.000 39[br]00:03:59,940 --> 00:04:03,750[br]What does that mean? Does it mean it's two times? 9:59:59.000,9:59:59.000 40[br]00:04:03,750 --> 00:04:08,100[br]Does it mean it's 200 percent more, which would be three times? 9:59:59.000,9:59:59.000 41[br]00:04:08,100 --> 00:04:13,350[br]Is it clear? I would submit that this way of framing it is ambiguous. 9:59:59.000,9:59:59.000 42[br]00:04:13,350 --> 00:04:20,400[br]And so we should avoid it. The appropriate comparison really depends on context and problem. 9:59:59.000,9:59:59.000 43[br]00:04:20,400 --> 00:04:25,320[br]There's not a hard and fast rule when you need one or another. 9:59:59.000,9:59:59.000 44[br]00:04:25,320 --> 00:04:31,740[br]Relative comparisons are quite common because they they can be compared across a variety of contexts. 9:59:59.000,9:59:59.000 45[br]00:04:31,740 --> 00:04:36,570[br]But we still also need to pay attention to the underlying absolute difference in what 9:59:59.000,9:59:59.000 46[br]00:04:36,570 --> 00:04:41,190[br]the act what this what the change being made in this relative change actually is. 9:59:59.000,9:59:59.000 47[br]00:04:41,190 --> 00:04:43,850[br]One example of a high profile relative change. 9:59:59.000,9:59:59.000 48[br]00:04:43,850 --> 00:04:53,940[br]If the Netflix prize, which was a run by Netflix a number of years ago, they paid a million dollars to the team that was able to beat It's The Beat, 9:59:59.000,9:59:59.000 49[br]00:04:53,940 --> 00:04:58,740[br]their internal movie recommender on the metric that they chose by 10 percent. 9:59:59.000,9:59:59.000 50[br]00:04:58,740 --> 00:05:06,120[br]They wanted a 10 percent improvement. And this metric it was. Lower is better, so they wanted you to. 9:59:59.000,9:59:59.000 51[br]00:05:06,120 --> 00:05:10,380[br]They wanted to decrease in the metric by 10 percent. 9:59:59.000,9:59:59.000 52[br]00:05:10,380 --> 00:05:17,340[br]We can also talk about a difference between differences, because if we compute a difference, that difference itself is just another value. 9:59:59.000,9:59:59.000 53[br]00:05:17,340 --> 00:05:22,290[br]So we could say ten sales grew 10 percent more this year than last year. 9:59:59.000,9:59:59.000 54[br]00:05:22,290 --> 00:05:32,200[br]So if we define growth as the one year sales minus the other year sales, then we can look at the growth of this year. 9:59:59.000,9:59:59.000 55[br]00:05:32,200 --> 00:05:37,000[br]And the growth of last year and we can compute the difference and difference. 9:59:59.000,9:59:59.000 56[br]00:05:37,000 --> 00:05:44,080[br]And so we can have a 10 percent increase in growth. Difference in difference has come up a lot in various contexts. 9:59:59.000,9:59:59.000 57[br]00:05:44,080 --> 00:05:50,020[br]And so it's important to be able to reason about those as well. And again, be clear both in writing and understanding. 9:59:59.000,9:59:59.000 58[br]00:05:50,020 --> 00:05:58,120[br]So we don't want we don't want to. Visual comparison bar charts emphasized relative difference because the height of the bar is right there. 9:59:59.000,9:59:59.000 59[br]00:05:58,120 --> 00:06:04,120[br]And the eye very naturally compares the difference between bars to the height of the bar itself. 9:59:59.000,9:59:59.000 60[br]00:06:04,120 --> 00:06:10,840[br]Point plots emphasize absolute difference because you don't have the reference point of the size of the bar. 9:59:59.000,9:59:59.000 61[br]00:06:10,840 --> 00:06:17,170[br]They're both of them. Make it pretty clear to see the also compare the differences. 9:59:59.000,9:59:59.000 62[br]00:06:17,170 --> 00:06:23,590[br]See how different the differences are. Those are of evident both in bar charts and point plots. 9:59:59.000,9:59:59.000 63[br]00:06:23,590 --> 00:06:28,690[br]So to wrap up, there are three primary ways to compute statistics, absolute relative and ratio. 9:59:59.000,9:59:59.000 64[br]00:06:28,690 --> 00:06:32,470[br]You need to be very clear and unambiguous when you're writing the results of a 9:59:59.000,9:59:59.000 65[br]00:06:32,470 --> 00:06:36,370[br]comparison and also when you're trying to understand what others have written. 9:59:59.000,9:59:59.000 66[br]00:06:36,370 --> 00:06:39,340[br]Seek to accurately understand it. And if you're providing feeB, 9:59:59.000,9:59:59.000 67[br]00:06:39,340 --> 00:06:50,900[br]if you're in a context where you're providing feedback and it's not clear that clarity is something you want to ask for revision. 9:59:59.000,9:59:59.000