1 00:00:00,520 --> 00:00:02,820 Let's say I've got a set of numbers. 2 00:00:02,820 --> 00:00:08,710 2, say I've got three 3's, I've got a couple of 4's, and 3 00:00:08,710 --> 00:00:10,620 I've got a 10 there. 4 00:00:10,620 --> 00:00:14,210 And what we want to do is find the middle of these numbers. 5 00:00:14,210 --> 00:00:17,820 We want to represent these numbers with the center of the 6 00:00:17,820 --> 00:00:19,900 numbers, or the middle of the numbers, just so we have a 7 00:00:19,900 --> 00:00:24,350 sense of where these numbers roughly are. 8 00:00:24,350 --> 00:00:27,360 And this central tendency that we're going to try to get out 9 00:00:27,360 --> 00:00:31,020 of these numbers, we're going to call the average. 10 00:00:31,020 --> 00:00:34,990 The average of this set of numbers. 11 00:00:34,990 --> 00:00:38,840 And you've, I'm sure, heard the word average before, but 12 00:00:38,840 --> 00:00:41,110 we're going to get a little bit more detailed on the 13 00:00:41,110 --> 00:00:44,270 different types of averages in this video. 14 00:00:44,270 --> 00:00:46,890 The one you're probably most familiar with, although you 15 00:00:46,890 --> 00:00:49,950 might have not seen it referred to in this way, is 16 00:00:49,950 --> 00:00:58,290 the arithmetic mean, which literally says, look, I, the 17 00:00:58,290 --> 00:01:01,270 arithmetic mean of this set of numbers, is literally the sum 18 00:01:01,270 --> 00:01:03,420 of all of these numbers divided by the number of 19 00:01:03,420 --> 00:01:04,209 numbers there are. 20 00:01:04,209 --> 00:01:07,440 So the arithmetic mean for this set right here is going 21 00:01:07,440 --> 00:01:17,070 to be 2 plus 3 plus 3 plus 3 plus 4 plus 4 plus 10, all of 22 00:01:17,070 --> 00:01:19,230 that over, how many numbers do I have? 23 00:01:19,230 --> 00:01:22,870 1, 2, 3, 4, 5, 6, 7. 24 00:01:22,870 --> 00:01:24,720 All of that over 7. 25 00:01:24,720 --> 00:01:25,660 And what is this equal to? 26 00:01:25,660 --> 00:01:34,560 This is 2 plus 9, which is 11, plus 8, which is 19, plus 10, 27 00:01:34,560 --> 00:01:35,520 which is 29. 28 00:01:35,520 --> 00:01:41,290 So this is going to be equal to 29/7, or you could say it's 29 00:01:41,290 --> 00:01:43,540 equal to 4 and 1/7. 30 00:01:43,540 --> 00:01:45,240 If I got my calculator out, we could figure out 31 00:01:45,240 --> 00:01:46,480 the decimal of this. 32 00:01:46,480 --> 00:01:50,990 But this is a representation of the central tendency, or 33 00:01:50,990 --> 00:01:52,450 the middle of these numbers. 34 00:01:52,450 --> 00:01:53,670 And it kind of makes sense. 35 00:01:53,670 --> 00:01:57,670 4 and 1/7, it's a little bit higher than 4. 36 00:01:57,670 --> 00:02:01,070 We're kind of close to the middle of our number range 37 00:02:01,070 --> 00:02:01,810 right there. 38 00:02:01,810 --> 00:02:03,590 And you might say, well, it's a little skewed to the right 39 00:02:03,590 --> 00:02:04,700 and what caused that? 40 00:02:04,700 --> 00:02:06,950 And well, gee, 10 is a little bit larger than all of the 41 00:02:06,950 --> 00:02:07,840 other numbers. 42 00:02:07,840 --> 00:02:09,320 It's kind of an outlier. 43 00:02:09,320 --> 00:02:13,370 Maybe that skewed this average up, the arithmetic mean. 44 00:02:13,370 --> 00:02:16,680 So there are other types of averages, although this is the 45 00:02:16,680 --> 00:02:19,200 one that, if people just say, hey, let's take the average of 46 00:02:19,200 --> 00:02:21,660 these numbers, and they don't really tell you more, they're 47 00:02:21,660 --> 00:02:23,940 probably talking about the arithmetic mean. 48 00:02:23,940 --> 00:02:30,270 The other forms of average, though, are the median, and 49 00:02:30,270 --> 00:02:32,290 this literally is the middle number. 50 00:02:35,970 --> 00:02:38,360 If there are two middle numbers, you actually take the 51 00:02:38,360 --> 00:02:40,730 arithmetic mean of those two middle numbers. 52 00:02:40,730 --> 00:02:43,310 You actually find the number halfway in between those two 53 00:02:43,310 --> 00:02:44,330 middle numbers. 54 00:02:44,330 --> 00:02:47,260 So the median of this set right here-- let me just 55 00:02:47,260 --> 00:02:48,350 rewrite them. 56 00:02:48,350 --> 00:02:57,060 So I have a 2, a 3, 3, 3, 4, 4, 10. 57 00:02:57,060 --> 00:02:59,690 So, let's see, we have seven numbers right here. 58 00:02:59,690 --> 00:03:02,800 The middle number, if I go 1, 2, 3, to the 59 00:03:02,800 --> 00:03:04,140 right, we're there. 60 00:03:04,140 --> 00:03:06,470 If we go 1, 2, 3 to the left, we're there. 61 00:03:06,470 --> 00:03:09,850 The middle number is that 3 right there. 62 00:03:09,850 --> 00:03:13,360 I just listed them in order, and I said, well, look, 3, you 63 00:03:13,360 --> 00:03:16,640 could think of it as the fourth number from the right, 64 00:03:16,640 --> 00:03:19,550 and it's also the fourth number from the left. 65 00:03:19,550 --> 00:03:21,360 3 is the middle number. 66 00:03:21,360 --> 00:03:23,940 And this case, it is the median. 67 00:03:23,940 --> 00:03:29,280 So in this case, 3, if you use the median, is our average. 68 00:03:29,280 --> 00:03:30,360 And that also makes sense. 69 00:03:30,360 --> 00:03:32,030 I mean, it's literally the middle number, and if you look 70 00:03:32,030 --> 00:03:35,670 at this set of numbers, it kind of does represent the 71 00:03:35,670 --> 00:03:37,790 central tendency of this set. 72 00:03:37,790 --> 00:03:40,880 Now just to be clear, it was very clear what the middle 73 00:03:40,880 --> 00:03:45,350 number was, because I had an odd number of numbers. 74 00:03:45,350 --> 00:03:48,560 I had three on each side of the three, so it was very easy 75 00:03:48,560 --> 00:03:50,790 to figure out the median, the middle number. 76 00:03:50,790 --> 00:03:53,800 But if I had a situation-- let's say I have the situation 77 00:03:53,800 --> 00:03:57,850 where I have 2, 3, 4, and 5. 78 00:03:57,850 --> 00:04:01,510 Let's say that's my set of numbers. 79 00:04:01,510 --> 00:04:04,350 Well, here, there is no one middle number. 80 00:04:04,350 --> 00:04:07,460 The 3 is closer to the left than it is to the right. 81 00:04:07,460 --> 00:04:10,040 The 4 is closer to the right than it is to the left. 82 00:04:10,040 --> 00:04:12,220 There's actually two middle numbers here. 83 00:04:12,220 --> 00:04:17,380 The two middle numbers here are the 3 and the 4. 84 00:04:17,380 --> 00:04:20,640 And here, when you have two middle numbers, which occurs 85 00:04:20,640 --> 00:04:24,430 when you have an even number in your data set, there the 86 00:04:24,430 --> 00:04:27,630 median is halfway in between these two numbers. 87 00:04:27,630 --> 00:04:31,070 So in this situation, the median is going to be 3 plus 4 88 00:04:31,070 --> 00:04:35,670 over 2, which is equal to 3.5. 89 00:04:35,670 --> 00:04:37,610 And if you look at this data set, that's not what our 90 00:04:37,610 --> 00:04:44,430 original problem was, but if you look at this data set 91 00:04:44,430 --> 00:04:47,770 right there, you're actually going to find that the 92 00:04:47,770 --> 00:04:52,070 arithmetic mean and the median here is the exact same thing. 93 00:04:52,070 --> 00:04:53,230 Let's calculate it. 94 00:04:53,230 --> 00:04:55,470 What 's the arithmetic mean over here? 95 00:04:55,470 --> 00:05:00,240 It's going to be 2 plus 3 plus 4 plus 5, which is what? 96 00:05:00,240 --> 00:05:09,490 5 plus 9, which is equal to 14, over 4. 97 00:05:09,490 --> 00:05:10,910 And what's this equal to? 98 00:05:10,910 --> 00:05:16,740 14/4 is 3 and 2/4, or 3 and 1/2, the exact same thing. 99 00:05:16,740 --> 00:05:19,310 So for this data set, they were the same thing. 100 00:05:19,310 --> 00:05:24,020 For this data set, our median is a little bit lower. 101 00:05:24,020 --> 00:05:28,090 It's 3, while our arithmetic mean is 4 and 1/7. 102 00:05:28,090 --> 00:05:30,080 And I really want you to think about why that is. 103 00:05:30,080 --> 00:05:32,840 And it has a lot to do with this 10 that sits out there. 104 00:05:32,840 --> 00:05:37,190 All of these other numbers are pretty close to whichever 105 00:05:37,190 --> 00:05:39,590 average you want to pick, whether it's the arithmetic 106 00:05:39,590 --> 00:05:42,560 mean or it's the median. 107 00:05:42,560 --> 00:05:47,860 But this 10 is kind of an outlier, or it 108 00:05:47,860 --> 00:05:50,650 skews the data set. 109 00:05:50,650 --> 00:05:53,790 Maybe it's so much larger than the other numbers, that it 110 00:05:53,790 --> 00:05:57,680 makes the arithmetic mean seem larger than maybe is 111 00:05:57,680 --> 00:05:59,670 representative of this data set. 112 00:05:59,670 --> 00:06:01,500 And that's something important to think about. 113 00:06:01,500 --> 00:06:07,570 When you're finding the average for something, most 114 00:06:07,570 --> 00:06:09,740 people will immediately go to the arithmetic mean. 115 00:06:09,740 --> 00:06:13,520 But in a lot of cases, median will make a lot more sense, if 116 00:06:13,520 --> 00:06:16,970 you have these really large or really small numbers that 117 00:06:16,970 --> 00:06:18,460 could skew the data set. 118 00:06:18,460 --> 00:06:21,530 I mean, you can imagine, if this wasn't a 10-- or let's 119 00:06:21,530 --> 00:06:22,790 imagine adding another number here. 120 00:06:22,790 --> 00:06:27,810 If I added the number 1 million, if I added 1 million 121 00:06:27,810 --> 00:06:30,390 to this data set, if that was the eighth number, the 122 00:06:30,390 --> 00:06:32,260 arithmetic mean is going to be this huge number. 123 00:06:32,260 --> 00:06:36,400 It's going to be much larger than what is representative of 124 00:06:36,400 --> 00:06:38,470 most of the numbers in this data set. 125 00:06:38,470 --> 00:06:40,240 But the median is still going to work. 126 00:06:40,240 --> 00:06:43,730 The median is still going to be about 3 and a half, right? 127 00:06:43,730 --> 00:06:47,750 If you had 1 million here, it would be 1, 2, 3, 4. 128 00:06:47,750 --> 00:06:49,200 The middle two numbers would be that. 129 00:06:49,200 --> 00:06:50,580 It would be 3 and 1/2. 130 00:06:50,580 --> 00:06:53,900 So the median is less sensitive to one or two 131 00:06:53,900 --> 00:06:58,430 numbers at the extremes that otherwise would skew the mean. 132 00:06:58,430 --> 00:07:01,130 Now, the last form of average I want to talk 133 00:07:01,130 --> 00:07:02,840 about is the mode. 134 00:07:05,400 --> 00:07:08,000 It has nothing to do with ice cream. 135 00:07:08,000 --> 00:07:10,910 The mode is literally the most frequent number. 136 00:07:16,800 --> 00:07:19,880 And in this data set, it's pretty clear what the most 137 00:07:19,880 --> 00:07:21,130 frequent number is. 138 00:07:21,130 --> 00:07:25,820 I only have one 2, I have three 3's, I have two 4's, I 139 00:07:25,820 --> 00:07:28,445 have one 10, and even if want to include the million, I only 140 00:07:28,445 --> 00:07:29,710 have one million there. 141 00:07:29,710 --> 00:07:31,990 So here, the number that occurs most 142 00:07:31,990 --> 00:07:35,360 frequently is the 3. 143 00:07:35,360 --> 00:07:38,330 So, once again, the mode seems like a pretty good measure of 144 00:07:38,330 --> 00:07:41,260 central tendency or a pretty good average 145 00:07:41,260 --> 00:07:43,300 for this data set. 146 00:07:43,300 --> 00:07:45,950 Now the mode, it's a little tricky to deal with, and you 147 00:07:45,950 --> 00:07:49,070 won't see it used that often, because it becomes a little 148 00:07:49,070 --> 00:07:52,920 ambiguous when-- you know, look at this data set: 149 00:07:52,920 --> 00:07:55,270 2, 3, 4, and 5. 150 00:07:55,270 --> 00:07:56,370 What is the mode there? 151 00:07:56,370 --> 00:07:59,200 All of these numbers are equally frequent. 152 00:07:59,200 --> 00:08:01,360 So if you have a situation like this, then you might 153 00:08:01,360 --> 00:08:03,930 just-- the mode really loses its meaning. 154 00:08:03,930 --> 00:08:06,350 It might force you anyway to take the median or the 155 00:08:06,350 --> 00:08:07,530 mean in some form. 156 00:08:07,530 --> 00:08:10,700 But if you really do have numbers that one shows up a 157 00:08:10,700 --> 00:08:14,640 lot more than the other, then the mode starts to make sense. 158 00:08:14,640 --> 00:08:18,220 So, hopefully, this has given you a pretty good overview of 159 00:08:18,220 --> 00:08:25,400 how to represent the central tendency of a data set. 160 00:08:25,400 --> 00:08:26,430 Very fancy word. 161 00:08:26,430 --> 00:08:28,180 But it's just saying, look, we're trying to represent with 162 00:08:28,180 --> 00:08:30,200 one number all of this data. 163 00:08:30,200 --> 00:08:31,810 And you might say, hey, why do we even worry about that? 164 00:08:31,810 --> 00:08:34,059 It only has seven numbers here or eight numbers here. 165 00:08:34,059 --> 00:08:36,770 But you can imagine if you had 7 million numbers or 7 billion 166 00:08:36,770 --> 00:08:39,440 numbers, and you don't want to show someone all of that data. 167 00:08:39,440 --> 00:08:42,210 You just want to give someone a sense of what those numbers 168 00:08:42,210 --> 00:08:45,345 are on average. 169 00:08:45,345 --> 00:08:49,420 And as we said, the arithmetic mean is what I see being used 170 00:08:49,420 --> 00:08:53,170 the most. But in situations where you might have numbers 171 00:08:53,170 --> 00:08:55,940 that would skew the arithmetic mean, because they're so large 172 00:08:55,940 --> 00:08:58,670 or they're so small, the median might 173 00:08:58,670 --> 00:09:00,480 make a lot of sense.