1 00:00:00,000 --> 00:00:00,550 2 00:00:00,550 --> 00:00:02,900 We've seen in the last several videos you start off with 3 00:00:02,900 --> 00:00:05,410 any crazy distribution. 4 00:00:05,410 --> 00:00:06,940 It doesn't have to be crazy, it could be a nice 5 00:00:06,940 --> 00:00:07,830 normal distribution. 6 00:00:07,830 --> 00:00:10,380 But to really make the point that you don't have to have 7 00:00:10,380 --> 00:00:12,170 a normal distribution I like to use crazy ones. 8 00:00:12,170 --> 00:00:14,790 So let's say you have some kind of crazy distribution that 9 00:00:14,790 --> 00:00:15,790 looks something like that. 10 00:00:15,790 --> 00:00:17,080 It could look like anything. 11 00:00:17,080 --> 00:00:19,360 So we've seen multiple times you take samples from 12 00:00:19,360 --> 00:00:20,530 this crazy distribution. 13 00:00:20,530 --> 00:00:27,890 So let's say you were to take samples of n is equal to 10. 14 00:00:27,890 --> 00:00:32,750 So we take 10 instances of this random variable, average them 15 00:00:32,750 --> 00:00:34,750 out, and then plot our average. 16 00:00:34,750 --> 00:00:36,160 We plot our average. 17 00:00:36,160 --> 00:00:37,590 We get 1 instance there. 18 00:00:37,590 --> 00:00:38,680 We keep doing that. 19 00:00:38,680 --> 00:00:39,280 We do that again. 20 00:00:39,280 --> 00:00:42,650 We take 10 samples from this random variable, average 21 00:00:42,650 --> 00:00:43,430 them, plot them again. 22 00:00:43,430 --> 00:00:47,360 You plot again and eventually you do this a gazillion times-- 23 00:00:47,360 --> 00:00:50,230 in theory an infinite number of times-- and you're going to 24 00:00:50,230 --> 00:00:53,240 approach the sampling distribution of the sample 25 00:00:53,240 --> 00:00:56,290 mean. n equal 10 is not going to be a perfect normal 26 00:00:56,290 --> 00:00:58,900 distribution but it's going to be close. 27 00:00:58,900 --> 00:01:00,610 It'd be perfect only if n was infinity. 28 00:01:00,610 --> 00:01:05,680 But let's say we eventually-- all of our samples we get a lot 29 00:01:05,680 --> 00:01:08,180 of averages that are there that stacks up, that stacks up 30 00:01:08,180 --> 00:01:10,290 there, and eventually will approach something that 31 00:01:10,290 --> 00:01:13,060 looks something like that. 32 00:01:13,060 --> 00:01:16,190 And we've seen from the last video that one-- if let's say 33 00:01:16,190 --> 00:01:19,130 we were to do it again and this time let's say that n is equal 34 00:01:19,130 --> 00:01:23,200 to 20-- one, the distribution that we get is going 35 00:01:23,200 --> 00:01:24,990 to be more normal. 36 00:01:24,990 --> 00:01:27,440 And maybe in future videos we'll delve even deeper into 37 00:01:27,440 --> 00:01:29,510 things like kurtosis and skew. 38 00:01:29,510 --> 00:01:30,780 But it's going to be more normal. 39 00:01:30,780 --> 00:01:33,570 But even more important here or I guess even more obviously 40 00:01:33,570 --> 00:01:35,740 to us, we saw that in the experiment it's going to have 41 00:01:35,740 --> 00:01:37,420 a lower standard deviation. 42 00:01:37,420 --> 00:01:38,680 So they're all going to have the same mean. 43 00:01:38,680 --> 00:01:41,800 Let's say the mean here is, I don't know, let's say 44 00:01:41,800 --> 00:01:43,410 the mean here is 5. 45 00:01:43,410 --> 00:01:45,350 Then the mean here is also going to be 5. 46 00:01:45,350 --> 00:01:47,600 The mean of our sampling distribution of the sample 47 00:01:47,600 --> 00:01:48,960 mean is going to be 5. 48 00:01:48,960 --> 00:01:50,260 It doesn't matter what our n is. 49 00:01:50,260 --> 00:01:52,250 If our n is 20 it's still going to be 5. 50 00:01:52,250 --> 00:01:54,140 But our standard deviation is going to be less than 51 00:01:54,140 --> 00:01:55,670 either of these scenarios. 52 00:01:55,670 --> 00:01:57,260 And we saw that just by experimenting. 53 00:01:57,260 --> 00:01:58,015 It might look like this. 54 00:01:58,015 --> 00:01:59,520 It's going to be more normal but it's going to have a 55 00:01:59,520 --> 00:02:01,350 tighter standard deviation. 56 00:02:01,350 --> 00:02:03,160 So maybe it'll look like that. 57 00:02:03,160 --> 00:02:06,810 And if we did it with an even larger sample size-- let me do 58 00:02:06,810 --> 00:02:09,770 that in a different color-- if we did that with an even larger 59 00:02:09,770 --> 00:02:13,430 sample size, n is equal to 100, what we're going to get is 60 00:02:13,430 --> 00:02:17,010 something that fits the normal distribution even better. 61 00:02:17,010 --> 00:02:19,820 We take a hundred instances of this random variable, 62 00:02:19,820 --> 00:02:21,230 average them, plot it. 63 00:02:21,230 --> 00:02:22,730 A hundred instances of this random variable, 64 00:02:22,730 --> 00:02:23,600 average them, plot it. 65 00:02:23,600 --> 00:02:25,230 And we just keep doing that. 66 00:02:25,230 --> 00:02:27,510 If we keep doing that, what we're going to have is 67 00:02:27,510 --> 00:02:29,950 something that's even more normal than either of these. 68 00:02:29,950 --> 00:02:32,130 So it's going to be a much closer fit to a true 69 00:02:32,130 --> 00:02:33,480 normal distribution. 70 00:02:33,480 --> 00:02:35,890 But even more obvious to the human, it's going 71 00:02:35,890 --> 00:02:37,650 to be even tighter. 72 00:02:37,650 --> 00:02:40,640 So it's going to be a very low standard deviation. 73 00:02:40,640 --> 00:02:41,840 It's going to look something like that. 74 00:02:41,840 --> 00:02:46,760 And I'll show you on the simulation app in the next or 75 00:02:46,760 --> 00:02:48,710 probably later in this video. 76 00:02:48,710 --> 00:02:49,680 So two things happen. 77 00:02:49,680 --> 00:02:51,830 As you increase your sample size for every time you 78 00:02:51,830 --> 00:02:53,560 do the average, two things are happening. 79 00:02:53,560 --> 00:02:57,150 You're becoming more normal and your standard deviation 80 00:02:57,150 --> 00:02:57,970 is getting smaller. 81 00:02:57,970 --> 00:03:00,810 So the question might arise is there a formula? 82 00:03:00,810 --> 00:03:04,680 So if I know the standard deviation-- so this is my 83 00:03:04,680 --> 00:03:07,990 standard deviation of just my original probability density 84 00:03:07,990 --> 00:03:10,910 function, this is the mean of my original probability 85 00:03:10,910 --> 00:03:11,800 density function. 86 00:03:11,800 --> 00:03:15,320 So if I know the standard deviation and I know n-- n is 87 00:03:15,320 --> 00:03:17,330 going to change depending on how many samples I'm taking 88 00:03:17,330 --> 00:03:21,450 every time I do a sample mean-- if I know that my standard 89 00:03:21,450 --> 00:03:23,870 deviation, or maybe if I know my variance, right? 90 00:03:23,870 --> 00:03:26,300 The variance to just the standard deviation squared. 91 00:03:26,300 --> 00:03:27,950 If you don't remember that you might want to 92 00:03:27,950 --> 00:03:29,640 review those videos. 93 00:03:29,640 --> 00:03:34,450 But if I know the variance of my original distribution and if 94 00:03:34,450 --> 00:03:38,860 I know what my n is-- how many samples I'm going to take every 95 00:03:38,860 --> 00:03:41,620 time before I average them in order to plot one thing in my 96 00:03:41,620 --> 00:03:46,910 sampling distribution of my sample mean-- is there a way to 97 00:03:46,910 --> 00:03:50,700 predict what the mean of these distributions are? 98 00:03:50,700 --> 00:03:52,870 And so-- I'm sorry, the standard deviation of 99 00:03:52,870 --> 00:03:54,050 these distributions. 100 00:03:54,050 --> 00:03:56,300 And so you don't get confused between that and that, 101 00:03:56,300 --> 00:03:57,190 let me say the variance. 102 00:03:57,190 --> 00:03:58,560 If you know the variance you can figure out the 103 00:03:58,560 --> 00:03:59,570 standard deviation. 104 00:03:59,570 --> 00:04:01,220 One is just the square root of the other. 105 00:04:01,220 --> 00:04:06,170 So this is the variance of our original distribution. 106 00:04:06,170 --> 00:04:09,300 Now to show that this is the variance of our sampling 107 00:04:09,300 --> 00:04:11,620 distribution of our sample mean we'll write it right here. 108 00:04:11,620 --> 00:04:16,430 This is the variance of our mean of our sample mean. 109 00:04:16,430 --> 00:04:19,470 Remember the sample-- our true mean is this. 110 00:04:19,470 --> 00:04:22,100 The Greek letter Mu is our true mean. 111 00:04:22,100 --> 00:04:27,080 This is equal to the mean, while an x a line over 112 00:04:27,080 --> 00:04:28,350 it means sample mean. 113 00:04:28,350 --> 00:04:31,300 114 00:04:31,300 --> 00:04:34,110 So here what we're saying is this is the variance of our 115 00:04:34,110 --> 00:04:36,880 sample mean, that this is going to be true distribution. 116 00:04:36,880 --> 00:04:38,320 This isn't an estimate. 117 00:04:38,320 --> 00:04:42,590 There's some-- you know, if we magically knew distribution-- 118 00:04:42,590 --> 00:04:44,920 there's some true variance here. 119 00:04:44,920 --> 00:04:48,590 And of course the mean-- so this has a mean-- this right 120 00:04:48,590 --> 00:04:51,420 here, we can just get our notation right, this is the 121 00:04:51,420 --> 00:04:54,750 mean of the sampling distribution of the 122 00:04:54,750 --> 00:04:55,710 sampling mean. 123 00:04:55,710 --> 00:04:57,870 So this is the mean of our means. 124 00:04:57,870 --> 00:04:59,830 It just happens to be the same thing. 125 00:04:59,830 --> 00:05:03,180 This is the mean of our sample means. 126 00:05:03,180 --> 00:05:05,410 It's going to be the same thing as that, especially if we do 127 00:05:05,410 --> 00:05:07,400 the trial over and over again. 128 00:05:07,400 --> 00:05:09,490 But anyway, the point of this video, is there any way to 129 00:05:09,490 --> 00:05:13,600 figure out this variance given the variance of the original 130 00:05:13,600 --> 00:05:15,520 distribution and your n? 131 00:05:15,520 --> 00:05:16,720 And it turns out there is. 132 00:05:16,720 --> 00:05:18,230 And I'm not going to do a proof here. 133 00:05:18,230 --> 00:05:19,910 I really want to give you the intuition of it. 134 00:05:19,910 --> 00:05:22,830 I think you already do have the sense that every trial you 135 00:05:22,830 --> 00:05:26,010 take-- if you take a hundred, you're much more likely when 136 00:05:26,010 --> 00:05:29,140 you average those out, to get close to the true mean than if 137 00:05:29,140 --> 00:05:31,410 you took an n of 2 or an n of 5. 138 00:05:31,410 --> 00:05:34,240 You're just very unlikely to be far away, right, if you took 139 00:05:34,240 --> 00:05:36,640 100 trials as opposed to taking 5. 140 00:05:36,640 --> 00:05:38,590 So I think you know that in some way it should be 141 00:05:38,590 --> 00:05:40,740 inversely proportional to n. 142 00:05:40,740 --> 00:05:43,680 The larger your n the smaller a standard deviation. 143 00:05:43,680 --> 00:05:45,740 And actually it turns out it's about as simple as possible. 144 00:05:45,740 --> 00:05:47,740 It's one of those magical things about mathematics. 145 00:05:47,740 --> 00:05:50,110 And I'll prove it to you one day. 146 00:05:50,110 --> 00:05:51,690 I want to give you working knowledge first. 147 00:05:51,690 --> 00:05:54,390 In statistics, I'm always struggling whether I should be 148 00:05:54,390 --> 00:05:57,410 formal in giving you rigorous proofs but I've kind of come to 149 00:05:57,410 --> 00:05:59,220 the conclusion that it's more important to get the working 150 00:05:59,220 --> 00:06:02,010 knowledge first in statistics and then later, once you've 151 00:06:02,010 --> 00:06:05,010 gotten all of that down, we can get into the real deep math 152 00:06:05,010 --> 00:06:06,270 of it and prove it to you. 153 00:06:06,270 --> 00:06:09,460 But I think experimental proofs are kind of all you need for 154 00:06:09,460 --> 00:06:11,110 right now, using those simulations to show that 155 00:06:11,110 --> 00:06:12,050 they're really true. 156 00:06:12,050 --> 00:06:14,860 So it turns out that the variance of your sampling 157 00:06:14,860 --> 00:06:18,230 distribution of your sample mean is equal to the 158 00:06:18,230 --> 00:06:20,880 variance of your original distribution-- that guy 159 00:06:20,880 --> 00:06:23,310 right there-- divided by n. 160 00:06:23,310 --> 00:06:24,300 That's all it is. 161 00:06:24,300 --> 00:06:29,940 So if this up here has a variance of-- let's say this up 162 00:06:29,940 --> 00:06:33,640 here has a variance of 20-- I'm just making that number up-- 163 00:06:33,640 --> 00:06:36,330 then let's say your n is 20. 164 00:06:36,330 --> 00:06:38,850 Then the variance of your sampling distribution of your 165 00:06:38,850 --> 00:06:41,360 sample mean for an n of 20, well you're just going to take 166 00:06:41,360 --> 00:06:44,340 that, the variance up here-- your variance is 20-- 167 00:06:44,340 --> 00:06:45,840 divided by your n, 20. 168 00:06:45,840 --> 00:06:49,780 So here your variance is going to be 20 divided by 169 00:06:49,780 --> 00:06:51,410 20 which is equal to 1. 170 00:06:51,410 --> 00:06:53,230 This is the variance of your original probability 171 00:06:53,230 --> 00:06:55,860 distribution and this is your n. 172 00:06:55,860 --> 00:06:57,460 What's your standard deviation going to be? 173 00:06:57,460 --> 00:06:59,540 What's going to be the square root of that, right? 174 00:06:59,540 --> 00:07:00,870 Standard deviation is going to be square root of 1. 175 00:07:00,870 --> 00:07:02,410 Well that's also going to be 1. 176 00:07:02,410 --> 00:07:04,090 So we could also write this. 177 00:07:04,090 --> 00:07:07,040 We could take the square root of both sides of this and say 178 00:07:07,040 --> 00:07:10,830 the standard deviation of the sampling distribution 179 00:07:10,830 --> 00:07:13,800 standard-- the standard deviation of the sampling 180 00:07:13,800 --> 00:07:17,180 distribution of the sample mean is often called the standard 181 00:07:17,180 --> 00:07:18,550 deviation of the mean. 182 00:07:18,550 --> 00:07:20,470 And it's also called-- I'm going to write this down-- the 183 00:07:20,470 --> 00:07:22,050 standard error of the mean. 184 00:07:22,050 --> 00:07:27,810 185 00:07:27,810 --> 00:07:30,230 All of these things that I just mentioned, they all just mean 186 00:07:30,230 --> 00:07:33,010 the standard deviation of the sampling distribution 187 00:07:33,010 --> 00:07:33,860 of the sample mean. 188 00:07:33,860 --> 00:07:36,770 That's why this is confusing because you use the word mean 189 00:07:36,770 --> 00:07:38,080 and sample over and over again. 190 00:07:38,080 --> 00:07:39,800 And if it confuses you let me know. 191 00:07:39,800 --> 00:07:42,030 I'll do another video or pause and repeat or whatever. 192 00:07:42,030 --> 00:07:44,000 But if we just take the square root of both sides, the 193 00:07:44,000 --> 00:07:46,770 standard error of the mean or the standard deviation of the 194 00:07:46,770 --> 00:07:50,300 sampling distribution of the sample mean is equal to the 195 00:07:50,300 --> 00:07:54,310 standard deviation of your original function-- of your 196 00:07:54,310 --> 00:07:56,580 original probability density function-- which could be very 197 00:07:56,580 --> 00:07:59,850 non-normal, divided by the square root of n. 198 00:07:59,850 --> 00:08:03,110 I just took the square root of both sides of this equation. 199 00:08:03,110 --> 00:08:07,090 I personally like to remember this: that the variance is just 200 00:08:07,090 --> 00:08:08,820 inversely proportional to n. 201 00:08:08,820 --> 00:08:10,300 And then I like to go back to this. 202 00:08:10,300 --> 00:08:11,840 Because this is very simple in my head. 203 00:08:11,840 --> 00:08:13,790 You just take the variance, divide it by n. 204 00:08:13,790 --> 00:08:15,790 Oh and if I want the standard deviation, I just take the 205 00:08:15,790 --> 00:08:18,430 square roots of both sides and I get this formula. 206 00:08:18,430 --> 00:08:22,460 So here the standard deviation-- when n is 20-- the 207 00:08:22,460 --> 00:08:25,890 standard deviation of the sampling distribution of the 208 00:08:25,890 --> 00:08:27,210 sample mean is going to be 1. 209 00:08:27,210 --> 00:08:31,530 Here when n is 100, our variance here when 210 00:08:31,530 --> 00:08:32,500 n is equal to 100. 211 00:08:32,500 --> 00:08:35,230 So our variance of the sampling mean of the sample distribution 212 00:08:35,230 --> 00:08:37,580 or our variance of the mean-- of the sample mean, we 213 00:08:37,580 --> 00:08:40,340 could say-- is going to be equal to 20-- this guy's 214 00:08:40,340 --> 00:08:43,290 variance-- divided by n. 215 00:08:43,290 --> 00:08:46,950 So it equals-- n is 100-- so it equals 1/5. 216 00:08:46,950 --> 00:08:50,730 Now this guy's standard deviation or the standard 217 00:08:50,730 --> 00:08:54,270 deviation of the sampling distribution of the sample mean 218 00:08:54,270 --> 00:08:55,760 or the standard error of the mean is going to be the 219 00:08:55,760 --> 00:08:56,480 square root of that. 220 00:08:56,480 --> 00:08:58,780 So 1 over the square root of 5. 221 00:08:58,780 --> 00:09:03,390 And so this guy's will be a little bit under 1/2 the 222 00:09:03,390 --> 00:09:05,060 standard deviation while this guy had a standard 223 00:09:05,060 --> 00:09:05,930 deviation of 1. 224 00:09:05,930 --> 00:09:07,450 So you see, it's definitely thinner. 225 00:09:07,450 --> 00:09:08,140 Now I know what you're saying. 226 00:09:08,140 --> 00:09:09,590 Well, Sal, you just gave a formula, I don't 227 00:09:09,590 --> 00:09:11,140 necessarily believe you. 228 00:09:11,140 --> 00:09:13,610 Well let's see if we can prove it to ourselves 229 00:09:13,610 --> 00:09:15,960 using the simulation. 230 00:09:15,960 --> 00:09:20,240 So just for fun let me make a-- I'll just mess with this 231 00:09:20,240 --> 00:09:21,670 distribution a little bit. 232 00:09:21,670 --> 00:09:23,440 So that's my new distribution. 233 00:09:23,440 --> 00:09:25,370 And let me take an n of-- let me take two things that's easy 234 00:09:25,370 --> 00:09:27,550 to take the square root of because we're looking at 235 00:09:27,550 --> 00:09:28,330 standard deviations. 236 00:09:28,330 --> 00:09:33,830 So we take an n of 16 and an n of 25. 237 00:09:33,830 --> 00:09:35,470 Let's do 10,000 trials. 238 00:09:35,470 --> 00:09:37,490 So in this case every one of the trials we're going to take 239 00:09:37,490 --> 00:09:40,320 16 samples from here, average them, plot it here, and 240 00:09:40,320 --> 00:09:41,600 then do a frequency plot. 241 00:09:41,600 --> 00:09:45,020 Here we're going to do 25 at a time and then average them. 242 00:09:45,020 --> 00:09:46,830 I'll do it once animated just to remember. 243 00:09:46,830 --> 00:09:50,620 So I'm taking 16 samples, plot it there. 244 00:09:50,620 --> 00:09:53,500 I take 16 samples as described by this probability density 245 00:09:53,500 --> 00:09:56,960 function-- or 25 now, plot it down here. 246 00:09:56,960 --> 00:10:03,420 Now if I do that 10,000 times, what do I get? 247 00:10:03,420 --> 00:10:06,710 All right, so here, just visually you can tell just when 248 00:10:06,710 --> 00:10:08,810 n was larger, the standard deviation here is smaller. 249 00:10:08,810 --> 00:10:10,020 This is more squeezed together. 250 00:10:10,020 --> 00:10:12,490 But actually let's write this stuff down. 251 00:10:12,490 --> 00:10:14,420 Let's see if I can remember it here. 252 00:10:14,420 --> 00:10:17,370 So in this random distribution I made my standard 253 00:10:17,370 --> 00:10:19,160 deviation was 9.3. 254 00:10:19,160 --> 00:10:20,460 I'm going to remember these. 255 00:10:20,460 --> 00:10:24,470 Our standard deviation for the original thing was 9.3. 256 00:10:24,470 --> 00:10:27,980 And so standard deviation here was 2.3 and the standard 257 00:10:27,980 --> 00:10:29,590 deviation here is 1.87. 258 00:10:29,590 --> 00:10:33,290 Let's see if it conforms to our formula. 259 00:10:33,290 --> 00:10:35,240 So I'm going to take this off screen for a second and I'm 260 00:10:35,240 --> 00:10:38,900 going to go back and do some mathematics. 261 00:10:38,900 --> 00:10:41,030 So I have this on my other screen so I can 262 00:10:41,030 --> 00:10:42,750 remember those numbers. 263 00:10:42,750 --> 00:10:47,220 So in the trial we just did, my wacky distribution had a 264 00:10:47,220 --> 00:10:52,650 standard deviation of 9.3. 265 00:10:52,650 --> 00:10:57,680 When n is equal to-- let me do this in another color-- when n 266 00:10:57,680 --> 00:11:01,900 was equal to 16, just doing the experiment, doing a bunch of 267 00:11:01,900 --> 00:11:04,460 trials and averaging and doing all the things, we got the 268 00:11:04,460 --> 00:11:08,070 standard deviation of the sampling distribution of the 269 00:11:08,070 --> 00:11:10,380 sample mean or the standard error of the mean, we 270 00:11:10,380 --> 00:11:15,570 experimentally determined it to be 2.33. 271 00:11:15,570 --> 00:11:21,500 And then when n is equal to 25 we got the standard error of 272 00:11:21,500 --> 00:11:24,900 the mean being equal to 1.87. 273 00:11:24,900 --> 00:11:28,330 Let's see if it conforms to our formulas. 274 00:11:28,330 --> 00:11:32,680 So we know that the variance or we could almost say the 275 00:11:32,680 --> 00:11:36,010 variance of the mean or the standard error-- the variance 276 00:11:36,010 --> 00:11:39,230 of the sampling distribution of the sample mean is equal to the 277 00:11:39,230 --> 00:11:42,050 variance of our original distribution divided by n, take 278 00:11:42,050 --> 00:11:45,490 the square roots of both sides, and then you get the standard 279 00:11:45,490 --> 00:11:48,770 error of the mean is equal to the standard deviation of your 280 00:11:48,770 --> 00:11:51,800 original distribution divided by the square root of n. 281 00:11:51,800 --> 00:11:54,350 So let's see if this works out for these two things. 282 00:11:54,350 --> 00:11:58,670 So if I were to take 9.3-- so let me do this case. 283 00:11:58,670 --> 00:12:03,870 So 9.3 divided by the square root of 16, right? 284 00:12:03,870 --> 00:12:05,150 N is 16. 285 00:12:05,150 --> 00:12:07,050 So divided by the square root of 16, which is 286 00:12:07,050 --> 00:12:09,300 4, what do I get? 287 00:12:09,300 --> 00:12:11,566 So 9.3 divided by 4. 288 00:12:11,566 --> 00:12:14,840 Let me get a little calculator out here. 289 00:12:14,840 --> 00:12:15,550 Let's see. 290 00:12:15,550 --> 00:12:18,720 We have-- let me clear it out-- we want to divide 291 00:12:18,720 --> 00:12:21,000 9.3 divided by 4. 292 00:12:21,000 --> 00:12:24,930 9.3 three divided by our square root of n. n was 16. 293 00:12:24,930 --> 00:12:32,060 So divided by 4 is equal to 2.32. 294 00:12:32,060 --> 00:12:41,540 So this is equal to 2.32 which is pretty darn close to 2.33. 295 00:12:41,540 --> 00:12:43,160 This was after 10,000 trials. 296 00:12:43,160 --> 00:12:45,780 Maybe right after this I'll see what happens if we did 20,000 297 00:12:45,780 --> 00:12:48,960 or 30,000 trials where we take samples of 16 and average them. 298 00:12:48,960 --> 00:12:50,340 Now let's look at this. 299 00:12:50,340 --> 00:12:55,400 Here we would take 9.3-- so let me draw a little line here. 300 00:12:55,400 --> 00:12:57,300 Let me scroll over, that might be better. 301 00:12:57,300 --> 00:13:00,340 So we take our standard deviation of our 302 00:13:00,340 --> 00:13:01,790 original distribution. 303 00:13:01,790 --> 00:13:05,280 So just that formula that we've derived right here would tell 304 00:13:05,280 --> 00:13:09,180 us that our standard error should be equal to the standard 305 00:13:09,180 --> 00:13:13,350 deviation of our original distribution, 9.3, divided by 306 00:13:13,350 --> 00:13:15,400 the square root of n, divided by the square root 307 00:13:15,400 --> 00:13:16,380 of 25, right? 308 00:13:16,380 --> 00:13:18,410 4 was just the square root of 16. 309 00:13:18,410 --> 00:13:21,860 So this is equal to 9.3 divided by 5. 310 00:13:21,860 --> 00:13:23,820 And let's see if it's 1.87. 311 00:13:23,820 --> 00:13:28,320 So let me get my calculator back. 312 00:13:28,320 --> 00:13:36,410 So if I take 9.3 divided by 5, what do I get? 313 00:13:36,410 --> 00:13:41,720 1.86 which is very close to 1.87. 314 00:13:41,720 --> 00:13:49,480 So we got in this case 1.86. 315 00:13:49,480 --> 00:13:53,150 So as you can see what we got experimentally was almost 316 00:13:53,150 --> 00:13:56,010 exactly-- and this was after 10,000 trials-- of what 317 00:13:56,010 --> 00:13:56,600 you would expect. 318 00:13:56,600 --> 00:13:58,690 Let's do another 10,000. 319 00:13:58,690 --> 00:14:00,220 So you've got another 10,000 trials. 320 00:14:00,220 --> 00:14:01,550 Well we're still in the ballpark. 321 00:14:01,550 --> 00:14:04,920 We're not going to-- maybe I can't hope to get the exact 322 00:14:04,920 --> 00:14:07,350 number rounded or whatever. 323 00:14:07,350 --> 00:14:10,690 But as you can see, hopefully that'll be pretty satisfying to 324 00:14:10,690 --> 00:14:14,410 you, that the variance of the sampling distribution of the 325 00:14:14,410 --> 00:14:21,500 sample mean is just going to be equal to the variance of your 326 00:14:21,500 --> 00:14:23,790 original distribution, no matter how wacky that 327 00:14:23,790 --> 00:14:27,370 distribution might be, divided by your sample size-- by the 328 00:14:27,370 --> 00:14:33,530 number of samples you take for every basket that you average I 329 00:14:33,530 --> 00:14:35,220 guess is the best way to think about it. 330 00:14:35,220 --> 00:14:37,710 You know, sometimes this can get confusing because you are 331 00:14:37,710 --> 00:14:40,275 taking samples of averages based on samples. 332 00:14:40,275 --> 00:14:43,300 So when someone says sample size, you're like, is sample 333 00:14:43,300 --> 00:14:46,510 size the number of times I took averages or the number 334 00:14:46,510 --> 00:14:48,780 of things I'm taking averages of each time? 335 00:14:48,780 --> 00:14:51,160 And you know, it doesn't hurt to clarify that. 336 00:14:51,160 --> 00:14:52,790 Normally when they talk about sample size 337 00:14:52,790 --> 00:14:54,220 they're talking about n. 338 00:14:54,220 --> 00:14:57,570 And, at least in my head, when I think of the trials as you 339 00:14:57,570 --> 00:15:00,700 take a sample size of 16, you average it, that's the one 340 00:15:00,700 --> 00:15:01,990 trial, and then you plot it. 341 00:15:01,990 --> 00:15:03,680 Then you do it again and you do another trial. 342 00:15:03,680 --> 00:15:04,980 And you do it over and over again. 343 00:15:04,980 --> 00:15:06,930 But anyway, hopefully this makes everything clear and then 344 00:15:06,930 --> 00:15:11,340 you now also understand how to get to the standard 345 00:15:11,340 --> 00:15:13,750 error of the mean. 346 00:15:13,750 --> 00:15:14,848