[Script Info] Title: [Events] Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text Dialogue: 0,0:00:02.98,0:00:09.40,Default,,0000,0000,0000,,Coins and dice provide a nice simple model\Nof how to calculate probabilities, but Dialogue: 0,0:00:09.99,0:00:14.54,Default,,0000,0000,0000,,everyday life is a lot more complicated\Nand it's not taken up with gambling. Dialogue: 0,0:00:14.54,0:00:17.45,Default,,0000,0000,0000,,At least, I hope your life is not taken up\Nwith gambling. Dialogue: 0,0:00:18.40,0:00:22.23,Default,,0000,0000,0000,,So in order to make probabilities more\Napplicable to everyday life, Dialogue: 0,0:00:22.23,0:00:25.99,Default,,0000,0000,0000,,we need to look at, slightly more\Ncomplicated methods. Dialogue: 0,0:00:26.82,0:00:30.13,Default,,0000,0000,0000,,Now, because these methods \Nare more complicated, Dialogue: 0,0:00:30.13,0:00:34.23,Default,,0000,0000,0000,,this lecture is going to be \Nan honors lecture: it's optional. Dialogue: 0,0:00:34.23,0:00:35.93,Default,,0000,0000,0000,,It will not be on the quiz, Dialogue: 0,0:00:35.93,0:00:37.74,Default,,0000,0000,0000,,so don't get worried about that. Dialogue: 0,0:00:38.49,0:00:41.77,Default,,0000,0000,0000,,But it is still useful, and it's fascinating, Dialogue: 0,0:00:41.77,0:00:44.43,Default,,0000,0000,0000,,and it'll help you avoid some mistakes Dialogue: 0,0:00:44.43,0:00:47.56,Default,,0000,0000,0000,,that a lot people make\Nand that create a lot of problems. Dialogue: 0,0:00:48.21,0:00:52.62,Default,,0000,0000,0000,,And so I hope you'll stick with it and listen to this lecture. Dialogue: 0,0:00:52.62,0:00:56.58,Default,,0000,0000,0000,,And there will be exercises \Nto help you figure out Dialogue: 0,0:00:56.58,0:00:58.50,Default,,0000,0000,0000,,whether you understand \Nthe material or not. Dialogue: 0,0:00:58.50,0:01:02.65,Default,,0000,0000,0000,,But don't get too worried, because \Nit's not going to be on the quizz. Dialogue: 0,0:01:05.46,0:01:08.08,Default,,0000,0000,0000,,The real problem \Nthat we'll be facing in this lecture Dialogue: 0,0:01:08.74,0:01:10.77,Default,,0000,0000,0000,,is the problem of test Dialogue: 0,0:01:10.77,0:01:13.76,Default,,0000,0000,0000,,We use tests all the time: \Nwe use tests to figure out Dialogue: 0,0:01:13.76,0:01:17.06,Default,,0000,0000,0000,,whether you have \Na certain medical condition. Dialogue: 0,0:01:17.06,0:01:22.20,Default,,0000,0000,0000,,We use tests to predict the weather \Nor to predict people's future behavior. Dialogue: 0,0:01:22.20,0:01:25.09,Default,,0000,0000,0000,,We have certain indicators \Nof how they're going to act, Dialogue: 0,0:01:25.74,0:01:28.30,Default,,0000,0000,0000,,either commit a crime \Nor not commit a crime, Dialogue: 0,0:01:28.30,0:01:30.34,Default,,0000,0000,0000,,but also whether they're going to pass, Dialogue: 0,0:01:30.34,0:01:32.40,Default,,0000,0000,0000,,do well in school or fail. Dialogue: 0,0:01:33.49,0:01:37.66,Default,,0000,0000,0000,,We always use these tests \Nwhen we don't know for certain, Dialogue: 0,0:01:38.23,0:01:41.34,Default,,0000,0000,0000,,but we want some kind of evidence, \Nor some kind of indicator. Dialogue: 0,0:01:41.90,0:01:45.14,Default,,0000,0000,0000,,The problem is none of these tests\Nare perfect. Dialogue: 0,0:01:45.45,0:01:48.50,Default,,0000,0000,0000,,They always contain errors \Nof various sorts. Dialogue: 0,0:01:48.90,0:01:51.91,Default,,0000,0000,0000,,And what we're going to have to do is to\Nsee how to take Dialogue: 0,0:01:51.91,0:01:57.96,Default,,0000,0000,0000,,those errors of different sorts \Nand build them together into a method Dialogue: 0,0:01:57.96,0:02:03.44,Default,,0000,0000,0000,,and then a formula for calculating \Nhow reliable the method is Dialogue: 0,0:02:03.44,0:02:06.26,Default,,0000,0000,0000,,for detecting the thing that we want to detect. Dialogue: 0,0:02:07.26,0:02:10.38,Default,,0000,0000,0000,,This problem is a lot like the problem \Nwe faced earlier Dialogue: 0,0:02:10.38,0:02:14.98,Default,,0000,0000,0000,,when we were talking about applying \Ngeneralizations to particular cases Dialogue: 0,0:02:14.98,0:02:18.26,Default,,0000,0000,0000,,because here we're going to be applying \Nprobabilities to particular cases. Dialogue: 0,0:02:18.94,0:02:21.100,Default,,0000,0000,0000,,So it'll seem familiar to you in certain parts, Dialogue: 0,0:02:21.100,0:02:25.28,Default,,0000,0000,0000,,but you'll see that this case \Nis a little trickier. Dialogue: 0,0:02:25.89,0:02:28.16,Default,,0000,0000,0000,,The best examples occur in medicine. Dialogue: 0,0:02:28.51,0:02:32.46,Default,,0000,0000,0000,,So just imagine that you go to your doctor\Nfor a regular checkup. Dialogue: 0,0:02:32.82,0:02:34.39,Default,,0000,0000,0000,,You don't have any special symptoms, Dialogue: 0,0:02:35.14,0:02:37.58,Default,,0000,0000,0000,,but he decides to do \Na few screening tests. Dialogue: 0,0:02:38.92,0:02:44.38,Default,,0000,0000,0000,,And unfortunately, and very worryingly, \Nit turns out that you test positive Dialogue: 0,0:02:45.04,0:02:51.25,Default,,0000,0000,0000,,on one test for a particular form of cancer,\Na certain kind of medical condition. Dialogue: 0,0:02:52.47,0:02:56.10,Default,,0000,0000,0000,,Well, what that means is that you might\Nhave cancer. Dialogue: 0,0:02:56.87,0:02:58.25,Default,,0000,0000,0000,,Might, great. Dialogue: 0,0:02:58.25,0:03:00.38,Default,,0000,0000,0000,,You want to know whether you do have\Ncancer. Dialogue: 0,0:03:01.01,0:03:04.09,Default,,0000,0000,0000,,But of course, finding out for sure\Nwhether or not you have cancer Dialogue: 0,0:03:04.11,0:03:06.29,Default,,0000,0000,0000,,is going to take further tests. Dialogue: 0,0:03:06.29,0:03:10.67,Default,,0000,0000,0000,,And those tests might be expensive, \Nthey might be dangerous, Dialogue: 0,0:03:10.67,0:03:13.06,Default,,0000,0000,0000,,they're going to be invasive \Nin various ways. Dialogue: 0,0:03:13.65,0:03:16.52,Default,,0000,0000,0000,,So you really want to know what's the\Nprobability, Dialogue: 0,0:03:17.21,0:03:20.51,Default,,0000,0000,0000,,given that you've tested positive \Non this one test, Dialogue: 0,0:03:21.14,0:03:22.69,Default,,0000,0000,0000,,that you really have cancer. Dialogue: 0,0:03:23.93,0:03:27.95,Default,,0000,0000,0000,,Now clearly that probability is going \Nto depend on a number of facts Dialogue: 0,0:03:27.95,0:03:31.24,Default,,0000,0000,0000,,about this type of cancer, \Nabout the type of test and so on. Dialogue: 0,0:03:31.52,0:03:33.52,Default,,0000,0000,0000,,And I am not a doctor. Dialogue: 0,0:03:33.94,0:03:36.27,Default,,0000,0000,0000,,I am not giving you medical advice. Dialogue: 0,0:03:36.59,0:03:41.21,Default,,0000,0000,0000,,If you test positive on a test, \Ngo talk to your doctor, Dialogue: 0,0:03:41.21,0:03:44.04,Default,,0000,0000,0000,,don't trust me, because I'm just \Nmaking up numbers here. Dialogue: 0,0:03:44.35,0:03:48.27,Default,,0000,0000,0000,,But let's do make up a few numbers \Nand figure out Dialogue: 0,0:03:48.27,0:03:53.29,Default,,0000,0000,0000,,what the likelihood is of having cancer,\Ngiven that you tested positive. Dialogue: 0,0:03:53.29,0:03:58.89,Default,,0000,0000,0000,,So let's imagine that the base rate \Nof this particular type of cancer Dialogue: 0,0:03:58.89,0:04:06.24,Default,,0000,0000,0000,,in the population is 0.3%, that is, \N3 out of 1,000, or 0.003. Dialogue: 0,0:04:06.24,0:04:08.43,Default,,0000,0000,0000,,And they say that's the base rate, Dialogue: 0,0:04:08.43,0:04:12.58,Default,,0000,0000,0000,,or it's sometimes called the prevalence\Nof the condition in the population. Dialogue: 0,0:04:12.97,0:04:16.82,Default,,0000,0000,0000,,That's simply to say that out of 1,000\Npeople chosen randomly Dialogue: 0,0:04:16.82,0:04:19.95,Default,,0000,0000,0000,,in the population, you'd get about 3 \Nthat have this condition. Dialogue: 0,0:04:21.64,0:04:24.88,Default,,0000,0000,0000,,It's just a percentage \Nof the general population. Dialogue: 0,0:04:25.91,0:04:28.41,Default,,0000,0000,0000,,So that's the condition, what about the\Ntest? Dialogue: 0,0:04:28.79,0:04:32.27,Default,,0000,0000,0000,,Well the first thing we want to know \Nis the sensitivity of the test. Dialogue: 0,0:04:32.93,0:04:37.46,Default,,0000,0000,0000,,The sensitivity of the test we're going to\Nassume is 0.99. Dialogue: 0,0:04:38.62,0:04:46.04,Default,,0000,0000,0000,,And what that means is that out of \N100 people who have this condition, Dialogue: 0,0:04:46.50,0:04:49.01,Default,,0000,0000,0000,,99 of them will test positive. Dialogue: 0,0:04:49.01,0:04:53.57,Default,,0000,0000,0000,,So this test is pretty good at figuring\Nout, Dialogue: 0,0:04:53.57,0:04:57.40,Default,,0000,0000,0000,,from among the people \Nwho have the condition, which ones do. Dialogue: 0,0:04:57.40,0:05:03.02,Default,,0000,0000,0000,,99 of those 100 people who have the\Ncondition will test positive. Dialogue: 0,0:05:03.21,0:05:08.00,Default,,0000,0000,0000,,The other feature is specificity, and what\Nthat means is Dialogue: 0,0:05:08.00,0:05:13.41,Default,,0000,0000,0000,,the percentage of the people who don't\Nhave the condition who will test negative. Dialogue: 0,0:05:14.39,0:05:17.50,Default,,0000,0000,0000,,The point here is you're not going \Nto get a positive result Dialogue: 0,0:05:17.50,0:05:20.15,Default,,0000,0000,0000,,for people who don't have the condition,\Nright? Dialogue: 0,0:05:20.18,0:05:23.66,Default,,0000,0000,0000,,Because you want it to be specific \Nto this particular condition Dialogue: 0,0:05:23.66,0:05:27.60,Default,,0000,0000,0000,,and not get a bunch of positives for \Npeople who have other types of conditions Dialogue: 0,0:05:27.60,0:05:29.48,Default,,0000,0000,0000,,or no medical condition at all. Dialogue: 0,0:05:30.24,0:05:32.44,Default,,0000,0000,0000,,So the specificity we're going to assume, Dialogue: 0,0:05:32.44,0:05:37.11,Default,,0000,0000,0000,,in this particular case we're talking about, is also 99%. Dialogue: 0,0:05:39.23,0:05:46.49,Default,,0000,0000,0000,,Now, what we want to know is the probability\Nthat you have a cancer, a condition, Dialogue: 0,0:05:47.12,0:05:50.63,Default,,0000,0000,0000,,given that you tested positive on the test; Dialogue: 0,0:05:50.63,0:05:55.46,Default,,0000,0000,0000,,but notice that the sensitivity \Ntells you the probability Dialogue: 0,0:05:55.46,0:05:59.04,Default,,0000,0000,0000,,that you will test positive \Ngiven that you have the condition. Dialogue: 0,0:05:59.34,0:06:01.76,Default,,0000,0000,0000,,We want to know the opposite of that, Dialogue: 0,0:06:01.76,0:06:04.74,Default,,0000,0000,0000,,the probability \Nthat you have the condition Dialogue: 0,0:06:04.74,0:06:07.32,Default,,0000,0000,0000,,given that you tested positive. Dialogue: 0,0:06:08.18,0:06:10.84,Default,,0000,0000,0000,,And that's what we have to do \Na little calculation to figure out. Dialogue: 0,0:06:10.84,0:06:15.32,Default,,0000,0000,0000,,But before we do that calculation, \NI want you to think about these figures Dialogue: 0,0:06:15.32,0:06:18.31,Default,,0000,0000,0000,,that I've given you:\Nthe prevalence in the population, Dialogue: 0,0:06:18.31,0:06:22.06,Default,,0000,0000,0000,,the sensitivity of the test,\Nthe specificity of the test, Dialogue: 0,0:06:22.06,0:06:23.49,Default,,0000,0000,0000,,and just make a guess. Dialogue: 0,0:06:23.90,0:06:26.51,Default,,0000,0000,0000,,Just start out by writing down \Non a piece of paper Dialogue: 0,0:06:26.51,0:06:32.43,Default,,0000,0000,0000,,what you think the probability is \Nthat you would have the cancer Dialogue: 0,0:06:32.43,0:06:35.85,Default,,0000,0000,0000,,given that you tested positive \Non the test. Dialogue: 0,0:06:36.99,0:06:40.23,Default,,0000,0000,0000,,Take a minute and think about it \Nand write it down. Dialogue: 0,0:06:41.02,0:06:45.08,Default,,0000,0000,0000,,But we don't want to just guess \Nabout medical conditions, Dialogue: 0,0:06:45.08,0:06:48.34,Default,,0000,0000,0000,,about probabilities that really matter \Nas much as this will do. Dialogue: 0,0:06:48.91,0:06:53.10,Default,,0000,0000,0000,,Instead, we want to calculate what the\Nprobability really is. Dialogue: 0,0:06:53.68,0:06:58.69,Default,,0000,0000,0000,,So, let's go through it carefully and\Nshow you how to use Dialogue: 0,0:06:58.69,0:07:04.10,Default,,0000,0000,0000,,what I'll call the box method in order \Nto calculate the real likelihood Dialogue: 0,0:07:04.10,0:07:08.58,Default,,0000,0000,0000,,that you have the condition, given that \Nyou got a positive test result. Dialogue: 0,0:07:09.32,0:07:15.55,Default,,0000,0000,0000,,What we need to do is to divide the\Npopulation into four different groups: Dialogue: 0,0:07:15.98,0:07:19.86,Default,,0000,0000,0000,,the group that has the condition \Nand tested positive, Dialogue: 0,0:07:19.86,0:07:22.60,Default,,0000,0000,0000,,the group that has the condition \Nand tested negative, Dialogue: 0,0:07:22.90,0:07:25.65,Default,,0000,0000,0000,,the group that doesn't have the condition\Nand tested positive, Dialogue: 0,0:07:25.65,0:07:28.66,Default,,0000,0000,0000,,and the group that doesn't have\Nthe condition and tested negative. Dialogue: 0,0:07:29.49,0:07:34.15,Default,,0000,0000,0000,,And this chart will show you a nice, \Nsimple way of organizing Dialogue: 0,0:07:34.15,0:07:35.55,Default,,0000,0000,0000,,all of that information. Dialogue: 0,0:07:35.87,0:07:43.59,Default,,0000,0000,0000,,Because this row, the top row, tells \Nyou all the people who tested positive. Dialogue: 0,0:07:44.48,0:07:49.46,Default,,0000,0000,0000,,The bottom row tells you the people \Nwho tested negative. Dialogue: 0,0:07:50.03,0:07:55.88,Default,,0000,0000,0000,,Then, the left column gives you the \Npeople who do have the medical condition, Dialogue: 0,0:07:55.88,0:07:57.74,Default,,0000,0000,0000,,in this case, some kind of cancer. Dialogue: 0,0:07:58.52,0:08:02.89,Default,,0000,0000,0000,,And the right column tells you the people\Nwho do not have that condition. Dialogue: 0,0:08:03.69,0:08:07.95,Default,,0000,0000,0000,,Now what we need to do is to start\Nfilling it out with numbers. Dialogue: 0,0:08:08.94,0:08:12.74,Default,,0000,0000,0000,,Now the first thing we need to specify is\Nthe population. Dialogue: 0,0:08:13.24,0:08:16.40,Default,,0000,0000,0000,,In this case we want to start with a big\Nenough population Dialogue: 0,0:08:16.40,0:08:19.38,Default,,0000,0000,0000,,that we're not going to have a lot \Nof fractions in the other boxes. Dialogue: 0,0:08:19.38,0:08:22.64,Default,,0000,0000,0000,,So, let's just imagine that the population\Nis 100,000. Dialogue: 0,0:08:23.00,0:08:25.49,Default,,0000,0000,0000,,Make it a million or 10 million,\Nit doesn't matter Dialogue: 0,0:08:25.49,0:08:28.75,Default,,0000,0000,0000,,because we're going to be interested \Nin the ratios with the different groups. Dialogue: 0,0:08:30.53,0:08:33.50,Default,,0000,0000,0000,,We can use that 100,000 to fill out the\Nother boxes, Dialogue: 0,0:08:33.50,0:08:36.38,Default,,0000,0000,0000,,if we know the prevalence, or the\Nbase rate, Dialogue: 0,0:08:37.10,0:08:40.24,Default,,0000,0000,0000,,because the base rate tells you what\Npercentage of that 100,000 Dialogue: 0,0:08:40.24,0:08:43.70,Default,,0000,0000,0000,,actually do have the condition and\Ndon't have the condition. Dialogue: 0,0:08:44.70,0:08:47.66,Default,,0000,0000,0000,,We imagined -- remember we're just \Nmaking up numbers here -- Dialogue: 0,0:08:47.66,0:08:52.18,Default,,0000,0000,0000,,but we imagined that the prevalence \Nof this condition is 0.3%. Dialogue: 0,0:08:52.18,0:08:56.20,Default,,0000,0000,0000,,And that means out of 100,000 people,\Nthere will be 300 Dialogue: 0,0:08:56.20,0:08:59.51,Default,,0000,0000,0000,,who do have the medical condition. Dialogue: 0,0:09:01.01,0:09:04.10,Default,,0000,0000,0000,,Well, if there are 300 who have it and\Nthere are 100,000 total, Dialogue: 0,0:09:04.10,0:09:07.53,Default,,0000,0000,0000,,we can figure out how many don't have the\Nmedical condition by just subtracting. Dialogue: 0,0:09:07.72,0:09:11.88,Default,,0000,0000,0000,,Which means 99,700 \Ndo not have the medical condition. Dialogue: 0,0:09:12.65,0:09:13.59,Default,,0000,0000,0000,,Okay? Dialogue: 0,0:09:13.59,0:09:17.82,Default,,0000,0000,0000,,Now, we've divided the population into our\Ntwo columns: Dialogue: 0,0:09:17.82,0:09:20.72,Default,,0000,0000,0000,,the ones that do and the ones that don't\Nhave the medical condition. Dialogue: 0,0:09:21.42,0:09:25.73,Default,,0000,0000,0000,,The next step is to figure out how many\Nare going to test positive Dialogue: 0,0:09:25.73,0:09:30.15,Default,,0000,0000,0000,,and how many are going to test negative\Nout of each of these groups. Dialogue: 0,0:09:30.63,0:09:33.73,Default,,0000,0000,0000,,For that, we first need the sensitivity. Dialogue: 0,0:09:34.20,0:09:38.50,Default,,0000,0000,0000,,The sensitivity tells us the percentage \Nof the cases that have the condition Dialogue: 0,0:09:38.50,0:09:40.07,Default,,0000,0000,0000,,who will test positive. Dialogue: 0,0:09:41.37,0:09:45.26,Default,,0000,0000,0000,,So the people who have the condition are\Nthe 300. Dialogue: 0,0:09:45.69,0:09:50.15,Default,,0000,0000,0000,,The ones who test positive are going \Nto go up in this area Dialogue: 0,0:09:50.88,0:09:56.84,Default,,0000,0000,0000,,and we know from the sensitivity being 0.99 or 99% Dialogue: 0,0:09:57.11,0:10:03.38,Default,,0000,0000,0000,,that the number in that area should be 99%\Nof 300, or 297. Dialogue: 0,0:10:04.86,0:10:08.35,Default,,0000,0000,0000,,And of course, if that's the number \Nthat test positive, Dialogue: 0,0:10:08.69,0:10:12.22,Default,,0000,0000,0000,,then the remainder \Nare going to test negative Dialogue: 0,0:10:12.22,0:10:13.87,Default,,0000,0000,0000,,and that means that we'll have three. Dialogue: 0,0:10:14.41,0:10:18.68,Default,,0000,0000,0000,,Which shouldn't surprise you because if\N99% of the cases that have it Dialogue: 0,0:10:18.68,0:10:23.66,Default,,0000,0000,0000,,test positive, then 1% will test negative,\Nand 1% of 300 is 3. Dialogue: 0,0:10:24.33,0:10:26.07,Default,,0000,0000,0000,,Good: so we got the first column done. Dialogue: 0,0:10:26.76,0:10:31.19,Default,,0000,0000,0000,,Now, the next question is going to be the\Nspecificity. Dialogue: 0,0:10:31.46,0:10:37.30,Default,,0000,0000,0000,,We can use the specificity to figure out\Nwhat goes in that next column. Dialogue: 0,0:10:38.09,0:10:43.62,Default,,0000,0000,0000,,If the specificity is 99 and we know Dialogue: 0,0:10:43.62,0:10:50.80,Default,,0000,0000,0000,,that 99,700 people do not have the\Ncondition out of our sample of 100,000, Dialogue: 0,0:10:51.76,0:10:58.94,Default,,0000,0000,0000,,well, that means that 99% of 99,700 are\Ngoing to test negative Dialogue: 0,0:10:58.94,0:11:02.94,Default,,0000,0000,0000,,because the specificity is the \Npercentage of cases without the condition Dialogue: 0,0:11:02.94,0:11:04.60,Default,,0000,0000,0000,,that test negative. Dialogue: 0,0:11:04.60,0:11:10.86,Default,,0000,0000,0000,,And that means that we'll have \N98,703 among the people Dialogue: 0,0:11:10.86,0:11:13.90,Default,,0000,0000,0000,,who do not have the condition \Nwho test negative. Dialogue: 0,0:11:14.82,0:11:18.16,Default,,0000,0000,0000,,How many are going to test positive?\NThe rest of them. Dialogue: 0,0:11:18.47,0:11:27.45,Default,,0000,0000,0000,,So 99,700 minus 98,703 \Nis going to be 997. Dialogue: 0,0:11:27.98,0:11:35.10,Default,,0000,0000,0000,,And of course, that shouldn't be surprising \Nagain, because 1% of 99,700 is 997. Dialogue: 0,0:11:36.38,0:11:38.61,Default,,0000,0000,0000,,We only got two boxes left to fill out. Dialogue: 0,0:11:39.19,0:11:40.50,Default,,0000,0000,0000,,How do you fill out those? Dialogue: 0,0:11:41.02,0:11:46.99,Default,,0000,0000,0000,,Well, this box in the upper right, \Nis the total number of people Dialogue: 0,0:11:46.99,0:11:50.55,Default,,0000,0000,0000,,in this population of 100,000 \Nwho test positive. Dialogue: 0,0:11:51.24,0:11:56.44,Default,,0000,0000,0000,,And so, we can get that by adding the ones\Nthat do have the condition and test positive Dialogue: 0,0:11:56.44,0:11:59.75,Default,,0000,0000,0000,,and the ones that don't have \Nthe condition and test positive. Dialogue: 0,0:11:59.75,0:12:05.59,Default,,0000,0000,0000,,Just add them together, and you get 1,294. Dialogue: 0,0:12:06.27,0:12:13.44,Default,,0000,0000,0000,,And you do the same on the next row, \Nbecause that blank is the area Dialogue: 0,0:12:13.44,0:12:16.19,Default,,0000,0000,0000,,that has all the people \Nwho test negative, Dialogue: 0,0:12:16.19,0:12:20.17,Default,,0000,0000,0000,,and 3 people who have the condition \Ntest negative, Dialogue: 0,0:12:20.62,0:12:25.33,Default,,0000,0000,0000,,98,703 people who do not have the\Ncondition test negative, Dialogue: 0,0:12:25.63,0:12:30.49,Default,,0000,0000,0000,,so the total is going to be 98,706. Dialogue: 0,0:12:30.50,0:12:35.06,Default,,0000,0000,0000,,And we can check to make sure that\Nwe got it right, Dialogue: 0,0:12:35.08,0:12:43.58,Default,,0000,0000,0000,,by just adding them together:\N1,294 plus 98,706 is equal to 100,000. Dialogue: 0,0:12:44.94,0:12:46.53,Default,,0000,0000,0000,,Phew, we got it right. Dialogue: 0,0:12:46.53,0:12:52.23,Default,,0000,0000,0000,,Okay, so now we've divided the population \Ninto those people who have the condition, Dialogue: 0,0:12:52.93,0:12:54.77,Default,,0000,0000,0000,,those people who don't have the\Ncondition, Dialogue: 0,0:12:55.07,0:12:59.35,Default,,0000,0000,0000,,and we know how many of each \Nof those groups test positive, Dialogue: 0,0:12:59.35,0:13:03.19,Default,,0000,0000,0000,,and how many of each of those groups \Ntest negative. Dialogue: 0,0:13:04.00,0:13:08.16,Default,,0000,0000,0000,,The real question is \Nwhat's the probability Dialogue: 0,0:13:08.16,0:13:12.27,Default,,0000,0000,0000,,that I have cancer or the medical \Ncondition, given that I tested positive? Dialogue: 0,0:13:12.49,0:13:13.94,Default,,0000,0000,0000,,How do we figure that out? Dialogue: 0,0:13:14.37,0:13:19.96,Default,,0000,0000,0000,,Well, the total number \Nof positive tests was 1,294 Dialogue: 0,0:13:20.82,0:13:27.33,Default,,0000,0000,0000,,and the people who tested positive\Nwho really had the condition was 297. Dialogue: 0,0:13:27.77,0:13:34.13,Default,,0000,0000,0000,,So it looks like the probability of\Nactually having the condition, Dialogue: 0,0:13:34.75,0:13:43.67,Default,,0000,0000,0000,,given that you tested positive, \Nis 297 out of 1294 or 0.23. Dialogue: 0,0:13:44.09,0:13:47.24,Default,,0000,0000,0000,,That's 23%, less than one in four. Dialogue: 0,0:13:47.79,0:13:49.09,Default,,0000,0000,0000,,Is that what you guessed? Dialogue: 0,0:13:49.61,0:13:54.92,Default,,0000,0000,0000,,Most people, including most doctors, when\Nthey hear that the test is Dialogue: 0,0:13:54.92,0:14:01.22,Default,,0000,0000,0000,,99% sensitive and 99% specific, will\Nguess a lot higher than one in four. Dialogue: 0,0:14:01.87,0:14:03.10,Default,,0000,0000,0000,,>> Oh my gosh! Dialogue: 0,0:14:03.10,0:14:06.44,Default,,0000,0000,0000,,I'm a doctor, and I never would have\Nthought that! Dialogue: 0,0:14:07.04,0:14:07.92,Default,,0000,0000,0000,,>> Now, don't worry: Dialogue: 0,0:14:07.92,0:14:11.02,Default,,0000,0000,0000,,she's not a physician.\Nshe's a metaphysician. Dialogue: 0,0:14:12.39,0:14:16.18,Default,,0000,0000,0000,,>> But in this case, the probability \Nreally is just one in four Dialogue: 0,0:14:16.18,0:14:17.91,Default,,0000,0000,0000,,that you had that medical condition. Dialogue: 0,0:14:17.92,0:14:19.12,Default,,0000,0000,0000,,Now how did that happen? Dialogue: 0,0:14:19.63,0:14:23.26,Default,,0000,0000,0000,,The reason was that the prevalence or the\Nbase rate was so low Dialogue: 0,0:14:23.84,0:14:27.59,Default,,0000,0000,0000,,that even a small rate \Nof false positives, Dialogue: 0,0:14:28.21,0:14:33.10,Default,,0000,0000,0000,,given the massive numbers of people who\Ndon't have the condition, Dialogue: 0,0:14:33.64,0:14:37.15,Default,,0000,0000,0000,,will mean that there are more false positives,\N3 times as many, Dialogue: 0,0:14:37.64,0:14:39.33,Default,,0000,0000,0000,,as there are true positives. Dialogue: 0,0:14:39.64,0:14:42.74,Default,,0000,0000,0000,,And that's why the probability \Nis just one in four, Dialogue: 0,0:14:42.74,0:14:44.64,Default,,0000,0000,0000,,actually a little less than one in four, Dialogue: 0,0:14:44.79,0:14:48.60,Default,,0000,0000,0000,,that you have the medical condition even\Nwhen you tested positive. Dialogue: 0,0:14:49.15,0:14:53.33,Default,,0000,0000,0000,,I want to add a quick caveat here, in\Norder to avoid misinterpretation. Dialogue: 0,0:14:53.78,0:14:58.63,Default,,0000,0000,0000,,because the point here is that, if you \Nhave a screening test for a condition Dialogue: 0,0:14:58.63,0:15:03.60,Default,,0000,0000,0000,,with a very low base rate or prevalence,\Nand you don't have any symptoms Dialogue: 0,0:15:03.60,0:15:09.78,Default,,0000,0000,0000,,that put you in a special category, \Nthen, you need to get another test Dialogue: 0,0:15:10.41,0:15:14.87,Default,,0000,0000,0000,,before you jump to any conclusions\Nabout having the medical condition. Dialogue: 0,0:15:15.89,0:15:19.89,Default,,0000,0000,0000,,Because, if you have that other test, \Nthen the fact that you tested positive Dialogue: 0,0:15:19.89,0:15:22.54,Default,,0000,0000,0000,,on the first test puts you in a smaller class, Dialogue: 0,0:15:22.54,0:15:24.100,Default,,0000,0000,0000,,with a much higher base rate, or prevalence. Dialogue: 0,0:15:25.00,0:15:27.87,Default,,0000,0000,0000,,And now, the probability's going to go up. Dialogue: 0,0:15:28.89,0:15:32.45,Default,,0000,0000,0000,,Most doctors know that, and that's why,\Nafter the first test, Dialogue: 0,0:15:32.45,0:15:35.34,Default,,0000,0000,0000,,they don't jump to conclusions, and they\Norder another test, Dialogue: 0,0:15:35.34,0:15:39.38,Default,,0000,0000,0000,,but many patients don't realize that and\Nthey get extremely worried Dialogue: 0,0:15:39.38,0:15:42.26,Default,,0000,0000,0000,,after a single test even when they don't\Nhave any symptoms. Dialogue: 0,0:15:43.50,0:15:45.91,Default,,0000,0000,0000,,So that's the mistake \Nthat we're trying to avoid here Dialogue: 0,0:15:45.91,0:15:53.17,Default,,0000,0000,0000,,and that's surprising, but it actually\Napplies to many different areas of life. Dialogue: 0,0:15:54.85,0:15:59.26,Default,,0000,0000,0000,,It applies, for example, to medical tests\Nwith all kinds of other diseases. Dialogue: 0,0:15:59.91,0:16:04.42,Default,,0000,0000,0000,,Not just cancer or colon cancer, but\Npretty much every disease Dialogue: 0,0:16:04.42,0:16:06.33,Default,,0000,0000,0000,,where the prevalence is extremely low. Dialogue: 0,0:16:07.40,0:16:09.96,Default,,0000,0000,0000,,It applies also to drug tests. Dialogue: 0,0:16:10.67,0:16:12.52,Default,,0000,0000,0000,,If somebody gets a positive drug test, Dialogue: 0,0:16:12.52,0:16:14.55,Default,,0000,0000,0000,,does that mean they really \Nwere using drugs? Dialogue: 0,0:16:14.98,0:16:20.32,Default,,0000,0000,0000,,Well, if it's a population where the \Nbase rate or prevelance of drug use Dialogue: 0,0:16:20.32,0:16:23.13,Default,,0000,0000,0000,,is quite low, then it might not. Dialogue: 0,0:16:24.44,0:16:28.29,Default,,0000,0000,0000,,Of course, if you assume that the \Nprevalence or base rate is quite high, Dialogue: 0,0:16:28.29,0:16:30.44,Default,,0000,0000,0000,,then you're going to believe \Nthat drug test. Dialogue: 0,0:16:30.84,0:16:35.10,Default,,0000,0000,0000,,But you need to know the facts about what \Nthe prevalence or base rate really is Dialogue: 0,0:16:35.10,0:16:39.15,Default,,0000,0000,0000,,in order to calculate \Naccurately the probability Dialogue: 0,0:16:39.15,0:16:41.93,Default,,0000,0000,0000,,that this person really was using drugs. Dialogue: 0,0:16:43.40,0:16:47.84,Default,,0000,0000,0000,,Same applies to evidence in legal trials:\Ntake eyewitnesses for example, Dialogue: 0,0:16:47.84,0:16:55.11,Default,,0000,0000,0000,,it's very tricky, someone's trying to use \Ntheir eyes as a test for what they see. Dialogue: 0,0:16:55.11,0:16:57.74,Default,,0000,0000,0000,,They might identify a friend, \Nor they might just say Dialogue: 0,0:16:58.23,0:17:02.31,Default,,0000,0000,0000,,that car that did the hit-and-run accident\Nwas a Porsche. Dialogue: 0,0:17:03.42,0:17:07.72,Default,,0000,0000,0000,,Well, how good are they at identifying\NPorsches? Dialogue: 0,0:17:09.58,0:17:12.93,Default,,0000,0000,0000,,If they get it right most of the time, \Nbut not always, Dialogue: 0,0:17:12.93,0:17:17.55,Default,,0000,0000,0000,,and sometimes they don't get it right \Nwhen it is a Porsche, Dialogue: 0,0:17:17.55,0:17:22.03,Default,,0000,0000,0000,,then we've got the sensitivity and \Nspecificity of what they identify. Dialogue: 0,0:17:22.84,0:17:26.01,Default,,0000,0000,0000,,And we can use that to calculate \Nhow likely it is Dialogue: 0,0:17:26.45,0:17:30.35,Default,,0000,0000,0000,,that their evidence in the trial \Nreally is reliable or not. Dialogue: 0,0:17:31.09,0:17:34.15,Default,,0000,0000,0000,,Another example is the prediction of\Nfuture behavior. Dialogue: 0,0:17:34.95,0:17:37.12,Default,,0000,0000,0000,,We might have some kind of marker Dialogue: 0,0:17:37.74,0:17:40.55,Default,,0000,0000,0000,,that a certain group of people \Nwith that marker Dialogue: 0,0:17:41.03,0:17:43.60,Default,,0000,0000,0000,,have a certain likelihood of\Ncommitting crimes. Dialogue: 0,0:17:44.15,0:17:49.00,Default,,0000,0000,0000,,But if crimes are very rare \Nin that community and every other, Dialogue: 0,0:17:49.00,0:17:55.12,Default,,0000,0000,0000,,then a test which has a pretty good \Nsensitivity and specificity Dialogue: 0,0:17:55.12,0:17:59.84,Default,,0000,0000,0000,,still might not be good enough when \Nwe're talking about something like crime Dialogue: 0,0:18:00.16,0:18:04.80,Default,,0000,0000,0000,,that's actually very rare and has \Na very low prevalence or base rate Dialogue: 0,0:18:04.80,0:18:06.23,Default,,0000,0000,0000,,in most communities. Dialogue: 0,0:18:06.67,0:18:08.100,Default,,0000,0000,0000,,And the same applies \Nto failing out of school. Dialogue: 0,0:18:10.70,0:18:14.07,Default,,0000,0000,0000,,Our SAT scores or GRE scores \Nare going to be Dialogue: 0,0:18:14.07,0:18:16.96,Default,,0000,0000,0000,,good predictors of \Nwho's going to fail out of school. Dialogue: 0,0:18:18.32,0:18:21.45,Default,,0000,0000,0000,,Well, if very few people fail out of\Nschool, Dialogue: 0,0:18:21.45,0:18:24.56,Default,,0000,0000,0000,,so that the prevalence and base rate \Nis very low, Dialogue: 0,0:18:24.56,0:18:27.71,Default,,0000,0000,0000,,then, even if they're \Npretty sensitive and specific, Dialogue: 0,0:18:27.71,0:18:29.43,Default,,0000,0000,0000,,they might not be good predictors. Dialogue: 0,0:18:29.94,0:18:35.30,Default,,0000,0000,0000,,So this same type of problem arises \Nin a lot of different areas. Dialogue: 0,0:18:35.84,0:18:38.33,Default,,0000,0000,0000,,And I'm not going to go through \Nmore examples right now, Dialogue: 0,0:18:38.33,0:18:41.87,Default,,0000,0000,0000,,but we'll have plenty of examples in the \Nexercises at the end of this chapter. Dialogue: 0,0:18:43.69,0:18:45.97,Default,,0000,0000,0000,,I want to end, though,\Nby saying a few things Dialogue: 0,0:18:45.97,0:18:49.26,Default,,0000,0000,0000,,that are a bit more technical \Nabout this method. Dialogue: 0,0:18:49.78,0:18:52.49,Default,,0000,0000,0000,,First, there's a lot of terminology to\Nlearn, Dialogue: 0,0:18:53.04,0:18:57.67,Default,,0000,0000,0000,,because when you read about using \Nthis method in other areas, Dialogue: 0,0:18:57.67,0:19:01.21,Default,,0000,0000,0000,,for other types of topics, \Nthen you'll run into these terms, Dialogue: 0,0:19:01.21,0:19:02.69,Default,,0000,0000,0000,,and it's a good idea to know them. Dialogue: 0,0:19:04.04,0:19:13.86,Default,,0000,0000,0000,,So first, the cases where the person does \Nhave the condition and also tests positive Dialogue: 0,0:19:13.86,0:19:17.06,Default,,0000,0000,0000,,are called hits, or true positives. Dialogue: 0,0:19:17.06,0:19:18.74,Default,,0000,0000,0000,,Different people use different terms. Dialogue: 0,0:19:21.54,0:19:27.62,Default,,0000,0000,0000,,The cases where the person tests positive,\Nbut they don't have the condition, Dialogue: 0,0:19:27.62,0:19:31.39,Default,,0000,0000,0000,,are called, false positives \Nor false alarms. Dialogue: 0,0:19:33.83,0:19:40.52,Default,,0000,0000,0000,,The cases where a person really does have\Nthe condition, but tests negative Dialogue: 0,0:19:41.12,0:19:44.29,Default,,0000,0000,0000,,are called misses or false negatives. Dialogue: 0,0:19:47.36,0:19:51.06,Default,,0000,0000,0000,,And the cases where the person \Ndoes not have the condition Dialogue: 0,0:19:51.59,0:19:54.83,Default,,0000,0000,0000,,and the test comes out negative \Nare called true negatives, Dialogue: 0,0:19:54.83,0:19:57.68,Default,,0000,0000,0000,,because they're negative and it's true\Nthat they don't have the condition. Dialogue: 0,0:19:59.62,0:20:03.38,Default,,0000,0000,0000,,If we put together the false negatives,\Nand the true negatives, Dialogue: 0,0:20:04.25,0:20:06.43,Default,,0000,0000,0000,,we get the total set of negatives. Dialogue: 0,0:20:07.35,0:20:11.67,Default,,0000,0000,0000,,And if we put together the true positives \Nand the false positives Dialogue: 0,0:20:12.11,0:20:14.68,Default,,0000,0000,0000,,we get, the total set of positives. Dialogue: 0,0:20:16.41,0:20:18.97,Default,,0000,0000,0000,,And of course, we have the general\Npopulation. Dialogue: 0,0:20:19.35,0:20:23.32,Default,,0000,0000,0000,,Within that population, \Na percentage that have the condition Dialogue: 0,0:20:23.32,0:20:25.53,Default,,0000,0000,0000,,and a percentage \Nthat don't have the condition. Dialogue: 0,0:20:27.02,0:20:29.41,Default,,0000,0000,0000,,Now, what's the base rate? Dialogue: 0,0:20:29.75,0:20:35.25,Default,,0000,0000,0000,,The base rate in this population is simply\Nthe set that have the condition, Dialogue: 0,0:20:35.51,0:20:41.57,Default,,0000,0000,0000,,divided by the total population,\Nwhich is Box 7 divided by Box 9. Dialogue: 0,0:20:41.94,0:20:44.82,Default,,0000,0000,0000,,If we use e for the evidence Dialogue: 0,0:20:45.09,0:20:50.19,Default,,0000,0000,0000,,and h for the hypothesis being true that\Nthe condition really does exist, Dialogue: 0,0:20:50.19,0:20:52.85,Default,,0000,0000,0000,,then that's the probability of h, Dialogue: 0,0:20:54.38,0:21:02.07,Default,,0000,0000,0000,,and the sensitivity is going to be \Nthe total number of true positives Dialogue: 0,0:21:02.07,0:21:06.23,Default,,0000,0000,0000,,divided by the total number of people \Nwith the condition, Dialogue: 0,0:21:06.23,0:21:11.73,Default,,0000,0000,0000,,because it's the percentage of people who\Nhave the condition and test positive. Dialogue: 0,0:21:12.76,0:21:16.10,Default,,0000,0000,0000,,OK? So that's the probability of e given h, Dialogue: 0,0:21:16.10,0:21:20.48,Default,,0000,0000,0000,,and it's box one divided by box 7. Dialogue: 0,0:21:21.04,0:21:26.87,Default,,0000,0000,0000,,The specificity in contrast is the ratio\Nof it being a true negative Dialogue: 0,0:21:26.87,0:21:31.71,Default,,0000,0000,0000,,to the total number of people \Nwho do not have the condition, that is, Dialogue: 0,0:21:31.71,0:21:35.09,Default,,0000,0000,0000,,the probability of not e, that is, Dialogue: 0,0:21:35.09,0:21:38.95,Default,,0000,0000,0000,,not having the evidence \Nof a positive test result, Dialogue: 0,0:21:38.95,0:21:43.18,Default,,0000,0000,0000,,given not h, \Ngiven that you're in the second column, Dialogue: 0,0:21:43.18,0:21:47.17,Default,,0000,0000,0000,,where the hypothesis is false, \Nbecause you don't have the condition. Dialogue: 0,0:21:47.17,0:21:52.06,Default,,0000,0000,0000,,So that's Box 5 divided by Box 8. Dialogue: 0,0:21:53.73,0:21:55.81,Default,,0000,0000,0000,,That's the specificity. Dialogue: 0,0:21:56.00,0:22:00.35,Default,,0000,0000,0000,,So we can define all of these \Nin terms of each other. Dialogue: 0,0:22:00.76,0:22:06.82,Default,,0000,0000,0000,,The hits divided by the total with that\Ncondition is going to be the sensitivity. Dialogue: 0,0:22:06.82,0:22:11.13,Default,,0000,0000,0000,,And you can use this terminology to guide\Nyour way through this box. Dialogue: 0,0:22:11.14,0:22:15.05,Default,,0000,0000,0000,,And the big question is again going to be\Nwhat's the solution? Dialogue: 0,0:22:15.05,0:22:21.66,Default,,0000,0000,0000,,What's the probability of the hypothesis\Nhaving the condition, given the evidence, Dialogue: 0,0:22:21.66,0:22:27.71,Default,,0000,0000,0000,,that is, a positive test result: \Nthat's going to be Box 1 divided by Box 3. Dialogue: 0,0:22:28.54,0:22:31.89,Default,,0000,0000,0000,,And as we saw in the case that we just\Nwent through, Dialogue: 0,0:22:31.89,0:22:37.07,Default,,0000,0000,0000,,that gives you the probability of having\Nthe medical condition, or colon cancer, Dialogue: 0,0:22:37.07,0:22:39.23,Default,,0000,0000,0000,,given a positive test result. Dialogue: 0,0:22:39.48,0:22:44.13,Default,,0000,0000,0000,,That's called the posterior probability,\Nor in symbols, Dialogue: 0,0:22:44.13,0:22:47.55,Default,,0000,0000,0000,,the probability of the hypothesis, \Ngiven the evidence. Dialogue: 0,0:22:48.21,0:22:53.17,Default,,0000,0000,0000,,So I hope this terminology helps you\Nunderstand some of the discussions of this, Dialogue: 0,0:22:53.17,0:22:55.86,Default,,0000,0000,0000,,if you go on and read about it \Nin the literature. Dialogue: 0,0:22:56.38,0:23:01.21,Default,,0000,0000,0000,,This procedure that we've been discussing \Nis actually just an application Dialogue: 0,0:23:01.73,0:23:06.57,Default,,0000,0000,0000,,of a famous theorem called Bayes' Theorem\Nafter Thomas Bayes, Dialogue: 0,0:23:06.57,0:23:11.27,Default,,0000,0000,0000,,a 18th century English clergyman, \Nwho was also a mathematician Dialogue: 0,0:23:11.27,0:23:15.69,Default,,0000,0000,0000,,and proved this extremely important \Ntheorem in probability theory. Dialogue: 0,0:23:16.70,0:23:22.52,Default,,0000,0000,0000,,Now some of you out there will use the\Nboxes, and it'll make sense to you. Dialogue: 0,0:23:22.52,0:23:26.29,Default,,0000,0000,0000,,But some Courserians, I assume, \Nare mathematicians, Dialogue: 0,0:23:26.29,0:23:28.34,Default,,0000,0000,0000,,and they want to see \Nthe mathematics behind it. Dialogue: 0,0:23:28.80,0:23:32.85,Default,,0000,0000,0000,,So now, I want to show you how to derive\NBayes' theorem Dialogue: 0,0:23:32.85,0:23:36.68,Default,,0000,0000,0000,,from the rules of probability \Nthat we learned in earlier lectures. Dialogue: 0,0:23:37.16,0:23:40.34,Default,,0000,0000,0000,,So for all you math nerds out there, \Nhere goes. Dialogue: 0,0:23:41.23,0:23:43.55,Default,,0000,0000,0000,,You start with rule 2G, Dialogue: 0,0:23:45.18,0:23:50.86,Default,,0000,0000,0000,,apply it to the probability that the\Nevidence and the hypothesis are both true. Dialogue: 0,0:23:51.33,0:23:56.50,Default,,0000,0000,0000,,And by the rule, that probability is \Nequal to the probability of the evidence, Dialogue: 0,0:23:56.50,0:24:00.99,Default,,0000,0000,0000,,times the probability of the hypothesis,\Ngiven the evidence. Dialogue: 0,0:24:02.32,0:24:04.51,Default,,0000,0000,0000,,You have to have \Nthat conditional probability Dialogue: 0,0:24:04.51,0:24:07.38,Default,,0000,0000,0000,,because they're not independent. Dialogue: 0,0:24:08.80,0:24:14.31,Default,,0000,0000,0000,,Then you simply divide both sides of that \Nby the probability of the evidence: Dialogue: 0,0:24:14.31,0:24:15.79,Default,,0000,0000,0000,,a little simple algebra. Dialogue: 0,0:24:15.79,0:24:20.22,Default,,0000,0000,0000,,And you end up with the probability \Nof the hypothesis, given the evidence, Dialogue: 0,0:24:20.22,0:24:24.79,Default,,0000,0000,0000,,is equal to the probability \Nof the evidence and the hypothesis, Dialogue: 0,0:24:24.79,0:24:28.10,Default,,0000,0000,0000,,divided by the probability \Nof the evidence. Dialogue: 0,0:24:30.83,0:24:34.53,Default,,0000,0000,0000,,Now we can do a little trick.\NThis was ingenious. Dialogue: 0,0:24:35.30,0:24:39.16,Default,,0000,0000,0000,,Substitute for e, something \Nthat's logically equivalent to e, Dialogue: 0,0:24:39.16,0:24:45.46,Default,,0000,0000,0000,,namely, the evidence AND the hypothesis\Nor the evidence AND NOT the hypothesis. Dialogue: 0,0:24:45.88,0:24:48.32,Default,,0000,0000,0000,,Now if you think about it, you'll see\Nthat those are equivalent, Dialogue: 0,0:24:48.32,0:24:51.26,Default,,0000,0000,0000,,because either the hypothesis \Nhas to be true Dialogue: 0,0:24:51.26,0:24:54.27,Default,,0000,0000,0000,,or NOT the hypothesis is true. Dialogue: 0,0:24:54.27,0:24:55.92,Default,,0000,0000,0000,,One or the other has to be true. Dialogue: 0,0:24:56.56,0:25:00.03,Default,,0000,0000,0000,,And that means that the evidence \NAND the hypothesis Dialogue: 0,0:25:00.03,0:25:04.77,Default,,0000,0000,0000,,or the evidence AND NOT the hypothesis \Nis going to be equivalent to e. Dialogue: 0,0:25:05.02,0:25:07.96,Default,,0000,0000,0000,,So this is equivalent to this. Dialogue: 0,0:25:08.44,0:25:11.36,Default,,0000,0000,0000,,And because they're equivalent,\Nwe can substitute them Dialogue: 0,0:25:11.36,0:25:15.38,Default,,0000,0000,0000,,within the formula for probability \Nwithout affecting the truth values. Dialogue: 0,0:25:15.72,0:25:23.40,Default,,0000,0000,0000,,So we just substitute this formula in \Nhere for the e up there. Dialogue: 0,0:25:23.84,0:25:27.91,Default,,0000,0000,0000,,And we end up with the probability of the\Nhypothesis, given the evidence, Dialogue: 0,0:25:27.91,0:25:32.24,Default,,0000,0000,0000,,is equal to the probability of the\Nevidence AND the hypothesis, divided by Dialogue: 0,0:25:32.24,0:25:35.22,Default,,0000,0000,0000,,the probability of the evidence \NAND the hypothesis Dialogue: 0,0:25:35.22,0:25:37.51,Default,,0000,0000,0000,,or the evidence AND NOT the hypothesis. Dialogue: 0,0:25:37.95,0:25:41.43,Default,,0000,0000,0000,,Now, that's not supposed to make much\Nsense, but it helps with the derivation. Dialogue: 0,0:25:43.28,0:25:47.60,Default,,0000,0000,0000,,The next step is to apply rule 3, because\Nwe have a disjunction. Dialogue: 0,0:25:47.60,0:25:51.08,Default,,0000,0000,0000,,And notice the disjuncts are mutually\Nexclusive. Dialogue: 0,0:25:51.54,0:25:56.23,Default,,0000,0000,0000,,It cannot be true, both, that the evidence\NAND the hypothesis is true, Dialogue: 0,0:25:56.23,0:26:00.20,Default,,0000,0000,0000,,and also that the evidence \NAND NOT the hypothesis is true, Dialogue: 0,0:26:00.20,0:26:04.25,Default,,0000,0000,0000,,because it can't be both h and not h. Dialogue: 0,0:26:05.16,0:26:08.07,Default,,0000,0000,0000,,So we can apply the simple version \Nof rule 3. Dialogue: 0,0:26:08.58,0:26:14.32,Default,,0000,0000,0000,,And that means that the probability of\N(e&h) or (e&~h) Dialogue: 0,0:26:14.32,0:26:21.07,Default,,0000,0000,0000,,is equal to the probability of (e&h\N+ the probability of (e&~h). Dialogue: 0,0:26:21.51,0:26:23.92,Default,,0000,0000,0000,,We're just applying \Nthat rule 3 for disjunction Dialogue: 0,0:26:23.92,0:26:26.26,Default,,0000,0000,0000,,that we learned a few lectures ago. Dialogue: 0,0:26:27.15,0:26:29.58,Default,,0000,0000,0000,,Now we apply rule 2G again, Dialogue: 0,0:26:29.58,0:26:35.38,Default,,0000,0000,0000,,because we have the probability \Nof a conjunction up in the top. Dialogue: 0,0:26:36.81,0:26:41.70,Default,,0000,0000,0000,,And, since these are not independent of\Neach other Dialogue: 0,0:26:41.88,0:26:44.63,Default,,0000,0000,0000,,-- we hope not, if it's a hypothesis \Nand the evidence for it -- Dialogue: 0,0:26:45.34,0:26:48.32,Default,,0000,0000,0000,,then we have to use \Nthe conditional probability. Dialogue: 0,0:26:48.83,0:26:53.01,Default,,0000,0000,0000,,And using rule 2G, we find that \Nthe probability of the hypothesis, Dialogue: 0,0:26:53.01,0:26:55.11,Default,,0000,0000,0000,,given the evidence, is equal to Dialogue: 0,0:26:55.11,0:26:59.87,Default,,0000,0000,0000,,the probability of the hypothesis, times\Nthe probability of the evidence, Dialogue: 0,0:26:59.87,0:27:05.13,Default,,0000,0000,0000,,given the hypothesis, divided by \Nthe probability of the hypothesis, Dialogue: 0,0:27:05.13,0:27:08.59,Default,,0000,0000,0000,,times the probability of the evidence, \Ngiven the hypothesis, Dialogue: 0,0:27:08.83,0:27:12.65,Default,,0000,0000,0000,,plus the probability \Nof the hypothesis being false, Dialogue: 0,0:27:12.65,0:27:16.85,Default,,0000,0000,0000,,that is the probability of NOT h, \Ntimes the probability of the evidence, Dialogue: 0,0:27:16.85,0:27:21.13,Default,,0000,0000,0000,,given NOT h, or the hypothesis being false. Dialogue: 0,0:27:22.01,0:27:23.17,Default,,0000,0000,0000,,And that's a mouthful Dialogue: 0,0:27:23.17,0:27:27.07,Default,,0000,0000,0000,,and it's a long formula, \Nbut that's the mathematical formula Dialogue: 0,0:27:27.07,0:27:33.27,Default,,0000,0000,0000,,that Bayes proved in the 18th century\Nand it provides the mathematical basis Dialogue: 0,0:27:33.50,0:27:36.28,Default,,0000,0000,0000,,for that whole system of boxes \Nthat we talked about before. Dialogue: 0,0:27:37.47,0:27:42.78,Default,,0000,0000,0000,,But if you don't like the mathematical \Nproof and that's too confusing for you, Dialogue: 0,0:27:42.78,0:27:44.09,Default,,0000,0000,0000,,then use the boxes. Dialogue: 0,0:27:44.40,0:27:47.40,Default,,0000,0000,0000,,And if you don't like the boxes, \Nuse the mathematical proof. Dialogue: 0,0:27:47.86,0:27:50.71,Default,,0000,0000,0000,,They're both going to work:\Njust pick the one that works for you. Dialogue: 0,0:27:50.99,0:27:53.22,Default,,0000,0000,0000,,In fact, you don't have to pick \Neither of them, Dialogue: 0,0:27:53.22,0:27:57.12,Default,,0000,0000,0000,,because remember, this is an honors\Nlecture, it's optional, Dialogue: 0,0:27:57.99,0:27:59.100,Default,,0000,0000,0000,,and it won't be on the quiz. Dialogue: 0,0:28:00.59,0:28:04.27,Default,,0000,0000,0000,,But if you do want to try this method, \Nand make sure that you understand it, Dialogue: 0,0:28:05.08,0:28:08.44,Default,,0000,0000,0000,,we'll have a bunch of exercises for you, \Nwhere you can test your skills.