1 00:00:00,420 --> 00:00:04,330 Okay, so now what I've done is I've gone to the Gaussian Naive Bayes 2 00:00:04,330 --> 00:00:05,385 documentation page. 3 00:00:05,385 --> 00:00:08,100 sklearn.naive_bayes.GaussianNB. 4 00:00:08,100 --> 00:00:11,450 This was that algorithm that I set out to find and 5 00:00:11,450 --> 00:00:15,650 now that I've, now I've found the sklearn documentation page. 6 00:00:15,650 --> 00:00:18,410 So the first thing that I see right here, actually this is one of the things I 7 00:00:18,410 --> 00:00:22,960 love about the SK Learn documentation, is it's full of examples. 8 00:00:22,960 --> 00:00:25,640 When I was actually developing the code for this class, this was one of 9 00:00:25,640 --> 00:00:28,550 the first things that I would always do is I would come find the example code 10 00:00:28,550 --> 00:00:31,030 and I would try to just run in my Python interpreter, 11 00:00:31,030 --> 00:00:32,549 see if I could get it working. 12 00:00:32,549 --> 00:00:36,110 And almost invariably it works right out of the box. 13 00:00:36,110 --> 00:00:37,570 So here's something that's just very simple. 14 00:00:37,570 --> 00:00:39,620 There's only a few lines here that are really important. 15 00:00:39,620 --> 00:00:41,110 So let me point them out to you and 16 00:00:41,110 --> 00:00:45,640 then I'll show you the code I've actually written for the example we just saw, 17 00:00:45,640 --> 00:00:48,250 and you'll start to recognize some of these lines. 18 00:00:48,250 --> 00:00:49,180 But first let's introduce them. 19 00:00:49,180 --> 00:00:52,430 So the first one that's really important is this one right here. 20 00:00:53,550 --> 00:00:56,950 Above this it's just creating some, some training points that we can use, 21 00:00:56,950 --> 00:00:57,790 it's not that important. 22 00:00:57,790 --> 00:01:01,790 This is where the real meat starts, is with this import statement and if you've 23 00:01:01,790 --> 00:01:04,540 programmed in Python before, you're well acquainted with import statements. 24 00:01:04,540 --> 00:01:08,480 This is the way that you bring in external modules into the code that you're 25 00:01:08,480 --> 00:01:12,360 writing so that you don't have to completely re-implement everything every time, 26 00:01:12,360 --> 00:01:15,470 you can use code that someone else has already written. 27 00:01:15,470 --> 00:01:19,380 So we say from sklearn.naive_bayes going to import GaussianNB. 28 00:01:19,380 --> 00:01:20,350 Very good. 29 00:01:20,350 --> 00:01:23,020 The next thing that we're going to do is we're going to use that to 30 00:01:23,020 --> 00:01:24,210 create a classifier. 31 00:01:24,210 --> 00:01:27,040 So classifier equals GaussianNB. 32 00:01:27,040 --> 00:01:28,790 If you miss your import statement. 33 00:01:28,790 --> 00:01:29,840 If you forget this line for 34 00:01:29,840 --> 00:01:32,480 some reason, then this line is going to throw an error. 35 00:01:32,480 --> 00:01:36,030 So if you end up seeing some kind of error that says that it doesn't recognize 36 00:01:36,030 --> 00:01:37,570 this function. 37 00:01:37,570 --> 00:01:40,350 It's probably a problem with your import statement. 38 00:01:40,350 --> 00:01:42,510 So, okay, we've created our classifier. 39 00:01:42,510 --> 00:01:45,290 So now the code is all sort of ready to go. 40 00:01:45,290 --> 00:01:47,360 The next thing that we need to do is we need to fit it. 41 00:01:48,550 --> 00:01:51,710 And we've been using the word train interchangeably with fit. 42 00:01:51,710 --> 00:01:54,790 So this is where we actually give it the training data, 43 00:01:54,790 --> 00:01:57,070 and it learns the patterns. 44 00:01:57,070 --> 00:02:00,030 So we have the classifier that we just created. 45 00:02:00,030 --> 00:02:03,870 We're calling the fit function on it, and then the two arguments that we pass to 46 00:02:03,870 --> 00:02:10,000 fit are x, which in this case are the features and y which are the labels. 47 00:02:10,000 --> 00:02:13,100 This is always going to be true in supervised classification. 48 00:02:13,100 --> 00:02:14,740 Is that it's going to call this fit function and 49 00:02:14,740 --> 00:02:16,400 then it's going to have the features. 50 00:02:16,400 --> 00:02:17,190 And then the labels. 51 00:02:18,600 --> 00:02:22,060 And then the last thing that we do is we ask the classifier that 52 00:02:22,060 --> 00:02:24,300 we've just trained for some predictions. 53 00:02:24,300 --> 00:02:25,470 So we give it a new point. 54 00:02:25,470 --> 00:02:29,320 In this case the point is negative 0.8, negative 1. 55 00:02:29,320 --> 00:02:32,880 And we ask for this what do you think the label is for this particular point? 56 00:02:32,880 --> 00:02:35,370 What's the, what class does it belong to? 57 00:02:35,370 --> 00:02:38,390 So in this particular case it says it belongs to class number one. 58 00:02:38,390 --> 00:02:42,480 Or you could imagine for some other point it might say class number two. 59 00:02:42,480 --> 00:02:47,600 So of course you have to have already fit the classifier before you 60 00:02:47,600 --> 00:02:48,850 can call predict on it. 61 00:02:48,850 --> 00:02:50,500 Because when it's fitting the data that's where it's 62 00:02:50,500 --> 00:02:51,590 actually learning the patterns. 63 00:02:51,590 --> 00:02:55,140 Then here is where it's using those patterns to make a prediction. 64 00:02:55,140 --> 00:02:56,530 So, that's it. 65 00:02:56,530 --> 00:02:59,380 That's kind of, now you know most all there is to 66 00:02:59,380 --> 00:03:02,200 know to get this working in the first example that I've done.