1 00:00:00,130 --> 00:00:03,040 So talk about confusion matrices in different context, you might 2 00:00:03,040 --> 00:00:08,109 remember the work that Katie showed you on principle component analysis, on PCA. 3 00:00:08,109 --> 00:00:12,800 Were we looked at seven different, I guess white male politicians from 4 00:00:12,800 --> 00:00:17,100 George Bush to Gerhard Schroeder, and we ran an eigenface analysis. 5 00:00:17,100 --> 00:00:21,585 Extracting the principle components of this data set, and then re-use 6 00:00:21,585 --> 00:00:28,040 the eigenfaces to re-map new faces to names in order to identify people. 7 00:00:28,040 --> 00:00:30,910 So what I'm going to do now, I won't drag you 8 00:00:30,910 --> 00:00:34,490 through the same PCA example again, so let's do away with those faces. 9 00:00:34,490 --> 00:00:38,410 But instead what I'll do is, I give you a typical output and 10 00:00:38,410 --> 00:00:42,110 we're going to study the output using confusion matrices. 11 00:00:42,110 --> 00:00:48,750 So Katie was so nice to run a PCA on the faces of those politicians and 12 00:00:48,750 --> 00:00:52,530 take the resulting features, put it into a support vector machine and 13 00:00:52,530 --> 00:00:54,440 then go through the data and 14 00:00:54,440 --> 00:00:59,980 count how often any of those people were predicted correctly or misclassified. 15 00:00:59,980 --> 00:01:02,530 And just to confuse everybody, in this example, 16 00:01:02,530 --> 00:01:06,120 we follow the convention of putting the true names, the true class labels, 17 00:01:06,120 --> 00:01:08,180 on the left, and the predicted ones on top. 18 00:01:08,180 --> 00:01:12,990 So, for example, this number one over here was truly Donald Rumsfeld, but 19 00:01:12,990 --> 00:01:15,320 was mistaken to be Colin Powell. 20 00:01:15,320 --> 00:01:19,130 And the way I know it's Colin Powell was the same names over here, 21 00:01:19,130 --> 00:01:22,980 from Ariel Sharon to Tony Blair, apply to the columns over here. 22 00:01:22,980 --> 00:01:26,030 Ariel Sharon's on the left and Tony Blair's on the right. 23 00:01:26,030 --> 00:01:27,760 So we ask you a few questions now. 24 00:01:27,760 --> 00:01:29,250 First, a simple one. 25 00:01:29,250 --> 00:01:33,390 Which of those seven politicians was most frequent in our dataset?