WEBVTT 00:00:00.259 --> 00:00:03.421 What we hope to do with this meetup 00:00:03.851 --> 00:00:10.620 is have something, given the spread of the questionnaire results 00:00:10.620 --> 00:00:12.681 we hope to do something which is kind of 00:00:12.681 --> 00:00:15.721 for people who don't know what deep learning is 00:00:15.721 --> 00:00:17.992 and want an introduction to TensorFlow 00:00:17.992 --> 00:00:20.322 but also something which is more of a 00:00:20.322 --> 00:00:24.021 like a crowd pleaser or something which is more cutting edge 00:00:24.021 --> 00:00:27.081 I am not going to say that this thing is particularly cutting edge 00:00:27.081 --> 00:00:31.553 because once we saw the responses, we dialed things down a bit 00:00:31.553 --> 00:00:37.803 But there will be more cutting edge stuff 00:00:37.803 --> 00:00:42.811 and maybe we start to do other meetups events in other formats 00:00:42.811 --> 00:00:48.824 So it could be like we have an experts' paper meeting 00:00:48.824 --> 00:00:52.864 or we could split it now we can see the size of people, size of the crowd 00:00:52.864 --> 00:00:57.824 Anyway, let me talk a little bit about going deeper with transfer learning 00:00:57.824 --> 00:01:00.164 Unfortunately, this is something some of you people 00:01:00.164 --> 00:01:02.503 would have seen me do before 00:01:02.503 --> 00:01:04.883 This is the first time I have done it in tensorflow 00:01:05.122 --> 00:01:07.272 and let me just explain that 00:01:07.272 --> 00:01:10.162 Before, I have been programming this stuff 00:01:10.162 --> 00:01:13.382 in Theano with the Lasagna layers thing on top 00:01:13.382 --> 00:01:19.431 and Theano is a research-based deep learning framework, out of Montreal 00:01:19.431 --> 00:01:22.683 but what I have concluded since last summer 00:01:22.683 --> 00:01:26.643 is that TensorFlow 's probably the winner of this framework race 00:01:26.643 --> 00:01:29.434 at least for foreseeable future 00:01:29.434 --> 00:01:31.861 with all this nice industrial stuff 00:01:31.891 --> 00:01:35.154 I should be retooling into TensorFlow 00:01:35.518 --> 00:01:37.483 That's what I am taking the opportunity to do for this 00:01:40.951 --> 00:01:43.067 So, about me, sorry here we go 00:01:43.067 --> 00:01:45.678 I have come up through finance, startups and stuff 00:01:45.678 --> 00:01:49.649 I took a year out basically in 2014 just for fun 00:01:49.649 --> 00:01:53.559 I have been doing serious kind of natural language processing since then 00:02:00.909 --> 00:02:04.629 Basically, the overview for this something more challenging talk 00:02:04.629 --> 00:02:08.669 which will probably be 20 mins, 30 mins depending on how it goes 00:02:08.669 --> 00:02:13.889 I want to take a state-of-the-art TensorFlow model 00:02:13.889 --> 00:02:16.769 I want to solve a problem that it wasn't trained for 00:02:16.769 --> 00:02:20.928 And I am going to be using deep learning as a component 00:02:20.928 --> 00:02:25.960 of my solution rather than the primary focus of what I am trying to build 00:02:25.960 --> 00:02:32.900 So this is, in a way more of an industrial or commercial kind of application 00:02:32.912 --> 00:02:35.190 for what's going on here 00:02:35.190 --> 00:02:38.510 So the goal for this kind of problem is 00:02:38.510 --> 00:02:42.530 I want to distinguish pictures of classic and modern sports cars 00:02:42.530 --> 00:02:47.051 you will see some pictures of classic and modern cars a bit later 00:02:48.433 --> 00:02:51.722 It's not that easy to say what the difference is 00:02:51.722 --> 00:02:55.211 obviously, it could be different types of images 00:02:55.211 --> 00:02:57.454 and it could be lots of different classes 00:02:57.454 --> 00:03:00.992 I am just doing a very simple two class thing 00:03:00.992 --> 00:03:03.145 but it's complicated images 00:03:03.145 --> 00:03:04.824 what I want to do is 00:03:04.824 --> 00:03:06.114 I want to have a very small training time 00:03:06.114 --> 00:03:08.381 so I don't want to be retraining some huge network 00:03:08.381 --> 00:03:12.895 Particularly, I have only got in this case, 20 training examples 00:03:12.895 --> 00:03:18.195 So I am not gonna do any fantastic million image training 00:03:18.195 --> 00:03:20.863 I have got 20 images with me 00:03:20.863 --> 00:03:24.705 and I also want to be able to put this in production 00:03:24.705 --> 00:03:30.118 so I can just run it as a component of something else 00:03:30.118 --> 00:03:36.395 Basically, one of the things that is carrying the deep learning world forward 00:03:36.395 --> 00:03:40.196 is an image classification task called ImageNet 00:03:40.196 --> 00:03:42.406 this has been a competition where 00:03:42.406 --> 00:03:47.407 they have 15 million labeled images from 22,000 categories 00:03:47.407 --> 00:03:49.858 and you can see some of them here 00:03:49.858 --> 00:03:55.817 if we go for this. this is a picture of a hotdog in a bun 00:03:55.817 --> 00:03:57.786 and here are some of the categories 00:03:57.786 --> 00:04:02.538 which will be some food I don't know 00:04:02.538 --> 00:04:06.107 these are hotdogs, lots of different pictures of hotdogs 00:04:06.107 --> 00:04:09.058 lots of different pictures of cheeseburgers 00:04:09.058 --> 00:04:11.848 lots of different pictures of plates 00:04:11.848 --> 00:04:15.338 so the task for ImageNet is to classify 00:04:15.338 --> 00:04:18.267 for any given, any one of these images 00:04:18.267 --> 00:04:20.447 which of a thousand different categories it is from 00:04:20.447 --> 00:04:25.328 and it used to be that people could score adequately well 00:04:25.328 --> 00:04:28.558 and were making incremental changes in 00:04:28.558 --> 00:04:30.558 how well they can do this 00:04:30.558 --> 00:04:32.998 but the deep learning people came along 00:04:32.998 --> 00:04:35.488 and kind of tore this to shreds 00:04:35.488 --> 00:04:40.149 and Google came up with GoogLeNet 00:04:40.149 --> 00:04:43.909 what we are actually going to use here, back in 2014 00:04:43.909 --> 00:04:49.649 suddenly, this stuff is now being done with further iterations 00:04:49.649 --> 00:04:52.808 of this kind of thing, better than humans can 00:04:52.808 --> 00:04:56.795 So the way you can measure whether someone is better than humans 00:04:56.795 --> 00:04:59.069 is, you take a human and see whether it beats him 00:04:59.069 --> 00:05:01.560 the question there is are there labeling errors 00:05:01.560 --> 00:05:03.720 there you need a committee of humans 00:05:03.720 --> 00:05:06.250 so the way they label these things is 00:05:06.250 --> 00:05:08.740 by running it on Mechanical Turk and 00:05:08.740 --> 00:05:12.490 asking people what category is this cheeseburger in 00:05:14.820 --> 00:05:16.380 The network we are going to use here 00:05:16.380 --> 00:05:23.421 is the 2014 state-of-the-art GoogLeNet, also called Inception version 1 00:05:23.421 --> 00:05:25.690 The nice thing about this is that 00:05:25.690 --> 00:05:30.942 there is an existing model already trained for this task 00:05:30.942 --> 00:05:33.772 and it's available for download it's all free 00:05:33.772 --> 00:05:38.952 and there are lots of different models out there 00:05:38.952 --> 00:05:41.362 there's a model zoo for TensorFlow 00:05:41.362 --> 00:05:44.351 So, what I have on my machine 00:05:44.351 --> 00:05:48.531 and this is a small model, it's a 20 megabytes kind of model 00:05:48.531 --> 00:05:50.276 So it is not a very big model 00:05:50.276 --> 00:05:57.291 Inception 4 is a 200 MB kind of model which is a bit heavy 00:05:57.291 --> 00:05:59.423 I am working here on my laptop 00:05:59.423 --> 00:06:01.212 you are gonna see it working in real-time 00:06:01.212 --> 00:06:07.254 and the trick here is instead of a softmax layer at the end 00:06:07.254 --> 00:06:12.984 I will show you the diagram, it should be clear to anyone who's following along 00:06:12.984 --> 00:06:19.082 instead of using logits to get me the probablilities 00:06:19.082 --> 00:06:21.133 I am going to strip that away 00:06:21.133 --> 00:06:23.074 and I am going to train a support vector machine 00:06:23.074 --> 00:06:24.884 to distinguish these classes 00:06:24.884 --> 00:06:29.854 I am not going to retrain the Inception network at all 00:06:29.854 --> 00:06:32.474 I am going to just use it as a component 00:06:32.474 --> 00:06:34.913 strip off the top classification piece 00:06:34.913 --> 00:06:38.234 and replace it with an SVM 00:06:38.234 --> 00:06:40.384 Now, SVMs are pretty well understood 00:06:40.384 --> 00:06:44.624 here I am just using Inception as a featurizer for images 00:06:44.624 --> 00:06:47.285 So here's a network picture 00:06:47.285 --> 00:06:52.015 Basically, this is what the ImageNet network is designed for 00:06:52.015 --> 00:06:54.334 you put in an image at the bottom 00:06:54.334 --> 00:06:57.445 there is this black box which is the Inception network 00:06:57.445 --> 00:07:00.745 which is a bunch of CNNs or convolutional neural networks 00:07:00.745 --> 00:07:02.596 followed by Dense network 00:07:02.596 --> 00:07:04.846 followed by these logits 00:07:04.846 --> 00:07:07.976 and these logits layers is essentially the same as the 0 to 10 00:07:07.976 --> 00:07:17.037 that Sam had for his digits, 1 to 1000 for the different classes for ImageNet 00:07:17.037 --> 00:07:20.418 To actually get the ImageNet output 00:07:20.418 --> 00:07:27.387 it uses a softmax function and then chooses the highest one of these 00:07:27.387 --> 00:07:28.908 to give you this is the class that this is in 00:07:28.908 --> 00:07:32.167 What I am going to do is I am going to ignore this 00:07:32.167 --> 00:07:35.337 neat piece of classification technology that they have got 00:07:35.337 --> 00:07:44.148 let's say we use these outputs as inputs to SVM and just treat these as features 00:07:44.148 --> 00:07:46.567 Now if we pick out one of these 00:07:46.567 --> 00:07:50.698 this class could be cheeseburger and this class could be parrot 00:07:50.698 --> 00:07:54.067 and this other class could be Husky dog 00:07:54.067 --> 00:07:57.178 there is all sorts of classes in here 00:07:57.178 --> 00:07:59.709 but basically what I will be doing is that 00:07:59.709 --> 00:08:02.248 I will be extracting out the features of these photos 00:08:02.248 --> 00:08:04.948 saying how much of this photo is like a parrot 00:08:04.948 --> 00:08:08.938 how much of this is like a Husky dog 00:08:08.938 --> 00:08:13.229 Now it turns out that modern cars and classic cars can be distinguised that way 00:08:13.229 --> 00:08:18.659 Let me go to some code 00:08:18.659 --> 00:08:20.600 Ok this code is all up on GitHub 00:08:30.950 --> 00:08:34.300 Can everyone see this enough 00:08:38.380 --> 00:08:42.230 So basically, I am pulling in TensorFlow 00:08:45.400 --> 00:08:49.251 I pull in this model 00:08:49.251 --> 00:08:52.780 Here is what the Inception architecture is 00:08:52.780 --> 00:08:56.971 It feeds forward this way, here you put your image 00:08:56.971 --> 00:08:59.901 it goes through lots and lots of convolutional layers 00:08:59.901 --> 00:09:03.490 all the way up to the end with softmax and the output 00:09:03.490 --> 00:09:06.922 So having done that, what I will do is 00:09:06.922 --> 00:09:09.741 actually I have a download for the checkpoint 00:09:09.741 --> 00:09:16.562 this is the checkpoint here which is a tar file, I have it locally stored 00:09:16.562 --> 00:09:18.500 It doesn't download it now 00:09:18.500 --> 00:09:25.262 but it is all there, even the big models are there up from Google 00:09:25.262 --> 00:09:27.762 so they have retrained these 00:09:27.762 --> 00:09:30.483 so the Inception thing takes about a week 00:09:30.483 --> 00:09:33.792 to retrain on a bunch of, it could be 64 GPUs 00:09:33.792 --> 00:09:36.864 so you don't really want to be training this thing on your own 00:09:36.864 --> 00:09:40.793 you also need the ImageNet training set 00:09:40.793 --> 00:09:48.384 it is a 140 GB file which is no fun to download 00:09:50.824 --> 00:09:57.185 what I am doing here is basically there is also an Inception library 00:09:57.185 --> 00:10:04.043 which is part of the TF-Slim this thing is desinged such that 00:10:04.043 --> 00:10:08.264 it already knows the network it can preload it 00:10:08.264 --> 00:10:12.290 this has loaded it, I can get some labels 00:10:12.290 --> 00:10:17.184 This is loading up the ImageNet labels 00:10:17.184 --> 00:10:25.565 I need to know which location corresponds to which class like the digits 00:10:31.285 --> 00:10:33.305 Here we are going through basically the same steps 00:10:33.305 --> 00:10:39.068 as the MNIST example in that we reset the default graph 00:10:39.068 --> 00:10:44.586 we create a placeholder which is where my images are going to go 00:10:44.586 --> 00:10:47.575 this is as an input but from this image input 00:10:47.575 --> 00:10:49.904 I am then going to do some TensorFlow steps 00:10:49.904 --> 00:10:52.286 because TensorFlow has various preprocessing 00:10:52.286 --> 00:10:55.767 or graphics handling commands 00:10:55.767 --> 00:10:57.747 because a lot of this stuff works with images 00:10:57.747 --> 00:11:02.547 so there's all sorts of clipping and rotating stuff 00:11:02.547 --> 00:11:04.778 so it can preprocess these images 00:11:04.778 --> 00:11:08.485 I am also going to pull out a numpy image 00:11:08.485 --> 00:11:10.828 so I can see what it is actually looking at 00:11:10.828 --> 00:11:14.850 here with this Inception version 1 00:11:14.850 --> 00:11:20.906 I am going to pull in the entire Inception version 1 model 00:11:23.356 --> 00:11:26.568 My net function rather than being just picks and random weights 00:11:26.568 --> 00:11:29.978 is gonna be assigned this from this checkpoint 00:11:29.978 --> 00:11:34.418 when I run the init thing from my graph 00:11:34.418 --> 00:11:37.478 or in my session, it won't initialize everything from random 00:11:37.478 --> 00:11:39.479 it will initialize everything from disk 00:11:39.479 --> 00:11:42.028 so this will define the model 00:11:42.028 --> 00:11:45.358 and now let's proceed 00:11:45.358 --> 00:11:51.609 one of the issues with having this on a nice TensorFlow graph 00:11:51.609 --> 00:11:56.658 is it just says input, Inception1, output 00:11:56.658 --> 00:11:59.939 so there's a big block there you can delve into it if you want 00:11:59.939 --> 00:12:05.790 let me just show you let's go back a bit 00:12:08.320 --> 00:12:11.300 So this is the code behind the Inception1 model 00:12:11.300 --> 00:12:16.060 so this is actually smaller than the Inception2 and Inception3 00:12:16.060 --> 00:12:22.331 basically, we have a kind of a base Inception piece, just this 00:12:22.331 --> 00:12:24.971 and these are combined together 00:12:24.971 --> 00:12:33.441 and this is a detailed model put together by many smart people in 2014 00:12:33.441 --> 00:12:35.472 it's got much more complicated since then 00:12:35.472 --> 00:12:38.912 fortunately, they have written the code and we don't have to 00:12:43.422 --> 00:12:46.321 So here what I am gonna do is I am gonna load an example image 00:12:46.321 --> 00:12:50.581 just to show you one of the the things here is 00:12:50.581 --> 00:12:56.396 TensorFlow in order to become efficient wants to do the loading itself 00:12:56.396 --> 00:13:01.344 So in order to get this pumping information through 00:13:01.344 --> 00:13:03.633 it wants you to set up hues of images 00:13:03.633 --> 00:13:10.263 it will then handle the whole ingestion process itself 00:13:10.263 --> 00:13:14.153 the problem with that is it's kind of complicated to do 00:13:14.153 --> 00:13:16.023 in a Jupyter notebook right here 00:13:16.023 --> 00:13:19.133 so here I am going to do the very simplest thing 00:13:19.133 --> 00:13:22.393 which is load a numpy image and stuff the numpy image in 00:13:22.393 --> 00:13:24.883 but what TensorFlow would love me to do 00:13:24.883 --> 00:13:29.413 is create , as you see in this one 00:13:29.413 --> 00:13:34.024 create a file name queue and it will 00:13:34.024 --> 00:13:35.314 then run the queue, do the matching 00:13:35.314 --> 00:13:36.674 and do all of this stuff itself 00:13:36.674 --> 00:13:41.093 because then it can lay it out across potentially distributed cluster 00:13:41.093 --> 00:13:43.414 and do everything just right 00:13:43.414 --> 00:13:50.254 here I do kind of the simple read the image 00:13:50.254 --> 00:13:59.507 so this image is a tensor which is 224 by 224 by RGB 00:13:59.507 --> 00:14:03.478 this is kind of sanity check what kind of numbers I got in the corner 00:14:03.478 --> 00:14:05.667 and then what I am gonna do is 00:14:05.667 --> 00:14:08.016 i am going to crop out the middle section of it 00:14:08.016 --> 00:14:10.761 this happens to be the right size already 00:14:10.761 --> 00:14:13.495 basically if you got odd shapes 00:14:13.495 --> 00:14:15.136 you need to think about how am I gonna do it 00:14:15.136 --> 00:14:18.956 am I going to pad it what do you do 00:14:18.956 --> 00:14:21.947 because in order to make this efficient 00:14:21.947 --> 00:14:29.056 TensorFlow wants to lay it out without all this variability in image size 00:14:29.056 --> 00:14:34.475 one set of parameters and it's then going to blast it across your GPU 00:14:34.475 --> 00:14:37.865 so let's just run this thing 00:14:37.865 --> 00:14:39.697 so now we have defined the network 00:14:39.697 --> 00:14:45.767 here I am going to pick a session here I am going to init the session 00:14:45.767 --> 00:14:47.839 it loads the data, and then I am going 00:14:47.839 --> 00:14:52.037 to pick up the numpy image and the probabilities from the top layer 00:14:52.037 --> 00:14:54.677 I am just gonna show it 00:14:57.507 --> 00:15:01.366 here is the image this is image I pulled out of the disk 00:15:01.366 --> 00:15:06.327 you can see here the probabilities, the highest probability is Tabby cat 00:15:06.327 --> 00:15:10.487 which is good, it's also interesting that 00:15:10.487 --> 00:15:15.263 the second in line things are Tiger cat, Egyptian cat, lynx 00:15:15.263 --> 00:15:21.037 so it's got a fair idea that it is a cat in particular, it is getting it right 00:15:21.037 --> 00:15:26.169 ok so this is the same diagram we have had before 00:15:26.169 --> 00:15:32.729 what you have seen is this going in this black box, coming out and telling us 00:15:32.729 --> 00:15:35.868 the probabilities here, so what we are now gonna do is 00:15:35.868 --> 00:15:41.910 from the image to the black box and just learn a bunch of features 00:15:50.030 --> 00:15:52.720 let me just show you this on disk 00:16:11.300 --> 00:16:13.304 so I have a cars directory here 00:16:13.957 --> 00:16:17.848 and inside this thing, 00:16:24.238 --> 00:16:25.788 I have surprisingly little data 00:16:36.648 --> 00:16:39.863 In this directory, I just have a bunch of car images 00:16:39.863 --> 00:16:42.189 and I have two sets of images 00:16:42.189 --> 00:16:47.659 one of which is called classic and the other is called modern 00:16:47.659 --> 00:16:52.010 so I basically I picked some photos off Flickr 00:16:52.010 --> 00:16:54.439 I put these into two separate directories 00:16:54.439 --> 00:16:56.309 I am going to use those directory names 00:16:56.309 --> 00:17:00.431 as the classification for these images 00:17:00.431 --> 00:17:05.160 In the upper directory here I got a bunch of test images 00:17:05.160 --> 00:17:06.830 which I don't know the labels for 00:17:12.610 --> 00:17:17.261 this picks out the list of classic , there is a classic and a modern directory 00:17:17.261 --> 00:17:21.990 I am gonna go through every file in this directory 00:17:21.990 --> 00:17:28.470 I am gonna crop it, I am gonna find the logits level which is 00:17:28.470 --> 00:17:33.441 all the classes and then I am just gonna add these to features 00:17:33.441 --> 00:17:36.601 So basically I am gonna do something like a scikit-learn model 00:17:36.601 --> 00:17:38.311 I am gonna fit SVM 00:17:38.311 --> 00:17:42.111 so basically, this is featurizing all these pictures 00:17:47.911 --> 00:17:49.961 so here we go with the training data 00:17:55.571 --> 00:17:56.972 here's some training 00:18:02.272 --> 00:18:05.622 classic cars, it went through the classic directory 00:18:05.622 --> 00:18:08.782 modern cars, it went through the modern directory 00:18:15.292 --> 00:18:16.752 it's thinking hard 00:18:18.392 --> 00:18:25.284 what I am gonna do now is build SVM over those features 00:18:31.016 --> 00:18:40.180 jump to 21:36 00:21:35.478 --> 00:21:43.839 I restarted this thing 00:21:43.839 --> 00:21:49.619 the actual training for this SVM thing takes that long, 00:21:49.619 --> 00:21:58.018 this is very quick, essentially 20 images worth of a thousand features 00:21:58.018 --> 00:22:01.840 so there was no big training loop to do 00:22:01.840 --> 00:22:09.070 then I can run this on the actual models in the directory, in the test set 00:22:09.070 --> 00:22:12.680 so here this is images that it has never seen before 00:22:12.680 --> 00:22:16.440 it thinks that this is a modern car 00:22:16.440 --> 00:22:19.020 this one it thinks is a classic car, this one is classified as modern 00:22:19.020 --> 00:22:26.301 so this is actually doing quite a good job out of just 10 examples of each 00:22:26.301 --> 00:22:32.770 it actually thinks this one is modern it's not a sports car but anyway 00:22:32.770 --> 00:22:38.939 so this is showing that the SVM we trained 00:22:38.939 --> 00:22:42.901 can classify based on the features that Inception is producing because 00:22:42.901 --> 00:22:47.231 Inception understands "understands" what images are about 00:22:47.231 --> 00:22:50.801 so if I go back to here, code is on GitHub 00:22:50.801 --> 00:22:53.992 conclusions okay, this thing really works 00:22:53.992 --> 00:22:58.402 we didn't have to train a deep neural network 00:22:58.402 --> 00:23:01.876 we could plug this TensorFlow model into an existing pipeline 00:23:01.876 --> 00:23:04.760 and this is actually something where 00:23:04.760 --> 00:23:08.532 the TensorFlow Summit has something to say about these pipelines 00:23:08.532 --> 00:23:11.013 because not only are they talking about deep learning 00:23:11.013 --> 00:23:14.753 they are talking about the whole cloud-based learning 00:23:14.753 --> 00:23:19.453 and setting up proper processes 00:23:19.453 --> 00:23:23.965 I guess, time for questions quickly 00:23:23.965 --> 00:23:29.142 we can then do the TensorFlow Summit wrap-up 00:23:33.212 --> 00:23:37.144 "I am assuming that there is no backpropagation here" 00:23:37.144 --> 00:23:40.034 This includes no backpropagation 00:23:40.034 --> 00:23:42.504 "End result is a feature" 00:23:45.884 --> 00:23:53.135 I am just assuming that Inception, you can imagine if the ImageNet thing 00:23:53.135 --> 00:23:56.265 had focused more on products, it could be even better 00:23:56.265 --> 00:23:58.914 if it focused on man-made things 00:23:58.914 --> 00:24:04.915 The ImageNet training set has an awful lot of dogs in it, not that many cats 00:24:04.915 --> 00:24:09.426 So, on the other hand it may be that it has quite a lot of flowers 00:24:09.426 --> 00:24:13.826 or maybe that it is saying I like this car as modern car 00:24:13.826 --> 00:24:16.046 because it's got petals for wheels 00:24:16.046 --> 00:24:20.385 whereas the other one, the classic cars tend to have round things for wheels 00:24:20.385 --> 00:24:25.146 So it is abstractly doing this 00:24:25.146 --> 00:24:29.918 It doesn't know about sports cars or what they look like 00:24:29.918 --> 00:24:31.587 But it does know about curves 00:24:34.607 --> 00:24:37.527 "So for SVM, you don't use TensorFlow anymore ?" 00:24:37.527 --> 00:24:43.157 No, basically I have used TensorFlow to create some features 00:24:43.157 --> 00:24:45.308 Now, I don't want to throw it away 00:24:45.308 --> 00:24:47.687 because hopefully I have got a streaming process where 00:24:47.687 --> 00:24:52.177 more and more images are chugged through this thing 00:24:52.177 --> 00:25:04.528 could not hear the question properly 00:25:07.058 --> 00:25:10.068 There is an example code called TensorFlow for poets 00:25:10.068 --> 00:25:13.296 where they actually say that, let's load up one of these networks 00:25:13.296 --> 00:25:15.369 and then we will do some fine tuning 00:25:15.369 --> 00:25:21.977 there you get involved in tuning these neurons with some gradient descent 00:25:21.977 --> 00:25:24.819 and you are taking some steps and all this kind of thing 00:25:24.819 --> 00:25:28.328 maybe you are having broad implications across the whole network 00:25:28.328 --> 00:25:32.819 which could be good if you have got tons of data and tons of time 00:25:32.819 --> 00:25:36.948 but this is a very simple way of just tricking it to get it done 00:25:36.948 --> 00:25:47.382 could not hear the comment properly 00:25:47.382 --> 00:25:54.033 it will be a very small network because SVM is essentially fairly shallow 00:25:54.033 --> 00:26:06.532 could not hear the question 00:26:06.532 --> 00:26:13.752 TensorFlow even though it has imported this large Inception network 00:26:13.752 --> 00:26:20.572 as far as I am concerned, I am using a f(x) = y and that's it 00:26:20.572 --> 00:26:25.062 but you can inquire what would it say at this particular level 00:26:25.062 --> 00:26:30.473 and these bunches of levels with various component points along the way 00:26:30.473 --> 00:26:33.654 I could take out other levels 00:26:33.654 --> 00:26:35.783 I haven't tried it to have a look 00:26:35.783 --> 00:26:40.083 There you get more like pictures worth of features rather than 00:26:40.083 --> 00:26:43.094 this string of a 1000 numbers 00:26:43.094 --> 00:26:48.884 but each intermediate levels will be pictures with CNN kind of features 00:26:48.884 --> 00:26:53.544 on the other hand, if you want to play around with this thing 00:26:53.544 --> 00:26:57.654 there's this nice stuff called the DeepDream kind of things 00:26:57.654 --> 00:27:02.559 where they try and match images to being interesting images 00:27:02.559 --> 00:27:06.454 then you do the featurizing that looks at different levels 00:27:06.454 --> 00:27:12.415 the highest level is a cat but I want all local features to be as fishy as possible 00:27:12.415 --> 00:27:15.561 then you get like a fish-faced cat 00:27:15.561 --> 00:27:20.010 that's the kind of thing you can do with these kinds of features in models