WEBVTT 00:00:00.000 --> 00:00:09.550 34c3 preroll music 00:00:15.565 --> 00:00:18.230 Herald: ...and I will let Katherine take the stage now. 00:00:18.589 --> 00:00:21.430 Katharine Jarmul, kjam: Awesome! Well, thank you so much for the introduction and 00:00:21.430 --> 00:00:25.310 thank you so much for being here, taking your time. I know that Congress is really 00:00:25.310 --> 00:00:29.800 exciting, so I really appreciate you spending some time with me today. It's my 00:00:29.800 --> 00:00:34.470 first ever Congress, so I'm also really excited and I want to meet new people. So 00:00:34.470 --> 00:00:39.930 if you wanna come say hi to me later, I'm somewhat friendly, so we can maybe be 00:00:39.930 --> 00:00:44.680 friends later. Today what we're going to talk about is deep learning blind spots or 00:00:44.680 --> 00:00:49.890 how to fool "artificial intelligence". I like to put "artificial intelligence" in 00:00:49.890 --> 00:00:55.270 quotes, because.. yeah, we'll talk about that, but I think it should be in quotes. 00:00:55.270 --> 00:00:59.570 And today we're going to talk a little bit about deep learning, how it works and how 00:00:59.570 --> 00:01:07.640 you can maybe fool it. So I ask us: Is AI becoming more intelligent? 00:01:07.640 --> 00:01:11.078 And I ask this because when I open a browser and, of course, often it's Chrome 00:01:11.078 --> 00:01:16.979 and Google is already prompting me for what I should look at 00:01:16.979 --> 00:01:20.260 and it knows that I work with machine learning, right? 00:01:20.260 --> 00:01:23.830 And these are the headlines that I see every day: 00:01:23.830 --> 00:01:29.399 "Are Computers Already Smarter Than Humans?" 00:01:29.399 --> 00:01:32.289 If so, I think we could just pack up and go home, right? 00:01:32.289 --> 00:01:36.140 Like, we fixed computers, right? If a computer is smarter than me, 00:01:36.140 --> 00:01:39.780 then I already fixed it, we can go home, there's no need to talk about computers 00:01:39.780 --> 00:01:47.750 anymore, let's just move on with life. But that's not true, right? We know, because 00:01:47.750 --> 00:01:51.010 we work with computers and we know how stupid computers are sometimes. They're 00:01:51.010 --> 00:01:55.890 pretty bad. Computers do only what we tell them to do, generally, so I don't think a 00:01:55.890 --> 00:02:01.090 computer can think and be smarter than me. So with the same types of headlines that 00:02:01.090 --> 00:02:11.690 you see this, then you also see this: And yeah, so Apple recently released their 00:02:11.690 --> 00:02:17.500 face ID and this unlocks your phone with your face and it seems like a great idea, 00:02:17.500 --> 00:02:22.451 right? You have a unique face, you have a face, nobody else can take your face. But 00:02:22.451 --> 00:02:28.300 unfortunately what we find out about computers is that they're awful sometimes, 00:02:28.300 --> 00:02:32.480 and for these women.. for this Chinese woman that owned an iPhone, 00:02:32.480 --> 00:02:35.960 her coworker was able to unlock her phone. 00:02:35.964 --> 00:02:39.320 And I think Hendrick and Karin talked about, if you were here for the 00:02:39.320 --> 00:02:41.590 last talk ("Beeinflussung durch künstliche Intelligenz"). We have a lot of problems 00:02:41.590 --> 00:02:46.379 in machine learning and one of them is stereotypes and prejudice that are within 00:02:46.379 --> 00:02:52.340 our training data or within our minds that leak into our models. And perhaps they 00:02:52.340 --> 00:02:57.739 didn't do adequate training data on determining different features of Chinese 00:02:57.739 --> 00:03:03.160 folks. And perhaps it's other problems with their model or their training data or 00:03:03.160 --> 00:03:07.500 whatever they're trying to do. But they clearly have some issues, right? So when 00:03:07.500 --> 00:03:12.050 somebody asked me: "Is AI gonna take over the world and is there a super robot 00:03:12.050 --> 00:03:17.300 that's gonna come and be my new, you know, leader or so to speak?" I tell them we 00:03:17.300 --> 00:03:21.710 can't even figure out the stuff that we already have in production. So if we can't 00:03:21.710 --> 00:03:25.690 even figure out the stuff we already have in production, I'm a little bit less 00:03:25.690 --> 00:03:33.209 worried of the super robot coming to kill me. That said, unfortunately the powers 00:03:33.209 --> 00:03:38.190 that be, the powers that be a lot of times they believe in this and they believe 00:03:38.190 --> 00:03:44.540 strongly in "artificial intelligence" and machine learning. They're collecting data 00:03:44.540 --> 00:03:50.800 every day about you and me and everyone else. And they're gonna use this data to 00:03:50.800 --> 00:03:56.349 build even better models. This is because the revolution that we're seeing now in 00:03:56.349 --> 00:04:02.080 machine learning has really not much to do with new algorithms or architectures. It 00:04:02.080 --> 00:04:09.630 has a lot more to do with heavy compute and with massive, massive data sets. And 00:04:09.630 --> 00:04:15.740 the more that we have training data of petabytes per 24 hours or even less, the 00:04:15.740 --> 00:04:22.690 more we're able to essentially fix up the parts that don't work so well. The 00:04:22.690 --> 00:04:25.979 companies that we see here are companies that are investing heavily in machine 00:04:25.979 --> 00:04:30.979 learning and AI. Part of how they're investing heavily is, they're collecting 00:04:30.979 --> 00:04:37.999 more and more data about you and me and everyone else. Google and Facebook, more 00:04:37.999 --> 00:04:42.789 than 1 billion active users. I was surprised to know that in Germany the 00:04:42.789 --> 00:04:48.159 desktop search traffic for Google is higher than most of the rest of the world. 00:04:48.159 --> 00:04:53.259 And for Baidu they're growing with the speed that broadband is available. And so, 00:04:53.259 --> 00:04:56.970 what we see is, these people are collecting this data and they also are 00:04:56.970 --> 00:05:02.779 using new technologies like GPUs and TPUs in new ways to parallelize workflows 00:05:02.779 --> 00:05:09.449 and with this they're able to mess up less, right? They're still messing up, but 00:05:09.449 --> 00:05:14.960 they mess up slightly less. And they're not going to get uninterested in this 00:05:14.960 --> 00:05:20.550 topic, so we need to kind of start to prepare how we respond to this type of 00:05:20.550 --> 00:05:25.860 behavior. One of the things that has been a big area of research, actually also for 00:05:25.860 --> 00:05:30.080 a lot of these companies, is what we'll talk about today and that's adversarial 00:05:30.080 --> 00:05:36.800 machine learning. But the first thing that we'll start with is what is behind what we 00:05:36.800 --> 00:05:44.009 call AI. So most of the time when you think of AI or something like Siri and so 00:05:44.009 --> 00:05:48.979 forth, you are actually potentially talking about an old-school rule-based 00:05:48.979 --> 00:05:53.930 system. This is a rule, like you say a particular thing and then Siri is like: 00:05:53.930 --> 00:05:58.129 "Yes, I know how to respond to this". And we even hard program these types of things 00:05:58.129 --> 00:06:02.880 in, right? That is one version of AI, is essentially: It's been pre-programmed to 00:06:02.880 --> 00:06:08.839 do and understand certain things. Another form that usually, for example for the 00:06:08.839 --> 00:06:12.619 people that are trying to build AI robots and the people that are trying to build 00:06:12.619 --> 00:06:17.110 what we call "general AI", so this is something that can maybe learn like a 00:06:17.110 --> 00:06:20.190 human, they'll use reinforcement learning. 00:06:20.190 --> 00:06:22.200 I don't specialize in reinforcement learning. 00:06:22.200 --> 00:06:26.401 But what it does is it essentially tries to reward you for 00:06:26.401 --> 00:06:32.429 behaviour that you're expected to do. So if you complete a task, you get a a 00:06:32.429 --> 00:06:36.099 cookie. You complete two other tasks, you get two or three more cookies depending on 00:06:36.099 --> 00:06:41.759 how important the task is. And this will help you learn how to behave to get more 00:06:41.759 --> 00:06:45.990 points and it's used a lot in robots and gaming and so forth. And I'm not really 00:06:45.990 --> 00:06:49.340 going to talk about that today because most of that is still not really something 00:06:49.340 --> 00:06:54.880 that you or I interact with. Well, what I am gonna talk about today is neural 00:06:54.880 --> 00:06:59.680 networks, or as some people like to call them "deep learning", right? So deep 00:06:59.680 --> 00:07:04.119 learning 1: The neural network versus deep learning battle awhile ago. So here's an 00:07:04.119 --> 00:07:09.949 example neural network: we have an input layer and that's where we essentially make 00:07:09.949 --> 00:07:14.550 a quantitative version of whatever our data is. So we need to make it into 00:07:14.550 --> 00:07:19.890 numbers. Then we have a hidden layer and we might have multiple hidden layers. And 00:07:19.890 --> 00:07:23.759 depending on how deep our network is, or a network inside a network, right, which is 00:07:23.759 --> 00:07:28.179 possible. We might have very much different layers there and they may even 00:07:28.179 --> 00:07:33.539 act in cyclical ways. And then that's where all the weights and the variables 00:07:33.539 --> 00:07:39.259 and the learning happens. So that has.. holds a lot of information and data that 00:07:39.259 --> 00:07:43.979 we eventually want to train there. And finally we have an output layer. And 00:07:43.979 --> 00:07:47.529 depending on the network and what we're trying to do the output layer can vary 00:07:47.529 --> 00:07:51.539 between something that looks like the input, like for example if we want to 00:07:51.539 --> 00:07:55.719 machine translate, then I want the output to look like the input, right, I want it 00:07:55.719 --> 00:07:59.909 to just be in a different language, or the output could be a different class. It can 00:07:59.909 --> 00:08:05.749 be, you know, this is a car or this is a train and so forth. So it really depends 00:08:05.749 --> 00:08:10.610 what you're trying to solve, but the output layer gives us the answer. And how 00:08:10.610 --> 00:08:17.159 we train this is, we use backpropagation. Backpropagation is nothing new and neither 00:08:17.159 --> 00:08:21.139 is one of the most popular methods to do so, which is called stochastic gradient 00:08:21.139 --> 00:08:26.459 descent. What we do when we go through that part of the training, is we go from 00:08:26.459 --> 00:08:29.759 the output layer and we go backwards through the network. That's why it's 00:08:29.759 --> 00:08:34.828 called backpropagation, right? And as we go backwards through the network, in the 00:08:34.828 --> 00:08:39.139 most simple way, we upvote and downvote what's working and what's not working. So 00:08:39.139 --> 00:08:42.729 we say: "oh you got it right, you get a little bit more importance", or "you got 00:08:42.729 --> 00:08:46.040 it wrong, you get a little bit less importance". And eventually we hope 00:08:46.040 --> 00:08:50.481 over time, that they essentially correct each other's errors enough that we get a 00:08:50.481 --> 00:08:57.550 right answer. So that's a very general overview of how it works and the cool 00:08:57.550 --> 00:09:02.720 thing is: Because it works that way, we can fool it. And people have been 00:09:02.720 --> 00:09:08.269 researching ways to fool it for quite some time. So I give you a brief overview of 00:09:08.269 --> 00:09:13.290 the history of this field, so we can kind of know where we're working from and maybe 00:09:13.290 --> 00:09:19.220 hopefully then where we're going to. In 2005 was one of the first most important 00:09:19.220 --> 00:09:24.740 papers to approach adversarial learning and it was written by a series of 00:09:24.740 --> 00:09:29.630 researchers and they wanted to see, if they could act as an informed attacker and 00:09:29.630 --> 00:09:34.440 attack a linear classifier. So this is just a spam filter and they're like can I 00:09:34.440 --> 00:09:37.850 send spam to my friend? I don't know why they would want to do this, but: "Can I 00:09:37.850 --> 00:09:43.209 send spam to my friend, if I tried testing out a few ideas?" And what they were able 00:09:43.209 --> 00:09:47.639 to show is: Yes, rather than just, you know, trial and error which anybody can do 00:09:47.639 --> 00:09:52.120 or a brute force attack of just like send a thousand emails and see what happens, 00:09:52.120 --> 00:09:56.370 they were able to craft a few algorithms that they could use to try and find 00:09:56.370 --> 00:10:03.240 important words to change, to make it go through the spam filter. In 2007 NIPS, 00:10:03.240 --> 00:10:08.019 which is a very popular machine learning conference, had one of their first all-day 00:10:08.019 --> 00:10:12.930 workshops on computer security. And when they did so, they had a bunch of different 00:10:12.930 --> 00:10:16.780 people that were working on machine learning in computer security: From 00:10:16.780 --> 00:10:21.430 malware detection, to network intrusion detection, to of course spam. And they 00:10:21.430 --> 00:10:25.190 also had a few talks on this type of adversarial learning. So how do you act as 00:10:25.190 --> 00:10:29.980 an adversary to your own model? And then how do you learn how to counter that 00:10:29.980 --> 00:10:35.650 adversary? In 2013 there was a really great paper that got a lot of people's 00:10:35.650 --> 00:10:40.001 attention called "Poisoning Attacks against Support Vector Machines". Now 00:10:40.001 --> 00:10:45.290 support vector machines are essentially usually a linear classifier and we use 00:10:45.290 --> 00:10:50.121 them a lot to say, "this is a member of this class, that, or another", when we 00:10:50.121 --> 00:10:54.940 pertain to text. So I have a text and I want to know what the text is about or I 00:10:54.940 --> 00:10:58.610 want to know if it's a positive or negative sentiment, a lot of times I'll 00:10:58.610 --> 00:11:05.160 use a support vector machine. We call them SVM's as well. Battista Biggio was the 00:11:05.160 --> 00:11:08.319 main researcher and he has actually written quite a lot about these poisoning 00:11:08.319 --> 00:11:15.569 attacks and he poisoned the training data. So for a lot of these systems, sometimes 00:11:15.569 --> 00:11:20.820 they have active learning. This means, you or I, when we classify our emails as spam, 00:11:20.820 --> 00:11:26.290 we're helping train the network. So he poisoned the training data and was able to 00:11:26.290 --> 00:11:32.360 show that by poisoning it in a particular way, that he was able to then send spam 00:11:32.360 --> 00:11:37.810 email because he knew what words were then benign, essentially. He went on to study a 00:11:37.810 --> 00:11:43.220 few other things about biometric data if you're interested in biometrics. But then 00:11:43.220 --> 00:11:49.329 in 2014 Christian Szegedy, Ian Goodfellow, and a few other main researchers at Google 00:11:49.329 --> 00:11:55.350 Brain released "Intriguing Properties of Neural Networks." That really became the 00:11:55.350 --> 00:12:00.040 explosion of what we're seeing today in adversarial learning. And what they were 00:12:00.040 --> 00:12:04.629 able to do, is they were able to say "We believe there's linear properties of these 00:12:04.629 --> 00:12:08.790 neural networks, even if they're not necessarily linear networks. 00:12:08.790 --> 00:12:15.560 And we believe we can exploit them to fool them". And they first introduced then the 00:12:15.560 --> 00:12:23.189 fast gradient sign method, which we'll talk about later today. So how does it 00:12:23.189 --> 00:12:28.830 work? First I want us to get a little bit of an intuition around how this works. 00:12:28.830 --> 00:12:35.310 Here's a graphic of gradient descent. And in gradient descent we have this vertical 00:12:35.310 --> 00:12:40.339 axis is our cost function. And what we're trying to do is: We're trying to minimize 00:12:40.339 --> 00:12:47.400 cost, we want to minimize the error. And so when we start out, we just chose random 00:12:47.400 --> 00:12:51.790 weights and variables, so all of our hidden layers, they just have maybe random 00:12:51.790 --> 00:12:57.339 weights or random distribution. And then we want to get to a place where the 00:12:57.339 --> 00:13:01.740 weights have meaning, right? We want our network to know something, even if it's 00:13:01.740 --> 00:13:08.740 just a mathematical pattern, right? So we start in the high area of the graph, or 00:13:08.740 --> 00:13:13.819 the reddish area, and that's where we started, we have high error there. And 00:13:13.819 --> 00:13:21.209 then we try to get to the lowest area of the graph, or here the dark blue that is 00:13:21.209 --> 00:13:26.889 right about here. But sometimes what happens: As we learn, as we go through 00:13:26.889 --> 00:13:33.300 epochs and training, we're moving slowly down and hopefully we're optimizing. But 00:13:33.300 --> 00:13:37.370 what we might end up in instead of this global minimum, we might end up in the 00:13:37.370 --> 00:13:43.800 local minimum which is the other trail. And that's fine, because it's still zero 00:13:43.800 --> 00:13:49.889 error, right? So we're still probably going to be able to succeed, but we might 00:13:49.889 --> 00:13:56.139 not get the best answer all the time. What adversarial tries to do in the most basic 00:13:56.139 --> 00:14:01.980 of ways, it essentially tries to push the error rate back up the hill for as many 00:14:01.980 --> 00:14:07.709 units as it can. So it essentially tries to increase the error slowly through 00:14:07.709 --> 00:14:14.600 perturbations. And by disrupting, let's say, the weakest links like the one that 00:14:14.600 --> 00:14:19.060 did not find the global minimum but instead found a local minimum, we can 00:14:19.060 --> 00:14:23.069 hopefully fool the network, because we're finding those weak spots and we're 00:14:23.069 --> 00:14:25.629 capitalizing on them, essentially. 00:14:31.252 --> 00:14:34.140 So what does an adversarial example actually look like? 00:14:34.140 --> 00:14:37.430 You may have already seen this because it's very popular on the 00:14:37.430 --> 00:14:45.221 Twittersphere and a few other places, but this was a series of researches at MIT. It 00:14:45.221 --> 00:14:51.059 was debated whether you could do adverse.. adversarial learning in the real world. A 00:14:51.059 --> 00:14:57.339 lot of the research has just been a still image. And what they were able to show: 00:14:57.339 --> 00:15:03.079 They created a 3D-printed turtle. I mean it looks like a turtle to you as well, 00:15:03.079 --> 00:15:09.910 correct? And this 3D-printed turtle by the Inception Network, which is a very popular 00:15:09.910 --> 00:15:16.790 computer vision network, is a rifle and it is a rifle in every angle that you can 00:15:16.790 --> 00:15:21.959 see. And the way they were able to do this and, I don't know the next time it goes 00:15:21.959 --> 00:15:25.910 around you can see perhaps, and it's a little bit easier on the video which I'll 00:15:25.910 --> 00:15:29.790 have posted, I'll share at the end, you can see perhaps that there's a slight 00:15:29.790 --> 00:15:35.529 discoloration of the shell. They messed with the texture. By messing with this 00:15:35.529 --> 00:15:39.910 texture and the colors they were able to fool the neural network, they were able to 00:15:39.910 --> 00:15:45.259 activate different neurons that were not supposed to be activated. Units, I should 00:15:45.259 --> 00:15:51.129 say. So what we see here is, yeah, it can be done in the real world, and when I saw 00:15:51.129 --> 00:15:56.339 this I started getting really excited. Because, video surveillance is a real 00:15:56.339 --> 00:16:02.529 thing, right? So if we can start fooling 3D objects, we can perhaps start fooling 00:16:02.529 --> 00:16:08.040 other things in the real world that we would like to fool. 00:16:08.040 --> 00:16:12.440 applause 00:16:12.440 --> 00:16:19.149 kjam: So why do adversarial examples exist? We're going to talk a little bit 00:16:19.149 --> 00:16:23.879 about some things that are approximations of what's actually happening, so please 00:16:23.879 --> 00:16:27.610 forgive me for not being always exact, but I would rather us all have a general 00:16:27.610 --> 00:16:33.660 understanding of what's happening. Across the top row we have an input layer and 00:16:33.660 --> 00:16:39.480 these images to the left, we can see, are the source images and this source image is 00:16:39.480 --> 00:16:43.380 like a piece of farming equipment or something. And on the right we have our 00:16:43.380 --> 00:16:48.800 guide image. This is what we're trying to get the network to see we want it to 00:16:48.800 --> 00:16:55.070 missclassify this farm equipment as a pink bird. So what these researchers did is 00:16:55.070 --> 00:16:59.019 they targeted different layers of the network. And they said: "Okay, we're going 00:16:59.019 --> 00:17:02.410 to use this method to target this particular layer and we'll see what 00:17:02.410 --> 00:17:07.569 happens". And so as they targeted these different layers you can see what's 00:17:07.569 --> 00:17:12.109 happening on the internal visualization. Now neural networks can't see, right? 00:17:12.109 --> 00:17:17.939 They're looking at matrices of numbers but what we can do is we can use those 00:17:17.939 --> 00:17:26.559 internal values to try and see with our human eyes what they are learning. And we 00:17:26.559 --> 00:17:31.370 can see here clearly inside the network, we no longer see the farming equipment, 00:17:31.370 --> 00:17:39.550 right? We see a pink bird. And this is not visible to our human eyes. Now if you 00:17:39.550 --> 00:17:43.570 really study and if you enlarge the image you can start to see okay there's a little 00:17:43.570 --> 00:17:48.190 bit of pink here or greens, I don't know what's happening, but we can still see it 00:17:48.190 --> 00:17:56.510 in the neural network we have tricked. Now people don't exactly know yet why these 00:17:56.510 --> 00:18:03.159 blind spots exist. So it's still an area of active research exactly why we can fool 00:18:03.159 --> 00:18:09.429 neural networks so easily. There are some prominent researchers that believe that 00:18:09.429 --> 00:18:14.450 neural networks are essentially very linear and that we can use this simple 00:18:14.450 --> 00:18:20.840 linearity to misclassify to jump into another area. But there are others that 00:18:20.840 --> 00:18:24.820 believe that there's these pockets or blind spots and that we can then find 00:18:24.820 --> 00:18:28.500 these blind spots where these neurons really are the weakest links and they 00:18:28.500 --> 00:18:33.160 maybe even haven't learned anything and if we change their activation then we can 00:18:33.160 --> 00:18:37.580 fool the network easily. So this is still an area of active research and let's say 00:18:37.580 --> 00:18:44.320 you're looking for your thesis, this would be a pretty neat thing to work on. So 00:18:44.320 --> 00:18:49.399 we'll get into just a brief overview of some of the math behind the most popular 00:18:49.399 --> 00:18:55.571 methods. First we have the fast gradient sign method and that is was used in the 00:18:55.571 --> 00:18:59.950 initial paper and now there's been many iterations on it. And what we do is we 00:18:59.950 --> 00:19:05.120 have our same cost function, so this is the same way that we're trying to train 00:19:05.120 --> 00:19:13.110 our network and it's trying to learn. And we take the gradient sign of that and if 00:19:13.110 --> 00:19:16.330 you can think, it's okay, if you're not used to doing vector calculus, and 00:19:16.330 --> 00:19:20.250 especially not without a pen and paper in front of you, but what you think we're 00:19:20.250 --> 00:19:24.140 doing is we're essentially trying to calculate some approximation of a 00:19:24.140 --> 00:19:29.700 derivative of the function. And this can kind of tell us, where is it going. And if 00:19:29.700 --> 00:19:37.299 we know where it's going, we can maybe anticipate that and change. And then to 00:19:37.299 --> 00:19:41.480 create the adversarial images, we then take the original input plus a small 00:19:41.480 --> 00:19:48.770 number epsilon times that gradient's sign. For the Jacobian Saliency Map, this is a 00:19:48.770 --> 00:19:55.010 newer method and it's a little bit more effective, but it takes a little bit more 00:19:55.010 --> 00:20:02.250 compute. This Jacobian Saliency Map uses a Jacobian matrix and if you remember also, 00:20:02.250 --> 00:20:07.649 and it's okay if you don't, a Jacobian matrix will look at the full derivative of 00:20:07.649 --> 00:20:12.049 a function, so you take the full derivative of a cost function 00:20:12.049 --> 00:20:18.269 at that vector, and it gives you a matrix that is a pointwise approximation, 00:20:18.269 --> 00:20:22.550 if the function is differentiable at that input vector. Don't 00:20:22.550 --> 00:20:28.320 worry you can review this later too. But the Jacobian matrix then we use to create 00:20:28.320 --> 00:20:33.059 this saliency map the same way where we're essentially trying some sort of linear 00:20:33.059 --> 00:20:38.830 approximation, or pointwise approximation, and we then want to find two pixels that 00:20:38.830 --> 00:20:43.860 we can perturb that cause the most disruption. And then we continue to the 00:20:43.860 --> 00:20:48.970 next. Unfortunately this is currently a O(n²) problem, but there's a few people 00:20:48.970 --> 00:20:53.910 that are trying to essentially find ways that we can approximate this and make it 00:20:53.910 --> 00:21:01.320 faster. So maybe now you want to fool a network too and I hope you do, because 00:21:01.320 --> 00:21:06.580 that's what we're going to talk about. First you need to pick a problem or a 00:21:06.580 --> 00:21:13.460 network type you may already know. But you may want to investigate what perhaps is 00:21:13.460 --> 00:21:19.019 this company using, what perhaps is this method using and do a little bit of 00:21:19.019 --> 00:21:23.730 research, because that's going to help you. Then you want to research state-of- 00:21:23.730 --> 00:21:28.610 the-art methods and this is like a typical research statement that you have a new 00:21:28.610 --> 00:21:32.360 state-of-the-art method, but the good news is is that the state-of-the-art two to 00:21:32.360 --> 00:21:38.179 three years ago is most likely in production or in systems today. So once 00:21:38.179 --> 00:21:44.480 they find ways to speed it up, some approximation of that is deployed. And a 00:21:44.480 --> 00:21:48.279 lot of times these are then publicly available models, so a lot of times, if 00:21:48.279 --> 00:21:51.480 you're already working with the deep learning framework they'll come 00:21:51.480 --> 00:21:56.450 prepackaged with a few of the different popular models, so you can even use that. 00:21:56.450 --> 00:22:00.691 If you're already building neural networks of course you can build your own. An 00:22:00.691 --> 00:22:05.510 optional step, but one that might be recommended, is to fine-tune your model 00:22:05.510 --> 00:22:10.750 and what this means is to essentially take a new training data set, maybe data that 00:22:10.750 --> 00:22:15.490 you think this company is using or that you think this network is using, and 00:22:15.490 --> 00:22:19.300 you're going to remove the last few layers of the neural network and you're going to 00:22:19.300 --> 00:22:24.809 retrain it. So you essentially are nicely piggybacking on the work of the pre 00:22:24.809 --> 00:22:30.650 trained model and you're using the final layers to create finesse. This essentially 00:22:30.650 --> 00:22:37.169 makes your model better at the task that you have for it. Finally then you use a 00:22:37.169 --> 00:22:40.260 library, and we'll go through a few of them, but some of the ones that I have 00:22:40.260 --> 00:22:46.450 used myself is cleverhans, DeepFool and deep-pwning, and these all come with nice 00:22:46.450 --> 00:22:51.580 built-in features for you to use for let's say the fast gradient sign method, the 00:22:51.580 --> 00:22:56.740 Jacobian saliency map and a few other methods that are available. Finally it's 00:22:56.740 --> 00:23:01.550 not going to always work so depending on your source and your target, you won't 00:23:01.550 --> 00:23:05.840 always necessarily find a match. What researchers have shown is it's a lot 00:23:05.840 --> 00:23:10.950 easier to fool a network that a cat is a dog than it is to fool in networks that a 00:23:10.950 --> 00:23:16.030 cat is an airplane. And this is just like we can make these intuitive, so you might 00:23:16.030 --> 00:23:21.830 want to pick an input that's not super dissimilar from where you want to go, but 00:23:21.830 --> 00:23:28.260 is dissimilar enough. And you want to test it locally and then finally test the one 00:23:28.260 --> 00:23:38.149 for the highest misclassification rates on the target network. And you might say 00:23:38.149 --> 00:23:44.230 Katharine, or you can call me kjam, that's okay. You might say: "I don't know what 00:23:44.230 --> 00:23:50.049 the person is using", "I don't know what the company is using" and I will say "it's 00:23:50.049 --> 00:23:56.750 okay", because what's been proven: You can attack a blackbox model, you do not have 00:23:56.750 --> 00:24:01.950 to know what they're using, you do not have to know exactly how it works, you 00:24:01.950 --> 00:24:06.760 don't even have to know their training data, because what you can do is if it 00:24:06.760 --> 00:24:12.710 has.. okay, addendum it has to have some API you can interface with. But if it has 00:24:12.710 --> 00:24:18.130 an API you can interface with or even any API you can interact with, that uses the 00:24:18.130 --> 00:24:24.840 same type of learning, you can collect training data by querying the API. And 00:24:24.840 --> 00:24:28.700 then you're training your local model on that data that you're collecting. So 00:24:28.700 --> 00:24:32.890 you're collecting the data, you're training your local model, and as your 00:24:32.890 --> 00:24:37.299 local model gets more accurate and more similar to the deployed black box that you 00:24:37.299 --> 00:24:43.409 don't know how it works, you are then still able to fool it. And what this paper 00:24:43.409 --> 00:24:49.730 proved, Nicolas Papanov and a few other great researchers, is that with usually 00:24:49.730 --> 00:24:56.527 less than six thousand queries they were able to fool the network between 84% and 97% certainty 00:24:59.301 --> 00:25:03.419 And what the same group of researchers also studied is the ability 00:25:03.419 --> 00:25:09.241 to transfer the ability to fool one network into another network and they 00:25:09.241 --> 00:25:14.910 called that transfer ability. So I can take a certain type of network and I can 00:25:14.910 --> 00:25:19.320 use adversarial examples against this network to fool a different type of 00:25:19.320 --> 00:25:26.269 machine learning technique. Here we have their matrix, their heat map, that shows 00:25:26.269 --> 00:25:32.730 us exactly what they were able to fool. So we have across the left-hand side here the 00:25:32.730 --> 00:25:37.740 source machine learning technique, we have deep learning, logistic regression, SVM's 00:25:37.740 --> 00:25:43.380 like we talked about, decision trees and K-nearest-neighbors. And across the bottom 00:25:43.380 --> 00:25:47.340 we have the target machine learning, so what were they targeting. They created the 00:25:47.340 --> 00:25:51.470 adversaries with the left hand side and they targeted across the bottom. We 00:25:51.470 --> 00:25:56.700 finally have an ensemble model at the end. And what they were able to show is like, 00:25:56.700 --> 00:26:03.130 for example, SVM's and decision trees are quite easy to fool, but logistic 00:26:03.130 --> 00:26:08.480 regression a little bit less so, but still strong, for deep learning and K-nearest- 00:26:08.480 --> 00:26:13.460 neighbors, if you train a deep learning model or a K-nearest-neighbor model, then 00:26:13.460 --> 00:26:18.179 that performs fairly well against itself. And so what they're able to show is that 00:26:18.179 --> 00:26:23.320 you don't necessarily need to know the target machine and you don't even have to 00:26:23.320 --> 00:26:28.050 get it right, even if you do know, you can use a different type of machine learning 00:26:28.050 --> 00:26:30.437 technique to target the network. 00:26:34.314 --> 00:26:39.204 So we'll look at six lines of Python here and in 00:26:39.204 --> 00:26:44.559 these six lines of Python I'm using the cleverhans library and in six lines of 00:26:44.559 --> 00:26:52.419 Python I can both generate my adversarial input and I can even predict on it. So if 00:26:52.419 --> 00:27:02.350 you don't code Python, it's pretty easy to learn and pick up. And for example here we 00:27:02.350 --> 00:27:06.830 have Keras and Keras is a very popular deep learning library in Python, it 00:27:06.830 --> 00:27:12.070 usually works with a theano or a tensorflow backend and we can just wrap 00:27:12.070 --> 00:27:19.250 our model, pass it to the fast gradient method, class and then set up some 00:27:19.250 --> 00:27:24.630 parameters, so here's our epsilon and a few extra parameters, this is to tune our 00:27:24.630 --> 00:27:30.860 adversary, and finally we can generate our adversarial examples and then predict on 00:27:30.860 --> 00:27:39.865 them. So in a very small amount of Python we're able to target and trick a network. 00:27:40.710 --> 00:27:45.791 If you're already using tensorflow or Keras, it already works with those libraries. 00:27:48.828 --> 00:27:52.610 Deep-pwning is one of the first libraries that I heard about in this space 00:27:52.610 --> 00:27:58.200 and it was presented at Def Con in 2016 and what it comes with is a bunch of 00:27:58.200 --> 00:28:03.320 tensorflow built-in code. It even comes with a way that you can train the model 00:28:03.320 --> 00:28:06.730 yourself, so it has a few different models, a few different convolutional 00:28:06.730 --> 00:28:12.130 neural networks and these are predominantly used in computer vision. 00:28:12.130 --> 00:28:18.090 It also however has a semantic model and I normally work in NLP and I was pretty 00:28:18.090 --> 00:28:24.240 excited to try it out. What it comes built with is the Rotten Tomatoes sentiment, so 00:28:24.240 --> 00:28:29.900 this is Rotten Tomatoes movie reviews that try to learn is it positive or negative. 00:28:30.470 --> 00:28:35.269 So the original text that I input in, when I was generating my adversarial networks 00:28:35.269 --> 00:28:41.500 was "more trifle than triumph", which is a real review and the adversarial text that 00:28:41.500 --> 00:28:46.080 it gave me was "jonah refreshing haunting leaky" 00:28:49.470 --> 00:28:52.660 ...Yeah.. so I was able to fool my network 00:28:52.660 --> 00:28:57.559 but I lost any type of meaning and this is really the problem when we think 00:28:57.559 --> 00:29:03.539 about how we apply adversarial learning to different tasks is, it's easy for an image 00:29:03.539 --> 00:29:08.960 if we make a few changes for it to retain its image, right? It's many, many pixels, 00:29:08.960 --> 00:29:14.139 but when we start going into language, if we change one word and then another word 00:29:14.139 --> 00:29:18.950 and another word or maybe we changed all of the words, we no longer understand as 00:29:18.950 --> 00:29:23.120 humans. And I would say this is garbage in, garbage out, this is not actual 00:29:23.120 --> 00:29:28.759 adversarial learning. So we have a long way to go when it comes to language tasks 00:29:28.759 --> 00:29:32.740 and being able to do adversarial learning and there is some research in this, but 00:29:32.740 --> 00:29:37.279 it's not really advanced yet. So hopefully this is something that we can continue to 00:29:37.279 --> 00:29:42.429 work on and advance further and if so we need to support a few different types of 00:29:42.429 --> 00:29:47.426 networks that are more common in NLP than they are in computer vision. 00:29:50.331 --> 00:29:54.759 There's some other notable open-source libraries that are available to you and I'll cover just a 00:29:54.759 --> 00:29:59.610 few here. There's a "Vanderbilt computational economics research lab" that 00:29:59.610 --> 00:30:03.679 has adlib and this allows you to do poisoning attacks. So if you want to 00:30:03.679 --> 00:30:09.429 target training data and poison it, then you can do so with that and use scikit- 00:30:09.429 --> 00:30:16.590 learn. DeepFool allows you to do the fast gradient sign method, but it tries to do 00:30:16.590 --> 00:30:21.590 smaller perturbations, it tries to be less detectable to us humans. 00:30:23.171 --> 00:30:28.284 It's based on Theano, which is another library that I believe uses Lua as well as Python. 00:30:29.669 --> 00:30:34.049 "FoolBox" is kind of neat because I only heard about it last week, but it collects 00:30:34.049 --> 00:30:39.309 a bunch of different techniques all in one library and you could use it with one 00:30:39.309 --> 00:30:43.160 interface. So if you want to experiment with a few different ones at once, I would 00:30:43.160 --> 00:30:47.460 recommend taking a look at that and finally for something that we'll talk 00:30:47.460 --> 00:30:53.600 about briefly in a short period of time we have "Evolving AI Lab", which release a 00:30:53.600 --> 00:30:59.710 fooling library and this fooling library is able to generate images that you or I 00:30:59.710 --> 00:31:04.573 can't tell what it is, but that the neural network is convinced it is something. 00:31:05.298 --> 00:31:09.940 So this we'll talk about maybe some applications of this in a moment, but they 00:31:09.940 --> 00:31:13.559 also open sourced all of their code and they're researchers, who open sourced 00:31:13.559 --> 00:31:19.649 their code, which is always very exciting. As you may have known from some of the 00:31:19.649 --> 00:31:25.500 research I already cited, most of the studies and the research in this area has 00:31:25.500 --> 00:31:29.830 been on malicious attacks. So there's very few people trying to figure out how to do 00:31:29.830 --> 00:31:33.769 this for what I would call benevolent purposes. Most of them are trying to act 00:31:33.769 --> 00:31:39.539 as an adversary in the traditional computer security sense. They're perhaps 00:31:39.539 --> 00:31:43.889 studying spam filters and how spammers can get by them. They're perhaps looking at 00:31:43.889 --> 00:31:48.669 network intrusion or botnet-attacks and so forth. They're perhaps looking at self- 00:31:48.669 --> 00:31:53.390 driving cars so and I know that was referenced earlier as well at Henrick and 00:31:53.390 --> 00:31:57.889 Karen's talk, they're perhaps trying to make a yield sign look like a stop sign or 00:31:57.889 --> 00:32:02.760 a stop sign look like a yield sign or a speed limit, and so forth, and scarily 00:32:02.760 --> 00:32:07.669 they are quite successful at this. Or perhaps they're looking at data poisoning, 00:32:07.669 --> 00:32:12.441 so how do we poison the model so we render it useless? In a particular context, so we 00:32:12.441 --> 00:32:17.990 can utilize that. And finally for malware. So what a few researchers were able to 00:32:17.990 --> 00:32:22.669 show is, by just changing a few things in the malware they were able to upload their 00:32:22.669 --> 00:32:26.270 malware to Google Mail and send it to someone and this was still fully 00:32:26.270 --> 00:32:31.580 functional malware. In that same sense there's the malGAN project, which uses a 00:32:31.580 --> 00:32:38.549 generative adversarial network to create malware that works, I guess. So there's a 00:32:38.549 --> 00:32:43.326 lot of research of these kind of malicious attacks within adversarial learning. 00:32:44.984 --> 00:32:51.929 But what I wonder is how might we use this for good. And I put "good" in quotation marks, 00:32:51.929 --> 00:32:56.179 because we all have different ethical and moral systems we use. And what you may 00:32:56.179 --> 00:33:00.289 decide is ethical for you might be different. But I think as a community, 00:33:00.289 --> 00:33:05.450 especially at a conference like this, hopefully we can converge on some ethical 00:33:05.450 --> 00:33:10.183 privacy concerned version of using these networks. 00:33:13.237 --> 00:33:20.990 So I've composed a few ideas and I hope that this is just a starting list of a longer conversation. 00:33:22.889 --> 00:33:30.010 One idea is that we can perhaps use this type of adversarial learning to fool surveillance. 00:33:30.830 --> 00:33:36.470 As surveillance affects you and I it even disproportionately affects people that 00:33:36.470 --> 00:33:41.870 most likely can't be here. So whether or not we're personally affected, we can care 00:33:41.870 --> 00:33:46.419 about the many lives that are affected by this type of surveillance. And we can try 00:33:46.419 --> 00:33:49.667 and build ways to fool surveillance systems. 00:33:50.937 --> 00:33:52.120 Stenography: 00:33:52.120 --> 00:33:55.223 So we could potentially, in a world where more and more people 00:33:55.223 --> 00:33:58.780 have less of a private way of sending messages to one another 00:33:58.780 --> 00:34:03.080 We can perhaps use adversarial learning to send private messages. 00:34:03.830 --> 00:34:08.310 Adware fooling: So again, where I might have quite a lot of 00:34:08.310 --> 00:34:13.859 privilege and I don't actually see ads that are predatory on me as much, there is 00:34:13.859 --> 00:34:19.449 a lot of people in the world that face predatory advertising. And so how can we 00:34:19.449 --> 00:34:23.604 help those problems by developing adversarial techniques? 00:34:24.638 --> 00:34:26.520 Poisoning your own private data: 00:34:27.386 --> 00:34:30.600 This depends on whether you actually need to use the service and 00:34:30.600 --> 00:34:34.590 whether you like how the service is helping you with the machine learning, but 00:34:34.590 --> 00:34:40.110 if you don't care or if you need to essentially have a burn box of your data. 00:34:40.110 --> 00:34:45.760 Then potentially you could poison your own private data. Finally, I want us to use it 00:34:45.760 --> 00:34:51.139 to investigate deployed models. So even if we don't actually need a use for 00:34:51.139 --> 00:34:56.010 fooling this particular network, the more we know about what's deployed and how we 00:34:56.010 --> 00:35:00.350 can fool it, the more we're able to keep up with this technology as it continues to 00:35:00.350 --> 00:35:04.630 evolve. So the more that we're practicing, the more that we're ready for whatever 00:35:04.630 --> 00:35:09.800 might happen next. And finally I really want to hear your ideas as well. So I'll 00:35:09.800 --> 00:35:13.940 be here throughout the whole Congress and of course you can share during the Q&A 00:35:13.940 --> 00:35:17.073 time. If you have great ideas, I really want to hear them. 00:35:20.635 --> 00:35:26.085 So I decided to play around a little bit with some of my ideas. 00:35:26.810 --> 00:35:32.720 And I was convinced perhaps that I could make Facebook think I was a cat. 00:35:33.305 --> 00:35:36.499 This is my goal. Can Facebook think I'm a cat? 00:35:37.816 --> 00:35:40.704 Because nobody really likes Facebook. I mean let's be honest, right? 00:35:41.549 --> 00:35:44.166 But I have to be on it because my mom messages me there 00:35:44.166 --> 00:35:46.020 and she doesn't use the email anymore. 00:35:46.020 --> 00:35:47.890 So I'm on Facebook. Anyways. 00:35:48.479 --> 00:35:55.151 So I used a pre-trained Inception model and Keras and I fine-tuned the layers. 00:35:55.151 --> 00:35:57.190 And I'm not a computer vision person really. But it 00:35:57.190 --> 00:36:01.770 took me like a day of figuring out how computer vision people transfer their data 00:36:01.770 --> 00:36:06.350 into something I can put inside of a network figure that out and I was able to 00:36:06.350 --> 00:36:12.040 quickly train a model and the model could only distinguish between people and cats. 00:36:12.040 --> 00:36:15.140 That's all the model knew how to do. I give it a picture it says it's a person or 00:36:15.140 --> 00:36:19.630 it's a cat. I actually didn't try just giving it an image of something else, it 00:36:19.630 --> 00:36:25.380 would probably guess it's a person or a cat maybe, 50/50, who knows. What I did 00:36:25.380 --> 00:36:31.930 was, I used an image of myself and eventually I had my fast gradient sign 00:36:31.930 --> 00:36:37.700 method, I used cleverhans, and I was able to slowly increase the epsilon and so the 00:36:37.700 --> 00:36:44.100 epsilon as it's low, you and I can't see the perturbations, but also the network 00:36:44.100 --> 00:36:48.920 can't see the perturbations. So we need to increase it, and of course as we increase 00:36:48.920 --> 00:36:53.300 it, when we're using a technique like FGSM, we are also increasing the noise 00:36:53.300 --> 00:37:00.830 that we see. And when I got 2.21 epsilon and I kept uploading it to Facebook and 00:37:00.830 --> 00:37:02.350 Facebook kept saying: "Yeah, do you want to tag yourself?" and I'm like: 00:37:02.370 --> 00:37:04.222 "no Idon't, I'm just testing". 00:37:05.123 --> 00:37:11.379 Finally I got deployed to an epsilon and Facebook no longer knew I was a face 00:37:11.379 --> 00:37:15.323 So I was just a book, I was a cat book, maybe. 00:37:15.340 --> 00:37:19.590 applause 00:37:21.311 --> 00:37:24.740 kjam: So, unfortunately, as we see, I didn't actually become a cat, because that 00:37:24.740 --> 00:37:30.630 would be pretty neat. But I was able to fool it. I spoke with the computer visions 00:37:30.630 --> 00:37:34.760 specialists that I know and she actually works in this and I was like: "What 00:37:34.760 --> 00:37:39.020 methods do you think Facebook was using? Did I really fool the neural network or 00:37:39.020 --> 00:37:43.140 what did I do?" And she's convinced most likely that they're actually using a 00:37:43.140 --> 00:37:47.580 statistical method called Viola-Jones, which takes a look at the statistical 00:37:47.580 --> 00:37:53.280 distribution of your face and tries to guess if there's really a face there. But 00:37:53.280 --> 00:37:58.800 what I was able to show: transferability. That is, I can use my neural network even 00:37:58.800 --> 00:38:05.380 to fool this statistical model, so now I have a very noisy but happy photo on FB 00:38:08.548 --> 00:38:14.140 Another use case potentially is adversarial stenography and I was really 00:38:14.140 --> 00:38:18.590 excited reading this paper. What this paper covered and they actually released 00:38:18.590 --> 00:38:22.860 the library, as I mentioned. They study the ability of a neural network to be 00:38:22.860 --> 00:38:26.309 convinced that something's there that's not actually there. 00:38:27.149 --> 00:38:30.177 And what they used, they used the MNIST training set. 00:38:30.240 --> 00:38:33.420 I'm sorry, if that's like a trigger word 00:38:33.420 --> 00:38:38.410 if you've used MNIST a million times, then I'm sorry for this, but what they use is 00:38:38.410 --> 00:38:43.290 MNIST, which is zero through nine of digits, and what they were able to show 00:38:43.290 --> 00:38:48.790 using evolutionary networks is they were able to generate things that to us look 00:38:48.790 --> 00:38:53.280 maybe like art and they actually used it on the CIFAR data set too, which has 00:38:53.280 --> 00:38:57.320 colors, and it was quite beautiful. Some of what they created in fact they showed 00:38:57.320 --> 00:39:04.340 in a gallery. And what the network sees here is the digits across the top. They 00:39:04.340 --> 00:39:12.170 see that digit, they are more than 99% convinced that that digit is there and 00:39:12.170 --> 00:39:15.476 what we see is pretty patterns or just noise. 00:39:16.778 --> 00:39:19.698 When I was reading this paper I was thinking, 00:39:19.698 --> 00:39:23.620 how can we use this to send messages to each other that nobody else 00:39:23.620 --> 00:39:28.511 will know is there? I'm just sending really nice.., I'm an artist and this is 00:39:28.511 --> 00:39:35.200 my art and I'm sharing it with my friend. And in a world where I'm afraid to go home 00:39:35.200 --> 00:39:42.360 because there's a crazy person in charge and I'm afraid that they might look at my 00:39:42.360 --> 00:39:47.040 phone, in my computer, and a million other things and I just want to make sure that 00:39:47.040 --> 00:39:51.650 my friend has my pin number or this or that or whatever. I see a use case for my 00:39:51.650 --> 00:39:56.120 life, but again I leave a fairly privileged life, there are other people 00:39:56.120 --> 00:40:01.690 where their actual life and livelihood and security might depend on using a technique 00:40:01.690 --> 00:40:06.150 like this. And I think we could use adversarial learning to create a new form 00:40:06.150 --> 00:40:07.359 of stenography. 00:40:11.289 --> 00:40:17.070 Finally I cannot impress enough that the more information we have 00:40:17.070 --> 00:40:20.620 about the systems that we interact with every day, that our machine learning 00:40:20.620 --> 00:40:24.850 systems, that our AI systems, or whatever you want to call it, that our deep 00:40:24.850 --> 00:40:29.701 networks, the more information we have, the better we can fight them, right. We 00:40:29.701 --> 00:40:33.920 don't need perfect knowledge, but the more knowledge that we have, the better an 00:40:33.920 --> 00:40:41.360 adversary we can be. I thankfully now live in Germany and if you are also a European 00:40:41.360 --> 00:40:46.770 resident: We have GDPR, which is the general data protection regulation and it 00:40:46.770 --> 00:40:55.650 goes into effect in May of 2018. We can use gdpr to make requests about our data, 00:40:55.650 --> 00:41:00.450 we can use GDPR to make requests about machine learning systems that we interact 00:41:00.450 --> 00:41:07.840 with, this is a right that we have. And in recital 71 of the GDPR it states: "The 00:41:07.840 --> 00:41:12.550 data subject should have the right to not be subject to a decision, which may 00:41:12.550 --> 00:41:17.730 include a measure, evaluating personal aspects relating to him or her which is 00:41:17.730 --> 00:41:22.880 based solely on automated processing and which produces legal effects concerning 00:41:22.880 --> 00:41:28.010 him or her or similarly significantly affects him or her, such as automatic 00:41:28.010 --> 00:41:33.620 refusal of an online credit application or e-recruiting practices without any human 00:41:33.620 --> 00:41:39.270 intervention." And I'm not a lawyer and I don't know how this will be implemented 00:41:39.270 --> 00:41:43.990 and it's a recital, so we don't even know, if it will be in force the same way, but 00:41:43.990 --> 00:41:50.720 the good news is: Pieces of this same sentiment are in the actual amendments and 00:41:50.720 --> 00:41:55.580 if they're in the amendments, then we can legally use them. And what it also says 00:41:55.580 --> 00:41:59.920 is, we can ask companies to port our data other places, we can ask companies to 00:41:59.920 --> 00:42:03.890 delete our data, we can ask for information about how our data is 00:42:03.890 --> 00:42:09.010 processed, we can ask for information about what different automated decisions 00:42:09.010 --> 00:42:15.750 are being made, and the more we all here ask for that data, the more we can also 00:42:15.750 --> 00:42:20.530 share that same information with people worldwide. Because the systems that we 00:42:20.530 --> 00:42:25.091 interact with, they're not special to us, they're the same types of systems that are 00:42:25.091 --> 00:42:30.610 being deployed everywhere in the world. So we can help our fellow humans outside of 00:42:30.610 --> 00:42:36.400 Europe by being good caretakers and using our rights to make more information 00:42:36.400 --> 00:42:41.960 available to the entire world and to use this information, to find ways to use 00:42:41.960 --> 00:42:46.242 adversarial learning to fool these types of systems. 00:42:47.512 --> 00:42:56.500 applause 00:42:56.662 --> 00:43:03.360 So how else might we be able to harness this for good? I cannot focus enough on 00:43:03.360 --> 00:43:08.260 GDPR and our right to collect more information about the information they're 00:43:08.260 --> 00:43:14.110 already collecting about us and everyone else. So use it, let's find ways to share 00:43:14.110 --> 00:43:17.740 the information we gain from it. So I don't want it to just be that one person 00:43:17.740 --> 00:43:21.020 requests it and they learn something. Se have to find ways to share this 00:43:21.020 --> 00:43:28.080 information with one another. Test low- tech ways. I'm so excited about the maker 00:43:28.080 --> 00:43:32.850 space here and maker culture and other low-tech or human-crafted ways to fool 00:43:32.850 --> 00:43:37.890 networks. We can use adversarial learning perhaps to get good ideas on how to fool 00:43:37.890 --> 00:43:43.350 networks, to get lower tech ways. What if I painted red pixels all over my face? 00:43:43.350 --> 00:43:48.600 Would I still be recognized? Would I not? Let's experiment with things that we learn 00:43:48.600 --> 00:43:53.570 from adversarial learning and try to find other lower-tech solutions to the same problem 00:43:55.428 --> 00:43:59.930 Finally. or nearly finally, we need to increase the research beyond just 00:43:59.930 --> 00:44:04.010 computer vision. Quite a lot of adversarial learning has been only in 00:44:04.010 --> 00:44:08.220 computer vision and while I think that's important and it's also been very 00:44:08.220 --> 00:44:12.030 practical, because we can start to see how we can fool something, we need to figure 00:44:12.030 --> 00:44:15.920 out natural language processing, we need to figure out other ways that machine 00:44:15.920 --> 00:44:19.933 learning systems are being used, and we need to come up with clever ways to fool them. 00:44:21.797 --> 00:44:26.000 Finally, spread the word! So I don't want the conversation to end here, I don't 00:44:26.000 --> 00:44:30.950 want the conversation to end at Congress, I want you to go back to your hacker 00:44:30.950 --> 00:44:36.530 collective, your local CCC, the people that you talk with, your co-workers and I 00:44:36.530 --> 00:44:41.340 want you to spread the word. I want you to do workshops on adversarial learning, I 00:44:41.340 --> 00:44:47.930 want more people to not treat this AI as something mystical and powerful, because 00:44:47.930 --> 00:44:52.340 unfortunately it is powerful, but it's not mystical! So we need to demystify this 00:44:52.340 --> 00:44:57.040 space, we need to experiment, we need to hack on it and we need to find ways to 00:44:57.040 --> 00:45:02.310 play with it and spread the word to other people. Finally, I really want to hear 00:45:02.310 --> 00:45:10.480 your other ideas and before I leave today have to say a little bit about why I 00:45:10.480 --> 00:45:15.820 decided to join the resiliency track this year. I read about the resiliency track 00:45:15.820 --> 00:45:21.910 and I was really excited. It spoke to me. And I said I want to live in a world 00:45:21.910 --> 00:45:27.230 where, even if there's an entire burning trash fire around me, I know that there 00:45:27.230 --> 00:45:32.010 are other people that I care about, that I can count on, that I can work with to try 00:45:32.010 --> 00:45:37.840 and at least protect portions of our world. To try and protect ourselves, to 00:45:37.840 --> 00:45:43.940 try and protect people that do not have as much privilege. So, what I want to be a 00:45:43.940 --> 00:45:49.240 part of, is something that can use maybe the skills I have and the skills you have 00:45:49.240 --> 00:45:56.590 to do something with that. And your data is a big source of value for everyone. 00:45:56.590 --> 00:46:02.820 Any free service you use, they are selling your data. OK, I don't know that for a 00:46:02.820 --> 00:46:08.420 fact, but it is very certain, I feel very certain about the fact that they're most 00:46:08.420 --> 00:46:12.560 likely selling your data. And if they're selling your data, they might also be 00:46:12.560 --> 00:46:17.730 buying your data. And there is a whole market, that's legal, that's freely 00:46:17.730 --> 00:46:22.670 available, to buy and sell your data. And they make money off of that, and they mine 00:46:22.670 --> 00:46:28.910 more information, and make more money off of that, and so forth. So, I will read a 00:46:28.910 --> 00:46:35.410 little bit of my opinions that I put forth on this. Determine who you share your data 00:46:35.410 --> 00:46:41.910 with and for what reasons. GDPR and data portability give us European residents 00:46:41.910 --> 00:46:44.410 stronger rights than most of the world. 00:46:44.920 --> 00:46:47.940 Let's use them. Let's choose privacy 00:46:47.940 --> 00:46:52.800 concerned ethical data companies over corporations that are entirely built on 00:46:52.800 --> 00:46:58.260 selling ads. Let's build start-ups, organizations, open-source tools and 00:46:58.260 --> 00:47:05.691 systems that we can be truly proud of. And let's port our data to those. 00:47:05.910 --> 00:47:15.310 Applause 00:47:15.409 --> 00:47:18.940 Herald: Amazing. We have, we have time for a few questions. 00:47:18.940 --> 00:47:21.860 K.J.: I'm not done yet, sorry, it's fine. Herald: I'm so sorry. 00:47:21.860 --> 00:47:24.750 K.J.: Laughs It's cool. No big deal. 00:47:24.750 --> 00:47:31.520 So, machine learning. Closing remarks is brief round up. Closing remarks. There is 00:47:31.520 --> 00:47:35.250 that machine learning is not very intelligent. I think artificial 00:47:35.250 --> 00:47:39.330 intelligence is a misnomer in a lot of ways, but this doesn't mean that people 00:47:39.330 --> 00:47:43.830 are going to stop using it. In fact there's very smart, powerful, and rich 00:47:43.830 --> 00:47:49.850 people that are investing more than ever in it. So it's not going anywhere. And 00:47:49.850 --> 00:47:53.620 it's going to be something that potentially becomes more dangerous over 00:47:53.620 --> 00:47:58.570 time. Because as we hand over more of these to these systems, it could 00:47:58.570 --> 00:48:04.240 potentially control more and more of our lives. We can use, however, adversarial 00:48:04.240 --> 00:48:09.320 machine learning techniques to find ways to fool "black box" networks. So we can 00:48:09.320 --> 00:48:14.400 use these and we know we don't have to have perfect knowledge. However, 00:48:14.400 --> 00:48:18.930 information is powerful. And the more information that we do have, the more were 00:48:18.930 --> 00:48:25.860 able to become a good GDPR based adversary. So please use GDPR and let's 00:48:25.860 --> 00:48:31.230 discuss ways where we can share information. Finally, please support open- 00:48:31.230 --> 00:48:35.590 source tools and research in this space, because we need to keep up with where the 00:48:35.590 --> 00:48:41.790 state of the art is. So we need to keep ourselves moving and open in that way. And 00:48:41.790 --> 00:48:46.670 please, support ethical data companies. Or start one. If you come to me and you say 00:48:46.670 --> 00:48:50.240 "Katharine, I'm going to charge you this much money, but I will never sell your 00:48:50.240 --> 00:48:56.520 data. And I will never buy your data." I would much rather you handle my data. So I 00:48:56.520 --> 00:49:03.390 want us, especially those within the EU, to start a new economy around trust, and 00:49:03.390 --> 00:49:12.740 privacy, and ethical data use. Applause 00:49:12.740 --> 00:49:15.830 Thank you very much. Thank you. 00:49:15.830 --> 00:49:18.050 Herald: OK. We still have time for a few questions. 00:49:18.050 --> 00:49:20.390 K.J.: No, no, no. No worries, no worries. Herald: Less than the last time I walked 00:49:20.390 --> 00:49:23.870 up here, but we do. K.J.: Yeah, now I'm really done. 00:49:23.870 --> 00:49:27.730 Herald: Come up to one of the mics in the front section and raise your hand. Can we 00:49:27.730 --> 00:49:31.584 take a question from mic one. Question: Thank you very much for the very 00:49:31.584 --> 00:49:37.860 interesting talk. One impression that I got during the talk was, with the 00:49:37.860 --> 00:49:42.420 adversarial learning approach aren't we just doing pen testing and Quality 00:49:42.420 --> 00:49:47.920 Assurance for the AI companies they're just going to build better machines. 00:49:47.920 --> 00:49:52.910 Answer: That's a very good question and of course most of this research right now is 00:49:52.910 --> 00:49:56.780 coming from those companies, because they're worried about this. What, however, 00:49:56.780 --> 00:50:02.290 they've shown is, they don't really have a good way to fool, to learn how to fool 00:50:02.290 --> 00:50:08.710 this. Most likely they will need to use a different type of network, eventually. So 00:50:08.710 --> 00:50:13.440 probably, whether it's the blind spots or the linearity of these networks, they are 00:50:13.440 --> 00:50:18.000 easy to fool and they will have to come up with a different method for generating 00:50:18.000 --> 00:50:24.520 something that is robust enough to not be tricked. So, to some degree yes, its a 00:50:24.520 --> 00:50:28.520 cat-and-mouse game, right. But that's why I want the research and the open source to 00:50:28.520 --> 00:50:33.410 continue as well. And I would be highly suspect if they all of a sudden figure out 00:50:33.410 --> 00:50:38.170 a way to make a neural network which has proven linear relationships, that we can 00:50:38.170 --> 00:50:42.560 exploit, nonlinear. And if so, it's usually a different type of network that's 00:50:42.560 --> 00:50:47.430 a lot more expensive to train and that doesn't actually generalize well. So we're 00:50:47.430 --> 00:50:51.280 going to really hit them in a way where they're going to have to be more specific, 00:50:51.280 --> 00:50:59.620 try harder, and I would rather do that than just kind of give up. 00:50:59.620 --> 00:51:02.560 Herald: Next one. Mic 2 00:51:02.560 --> 00:51:07.840 Q: Hello. Thank you for the nice talk. I wanted to ask, have you ever tried looking 00:51:07.840 --> 00:51:14.720 at from the other direction? Like, just trying to feed the companies falsely 00:51:14.720 --> 00:51:21.560 classified data. And just do it with so massive amounts of data, so that they 00:51:21.560 --> 00:51:25.380 learn from it at a certain point. A: Yes, that's these poisoning attacks. So 00:51:25.380 --> 00:51:30.020 when we talk about poison attacks, we are essentially feeding bad training data and 00:51:30.020 --> 00:51:35.120 we're trying to get them to learn bad things. Or I wouldn't say bad things, but 00:51:35.120 --> 00:51:37.540 we're trying to get them to learn false information. 00:51:37.540 --> 00:51:42.781 And that already happens on accident all the time so I think the more to we can, if 00:51:42.781 --> 00:51:46.491 we share information and they have a publicly available API, where they're 00:51:46.491 --> 00:51:49.970 actually actively learning from our information, then yes I would say 00:51:49.970 --> 00:51:55.180 poisoning is a great attack way. And we can also share information of maybe how 00:51:55.180 --> 00:51:58.360 that works. So especially I would be intrigued if we 00:51:58.360 --> 00:52:02.330 can do poisoning for adware and malicious ad targeting. 00:52:02.330 --> 00:52:07.300 Mic 2: OK, thank you. Herald: One more question from the 00:52:07.300 --> 00:52:12.300 internet and then we run out of time. K.J. Oh no, sorry 00:52:12.300 --> 00:52:14.290 Herald: So you can find Katherine after. Signal-Angel: Thank you. One question from 00:52:14.290 --> 00:52:18.210 the internet. What exactly can I do to harden my model against adversarial 00:52:18.210 --> 00:52:21.210 samples? K.J.: Sorry? 00:52:21.210 --> 00:52:27.080 Signal: What exactly can I do to harden my model against adversarial samples? 00:52:27.080 --> 00:52:33.340 K.J.: Not much. What they have shown is, that if you train on a mixture of real 00:52:33.340 --> 00:52:39.300 training data and adversarial data it's a little bit harder to fool, but that just 00:52:39.300 --> 00:52:44.720 means that you have to try more iterations of adversarial input. So right now, the 00:52:44.720 --> 00:52:51.520 recommendation is to train on a mixture of adversarial and real training data and to 00:52:51.520 --> 00:52:56.330 continue to do that over time. And I would argue that you need to maybe do data 00:52:56.330 --> 00:53:00.400 validation on input. And if you do data validation on input maybe you can 00:53:00.400 --> 00:53:05.100 recognize abnormalities. But that's because I come from mainly like production 00:53:05.100 --> 00:53:09.220 levels not theoretical, and I think maybe you should just test things, and see if 00:53:09.220 --> 00:53:15.210 look weird you should maybe not take them into the system. 00:53:15.210 --> 00:53:19.340 Herald: And that's all for the questions. I wish we had more time but we just don't. 00:53:19.340 --> 00:53:21.660 Please give it up for Katharine Jarmul 00:53:21.660 --> 00:53:26.200 Applause 00:53:26.200 --> 00:53:31.050 34c3 postroll music 00:53:31.050 --> 00:53:47.950 subtitles created by c3subtitles.de in the year 2019. Join, and help us!