1
00:00:00,000 --> 00:00:09,550
34c3 preroll music
2
00:00:15,565 --> 00:00:18,230
Herald: ...and I will let Katherine take
the stage now.
3
00:00:18,589 --> 00:00:21,430
Katharine Jarmul, kjam: Awesome! Well,
thank you so much for the introduction and
4
00:00:21,430 --> 00:00:25,310
thank you so much for being here, taking
your time. I know that Congress is really
5
00:00:25,310 --> 00:00:29,800
exciting, so I really appreciate you
spending some time with me today. It's my
6
00:00:29,800 --> 00:00:34,470
first ever Congress, so I'm also really
excited and I want to meet new people. So
7
00:00:34,470 --> 00:00:39,930
if you wanna come say hi to me later, I'm
somewhat friendly, so we can maybe be
8
00:00:39,930 --> 00:00:44,680
friends later. Today what we're going to
talk about is deep learning blind spots or
9
00:00:44,680 --> 00:00:49,890
how to fool "artificial intelligence". I
like to put "artificial intelligence" in
10
00:00:49,890 --> 00:00:55,270
quotes, because.. yeah, we'll talk about
that, but I think it should be in quotes.
11
00:00:55,270 --> 00:00:59,570
And today we're going to talk a little bit
about deep learning, how it works and how
12
00:00:59,570 --> 00:01:07,640
you can maybe fool it. So I ask us: Is AI
becoming more intelligent?
13
00:01:07,640 --> 00:01:11,078
And I ask this because when I open a
browser and, of course, often it's Chrome
14
00:01:11,078 --> 00:01:16,979
and Google is already prompting me
for what I should look at
15
00:01:16,979 --> 00:01:20,260
and it knows that I work with machine
learning, right?
16
00:01:20,260 --> 00:01:23,830
And these are the headlines
that I see every day:
17
00:01:23,830 --> 00:01:29,399
"Are Computers Already Smarter Than
Humans?"
18
00:01:29,399 --> 00:01:32,289
If so, I think we could just pack up and
go home, right?
19
00:01:32,289 --> 00:01:36,140
Like, we fixed computers,
right? If a computer is smarter than me,
20
00:01:36,140 --> 00:01:39,780
then I already fixed it, we can go home,
there's no need to talk about computers
21
00:01:39,780 --> 00:01:47,750
anymore, let's just move on with life. But
that's not true, right? We know, because
22
00:01:47,750 --> 00:01:51,010
we work with computers and we know how
stupid computers are sometimes. They're
23
00:01:51,010 --> 00:01:55,890
pretty bad. Computers do only what we tell
them to do, generally, so I don't think a
24
00:01:55,890 --> 00:02:01,090
computer can think and be smarter than me.
So with the same types of headlines that
25
00:02:01,090 --> 00:02:11,690
you see this, then you also see this: And
yeah, so Apple recently released their
26
00:02:11,690 --> 00:02:17,500
face ID and this unlocks your phone with
your face and it seems like a great idea,
27
00:02:17,500 --> 00:02:22,451
right? You have a unique face, you have a
face, nobody else can take your face. But
28
00:02:22,451 --> 00:02:28,300
unfortunately what we find out about
computers is that they're awful sometimes,
29
00:02:28,300 --> 00:02:32,480
and for these women.. for this Chinese
woman that owned an iPhone,
30
00:02:32,480 --> 00:02:35,960
her coworker was able to unlock her phone.
31
00:02:35,964 --> 00:02:39,320
And I think Hendrick and Karin
talked about, if you were here for the
32
00:02:39,320 --> 00:02:41,590
last talk ("Beeinflussung durch künstliche
Intelligenz"). We have a lot of problems
33
00:02:41,590 --> 00:02:46,379
in machine learning and one of them is
stereotypes and prejudice that are within
34
00:02:46,379 --> 00:02:52,340
our training data or within our minds that
leak into our models. And perhaps they
35
00:02:52,340 --> 00:02:57,739
didn't do adequate training data on
determining different features of Chinese
36
00:02:57,739 --> 00:03:03,160
folks. And perhaps it's other problems
with their model or their training data or
37
00:03:03,160 --> 00:03:07,500
whatever they're trying to do. But they
clearly have some issues, right? So when
38
00:03:07,500 --> 00:03:12,050
somebody asked me: "Is AI gonna take over
the world and is there a super robot
39
00:03:12,050 --> 00:03:17,300
that's gonna come and be my new, you know,
leader or so to speak?" I tell them we
40
00:03:17,300 --> 00:03:21,710
can't even figure out the stuff that we
already have in production. So if we can't
41
00:03:21,710 --> 00:03:25,690
even figure out the stuff we already have
in production, I'm a little bit less
42
00:03:25,690 --> 00:03:33,209
worried of the super robot coming to kill
me. That said, unfortunately the powers
43
00:03:33,209 --> 00:03:38,190
that be, the powers that be a lot of times
they believe in this and they believe
44
00:03:38,190 --> 00:03:44,540
strongly in "artificial intelligence" and
machine learning. They're collecting data
45
00:03:44,540 --> 00:03:50,800
every day about you and me and everyone
else. And they're gonna use this data to
46
00:03:50,800 --> 00:03:56,349
build even better models. This is because
the revolution that we're seeing now in
47
00:03:56,349 --> 00:04:02,080
machine learning has really not much to do
with new algorithms or architectures. It
48
00:04:02,080 --> 00:04:09,630
has a lot more to do with heavy compute
and with massive, massive data sets. And
49
00:04:09,630 --> 00:04:15,740
the more that we have training data of
petabytes per 24 hours or even less, the
50
00:04:15,740 --> 00:04:22,690
more we're able to essentially fix up the
parts that don't work so well. The
51
00:04:22,690 --> 00:04:25,979
companies that we see here are companies
that are investing heavily in machine
52
00:04:25,979 --> 00:04:30,979
learning and AI. Part of how they're
investing heavily is, they're collecting
53
00:04:30,979 --> 00:04:37,999
more and more data about you and me and
everyone else. Google and Facebook, more
54
00:04:37,999 --> 00:04:42,789
than 1 billion active users. I was
surprised to know that in Germany the
55
00:04:42,789 --> 00:04:48,159
desktop search traffic for Google is
higher than most of the rest of the world.
56
00:04:48,159 --> 00:04:53,259
And for Baidu they're growing with the
speed that broadband is available. And so,
57
00:04:53,259 --> 00:04:56,970
what we see is, these people are
collecting this data and they also are
58
00:04:56,970 --> 00:05:02,779
using new technologies like GPUs and TPUs
in new ways to parallelize workflows
59
00:05:02,779 --> 00:05:09,449
and with this they're able to mess up
less, right? They're still messing up, but
60
00:05:09,449 --> 00:05:14,960
they mess up slightly less. And they're
not going to get uninterested in this
61
00:05:14,960 --> 00:05:20,550
topic, so we need to kind of start to
prepare how we respond to this type of
62
00:05:20,550 --> 00:05:25,860
behavior. One of the things that has been
a big area of research, actually also for
63
00:05:25,860 --> 00:05:30,080
a lot of these companies, is what we'll
talk about today and that's adversarial
64
00:05:30,080 --> 00:05:36,800
machine learning. But the first thing that
we'll start with is what is behind what we
65
00:05:36,800 --> 00:05:44,009
call AI. So most of the time when you
think of AI or something like Siri and so
66
00:05:44,009 --> 00:05:48,979
forth, you are actually potentially
talking about an old-school rule-based
67
00:05:48,979 --> 00:05:53,930
system. This is a rule, like you say a
particular thing and then Siri is like:
68
00:05:53,930 --> 00:05:58,129
"Yes, I know how to respond to this". And
we even hard program these types of things
69
00:05:58,129 --> 00:06:02,880
in, right? That is one version of AI, is
essentially: It's been pre-programmed to
70
00:06:02,880 --> 00:06:08,839
do and understand certain things. Another
form that usually, for example for the
71
00:06:08,839 --> 00:06:12,619
people that are trying to build AI robots
and the people that are trying to build
72
00:06:12,619 --> 00:06:17,110
what we call "general AI", so this is
something that can maybe learn like a
73
00:06:17,110 --> 00:06:20,190
human, they'll use reinforcement learning.
74
00:06:20,190 --> 00:06:22,200
I don't specialize in reinforcement
learning.
75
00:06:22,200 --> 00:06:26,401
But what it does is it essentially
tries to reward you for
76
00:06:26,401 --> 00:06:32,429
behaviour that you're expected to do. So
if you complete a task, you get a a
77
00:06:32,429 --> 00:06:36,099
cookie. You complete two other tasks, you
get two or three more cookies depending on
78
00:06:36,099 --> 00:06:41,759
how important the task is. And this will
help you learn how to behave to get more
79
00:06:41,759 --> 00:06:45,990
points and it's used a lot in robots and
gaming and so forth. And I'm not really
80
00:06:45,990 --> 00:06:49,340
going to talk about that today because
most of that is still not really something
81
00:06:49,340 --> 00:06:54,880
that you or I interact with. Well, what I
am gonna talk about today is neural
82
00:06:54,880 --> 00:06:59,680
networks, or as some people like to call
them "deep learning", right? So deep
83
00:06:59,680 --> 00:07:04,119
learning 1: The neural network versus deep
learning battle awhile ago. So here's an
84
00:07:04,119 --> 00:07:09,949
example neural network: we have an input
layer and that's where we essentially make
85
00:07:09,949 --> 00:07:14,550
a quantitative version of whatever our
data is. So we need to make it into
86
00:07:14,550 --> 00:07:19,890
numbers. Then we have a hidden layer and
we might have multiple hidden layers. And
87
00:07:19,890 --> 00:07:23,759
depending on how deep our network is, or a
network inside a network, right, which is
88
00:07:23,759 --> 00:07:28,179
possible. We might have very much
different layers there and they may even
89
00:07:28,179 --> 00:07:33,539
act in cyclical ways. And then that's
where all the weights and the variables
90
00:07:33,539 --> 00:07:39,259
and the learning happens. So that has..
holds a lot of information and data that
91
00:07:39,259 --> 00:07:43,979
we eventually want to train there. And
finally we have an output layer. And
92
00:07:43,979 --> 00:07:47,529
depending on the network and what we're
trying to do the output layer can vary
93
00:07:47,529 --> 00:07:51,539
between something that looks like the
input, like for example if we want to
94
00:07:51,539 --> 00:07:55,719
machine translate, then I want the output
to look like the input, right, I want it
95
00:07:55,719 --> 00:07:59,909
to just be in a different language, or the
output could be a different class. It can
96
00:07:59,909 --> 00:08:05,749
be, you know, this is a car or this is a
train and so forth. So it really depends
97
00:08:05,749 --> 00:08:10,610
what you're trying to solve, but the
output layer gives us the answer. And how
98
00:08:10,610 --> 00:08:17,159
we train this is, we use backpropagation.
Backpropagation is nothing new and neither
99
00:08:17,159 --> 00:08:21,139
is one of the most popular methods to do
so, which is called stochastic gradient
100
00:08:21,139 --> 00:08:26,459
descent. What we do when we go through
that part of the training, is we go from
101
00:08:26,459 --> 00:08:29,759
the output layer and we go backwards
through the network. That's why it's
102
00:08:29,759 --> 00:08:34,828
called backpropagation, right? And as we
go backwards through the network, in the
103
00:08:34,828 --> 00:08:39,139
most simple way, we upvote and downvote
what's working and what's not working. So
104
00:08:39,139 --> 00:08:42,729
we say: "oh you got it right, you get a
little bit more importance", or "you got
105
00:08:42,729 --> 00:08:46,040
it wrong, you get a little bit less
importance". And eventually we hope
106
00:08:46,040 --> 00:08:50,481
over time, that they essentially correct
each other's errors enough that we get a
107
00:08:50,481 --> 00:08:57,550
right answer. So that's a very general
overview of how it works and the cool
108
00:08:57,550 --> 00:09:02,720
thing is: Because it works that way, we
can fool it. And people have been
109
00:09:02,720 --> 00:09:08,269
researching ways to fool it for quite some
time. So I give you a brief overview of
110
00:09:08,269 --> 00:09:13,290
the history of this field, so we can kind
of know where we're working from and maybe
111
00:09:13,290 --> 00:09:19,220
hopefully then where we're going to. In
2005 was one of the first most important
112
00:09:19,220 --> 00:09:24,740
papers to approach adversarial learning
and it was written by a series of
113
00:09:24,740 --> 00:09:29,630
researchers and they wanted to see, if
they could act as an informed attacker and
114
00:09:29,630 --> 00:09:34,440
attack a linear classifier. So this is
just a spam filter and they're like can I
115
00:09:34,440 --> 00:09:37,850
send spam to my friend? I don't know why
they would want to do this, but: "Can I
116
00:09:37,850 --> 00:09:43,209
send spam to my friend, if I tried testing
out a few ideas?" And what they were able
117
00:09:43,209 --> 00:09:47,639
to show is: Yes, rather than just, you
know, trial and error which anybody can do
118
00:09:47,639 --> 00:09:52,120
or a brute force attack of just like send
a thousand emails and see what happens,
119
00:09:52,120 --> 00:09:56,370
they were able to craft a few algorithms
that they could use to try and find
120
00:09:56,370 --> 00:10:03,240
important words to change, to make it go
through the spam filter. In 2007 NIPS,
121
00:10:03,240 --> 00:10:08,019
which is a very popular machine learning
conference, had one of their first all-day
122
00:10:08,019 --> 00:10:12,930
workshops on computer security. And when
they did so, they had a bunch of different
123
00:10:12,930 --> 00:10:16,780
people that were working on machine
learning in computer security: From
124
00:10:16,780 --> 00:10:21,430
malware detection, to network intrusion
detection, to of course spam. And they
125
00:10:21,430 --> 00:10:25,190
also had a few talks on this type of
adversarial learning. So how do you act as
126
00:10:25,190 --> 00:10:29,980
an adversary to your own model? And then
how do you learn how to counter that
127
00:10:29,980 --> 00:10:35,650
adversary? In 2013 there was a really
great paper that got a lot of people's
128
00:10:35,650 --> 00:10:40,001
attention called "Poisoning Attacks
against Support Vector Machines". Now
129
00:10:40,001 --> 00:10:45,290
support vector machines are essentially
usually a linear classifier and we use
130
00:10:45,290 --> 00:10:50,121
them a lot to say, "this is a member of
this class, that, or another", when we
131
00:10:50,121 --> 00:10:54,940
pertain to text. So I have a text and I
want to know what the text is about or I
132
00:10:54,940 --> 00:10:58,610
want to know if it's a positive or
negative sentiment, a lot of times I'll
133
00:10:58,610 --> 00:11:05,160
use a support vector machine. We call them
SVM's as well. Battista Biggio was the
134
00:11:05,160 --> 00:11:08,319
main researcher and he has actually
written quite a lot about these poisoning
135
00:11:08,319 --> 00:11:15,569
attacks and he poisoned the training data.
So for a lot of these systems, sometimes
136
00:11:15,569 --> 00:11:20,820
they have active learning. This means, you
or I, when we classify our emails as spam,
137
00:11:20,820 --> 00:11:26,290
we're helping train the network. So he
poisoned the training data and was able to
138
00:11:26,290 --> 00:11:32,360
show that by poisoning it in a particular
way, that he was able to then send spam
139
00:11:32,360 --> 00:11:37,810
email because he knew what words were then
benign, essentially. He went on to study a
140
00:11:37,810 --> 00:11:43,220
few other things about biometric data if
you're interested in biometrics. But then
141
00:11:43,220 --> 00:11:49,329
in 2014 Christian Szegedy, Ian Goodfellow,
and a few other main researchers at Google
142
00:11:49,329 --> 00:11:55,350
Brain released "Intriguing Properties of
Neural Networks." That really became the
143
00:11:55,350 --> 00:12:00,040
explosion of what we're seeing today in
adversarial learning. And what they were
144
00:12:00,040 --> 00:12:04,629
able to do, is they were able to say "We
believe there's linear properties of these
145
00:12:04,629 --> 00:12:08,790
neural networks, even if they're not
necessarily linear networks.
146
00:12:08,790 --> 00:12:15,560
And we believe we can exploit them to fool
them". And they first introduced then the
147
00:12:15,560 --> 00:12:23,189
fast gradient sign method, which we'll
talk about later today. So how does it
148
00:12:23,189 --> 00:12:28,830
work? First I want us to get a little bit
of an intuition around how this works.
149
00:12:28,830 --> 00:12:35,310
Here's a graphic of gradient descent. And
in gradient descent we have this vertical
150
00:12:35,310 --> 00:12:40,339
axis is our cost function. And what we're
trying to do is: We're trying to minimize
151
00:12:40,339 --> 00:12:47,400
cost, we want to minimize the error. And
so when we start out, we just chose random
152
00:12:47,400 --> 00:12:51,790
weights and variables, so all of our
hidden layers, they just have maybe random
153
00:12:51,790 --> 00:12:57,339
weights or random distribution. And then
we want to get to a place where the
154
00:12:57,339 --> 00:13:01,740
weights have meaning, right? We want our
network to know something, even if it's
155
00:13:01,740 --> 00:13:08,740
just a mathematical pattern, right? So we
start in the high area of the graph, or
156
00:13:08,740 --> 00:13:13,819
the reddish area, and that's where we
started, we have high error there. And
157
00:13:13,819 --> 00:13:21,209
then we try to get to the lowest area of
the graph, or here the dark blue that is
158
00:13:21,209 --> 00:13:26,889
right about here. But sometimes what
happens: As we learn, as we go through
159
00:13:26,889 --> 00:13:33,300
epochs and training, we're moving slowly
down and hopefully we're optimizing. But
160
00:13:33,300 --> 00:13:37,370
what we might end up in instead of this
global minimum, we might end up in the
161
00:13:37,370 --> 00:13:43,800
local minimum which is the other trail.
And that's fine, because it's still zero
162
00:13:43,800 --> 00:13:49,889
error, right? So we're still probably
going to be able to succeed, but we might
163
00:13:49,889 --> 00:13:56,139
not get the best answer all the time. What
adversarial tries to do in the most basic
164
00:13:56,139 --> 00:14:01,980
of ways, it essentially tries to push the
error rate back up the hill for as many
165
00:14:01,980 --> 00:14:07,709
units as it can. So it essentially tries
to increase the error slowly through
166
00:14:07,709 --> 00:14:14,600
perturbations. And by disrupting, let's
say, the weakest links like the one that
167
00:14:14,600 --> 00:14:19,060
did not find the global minimum but
instead found a local minimum, we can
168
00:14:19,060 --> 00:14:23,069
hopefully fool the network, because we're
finding those weak spots and we're
169
00:14:23,069 --> 00:14:25,629
capitalizing on them, essentially.
170
00:14:31,252 --> 00:14:34,140
So what does an adversarial example
actually look like?
171
00:14:34,140 --> 00:14:37,430
You may have already seen this
because it's very popular on the
172
00:14:37,430 --> 00:14:45,221
Twittersphere and a few other places, but
this was a series of researches at MIT. It
173
00:14:45,221 --> 00:14:51,059
was debated whether you could do adverse..
adversarial learning in the real world. A
174
00:14:51,059 --> 00:14:57,339
lot of the research has just been a still
image. And what they were able to show:
175
00:14:57,339 --> 00:15:03,079
They created a 3D-printed turtle. I mean
it looks like a turtle to you as well,
176
00:15:03,079 --> 00:15:09,910
correct? And this 3D-printed turtle by the
Inception Network, which is a very popular
177
00:15:09,910 --> 00:15:16,790
computer vision network, is a rifle and it
is a rifle in every angle that you can
178
00:15:16,790 --> 00:15:21,959
see. And the way they were able to do this
and, I don't know the next time it goes
179
00:15:21,959 --> 00:15:25,910
around you can see perhaps, and it's a
little bit easier on the video which I'll
180
00:15:25,910 --> 00:15:29,790
have posted, I'll share at the end, you
can see perhaps that there's a slight
181
00:15:29,790 --> 00:15:35,529
discoloration of the shell. They messed
with the texture. By messing with this
182
00:15:35,529 --> 00:15:39,910
texture and the colors they were able to
fool the neural network, they were able to
183
00:15:39,910 --> 00:15:45,259
activate different neurons that were not
supposed to be activated. Units, I should
184
00:15:45,259 --> 00:15:51,129
say. So what we see here is, yeah, it can
be done in the real world, and when I saw
185
00:15:51,129 --> 00:15:56,339
this I started getting really excited.
Because, video surveillance is a real
186
00:15:56,339 --> 00:16:02,529
thing, right? So if we can start fooling
3D objects, we can perhaps start fooling
187
00:16:02,529 --> 00:16:08,040
other things in the real world that we
would like to fool.
188
00:16:08,040 --> 00:16:12,440
applause
189
00:16:12,440 --> 00:16:19,149
kjam: So why do adversarial examples
exist? We're going to talk a little bit
190
00:16:19,149 --> 00:16:23,879
about some things that are approximations
of what's actually happening, so please
191
00:16:23,879 --> 00:16:27,610
forgive me for not being always exact, but
I would rather us all have a general
192
00:16:27,610 --> 00:16:33,660
understanding of what's happening. Across
the top row we have an input layer and
193
00:16:33,660 --> 00:16:39,480
these images to the left, we can see, are
the source images and this source image is
194
00:16:39,480 --> 00:16:43,380
like a piece of farming equipment or
something. And on the right we have our
195
00:16:43,380 --> 00:16:48,800
guide image. This is what we're trying to
get the network to see we want it to
196
00:16:48,800 --> 00:16:55,070
missclassify this farm equipment as a pink
bird. So what these researchers did is
197
00:16:55,070 --> 00:16:59,019
they targeted different layers of the
network. And they said: "Okay, we're going
198
00:16:59,019 --> 00:17:02,410
to use this method to target this
particular layer and we'll see what
199
00:17:02,410 --> 00:17:07,569
happens". And so as they targeted these
different layers you can see what's
200
00:17:07,569 --> 00:17:12,109
happening on the internal visualization.
Now neural networks can't see, right?
201
00:17:12,109 --> 00:17:17,939
They're looking at matrices of numbers but
what we can do is we can use those
202
00:17:17,939 --> 00:17:26,559
internal values to try and see with our
human eyes what they are learning. And we
203
00:17:26,559 --> 00:17:31,370
can see here clearly inside the network,
we no longer see the farming equipment,
204
00:17:31,370 --> 00:17:39,550
right? We see a pink bird. And this is not
visible to our human eyes. Now if you
205
00:17:39,550 --> 00:17:43,570
really study and if you enlarge the image
you can start to see okay there's a little
206
00:17:43,570 --> 00:17:48,190
bit of pink here or greens, I don't know
what's happening, but we can still see it
207
00:17:48,190 --> 00:17:56,510
in the neural network we have tricked. Now
people don't exactly know yet why these
208
00:17:56,510 --> 00:18:03,159
blind spots exist. So it's still an area
of active research exactly why we can fool
209
00:18:03,159 --> 00:18:09,429
neural networks so easily. There are some
prominent researchers that believe that
210
00:18:09,429 --> 00:18:14,450
neural networks are essentially very
linear and that we can use this simple
211
00:18:14,450 --> 00:18:20,840
linearity to misclassify to jump into
another area. But there are others that
212
00:18:20,840 --> 00:18:24,820
believe that there's these pockets or
blind spots and that we can then find
213
00:18:24,820 --> 00:18:28,500
these blind spots where these neurons
really are the weakest links and they
214
00:18:28,500 --> 00:18:33,160
maybe even haven't learned anything and if
we change their activation then we can
215
00:18:33,160 --> 00:18:37,580
fool the network easily. So this is still
an area of active research and let's say
216
00:18:37,580 --> 00:18:44,320
you're looking for your thesis, this would
be a pretty neat thing to work on. So
217
00:18:44,320 --> 00:18:49,399
we'll get into just a brief overview of
some of the math behind the most popular
218
00:18:49,399 --> 00:18:55,571
methods. First we have the fast gradient
sign method and that is was used in the
219
00:18:55,571 --> 00:18:59,950
initial paper and now there's been many
iterations on it. And what we do is we
220
00:18:59,950 --> 00:19:05,120
have our same cost function, so this is
the same way that we're trying to train
221
00:19:05,120 --> 00:19:13,110
our network and it's trying to learn. And
we take the gradient sign of that and if
222
00:19:13,110 --> 00:19:16,330
you can think, it's okay, if you're not
used to doing vector calculus, and
223
00:19:16,330 --> 00:19:20,250
especially not without a pen and paper in
front of you, but what you think we're
224
00:19:20,250 --> 00:19:24,140
doing is we're essentially trying to
calculate some approximation of a
225
00:19:24,140 --> 00:19:29,700
derivative of the function. And this can
kind of tell us, where is it going. And if
226
00:19:29,700 --> 00:19:37,299
we know where it's going, we can maybe
anticipate that and change. And then to
227
00:19:37,299 --> 00:19:41,480
create the adversarial images, we then
take the original input plus a small
228
00:19:41,480 --> 00:19:48,770
number epsilon times that gradient's sign.
For the Jacobian Saliency Map, this is a
229
00:19:48,770 --> 00:19:55,010
newer method and it's a little bit more
effective, but it takes a little bit more
230
00:19:55,010 --> 00:20:02,250
compute. This Jacobian Saliency Map uses a
Jacobian matrix and if you remember also,
231
00:20:02,250 --> 00:20:07,649
and it's okay if you don't, a Jacobian
matrix will look at the full derivative of
232
00:20:07,649 --> 00:20:12,049
a function, so you take the full
derivative of a cost function
233
00:20:12,049 --> 00:20:18,269
at that vector, and it gives you a matrix
that is a pointwise approximation,
234
00:20:18,269 --> 00:20:22,550
if the function is differentiable
at that input vector. Don't
235
00:20:22,550 --> 00:20:28,320
worry you can review this later too. But
the Jacobian matrix then we use to create
236
00:20:28,320 --> 00:20:33,059
this saliency map the same way where we're
essentially trying some sort of linear
237
00:20:33,059 --> 00:20:38,830
approximation, or pointwise approximation,
and we then want to find two pixels that
238
00:20:38,830 --> 00:20:43,860
we can perturb that cause the most
disruption. And then we continue to the
239
00:20:43,860 --> 00:20:48,970
next. Unfortunately this is currently a
O(n²) problem, but there's a few people
240
00:20:48,970 --> 00:20:53,910
that are trying to essentially find ways
that we can approximate this and make it
241
00:20:53,910 --> 00:21:01,320
faster. So maybe now you want to fool a
network too and I hope you do, because
242
00:21:01,320 --> 00:21:06,580
that's what we're going to talk about.
First you need to pick a problem or a
243
00:21:06,580 --> 00:21:13,460
network type you may already know. But you
may want to investigate what perhaps is
244
00:21:13,460 --> 00:21:19,019
this company using, what perhaps is this
method using and do a little bit of
245
00:21:19,019 --> 00:21:23,730
research, because that's going to help
you. Then you want to research state-of-
246
00:21:23,730 --> 00:21:28,610
the-art methods and this is like a typical
research statement that you have a new
247
00:21:28,610 --> 00:21:32,360
state-of-the-art method, but the good news
is is that the state-of-the-art two to
248
00:21:32,360 --> 00:21:38,179
three years ago is most likely in
production or in systems today. So once
249
00:21:38,179 --> 00:21:44,480
they find ways to speed it up, some
approximation of that is deployed. And a
250
00:21:44,480 --> 00:21:48,279
lot of times these are then publicly
available models, so a lot of times, if
251
00:21:48,279 --> 00:21:51,480
you're already working with the deep
learning framework they'll come
252
00:21:51,480 --> 00:21:56,450
prepackaged with a few of the different
popular models, so you can even use that.
253
00:21:56,450 --> 00:22:00,691
If you're already building neural networks
of course you can build your own. An
254
00:22:00,691 --> 00:22:05,510
optional step, but one that might be
recommended, is to fine-tune your model
255
00:22:05,510 --> 00:22:10,750
and what this means is to essentially take
a new training data set, maybe data that
256
00:22:10,750 --> 00:22:15,490
you think this company is using or that
you think this network is using, and
257
00:22:15,490 --> 00:22:19,300
you're going to remove the last few layers
of the neural network and you're going to
258
00:22:19,300 --> 00:22:24,809
retrain it. So you essentially are nicely
piggybacking on the work of the pre
259
00:22:24,809 --> 00:22:30,650
trained model and you're using the final
layers to create finesse. This essentially
260
00:22:30,650 --> 00:22:37,169
makes your model better at the task that
you have for it. Finally then you use a
261
00:22:37,169 --> 00:22:40,260
library, and we'll go through a few of
them, but some of the ones that I have
262
00:22:40,260 --> 00:22:46,450
used myself is cleverhans, DeepFool and
deep-pwning, and these all come with nice
263
00:22:46,450 --> 00:22:51,580
built-in features for you to use for let's
say the fast gradient sign method, the
264
00:22:51,580 --> 00:22:56,740
Jacobian saliency map and a few other
methods that are available. Finally it's
265
00:22:56,740 --> 00:23:01,550
not going to always work so depending on
your source and your target, you won't
266
00:23:01,550 --> 00:23:05,840
always necessarily find a match. What
researchers have shown is it's a lot
267
00:23:05,840 --> 00:23:10,950
easier to fool a network that a cat is a
dog than it is to fool in networks that a
268
00:23:10,950 --> 00:23:16,030
cat is an airplane. And this is just like
we can make these intuitive, so you might
269
00:23:16,030 --> 00:23:21,830
want to pick an input that's not super
dissimilar from where you want to go, but
270
00:23:21,830 --> 00:23:28,260
is dissimilar enough. And you want to test
it locally and then finally test the one
271
00:23:28,260 --> 00:23:38,149
for the highest misclassification rates on
the target network. And you might say
272
00:23:38,149 --> 00:23:44,230
Katharine, or you can call me kjam, that's
okay. You might say: "I don't know what
273
00:23:44,230 --> 00:23:50,049
the person is using", "I don't know what
the company is using" and I will say "it's
274
00:23:50,049 --> 00:23:56,750
okay", because what's been proven: You can
attack a blackbox model, you do not have
275
00:23:56,750 --> 00:24:01,950
to know what they're using, you do not
have to know exactly how it works, you
276
00:24:01,950 --> 00:24:06,760
don't even have to know their training
data, because what you can do is if it
277
00:24:06,760 --> 00:24:12,710
has.. okay, addendum it has to have some
API you can interface with. But if it has
278
00:24:12,710 --> 00:24:18,130
an API you can interface with or even any
API you can interact with, that uses the
279
00:24:18,130 --> 00:24:24,840
same type of learning, you can collect
training data by querying the API. And
280
00:24:24,840 --> 00:24:28,700
then you're training your local model on
that data that you're collecting. So
281
00:24:28,700 --> 00:24:32,890
you're collecting the data, you're
training your local model, and as your
282
00:24:32,890 --> 00:24:37,299
local model gets more accurate and more
similar to the deployed black box that you
283
00:24:37,299 --> 00:24:43,409
don't know how it works, you are then
still able to fool it. And what this paper
284
00:24:43,409 --> 00:24:49,730
proved, Nicolas Papanov and a few other
great researchers, is that with usually
285
00:24:49,730 --> 00:24:56,527
less than six thousand queries they were
able to fool the network between 84% and 97% certainty
286
00:24:59,301 --> 00:25:03,419
And what the same group
of researchers also studied is the ability
287
00:25:03,419 --> 00:25:09,241
to transfer the ability to fool one
network into another network and they
288
00:25:09,241 --> 00:25:14,910
called that transfer ability. So I can
take a certain type of network and I can
289
00:25:14,910 --> 00:25:19,320
use adversarial examples against this
network to fool a different type of
290
00:25:19,320 --> 00:25:26,269
machine learning technique. Here we have
their matrix, their heat map, that shows
291
00:25:26,269 --> 00:25:32,730
us exactly what they were able to fool. So
we have across the left-hand side here the
292
00:25:32,730 --> 00:25:37,740
source machine learning technique, we have
deep learning, logistic regression, SVM's
293
00:25:37,740 --> 00:25:43,380
like we talked about, decision trees and
K-nearest-neighbors. And across the bottom
294
00:25:43,380 --> 00:25:47,340
we have the target machine learning, so
what were they targeting. They created the
295
00:25:47,340 --> 00:25:51,470
adversaries with the left hand side and
they targeted across the bottom. We
296
00:25:51,470 --> 00:25:56,700
finally have an ensemble model at the end.
And what they were able to show is like,
297
00:25:56,700 --> 00:26:03,130
for example, SVM's and decision trees are
quite easy to fool, but logistic
298
00:26:03,130 --> 00:26:08,480
regression a little bit less so, but still
strong, for deep learning and K-nearest-
299
00:26:08,480 --> 00:26:13,460
neighbors, if you train a deep learning
model or a K-nearest-neighbor model, then
300
00:26:13,460 --> 00:26:18,179
that performs fairly well against itself.
And so what they're able to show is that
301
00:26:18,179 --> 00:26:23,320
you don't necessarily need to know the
target machine and you don't even have to
302
00:26:23,320 --> 00:26:28,050
get it right, even if you do know, you can
use a different type of machine learning
303
00:26:28,050 --> 00:26:30,437
technique to target the network.
304
00:26:34,314 --> 00:26:39,204
So we'll
look at six lines of Python here and in
305
00:26:39,204 --> 00:26:44,559
these six lines of Python I'm using the
cleverhans library and in six lines of
306
00:26:44,559 --> 00:26:52,419
Python I can both generate my adversarial
input and I can even predict on it. So if
307
00:26:52,419 --> 00:27:02,350
you don't code Python, it's pretty easy to
learn and pick up. And for example here we
308
00:27:02,350 --> 00:27:06,830
have Keras and Keras is a very popular
deep learning library in Python, it
309
00:27:06,830 --> 00:27:12,070
usually works with a theano or a
tensorflow backend and we can just wrap
310
00:27:12,070 --> 00:27:19,250
our model, pass it to the fast gradient
method, class and then set up some
311
00:27:19,250 --> 00:27:24,630
parameters, so here's our epsilon and a
few extra parameters, this is to tune our
312
00:27:24,630 --> 00:27:30,860
adversary, and finally we can generate our
adversarial examples and then predict on
313
00:27:30,860 --> 00:27:39,865
them. So in a very small amount of Python
we're able to target and trick a network.
314
00:27:40,710 --> 00:27:45,791
If you're already using tensorflow or
Keras, it already works with those libraries.
315
00:27:48,828 --> 00:27:52,610
Deep-pwning is one of the first
libraries that I heard about in this space
316
00:27:52,610 --> 00:27:58,200
and it was presented at Def Con in 2016
and what it comes with is a bunch of
317
00:27:58,200 --> 00:28:03,320
tensorflow built-in code. It even comes
with a way that you can train the model
318
00:28:03,320 --> 00:28:06,730
yourself, so it has a few different
models, a few different convolutional
319
00:28:06,730 --> 00:28:12,130
neural networks and these are
predominantly used in computer vision.
320
00:28:12,130 --> 00:28:18,090
It also however has a semantic model and I
normally work in NLP and I was pretty
321
00:28:18,090 --> 00:28:24,240
excited to try it out. What it comes built
with is the Rotten Tomatoes sentiment, so
322
00:28:24,240 --> 00:28:29,900
this is Rotten Tomatoes movie reviews that
try to learn is it positive or negative.
323
00:28:30,470 --> 00:28:35,269
So the original text that I input in, when
I was generating my adversarial networks
324
00:28:35,269 --> 00:28:41,500
was "more trifle than triumph", which is a
real review and the adversarial text that
325
00:28:41,500 --> 00:28:46,080
it gave me was "jonah refreshing haunting
leaky"
326
00:28:49,470 --> 00:28:52,660
...Yeah.. so I was able to fool my network
327
00:28:52,660 --> 00:28:57,559
but I lost any type of meaning and
this is really the problem when we think
328
00:28:57,559 --> 00:29:03,539
about how we apply adversarial learning to
different tasks is, it's easy for an image
329
00:29:03,539 --> 00:29:08,960
if we make a few changes for it to retain
its image, right? It's many, many pixels,
330
00:29:08,960 --> 00:29:14,139
but when we start going into language, if
we change one word and then another word
331
00:29:14,139 --> 00:29:18,950
and another word or maybe we changed all
of the words, we no longer understand as
332
00:29:18,950 --> 00:29:23,120
humans. And I would say this is garbage
in, garbage out, this is not actual
333
00:29:23,120 --> 00:29:28,759
adversarial learning. So we have a long
way to go when it comes to language tasks
334
00:29:28,759 --> 00:29:32,740
and being able to do adversarial learning
and there is some research in this, but
335
00:29:32,740 --> 00:29:37,279
it's not really advanced yet. So hopefully
this is something that we can continue to
336
00:29:37,279 --> 00:29:42,429
work on and advance further and if so we
need to support a few different types of
337
00:29:42,429 --> 00:29:47,426
networks that are more common in NLP than
they are in computer vision.
338
00:29:50,331 --> 00:29:54,759
There's some other notable open-source libraries that
are available to you and I'll cover just a
339
00:29:54,759 --> 00:29:59,610
few here. There's a "Vanderbilt
computational economics research lab" that
340
00:29:59,610 --> 00:30:03,679
has adlib and this allows you to do
poisoning attacks. So if you want to
341
00:30:03,679 --> 00:30:09,429
target training data and poison it, then
you can do so with that and use scikit-
342
00:30:09,429 --> 00:30:16,590
learn. DeepFool allows you to do the fast
gradient sign method, but it tries to do
343
00:30:16,590 --> 00:30:21,590
smaller perturbations, it tries to be less
detectable to us humans.
344
00:30:23,171 --> 00:30:28,284
It's based on Theano, which is another library that I believe uses Lua as well as Python.
345
00:30:29,669 --> 00:30:34,049
"FoolBox" is kind of neat because I only
heard about it last week, but it collects
346
00:30:34,049 --> 00:30:39,309
a bunch of different techniques all in one
library and you could use it with one
347
00:30:39,309 --> 00:30:43,160
interface. So if you want to experiment
with a few different ones at once, I would
348
00:30:43,160 --> 00:30:47,460
recommend taking a look at that and
finally for something that we'll talk
349
00:30:47,460 --> 00:30:53,600
about briefly in a short period of time we
have "Evolving AI Lab", which release a
350
00:30:53,600 --> 00:30:59,710
fooling library and this fooling library
is able to generate images that you or I
351
00:30:59,710 --> 00:31:04,573
can't tell what it is, but that the neural
network is convinced it is something.
352
00:31:05,298 --> 00:31:09,940
So this we'll talk about maybe some
applications of this in a moment, but they
353
00:31:09,940 --> 00:31:13,559
also open sourced all of their code and
they're researchers, who open sourced
354
00:31:13,559 --> 00:31:19,649
their code, which is always very exciting.
As you may have known from some of the
355
00:31:19,649 --> 00:31:25,500
research I already cited, most of the
studies and the research in this area has
356
00:31:25,500 --> 00:31:29,830
been on malicious attacks. So there's very
few people trying to figure out how to do
357
00:31:29,830 --> 00:31:33,769
this for what I would call benevolent
purposes. Most of them are trying to act
358
00:31:33,769 --> 00:31:39,539
as an adversary in the traditional
computer security sense. They're perhaps
359
00:31:39,539 --> 00:31:43,889
studying spam filters and how spammers can
get by them. They're perhaps looking at
360
00:31:43,889 --> 00:31:48,669
network intrusion or botnet-attacks and so
forth. They're perhaps looking at self-
361
00:31:48,669 --> 00:31:53,390
driving cars so and I know that was
referenced earlier as well at Henrick and
362
00:31:53,390 --> 00:31:57,889
Karen's talk, they're perhaps trying to
make a yield sign look like a stop sign or
363
00:31:57,889 --> 00:32:02,760
a stop sign look like a yield sign or a
speed limit, and so forth, and scarily
364
00:32:02,760 --> 00:32:07,669
they are quite successful at this. Or
perhaps they're looking at data poisoning,
365
00:32:07,669 --> 00:32:12,441
so how do we poison the model so we render
it useless? In a particular context, so we
366
00:32:12,441 --> 00:32:17,990
can utilize that. And finally for malware.
So what a few researchers were able to
367
00:32:17,990 --> 00:32:22,669
show is, by just changing a few things in
the malware they were able to upload their
368
00:32:22,669 --> 00:32:26,270
malware to Google Mail and send it to
someone and this was still fully
369
00:32:26,270 --> 00:32:31,580
functional malware. In that same sense
there's the malGAN project, which uses a
370
00:32:31,580 --> 00:32:38,549
generative adversarial network to create
malware that works, I guess. So there's a
371
00:32:38,549 --> 00:32:43,326
lot of research of these kind of malicious
attacks within adversarial learning.
372
00:32:44,984 --> 00:32:51,929
But what I wonder is how might we use this for
good. And I put "good" in quotation marks,
373
00:32:51,929 --> 00:32:56,179
because we all have different ethical and
moral systems we use. And what you may
374
00:32:56,179 --> 00:33:00,289
decide is ethical for you might be
different. But I think as a community,
375
00:33:00,289 --> 00:33:05,450
especially at a conference like this,
hopefully we can converge on some ethical
376
00:33:05,450 --> 00:33:10,183
privacy concerned version of using these
networks.
377
00:33:13,237 --> 00:33:20,990
So I've composed a few ideas and I hope that this is just a starting list of a longer conversation.
378
00:33:22,889 --> 00:33:30,010
One idea is that we can perhaps use this type of adversarial learning to fool surveillance.
379
00:33:30,830 --> 00:33:36,470
As surveillance affects you and I it even
disproportionately affects people that
380
00:33:36,470 --> 00:33:41,870
most likely can't be here. So whether or
not we're personally affected, we can care
381
00:33:41,870 --> 00:33:46,419
about the many lives that are affected by
this type of surveillance. And we can try
382
00:33:46,419 --> 00:33:49,667
and build ways to fool surveillance
systems.
383
00:33:50,937 --> 00:33:52,120
Stenography:
384
00:33:52,120 --> 00:33:55,223
So we could potentially, in a world where more and more people
385
00:33:55,223 --> 00:33:58,780
have less of a private way of sending messages to one another
386
00:33:58,780 --> 00:34:03,080
We can perhaps use adversarial learning to send private messages.
387
00:34:03,830 --> 00:34:08,310
Adware fooling: So
again, where I might have quite a lot of
388
00:34:08,310 --> 00:34:13,859
privilege and I don't actually see ads
that are predatory on me as much, there is
389
00:34:13,859 --> 00:34:19,449
a lot of people in the world that face
predatory advertising. And so how can we
390
00:34:19,449 --> 00:34:23,604
help those problems by developing
adversarial techniques?
391
00:34:24,638 --> 00:34:26,520
Poisoning your own private data:
392
00:34:27,386 --> 00:34:30,600
This depends on whether you
actually need to use the service and
393
00:34:30,600 --> 00:34:34,590
whether you like how the service is
helping you with the machine learning, but
394
00:34:34,590 --> 00:34:40,110
if you don't care or if you need to
essentially have a burn box of your data.
395
00:34:40,110 --> 00:34:45,760
Then potentially you could poison your own
private data. Finally, I want us to use it
396
00:34:45,760 --> 00:34:51,139
to investigate deployed models. So even
if we don't actually need a use for
397
00:34:51,139 --> 00:34:56,010
fooling this particular network, the more
we know about what's deployed and how we
398
00:34:56,010 --> 00:35:00,350
can fool it, the more we're able to keep
up with this technology as it continues to
399
00:35:00,350 --> 00:35:04,630
evolve. So the more that we're practicing,
the more that we're ready for whatever
400
00:35:04,630 --> 00:35:09,800
might happen next. And finally I really
want to hear your ideas as well. So I'll
401
00:35:09,800 --> 00:35:13,940
be here throughout the whole Congress and
of course you can share during the Q&A
402
00:35:13,940 --> 00:35:17,073
time. If you have great ideas, I really
want to hear them.
403
00:35:20,635 --> 00:35:26,085
So I decided to play around a little bit with some of my ideas.
404
00:35:26,810 --> 00:35:32,720
And I was convinced perhaps that I could make Facebook think I was a cat.
405
00:35:33,305 --> 00:35:36,499
This is my goal. Can Facebook think I'm a cat?
406
00:35:37,816 --> 00:35:40,704
Because nobody really likes Facebook. I
mean let's be honest, right?
407
00:35:41,549 --> 00:35:44,166
But I have to be on it because my mom messages me there
408
00:35:44,166 --> 00:35:46,020
and she doesn't use the email anymore.
409
00:35:46,020 --> 00:35:47,890
So I'm on Facebook. Anyways.
410
00:35:48,479 --> 00:35:55,151
So I used a pre-trained Inception model and Keras and I fine-tuned the layers.
411
00:35:55,151 --> 00:35:57,190
And I'm not a
computer vision person really. But it
412
00:35:57,190 --> 00:36:01,770
took me like a day of figuring out how
computer vision people transfer their data
413
00:36:01,770 --> 00:36:06,350
into something I can put inside of a
network figure that out and I was able to
414
00:36:06,350 --> 00:36:12,040
quickly train a model and the model could
only distinguish between people and cats.
415
00:36:12,040 --> 00:36:15,140
That's all the model knew how to do. I
give it a picture it says it's a person or
416
00:36:15,140 --> 00:36:19,630
it's a cat. I actually didn't try just
giving it an image of something else, it
417
00:36:19,630 --> 00:36:25,380
would probably guess it's a person or a
cat maybe, 50/50, who knows. What I did
418
00:36:25,380 --> 00:36:31,930
was, I used an image of myself and
eventually I had my fast gradient sign
419
00:36:31,930 --> 00:36:37,700
method, I used cleverhans, and I was able
to slowly increase the epsilon and so the
420
00:36:37,700 --> 00:36:44,100
epsilon as it's low, you and I can't see
the perturbations, but also the network
421
00:36:44,100 --> 00:36:48,920
can't see the perturbations. So we need to
increase it, and of course as we increase
422
00:36:48,920 --> 00:36:53,300
it, when we're using a technique like
FGSM, we are also increasing the noise
423
00:36:53,300 --> 00:37:00,830
that we see. And when I got 2.21 epsilon
and I kept uploading it to Facebook and
424
00:37:00,830 --> 00:37:02,350
Facebook kept saying: "Yeah, do you want
to tag yourself?" and I'm like:
425
00:37:02,370 --> 00:37:04,222
"no Idon't, I'm just testing".
426
00:37:05,123 --> 00:37:11,379
Finally I got deployed to an epsilon and Facebook no longer knew I was a face
427
00:37:11,379 --> 00:37:15,323
So I was just a
book, I was a cat book, maybe.
428
00:37:15,340 --> 00:37:19,590
applause
429
00:37:21,311 --> 00:37:24,740
kjam: So, unfortunately, as we see, I
didn't actually become a cat, because that
430
00:37:24,740 --> 00:37:30,630
would be pretty neat. But I was able to
fool it. I spoke with the computer visions
431
00:37:30,630 --> 00:37:34,760
specialists that I know and she actually
works in this and I was like: "What
432
00:37:34,760 --> 00:37:39,020
methods do you think Facebook was using?
Did I really fool the neural network or
433
00:37:39,020 --> 00:37:43,140
what did I do?" And she's convinced most
likely that they're actually using a
434
00:37:43,140 --> 00:37:47,580
statistical method called Viola-Jones,
which takes a look at the statistical
435
00:37:47,580 --> 00:37:53,280
distribution of your face and tries to
guess if there's really a face there. But
436
00:37:53,280 --> 00:37:58,800
what I was able to show: transferability.
That is, I can use my neural network even
437
00:37:58,800 --> 00:38:05,380
to fool this statistical model, so now I
have a very noisy but happy photo on FB
438
00:38:08,548 --> 00:38:14,140
Another use case potentially is
adversarial stenography and I was really
439
00:38:14,140 --> 00:38:18,590
excited reading this paper. What this
paper covered and they actually released
440
00:38:18,590 --> 00:38:22,860
the library, as I mentioned. They study
the ability of a neural network to be
441
00:38:22,860 --> 00:38:26,309
convinced that something's there that's
not actually there.
442
00:38:27,149 --> 00:38:30,177
And what they used, they used the MNIST training set.
443
00:38:30,240 --> 00:38:33,420
I'm sorry, if that's like a trigger word
444
00:38:33,420 --> 00:38:38,410
if you've used MNIST a million times, then
I'm sorry for this, but what they use is
445
00:38:38,410 --> 00:38:43,290
MNIST, which is zero through nine of
digits, and what they were able to show
446
00:38:43,290 --> 00:38:48,790
using evolutionary networks is they were
able to generate things that to us look
447
00:38:48,790 --> 00:38:53,280
maybe like art and they actually used it
on the CIFAR data set too, which has
448
00:38:53,280 --> 00:38:57,320
colors, and it was quite beautiful. Some
of what they created in fact they showed
449
00:38:57,320 --> 00:39:04,340
in a gallery. And what the network sees
here is the digits across the top. They
450
00:39:04,340 --> 00:39:12,170
see that digit, they are more than 99%
convinced that that digit is there and
451
00:39:12,170 --> 00:39:15,476
what we see is pretty patterns or just
noise.
452
00:39:16,778 --> 00:39:19,698
When I was reading this paper I was thinking,
453
00:39:19,698 --> 00:39:23,620
how can we use this to send
messages to each other that nobody else
454
00:39:23,620 --> 00:39:28,511
will know is there? I'm just sending
really nice.., I'm an artist and this is
455
00:39:28,511 --> 00:39:35,200
my art and I'm sharing it with my friend.
And in a world where I'm afraid to go home
456
00:39:35,200 --> 00:39:42,360
because there's a crazy person in charge
and I'm afraid that they might look at my
457
00:39:42,360 --> 00:39:47,040
phone, in my computer, and a million other
things and I just want to make sure that
458
00:39:47,040 --> 00:39:51,650
my friend has my pin number or this or
that or whatever. I see a use case for my
459
00:39:51,650 --> 00:39:56,120
life, but again I leave a fairly
privileged life, there are other people
460
00:39:56,120 --> 00:40:01,690
where their actual life and livelihood and
security might depend on using a technique
461
00:40:01,690 --> 00:40:06,150
like this. And I think we could use
adversarial learning to create a new form
462
00:40:06,150 --> 00:40:07,359
of stenography.
463
00:40:11,289 --> 00:40:17,070
Finally I cannot impress
enough that the more information we have
464
00:40:17,070 --> 00:40:20,620
about the systems that we interact with
every day, that our machine learning
465
00:40:20,620 --> 00:40:24,850
systems, that our AI systems, or whatever
you want to call it, that our deep
466
00:40:24,850 --> 00:40:29,701
networks, the more information we have,
the better we can fight them, right. We
467
00:40:29,701 --> 00:40:33,920
don't need perfect knowledge, but the more
knowledge that we have, the better an
468
00:40:33,920 --> 00:40:41,360
adversary we can be. I thankfully now live
in Germany and if you are also a European
469
00:40:41,360 --> 00:40:46,770
resident: We have GDPR, which is the
general data protection regulation and it
470
00:40:46,770 --> 00:40:55,650
goes into effect in May of 2018. We can
use gdpr to make requests about our data,
471
00:40:55,650 --> 00:41:00,450
we can use GDPR to make requests about
machine learning systems that we interact
472
00:41:00,450 --> 00:41:07,840
with, this is a right that we have. And in
recital 71 of the GDPR it states: "The
473
00:41:07,840 --> 00:41:12,550
data subject should have the right to not
be subject to a decision, which may
474
00:41:12,550 --> 00:41:17,730
include a measure, evaluating personal
aspects relating to him or her which is
475
00:41:17,730 --> 00:41:22,880
based solely on automated processing and
which produces legal effects concerning
476
00:41:22,880 --> 00:41:28,010
him or her or similarly significantly
affects him or her, such as automatic
477
00:41:28,010 --> 00:41:33,620
refusal of an online credit application or
e-recruiting practices without any human
478
00:41:33,620 --> 00:41:39,270
intervention." And I'm not a lawyer and I
don't know how this will be implemented
479
00:41:39,270 --> 00:41:43,990
and it's a recital, so we don't even know,
if it will be in force the same way, but
480
00:41:43,990 --> 00:41:50,720
the good news is: Pieces of this same
sentiment are in the actual amendments and
481
00:41:50,720 --> 00:41:55,580
if they're in the amendments, then we can
legally use them. And what it also says
482
00:41:55,580 --> 00:41:59,920
is, we can ask companies to port our data
other places, we can ask companies to
483
00:41:59,920 --> 00:42:03,890
delete our data, we can ask for
information about how our data is
484
00:42:03,890 --> 00:42:09,010
processed, we can ask for information
about what different automated decisions
485
00:42:09,010 --> 00:42:15,750
are being made, and the more we all here
ask for that data, the more we can also
486
00:42:15,750 --> 00:42:20,530
share that same information with people
worldwide. Because the systems that we
487
00:42:20,530 --> 00:42:25,091
interact with, they're not special to us,
they're the same types of systems that are
488
00:42:25,091 --> 00:42:30,610
being deployed everywhere in the world. So
we can help our fellow humans outside of
489
00:42:30,610 --> 00:42:36,400
Europe by being good caretakers and using
our rights to make more information
490
00:42:36,400 --> 00:42:41,960
available to the entire world and to use
this information, to find ways to use
491
00:42:41,960 --> 00:42:46,242
adversarial learning to fool these types
of systems.
492
00:42:47,512 --> 00:42:56,500
applause
493
00:42:56,662 --> 00:43:03,360
So how else might we be able to harness
this for good? I cannot focus enough on
494
00:43:03,360 --> 00:43:08,260
GDPR and our right to collect more
information about the information they're
495
00:43:08,260 --> 00:43:14,110
already collecting about us and everyone
else. So use it, let's find ways to share
496
00:43:14,110 --> 00:43:17,740
the information we gain from it. So I
don't want it to just be that one person
497
00:43:17,740 --> 00:43:21,020
requests it and they learn something. Se
have to find ways to share this
498
00:43:21,020 --> 00:43:28,080
information with one another. Test low-
tech ways. I'm so excited about the maker
499
00:43:28,080 --> 00:43:32,850
space here and maker culture and other
low-tech or human-crafted ways to fool
500
00:43:32,850 --> 00:43:37,890
networks. We can use adversarial learning
perhaps to get good ideas on how to fool
501
00:43:37,890 --> 00:43:43,350
networks, to get lower tech ways. What if
I painted red pixels all over my face?
502
00:43:43,350 --> 00:43:48,600
Would I still be recognized? Would I not?
Let's experiment with things that we learn
503
00:43:48,600 --> 00:43:53,570
from adversarial learning and try to find
other lower-tech solutions to the same problem
504
00:43:55,428 --> 00:43:59,930
Finally. or nearly finally, we
need to increase the research beyond just
505
00:43:59,930 --> 00:44:04,010
computer vision. Quite a lot of
adversarial learning has been only in
506
00:44:04,010 --> 00:44:08,220
computer vision and while I think that's
important and it's also been very
507
00:44:08,220 --> 00:44:12,030
practical, because we can start to see how
we can fool something, we need to figure
508
00:44:12,030 --> 00:44:15,920
out natural language processing, we need
to figure out other ways that machine
509
00:44:15,920 --> 00:44:19,933
learning systems are being used, and we
need to come up with clever ways to fool them.
510
00:44:21,797 --> 00:44:26,000
Finally, spread the word! So I don't
want the conversation to end here, I don't
511
00:44:26,000 --> 00:44:30,950
want the conversation to end at Congress,
I want you to go back to your hacker
512
00:44:30,950 --> 00:44:36,530
collective, your local CCC, the people
that you talk with, your co-workers and I
513
00:44:36,530 --> 00:44:41,340
want you to spread the word. I want you to
do workshops on adversarial learning, I
514
00:44:41,340 --> 00:44:47,930
want more people to not treat this AI as
something mystical and powerful, because
515
00:44:47,930 --> 00:44:52,340
unfortunately it is powerful, but it's not
mystical! So we need to demystify this
516
00:44:52,340 --> 00:44:57,040
space, we need to experiment, we need to
hack on it and we need to find ways to
517
00:44:57,040 --> 00:45:02,310
play with it and spread the word to other
people. Finally, I really want to hear
518
00:45:02,310 --> 00:45:10,480
your other ideas and before I leave today
have to say a little bit about why I
519
00:45:10,480 --> 00:45:15,820
decided to join the resiliency track this
year. I read about the resiliency track
520
00:45:15,820 --> 00:45:21,910
and I was really excited. It spoke to me.
And I said I want to live in a world
521
00:45:21,910 --> 00:45:27,230
where, even if there's an entire burning
trash fire around me, I know that there
522
00:45:27,230 --> 00:45:32,010
are other people that I care about, that I
can count on, that I can work with to try
523
00:45:32,010 --> 00:45:37,840
and at least protect portions of our
world. To try and protect ourselves, to
524
00:45:37,840 --> 00:45:43,940
try and protect people that do not have as
much privilege. So, what I want to be a
525
00:45:43,940 --> 00:45:49,240
part of, is something that can use maybe
the skills I have and the skills you have
526
00:45:49,240 --> 00:45:56,590
to do something with that. And your data
is a big source of value for everyone.
527
00:45:56,590 --> 00:46:02,820
Any free service you use, they are selling
your data. OK, I don't know that for a
528
00:46:02,820 --> 00:46:08,420
fact, but it is very certain, I feel very
certain about the fact that they're most
529
00:46:08,420 --> 00:46:12,560
likely selling your data. And if they're
selling your data, they might also be
530
00:46:12,560 --> 00:46:17,730
buying your data. And there is a whole
market, that's legal, that's freely
531
00:46:17,730 --> 00:46:22,670
available, to buy and sell your data. And
they make money off of that, and they mine
532
00:46:22,670 --> 00:46:28,910
more information, and make more money off
of that, and so forth. So, I will read a
533
00:46:28,910 --> 00:46:35,410
little bit of my opinions that I put forth
on this. Determine who you share your data
534
00:46:35,410 --> 00:46:41,910
with and for what reasons. GDPR and data
portability give us European residents
535
00:46:41,910 --> 00:46:44,410
stronger rights than most of the world.
536
00:46:44,920 --> 00:46:47,940
Let's use them. Let's choose privacy
537
00:46:47,940 --> 00:46:52,800
concerned ethical data companies over
corporations that are entirely built on
538
00:46:52,800 --> 00:46:58,260
selling ads. Let's build start-ups,
organizations, open-source tools and
539
00:46:58,260 --> 00:47:05,691
systems that we can be truly proud of. And
let's port our data to those.
540
00:47:05,910 --> 00:47:15,310
Applause
541
00:47:15,409 --> 00:47:18,940
Herald: Amazing. We have,
we have time for a few questions.
542
00:47:18,940 --> 00:47:21,860
K.J.: I'm not done yet, sorry, it's fine.
Herald: I'm so sorry.
543
00:47:21,860 --> 00:47:24,750
K.J.: Laughs It's cool.
No big deal.
544
00:47:24,750 --> 00:47:31,520
So, machine learning. Closing remarks is
brief round up. Closing remarks. There is
545
00:47:31,520 --> 00:47:35,250
that machine learning is not very
intelligent. I think artificial
546
00:47:35,250 --> 00:47:39,330
intelligence is a misnomer in a lot of
ways, but this doesn't mean that people
547
00:47:39,330 --> 00:47:43,830
are going to stop using it. In fact
there's very smart, powerful, and rich
548
00:47:43,830 --> 00:47:49,850
people that are investing more than ever
in it. So it's not going anywhere. And
549
00:47:49,850 --> 00:47:53,620
it's going to be something that
potentially becomes more dangerous over
550
00:47:53,620 --> 00:47:58,570
time. Because as we hand over more of
these to these systems, it could
551
00:47:58,570 --> 00:48:04,240
potentially control more and more of our
lives. We can use, however, adversarial
552
00:48:04,240 --> 00:48:09,320
machine learning techniques to find ways
to fool "black box" networks. So we can
553
00:48:09,320 --> 00:48:14,400
use these and we know we don't have to
have perfect knowledge. However,
554
00:48:14,400 --> 00:48:18,930
information is powerful. And the more
information that we do have, the more were
555
00:48:18,930 --> 00:48:25,860
able to become a good GDPR based
adversary. So please use GDPR and let's
556
00:48:25,860 --> 00:48:31,230
discuss ways where we can share
information. Finally, please support open-
557
00:48:31,230 --> 00:48:35,590
source tools and research in this space,
because we need to keep up with where the
558
00:48:35,590 --> 00:48:41,790
state of the art is. So we need to keep
ourselves moving and open in that way. And
559
00:48:41,790 --> 00:48:46,670
please, support ethical data companies. Or
start one. If you come to me and you say
560
00:48:46,670 --> 00:48:50,240
"Katharine, I'm going to charge you this
much money, but I will never sell your
561
00:48:50,240 --> 00:48:56,520
data. And I will never buy your data." I
would much rather you handle my data. So I
562
00:48:56,520 --> 00:49:03,390
want us, especially those within the EU,
to start a new economy around trust, and
563
00:49:03,390 --> 00:49:12,740
privacy, and ethical data use.
Applause
564
00:49:12,740 --> 00:49:15,830
Thank you very much.
Thank you.
565
00:49:15,830 --> 00:49:18,050
Herald: OK. We still have time for a few
questions.
566
00:49:18,050 --> 00:49:20,390
K.J.: No, no, no. No worries, no worries.
Herald: Less than the last time I walked
567
00:49:20,390 --> 00:49:23,870
up here, but we do.
K.J.: Yeah, now I'm really done.
568
00:49:23,870 --> 00:49:27,730
Herald: Come up to one of the mics in the
front section and raise your hand. Can we
569
00:49:27,730 --> 00:49:31,584
take a question from mic one.
Question: Thank you very much for the very
570
00:49:31,584 --> 00:49:37,860
interesting talk. One impression that I
got during the talk was, with the
571
00:49:37,860 --> 00:49:42,420
adversarial learning approach aren't we
just doing pen testing and Quality
572
00:49:42,420 --> 00:49:47,920
Assurance for the AI companies they're
just going to build better machines.
573
00:49:47,920 --> 00:49:52,910
Answer: That's a very good question and of
course most of this research right now is
574
00:49:52,910 --> 00:49:56,780
coming from those companies, because
they're worried about this. What, however,
575
00:49:56,780 --> 00:50:02,290
they've shown is, they don't really have a
good way to fool, to learn how to fool
576
00:50:02,290 --> 00:50:08,710
this. Most likely they will need to use a
different type of network, eventually. So
577
00:50:08,710 --> 00:50:13,440
probably, whether it's the blind spots or
the linearity of these networks, they are
578
00:50:13,440 --> 00:50:18,000
easy to fool and they will have to come up
with a different method for generating
579
00:50:18,000 --> 00:50:24,520
something that is robust enough to not be
tricked. So, to some degree yes, its a
580
00:50:24,520 --> 00:50:28,520
cat-and-mouse game, right. But that's why
I want the research and the open source to
581
00:50:28,520 --> 00:50:33,410
continue as well. And I would be highly
suspect if they all of a sudden figure out
582
00:50:33,410 --> 00:50:38,170
a way to make a neural network which has
proven linear relationships, that we can
583
00:50:38,170 --> 00:50:42,560
exploit, nonlinear. And if so, it's
usually a different type of network that's
584
00:50:42,560 --> 00:50:47,430
a lot more expensive to train and that
doesn't actually generalize well. So we're
585
00:50:47,430 --> 00:50:51,280
going to really hit them in a way where
they're going to have to be more specific,
586
00:50:51,280 --> 00:50:59,620
try harder, and I would rather do that
than just kind of give up.
587
00:50:59,620 --> 00:51:02,560
Herald: Next one.
Mic 2
588
00:51:02,560 --> 00:51:07,840
Q: Hello. Thank you for the nice talk. I
wanted to ask, have you ever tried looking
589
00:51:07,840 --> 00:51:14,720
at from the other direction? Like, just
trying to feed the companies falsely
590
00:51:14,720 --> 00:51:21,560
classified data. And just do it with so
massive amounts of data, so that they
591
00:51:21,560 --> 00:51:25,380
learn from it at a certain point.
A: Yes, that's these poisoning attacks. So
592
00:51:25,380 --> 00:51:30,020
when we talk about poison attacks, we are
essentially feeding bad training data and
593
00:51:30,020 --> 00:51:35,120
we're trying to get them to learn bad
things. Or I wouldn't say bad things, but
594
00:51:35,120 --> 00:51:37,540
we're trying to get them to learn false
information.
595
00:51:37,540 --> 00:51:42,781
And that already happens on accident all
the time so I think the more to we can, if
596
00:51:42,781 --> 00:51:46,491
we share information and they have a
publicly available API, where they're
597
00:51:46,491 --> 00:51:49,970
actually actively learning from our
information, then yes I would say
598
00:51:49,970 --> 00:51:55,180
poisoning is a great attack way. And we
can also share information of maybe how
599
00:51:55,180 --> 00:51:58,360
that works.
So especially I would be intrigued if we
600
00:51:58,360 --> 00:52:02,330
can do poisoning for adware and malicious
ad targeting.
601
00:52:02,330 --> 00:52:07,300
Mic 2: OK, thank you.
Herald: One more question from the
602
00:52:07,300 --> 00:52:12,300
internet and then we run out of time.
K.J. Oh no, sorry
603
00:52:12,300 --> 00:52:14,290
Herald: So you can find Katherine after.
Signal-Angel: Thank you. One question from
604
00:52:14,290 --> 00:52:18,210
the internet. What exactly can I do to
harden my model against adversarial
605
00:52:18,210 --> 00:52:21,210
samples?
K.J.: Sorry?
606
00:52:21,210 --> 00:52:27,080
Signal: What exactly can I do to harden my
model against adversarial samples?
607
00:52:27,080 --> 00:52:33,340
K.J.: Not much. What they have shown is,
that if you train on a mixture of real
608
00:52:33,340 --> 00:52:39,300
training data and adversarial data it's a
little bit harder to fool, but that just
609
00:52:39,300 --> 00:52:44,720
means that you have to try more iterations
of adversarial input. So right now, the
610
00:52:44,720 --> 00:52:51,520
recommendation is to train on a mixture of
adversarial and real training data and to
611
00:52:51,520 --> 00:52:56,330
continue to do that over time. And I would
argue that you need to maybe do data
612
00:52:56,330 --> 00:53:00,400
validation on input. And if you do data
validation on input maybe you can
613
00:53:00,400 --> 00:53:05,100
recognize abnormalities. But that's
because I come from mainly like production
614
00:53:05,100 --> 00:53:09,220
levels not theoretical, and I think maybe
you should just test things, and see if
615
00:53:09,220 --> 00:53:15,210
look weird you should maybe not take them
into the system.
616
00:53:15,210 --> 00:53:19,340
Herald: And that's all for the questions.
I wish we had more time but we just don't.
617
00:53:19,340 --> 00:53:21,660
Please give it up for Katharine Jarmul
618
00:53:21,660 --> 00:53:26,200
Applause
619
00:53:26,200 --> 00:53:31,050
34c3 postroll music
620
00:53:31,050 --> 00:53:47,950
subtitles created by c3subtitles.de
in the year 2019. Join, and help us!