34c3 preroll music
Herald: ...and I will let Katherine take
the stage now.
Katharine Jarmul, kjam: Awesome! Well,
thank you so much for the introduction and
thank you so much for being here, taking
your time. I know that Congress is really
exciting, so I really appreciate you
spending some time with me today. It's my
first ever Congress, so I'm also really
excited and I want to meet new people. So
if you wanna come say hi to me later, I'm
somewhat friendly, so we can maybe be
friends later. Today what we're going to
talk about is deep learning blind spots or
how to fool "artificial intelligence". I
like to put "artificial intelligence" in
quotes, because.. yeah, we'll talk about
that, but I think it should be in quotes.
And today we're going to talk a little bit
about deep learning, how it works and how
you can maybe fool it. So I ask us: Is AI
becoming more intelligent?
And I ask this because when I open a
browser and, of course, often it's Chrome
and Google is already prompting me
for what I should look at
and it knows that I work with machine
learning, right?
And these are the headlines
that I see every day:
"Are Computers Already Smarter Than
Humans?"
If so, I think we could just pack up and
go home, right?
Like, we fixed computers,
right? If a computer is smarter than me,
then I already fixed it, we can go home,
there's no need to talk about computers
anymore, let's just move on with life. But
that's not true, right? We know, because
we work with computers and we know how
stupid computers are sometimes. They're
pretty bad. Computers do only what we tell
them to do, generally, so I don't think a
computer can think and be smarter than me.
So with the same types of headlines that
you see this, then you also see this: And
yeah, so Apple recently released their
face ID and this unlocks your phone with
your face and it seems like a great idea,
right? You have a unique face, you have a
face, nobody else can take your face. But
unfortunately what we find out about
computers is that they're awful sometimes,
and for these women.. for this Chinese
woman that owned an iPhone,
her coworker was able to unlock her phone.
And I think Hendrick and Karin
talked about, if you were here for the
last talk ("Beeinflussung durch künstliche
Intelligenz"). We have a lot of problems
in machine learning and one of them is
stereotypes and prejudice that are within
our training data or within our minds that
leak into our models. And perhaps they
didn't do adequate training data on
determining different features of Chinese
folks. And perhaps it's other problems
with their model or their training data or
whatever they're trying to do. But they
clearly have some issues, right? So when
somebody asked me: "Is AI gonna take over
the world and is there a super robot
that's gonna come and be my new, you know,
leader or so to speak?" I tell them we
can't even figure out the stuff that we
already have in production. So if we can't
even figure out the stuff we already have
in production, I'm a little bit less
worried of the super robot coming to kill
me. That said, unfortunately the powers
that be, the powers that be a lot of times
they believe in this and they believe
strongly in "artificial intelligence" and
machine learning. They're collecting data
every day about you and me and everyone
else. And they're gonna use this data to
build even better models. This is because
the revolution that we're seeing now in
machine learning has really not much to do
with new algorithms or architectures. It
has a lot more to do with heavy compute
and with massive, massive data sets. And
the more that we have training data of
petabytes per 24 hours or even less, the
more we're able to essentially fix up the
parts that don't work so well. The
companies that we see here are companies
that are investing heavily in machine
learning and AI. Part of how they're
investing heavily is, they're collecting
more and more data about you and me and
everyone else. Google and Facebook, more
than 1 billion active users. I was
surprised to know that in Germany the
desktop search traffic for Google is
higher than most of the rest of the world.
And for Baidu they're growing with the
speed that broadband is available. And so,
what we see is, these people are
collecting this data and they also are
using new technologies like GPUs and TPUs
in new ways to parallelize workflows
and with this they're able to mess up
less, right? They're still messing up, but
they mess up slightly less. And they're
not going to get uninterested in this
topic, so we need to kind of start to
prepare how we respond to this type of
behavior. One of the things that has been
a big area of research, actually also for
a lot of these companies, is what we'll
talk about today and that's adversarial
machine learning. But the first thing that
we'll start with is what is behind what we
call AI. So most of the time when you
think of AI or something like Siri and so
forth, you are actually potentially
talking about an old-school rule-based
system. This is a rule, like you say a
particular thing and then Siri is like:
"Yes, I know how to respond to this". And
we even hard program these types of things
in, right? That is one version of AI, is
essentially: It's been pre-programmed to
do and understand certain things. Another
form that usually, for example for the
people that are trying to build AI robots
and the people that are trying to build
what we call "general AI", so this is
something that can maybe learn like a
human, they'll use reinforcement learning.
I don't specialize in reinforcement
learning.
But what it does is it essentially
tries to reward you for
behaviour that you're expected to do. So
if you complete a task, you get a a
cookie. You complete two other tasks, you
get two or three more cookies depending on
how important the task is. And this will
help you learn how to behave to get more
points and it's used a lot in robots and
gaming and so forth. And I'm not really
going to talk about that today because
most of that is still not really something
that you or I interact with. Well, what I
am gonna talk about today is neural
networks, or as some people like to call
them "deep learning", right? So deep
learning 1: The neural network versus deep
learning battle awhile ago. So here's an
example neural network: we have an input
layer and that's where we essentially make
a quantitative version of whatever our
data is. So we need to make it into
numbers. Then we have a hidden layer and
we might have multiple hidden layers. And
depending on how deep our network is, or a
network inside a network, right, which is
possible. We might have very much
different layers there and they may even
act in cyclical ways. And then that's
where all the weights and the variables
and the learning happens. So that has..
holds a lot of information and data that
we eventually want to train there. And
finally we have an output layer. And
depending on the network and what we're
trying to do the output layer can vary
between something that looks like the
input, like for example if we want to
machine translate, then I want the output
to look like the input, right, I want it
to just be in a different language, or the
output could be a different class. It can
be, you know, this is a car or this is a
train and so forth. So it really depends
what you're trying to solve, but the
output layer gives us the answer. And how
we train this is, we use backpropagation.
Backpropagation is nothing new and neither
is one of the most popular methods to do
so, which is called stochastic gradient
descent. What we do when we go through
that part of the training, is we go from
the output layer and we go backwards
through the network. That's why it's
called backpropagation, right? And as we
go backwards through the network, in the
most simple way, we upvote and downvote
what's working and what's not working. So
we say: "oh you got it right, you get a
little bit more importance", or "you got
it wrong, you get a little bit less
importance". And eventually we hope
over time, that they essentially correct
each other's errors enough that we get a
right answer. So that's a very general
overview of how it works and the cool
thing is: Because it works that way, we
can fool it. And people have been
researching ways to fool it for quite some
time. So I give you a brief overview of
the history of this field, so we can kind
of know where we're working from and maybe
hopefully then where we're going to. In
2005 was one of the first most important
papers to approach adversarial learning
and it was written by a series of
researchers and they wanted to see, if
they could act as an informed attacker and
attack a linear classifier. So this is
just a spam filter and they're like can I
send spam to my friend? I don't know why
they would want to do this, but: "Can I
send spam to my friend, if I tried testing
out a few ideas?" And what they were able
to show is: Yes, rather than just, you
know, trial and error which anybody can do
or a brute force attack of just like send
a thousand emails and see what happens,
they were able to craft a few algorithms
that they could use to try and find
important words to change, to make it go
through the spam filter. In 2007 NIPS,
which is a very popular machine learning
conference, had one of their first all-day
workshops on computer security. And when
they did so, they had a bunch of different
people that were working on machine
learning in computer security: From
malware detection, to network intrusion
detection, to of course spam. And they
also had a few talks on this type of
adversarial learning. So how do you act as
an adversary to your own model? And then
how do you learn how to counter that
adversary? In 2013 there was a really
great paper that got a lot of people's
attention called "Poisoning Attacks
against Support Vector Machines". Now
support vector machines are essentially
usually a linear classifier and we use
them a lot to say, "this is a member of
this class, that, or another", when we
pertain to text. So I have a text and I
want to know what the text is about or I
want to know if it's a positive or
negative sentiment, a lot of times I'll
use a support vector machine. We call them
SVM's as well. Battista Biggio was the
main researcher and he has actually
written quite a lot about these poisoning
attacks and he poisoned the training data.
So for a lot of these systems, sometimes
they have active learning. This means, you
or I, when we classify our emails as spam,
we're helping train the network. So he
poisoned the training data and was able to
show that by poisoning it in a particular
way, that he was able to then send spam
email because he knew what words were then
benign, essentially. He went on to study a
few other things about biometric data if
you're interested in biometrics. But then
in 2014 Christian Szegedy, Ian Goodfellow,
and a few other main researchers at Google
Brain released "Intriguing Properties of
Neural Networks." That really became the
explosion of what we're seeing today in
adversarial learning. And what they were
able to do, is they were able to say "We
believe there's linear properties of these
neural networks, even if they're not
necessarily linear networks.
And we believe we can exploit them to fool
them". And they first introduced then the
fast gradient sign method, which we'll
talk about later today. So how does it
work? First I want us to get a little bit
of an intuition around how this works.
Here's a graphic of gradient descent. And
in gradient descent we have this vertical
axis is our cost function. And what we're
trying to do is: We're trying to minimize
cost, we want to minimize the error. And
so when we start out, we just chose random
weights and variables, so all of our
hidden layers, they just have maybe random
weights or random distribution. And then
we want to get to a place where the
weights have meaning, right? We want our
network to know something, even if it's
just a mathematical pattern, right? So we
start in the high area of the graph, or
the reddish area, and that's where we
started, we have high error there. And
then we try to get to the lowest area of
the graph, or here the dark blue that is
right about here. But sometimes what
happens: As we learn, as we go through
epochs and training, we're moving slowly
down and hopefully we're optimizing. But
what we might end up in instead of this
global minimum, we might end up in the
local minimum which is the other trail.
And that's fine, because it's still zero
error, right? So we're still probably
going to be able to succeed, but we might
not get the best answer all the time. What
adversarial tries to do in the most basic
of ways, it essentially tries to push the
error rate back up the hill for as many
units as it can. So it essentially tries
to increase the error slowly through
perturbations. And by disrupting, let's
say, the weakest links like the one that
did not find the global minimum but
instead found a local minimum, we can
hopefully fool the network, because we're
finding those weak spots and we're
capitalizing on them, essentially.
So what does an adversarial example
actually look like?
You may have already seen this
because it's very popular on the
Twittersphere and a few other places, but
this was a series of researches at MIT. It
was debated whether you could do adverse..
adversarial learning in the real world. A
lot of the research has just been a still
image. And what they were able to show:
They created a 3D-printed turtle. I mean
it looks like a turtle to you as well,
correct? And this 3D-printed turtle by the
Inception Network, which is a very popular
computer vision network, is a rifle and it
is a rifle in every angle that you can
see. And the way they were able to do this
and, I don't know the next time it goes
around you can see perhaps, and it's a
little bit easier on the video which I'll
have posted, I'll share at the end, you
can see perhaps that there's a slight
discoloration of the shell. They messed
with the texture. By messing with this
texture and the colors they were able to
fool the neural network, they were able to
activate different neurons that were not
supposed to be activated. Units, I should
say. So what we see here is, yeah, it can
be done in the real world, and when I saw
this I started getting really excited.
Because, video surveillance is a real
thing, right? So if we can start fooling
3D objects, we can perhaps start fooling
other things in the real world that we
would like to fool.
applause
kjam: So why do adversarial examples
exist? We're going to talk a little bit
about some things that are approximations
of what's actually happening, so please
forgive me for not being always exact, but
I would rather us all have a general
understanding of what's happening. Across
the top row we have an input layer and
these images to the left, we can see, are
the source images and this source image is
like a piece of farming equipment or
something. And on the right we have our
guide image. This is what we're trying to
get the network to see we want it to
missclassify this farm equipment as a pink
bird. So what these researchers did is
they targeted different layers of the
network. And they said: "Okay, we're going
to use this method to target this
particular layer and we'll see what
happens". And so as they targeted these
different layers you can see what's
happening on the internal visualization.
Now neural networks can't see, right?
They're looking at matrices of numbers but
what we can do is we can use those
internal values to try and see with our
human eyes what they are learning. And we
can see here clearly inside the network,
we no longer see the farming equipment,
right? We see a pink bird. And this is not
visible to our human eyes. Now if you
really study and if you enlarge the image
you can start to see okay there's a little
bit of pink here or greens, I don't know
what's happening, but we can still see it
in the neural network we have tricked. Now
people don't exactly know yet why these
blind spots exist. So it's still an area
of active research exactly why we can fool
neural networks so easily. There are some
prominent researchers that believe that
neural networks are essentially very
linear and that we can use this simple
linearity to misclassify to jump into
another area. But there are others that
believe that there's these pockets or
blind spots and that we can then find
these blind spots where these neurons
really are the weakest links and they
maybe even haven't learned anything and if
we change their activation then we can
fool the network easily. So this is still
an area of active research and let's say
you're looking for your thesis, this would
be a pretty neat thing to work on. So
we'll get into just a brief overview of
some of the math behind the most popular
methods. First we have the fast gradient
sign method and that is was used in the
initial paper and now there's been many
iterations on it. And what we do is we
have our same cost function, so this is
the same way that we're trying to train
our network and it's trying to learn. And
we take the gradient sign of that and if
you can think, it's okay, if you're not
used to doing vector calculus, and
especially not without a pen and paper in
front of you, but what you think we're
doing is we're essentially trying to
calculate some approximation of a
derivative of the function. And this can
kind of tell us, where is it going. And if
we know where it's going, we can maybe
anticipate that and change. And then to
create the adversarial images, we then
take the original input plus a small
number epsilon times that gradient's sign.
For the Jacobian Saliency Map, this is a
newer method and it's a little bit more
effective, but it takes a little bit more
compute. This Jacobian Saliency Map uses a
Jacobian matrix and if you remember also,
and it's okay if you don't, a Jacobian
matrix will look at the full derivative of
a function, so you take the full
derivative of a cost function
at that vector, and it gives you a matrix
that is a pointwise approximation,
if the function is differentiable
at that input vector. Don't
worry you can review this later too. But
the Jacobian matrix then we use to create
this saliency map the same way where we're
essentially trying some sort of linear
approximation, or pointwise approximation,
and we then want to find two pixels that
we can perturb that cause the most
disruption. And then we continue to the
next. Unfortunately this is currently a
O(n²) problem, but there's a few people
that are trying to essentially find ways
that we can approximate this and make it
faster. So maybe now you want to fool a
network too and I hope you do, because
that's what we're going to talk about.
First you need to pick a problem or a
network type you may already know. But you
may want to investigate what perhaps is
this company using, what perhaps is this
method using and do a little bit of
research, because that's going to help
you. Then you want to research state-of-
the-art methods and this is like a typical
research statement that you have a new
state-of-the-art method, but the good news
is is that the state-of-the-art two to
three years ago is most likely in
production or in systems today. So once
they find ways to speed it up, some
approximation of that is deployed. And a
lot of times these are then publicly
available models, so a lot of times, if
you're already working with the deep
learning framework they'll come
prepackaged with a few of the different
popular models, so you can even use that.
If you're already building neural networks
of course you can build your own. An
optional step, but one that might be
recommended, is to fine-tune your model
and what this means is to essentially take
a new training data set, maybe data that
you think this company is using or that
you think this network is using, and
you're going to remove the last few layers
of the neural network and you're going to
retrain it. So you essentially are nicely
piggybacking on the work of the pre
trained model and you're using the final
layers to create finesse. This essentially
makes your model better at the task that
you have for it. Finally then you use a
library, and we'll go through a few of
them, but some of the ones that I have
used myself is cleverhans, DeepFool and
deep-pwning, and these all come with nice
built-in features for you to use for let's
say the fast gradient sign method, the
Jacobian saliency map and a few other
methods that are available. Finally it's
not going to always work so depending on
your source and your target, you won't
always necessarily find a match. What
researchers have shown is it's a lot
easier to fool a network that a cat is a
dog than it is to fool in networks that a
cat is an airplane. And this is just like
we can make these intuitive, so you might
want to pick an input that's not super
dissimilar from where you want to go, but
is dissimilar enough. And you want to test
it locally and then finally test the one
for the highest misclassification rates on
the target network. And you might say
Katharine, or you can call me kjam, that's
okay. You might say: "I don't know what
the person is using", "I don't know what
the company is using" and I will say "it's
okay", because what's been proven: You can
attack a blackbox model, you do not have
to know what they're using, you do not
have to know exactly how it works, you
don't even have to know their training
data, because what you can do is if it
has.. okay, addendum it has to have some
API you can interface with. But if it has
an API you can interface with or even any
API you can interact with, that uses the
same type of learning, you can collect
training data by querying the API. And
then you're training your local model on
that data that you're collecting. So
you're collecting the data, you're
training your local model, and as your
local model gets more accurate and more
similar to the deployed black box that you
don't know how it works, you are then
still able to fool it. And what this paper
proved, Nicolas Papanov and a few other
great researchers, is that with usually
less than six thousand queries they were
able to fool the network between 84% and 97% certainty
And what the same group
of researchers also studied is the ability
to transfer the ability to fool one
network into another network and they
called that transfer ability. So I can
take a certain type of network and I can
use adversarial examples against this
network to fool a different type of
machine learning technique. Here we have
their matrix, their heat map, that shows
us exactly what they were able to fool. So
we have across the left-hand side here the
source machine learning technique, we have
deep learning, logistic regression, SVM's
like we talked about, decision trees and
K-nearest-neighbors. And across the bottom
we have the target machine learning, so
what were they targeting. They created the
adversaries with the left hand side and
they targeted across the bottom. We
finally have an ensemble model at the end.
And what they were able to show is like,
for example, SVM's and decision trees are
quite easy to fool, but logistic
regression a little bit less so, but still
strong, for deep learning and K-nearest-
neighbors, if you train a deep learning
model or a K-nearest-neighbor model, then
that performs fairly well against itself.
And so what they're able to show is that
you don't necessarily need to know the
target machine and you don't even have to
get it right, even if you do know, you can
use a different type of machine learning
technique to target the network.
So we'll
look at six lines of Python here and in
these six lines of Python I'm using the
cleverhans library and in six lines of
Python I can both generate my adversarial
input and I can even predict on it. So if
you don't code Python, it's pretty easy to
learn and pick up. And for example here we
have Keras and Keras is a very popular
deep learning library in Python, it
usually works with a theano or a
tensorflow backend and we can just wrap
our model, pass it to the fast gradient
method, class and then set up some
parameters, so here's our epsilon and a
few extra parameters, this is to tune our
adversary, and finally we can generate our
adversarial examples and then predict on
them. So in a very small amount of Python
we're able to target and trick a network.
If you're already using tensorflow or
Keras, it already works with those libraries.
Deep-pwning is one of the first
libraries that I heard about in this space
and it was presented at Def Con in 2016
and what it comes with is a bunch of
tensorflow built-in code. It even comes
with a way that you can train the model
yourself, so it has a few different
models, a few different convolutional
neural networks and these are
predominantly used in computer vision.
It also however has a semantic model and I
normally work in NLP and I was pretty
excited to try it out. What it comes built
with is the Rotten Tomatoes sentiment, so
this is Rotten Tomatoes movie reviews that
try to learn is it positive or negative.
So the original text that I input in, when
I was generating my adversarial networks
was "more trifle than triumph", which is a
real review and the adversarial text that
it gave me was "jonah refreshing haunting
leaky"
...Yeah.. so I was able to fool my network
but I lost any type of meaning and
this is really the problem when we think
about how we apply adversarial learning to
different tasks is, it's easy for an image
if we make a few changes for it to retain
its image, right? It's many, many pixels,
but when we start going into language, if
we change one word and then another word
and another word or maybe we changed all
of the words, we no longer understand as
humans. And I would say this is garbage
in, garbage out, this is not actual
adversarial learning. So we have a long
way to go when it comes to language tasks
and being able to do adversarial learning
and there is some research in this, but
it's not really advanced yet. So hopefully
this is something that we can continue to
work on and advance further and if so we
need to support a few different types of
networks that are more common in NLP than
they are in computer vision.
There's some other notable open-source libraries that
are available to you and I'll cover just a
few here. There's a "Vanderbilt
computational economics research lab" that
has adlib and this allows you to do
poisoning attacks. So if you want to
target training data and poison it, then
you can do so with that and use scikit-
learn. DeepFool allows you to do the fast
gradient sign method, but it tries to do
smaller perturbations, it tries to be less
detectable to us humans.
It's based on Theano, which is another library that I believe uses Lua as well as Python.
"FoolBox" is kind of neat because I only
heard about it last week, but it collects
a bunch of different techniques all in one
library and you could use it with one
interface. So if you want to experiment
with a few different ones at once, I would
recommend taking a look at that and
finally for something that we'll talk
about briefly in a short period of time we
have "Evolving AI Lab", which release a
fooling library and this fooling library
is able to generate images that you or I
can't tell what it is, but that the neural
network is convinced it is something.
So this we'll talk about maybe some
applications of this in a moment, but they
also open sourced all of their code and
they're researchers, who open sourced
their code, which is always very exciting.
As you may have known from some of the
research I already cited, most of the
studies and the research in this area has
been on malicious attacks. So there's very
few people trying to figure out how to do
this for what I would call benevolent
purposes. Most of them are trying to act
as an adversary in the traditional
computer security sense. They're perhaps
studying spam filters and how spammers can
get by them. They're perhaps looking at
network intrusion or botnet-attacks and so
forth. They're perhaps looking at self-
driving cars so and I know that was
referenced earlier as well at Henrick and
Karen's talk, they're perhaps trying to
make a yield sign look like a stop sign or
a stop sign look like a yield sign or a
speed limit, and so forth, and scarily
they are quite successful at this. Or
perhaps they're looking at data poisoning,
so how do we poison the model so we render
it useless? In a particular context, so we
can utilize that. And finally for malware.
So what a few researchers were able to
show is, by just changing a few things in
the malware they were able to upload their
malware to Google Mail and send it to
someone and this was still fully
functional malware. In that same sense
there's the malGAN project, which uses a
generative adversarial network to create
malware that works, I guess. So there's a
lot of research of these kind of malicious
attacks within adversarial learning.
But what I wonder is how might we use this for
good. And I put "good" in quotation marks,
because we all have different ethical and
moral systems we use. And what you may
decide is ethical for you might be
different. But I think as a community,
especially at a conference like this,
hopefully we can converge on some ethical
privacy concerned version of using these
networks.
So I've composed a few ideas and I hope that this is just a starting list of a longer conversation.
One idea is that we can perhaps use this type of adversarial learning to fool surveillance.
As surveillance affects you and I it even
disproportionately affects people that
most likely can't be here. So whether or
not we're personally affected, we can care
about the many lives that are affected by
this type of surveillance. And we can try
and build ways to fool surveillance
systems.
Stenography:
So we could potentially, in a world where more and more people
have less of a private way of sending messages to one another
We can perhaps use adversarial learning to send private messages.
Adware fooling: So
again, where I might have quite a lot of
privilege and I don't actually see ads
that are predatory on me as much, there is
a lot of people in the world that face
predatory advertising. And so how can we
help those problems by developing
adversarial techniques?
Poisoning your own private data:
This depends on whether you
actually need to use the service and
whether you like how the service is
helping you with the machine learning, but
if you don't care or if you need to
essentially have a burn box of your data.
Then potentially you could poison your own
private data. Finally, I want us to use it
to investigate deployed models. So even
if we don't actually need a use for
fooling this particular network, the more
we know about what's deployed and how we
can fool it, the more we're able to keep
up with this technology as it continues to
evolve. So the more that we're practicing,
the more that we're ready for whatever
might happen next. And finally I really
want to hear your ideas as well. So I'll
be here throughout the whole Congress and
of course you can share during the Q&A
time. If you have great ideas, I really
want to hear them.
So I decided to play around a little bit with some of my ideas.
And I was convinced perhaps that I could make Facebook think I was a cat.
This is my goal. Can Facebook think I'm a cat?
Because nobody really likes Facebook. I
mean let's be honest, right?
But I have to be on it because my mom messages me there
and she doesn't use the email anymore.
So I'm on Facebook. Anyways.
So I used a pre-trained Inception model and Keras and I fine-tuned the layers.
And I'm not a
computer vision person really. But it
took me like a day of figuring out how
computer vision people transfer their data
into something I can put inside of a
network figure that out and I was able to
quickly train a model and the model could
only distinguish between people and cats.
That's all the model knew how to do. I
give it a picture it says it's a person or
it's a cat. I actually didn't try just
giving it an image of something else, it
would probably guess it's a person or a
cat maybe, 50/50, who knows. What I did
was, I used an image of myself and
eventually I had my fast gradient sign
method, I used cleverhans, and I was able
to slowly increase the epsilon and so the
epsilon as it's low, you and I can't see
the perturbations, but also the network
can't see the perturbations. So we need to
increase it, and of course as we increase
it, when we're using a technique like
FGSM, we are also increasing the noise
that we see. And when I got 2.21 epsilon
and I kept uploading it to Facebook and
Facebook kept saying: "Yeah, do you want
to tag yourself?" and I'm like:
"no Idon't, I'm just testing".
Finally I got deployed to an epsilon and Facebook no longer knew I was a face
So I was just a
book, I was a cat book, maybe.
applause
kjam: So, unfortunately, as we see, I
didn't actually become a cat, because that
would be pretty neat. But I was able to
fool it. I spoke with the computer visions
specialists that I know and she actually
works in this and I was like: "What
methods do you think Facebook was using?
Did I really fool the neural network or
what did I do?" And she's convinced most
likely that they're actually using a
statistical method called Viola-Jones,
which takes a look at the statistical
distribution of your face and tries to
guess if there's really a face there. But
what I was able to show: transferability.
That is, I can use my neural network even
to fool this statistical model, so now I
have a very noisy but happy photo on FB
Another use case potentially is
adversarial stenography and I was really
excited reading this paper. What this
paper covered and they actually released
the library, as I mentioned. They study
the ability of a neural network to be
convinced that something's there that's
not actually there.
And what they used, they used the MNIST training set.
I'm sorry, if that's like a trigger word
if you've used MNIST a million times, then
I'm sorry for this, but what they use is
MNIST, which is zero through nine of
digits, and what they were able to show
using evolutionary networks is they were
able to generate things that to us look
maybe like art and they actually used it
on the CIFAR data set too, which has
colors, and it was quite beautiful. Some
of what they created in fact they showed
in a gallery. And what the network sees
here is the digits across the top. They
see that digit, they are more than 99%
convinced that that digit is there and
what we see is pretty patterns or just
noise.
When I was reading this paper I was thinking,
how can we use this to send
messages to each other that nobody else
will know is there? I'm just sending
really nice.., I'm an artist and this is
my art and I'm sharing it with my friend.
And in a world where I'm afraid to go home
because there's a crazy person in charge
and I'm afraid that they might look at my
phone, in my computer, and a million other
things and I just want to make sure that
my friend has my pin number or this or
that or whatever. I see a use case for my
life, but again I leave a fairly
privileged life, there are other people
where their actual life and livelihood and
security might depend on using a technique
like this. And I think we could use
adversarial learning to create a new form
of stenography.
Finally I cannot impress
enough that the more information we have
about the systems that we interact with
every day, that our machine learning
systems, that our AI systems, or whatever
you want to call it, that our deep
networks, the more information we have,
the better we can fight them, right. We
don't need perfect knowledge, but the more
knowledge that we have, the better an
adversary we can be. I thankfully now live
in Germany and if you are also a European
resident: We have GDPR, which is the
general data protection regulation and it
goes into effect in May of 2018. We can
use gdpr to make requests about our data,
we can use GDPR to make requests about
machine learning systems that we interact
with, this is a right that we have. And in
recital 71 of the GDPR it states: "The
data subject should have the right to not
be subject to a decision, which may
include a measure, evaluating personal
aspects relating to him or her which is
based solely on automated processing and
which produces legal effects concerning
him or her or similarly significantly
affects him or her, such as automatic
refusal of an online credit application or
e-recruiting practices without any human
intervention." And I'm not a lawyer and I
don't know how this will be implemented
and it's a recital, so we don't even know,
if it will be in force the same way, but
the good news is: Pieces of this same
sentiment are in the actual amendments and
if they're in the amendments, then we can
legally use them. And what it also says
is, we can ask companies to port our data
other places, we can ask companies to
delete our data, we can ask for
information about how our data is
processed, we can ask for information
about what different automated decisions
are being made, and the more we all here
ask for that data, the more we can also
share that same information with people
worldwide. Because the systems that we
interact with, they're not special to us,
they're the same types of systems that are
being deployed everywhere in the world. So
we can help our fellow humans outside of
Europe by being good caretakers and using
our rights to make more information
available to the entire world and to use
this information, to find ways to use
adversarial learning to fool these types
of systems.
applause
So how else might we be able to harness
this for good? I cannot focus enough on
GDPR and our right to collect more
information about the information they're
already collecting about us and everyone
else. So use it, let's find ways to share
the information we gain from it. So I
don't want it to just be that one person
requests it and they learn something. Se
have to find ways to share this
information with one another. Test low-
tech ways. I'm so excited about the maker
space here and maker culture and other
low-tech or human-crafted ways to fool
networks. We can use adversarial learning
perhaps to get good ideas on how to fool
networks, to get lower tech ways. What if
I painted red pixels all over my face?
Would I still be recognized? Would I not?
Let's experiment with things that we learn
from adversarial learning and try to find
other lower-tech solutions to the same problem
Finally. or nearly finally, we
need to increase the research beyond just
computer vision. Quite a lot of
adversarial learning has been only in
computer vision and while I think that's
important and it's also been very
practical, because we can start to see how
we can fool something, we need to figure
out natural language processing, we need
to figure out other ways that machine
learning systems are being used, and we
need to come up with clever ways to fool them.
Finally, spread the word! So I don't
want the conversation to end here, I don't
want the conversation to end at Congress,
I want you to go back to your hacker
collective, your local CCC, the people
that you talk with, your co-workers and I
want you to spread the word. I want you to
do workshops on adversarial learning, I
want more people to not treat this AI as
something mystical and powerful, because
unfortunately it is powerful, but it's not
mystical! So we need to demystify this
space, we need to experiment, we need to
hack on it and we need to find ways to
play with it and spread the word to other
people. Finally, I really want to hear
your other ideas and before I leave today
have to say a little bit about why I
decided to join the resiliency track this
year. I read about the resiliency track
and I was really excited. It spoke to me.
And I said I want to live in a world
where, even if there's an entire burning
trash fire around me, I know that there
are other people that I care about, that I
can count on, that I can work with to try
and at least protect portions of our
world. To try and protect ourselves, to
try and protect people that do not have as
much privilege. So, what I want to be a
part of, is something that can use maybe
the skills I have and the skills you have
to do something with that. And your data
is a big source of value for everyone.
Any free service you use, they are selling
your data. OK, I don't know that for a
fact, but it is very certain, I feel very
certain about the fact that they're most
likely selling your data. And if they're
selling your data, they might also be
buying your data. And there is a whole
market, that's legal, that's freely
available, to buy and sell your data. And
they make money off of that, and they mine
more information, and make more money off
of that, and so forth. So, I will read a
little bit of my opinions that I put forth
on this. Determine who you share your data
with and for what reasons. GDPR and data
portability give us European residents
stronger rights than most of the world.
Let's use them. Let's choose privacy
concerned ethical data companies over
corporations that are entirely built on
selling ads. Let's build start-ups,
organizations, open-source tools and
systems that we can be truly proud of. And
let's port our data to those.
Applause
Herald: Amazing. We have,
we have time for a few questions.
K.J.: I'm not done yet, sorry, it's fine.
Herald: I'm so sorry.
K.J.: Laughs It's cool.
No big deal.
So, machine learning. Closing remarks is
brief round up. Closing remarks. There is
that machine learning is not very
intelligent. I think artificial
intelligence is a misnomer in a lot of
ways, but this doesn't mean that people
are going to stop using it. In fact
there's very smart, powerful, and rich
people that are investing more than ever
in it. So it's not going anywhere. And
it's going to be something that
potentially becomes more dangerous over
time. Because as we hand over more of
these to these systems, it could
potentially control more and more of our
lives. We can use, however, adversarial
machine learning techniques to find ways
to fool "black box" networks. So we can
use these and we know we don't have to
have perfect knowledge. However,
information is powerful. And the more
information that we do have, the more were
able to become a good GDPR based
adversary. So please use GDPR and let's
discuss ways where we can share
information. Finally, please support open-
source tools and research in this space,
because we need to keep up with where the
state of the art is. So we need to keep
ourselves moving and open in that way. And
please, support ethical data companies. Or
start one. If you come to me and you say
"Katharine, I'm going to charge you this
much money, but I will never sell your
data. And I will never buy your data." I
would much rather you handle my data. So I
want us, especially those within the EU,
to start a new economy around trust, and
privacy, and ethical data use.
Applause
Thank you very much.
Thank you.
Herald: OK. We still have time for a few
questions.
K.J.: No, no, no. No worries, no worries.
Herald: Less than the last time I walked
up here, but we do.
K.J.: Yeah, now I'm really done.
Herald: Come up to one of the mics in the
front section and raise your hand. Can we
take a question from mic one.
Question: Thank you very much for the very
interesting talk. One impression that I
got during the talk was, with the
adversarial learning approach aren't we
just doing pen testing and Quality
Assurance for the AI companies they're
just going to build better machines.
Answer: That's a very good question and of
course most of this research right now is
coming from those companies, because
they're worried about this. What, however,
they've shown is, they don't really have a
good way to fool, to learn how to fool
this. Most likely they will need to use a
different type of network, eventually. So
probably, whether it's the blind spots or
the linearity of these networks, they are
easy to fool and they will have to come up
with a different method for generating
something that is robust enough to not be
tricked. So, to some degree yes, its a
cat-and-mouse game, right. But that's why
I want the research and the open source to
continue as well. And I would be highly
suspect if they all of a sudden figure out
a way to make a neural network which has
proven linear relationships, that we can
exploit, nonlinear. And if so, it's
usually a different type of network that's
a lot more expensive to train and that
doesn't actually generalize well. So we're
going to really hit them in a way where
they're going to have to be more specific,
try harder, and I would rather do that
than just kind of give up.
Herald: Next one.
Mic 2
Q: Hello. Thank you for the nice talk. I
wanted to ask, have you ever tried looking
at from the other direction? Like, just
trying to feed the companies falsely
classified data. And just do it with so
massive amounts of data, so that they
learn from it at a certain point.
A: Yes, that's these poisoning attacks. So
when we talk about poison attacks, we are
essentially feeding bad training data and
we're trying to get them to learn bad
things. Or I wouldn't say bad things, but
we're trying to get them to learn false
information.
And that already happens on accident all
the time so I think the more to we can, if
we share information and they have a
publicly available API, where they're
actually actively learning from our
information, then yes I would say
poisoning is a great attack way. And we
can also share information of maybe how
that works.
So especially I would be intrigued if we
can do poisoning for adware and malicious
ad targeting.
Mic 2: OK, thank you.
Herald: One more question from the
internet and then we run out of time.
K.J. Oh no, sorry
Herald: So you can find Katherine after.
Signal-Angel: Thank you. One question from
the internet. What exactly can I do to
harden my model against adversarial
samples?
K.J.: Sorry?
Signal: What exactly can I do to harden my
model against adversarial samples?
K.J.: Not much. What they have shown is,
that if you train on a mixture of real
training data and adversarial data it's a
little bit harder to fool, but that just
means that you have to try more iterations
of adversarial input. So right now, the
recommendation is to train on a mixture of
adversarial and real training data and to
continue to do that over time. And I would
argue that you need to maybe do data
validation on input. And if you do data
validation on input maybe you can
recognize abnormalities. But that's
because I come from mainly like production
levels not theoretical, and I think maybe
you should just test things, and see if
look weird you should maybe not take them
into the system.
Herald: And that's all for the questions.
I wish we had more time but we just don't.
Please give it up for Katharine Jarmul
Applause
34c3 postroll music
subtitles created by c3subtitles.de
in the year 2019. Join, and help us!