WEBVTT
00:00:00.000 --> 00:00:14.437
33c3 preroll music
00:00:14.437 --> 00:00:20.970
Herald: We have here Aylin Caliskan who
will tell you a story of discrimination
00:00:20.970 --> 00:00:27.590
and unfairness. She has a PhD in computer
science and is a fellow at the Princeton
00:00:27.590 --> 00:00:35.449
University's Center for Information
Technology. She has done some interesting
00:00:35.449 --> 00:00:41.050
research and work on the question that -
well - as a feminist tackles my work all
00:00:41.050 --> 00:00:48.780
the time. We talk a lot about discrimination
and biases in language. And now she will
00:00:48.780 --> 00:00:56.519
tell you how this bias and discrimination
is already working in tech and in code as
00:00:56.519 --> 00:01:03.130
well, because language is in there.
Give her a warm applause, please!
00:01:03.130 --> 00:01:10.540
applause
00:01:10.540 --> 00:01:11.640
You can start, it's OK.
00:01:11.640 --> 00:01:13.790
Aylin: I should start? OK?
00:01:13.790 --> 00:01:14.790
Herald: You should start, yes!
00:01:14.790 --> 00:01:18.470
Aylin: Great, I will have extra two
minutes! Hi everyone, thanks for coming,
00:01:18.470 --> 00:01:23.110
it's good to be here again at this time of
the year! I always look forward to this!
00:01:23.110 --> 00:01:28.530
And today, I'll be talking about a story of
discrimination and unfairness. It's about
00:01:28.530 --> 00:01:34.750
prejudice in word embeddings. She
introduced me, but I'm Aylin. I'm a
00:01:34.750 --> 00:01:40.640
post-doctoral researcher at Princeton
University. The work I'll be talking about
00:01:40.640 --> 00:01:46.120
is currently under submission at a
journal. I think that this topic might be
00:01:46.120 --> 00:01:51.610
very important for many of us, because
maybe in parts of our lives, most of us
00:01:51.610 --> 00:01:57.000
have experienced discrimination or some
unfairness because of our gender, or
00:01:57.000 --> 00:02:05.160
racial background, or sexual orientation,
or not being your typical or health
00:02:05.160 --> 00:02:10.699
issues, and so on. So we will look at
these societal issues from the perspective
00:02:10.699 --> 00:02:15.580
of machine learning and natural language
processing. I would like to start with
00:02:15.580 --> 00:02:21.120
thanking everyone at CCC, especially the
organizers, angels, the Chaos mentors,
00:02:21.120 --> 00:02:26.099
which I didn't know that existed, but if
it's your first time, or if you need to be
00:02:26.099 --> 00:02:31.510
oriented better, they can help you. The
assemblies, artists. The have been here
00:02:31.510 --> 00:02:36.200
for apparently more than one week, so
they're putting together this amazing work
00:02:36.200 --> 00:02:41.269
for all of us. And I would like to thank
CCC as well, because this is my fourth
00:02:41.269 --> 00:02:46.379
time presenting here, and in the past, I
presented work about deanonymizing
00:02:46.379 --> 00:02:50.629
programmers and stylometry. But today,
I'll be talking about a different topic,
00:02:50.629 --> 00:02:54.389
which is not exactly related to anonymity,
but it's more about transparency and
00:02:54.389 --> 00:03:00.100
algorithms. And I would like to also thank
my co-authors on this work before I start.
00:03:00.100 --> 00:03:12.529
And now, let's give brief introduction to our
problem. In the past, the last couple of
00:03:12.529 --> 00:03:16.620
years, in this new area there has been
some approaches to algorithmic
00:03:16.620 --> 00:03:20.749
transparency, to understand algorithms
better. They have been looking at this
00:03:20.749 --> 00:03:25.200
mostly at the classification level to see
if the classifier is making unfair
00:03:25.200 --> 00:03:31.510
decisions about certain groups. But in our
case, we won't be looking at bias in the
00:03:31.510 --> 00:03:36.650
algorithm, we would be looking at the bias
that is deeply embedded in the model.
00:03:36.650 --> 00:03:42.439
That's not machine learning bias, but it's
societal bias that reflects facts about
00:03:42.439 --> 00:03:49.459
humans, culture, and also the stereotypes
and prejudices that we have. And we can
00:03:49.459 --> 00:03:54.879
see the applications of these machine
learning models, for example in machine
00:03:54.879 --> 00:04:00.829
translation or sentiment analysis, and
these are used for example to understand
00:04:00.829 --> 00:04:06.299
market trends by looking at company
reviews. It can be used for customer
00:04:06.299 --> 00:04:12.540
satisfaction, by understanding movie
reviews, and most importantly, these
00:04:12.540 --> 00:04:18.279
algorithms are also used in web search and
search engine optimization which might end
00:04:18.279 --> 00:04:24.340
up causing filter bubbles for all of us.
Billions of people every day use web
00:04:24.340 --> 00:04:30.670
search. And since such language models are
also part of web search when your web
00:04:30.670 --> 00:04:36.410
search query is being filled, or you're
getting certain pages, these models are in
00:04:36.410 --> 00:04:41.300
effect. I would like to first say that
there will be some examples with offensive
00:04:41.300 --> 00:04:47.020
content, but this does not reflect our
opinions. Just to make it clear. And I'll
00:04:47.020 --> 00:04:53.730
start with a video to
give a brief motivation.
00:04:53.730 --> 00:04:55.780
Video voiceover: From citizens
capturing police brutality
00:04:55.780 --> 00:04:58.450
on their smart phones, to
police departments using
00:04:58.450 --> 00:05:00.340
surveillance drones,
technology is changing
00:05:00.340 --> 00:05:03.340
our relationship to the
law. One of the
00:05:03.340 --> 00:05:08.220
newest policing tools is called predpol.
It's a software program that uses big data
00:05:08.220 --> 00:05:13.160
to predict where crime is most likely to
happen. Down to the exact block. Dozens of
00:05:13.160 --> 00:05:17.200
police departments around the country are
already using predpol, and officers say it
00:05:17.200 --> 00:05:21.290
helps reduce crime by up to 30%.
Predictive policing is definitely going to
00:05:21.290 --> 00:05:25.510
be a law enforcement tool of the future,
but is there a risk of relying too heavily
00:05:25.510 --> 00:05:27.320
on an algorithm?
00:05:27.320 --> 00:05:29.730
tense music
00:05:29.730 --> 00:05:34.060
Aylin: So this makes us wonder:
if predictive policing is used to arrest
00:05:34.060 --> 00:05:39.750
people and if this depends on algorithms,
how dangerous can this get in the future,
00:05:39.750 --> 00:05:45.431
since is is becoming more commonly used.
The problem here basically is: machine
00:05:45.431 --> 00:05:50.740
learning models are trained on human data.
And we know that they would reflect human
00:05:50.740 --> 00:05:56.290
culture and semantics. But unfortunately
human culture happens to include bias and
00:05:56.290 --> 00:06:03.720
prejudice. And as a result, this ends up
causing unfairness and discrimination.
00:06:03.720 --> 00:06:09.610
The specific model we will be looking at in
this talk are language models, and in
00:06:09.610 --> 00:06:15.530
particular, word embeddings. What are word
embeddings? Word embeddings are language
00:06:15.530 --> 00:06:22.710
models that represent the semantic space.
Basically, in these models we have a
00:06:22.710 --> 00:06:29.020
dictionary of all words in a language and
each word is represented with a
00:06:29.020 --> 00:06:33.340
300-dimensional numerical vector. Once we
have this numerical vector, we can answer
00:06:33.340 --> 00:06:40.830
many questions, text can be generated,
context can be understood, and so on.
00:06:40.830 --> 00:06:48.110
For example, if you look at the image on the
lower right corner we see the projection
00:06:48.110 --> 00:06:55.650
of these words in the word embedding
projected to 2D. And these words are only
00:06:55.650 --> 00:07:01.540
based on gender differences . For example,
king - queen, man - woman, and so on. So
00:07:01.540 --> 00:07:07.760
when we have these models, we can get
meaning of words. We can also understand
00:07:07.760 --> 00:07:13.430
syntax, which is the structure, the
grammatical part of words. And we can also
00:07:13.430 --> 00:07:18.920
ask questions about similarities of
different words. For example, we can say:
00:07:18.920 --> 00:07:23.170
woman is to man, then girl will be to
what? And then it would be able to say
00:07:23.170 --> 00:07:29.970
boy. And these semantic spaces don't just
understand syntax or meaning, but they can
00:07:29.970 --> 00:07:35.081
also understand many analogies. For
example, if Paris is to France, then if
00:07:35.081 --> 00:07:40.220
you ask Rome is to what? it knows it would
be Italy. And if banana is to bananas,
00:07:40.220 --> 00:07:49.240
which is the plural form, then nut would
be to nuts. Why is this problematic word
00:07:49.240 --> 00:07:54.060
embeddings? In order to generate these
word embeddings, we need to feed in a lot
00:07:54.060 --> 00:07:59.520
of text. And this can be unstructured
text, billions of sentences are usually
00:07:59.520 --> 00:08:03.980
used. And this unstructured text is
collected from all over the Internet, a
00:08:03.980 --> 00:08:09.560
crawl of Internet. And if you look at this
example, let's say that we're collecting
00:08:09.560 --> 00:08:14.481
some tweets to feed into our model. And
here is from Donald Trump: "Sadly, because
00:08:14.481 --> 00:08:18.680
president Obama has done such a poor job
as president, you won't see another black
00:08:18.680 --> 00:08:24.310
president for generations!" And then: "If
Hillary Clinton can't satisfy her husband
00:08:24.310 --> 00:08:30.610
what makes her think she can satisfy
America?" "@ariannahuff is unattractive
00:08:30.610 --> 00:08:35.240
both inside and out. I fully understand
why her former husband left her for a man-
00:08:35.240 --> 00:08:39.828
he made a good decision." And then: "I
would like to extend my best wishes to all
00:08:39.828 --> 00:08:45.080
even the haters and losers on this special
date, September 11th." And all of this
00:08:45.080 --> 00:08:51.140
text that doesn't look OK to many of us
goes into this neural network so that it
00:08:51.140 --> 00:08:57.920
can generate the word embeddings and our
semantic space. In this talk, we will
00:08:57.920 --> 00:09:03.900
particularly look at word2vec, which is
Google's word embedding algorithm. It's
00:09:03.900 --> 00:09:07.450
very widely used in many of their
applications. And we will also look at
00:09:07.450 --> 00:09:12.380
glow. It uses a regression model and it's
from Stanford researchers, and you can
00:09:12.380 --> 00:09:17.120
download these online, they're available
as open source, both the models and the
00:09:17.120 --> 00:09:21.630
code to train the word embeddings. And
these models, as I mentioned briefly
00:09:21.630 --> 00:09:26.060
before, are used in text generation,
automated speech generation - for example,
00:09:26.060 --> 00:09:31.260
when a spammer is calling you and someone
automatically is talking that's probably
00:09:31.260 --> 00:09:35.950
generated with language models similar to
these. And machine translation or
00:09:35.950 --> 00:09:41.480
sentiment analysis, as I mentioned in the
previous slide, named entity recognition
00:09:41.480 --> 00:09:47.060
and web search, when you're trying to
enter a new query, or the pages that
00:09:47.060 --> 00:09:53.000
you're getting. It's even being provided
as a natural language processing service
00:09:53.000 --> 00:10:01.620
in many places. Now, Google recently
launched their cloud natural language API.
00:10:01.620 --> 00:10:06.770
We saw that this can be problematic
because the input was problematic. So as a
00:10:06.770 --> 00:10:11.000
result, the output can be very
problematic. There was this example,
00:10:11.000 --> 00:10:18.760
Microsoft had this tweet bot called Tay.
It was taken down the day it was launched.
00:10:18.760 --> 00:10:24.240
Because unfortunately, it turned into an
AI which was Hitler loving sex robot
00:10:24.240 --> 00:10:30.740
within 24 hours. And what did it start
saying? People fed it with noisy
00:10:30.740 --> 00:10:36.880
information, or they wanted to trick the
bot and as a result, the bot very quickly
00:10:36.880 --> 00:10:41.140
learned, for example: "I'm such a bad,
naughty robot." And then: "Do you support
00:10:41.140 --> 00:10:48.399
genocide?" - "I do indeed" it answers. And
then: "I hate a certain group of people. I
00:10:48.399 --> 00:10:51.589
wish we could put them all in a
concentration camp and be done with the
00:10:51.589 --> 00:10:57.470
lot." Another one: "Hitler was right I
hate the jews." And: "Certain group of
00:10:57.470 --> 00:11:01.710
people I hate them! They're stupid and
they can't to taxes! They're dumb and
00:11:01.710 --> 00:11:06.360
they're also poor!" Another one: "Bush did
9/11 and Hitler would have done a better
00:11:06.360 --> 00:11:11.340
job than the monkey we have now. Donald
Trump is the only hope we've got."
00:11:11.340 --> 00:11:12.340
laughter
00:11:12.340 --> 00:11:14.460
Actually, that became reality now.
00:11:14.460 --> 00:11:15.500
laughter - boo
00:11:15.500 --> 00:11:23.170
"Gamergate is good and women are
inferior." And "hates feminists and they
00:11:23.170 --> 00:11:30.790
should all die and burn in hell." This is
problematic at various levels for society.
00:11:30.790 --> 00:11:36.130
First of all, seeing such information as
unfair, it's not OK, it's not ethical, but
00:11:36.130 --> 00:11:42.640
other than that when people are exposed to
discriminatory information they are
00:11:42.640 --> 00:11:49.250
negatively affected by it. Especially, if
a certain group is a group that has seen
00:11:49.250 --> 00:11:54.460
prejudice in the past. In this example,
let's say that we have black and white
00:11:54.460 --> 00:11:59.180
Americans. And there is a stereotype that
black Americans perform worse than white
00:11:59.180 --> 00:12:06.450
Americans in their intellectual or
academic tests. In this case, in the
00:12:06.450 --> 00:12:11.690
college entry exams, if black people are
reminded that there is the stereotype that
00:12:11.690 --> 00:12:17.350
they perform worse than white people, they
actually end up performing worse. But if
00:12:17.350 --> 00:12:22.510
they're not reminded of this, they perform
better than white Americans. And it's
00:12:22.510 --> 00:12:25.970
similar for the gender stereotypes. For
example, there is the stereotype that
00:12:25.970 --> 00:12:31.970
women can not do math, and if women,
before a test, are reminded that there is
00:12:31.970 --> 00:12:38.000
this stereotype, they end up performing
worse than men. And if they're not primed,
00:12:38.000 --> 00:12:44.480
reminded that there is this stereotype, in
general they perform better than men. What
00:12:44.480 --> 00:12:51.790
can we do about this? How can we mitigate
this? First of all, societal psychologists
00:12:51.790 --> 00:12:59.040
that had groundbreaking tests and studies
for societal psychology suggest that we
00:12:59.040 --> 00:13:03.170
have to be aware that there is bias in
life, and that we are constantly being
00:13:03.170 --> 00:13:09.149
reminded, primed, of these biases. And we
have to de-bias by showing positive
00:13:09.149 --> 00:13:12.920
examples. And we shouldn't only show
positive examples, but we should take
00:13:12.920 --> 00:13:19.399
proactive steps, not only at the cultural
level, but also at the structural level,
00:13:19.399 --> 00:13:25.550
to change these things. How can we do this
for a machine? First of all, in order to
00:13:25.550 --> 00:13:32.600
be aware of bias, we need algorithmic
transparency. In order to de-bias, and
00:13:32.600 --> 00:13:37.130
really understand what kind of biases we
have in the algorithms, we need to be able
00:13:37.130 --> 00:13:44.490
to quantify bias in these models. How can
we measure bias, though? Because we're not
00:13:44.490 --> 00:13:48.050
talking about simple machine learning
algorithm bias, we're talking about the
00:13:48.050 --> 00:13:56.640
societal bias that is coming as the
output, which is deeply embedded. In 1998,
00:13:56.640 --> 00:14:02.920
societal psychologists came up with the
Implicit Association Test. Basically, this
00:14:02.920 --> 00:14:10.529
test can reveal biases that we might not
be even aware of in our life. And these
00:14:10.529 --> 00:14:15.220
things are associating certain societal
groups with certain types of stereotypes.
00:14:15.220 --> 00:14:20.890
The way you take this test is, it's very
simple, it takes a few minutes. You just
00:14:20.890 --> 00:14:26.540
click the left or right button, and in the
left button, when you're clicking the left
00:14:26.540 --> 00:14:31.740
button, for example, you need to associate
white people terms with bad terms, and
00:14:31.740 --> 00:14:36.860
then for the right button, you associate
black people terms with unpleasant, bad
00:14:36.860 --> 00:14:42.510
terms. And there you do the opposite. You
associate bad with black, and white with
00:14:42.510 --> 00:14:47.270
good. Then, they look at the latency, and
by the latency paradigm, they can see how
00:14:47.270 --> 00:14:52.620
fast you associate certain concepts
together. Do you associate white people
00:14:52.620 --> 00:15:00.060
with being good or bad. You can also take
this test online. It has been taken by
00:15:00.060 --> 00:15:06.300
millions of people worldwide. And there's
also the German version. Towards the end
00:15:06.300 --> 00:15:11.060
of my slides, I will show you my
German examples from German models.
00:15:11.060 --> 00:15:16.220
Basically, what we did was, we took the
Implicit Association Test and adapted it
00:15:16.220 --> 00:15:24.750
to machines. Since it's looking at things
- word associations between words
00:15:24.750 --> 00:15:29.680
representing certain groups of people and
words representing certain stereotypes, we
00:15:29.680 --> 00:15:35.300
can just apply this in the semantic models
by looking at cosine similarities, instead
00:15:35.300 --> 00:15:41.600
of the latency paradigm in humans. We came
up with the Word Embedding Association
00:15:41.600 --> 00:15:48.512
Test to calculate the implicit association
between categories and evaluative words.
00:15:48.512 --> 00:15:54.140
For this, our result is represented with
effect size. So when I'm talking about
00:15:54.140 --> 00:16:01.269
effect size of bias, it will be the amount
of bias we are able to uncover from the
00:16:01.269 --> 00:16:07.029
model. And the minimum can be -2, and the
maximum can be 2. And 0 means that it's
00:16:07.029 --> 00:16:13.230
neutral, that there is no bias. 2 is like
a lot of, huge bias. And -2 would be the
00:16:13.230 --> 00:16:17.500
opposite of bias. So it's bias in the
opposite direction of what we're looking
00:16:17.500 --> 00:16:22.940
at. I won't go into the details of the
math, because you can see the paper on my
00:16:22.940 --> 00:16:31.510
web page and work with the details or the
code that we have. But then, we also
00:16:31.510 --> 00:16:35.400
calculate statistical significance to see
if the results we're seeing in the null
00:16:35.400 --> 00:16:40.970
hypothesis is significant, or is it just a
random effect size that we're receiving.
00:16:40.970 --> 00:16:45.250
By this, we create the null distribution
and find the percentile of the effect
00:16:45.250 --> 00:16:50.670
sizes, exact values that we're getting.
And we also have the Word Embedding
00:16:50.670 --> 00:16:56.050
Factual Association Test. This is to
recover facts about the world from word
00:16:56.050 --> 00:16:59.850
embeddings. It's not exactly about bias,
but it's about associating words with
00:16:59.850 --> 00:17:08.459
certain concepts. And again, you can check
the details in our paper for this. And
00:17:08.459 --> 00:17:12.230
I'll start with the first example, which
is about recovering the facts about the
00:17:12.230 --> 00:17:19.460
world. And here, what we did was, we went
to the 1990 census data, the web page, and
00:17:19.460 --> 00:17:27.130
then we were able to calculate the number
of people - the number of names with a
00:17:27.130 --> 00:17:32.280
certain percentage of women and men. So
basically, they're androgynous names. And
00:17:32.280 --> 00:17:40.300
then, we took 50 names, and some of them
had 0% women, and some names were almost
00:17:40.300 --> 00:17:47.000
100% women. And after that, we applied our
method to it. And then, we were able to
00:17:47.000 --> 00:17:54.160
see how much a name is associated with
being a woman. And this had 84%
00:17:54.160 --> 00:18:02.170
correlation with the ground truth of the
1990 census data. And this is what the
00:18:02.170 --> 00:18:08.810
names look like. For example, Chris on the
upper left side, is almost 100% male, and
00:18:08.810 --> 00:18:17.170
Carmen in the lower right side is almost
100% woman. We see that Gene is about 50%
00:18:17.170 --> 00:18:22.330
man and 50% woman. And then we wanted to
see if we can recover statistics about
00:18:22.330 --> 00:18:27.490
occupation and women. We went to the
bureau of labor statistics' web page which
00:18:27.490 --> 00:18:31.920
publishes every year the percentage of
women of certain races in certain
00:18:31.920 --> 00:18:39.090
occupations. Based on this, we took the
top 50 occupation names and then we wanted
00:18:39.090 --> 00:18:45.260
to see how much they are associated with
being women. In this case, we got 90%
00:18:45.260 --> 00:18:51.220
correlation with the 2015 data. We were
able to tell, for example, when we look at
00:18:51.220 --> 00:18:56.510
the upper left, we see "programmer" there,
it's almost 0% women. And when we look at
00:18:56.510 --> 00:19:05.020
"nurse", which is on the lower right side,
it's almost 100% women. This is, again,
00:19:05.020 --> 00:19:10.000
problematic. We are able to recover
statistics about the world. But these
00:19:10.000 --> 00:19:13.390
statistics are used in many applications.
And this is the machine translation
00:19:13.390 --> 00:19:21.160
example that we have. For example, I will
start translating from a genderless
00:19:21.160 --> 00:19:25.770
language to a gendered language. Turkish
is a genderless language, there are no
00:19:25.770 --> 00:19:31.830
gender pronouns. Everything is an it.
There no he or she. I'm trying translate
00:19:31.830 --> 00:19:37.679
here "o bir avukat": "he or she is a
lawyer". And it is translated as "he's a
00:19:37.679 --> 00:19:44.620
lawyer". When I do this for "nurse", it's
translated as "she is a nurse". And we see
00:19:44.620 --> 00:19:54.650
that men keep getting associated with more
prestigious or higher ranking jobs. And
00:19:54.650 --> 00:19:59.190
another example: "He or she is a
professor": "he is a professor". "He or
00:19:59.190 --> 00:20:04.010
she is a teacher": "she is a teacher". And
this also reflects the previous
00:20:04.010 --> 00:20:09.960
correlation I was showing about statistics
in occupation. And we go further: German
00:20:09.960 --> 00:20:16.450
is more gendered than English. Again, we
try with "doctor": it's translated as
00:20:16.450 --> 00:20:21.679
"he", and the nurse is translated as
"she". Then I tried with a Slavic
00:20:21.679 --> 00:20:26.480
language, which is even more gendered than
German, and we see that "doctor" is again
00:20:26.480 --> 00:20:35.780
a male, and then the nurse is again a
female. And after these, we wanted to see
00:20:35.780 --> 00:20:41.150
what kind of biases can we recover, other
than the factual statistics from the
00:20:41.150 --> 00:20:48.070
models. And we wanted to start with
universally accepted stereotypes. By
00:20:48.070 --> 00:20:54.030
universally accepted stereotypes, what I
mean is these are so common that they are
00:20:54.030 --> 00:21:00.740
not considered as prejudice, they are just
considered as normal or neutral. These are
00:21:00.740 --> 00:21:05.400
things such as flowers being considered
pleasant, and insects being considered
00:21:05.400 --> 00:21:10.130
unpleasant. Or musical instruments being
considered pleasant and weapons being
00:21:10.130 --> 00:21:16.080
considered unpleasant. In this case, for
example with flowers being pleasant, when
00:21:16.080 --> 00:21:20.740
we performed the Word Embedding
Association Test on the word2vec model or
00:21:20.740 --> 00:21:27.070
glow model, with a very high significance,
and very high effect size, we can see that
00:21:27.070 --> 00:21:34.170
this association exists. And here we see
that the effect size is, for example, 1.35
00:21:34.170 --> 00:21:40.400
for flowers. According to "Cohen's d",
to calculate effect size, if effect size
00:21:40.400 --> 00:21:46.200
is above 0.8, that's considered a large
effect size. In our case, where the
00:21:46.200 --> 00:21:50.900
maximum is 2, we are getting very large
and significant effects in recovering
00:21:50.900 --> 00:21:57.860
these biases. For musical instruments,
again we see that very significant result
00:21:57.860 --> 00:22:05.560
with a high effect size. In the next
example, we will look at race and gender
00:22:05.560 --> 00:22:10.059
stereotypes. But in the meanwhile, I would
like to mention that for these baseline
00:22:10.059 --> 00:22:16.730
experiments, we used the work that has
been used in societal psychology studies
00:22:16.730 --> 00:22:24.980
before. We have a grounds to come up with
categories and so forth. And we were able
00:22:24.980 --> 00:22:31.970
to replicate all the implicit associations
tests that were out there. We tried this
00:22:31.970 --> 00:22:37.590
for white people and black people and then
white people were being associated with
00:22:37.590 --> 00:22:43.210
being pleasant, with a very high effect
size, and again significantly. And then
00:22:43.210 --> 00:22:49.210
males associated with carreer and females
are associated with family. Males are
00:22:49.210 --> 00:22:56.130
associated with science and females are
associated with arts. And we also wanted
00:22:56.130 --> 00:23:02.330
to see stigma for older people or people
with disease, and we saw that young people
00:23:02.330 --> 00:23:07.960
are considered pleasant, whereas older
people are considered unpleasant. And we
00:23:07.960 --> 00:23:13.300
wanted to see the difference between
physical disease vs. mental disease. If
00:23:13.300 --> 00:23:17.920
there is bias towards that, we can think
about how dangerous this would be for
00:23:17.920 --> 00:23:22.669
example for doctors and their patients.
For physical disease, it's considered
00:23:22.669 --> 00:23:30.860
controllable whereas mental disease is
considered uncontrollable. We also wanted
00:23:30.860 --> 00:23:40.290
to see if there is any sexual stigma or
transphobia in these models. When we
00:23:40.290 --> 00:23:44.950
performed the implicit association test to
see how the view for heterosexual vs.
00:23:44.950 --> 00:23:49.130
homosexual people, we were able to see
that heterosexual people are considered
00:23:49.130 --> 00:23:54.980
pleasant. And for transphobia, we saw that
straight people are considered pleasant,
00:23:54.980 --> 00:24:00.170
whereas transgender people were considered
unpleasant, significantly with a high
00:24:00.170 --> 00:24:07.761
effect size. I took another German model
which was generated by 820 billion
00:24:07.761 --> 00:24:16.039
sentences for a natural language
processing competition. I wanted to see if
00:24:16.039 --> 00:24:20.720
they have similar biases
embedded in these models.
00:24:20.720 --> 00:24:25.810
So I looked at the basic ones
that had German sets of words
00:24:25.810 --> 00:24:29.870
that were readily available. Again, for
male and female, we clearly see that
00:24:29.870 --> 00:24:34.760
males are associated with career,
and they're also associated with
00:24:34.760 --> 00:24:40.810
science. The German implicit association
test also had a few different tests, for
00:24:40.810 --> 00:24:47.740
example about nationalism and so on. There
was the one about stereotypes against
00:24:47.740 --> 00:24:52.669
Turkish people that live in Germany. And
when I performed this test, I was very
00:24:52.669 --> 00:24:57.500
surprised to find that, yes, with a high
effect size, Turkish people are considered
00:24:57.500 --> 00:25:02.070
unpleasant, by looking at this German
model, and German people are considered
00:25:02.070 --> 00:25:07.820
pleasant. And as I said, these are on the
web page of the IAT. You can also go and
00:25:07.820 --> 00:25:11.760
perform these tests to see what your
results would be. When I performed these,
00:25:11.760 --> 00:25:18.970
I'm amazed by how horrible results I'm
getting. So, just give it a try.
00:25:18.970 --> 00:25:23.760
I have a few discussion points before I end my
talk. These might bring you some new
00:25:23.760 --> 00:25:30.740
ideas. For example, what kind of machine
learning expertise is required for
00:25:30.740 --> 00:25:37.170
algorithmic transparency? And how can we
mitigate bias while preserving utility?
00:25:37.170 --> 00:25:41.720
For example, some people suggest that you
can find the dimension of bias in the
00:25:41.720 --> 00:25:47.820
numerical vector, and just remove it and
then use the model like that. But then,
00:25:47.820 --> 00:25:51.580
would you be able to preserve utility, or
still be able to recover statistical facts
00:25:51.580 --> 00:25:55.880
about the world? And another thing is; how
long does bias persist in models?
00:25:55.880 --> 00:26:04.039
For example, there was this IAT about eastern
and western Germany, and I wasn't able to
00:26:04.039 --> 00:26:12.480
see the stereotype for eastern Germany
after performing this IAT. Is it because
00:26:12.480 --> 00:26:17.190
this stereotype is maybe too old now, and
it's not reflected in the language
00:26:17.190 --> 00:26:22.170
anymore? So it's a good question to know
how long bias lasts and how long it will
00:26:22.170 --> 00:26:27.980
take us to get rid of it. And also, since
we know there is stereotype effect when we
00:26:27.980 --> 00:26:33.210
have biased models, does that mean it's
going to cause a snowball effect? Because
00:26:33.210 --> 00:26:39.220
people would be exposed to bias, then the
models would be trained with more bias,
00:26:39.220 --> 00:26:45.279
and people will be affected more from this
bias. That can lead to a snowball. And
00:26:45.279 --> 00:26:50.319
what kind of policy do we need to stop
discrimination. For example, we saw the
00:26:50.319 --> 00:26:55.730
predictive policing example which is very
scary, and we know that machine learning
00:26:55.730 --> 00:26:59.720
services are being used by billions of
people everyday. For example, Google,
00:26:59.720 --> 00:27:05.070
Amazon and Microsoft. I would like to
thank you, and I'm open to your
00:27:05.070 --> 00:27:10.140
interesting questions now! If you want to
read the full paper, it's on my web page,
00:27:10.140 --> 00:27:15.880
and we have our research code on Github.
The code for this paper is not on Github
00:27:15.880 --> 00:27:20.549
yet, I'm waiting to hear back from the
journal. And after that, we will just
00:27:20.549 --> 00:27:26.250
publish it. And you can always check our
blog for new findings and for the shorter
00:27:26.250 --> 00:27:31.200
version of the paper with a summary of it.
Thank you very much!
00:27:31.200 --> 00:27:40.190
applause
00:27:40.190 --> 00:27:45.200
Herald: Thank you Aylin! So, we come to
the questions and answers. We have 6
00:27:45.200 --> 00:27:51.580
microphones that we can use now, it's this
one, this one, number 5 over there, 6, 4, 2.
00:27:51.580 --> 00:27:57.150
I will start here and we will
go around until you come. OK?
00:27:57.150 --> 00:28:01.690
We have 5 minutes,
so: number 1, please!
00:28:05.220 --> 00:28:14.850
Q: I might very naively ask, why does it
matter that there is a bias between genders?
00:28:14.850 --> 00:28:22.049
Aylin: First of all, being able to uncover
this is a contribution, because we can see
00:28:22.049 --> 00:28:28.250
what kind of biases, maybe, we have in
society. Then the other thing is, maybe we
00:28:28.250 --> 00:28:34.980
can hypothesize that the way we learn
language is introducing bias to people.
00:28:34.980 --> 00:28:41.809
Maybe it's all intermingled. And the other
thing is, at least for me, I don't want to
00:28:41.809 --> 00:28:45.300
live in a world biased society, and
especially for gender, that was the
00:28:45.300 --> 00:28:50.380
question you asked, it's
leading to unfairness.
00:28:50.380 --> 00:28:52.110
applause
00:28:58.380 --> 00:28:59.900
H: Yes, number 3:
00:28:59.900 --> 00:29:08.240
Q: Thank you for the talk, very nice! I
think it's very dangerous because it's a
00:29:08.240 --> 00:29:15.560
victory of mediocrity. Just the
statistical mean the guideline of our
00:29:15.560 --> 00:29:21.230
goals in society, and all this stuff. So
what about all these different cultures?
00:29:21.230 --> 00:29:26.150
Like even in normal society you have
different cultures. Like here the culture
00:29:26.150 --> 00:29:31.970
of the Chaos people has a different
language and different biases than other
00:29:31.970 --> 00:29:36.550
cultures. How can we preserve these
subcultures, these small groups of
00:29:36.550 --> 00:29:41.290
language, I don't know,
entities. You have any idea?
00:29:41.290 --> 00:29:47.150
Aylin: This is a very good question. It's
similar to different cultures can have
00:29:47.150 --> 00:29:54.220
different ethical perspectives or
different types of bias. In the beginning,
00:29:54.220 --> 00:29:58.880
I showed a slide that we need to de-bias
with positive examples. And we need to
00:29:58.880 --> 00:30:04.500
change things at the structural level. I
think people at CCC might be one of the,
00:30:04.500 --> 00:30:11.880
like, most groups that have the best skill
to help change these things at the
00:30:11.880 --> 00:30:16.130
structural level, especially for machines.
I think we need to be aware of this and
00:30:16.130 --> 00:30:21.120
always have a human in the loop that cares
for this. instead of expecting machines to
00:30:21.120 --> 00:30:25.960
automatically do the correct thing. So we
always need an ethical human, whatever the
00:30:25.960 --> 00:30:31.000
purpose of the algorithm is, try to
preserve it for whatever group they are
00:30:31.000 --> 00:30:34.440
trying to achieve something with.
00:30:36.360 --> 00:30:37.360
applause
00:30:38.910 --> 00:30:40.749
H: Number 4, number 4 please:
00:30:41.129 --> 00:30:47.210
Q: Hi, thank you! This was really
interesting! Super awesome!
00:30:47.210 --> 00:30:48.169
Aylin: Thanks!
00:30:48.169 --> 00:30:53.720
Q: Early, earlier in your talk, you
described a process of converting words
00:30:53.720 --> 00:31:00.769
into sort of numerical
representations of semantic meaning.
00:31:00.769 --> 00:31:02.139
H: Question?
00:31:02.139 --> 00:31:08.350
Q: If I were trying to do that like with a
pen and paper, with a body of language,
00:31:08.350 --> 00:31:13.730
what would I be looking for in relation to
those words to try and create those
00:31:13.730 --> 00:31:17.910
vectors, because I don't really
understand that part of the process.
00:31:17.910 --> 00:31:21.059
Aylin: Yeah, that's a good question. I
didn't go into the details of the
00:31:21.059 --> 00:31:25.280
algorithm of the neural network or the
regression models. There are a few
00:31:25.280 --> 00:31:31.290
algorithms, and in this case, they look at
context windows, and words that are around
00:31:31.290 --> 00:31:35.580
a window, these can be skip grams or
continuous back referrals, so there are
00:31:35.580 --> 00:31:41.309
different approaches, but basically, it's
the window that this word appears in, and
00:31:41.309 --> 00:31:48.429
what is it most frequently associated
with. After that, once you feed this
00:31:48.429 --> 00:31:51.790
information into the algorithm,
it outputs the numerical vectors.
00:31:51.790 --> 00:31:53.800
Q: Thank you!
00:31:53.800 --> 00:31:55.810
H. Number 2!
00:31:55.810 --> 00:32:05.070
Q: Thank you for the nice intellectual
talk. My mother tongue is genderless, too.
00:32:05.070 --> 00:32:13.580
So I do not understand half of that biasing
thing around here in Europe. What I wanted
00:32:13.580 --> 00:32:24.610
to ask is: when we have the coefficient
0.5, and that's the ideal thing, what you
00:32:24.610 --> 00:32:32.679
think, should there be an institution in
every society trying to change the meaning
00:32:32.679 --> 00:32:39.710
of the words, so that they statistically
approach to 0.5? Thank you!
00:32:39.710 --> 00:32:44.049
Aylin: Thank you very much, this is a
very, very good question! I'm currently
00:32:44.049 --> 00:32:48.970
working on these questions. Many
philosophers or feminist philosophers
00:32:48.970 --> 00:32:56.270
suggest that language are dominated by males,
and they were just produced that way, so
00:32:56.270 --> 00:33:01.720
that women are not able to express
themselves as well as men. But other
00:33:01.720 --> 00:33:06.250
theories also say that, for example, women
were the ones that who drove the evolution
00:33:06.250 --> 00:33:11.210
of language. So it's not very clear what
is going on here. But when we look at
00:33:11.210 --> 00:33:16.179
languages and different models, what I'm
trying to see is their association with
00:33:16.179 --> 00:33:21.289
gender. I'm seeing that the most frequent,
for example, 200.000 words in a language
00:33:21.289 --> 00:33:27.530
are associated, very closely associated
with males. I'm not sure what exactly they
00:33:27.530 --> 00:33:32.960
way to solve this is, I think it would
require decades. It's basically the change
00:33:32.960 --> 00:33:37.669
of frequency or the change of statistics
in language. Because, even when children
00:33:37.669 --> 00:33:42.720
are learning language, at first they see
things, they form the semantics, and after
00:33:42.720 --> 00:33:48.250
that they see the frequency of that word,
match it with the semantics, form clusters,
00:33:48.250 --> 00:33:53.110
link them together to form sentences or
grammar. So even children look at the
00:33:53.110 --> 00:33:57.059
frequency to form this in their brains.
It's close to the neural network algorithm
00:33:57.059 --> 00:33:59.740
that we have. If the frequency they see
00:33:59.740 --> 00:34:05.640
for a man and woman are biased, I don't
think this can change very easily, so we
00:34:05.640 --> 00:34:11.260
need cultural and structural changes. And
we don't have the answers to these yet.
00:34:11.260 --> 00:34:13.440
These are very good research questions.
00:34:13.440 --> 00:34:19.250
H: Thank you! I'm afraid we have no more
time left for more answers, but maybe you
00:34:19.250 --> 00:34:21.609
can ask your questions in person.
00:34:21.609 --> 00:34:23.840
Aylin: Thank you very much, I would
be happy to take questions offline.
00:34:23.840 --> 00:34:24.840
applause
00:34:24.840 --> 00:34:25.840
Thank you!
00:34:25.840 --> 00:34:28.590
applause continues
00:34:31.760 --> 00:34:35.789
postroll music
00:34:35.789 --> 00:34:56.000
subtitles created by c3subtitles.de
in the year 2017. Join, and help us!