WEBVTT
00:00:00.000 --> 00:00:19.030
36C3 preroll music
00:00:19.030 --> 00:00:26.500
Herald: OK. So inside the fake like
factories. I'm going to date myself. I
00:00:26.500 --> 00:00:32.980
remember it was the Congress around
1990,1991 or so, where I was sitting
00:00:32.980 --> 00:00:38.550
together with some people who came over to
the states to visit the CCC Congress. And
00:00:38.550 --> 00:00:43.230
we were kind of riffing on how great the
internet is gonna make the world, you
00:00:43.230 --> 00:00:46.970
know, how how it's gonna bring world peace
and truth will rule and everything like
00:00:46.970 --> 00:00:57.259
that. Boy, were we naive, boy, where we
totally wrong. And today I'm going to be
00:00:57.259 --> 00:01:03.470
schooled in how wrong I actually was
because we have Svea, Dennis and Philip to
00:01:03.470 --> 00:01:08.980
tell us all about the fake like factories
around the world. And with that, could you
00:01:08.980 --> 00:01:17.670
please help me in welcoming them onto the
stage? Svea, Dennis and Philip.
00:01:17.670 --> 00:01:28.810
Philip: Thank you very much. Welcome to
our talk "Inside the Fake Like Factories
00:01:28.810 --> 00:01:35.899
". My name is Philip. I'm an Internet
activist against disinformation and I'm
00:01:35.899 --> 00:01:38.719
also a student of the University of
Bamberg.
00:01:38.719 --> 00:01:45.039
Svea: Hi. Thank you that you listen to us
tonight. My name is Svea. I'm an
00:01:45.039 --> 00:01:50.219
investigative journalist, freelance mostly
for the NDR and ARD. It's a public
00:01:50.219 --> 00:01:55.759
broadcaster in Germany. And I focus on
tech issues. And I had the pleasure to
00:01:55.759 --> 00:02:01.280
work with these two guys on, for me, a
journalistic project and for them on a
00:02:01.280 --> 00:02:04.289
scientific project.
Dennis: Yeah. Hi, everyone. My name is
00:02:04.289 --> 00:02:09.009
Dennis. I'm a PhD student from Ruhr
University Bochum. I'm working as a
00:02:09.009 --> 00:02:16.160
research assistant for the chair for
System Security. My research focuses on
00:02:16.160 --> 00:02:21.349
network security topics and Internet
measurements. And as Svea said, Philip and
00:02:21.349 --> 00:02:26.660
myself, we are here for the scientific
part and Svea is for the journalistic part
00:02:26.660 --> 00:02:31.790
here.
Philip: So here's our outline for today.
00:02:31.790 --> 00:02:38.550
So first, I'm going to briefly talk about
our motivation for our descent into the
00:02:38.550 --> 00:02:45.160
fake like factories and then we are going
to show you how we got our hands on ninety
00:02:45.160 --> 00:02:50.780
thousand fake like campaigns of a major
crowd working platform. And we are also
00:02:50.780 --> 00:02:56.080
going to show you why we think that there
are 10 billion registered Facebook users
00:02:56.080 --> 00:03:04.360
today. So first, I'm going to talk about
the like button. The like button is the
00:03:04.360 --> 00:03:12.150
ultimate indicator for popularity on
social media. It shows you how trustworthy
00:03:12.150 --> 00:03:18.620
someone is. It shows how how popular
someone is. It shows, it is an indicator
00:03:18.620 --> 00:03:26.520
for economic success of brands and it also
influences the Facebook algorithm. And as
00:03:26.520 --> 00:03:31.710
we are going to show now, these kind of
likes can be easily forged and
00:03:31.710 --> 00:03:38.580
manipulated. But the problem is that many
users will still prefer this bad info on
00:03:38.580 --> 00:03:45.960
Facebook about the popularity of a product
to no info at all. And so this is a real
00:03:45.960 --> 00:03:53.780
problem. And there is no real solution to
this. So first, we are going to talk about
00:03:53.780 --> 00:03:58.990
the factories and the workers in the fake
like factories.
00:03:58.990 --> 00:04:04.210
Svea: That there are fake likes and that
you can buy likes everywhere, it's well
00:04:04.210 --> 00:04:09.660
known. So if you Google "buying fake
likes" or even "fake comments" for
00:04:09.660 --> 00:04:15.100
Instagram or for Facebook, then you will
get like a hundreds of results and you can
00:04:15.100 --> 00:04:19.989
buy them very cheap and very expensive. It
doesn't matter, you can buy them from
00:04:19.989 --> 00:04:27.790
every country. But when you think of these
bought likes, then you may think of this.
00:04:27.790 --> 00:04:34.960
So you may think of somebody sitting in
China, Pakistan or India, and you think of
00:04:34.960 --> 00:04:40.240
computers and machines doing all this and
that they are, yeah, that they are fake
00:04:40.240 --> 00:04:47.630
and also that they can easily be detected
and that maybe they are not a big problem.
00:04:47.630 --> 00:04:54.880
But it's not always like this. It also can
be like this. So, I want you to meet
00:04:54.880 --> 00:05:03.120
Maria, I met her in Berlin. And Harald, he
lives near Mönchen-Gladbach. So Maria, she
00:05:03.120 --> 00:05:11.750
is a a retiree. She was a former police
officer. And as money is always short, she
00:05:11.750 --> 00:05:19.670
is clicking Facebook likes for money. She
earns between 2 cent and 6 cent per like.
00:05:19.670 --> 00:05:28.720
And Harald, he was a baker once, is now
getting social aid and he is also clicking
00:05:28.720 --> 00:05:34.480
and liking and commenting the whole day.
We met them during our research project
00:05:34.480 --> 00:05:40.930
and did some interviews about their likes.
And one platform they are clicking and
00:05:40.930 --> 00:05:46.750
working for is PaidLikes. It's only one
platform out of a universe, out of a
00:05:46.750 --> 00:05:52.070
cosmos. PaidLikes, they are sitting just a
couple of minutes from here in Magdeburg
00:05:52.070 --> 00:05:56.990
and they are offering that you can earn
money with liking on different platforms.
00:05:56.990 --> 00:06:02.410
And it looks like this when you log into
the platform with your Facebook account
00:06:02.410 --> 00:06:07.300
then you get in the morning, in the
afternoon, in the evening, you get, we
00:06:07.300 --> 00:06:13.260
call it campaigns. But these are pages,
Facebook fan pages or Instagram pages, or
00:06:13.260 --> 00:06:18.240
posts, or comments. You can, you know, you
can work your way through them and click
00:06:18.240 --> 00:06:22.930
them. And I blurred you see here the blue
bar; I blurred them because we don't want
00:06:22.930 --> 00:06:29.800
to get sued from all these companies,
which you can see there. To take you a
00:06:29.800 --> 00:06:37.310
little bit with me on the journey. Harald,
he was okay with us coming by for
00:06:37.310 --> 00:06:44.280
television and he was okay that we did a
long interview with him, and I want to
00:06:44.280 --> 00:06:50.080
show you a very small piece out of his
daily life sitting there doing the
00:06:50.080 --> 00:06:53.540
household, the washing and the cleaning,
and clicking.
00:07:26.760 --> 00:07:36.020
Come on. It could be like that. You click
and you earn some money. How did we meet
00:07:36.020 --> 00:07:41.150
him and all the others? Of course, because
Philip and Dennis, they have a more
00:07:41.150 --> 00:07:45.169
scientific approach. So it was also
important not only to talk to one or two,
00:07:45.169 --> 00:07:50.120
but to talk to many. So we created a
Facebook fan page, which we call "Eine
00:07:50.120 --> 00:07:54.210
Linie unterm Strich" (a line under a line)
because I thought, okay, nobody will like
00:07:54.210 --> 00:08:01.080
this freely. And then we did a post. This
post, and we bought likes, and you won't
00:08:01.080 --> 00:08:10.310
believe it, it worked so well; 222 people,
all the people I paid for liked this. And
00:08:10.310 --> 00:08:18.259
then we wrote all of them and we talked to
many of them. Some of them only in
00:08:18.259 --> 00:08:23.410
writing, some of them only we just called
or had a phone chat. But they gave us a
00:08:23.410 --> 00:08:29.949
lot of information about their life as a
click worker, which I will sum up. So what
00:08:29.949 --> 00:08:36.169
PaidLikes by itself says, they say that
they have 30000 registered users, and it's
00:08:36.169 --> 00:08:41.070
really interesting because you might think
that they are all registered with 10 or 15
00:08:41.070 --> 00:08:45.620
accounts, but most of them, they are not.
They are clicking with their real account,
00:08:45.620 --> 00:08:57.529
which makes it really hard to detect them.
So they even scan their I.D. so that the
00:08:57.529 --> 00:09:03.210
company knows that they are real. Then
they earn their money. And we met men,
00:09:03.210 --> 00:09:09.760
women, stay-at-home moms, low-income
earners, retirees, people who are getting
00:09:09.760 --> 00:09:17.850
social care. So, basically, anybody. There
was no kind of bias. And many of them are
00:09:17.850 --> 00:09:24.890
clicking for two and more platforms. That
was, I didn't meet anybody who's only
00:09:24.890 --> 00:09:29.370
clicking for one platform. They all have a
variety of platforms where they are
00:09:29.370 --> 00:09:34.610
writing comments or clicking likes. And
you can make - this is what they told us -
00:09:34.610 --> 00:09:41.580
between 15 euro and 450 euro monthly, if
you are a so-called power clicker and you
00:09:41.580 --> 00:09:48.410
do this some kind of professional. But
this are only the workers, and maybe you
00:09:48.410 --> 00:09:52.740
are more interested in who are the buyers?
Who benefits?
00:09:52.740 --> 00:09:59.631
Dennis: Yeah. Let's come to step two. Who
benefits from the campaigns? So I think
00:09:59.631 --> 00:10:06.089
you all remember this page. This is the
screen if you log into PaidLikes and,
00:10:06.089 --> 00:10:14.490
you'll see the campaigns with, you have to
click in order to get a little bit of
00:10:14.490 --> 00:10:25.370
money. And by luck we've noticed that if
you go over a URL, we see in the left
00:10:25.370 --> 00:10:31.980
bottom side of the browser, a URL
redirecting to the campaign. You have to
00:10:31.980 --> 00:10:40.700
click and you see that every campaign is
using a unique ID. It is just a simple
00:10:40.700 --> 00:10:49.640
integer, and the good thing is, it is just
incremented. So now maybe some of you guys
00:10:49.640 --> 00:10:56.570
notice what we can do with that. And yeah,
it is really easy with these constructed
00:10:56.570 --> 00:11:02.670
URLs to implement a crawler for data
gathering, and our crawler simply
00:11:02.670 --> 00:11:11.931
requested all campaign IDs between 0 and
90000. Maybe some of you ask why 90000? As
00:11:11.931 --> 00:11:17.110
I already said, we were also registered as
click workers and we see, we saw that the
00:11:17.110 --> 00:11:24.779
highest ID campaign used is about 88000.
So we thought OK, 90000 is a good value
00:11:24.779 --> 00:11:30.540
and we check for every request between
these 90000 requests if it got resolved or
00:11:30.540 --> 00:11:36.030
not, and if it got resolved, we redirected
the URL we present this source. That
00:11:36.030 --> 00:11:42.431
should be liked or followed. And we did
not save the page sources from the
00:11:42.431 --> 00:11:50.750
resolved URLs, we only save the resolved
URLs in the list of campaigns, and this
00:11:50.750 --> 00:11:58.700
list was then the basis for further
analysis. And here you see our list.
00:11:58.700 --> 00:12:05.740
Svea: Yes. This was the point when Dennis
and Philip, when they came to us and said,
00:12:05.740 --> 00:12:12.000
hey, we have a list. So what can you find?
And of course we searched AfD, was one of
00:12:12.000 --> 00:12:20.940
the first search queries. And yeah, of
course, AfD is also in that list. Maybe
00:12:20.940 --> 00:12:31.149
not so surprisingly for some. And when you
look, it is AFD Gelsenkirchen. And the fan
00:12:31.149 --> 00:12:39.589
page. And we asked AfD Gelsenkirchen, did
you buy likes? And they said, we don't
00:12:39.589 --> 00:12:48.240
know how we got on that list. But however,
we do not rule out an anonymous donation.
00:12:48.240 --> 00:12:55.410
But now you would think, Ok, they found
AfD; this is very expectable. But no, all
00:12:55.410 --> 00:13:00.930
political parties – mostly local and
regional entities - showed up on that
00:13:00.930 --> 00:13:09.250
list. So we have CDU/CSU. We have had FDP,
SPD, AfD, Die Grünen and Die Linke. But
00:13:09.250 --> 00:13:15.390
not that you think Angela Merkel or some
very big Facebook fan pages just showed
00:13:15.390 --> 00:13:23.800
up. No, no. Very small entities with a
couple of hundreds or maybe 10000 or 15000
00:13:23.800 --> 00:13:28.390
followers. And I think this makes
perfectly sense, because somebody who has
00:13:28.390 --> 00:13:35.370
already very, very much many fans
probably would not buy them there at
00:13:35.370 --> 00:13:46.311
PaidLikes. And we asked many of them, and
mostly they could not explain it. They
00:13:46.311 --> 00:13:52.040
would never do something like that. Yeah,
they were completely over asked. But you
00:13:52.040 --> 00:13:56.690
have to think that we only saw the
campaign. The campaigns, their Facebook
00:13:56.690 --> 00:14:03.110
fan pages, we could not see who bought the
likes. And as you can imagine, everybody
00:14:03.110 --> 00:14:08.740
could have done it like the mother, the
brother, the fan, you know, the dog. So
00:14:08.740 --> 00:14:15.160
this was a case we would have needed a lot
of luck to call anybody out of the blue
00:14:15.160 --> 00:14:20.260
and then he would say, oh, yes, I did
this. And there was one, or there were
00:14:20.260 --> 00:14:25.810
some politicians who admitted it. And one
of them, she did it also publicly and gave
00:14:25.810 --> 00:14:35.339
us an interview. It's Tanja Kühne. She is
a regional politician from Walsrode,
00:14:35.339 --> 00:14:40.260
Niedersachsen. And she was in the..., it
was the case that it was after an election
00:14:40.260 --> 00:14:44.360
and she was not very happy with her fan
page. That is what she told us. She was
00:14:44.360 --> 00:14:49.220
very unlucky and she wanted, you know, to
push herself and to boost it a little bit,
00:14:49.220 --> 00:14:55.510
and get more friends and followers and
reach. And then she bought 500 followers.
00:14:55.510 --> 00:15:02.870
And then we had a nice interview with her
about that. Show you a small piece.
00:15:53.829 --> 00:15:59.760
Okay, so you see – answers are pretty
interesting. And she.. I think she was
00:15:59.760 --> 00:16:05.180
that courageous to speak out to us. Many
of others did too, but only on the phone.
00:16:05.180 --> 00:16:09.180
And they didn't want to go on the record.
But she's not the only one who answered
00:16:09.180 --> 00:16:14.110
like this. Because, of course, if you call
through a list of potential fake like
00:16:14.110 --> 00:16:21.120
buyers, of course they answer like, no,
it's not a scam. And I also think from a
00:16:21.120 --> 00:16:26.180
jurisdictional way, it's it's also very
hard to show that this is fraud and a
00:16:26.180 --> 00:16:33.209
scam. And it's more an ethical problem
that you can that you can see here, that
00:16:33.209 --> 00:16:40.170
it's manipulative if you buy likes. We
also found a guy from FSP from the
00:16:40.170 --> 00:16:45.269
Bundestag. But yeah, he ran away and
didn't want to get interviewed, so I
00:16:45.269 --> 00:16:52.700
couldn't show you. So bought, or no
probably... He was like 40 times in our
00:16:52.700 --> 00:16:59.100
list for various Facebook posts and videos
and also for his Instagram account. But we
00:16:59.100 --> 00:17:06.730
could not get him on, we could not get him
on record. So what did others say? We, of
00:17:06.730 --> 00:17:10.970
course, confronted Facebook, Instagram and
YouTube with this small research. And they
00:17:10.970 --> 00:17:18.079
said, no, we don't want fake likes on our
platform. PaidLikes is active since 2012,
00:17:18.079 --> 00:17:25.370
you know. So they waited seven years. But
after our report, at least, Facebook
00:17:25.370 --> 00:17:32.549
temporarily blocked PaidLikes. And of
course, we asked them too, and spoke to
00:17:32.549 --> 00:17:35.781
them and wrote with PaidLikes in
Magdeburg. And they said, of course, it's
00:17:35.781 --> 00:17:41.620
not a scam because the click workers they
are freely clicking on pages. So, yeah,
00:17:41.620 --> 00:17:47.640
kind of nobody cares. But PaidLikes, this
is only the tip of the iceberg.
00:17:47.640 --> 00:17:58.520
Philip: So we also wanted to dive a little
bit into this fake like universe outside
00:17:58.520 --> 00:18:05.780
of PaidLikes and to see what else is out
there. And so we did an analysis of
00:18:05.780 --> 00:18:12.780
account creation on Facebook. So what
Facebook is saying about account creation
00:18:12.780 --> 00:18:19.299
is that they are very effective against
fake accounts. So they say they remove
00:18:19.299 --> 00:18:26.330
billions of accounts each year, and that
most of these accounts never reach any
00:18:26.330 --> 00:18:33.000
real users and they remove them before
they get reported. So what Facebook
00:18:33.000 --> 00:18:39.080
basically wants to tell you is that they
have it under control. However, there are
00:18:39.080 --> 00:18:45.700
a number of reports that suggest
otherwise. For example, recently at NATO-
00:18:45.700 --> 00:18:53.630
Stratcom Taskforce released a report where
they actually bought 54000 likes, 54000
00:18:53.630 --> 00:19:02.220
social media interactions for just 300
Euros. So this is a very low price. And I
00:19:02.220 --> 00:19:07.169
think you wouldn't expect such a low price
if it would be hard to get that many
00:19:07.169 --> 00:19:15.880
interactions. They bought 3500 comments,
25000 likes, 20000 views and 5100
00:19:15.880 --> 00:19:22.991
followers. Everything for just 300 Euros.
So, you know, the thing they have in
00:19:22.991 --> 00:19:32.050
common, they are cheap, the fake likes and
the fake interactions. So we also have,
00:19:32.050 --> 00:19:38.470
there was also another report from Vice
Germany recently. And they reported on
00:19:38.470 --> 00:19:46.410
some interesting facts about automated
fake accounts. They reported on findings
00:19:46.410 --> 00:19:50.980
that suggest that actually people use
internet or hacked internet of things
00:19:50.980 --> 00:19:59.150
devices and to use them to create these
fake accounts and to manage them. And so
00:19:59.150 --> 00:20:04.590
it's actually kind of interesting to think
about this this wa. To say, OK, maybe next
00:20:04.590 --> 00:20:11.020
election your fridge is actually going to
support the other candidate on Facebook.
00:20:11.020 --> 00:20:16.970
And so we also wanted to look into this
and we wanted to go a step further and to
00:20:16.970 --> 00:20:24.660
look at who these people are. Who are
they, and what what are they doing on
00:20:24.660 --> 00:20:32.200
Facebook? And so we actually examined the
profiles of purchased likes. For this we
00:20:32.200 --> 00:20:38.390
created four comments under arbitrary
posts, and then we bought likes for these
00:20:38.390 --> 00:20:46.500
comments, and then we examined the
resulting profiles of the fake likes. So
00:20:46.500 --> 00:20:51.050
it was pretty cheap to buy these likes.
Comment likes are always a little bit more
00:20:51.050 --> 00:20:59.520
expensive than other likes. And we found
all these offerings on Google and we paid
00:20:59.520 --> 00:21:08.169
with PayPal. So we actually used a pretty
neat trick to estimate the age of these
00:21:08.169 --> 00:21:16.490
fake accounts. So as you can see here, the
Facebook user ID is incremented. So
00:21:16.490 --> 00:21:24.250
Facebook started in 2009 to use
incremented Facebook ID, and they use this
00:21:24.250 --> 00:21:31.780
pattern of 1 0 0 0 and then the
incremented number. And as you can see, in
00:21:31.780 --> 00:21:40.200
2009 this incremented number was very
close to zero. And then today it is close
00:21:40.200 --> 00:21:49.559
to 40 billion. And in this time period,
you can see that you can kind of get a
00:21:49.559 --> 00:21:56.770
rather fitting line through all these
points. And you can see that the likes are
00:21:56.770 --> 00:22:02.710
in fact incremented, ... the account IDs
are in fact incremented over time. So we
00:22:02.710 --> 00:22:08.670
can use this fact in reverse to estimate
the creation date of an account where we
00:22:08.670 --> 00:22:15.340
know the Facebook ID. And that's exactly
what we did with these fake likes. So we
00:22:15.340 --> 00:22:22.090
estimated the account creation dates. And
as you can see, we get kind of different
00:22:22.090 --> 00:22:28.929
results from different services. For
example, PaidLikes, they had rather old
00:22:28.929 --> 00:22:35.750
accounts. So this means they use very
authentic accounts. And we already know
00:22:35.750 --> 00:22:41.370
that because we talked to them. So these
are very authentic accounts. Also like
00:22:41.370 --> 00:22:46.660
Service A over here also uses very, very
authentic accounts. But on the other hand,
00:22:46.660 --> 00:22:52.160
like service B uses very new accounts,
they were all created in the last three
00:22:52.160 --> 00:22:58.280
years. So if you look at the accounts and
also from these numbers, we think that
00:22:58.280 --> 00:23:06.510
these accounts were bots and on service C
it's kind of not clear, are these are
00:23:06.510 --> 00:23:10.870
these accounts bots or are these
clickworkers? Maybe it's a mixture of
00:23:10.870 --> 00:23:17.820
both, we don't know exactly for sure. But
this is an interesting metric to measure
00:23:17.820 --> 00:23:23.390
the age of the accounts to determine if
some of them might be bots. And that's
00:23:23.390 --> 00:23:29.340
exactly what we did on this page. So this
is actually a page for garden furniture
00:23:29.340 --> 00:23:36.750
and we found it in our list that we got
from paid likes. So they bought, obviously
00:23:36.750 --> 00:23:43.970
they were on this list for bought likes on
Facebook, on PaidLikes. And they caught
00:23:43.970 --> 00:23:51.000
our eye because they had one million
likes. And that's rather unusual for a
00:23:51.000 --> 00:24:01.260
shop for garden furniture in Germany. And
so we looked at this page further and we
00:24:01.260 --> 00:24:07.390
noticed other interesting things. For
example, there are posts, all the time,
00:24:07.390 --> 00:24:13.820
they got like thousands of likes. And
that's also kind of unusual for a garden
00:24:13.820 --> 00:24:19.590
furniture shop. And so we looked into the
likes and as you can see, they all look
00:24:19.590 --> 00:24:26.790
like they come from Southeast Asia and
they don't look very authentic. And we
00:24:26.790 --> 00:24:32.460
were actually able to estimate the
creation dates of these accounts. And we
00:24:32.460 --> 00:24:36.700
found that most of these accounts that
were used for liking these posts on this
00:24:36.700 --> 00:24:44.130
page were actually created in the last
three years. So this is a page where
00:24:44.130 --> 00:24:49.540
everything, from the number of people who
like to page to the number of people who
00:24:49.540 --> 00:24:55.559
like to posts is complete fraud. So
nothing about this is real. And it's
00:24:55.559 --> 00:25:02.380
obvious that this can happen on Facebook
and that this is a really, really big
00:25:02.380 --> 00:25:08.309
problem. I mean, this is a, this is a shop
for garden furniture. Obviously, they
00:25:08.309 --> 00:25:14.580
probably don't have such huge sums of
money. So it was probably very cheap to
00:25:14.580 --> 00:25:22.170
buy this amount of fake accounts. And it
is really shocking to see how, how big,
00:25:22.170 --> 00:25:31.179
how big the scale is of this kind of
operations. And so what we have to say is,
00:25:31.179 --> 00:25:39.970
OK, when Facebook says they have it under
control, we have to doubt that. So now we
00:25:39.970 --> 00:25:46.320
can look at the bigger picture. And what
we are going to do here is we are going to
00:25:46.320 --> 00:25:52.700
use this same graph that we used before to
estimate the creation dates, but in a
00:25:52.700 --> 00:25:59.080
different way. So we can actually see that
the lowest and the highest points of
00:25:59.080 --> 00:26:05.090
Facebook IDs in this graph. So we know the
newest Facebook ID by creating a new
00:26:05.090 --> 00:26:13.200
account. And we know the lowest ID because
it's zero. And then we know that there are
00:26:13.200 --> 00:26:20.780
40 billion Facebook IDs. Now, in the next
step, we took a sample, a random sample
00:26:20.780 --> 00:26:27.610
from these 40 billion Facebook IDs. And
inside of the sample, we checked if these
00:26:27.610 --> 00:26:33.740
accounts exist, if this ID corresponds to
an existing account. And we do that because
00:26:33.740 --> 00:26:39.360
we obviously cannot check 40 billion
accounts and 40 billion IDs, but we can
00:26:39.360 --> 00:26:45.720
check a small sample of these accounts of
these IDs and estimate, then, the number
00:26:45.720 --> 00:26:54.470
of existing accounts on Facebook and
total. So for this, we repeatedly access
00:26:54.470 --> 00:27:02.770
the same sample of one million random IDs
over the course of one year. And we also
00:27:02.770 --> 00:27:10.100
pulled a sample of 10 million random IDs
for closer analysis this July. And now
00:27:10.100 --> 00:27:15.950
Dennis is going to tell you how we did it.
Dennis: Yeah. Well, pretty interesting,
00:27:15.950 --> 00:27:21.160
pretty interesting results so far, right?
So we again implemented the crawler, the
00:27:21.160 --> 00:27:26.530
second time for gathering public Facebook
information, the public Facebook account
00:27:26.530 --> 00:27:35.730
data. And, yeah, this was not so easy as
in the first case. Um, yeah. As. It's not
00:27:35.730 --> 00:27:45.059
surprising that Facebook is using a lot of
measures to try to block the automated
00:27:45.059 --> 00:27:52.460
crawling of the Facebook page, for example
with IP blocking or CAPTCHA solving. But,
00:27:52.460 --> 00:27:59.929
uh, we were pretty easy... Yeah, we could
pretty easy solve this problem by using
00:27:59.929 --> 00:28:06.980
the Tor Anonymity Network. So every time
our IP got blocked by crawling the data,
00:28:06.980 --> 00:28:14.480
we just made a new Tor connection and
change the IP. And this also with the
00:28:14.480 --> 00:28:21.440
CAPTCHAs. And with this easy method, we
were able to to crawl all the Facebook,
00:28:21.440 --> 00:28:26.020
and all the public Facebook data. And
let's have a look at two examples. The
00:28:26.020 --> 00:28:36.890
first example is facebook.com/4. So the,
very, very small Facebook ID. Yeah, in
00:28:36.890 --> 00:28:41.790
this case, we are, we are redirected and
check the response and find a valid
00:28:41.790 --> 00:28:50.070
account page. And does anyone know which
account this is? Mark Zuckerberg? Yeah,
00:28:50.070 --> 00:28:55.360
that's correct. This is this is a public
account for Mark Zuckerberg. Number four,
00:28:55.360 --> 00:29:01.679
as we see, as we already saw, the other
IDs are really high. But he got the number
00:29:01.679 --> 00:29:10.690
four. Second example was facebook.com/3.
In this case, we are not forwarded. And
00:29:10.690 --> 00:29:17.760
this means that it is an invalid account.
And that was really easy to confirm with a
00:29:17.760 --> 00:29:23.740
quick Google search. And it was a test
account from the beginning of Facebook. So
00:29:23.740 --> 00:29:31.059
we did not get redirected. And it's just
the login page from Facebook. And with
00:29:31.059 --> 00:29:38.500
these examples, we did, we did a lot of, a
lot more experiments. And at the end, we
00:29:38.500 --> 00:29:46.970
were able to to build this tree. And, yeah,
this tree represents the high level
00:29:46.970 --> 00:29:53.059
approach from our scraper. So in the,
What's that?
00:29:53.059 --> 00:29:56.340
Svea: Okay. Sleeping.
Laughing
00:29:56.340 --> 00:30:07.090
Dennis: Yeah. We have still time. Right.
So what? Okay, so everyone is waking up
00:30:07.090 --> 00:30:16.680
again. Oh, yeah. The first step we call
the domain, www.facebook.com/FID. If we
00:30:16.680 --> 00:30:24.650
get redirected in this case, then we check
if the, if the page is an account page. If
00:30:24.650 --> 00:30:31.270
it's an account page, then it's an public
account like the example 4 and we were
00:30:31.270 --> 00:30:39.890
able to save the raw data, the raw HTTP
source. If we, if it's not an account page
00:30:39.890 --> 00:30:45.070
then everything is OK. If it's not, it's
not a public account and we are not able
00:30:45.070 --> 00:30:52.580
to save any data. And if we call, if we
do, if we do not get redirected in the
00:30:52.580 --> 00:31:01.630
first step, then we call the second
domain, facebook.com/profile.php?id=FID
00:31:01.630 --> 00:31:09.289
with the mobile user agent. And if we get
redirected then, then again, it is a
00:31:09.289 --> 00:31:14.990
nonpublic profile and we cannot save
anything. But, and if we get not
00:31:14.990 --> 00:31:22.710
redirected, it is an invalid profile and
it is most often a deleted account. Yeah.
00:31:22.710 --> 00:31:29.390
And yeah, that's the high level overview
of our scraper. And Phillip will now give
00:31:29.390 --> 00:31:32.340
some more information on interesting
results.
00:31:32.340 --> 00:31:38.820
Phillip: So the most interesting result of
this scraping of the sample of Facebook
00:31:38.820 --> 00:31:47.070
IDs was that one in four Facebook IDs
corresponds to a valid account. And you
00:31:47.070 --> 00:31:53.559
can do the math. There are 40 billion
Facebook IDs, so there must be 10 billion
00:31:53.559 --> 00:32:00.170
registered users on Facebook. And this
means that there are more registered users
00:32:00.170 --> 00:32:08.140
on Facebook than there are humans on
Earth. And also, it means that it's even
00:32:08.140 --> 00:32:12.460
worse than that because not everybody on
Earth can have a Facebook account because
00:32:12.460 --> 00:32:17.370
not everybody, you need a smartphone for
that. And many people don't have those. So
00:32:17.370 --> 00:32:22.270
this is actually a pretty high number and
it's very unexpected. So in July 2019,
00:32:22.270 --> 00:32:29.059
there were more than ten billion Facebook
accounts. Also, we did another research on
00:32:29.059 --> 00:32:36.429
the timeframe between October 2018 and
today, or this month. And we found that in
00:32:36.429 --> 00:32:43.140
this timeframe there were 2 billion new
registered Facebook accounts. So this is
00:32:43.140 --> 00:32:48.679
like the timeframe of one year, more or
less. And in a similar timeframe, the
00:32:48.679 --> 00:32:58.899
monthly active user base rose by only 187
million. Facebook deleted 150 million
00:32:58.899 --> 00:33:05.419
older accounts between October 2018 and
July 2019. And we know that because we
00:33:05.419 --> 00:33:11.460
pulled the same sample over a longer
period of time. And then we watched for
00:33:11.460 --> 00:33:16.230
accounts that got deleted in the sample.
And that enables us to estimate this
00:33:16.230 --> 00:33:23.400
number of 150 million accounts that got
deleted that are basically older than our
00:33:23.400 --> 00:33:31.890
sample. So I made some nice graphs for
your viewing pleasure. So, again, the
00:33:31.890 --> 00:33:40.919
older accounts were, just 150 million were
deleted since October 2018. These are
00:33:40.919 --> 00:33:46.350
accounts that are older than last year.
And Facebook claims that since then, about
00:33:46.350 --> 00:33:52.789
7 billion accounts got deleted from their
platform, which is vastly more than these
00:33:52.789 --> 00:33:58.370
older accounts. And that that's why we
think that Facebook mostly deleted these
00:33:58.370 --> 00:34:06.770
newer accounts. And if an account is older
than a certain age, then it is very
00:34:06.770 --> 00:34:13.069
unlikely that it gets deleted. And also, I
think you can see the scales here. So, of
00:34:13.069 --> 00:34:17.960
course, the registered users are not the
same thing as active users, but you can
00:34:17.960 --> 00:34:23.290
still see that there are much more
registrations of, of new users than there
00:34:23.290 --> 00:34:30.139
are active users. And there are new active
users during the last year. So what does
00:34:30.139 --> 00:34:37.909
this all mean? Does it mean that Facebook
gets flooded by fake accounts? We don't
00:34:37.909 --> 00:34:42.980
really know. We only know these numbers.
What Facebook is telling us is that they
00:34:42.980 --> 00:34:50.409
only count and publish active users, as I
already said, that there is a disconnect
00:34:50.409 --> 00:34:56.759
between this record, registered users and
active users and Facebook only reports on
00:34:56.759 --> 00:35:04.289
the active users. Also, they say that
users register accounts, but they don't
00:35:04.289 --> 00:35:10.519
verify them or they don't use them, and
that's how this number gets so high. But I
00:35:10.519 --> 00:35:19.319
think that that's not really explaining
these high numbers and because that's just
00:35:19.319 --> 00:35:26.469
by orders of magnitude larger than
anything that this could cause. Also, they
00:35:26.469 --> 00:35:31.819
say that they regularly delete fake
accounts. But we have seen that these are
00:35:31.819 --> 00:35:37.519
mostly accounts that get deleted directly
after their creation. And if they survive
00:35:37.519 --> 00:35:46.170
long enough, then they are getting
through. So what does this all mean?
00:35:46.170 --> 00:35:55.390
Svea: Okay, so you got the full load,
which I had like over two or three months.
00:35:55.390 --> 00:36:02.869
And what for me was, was a one very big
conclusion was that we have some kind of
00:36:02.869 --> 00:36:08.530
broken metric here, that all the likes and
all the hearts on Instagram and the
00:36:08.530 --> 00:36:13.650
followers that they can so easily be
manipulated. And then it's it's so hard to
00:36:13.650 --> 00:36:19.029
tell in some cases, it's so hard to tell
if they are real or not real. And this
00:36:19.029 --> 00:36:26.160
opens the gate for manipulation and yes,
untrueness. And for economic losses, if
00:36:26.160 --> 00:36:33.109
you think as somebody who is investing
money and or as an advertiser, for
00:36:33.109 --> 00:36:40.170
example. And in the very end, it is a case
of eroding trust, which means that we
00:36:40.170 --> 00:36:45.739
cannot trust these numbers anymore. These
numbers are, you know, they are so easily
00:36:45.739 --> 00:36:53.799
manipulated. And why should we trust this?
And this has a severe consequence for all
00:36:53.799 --> 00:36:59.420
the social networks. If you are still in
them. So what can be a solution? And
00:36:59.420 --> 00:37:05.150
Philip, you thought about that.
Phillip: So basically we have two
00:37:05.150 --> 00:37:11.410
problems. One is click workers and one is
fakes. Click workers are basically just
00:37:11.410 --> 00:37:18.420
hyper active users and they are selling
their hyper activity. And so what social
00:37:18.420 --> 00:37:23.660
networks could do is just make
interactions scarce, so just lower the
00:37:23.660 --> 00:37:29.180
value of more interactions. If you are a
hyper active users, then your interaction
00:37:29.180 --> 00:37:34.240
should count less than the interactions of
a less active user.
00:37:34.240 --> 00:37:39.229
Mumbling
That's kind of solvable, I think. The real
00:37:39.229 --> 00:37:46.890
problem is the authenticity. So if you if
you get stopped from posting or liking
00:37:46.890 --> 00:37:52.640
hundreds of pages a day, then maybe you
just create multiple accounts and operate
00:37:52.640 --> 00:37:58.599
them simultaneously. And this can only be
solved by authenticity. So this can only
00:37:58.599 --> 00:38:04.990
be solved if you know that the person who
is operating the account is just one
00:38:04.990 --> 00:38:10.569
person, is operating one account. And this
is really hard to do, because Facebook
00:38:10.569 --> 00:38:14.940
doesn't know who is clicking. Is it a bot?
Is it a clickworrker, or is it one
00:38:14.940 --> 00:38:20.410
clickworker for ten accounts? How does
this work? And so this is really hard for
00:38:20.410 --> 00:38:27.609
the, for the social media companies to do.
And you could say, OK, let's send in the
00:38:27.609 --> 00:38:32.359
passport or something like that to prove
authenticity. But that's actually not a
00:38:32.359 --> 00:38:37.109
good idea because nobody wants to send
their passport to Facebook. And so this is
00:38:37.109 --> 00:38:42.359
really a hard problem that has to be
solved. If we want to use social, social
00:38:42.359 --> 00:38:49.750
media in a meaningful way. And so this is
what, what companies could do. And now...
00:38:49.750 --> 00:38:53.200
Svea: But what do what you
could do. Okay. Of course, you can delete
00:38:53.200 --> 00:38:56.469
your Facebook account or your Instagram
account and stop.
00:38:56.469 --> 00:39:01.299
Slight Applause, Lauthing
Svea: Yeah! Stay away from social media.
00:39:01.299 --> 00:39:08.959
But this maybe is not for all of us a
solution. So I think be aware, of course.
00:39:08.959 --> 00:39:17.499
Spread the word, tell others. And if, if
you, if you like, then and you get more
00:39:17.499 --> 00:39:24.019
intelligence about that, we are really
happy to dig deeper in these networks. And
00:39:24.019 --> 00:39:30.180
and we will go on investigating and so at
last but not least, it's to say thank you
00:39:30.180 --> 00:39:33.349
to you guys. Thank you very much for
listening.
00:39:33.349 --> 00:39:40.089
Applause
Svea: And we did not do this alone. We are
00:39:40.089 --> 00:39:44.849
not three people. There are many more
standing behind and doing this, this
00:39:44.849 --> 00:39:50.709
beautiful research. And we are opening now
for questions, please.
00:39:50.719 --> 00:39:55.429
Herald: Yes. Please, thank Svea, Phil and
Dennis again.
00:39:55.429 --> 00:40:05.519
Applause
And we have microphones out
00:40:05.519 --> 00:40:09.680
here in the room, about nine of them,
actually. If you line up behind them to
00:40:09.680 --> 00:40:15.780
ask a question, remember that a question
is a sentence with a question mark behind
00:40:15.780 --> 00:40:20.500
it. And I think I see somebody at number
three. So let's start with that.
00:40:20.500 --> 00:40:25.979
Question: Hi. I, I just have a little
question. Wouldn't a dislike button, the
00:40:25.979 --> 00:40:30.749
concept of a dislike button, wouldn't that
be a solution to all the problems?
00:40:30.749 --> 00:40:38.039
Phillip: So we thought about recommending
that Facebook ditches the like button
00:40:38.039 --> 00:40:42.299
altogether. I think that would be a better
solution than a dislike button, because a
00:40:42.299 --> 00:40:47.079
dislike button could also be manipulated
and it would be even worse because you
00:40:47.079 --> 00:40:54.119
could actually manipulate the network into
down ranking posts or kind of not showing
00:40:54.119 --> 00:41:00.670
posts to somebody. And that, I think would
be even worse. I imagine what dictators
00:41:00.670 --> 00:41:08.209
would do with that. And so I think the
best option would be to actually not show
00:41:08.209 --> 00:41:18.029
off like, like counts anymore and to this,
to actually make people not invest into
00:41:18.029 --> 00:41:25.199
these counts if they become meaningless.
Herald: I think I see a microphone 7, up
00:41:25.199 --> 00:41:28.109
there.
Question: Hello. So one question I had is
00:41:28.109 --> 00:41:37.210
you are signed creation dates to IDs. How
did you do this?
00:41:37.210 --> 00:41:52.489
Phillip: So, we actually knew the creation
date of some accounts. And then we kind of
00:41:52.489 --> 00:41:58.210
interpolated between the creation dates
and the IDs. So you see this black line
00:41:58.210 --> 00:42:04.109
there. That's actually our, our
interpolation. And with this black line,
00:42:04.109 --> 00:42:10.910
we can then estimate the creation dates
for IDs that we do not yet know because
00:42:10.910 --> 00:42:17.430
they did, kind of fill in the gaps.
Q: Follow up question, do you know why
00:42:17.430 --> 00:42:20.310
there are some points outside of this
graph?
00:42:20.310 --> 00:42:23.999
Phillip: No.
Q: No? Thank you.
00:42:23.999 --> 00:42:26.400
Herald: So there was a question from the
Internet.
00:42:26.400 --> 00:42:33.723
Question: Did you report your findings to
Facebook? And did they do anything?
00:42:33.723 --> 00:42:41.509
Svea: Because this research is very new,
we, we just recently approached them and
00:42:41.509 --> 00:42:47.190
showed them the research and we got an
answer. But I think we also already showed
00:42:47.190 --> 00:42:54.480
the answer. It was that they, I think that
they only count and publish active users.
00:42:54.480 --> 00:42:59.680
They could, they did not want to tell us
how many registered users they have, that
00:42:59.680 --> 00:43:03.859
they say, oh, sometimes users register
accounts, but don't use them or verify
00:43:03.859 --> 00:43:08.930
them. And that they regularly delete fake
accounts. But we hope that we get into a
00:43:08.930 --> 00:43:12.469
closer discussion with them soon about
this.
00:43:12.469 --> 00:43:19.469
Herald: Microphone two.
Question: When hunting down the bias of
00:43:19.469 --> 00:43:26.740
the campaigns, did you dig out your own
campaign line, Line below the line? No,
00:43:26.740 --> 00:43:34.039
because they stopped scraping in August.
And I, you stopped scraping in August. And
00:43:34.039 --> 00:43:39.449
then I started, you know, the whole
project started with them coming to us
00:43:39.449 --> 00:43:44.599
with the list. And then we thought, oh,
this is very interesting. And then the
00:43:44.599 --> 00:43:50.729
whole journalistic research started. And,
but I think if we, I think if we would do
00:43:50.729 --> 00:43:56.200
it again, of course, I think we would find
us. We all also found there was another
00:43:56.200 --> 00:44:01.650
magazine, and they did, also a test, paid
test a couple of years ago. And we found
00:44:01.650 --> 00:44:04.920
their campaign.
Phillip: So, so we we actually did another
00:44:04.920 --> 00:44:11.480
test. And for the other test, I noted we
also got like this ID, I think. And it
00:44:11.480 --> 00:44:20.329
worked to plug it into the URL and then we
also got to redirected to our own page. So
00:44:20.329 --> 00:44:22.569
that worked.
Q: Thank you.
00:44:22.569 --> 00:44:26.379
Herald: Microphone three.
Question: Hi. I'm Farhan, I'm a Pakistani
00:44:26.379 --> 00:44:30.759
journalist. And first of all, I would like
to say that you were right when you said
00:44:30.759 --> 00:44:34.910
that there might be people sitting in
Pakistan clicking on the likes. That does
00:44:34.910 --> 00:44:41.329
happen. But my question would be that
Facebook does have its own ad program that
00:44:41.329 --> 00:44:47.470
it aggressively pushes. And in that ad
program, there is also options whereby
00:44:47.470 --> 00:44:53.701
people can buy likes and comments and
impressions and reactions. Did you, would
00:44:53.701 --> 00:44:59.670
you also consider those as a fake? I mean,
that they're not fake, per se, but they're
00:44:59.670 --> 00:45:05.799
still bought likes. So what's your view on
those? Thank you.
00:45:05.799 --> 00:45:14.349
Phillip: So, when you buy ads on Facebook,
then, so, what you what you actually want
00:45:14.349 --> 00:45:19.489
to have is fans for your page that are
actually interested in your page. So
00:45:19.489 --> 00:45:25.460
that's kind of the difference, I think to
the, to the paid likes system where the
00:45:25.460 --> 00:45:30.119
people themselves, they get paid for
liking stuff that they wouldn't normally
00:45:30.119 --> 00:45:35.599
like. So I think that's the fundamental
difference between the two programs. And
00:45:35.599 --> 00:45:40.529
that's why I think that one is unethical.
And one is not really that unethical.
00:45:40.529 --> 00:45:47.749
Svea: The very problem is if you, if you
buy these click workers, then you have
00:45:47.749 --> 00:45:52.789
many people in your fan page. They are not
interested in you. They don't care about
00:45:52.789 --> 00:45:57.410
you. They don't look at your products.
They don't look at your political party.
00:45:57.410 --> 00:46:03.539
And then often the people, they
additionally, they make Facebook ads, and
00:46:03.539 --> 00:46:08.229
these ads, they are shown, again, the
click workers and they don't look at them.
00:46:08.229 --> 00:46:13.410
So, you know, people, they are burning
money and money and money with this whole
00:46:13.410 --> 00:46:18.069
corrupt system.
Herald: So, microphone two.
00:46:18.069 --> 00:46:22.039
Question: Hi. Thanks. Thanks for the talk
and thanks for the effort of going through
00:46:22.039 --> 00:46:27.709
all of this project. From my
understanding, this whole finding
00:46:27.709 --> 00:46:35.209
basically undermines the trust in
Facebook's likes in general, per se. So I
00:46:35.209 --> 00:46:42.369
would expect now the price of likes to
drop and the pay for click workers to drop
00:46:42.369 --> 00:46:49.250
as well. Do you have any metrics on that?
Svea: The research just went public. I
00:46:49.250 --> 00:46:56.180
think one week ago. So, so what we have
seen as an effect is that Facebook, they
00:46:56.180 --> 00:47:02.940
excluded paid likes for, for a moment. So,
yes, of course, one platform is down. But
00:47:02.940 --> 00:47:08.010
I think there are so many outside. There
are so many. So I think...
00:47:08.010 --> 00:47:14.229
Q: I meant the phenomenon of paid likes,
not the company itself. Like the value of
00:47:14.229 --> 00:47:19.319
a like as a measure of credibility...
Phillip: We didn't...
00:47:19.319 --> 00:47:22.829
Q: ...is declining now. That's my, that's
my...
00:47:22.829 --> 00:47:27.869
Svea: Yes. That's why many people are
buying Instagram hearts now. So, so, yes,
00:47:27.869 --> 00:47:32.900
that's true. The like is not the fancy hot
shit anymore. Yes. And we also saw in the
00:47:32.900 --> 00:47:40.670
data that the likes for the fan pages,
they rapidly went down and the likes for
00:47:40.670 --> 00:47:45.229
the posts and the comments, they went up.
So I think, yes, there is a shift. And
00:47:45.229 --> 00:47:51.809
what we also saw in that data was that the
Facebook likes, they, they went down from
00:47:51.809 --> 00:47:57.839
2016. They are rapidly down. And what is
growing and rising is YouTube and
00:47:57.839 --> 00:48:01.609
Instagram. Now, everything is about,
today, everything is about Instagram.
00:48:01.609 --> 00:48:05.270
Q: Thanks.
Herald: So let's go to number one.
00:48:05.270 --> 00:48:09.630
Question: Hello and thank you very much
for this fascinating talk, because I've
00:48:09.630 --> 00:48:15.400
been following this whole topic for a
while. And I was wondering if you were
00:48:15.400 --> 00:48:20.849
looking also into the demographics, in
terms of age groups and social class, not
00:48:20.849 --> 00:48:25.619
of the people who were doing the actual
liking, but actually, you know, buying
00:48:25.619 --> 00:48:31.249
these likes. Because I think that what is
changing is an entire social discourse on
00:48:31.249 --> 00:48:36.709
social capital and, the bold U.S. kind of
term, because it can now be quantified. As
00:48:36.709 --> 00:48:43.650
a teacher, I hear of kids who buy likes to
be more popular than their other
00:48:43.650 --> 00:48:47.880
schoolmates. So I'm wondering if you're
looking into that, because I think that's
00:48:47.880 --> 00:48:52.559
fascinating, fascinating area to actually
come up with numbers about it.
00:48:52.559 --> 00:48:59.229
Svea: It definitely is. And we were all so
fascinated by this data set of 90,000 data
00:48:59.229 --> 00:49:05.479
points. And what we did was, and this was
very hard, and was that we tried it, first
00:49:05.479 --> 00:49:11.869
of all, to look who is buying likes, like
automotives, you know, to to, this some,
00:49:11.869 --> 00:49:18.910
you know, what, what kind of branches? Who
is in that? And so this was this was
00:49:18.910 --> 00:49:24.769
doable. But to get more into demographics,
you would have liked to, to crawl, to
00:49:24.769 --> 00:49:33.699
click every page. And so we we did not do
this. What we did was, of course, that we
00:49:33.699 --> 00:49:38.489
that we were a team of three to ten people
and manually looking into it. And what we,
00:49:38.489 --> 00:49:43.739
of course, saw that on Instagram and on
YouTube, you have many of these very young
00:49:43.739 --> 00:49:47.219
people. Some of them, I actually called
them and they were like, Yes, I bought
00:49:47.219 --> 00:49:54.089
likes. Very bad idea. So I think yes, I
think there is a demographic shift away
00:49:54.089 --> 00:49:59.890
from the companies and the automotive and
industries buying Facebook fan page likes
00:49:59.890 --> 00:50:04.390
to Instagram and YouTube wannabe-
influencers.
00:50:04.390 --> 00:50:06.430
Q: Influencers, influencer culture is
obviously...
00:50:06.430 --> 00:50:12.670
Svea: Yes. And I have to admit here we, we
showed you the political side, but we have
00:50:12.670 --> 00:50:19.849
to admit that the political likes, they
were like this small in the numbers. And
00:50:19.849 --> 00:50:25.640
the very, very vast majority of this data
set, it's about wedding planners,
00:50:25.640 --> 00:50:31.440
photography, tattoo studios and
influencers, influencers, influencers and
00:50:31.440 --> 00:50:34.479
YouTubers, of course.
Q: Yes. Thank you so much.
00:50:34.479 --> 00:50:37.439
Herald: So we have a lot of questions in
the room. I'm going to get to you as soon
00:50:37.439 --> 00:50:40.009
as we can. I'd like to go to the Internet
first.
00:50:40.009 --> 00:50:44.680
Signal Angel: Do you think this will get
bit better or worse if people move to more
00:50:44.680 --> 00:50:48.319
decentralized platforms?
Phillip: To more what?
00:50:48.319 --> 00:50:54.910
Svea: If it get better or worse.
Dennis: Can you repeat that, please?
00:50:54.910 --> 00:50:58.880
Herald: Would this issue get better or
worse if people move to a more
00:50:58.880 --> 00:51:01.239
decentralized platform?
Phillip: Decentralized. decentralized,
00:51:01.239 --> 00:51:12.160
okay. So, I mean, we can look at, at the,
this slide, I think, and think about
00:51:12.160 --> 00:51:18.249
whether decentralized platforms would
change any of these, any of these two
00:51:18.249 --> 00:51:25.999
points here. And I fear, I don't think so,
because they cannot solve the interactions
00:51:25.999 --> 00:51:30.210
problem that people can be hyperactive.
Actually, that's kind of a normal thing
00:51:30.210 --> 00:51:34.299
with social media. A small portion of
social media users is much more active
00:51:34.299 --> 00:51:39.880
than everybody else. That's kind of. You
have that without paying for it. So
00:51:39.880 --> 00:51:44.720
without even having paid likes, you will
have to consider if social media is really
00:51:44.720 --> 00:51:51.189
kind of representative of the society.
But, and the other thing is authenticity.
00:51:51.189 --> 00:51:57.170
And also in a decentralized platform, you
could have multiple accounts run by the
00:51:57.170 --> 00:52:01.199
same person.
Herald: So, microphone seven, all the way
00:52:01.199 --> 00:52:06.779
back there.
Question: Hi. Do you know if Facebook even
00:52:06.779 --> 00:52:10.220
removes the likes when they delete fake
accounts?
00:52:10.220 --> 00:52:17.319
Svea: Do you know that?
Phillip: No, we don't know that. No, we
00:52:17.319 --> 00:52:21.259
don't. We don't know. We know they delete
fake accounts, but we don't know if they
00:52:21.259 --> 00:52:27.619
also delete the likes. I know from our
research that the people we approached,
00:52:27.619 --> 00:52:31.329
they did not delete the click workers.
They get...
00:52:31.329 --> 00:52:35.839
Herald: Microphone two.
Question: Yeah. Hi. So I have a question
00:52:35.839 --> 00:52:41.359
with respect to this, one out of four
Facebook accounts are active in your, in
00:52:41.359 --> 00:52:46.949
your test. Did you see any difference with
respect to age of the accounts? So is it
00:52:46.949 --> 00:52:52.489
always one out the four to the entire
sample? Or does it maybe change over the,
00:52:52.489 --> 00:52:57.730
over the like going from a zero ID to,
well, 10 billion or 40 billion?
00:52:57.730 --> 00:53:02.189
Phillip: So you're talking about the
density of accounts in our ID?
00:53:02.189 --> 00:53:05.989
Q: Kind of.
Phillip: So, so there are changes over
00:53:05.989 --> 00:53:12.150
time. Yeah. So I guess I think now it's
less than it was before. So now they are
00:53:12.150 --> 00:53:19.089
less than for then, and before it was more
and so I think it was. Yeah. I don't know.
00:53:19.089 --> 00:53:23.660
Q: But you don't see anything specific
that now, only in the new accounts, only
00:53:23.660 --> 00:53:28.229
one out of 10 is active or valid and
before it was one out of two or something
00:53:28.229 --> 00:53:31.259
like that.
Phillip: It's not that extreme. So it's
00:53:31.259 --> 00:53:34.859
less than that. It's kind of...
Dennis: We have to say we did not check
00:53:34.859 --> 00:53:41.239
this, but there were no special cases.
Phillip: But it changed over time? So
00:53:41.239 --> 00:53:47.200
before it was less and, before it was more
and now it is less. And so what we checked
00:53:47.200 --> 00:53:54.710
was whether an ID actually corresponds to
an account. And so this metric, yeah. And
00:53:54.710 --> 00:53:57.299
it changed a little bit over time, but not
much.
00:53:57.299 --> 00:54:02.239
Herald: So, so number three, please.
Question: Yeah. Thank you for a very
00:54:02.239 --> 00:54:06.989
interesting talk. At the end, you gave
some recommendations, how to fix the
00:54:06.989 --> 00:54:11.769
metrics, right? And it's always nice to
have some metrics because then, well, we
00:54:11.769 --> 00:54:15.220
are the people who deal with the numbers.
So we want the metrics. But I want to
00:54:15.220 --> 00:54:20.309
raise the issue whether quantitative
measure is actually the right thing to do.
00:54:20.309 --> 00:54:26.449
So would you buy your furniture from store
A with 300 likes against store B with 200
00:54:26.449 --> 00:54:32.049
likes? Or would it not be better to have a
more qualitative thing? And to what extent
00:54:32.049 --> 00:54:38.259
is a quantitative measure maybe also the
source of a lot of bad developments we see
00:54:38.259 --> 00:54:43.390
in social media to begin with, even not
with bot firms and anything, but just
00:54:43.390 --> 00:54:48.339
people who go for the quick like and say
Hooray for Trump and then get, whatever,
00:54:48.339 --> 00:54:52.479
all the Trumpists is liking that and the
others say Fuck Trump and you get all the
00:54:52.479 --> 00:54:57.229
non Trumpists like that and you get all
the polarization, right? So, Instagram, I
00:54:57.229 --> 00:55:02.650
think they just don't just display their
like equivalent anymore in order to
00:55:02.650 --> 00:55:04.929
prevent that, so could you maybe comment
on that?
00:55:04.929 --> 00:55:12.299
Svea: I think this is a good idea, to, to
hide the likes. Yes. But I you know, we
00:55:12.299 --> 00:55:17.799
talked to many clickworkers and they do a
lot of stuff. And what they also do is
00:55:17.799 --> 00:55:23.309
taking comments and doing copy paste for
comments section or for Amazon reviews.
00:55:23.309 --> 00:55:29.789
So, you know, I think it's really hard to
get them out of the system because maybe
00:55:29.789 --> 00:55:34.390
if the likes are not shown and if and when
the comments are counting, then you will
00:55:34.390 --> 00:55:41.069
have people who are copy pasting comments
in the comments section. So I really think
00:55:41.069 --> 00:55:44.519
that the networks, that they really have
an issue here.
00:55:44.519 --> 00:55:49.829
Herald: So let's try to squeeze the last
three questions now. First, number seven,
00:55:49.829 --> 00:55:52.950
really quick.
Question: Very quick. Thank you for the
00:55:52.950 --> 00:55:58.799
nice insights. And I have a question about
the location of the users. So you made
00:55:58.799 --> 00:56:03.289
your point that you can analyze by the
metadata where, uh, when the account was
00:56:03.289 --> 00:56:08.650
made. But how about the location of the
followers? Is there any way to analyze
00:56:08.650 --> 00:56:12.339
that as well?
Phillip: So we can only analyze that if
00:56:12.339 --> 00:56:21.049
the users agreed to share it publicly and
not all of them do that, I think often a
00:56:21.049 --> 00:56:26.460
name check is often a very good way to
check where somebody is from. For these
00:56:26.460 --> 00:56:32.190
fake likes, for example. But as I said, it
always depends on what the user himself is
00:56:32.190 --> 00:56:36.130
willing to share.
Herald: Internet?
00:56:36.130 --> 00:56:41.039
Signal Angel: Isn't this just the western
version of the Chinese social credit
00:56:41.039 --> 00:56:43.999
system? Where do we go from here? What is
the future of all this?
00:56:43.999 --> 00:56:54.089
Svea: Yeah, it's dystopian, right? Oh,
yeah, I don't, after this research, you
00:56:54.089 --> 00:57:01.109
know, for me, I deleted my Facebook
account like one or two years ago. So this
00:57:01.109 --> 00:57:07.279
does you know, this did not matter to me
so much. But I stayed on Instagram and
00:57:07.279 --> 00:57:13.359
when I saw all this bought likes and
abonnents and followers and also YouTube,
00:57:13.359 --> 00:57:16.999
all this views, this, because the click
workers, they also watch YouTube videos.
00:57:16.999 --> 00:57:20.859
They have to stay on them like 40 seconds,
it's really funny because they hate
00:57:20.859 --> 00:57:27.239
hearing like techno music, rap music, all
40 seconds and then they go on. But when I
00:57:27.239 --> 00:57:34.589
sit next to Herald for two hour, three
hours, I was so desillusionated about all
00:57:34.589 --> 00:57:40.960
the social network things. And and I
thought, OK, don't count on anything. Just
00:57:40.960 --> 00:57:46.119
if you like the content, follow them and
look at them. But don't believe anything.
00:57:46.119 --> 00:57:50.479
That was my personal take away from this
research.
00:57:50.479 --> 00:57:53.970
Herald: So very last question, microphone
two.
00:57:53.970 --> 00:57:59.150
Question: A couple of days ago, The
Independent reported that Facebook, the
00:57:59.150 --> 00:58:06.839
Facebook App was activating the camera
when reading a news feed. Could this be in
00:58:06.839 --> 00:58:10.779
use in the context of detecting fake
accounts?
00:58:10.779 --> 00:58:18.400
Svea: I don't know.
Phillip: So, I think that that in this
00:58:18.400 --> 00:58:26.799
particular instance that it was probably a
bug. So, I don't know, but I mean that the
00:58:26.799 --> 00:58:30.679
people who work at Facebook are, not all
of them are like crooks or anything that
00:58:30.679 --> 00:58:35.130
they will deliberately program this kind
of stuff. So they said that it was kind of
00:58:35.130 --> 00:58:41.189
a bug from from an update that they did.
And the question is whether we can
00:58:41.189 --> 00:58:49.430
actually detect fake accounts with the
camera. And the problem is that current, I
00:58:49.430 --> 00:58:57.469
don't think that current face recognition
technology is enough to detect that you
00:58:57.469 --> 00:59:02.940
are a unique person. So there are so many
people on the planet that probably another
00:59:02.940 --> 00:59:08.959
person who has the same face. And I think
the new iPhone, they also have this much
00:59:08.959 --> 00:59:14.579
more sophisticated version of this
technology. And even they say, OK, there's
00:59:14.579 --> 00:59:19.079
a chance of one in, I don't know, that
there is somebody who can unlock your
00:59:19.079 --> 00:59:23.829
phone. So I think it's really hard to do
that with, do that with recording
00:59:23.829 --> 00:59:29.299
technology, to actually prove that
somebody is just one person.
00:59:29.299 --> 00:59:38.059
Herald: So with that, would you please
help me thank Svea, Dennis and Philip
00:59:38.059 --> 00:59:41.160
one more time for this fantastic
presentation! Very interesting and very,
00:59:41.160 --> 00:59:48.099
very disturbing. Thank you very much.
Applause
00:59:48.099 --> 00:59:52.099
postroll music
00:59:52.099 --> 01:00:16.000
Subtitles created by c3subtitles.de
in the year 2020. Join, and help us!