1
00:00:00,000 --> 00:00:19,030
36C3 preroll music
2
00:00:19,030 --> 00:00:26,500
Herald: OK. So inside the fake like
factories. I'm going to date myself. I
3
00:00:26,500 --> 00:00:32,980
remember it was the Congress around
1990,1991 or so, where I was sitting
4
00:00:32,980 --> 00:00:38,550
together with some people who came over to
the states to visit the CCC Congress. And
5
00:00:38,550 --> 00:00:43,230
we were kind of riffing on how great the
internet is gonna make the world, you
6
00:00:43,230 --> 00:00:46,970
know, how how it's gonna bring world peace
and truth will rule and everything like
7
00:00:46,970 --> 00:00:57,259
that. Boy, were we naive, boy, where we
totally wrong. And today I'm going to be
8
00:00:57,259 --> 00:01:03,470
schooled in how wrong I actually was
because we have Svea, Dennis and Philip to
9
00:01:03,470 --> 00:01:08,980
tell us all about the fake like factories
around the world. And with that, could you
10
00:01:08,980 --> 00:01:17,670
please help me in welcoming them onto the
stage? Svea, Dennis and Philip.
11
00:01:17,670 --> 00:01:28,810
Philip: Thank you very much. Welcome to
our talk "Inside the Fake Like Factories
12
00:01:28,810 --> 00:01:35,899
". My name is Philip. I'm an Internet
activist against disinformation and I'm
13
00:01:35,899 --> 00:01:38,719
also a student of the University of
Bamberg.
14
00:01:38,719 --> 00:01:45,039
Svea: Hi. Thank you that you listen to us
tonight. My name is Svea. I'm an
15
00:01:45,039 --> 00:01:50,219
investigative journalist, freelance mostly
for the NDR and ARD. It's a public
16
00:01:50,219 --> 00:01:55,759
broadcaster in Germany. And I focus on
tech issues. And I had the pleasure to
17
00:01:55,759 --> 00:02:01,280
work with these two guys on, for me, a
journalistic project and for them on a
18
00:02:01,280 --> 00:02:04,289
scientific project.
Dennis: Yeah. Hi, everyone. My name is
19
00:02:04,289 --> 00:02:09,009
Dennis. I'm a PhD student from Ruhr
University Bochum. I'm working as a
20
00:02:09,009 --> 00:02:16,160
research assistant for the chair for
System Security. My research focuses on
21
00:02:16,160 --> 00:02:21,349
network security topics and Internet
measurements. And as Svea said, Philip and
22
00:02:21,349 --> 00:02:26,660
myself, we are here for the scientific
part and Svea is for the journalistic part
23
00:02:26,660 --> 00:02:31,790
here.
Philip: So here's our outline for today.
24
00:02:31,790 --> 00:02:38,550
So first, I'm going to briefly talk about
our motivation for our descent into the
25
00:02:38,550 --> 00:02:45,160
fake like factories and then we are going
to show you how we got our hands on ninety
26
00:02:45,160 --> 00:02:50,780
thousand fake like campaigns of a major
crowd working platform. And we are also
27
00:02:50,780 --> 00:02:56,080
going to show you why we think that there
are 10 billion registered Facebook users
28
00:02:56,080 --> 00:03:04,360
today. So first, I'm going to talk about
the like button. The like button is the
29
00:03:04,360 --> 00:03:12,150
ultimate indicator for popularity on
social media. It shows you how trustworthy
30
00:03:12,150 --> 00:03:18,620
someone is. It shows how how popular
someone is. It shows, it is an indicator
31
00:03:18,620 --> 00:03:26,520
for economic success of brands and it also
influences the Facebook algorithm. And as
32
00:03:26,520 --> 00:03:31,710
we are going to show now, these kind of
likes can be easily forged and
33
00:03:31,710 --> 00:03:38,580
manipulated. But the problem is that many
users will still prefer this bad info on
34
00:03:38,580 --> 00:03:45,960
Facebook about the popularity of a product
to no info at all. And so this is a real
35
00:03:45,960 --> 00:03:53,780
problem. And there is no real solution to
this. So first, we are going to talk about
36
00:03:53,780 --> 00:03:58,990
the factories and the workers in the fake
like factories.
37
00:03:58,990 --> 00:04:04,210
Svea: That there are fake likes and that
you can buy likes everywhere, it's well
38
00:04:04,210 --> 00:04:09,660
known. So if you Google "buying fake
likes" or even "fake comments" for
39
00:04:09,660 --> 00:04:15,100
Instagram or for Facebook, then you will
get like a hundreds of results and you can
40
00:04:15,100 --> 00:04:19,989
buy them very cheap and very expensive. It
doesn't matter, you can buy them from
41
00:04:19,989 --> 00:04:27,790
every country. But when you think of these
bought likes, then you may think of this.
42
00:04:27,790 --> 00:04:34,960
So you may think of somebody sitting in
China, Pakistan or India, and you think of
43
00:04:34,960 --> 00:04:40,240
computers and machines doing all this and
that they are, yeah, that they are fake
44
00:04:40,240 --> 00:04:47,630
and also that they can easily be detected
and that maybe they are not a big problem.
45
00:04:47,630 --> 00:04:54,880
But it's not always like this. It also can
be like this. So, I want you to meet
46
00:04:54,880 --> 00:05:03,120
Maria, I met her in Berlin. And Harald, he
lives near Mönchen-Gladbach. So Maria, she
47
00:05:03,120 --> 00:05:11,750
is a a retiree. She was a former police
officer. And as money is always short, she
48
00:05:11,750 --> 00:05:19,670
is clicking Facebook likes for money. She
earns between 2 cent and 6 cent per like.
49
00:05:19,670 --> 00:05:28,720
And Harald, he was a baker once, is now
getting social aid and he is also clicking
50
00:05:28,720 --> 00:05:34,480
and liking and commenting the whole day.
We met them during our research project
51
00:05:34,480 --> 00:05:40,930
and did some interviews about their likes.
And one platform they are clicking and
52
00:05:40,930 --> 00:05:46,750
working for is PaidLikes. It's only one
platform out of a universe, out of a
53
00:05:46,750 --> 00:05:52,070
cosmos. PaidLikes, they are sitting just a
couple of minutes from here in Magdeburg
54
00:05:52,070 --> 00:05:56,990
and they are offering that you can earn
money with liking on different platforms.
55
00:05:56,990 --> 00:06:02,410
And it looks like this when you log into
the platform with your Facebook account
56
00:06:02,410 --> 00:06:07,300
then you get in the morning, in the
afternoon, in the evening, you get, we
57
00:06:07,300 --> 00:06:13,260
call it campaigns. But these are pages,
Facebook fan pages or Instagram pages, or
58
00:06:13,260 --> 00:06:18,240
posts, or comments. You can, you know, you
can work your way through them and click
59
00:06:18,240 --> 00:06:22,930
them. And I blurred you see here the blue
bar; I blurred them because we don't want
60
00:06:22,930 --> 00:06:29,800
to get sued from all these companies,
which you can see there. To take you a
61
00:06:29,800 --> 00:06:37,310
little bit with me on the journey. Harald,
he was okay with us coming by for
62
00:06:37,310 --> 00:06:44,280
television and he was okay that we did a
long interview with him, and I want to
63
00:06:44,280 --> 00:06:50,080
show you a very small piece out of his
daily life sitting there doing the
64
00:06:50,080 --> 00:06:53,540
household, the washing and the cleaning,
and clicking.
65
00:07:26,760 --> 00:07:36,020
Come on. It could be like that. You click
and you earn some money. How did we meet
66
00:07:36,020 --> 00:07:41,150
him and all the others? Of course, because
Philip and Dennis, they have a more
67
00:07:41,150 --> 00:07:45,169
scientific approach. So it was also
important not only to talk to one or two,
68
00:07:45,169 --> 00:07:50,120
but to talk to many. So we created a
Facebook fan page, which we call "Eine
69
00:07:50,120 --> 00:07:54,210
Linie unterm Strich" (a line under a line)
because I thought, okay, nobody will like
70
00:07:54,210 --> 00:08:01,080
this freely. And then we did a post. This
post, and we bought likes, and you won't
71
00:08:01,080 --> 00:08:10,310
believe it, it worked so well; 222 people,
all the people I paid for liked this. And
72
00:08:10,310 --> 00:08:18,259
then we wrote all of them and we talked to
many of them. Some of them only in
73
00:08:18,259 --> 00:08:23,410
writing, some of them only we just called
or had a phone chat. But they gave us a
74
00:08:23,410 --> 00:08:29,949
lot of information about their life as a
click worker, which I will sum up. So what
75
00:08:29,949 --> 00:08:36,169
PaidLikes by itself says, they say that
they have 30000 registered users, and it's
76
00:08:36,169 --> 00:08:41,070
really interesting because you might think
that they are all registered with 10 or 15
77
00:08:41,070 --> 00:08:45,620
accounts, but most of them, they are not.
They are clicking with their real account,
78
00:08:45,620 --> 00:08:57,529
which makes it really hard to detect them.
So they even scan their I.D. so that the
79
00:08:57,529 --> 00:09:03,210
company knows that they are real. Then
they earn their money. And we met men,
80
00:09:03,210 --> 00:09:09,760
women, stay-at-home moms, low-income
earners, retirees, people who are getting
81
00:09:09,760 --> 00:09:17,850
social care. So, basically, anybody. There
was no kind of bias. And many of them are
82
00:09:17,850 --> 00:09:24,890
clicking for two and more platforms. That
was, I didn't meet anybody who's only
83
00:09:24,890 --> 00:09:29,370
clicking for one platform. They all have a
variety of platforms where they are
84
00:09:29,370 --> 00:09:34,610
writing comments or clicking likes. And
you can make - this is what they told us -
85
00:09:34,610 --> 00:09:41,580
between 15 euro and 450 euro monthly, if
you are a so-called power clicker and you
86
00:09:41,580 --> 00:09:48,410
do this some kind of professional. But
this are only the workers, and maybe you
87
00:09:48,410 --> 00:09:52,740
are more interested in who are the buyers?
Who benefits?
88
00:09:52,740 --> 00:09:59,631
Dennis: Yeah. Let's come to step two. Who
benefits from the campaigns? So I think
89
00:09:59,631 --> 00:10:06,089
you all remember this page. This is the
screen if you log into PaidLikes and,
90
00:10:06,089 --> 00:10:14,490
you'll see the campaigns with, you have to
click in order to get a little bit of
91
00:10:14,490 --> 00:10:25,370
money. And by luck we've noticed that if
you go over a URL, we see in the left
92
00:10:25,370 --> 00:10:31,980
bottom side of the browser, a URL
redirecting to the campaign. You have to
93
00:10:31,980 --> 00:10:40,700
click and you see that every campaign is
using a unique ID. It is just a simple
94
00:10:40,700 --> 00:10:49,640
integer, and the good thing is, it is just
incremented. So now maybe some of you guys
95
00:10:49,640 --> 00:10:56,570
notice what we can do with that. And yeah,
it is really easy with these constructed
96
00:10:56,570 --> 00:11:02,670
URLs to implement a crawler for data
gathering, and our crawler simply
97
00:11:02,670 --> 00:11:11,931
requested all campaign IDs between 0 and
90000. Maybe some of you ask why 90000? As
98
00:11:11,931 --> 00:11:17,110
I already said, we were also registered as
click workers and we see, we saw that the
99
00:11:17,110 --> 00:11:24,779
highest ID campaign used is about 88000.
So we thought OK, 90000 is a good value
100
00:11:24,779 --> 00:11:30,540
and we check for every request between
these 90000 requests if it got resolved or
101
00:11:30,540 --> 00:11:36,030
not, and if it got resolved, we redirected
the URL we present this source. That
102
00:11:36,030 --> 00:11:42,431
should be liked or followed. And we did
not save the page sources from the
103
00:11:42,431 --> 00:11:50,750
resolved URLs, we only save the resolved
URLs in the list of campaigns, and this
104
00:11:50,750 --> 00:11:58,700
list was then the basis for further
analysis. And here you see our list.
105
00:11:58,700 --> 00:12:05,740
Svea: Yes. This was the point when Dennis
and Philip, when they came to us and said,
106
00:12:05,740 --> 00:12:12,000
hey, we have a list. So what can you find?
And of course we searched AfD, was one of
107
00:12:12,000 --> 00:12:20,940
the first search queries. And yeah, of
course, AfD is also in that list. Maybe
108
00:12:20,940 --> 00:12:31,149
not so surprisingly for some. And when you
look, it is AFD Gelsenkirchen. And the fan
109
00:12:31,149 --> 00:12:39,589
page. And we asked AfD Gelsenkirchen, did
you buy likes? And they said, we don't
110
00:12:39,589 --> 00:12:48,240
know how we got on that list. But however,
we do not rule out an anonymous donation.
111
00:12:48,240 --> 00:12:55,410
But now you would think, Ok, they found
AfD; this is very expectable. But no, all
112
00:12:55,410 --> 00:13:00,930
political parties – mostly local and
regional entities - showed up on that
113
00:13:00,930 --> 00:13:09,250
list. So we have CDU/CSU. We have had FDP,
SPD, AfD, Die Grünen and Die Linke. But
114
00:13:09,250 --> 00:13:15,390
not that you think Angela Merkel or some
very big Facebook fan pages just showed
115
00:13:15,390 --> 00:13:23,800
up. No, no. Very small entities with a
couple of hundreds or maybe 10000 or 15000
116
00:13:23,800 --> 00:13:28,390
followers. And I think this makes
perfectly sense, because somebody who has
117
00:13:28,390 --> 00:13:35,370
already very, very much many fans
probably would not buy them there at
118
00:13:35,370 --> 00:13:46,311
PaidLikes. And we asked many of them, and
mostly they could not explain it. They
119
00:13:46,311 --> 00:13:52,040
would never do something like that. Yeah,
they were completely over asked. But you
120
00:13:52,040 --> 00:13:56,690
have to think that we only saw the
campaign. The campaigns, their Facebook
121
00:13:56,690 --> 00:14:03,110
fan pages, we could not see who bought the
likes. And as you can imagine, everybody
122
00:14:03,110 --> 00:14:08,740
could have done it like the mother, the
brother, the fan, you know, the dog. So
123
00:14:08,740 --> 00:14:15,160
this was a case we would have needed a lot
of luck to call anybody out of the blue
124
00:14:15,160 --> 00:14:20,260
and then he would say, oh, yes, I did
this. And there was one, or there were
125
00:14:20,260 --> 00:14:25,810
some politicians who admitted it. And one
of them, she did it also publicly and gave
126
00:14:25,810 --> 00:14:35,339
us an interview. It's Tanja Kühne. She is
a regional politician from Walsrode,
127
00:14:35,339 --> 00:14:40,260
Niedersachsen. And she was in the..., it
was the case that it was after an election
128
00:14:40,260 --> 00:14:44,360
and she was not very happy with her fan
page. That is what she told us. She was
129
00:14:44,360 --> 00:14:49,220
very unlucky and she wanted, you know, to
push herself and to boost it a little bit,
130
00:14:49,220 --> 00:14:55,510
and get more friends and followers and
reach. And then she bought 500 followers.
131
00:14:55,510 --> 00:15:02,870
And then we had a nice interview with her
about that. Show you a small piece.
132
00:15:53,829 --> 00:15:59,760
Okay, so you see – answers are pretty
interesting. And she.. I think she was
133
00:15:59,760 --> 00:16:05,180
that courageous to speak out to us. Many
of others did too, but only on the phone.
134
00:16:05,180 --> 00:16:09,180
And they didn't want to go on the record.
But she's not the only one who answered
135
00:16:09,180 --> 00:16:14,110
like this. Because, of course, if you call
through a list of potential fake like
136
00:16:14,110 --> 00:16:21,120
buyers, of course they answer like, no,
it's not a scam. And I also think from a
137
00:16:21,120 --> 00:16:26,180
jurisdictional way, it's it's also very
hard to show that this is fraud and a
138
00:16:26,180 --> 00:16:33,209
scam. And it's more an ethical problem
that you can that you can see here, that
139
00:16:33,209 --> 00:16:40,170
it's manipulative if you buy likes. We
also found a guy from FSP from the
140
00:16:40,170 --> 00:16:45,269
Bundestag. But yeah, he ran away and
didn't want to get interviewed, so I
141
00:16:45,269 --> 00:16:52,700
couldn't show you. So bought, or no
probably... He was like 40 times in our
142
00:16:52,700 --> 00:16:59,100
list for various Facebook posts and videos
and also for his Instagram account. But we
143
00:16:59,100 --> 00:17:06,730
could not get him on, we could not get him
on record. So what did others say? We, of
144
00:17:06,730 --> 00:17:10,970
course, confronted Facebook, Instagram and
YouTube with this small research. And they
145
00:17:10,970 --> 00:17:18,079
said, no, we don't want fake likes on our
platform. PaidLikes is active since 2012,
146
00:17:18,079 --> 00:17:25,370
you know. So they waited seven years. But
after our report, at least, Facebook
147
00:17:25,370 --> 00:17:32,549
temporarily blocked PaidLikes. And of
course, we asked them too, and spoke to
148
00:17:32,549 --> 00:17:35,781
them and wrote with PaidLikes in
Magdeburg. And they said, of course, it's
149
00:17:35,781 --> 00:17:41,620
not a scam because the click workers they
are freely clicking on pages. So, yeah,
150
00:17:41,620 --> 00:17:47,640
kind of nobody cares. But PaidLikes, this
is only the tip of the iceberg.
151
00:17:47,640 --> 00:17:58,520
Philip: So we also wanted to dive a little
bit into this fake like universe outside
152
00:17:58,520 --> 00:18:05,780
of PaidLikes and to see what else is out
there. And so we did an analysis of
153
00:18:05,780 --> 00:18:12,780
account creation on Facebook. So what
Facebook is saying about account creation
154
00:18:12,780 --> 00:18:19,299
is that they are very effective against
fake accounts. So they say they remove
155
00:18:19,299 --> 00:18:26,330
billions of accounts each year, and that
most of these accounts never reach any
156
00:18:26,330 --> 00:18:33,000
real users and they remove them before
they get reported. So what Facebook
157
00:18:33,000 --> 00:18:39,080
basically wants to tell you is that they
have it under control. However, there are
158
00:18:39,080 --> 00:18:45,700
a number of reports that suggest
otherwise. For example, recently at NATO-
159
00:18:45,700 --> 00:18:53,630
Stratcom Taskforce released a report where
they actually bought 54000 likes, 54000
160
00:18:53,630 --> 00:19:02,220
social media interactions for just 300
Euros. So this is a very low price. And I
161
00:19:02,220 --> 00:19:07,169
think you wouldn't expect such a low price
if it would be hard to get that many
162
00:19:07,169 --> 00:19:15,880
interactions. They bought 3500 comments,
25000 likes, 20000 views and 5100
163
00:19:15,880 --> 00:19:22,991
followers. Everything for just 300 Euros.
So, you know, the thing they have in
164
00:19:22,991 --> 00:19:32,050
common, they are cheap, the fake likes and
the fake interactions. So we also have,
165
00:19:32,050 --> 00:19:38,470
there was also another report from Vice
Germany recently. And they reported on
166
00:19:38,470 --> 00:19:46,410
some interesting facts about automated
fake accounts. They reported on findings
167
00:19:46,410 --> 00:19:50,980
that suggest that actually people use
internet or hacked internet of things
168
00:19:50,980 --> 00:19:59,150
devices and to use them to create these
fake accounts and to manage them. And so
169
00:19:59,150 --> 00:20:04,590
it's actually kind of interesting to think
about this this wa. To say, OK, maybe next
170
00:20:04,590 --> 00:20:11,020
election your fridge is actually going to
support the other candidate on Facebook.
171
00:20:11,020 --> 00:20:16,970
And so we also wanted to look into this
and we wanted to go a step further and to
172
00:20:16,970 --> 00:20:24,660
look at who these people are. Who are
they, and what what are they doing on
173
00:20:24,660 --> 00:20:32,200
Facebook? And so we actually examined the
profiles of purchased likes. For this we
174
00:20:32,200 --> 00:20:38,390
created four comments under arbitrary
posts, and then we bought likes for these
175
00:20:38,390 --> 00:20:46,500
comments, and then we examined the
resulting profiles of the fake likes. So
176
00:20:46,500 --> 00:20:51,050
it was pretty cheap to buy these likes.
Comment likes are always a little bit more
177
00:20:51,050 --> 00:20:59,520
expensive than other likes. And we found
all these offerings on Google and we paid
178
00:20:59,520 --> 00:21:08,169
with PayPal. So we actually used a pretty
neat trick to estimate the age of these
179
00:21:08,169 --> 00:21:16,490
fake accounts. So as you can see here, the
Facebook user ID is incremented. So
180
00:21:16,490 --> 00:21:24,250
Facebook started in 2009 to use
incremented Facebook ID, and they use this
181
00:21:24,250 --> 00:21:31,780
pattern of 1 0 0 0 and then the
incremented number. And as you can see, in
182
00:21:31,780 --> 00:21:40,200
2009 this incremented number was very
close to zero. And then today it is close
183
00:21:40,200 --> 00:21:49,559
to 40 billion. And in this time period,
you can see that you can kind of get a
184
00:21:49,559 --> 00:21:56,770
rather fitting line through all these
points. And you can see that the likes are
185
00:21:56,770 --> 00:22:02,710
in fact incremented, ... the account IDs
are in fact incremented over time. So we
186
00:22:02,710 --> 00:22:08,670
can use this fact in reverse to estimate
the creation date of an account where we
187
00:22:08,670 --> 00:22:15,340
know the Facebook ID. And that's exactly
what we did with these fake likes. So we
188
00:22:15,340 --> 00:22:22,090
estimated the account creation dates. And
as you can see, we get kind of different
189
00:22:22,090 --> 00:22:28,929
results from different services. For
example, PaidLikes, they had rather old
190
00:22:28,929 --> 00:22:35,750
accounts. So this means they use very
authentic accounts. And we already know
191
00:22:35,750 --> 00:22:41,370
that because we talked to them. So these
are very authentic accounts. Also like
192
00:22:41,370 --> 00:22:46,660
Service A over here also uses very, very
authentic accounts. But on the other hand,
193
00:22:46,660 --> 00:22:52,160
like service B uses very new accounts,
they were all created in the last three
194
00:22:52,160 --> 00:22:58,280
years. So if you look at the accounts and
also from these numbers, we think that
195
00:22:58,280 --> 00:23:06,510
these accounts were bots and on service C
it's kind of not clear, are these are
196
00:23:06,510 --> 00:23:10,870
these accounts bots or are these
clickworkers? Maybe it's a mixture of
197
00:23:10,870 --> 00:23:17,820
both, we don't know exactly for sure. But
this is an interesting metric to measure
198
00:23:17,820 --> 00:23:23,390
the age of the accounts to determine if
some of them might be bots. And that's
199
00:23:23,390 --> 00:23:29,340
exactly what we did on this page. So this
is actually a page for garden furniture
200
00:23:29,340 --> 00:23:36,750
and we found it in our list that we got
from paid likes. So they bought, obviously
201
00:23:36,750 --> 00:23:43,970
they were on this list for bought likes on
Facebook, on PaidLikes. And they caught
202
00:23:43,970 --> 00:23:51,000
our eye because they had one million
likes. And that's rather unusual for a
203
00:23:51,000 --> 00:24:01,260
shop for garden furniture in Germany. And
so we looked at this page further and we
204
00:24:01,260 --> 00:24:07,390
noticed other interesting things. For
example, there are posts, all the time,
205
00:24:07,390 --> 00:24:13,820
they got like thousands of likes. And
that's also kind of unusual for a garden
206
00:24:13,820 --> 00:24:19,590
furniture shop. And so we looked into the
likes and as you can see, they all look
207
00:24:19,590 --> 00:24:26,790
like they come from Southeast Asia and
they don't look very authentic. And we
208
00:24:26,790 --> 00:24:32,460
were actually able to estimate the
creation dates of these accounts. And we
209
00:24:32,460 --> 00:24:36,700
found that most of these accounts that
were used for liking these posts on this
210
00:24:36,700 --> 00:24:44,130
page were actually created in the last
three years. So this is a page where
211
00:24:44,130 --> 00:24:49,540
everything, from the number of people who
like to page to the number of people who
212
00:24:49,540 --> 00:24:55,559
like to posts is complete fraud. So
nothing about this is real. And it's
213
00:24:55,559 --> 00:25:02,380
obvious that this can happen on Facebook
and that this is a really, really big
214
00:25:02,380 --> 00:25:08,309
problem. I mean, this is a, this is a shop
for garden furniture. Obviously, they
215
00:25:08,309 --> 00:25:14,580
probably don't have such huge sums of
money. So it was probably very cheap to
216
00:25:14,580 --> 00:25:22,170
buy this amount of fake accounts. And it
is really shocking to see how, how big,
217
00:25:22,170 --> 00:25:31,179
how big the scale is of this kind of
operations. And so what we have to say is,
218
00:25:31,179 --> 00:25:39,970
OK, when Facebook says they have it under
control, we have to doubt that. So now we
219
00:25:39,970 --> 00:25:46,320
can look at the bigger picture. And what
we are going to do here is we are going to
220
00:25:46,320 --> 00:25:52,700
use this same graph that we used before to
estimate the creation dates, but in a
221
00:25:52,700 --> 00:25:59,080
different way. So we can actually see that
the lowest and the highest points of
222
00:25:59,080 --> 00:26:05,090
Facebook IDs in this graph. So we know the
newest Facebook ID by creating a new
223
00:26:05,090 --> 00:26:13,200
account. And we know the lowest ID because
it's zero. And then we know that there are
224
00:26:13,200 --> 00:26:20,780
40 billion Facebook IDs. Now, in the next
step, we took a sample, a random sample
225
00:26:20,780 --> 00:26:27,610
from these 40 billion Facebook IDs. And
inside of the sample, we checked if these
226
00:26:27,610 --> 00:26:33,740
accounts exist, if this ID corresponds to
an existing account. And we do that because
227
00:26:33,740 --> 00:26:39,360
we obviously cannot check 40 billion
accounts and 40 billion IDs, but we can
228
00:26:39,360 --> 00:26:45,720
check a small sample of these accounts of
these IDs and estimate, then, the number
229
00:26:45,720 --> 00:26:54,470
of existing accounts on Facebook and
total. So for this, we repeatedly access
230
00:26:54,470 --> 00:27:02,770
the same sample of one million random IDs
over the course of one year. And we also
231
00:27:02,770 --> 00:27:10,100
pulled a sample of 10 million random IDs
for closer analysis this July. And now
232
00:27:10,100 --> 00:27:15,950
Dennis is going to tell you how we did it.
Dennis: Yeah. Well, pretty interesting,
233
00:27:15,950 --> 00:27:21,160
pretty interesting results so far, right?
So we again implemented the crawler, the
234
00:27:21,160 --> 00:27:26,530
second time for gathering public Facebook
information, the public Facebook account
235
00:27:26,530 --> 00:27:35,730
data. And, yeah, this was not so easy as
in the first case. Um, yeah. As. It's not
236
00:27:35,730 --> 00:27:45,059
surprising that Facebook is using a lot of
measures to try to block the automated
237
00:27:45,059 --> 00:27:52,460
crawling of the Facebook page, for example
with IP blocking or CAPTCHA solving. But,
238
00:27:52,460 --> 00:27:59,929
uh, we were pretty easy... Yeah, we could
pretty easy solve this problem by using
239
00:27:59,929 --> 00:28:06,980
the Tor Anonymity Network. So every time
our IP got blocked by crawling the data,
240
00:28:06,980 --> 00:28:14,480
we just made a new Tor connection and
change the IP. And this also with the
241
00:28:14,480 --> 00:28:21,440
CAPTCHAs. And with this easy method, we
were able to to crawl all the Facebook,
242
00:28:21,440 --> 00:28:26,020
and all the public Facebook data. And
let's have a look at two examples. The
243
00:28:26,020 --> 00:28:36,890
first example is facebook.com/4. So the,
very, very small Facebook ID. Yeah, in
244
00:28:36,890 --> 00:28:41,790
this case, we are, we are redirected and
check the response and find a valid
245
00:28:41,790 --> 00:28:50,070
account page. And does anyone know which
account this is? Mark Zuckerberg? Yeah,
246
00:28:50,070 --> 00:28:55,360
that's correct. This is this is a public
account for Mark Zuckerberg. Number four,
247
00:28:55,360 --> 00:29:01,679
as we see, as we already saw, the other
IDs are really high. But he got the number
248
00:29:01,679 --> 00:29:10,690
four. Second example was facebook.com/3.
In this case, we are not forwarded. And
249
00:29:10,690 --> 00:29:17,760
this means that it is an invalid account.
And that was really easy to confirm with a
250
00:29:17,760 --> 00:29:23,740
quick Google search. And it was a test
account from the beginning of Facebook. So
251
00:29:23,740 --> 00:29:31,059
we did not get redirected. And it's just
the login page from Facebook. And with
252
00:29:31,059 --> 00:29:38,500
these examples, we did, we did a lot of, a
lot more experiments. And at the end, we
253
00:29:38,500 --> 00:29:46,970
were able to to build this tree. And, yeah,
this tree represents the high level
254
00:29:46,970 --> 00:29:53,059
approach from our scraper. So in the,
What's that?
255
00:29:53,059 --> 00:29:56,340
Svea: Okay. Sleeping.
Laughing
256
00:29:56,340 --> 00:30:07,090
Dennis: Yeah. We have still time. Right.
So what? Okay, so everyone is waking up
257
00:30:07,090 --> 00:30:16,680
again. Oh, yeah. The first step we call
the domain, www.facebook.com/FID. If we
258
00:30:16,680 --> 00:30:24,650
get redirected in this case, then we check
if the, if the page is an account page. If
259
00:30:24,650 --> 00:30:31,270
it's an account page, then it's an public
account like the example 4 and we were
260
00:30:31,270 --> 00:30:39,890
able to save the raw data, the raw HTTP
source. If we, if it's not an account page
261
00:30:39,890 --> 00:30:45,070
then everything is OK. If it's not, it's
not a public account and we are not able
262
00:30:45,070 --> 00:30:52,580
to save any data. And if we call, if we
do, if we do not get redirected in the
263
00:30:52,580 --> 00:31:01,630
first step, then we call the second
domain, facebook.com/profile.php?id=FID
264
00:31:01,630 --> 00:31:09,289
with the mobile user agent. And if we get
redirected then, then again, it is a
265
00:31:09,289 --> 00:31:14,990
nonpublic profile and we cannot save
anything. But, and if we get not
266
00:31:14,990 --> 00:31:22,710
redirected, it is an invalid profile and
it is most often a deleted account. Yeah.
267
00:31:22,710 --> 00:31:29,390
And yeah, that's the high level overview
of our scraper. And Phillip will now give
268
00:31:29,390 --> 00:31:32,340
some more information on interesting
results.
269
00:31:32,340 --> 00:31:38,820
Phillip: So the most interesting result of
this scraping of the sample of Facebook
270
00:31:38,820 --> 00:31:47,070
IDs was that one in four Facebook IDs
corresponds to a valid account. And you
271
00:31:47,070 --> 00:31:53,559
can do the math. There are 40 billion
Facebook IDs, so there must be 10 billion
272
00:31:53,559 --> 00:32:00,170
registered users on Facebook. And this
means that there are more registered users
273
00:32:00,170 --> 00:32:08,140
on Facebook than there are humans on
Earth. And also, it means that it's even
274
00:32:08,140 --> 00:32:12,460
worse than that because not everybody on
Earth can have a Facebook account because
275
00:32:12,460 --> 00:32:17,370
not everybody, you need a smartphone for
that. And many people don't have those. So
276
00:32:17,370 --> 00:32:22,270
this is actually a pretty high number and
it's very unexpected. So in July 2019,
277
00:32:22,270 --> 00:32:29,059
there were more than ten billion Facebook
accounts. Also, we did another research on
278
00:32:29,059 --> 00:32:36,429
the timeframe between October 2018 and
today, or this month. And we found that in
279
00:32:36,429 --> 00:32:43,140
this timeframe there were 2 billion new
registered Facebook accounts. So this is
280
00:32:43,140 --> 00:32:48,679
like the timeframe of one year, more or
less. And in a similar timeframe, the
281
00:32:48,679 --> 00:32:58,899
monthly active user base rose by only 187
million. Facebook deleted 150 million
282
00:32:58,899 --> 00:33:05,419
older accounts between October 2018 and
July 2019. And we know that because we
283
00:33:05,419 --> 00:33:11,460
pulled the same sample over a longer
period of time. And then we watched for
284
00:33:11,460 --> 00:33:16,230
accounts that got deleted in the sample.
And that enables us to estimate this
285
00:33:16,230 --> 00:33:23,400
number of 150 million accounts that got
deleted that are basically older than our
286
00:33:23,400 --> 00:33:31,890
sample. So I made some nice graphs for
your viewing pleasure. So, again, the
287
00:33:31,890 --> 00:33:40,919
older accounts were, just 150 million were
deleted since October 2018. These are
288
00:33:40,919 --> 00:33:46,350
accounts that are older than last year.
And Facebook claims that since then, about
289
00:33:46,350 --> 00:33:52,789
7 billion accounts got deleted from their
platform, which is vastly more than these
290
00:33:52,789 --> 00:33:58,370
older accounts. And that that's why we
think that Facebook mostly deleted these
291
00:33:58,370 --> 00:34:06,770
newer accounts. And if an account is older
than a certain age, then it is very
292
00:34:06,770 --> 00:34:13,069
unlikely that it gets deleted. And also, I
think you can see the scales here. So, of
293
00:34:13,069 --> 00:34:17,960
course, the registered users are not the
same thing as active users, but you can
294
00:34:17,960 --> 00:34:23,290
still see that there are much more
registrations of, of new users than there
295
00:34:23,290 --> 00:34:30,139
are active users. And there are new active
users during the last year. So what does
296
00:34:30,139 --> 00:34:37,909
this all mean? Does it mean that Facebook
gets flooded by fake accounts? We don't
297
00:34:37,909 --> 00:34:42,980
really know. We only know these numbers.
What Facebook is telling us is that they
298
00:34:42,980 --> 00:34:50,409
only count and publish active users, as I
already said, that there is a disconnect
299
00:34:50,409 --> 00:34:56,759
between this record, registered users and
active users and Facebook only reports on
300
00:34:56,759 --> 00:35:04,289
the active users. Also, they say that
users register accounts, but they don't
301
00:35:04,289 --> 00:35:10,519
verify them or they don't use them, and
that's how this number gets so high. But I
302
00:35:10,519 --> 00:35:19,319
think that that's not really explaining
these high numbers and because that's just
303
00:35:19,319 --> 00:35:26,469
by orders of magnitude larger than
anything that this could cause. Also, they
304
00:35:26,469 --> 00:35:31,819
say that they regularly delete fake
accounts. But we have seen that these are
305
00:35:31,819 --> 00:35:37,519
mostly accounts that get deleted directly
after their creation. And if they survive
306
00:35:37,519 --> 00:35:46,170
long enough, then they are getting
through. So what does this all mean?
307
00:35:46,170 --> 00:35:55,390
Svea: Okay, so you got the full load,
which I had like over two or three months.
308
00:35:55,390 --> 00:36:02,869
And what for me was, was a one very big
conclusion was that we have some kind of
309
00:36:02,869 --> 00:36:08,530
broken metric here, that all the likes and
all the hearts on Instagram and the
310
00:36:08,530 --> 00:36:13,650
followers that they can so easily be
manipulated. And then it's it's so hard to
311
00:36:13,650 --> 00:36:19,029
tell in some cases, it's so hard to tell
if they are real or not real. And this
312
00:36:19,029 --> 00:36:26,160
opens the gate for manipulation and yes,
untrueness. And for economic losses, if
313
00:36:26,160 --> 00:36:33,109
you think as somebody who is investing
money and or as an advertiser, for
314
00:36:33,109 --> 00:36:40,170
example. And in the very end, it is a case
of eroding trust, which means that we
315
00:36:40,170 --> 00:36:45,739
cannot trust these numbers anymore. These
numbers are, you know, they are so easily
316
00:36:45,739 --> 00:36:53,799
manipulated. And why should we trust this?
And this has a severe consequence for all
317
00:36:53,799 --> 00:36:59,420
the social networks. If you are still in
them. So what can be a solution? And
318
00:36:59,420 --> 00:37:05,150
Philip, you thought about that.
Phillip: So basically we have two
319
00:37:05,150 --> 00:37:11,410
problems. One is click workers and one is
fakes. Click workers are basically just
320
00:37:11,410 --> 00:37:18,420
hyper active users and they are selling
their hyper activity. And so what social
321
00:37:18,420 --> 00:37:23,660
networks could do is just make
interactions scarce, so just lower the
322
00:37:23,660 --> 00:37:29,180
value of more interactions. If you are a
hyper active users, then your interaction
323
00:37:29,180 --> 00:37:34,240
should count less than the interactions of
a less active user.
324
00:37:34,240 --> 00:37:39,229
Mumbling
That's kind of solvable, I think. The real
325
00:37:39,229 --> 00:37:46,890
problem is the authenticity. So if you if
you get stopped from posting or liking
326
00:37:46,890 --> 00:37:52,640
hundreds of pages a day, then maybe you
just create multiple accounts and operate
327
00:37:52,640 --> 00:37:58,599
them simultaneously. And this can only be
solved by authenticity. So this can only
328
00:37:58,599 --> 00:38:04,990
be solved if you know that the person who
is operating the account is just one
329
00:38:04,990 --> 00:38:10,569
person, is operating one account. And this
is really hard to do, because Facebook
330
00:38:10,569 --> 00:38:14,940
doesn't know who is clicking. Is it a bot?
Is it a clickworrker, or is it one
331
00:38:14,940 --> 00:38:20,410
clickworker for ten accounts? How does
this work? And so this is really hard for
332
00:38:20,410 --> 00:38:27,609
the, for the social media companies to do.
And you could say, OK, let's send in the
333
00:38:27,609 --> 00:38:32,359
passport or something like that to prove
authenticity. But that's actually not a
334
00:38:32,359 --> 00:38:37,109
good idea because nobody wants to send
their passport to Facebook. And so this is
335
00:38:37,109 --> 00:38:42,359
really a hard problem that has to be
solved. If we want to use social, social
336
00:38:42,359 --> 00:38:49,750
media in a meaningful way. And so this is
what, what companies could do. And now...
337
00:38:49,750 --> 00:38:53,200
Svea: But what do what you
could do. Okay. Of course, you can delete
338
00:38:53,200 --> 00:38:56,469
your Facebook account or your Instagram
account and stop.
339
00:38:56,469 --> 00:39:01,299
Slight Applause, Lauthing
Svea: Yeah! Stay away from social media.
340
00:39:01,299 --> 00:39:08,959
But this maybe is not for all of us a
solution. So I think be aware, of course.
341
00:39:08,959 --> 00:39:17,499
Spread the word, tell others. And if, if
you, if you like, then and you get more
342
00:39:17,499 --> 00:39:24,019
intelligence about that, we are really
happy to dig deeper in these networks. And
343
00:39:24,019 --> 00:39:30,180
and we will go on investigating and so at
last but not least, it's to say thank you
344
00:39:30,180 --> 00:39:33,349
to you guys. Thank you very much for
listening.
345
00:39:33,349 --> 00:39:40,089
Applause
Svea: And we did not do this alone. We are
346
00:39:40,089 --> 00:39:44,849
not three people. There are many more
standing behind and doing this, this
347
00:39:44,849 --> 00:39:50,709
beautiful research. And we are opening now
for questions, please.
348
00:39:50,719 --> 00:39:55,429
Herald: Yes. Please, thank Svea, Phil and
Dennis again.
349
00:39:55,429 --> 00:40:05,519
Applause
And we have microphones out
350
00:40:05,519 --> 00:40:09,680
here in the room, about nine of them,
actually. If you line up behind them to
351
00:40:09,680 --> 00:40:15,780
ask a question, remember that a question
is a sentence with a question mark behind
352
00:40:15,780 --> 00:40:20,500
it. And I think I see somebody at number
three. So let's start with that.
353
00:40:20,500 --> 00:40:25,979
Question: Hi. I, I just have a little
question. Wouldn't a dislike button, the
354
00:40:25,979 --> 00:40:30,749
concept of a dislike button, wouldn't that
be a solution to all the problems?
355
00:40:30,749 --> 00:40:38,039
Phillip: So we thought about recommending
that Facebook ditches the like button
356
00:40:38,039 --> 00:40:42,299
altogether. I think that would be a better
solution than a dislike button, because a
357
00:40:42,299 --> 00:40:47,079
dislike button could also be manipulated
and it would be even worse because you
358
00:40:47,079 --> 00:40:54,119
could actually manipulate the network into
down ranking posts or kind of not showing
359
00:40:54,119 --> 00:41:00,670
posts to somebody. And that, I think would
be even worse. I imagine what dictators
360
00:41:00,670 --> 00:41:08,209
would do with that. And so I think the
best option would be to actually not show
361
00:41:08,209 --> 00:41:18,029
off like, like counts anymore and to this,
to actually make people not invest into
362
00:41:18,029 --> 00:41:25,199
these counts if they become meaningless.
Herald: I think I see a microphone 7, up
363
00:41:25,199 --> 00:41:28,109
there.
Question: Hello. So one question I had is
364
00:41:28,109 --> 00:41:37,210
you are signed creation dates to IDs. How
did you do this?
365
00:41:37,210 --> 00:41:52,489
Phillip: So, we actually knew the creation
date of some accounts. And then we kind of
366
00:41:52,489 --> 00:41:58,210
interpolated between the creation dates
and the IDs. So you see this black line
367
00:41:58,210 --> 00:42:04,109
there. That's actually our, our
interpolation. And with this black line,
368
00:42:04,109 --> 00:42:10,910
we can then estimate the creation dates
for IDs that we do not yet know because
369
00:42:10,910 --> 00:42:17,430
they did, kind of fill in the gaps.
Q: Follow up question, do you know why
370
00:42:17,430 --> 00:42:20,310
there are some points outside of this
graph?
371
00:42:20,310 --> 00:42:23,999
Phillip: No.
Q: No? Thank you.
372
00:42:23,999 --> 00:42:26,400
Herald: So there was a question from the
Internet.
373
00:42:26,400 --> 00:42:33,723
Question: Did you report your findings to
Facebook? And did they do anything?
374
00:42:33,723 --> 00:42:41,509
Svea: Because this research is very new,
we, we just recently approached them and
375
00:42:41,509 --> 00:42:47,190
showed them the research and we got an
answer. But I think we also already showed
376
00:42:47,190 --> 00:42:54,480
the answer. It was that they, I think that
they only count and publish active users.
377
00:42:54,480 --> 00:42:59,680
They could, they did not want to tell us
how many registered users they have, that
378
00:42:59,680 --> 00:43:03,859
they say, oh, sometimes users register
accounts, but don't use them or verify
379
00:43:03,859 --> 00:43:08,930
them. And that they regularly delete fake
accounts. But we hope that we get into a
380
00:43:08,930 --> 00:43:12,469
closer discussion with them soon about
this.
381
00:43:12,469 --> 00:43:19,469
Herald: Microphone two.
Question: When hunting down the bias of
382
00:43:19,469 --> 00:43:26,740
the campaigns, did you dig out your own
campaign line, Line below the line? No,
383
00:43:26,740 --> 00:43:34,039
because they stopped scraping in August.
And I, you stopped scraping in August. And
384
00:43:34,039 --> 00:43:39,449
then I started, you know, the whole
project started with them coming to us
385
00:43:39,449 --> 00:43:44,599
with the list. And then we thought, oh,
this is very interesting. And then the
386
00:43:44,599 --> 00:43:50,729
whole journalistic research started. And,
but I think if we, I think if we would do
387
00:43:50,729 --> 00:43:56,200
it again, of course, I think we would find
us. We all also found there was another
388
00:43:56,200 --> 00:44:01,650
magazine, and they did, also a test, paid
test a couple of years ago. And we found
389
00:44:01,650 --> 00:44:04,920
their campaign.
Phillip: So, so we we actually did another
390
00:44:04,920 --> 00:44:11,480
test. And for the other test, I noted we
also got like this ID, I think. And it
391
00:44:11,480 --> 00:44:20,329
worked to plug it into the URL and then we
also got to redirected to our own page. So
392
00:44:20,329 --> 00:44:22,569
that worked.
Q: Thank you.
393
00:44:22,569 --> 00:44:26,379
Herald: Microphone three.
Question: Hi. I'm Farhan, I'm a Pakistani
394
00:44:26,379 --> 00:44:30,759
journalist. And first of all, I would like
to say that you were right when you said
395
00:44:30,759 --> 00:44:34,910
that there might be people sitting in
Pakistan clicking on the likes. That does
396
00:44:34,910 --> 00:44:41,329
happen. But my question would be that
Facebook does have its own ad program that
397
00:44:41,329 --> 00:44:47,470
it aggressively pushes. And in that ad
program, there is also options whereby
398
00:44:47,470 --> 00:44:53,701
people can buy likes and comments and
impressions and reactions. Did you, would
399
00:44:53,701 --> 00:44:59,670
you also consider those as a fake? I mean,
that they're not fake, per se, but they're
400
00:44:59,670 --> 00:45:05,799
still bought likes. So what's your view on
those? Thank you.
401
00:45:05,799 --> 00:45:14,349
Phillip: So, when you buy ads on Facebook,
then, so, what you what you actually want
402
00:45:14,349 --> 00:45:19,489
to have is fans for your page that are
actually interested in your page. So
403
00:45:19,489 --> 00:45:25,460
that's kind of the difference, I think to
the, to the paid likes system where the
404
00:45:25,460 --> 00:45:30,119
people themselves, they get paid for
liking stuff that they wouldn't normally
405
00:45:30,119 --> 00:45:35,599
like. So I think that's the fundamental
difference between the two programs. And
406
00:45:35,599 --> 00:45:40,529
that's why I think that one is unethical.
And one is not really that unethical.
407
00:45:40,529 --> 00:45:47,749
Svea: The very problem is if you, if you
buy these click workers, then you have
408
00:45:47,749 --> 00:45:52,789
many people in your fan page. They are not
interested in you. They don't care about
409
00:45:52,789 --> 00:45:57,410
you. They don't look at your products.
They don't look at your political party.
410
00:45:57,410 --> 00:46:03,539
And then often the people, they
additionally, they make Facebook ads, and
411
00:46:03,539 --> 00:46:08,229
these ads, they are shown, again, the
click workers and they don't look at them.
412
00:46:08,229 --> 00:46:13,410
So, you know, people, they are burning
money and money and money with this whole
413
00:46:13,410 --> 00:46:18,069
corrupt system.
Herald: So, microphone two.
414
00:46:18,069 --> 00:46:22,039
Question: Hi. Thanks. Thanks for the talk
and thanks for the effort of going through
415
00:46:22,039 --> 00:46:27,709
all of this project. From my
understanding, this whole finding
416
00:46:27,709 --> 00:46:35,209
basically undermines the trust in
Facebook's likes in general, per se. So I
417
00:46:35,209 --> 00:46:42,369
would expect now the price of likes to
drop and the pay for click workers to drop
418
00:46:42,369 --> 00:46:49,250
as well. Do you have any metrics on that?
Svea: The research just went public. I
419
00:46:49,250 --> 00:46:56,180
think one week ago. So, so what we have
seen as an effect is that Facebook, they
420
00:46:56,180 --> 00:47:02,940
excluded paid likes for, for a moment. So,
yes, of course, one platform is down. But
421
00:47:02,940 --> 00:47:08,010
I think there are so many outside. There
are so many. So I think...
422
00:47:08,010 --> 00:47:14,229
Q: I meant the phenomenon of paid likes,
not the company itself. Like the value of
423
00:47:14,229 --> 00:47:19,319
a like as a measure of credibility...
Phillip: We didn't...
424
00:47:19,319 --> 00:47:22,829
Q: ...is declining now. That's my, that's
my...
425
00:47:22,829 --> 00:47:27,869
Svea: Yes. That's why many people are
buying Instagram hearts now. So, so, yes,
426
00:47:27,869 --> 00:47:32,900
that's true. The like is not the fancy hot
shit anymore. Yes. And we also saw in the
427
00:47:32,900 --> 00:47:40,670
data that the likes for the fan pages,
they rapidly went down and the likes for
428
00:47:40,670 --> 00:47:45,229
the posts and the comments, they went up.
So I think, yes, there is a shift. And
429
00:47:45,229 --> 00:47:51,809
what we also saw in that data was that the
Facebook likes, they, they went down from
430
00:47:51,809 --> 00:47:57,839
2016. They are rapidly down. And what is
growing and rising is YouTube and
431
00:47:57,839 --> 00:48:01,609
Instagram. Now, everything is about,
today, everything is about Instagram.
432
00:48:01,609 --> 00:48:05,270
Q: Thanks.
Herald: So let's go to number one.
433
00:48:05,270 --> 00:48:09,630
Question: Hello and thank you very much
for this fascinating talk, because I've
434
00:48:09,630 --> 00:48:15,400
been following this whole topic for a
while. And I was wondering if you were
435
00:48:15,400 --> 00:48:20,849
looking also into the demographics, in
terms of age groups and social class, not
436
00:48:20,849 --> 00:48:25,619
of the people who were doing the actual
liking, but actually, you know, buying
437
00:48:25,619 --> 00:48:31,249
these likes. Because I think that what is
changing is an entire social discourse on
438
00:48:31,249 --> 00:48:36,709
social capital and, the bold U.S. kind of
term, because it can now be quantified. As
439
00:48:36,709 --> 00:48:43,650
a teacher, I hear of kids who buy likes to
be more popular than their other
440
00:48:43,650 --> 00:48:47,880
schoolmates. So I'm wondering if you're
looking into that, because I think that's
441
00:48:47,880 --> 00:48:52,559
fascinating, fascinating area to actually
come up with numbers about it.
442
00:48:52,559 --> 00:48:59,229
Svea: It definitely is. And we were all so
fascinated by this data set of 90,000 data
443
00:48:59,229 --> 00:49:05,479
points. And what we did was, and this was
very hard, and was that we tried it, first
444
00:49:05,479 --> 00:49:11,869
of all, to look who is buying likes, like
automotives, you know, to to, this some,
445
00:49:11,869 --> 00:49:18,910
you know, what, what kind of branches? Who
is in that? And so this was this was
446
00:49:18,910 --> 00:49:24,769
doable. But to get more into demographics,
you would have liked to, to crawl, to
447
00:49:24,769 --> 00:49:33,699
click every page. And so we we did not do
this. What we did was, of course, that we
448
00:49:33,699 --> 00:49:38,489
that we were a team of three to ten people
and manually looking into it. And what we,
449
00:49:38,489 --> 00:49:43,739
of course, saw that on Instagram and on
YouTube, you have many of these very young
450
00:49:43,739 --> 00:49:47,219
people. Some of them, I actually called
them and they were like, Yes, I bought
451
00:49:47,219 --> 00:49:54,089
likes. Very bad idea. So I think yes, I
think there is a demographic shift away
452
00:49:54,089 --> 00:49:59,890
from the companies and the automotive and
industries buying Facebook fan page likes
453
00:49:59,890 --> 00:50:04,390
to Instagram and YouTube wannabe-
influencers.
454
00:50:04,390 --> 00:50:06,430
Q: Influencers, influencer culture is
obviously...
455
00:50:06,430 --> 00:50:12,670
Svea: Yes. And I have to admit here we, we
showed you the political side, but we have
456
00:50:12,670 --> 00:50:19,849
to admit that the political likes, they
were like this small in the numbers. And
457
00:50:19,849 --> 00:50:25,640
the very, very vast majority of this data
set, it's about wedding planners,
458
00:50:25,640 --> 00:50:31,440
photography, tattoo studios and
influencers, influencers, influencers and
459
00:50:31,440 --> 00:50:34,479
YouTubers, of course.
Q: Yes. Thank you so much.
460
00:50:34,479 --> 00:50:37,439
Herald: So we have a lot of questions in
the room. I'm going to get to you as soon
461
00:50:37,439 --> 00:50:40,009
as we can. I'd like to go to the Internet
first.
462
00:50:40,009 --> 00:50:44,680
Signal Angel: Do you think this will get
bit better or worse if people move to more
463
00:50:44,680 --> 00:50:48,319
decentralized platforms?
Phillip: To more what?
464
00:50:48,319 --> 00:50:54,910
Svea: If it get better or worse.
Dennis: Can you repeat that, please?
465
00:50:54,910 --> 00:50:58,880
Herald: Would this issue get better or
worse if people move to a more
466
00:50:58,880 --> 00:51:01,239
decentralized platform?
Phillip: Decentralized. decentralized,
467
00:51:01,239 --> 00:51:12,160
okay. So, I mean, we can look at, at the,
this slide, I think, and think about
468
00:51:12,160 --> 00:51:18,249
whether decentralized platforms would
change any of these, any of these two
469
00:51:18,249 --> 00:51:25,999
points here. And I fear, I don't think so,
because they cannot solve the interactions
470
00:51:25,999 --> 00:51:30,210
problem that people can be hyperactive.
Actually, that's kind of a normal thing
471
00:51:30,210 --> 00:51:34,299
with social media. A small portion of
social media users is much more active
472
00:51:34,299 --> 00:51:39,880
than everybody else. That's kind of. You
have that without paying for it. So
473
00:51:39,880 --> 00:51:44,720
without even having paid likes, you will
have to consider if social media is really
474
00:51:44,720 --> 00:51:51,189
kind of representative of the society.
But, and the other thing is authenticity.
475
00:51:51,189 --> 00:51:57,170
And also in a decentralized platform, you
could have multiple accounts run by the
476
00:51:57,170 --> 00:52:01,199
same person.
Herald: So, microphone seven, all the way
477
00:52:01,199 --> 00:52:06,779
back there.
Question: Hi. Do you know if Facebook even
478
00:52:06,779 --> 00:52:10,220
removes the likes when they delete fake
accounts?
479
00:52:10,220 --> 00:52:17,319
Svea: Do you know that?
Phillip: No, we don't know that. No, we
480
00:52:17,319 --> 00:52:21,259
don't. We don't know. We know they delete
fake accounts, but we don't know if they
481
00:52:21,259 --> 00:52:27,619
also delete the likes. I know from our
research that the people we approached,
482
00:52:27,619 --> 00:52:31,329
they did not delete the click workers.
They get...
483
00:52:31,329 --> 00:52:35,839
Herald: Microphone two.
Question: Yeah. Hi. So I have a question
484
00:52:35,839 --> 00:52:41,359
with respect to this, one out of four
Facebook accounts are active in your, in
485
00:52:41,359 --> 00:52:46,949
your test. Did you see any difference with
respect to age of the accounts? So is it
486
00:52:46,949 --> 00:52:52,489
always one out the four to the entire
sample? Or does it maybe change over the,
487
00:52:52,489 --> 00:52:57,730
over the like going from a zero ID to,
well, 10 billion or 40 billion?
488
00:52:57,730 --> 00:53:02,189
Phillip: So you're talking about the
density of accounts in our ID?
489
00:53:02,189 --> 00:53:05,989
Q: Kind of.
Phillip: So, so there are changes over
490
00:53:05,989 --> 00:53:12,150
time. Yeah. So I guess I think now it's
less than it was before. So now they are
491
00:53:12,150 --> 00:53:19,089
less than for then, and before it was more
and so I think it was. Yeah. I don't know.
492
00:53:19,089 --> 00:53:23,660
Q: But you don't see anything specific
that now, only in the new accounts, only
493
00:53:23,660 --> 00:53:28,229
one out of 10 is active or valid and
before it was one out of two or something
494
00:53:28,229 --> 00:53:31,259
like that.
Phillip: It's not that extreme. So it's
495
00:53:31,259 --> 00:53:34,859
less than that. It's kind of...
Dennis: We have to say we did not check
496
00:53:34,859 --> 00:53:41,239
this, but there were no special cases.
Phillip: But it changed over time? So
497
00:53:41,239 --> 00:53:47,200
before it was less and, before it was more
and now it is less. And so what we checked
498
00:53:47,200 --> 00:53:54,710
was whether an ID actually corresponds to
an account. And so this metric, yeah. And
499
00:53:54,710 --> 00:53:57,299
it changed a little bit over time, but not
much.
500
00:53:57,299 --> 00:54:02,239
Herald: So, so number three, please.
Question: Yeah. Thank you for a very
501
00:54:02,239 --> 00:54:06,989
interesting talk. At the end, you gave
some recommendations, how to fix the
502
00:54:06,989 --> 00:54:11,769
metrics, right? And it's always nice to
have some metrics because then, well, we
503
00:54:11,769 --> 00:54:15,220
are the people who deal with the numbers.
So we want the metrics. But I want to
504
00:54:15,220 --> 00:54:20,309
raise the issue whether quantitative
measure is actually the right thing to do.
505
00:54:20,309 --> 00:54:26,449
So would you buy your furniture from store
A with 300 likes against store B with 200
506
00:54:26,449 --> 00:54:32,049
likes? Or would it not be better to have a
more qualitative thing? And to what extent
507
00:54:32,049 --> 00:54:38,259
is a quantitative measure maybe also the
source of a lot of bad developments we see
508
00:54:38,259 --> 00:54:43,390
in social media to begin with, even not
with bot firms and anything, but just
509
00:54:43,390 --> 00:54:48,339
people who go for the quick like and say
Hooray for Trump and then get, whatever,
510
00:54:48,339 --> 00:54:52,479
all the Trumpists is liking that and the
others say Fuck Trump and you get all the
511
00:54:52,479 --> 00:54:57,229
non Trumpists like that and you get all
the polarization, right? So, Instagram, I
512
00:54:57,229 --> 00:55:02,650
think they just don't just display their
like equivalent anymore in order to
513
00:55:02,650 --> 00:55:04,929
prevent that, so could you maybe comment
on that?
514
00:55:04,929 --> 00:55:12,299
Svea: I think this is a good idea, to, to
hide the likes. Yes. But I you know, we
515
00:55:12,299 --> 00:55:17,799
talked to many clickworkers and they do a
lot of stuff. And what they also do is
516
00:55:17,799 --> 00:55:23,309
taking comments and doing copy paste for
comments section or for Amazon reviews.
517
00:55:23,309 --> 00:55:29,789
So, you know, I think it's really hard to
get them out of the system because maybe
518
00:55:29,789 --> 00:55:34,390
if the likes are not shown and if and when
the comments are counting, then you will
519
00:55:34,390 --> 00:55:41,069
have people who are copy pasting comments
in the comments section. So I really think
520
00:55:41,069 --> 00:55:44,519
that the networks, that they really have
an issue here.
521
00:55:44,519 --> 00:55:49,829
Herald: So let's try to squeeze the last
three questions now. First, number seven,
522
00:55:49,829 --> 00:55:52,950
really quick.
Question: Very quick. Thank you for the
523
00:55:52,950 --> 00:55:58,799
nice insights. And I have a question about
the location of the users. So you made
524
00:55:58,799 --> 00:56:03,289
your point that you can analyze by the
metadata where, uh, when the account was
525
00:56:03,289 --> 00:56:08,650
made. But how about the location of the
followers? Is there any way to analyze
526
00:56:08,650 --> 00:56:12,339
that as well?
Phillip: So we can only analyze that if
527
00:56:12,339 --> 00:56:21,049
the users agreed to share it publicly and
not all of them do that, I think often a
528
00:56:21,049 --> 00:56:26,460
name check is often a very good way to
check where somebody is from. For these
529
00:56:26,460 --> 00:56:32,190
fake likes, for example. But as I said, it
always depends on what the user himself is
530
00:56:32,190 --> 00:56:36,130
willing to share.
Herald: Internet?
531
00:56:36,130 --> 00:56:41,039
Signal Angel: Isn't this just the western
version of the Chinese social credit
532
00:56:41,039 --> 00:56:43,999
system? Where do we go from here? What is
the future of all this?
533
00:56:43,999 --> 00:56:54,089
Svea: Yeah, it's dystopian, right? Oh,
yeah, I don't, after this research, you
534
00:56:54,089 --> 00:57:01,109
know, for me, I deleted my Facebook
account like one or two years ago. So this
535
00:57:01,109 --> 00:57:07,279
does you know, this did not matter to me
so much. But I stayed on Instagram and
536
00:57:07,279 --> 00:57:13,359
when I saw all this bought likes and
abonnents and followers and also YouTube,
537
00:57:13,359 --> 00:57:16,999
all this views, this, because the click
workers, they also watch YouTube videos.
538
00:57:16,999 --> 00:57:20,859
They have to stay on them like 40 seconds,
it's really funny because they hate
539
00:57:20,859 --> 00:57:27,239
hearing like techno music, rap music, all
40 seconds and then they go on. But when I
540
00:57:27,239 --> 00:57:34,589
sit next to Herald for two hour, three
hours, I was so desillusionated about all
541
00:57:34,589 --> 00:57:40,960
the social network things. And and I
thought, OK, don't count on anything. Just
542
00:57:40,960 --> 00:57:46,119
if you like the content, follow them and
look at them. But don't believe anything.
543
00:57:46,119 --> 00:57:50,479
That was my personal take away from this
research.
544
00:57:50,479 --> 00:57:53,970
Herald: So very last question, microphone
two.
545
00:57:53,970 --> 00:57:59,150
Question: A couple of days ago, The
Independent reported that Facebook, the
546
00:57:59,150 --> 00:58:06,839
Facebook App was activating the camera
when reading a news feed. Could this be in
547
00:58:06,839 --> 00:58:10,779
use in the context of detecting fake
accounts?
548
00:58:10,779 --> 00:58:18,400
Svea: I don't know.
Phillip: So, I think that that in this
549
00:58:18,400 --> 00:58:26,799
particular instance that it was probably a
bug. So, I don't know, but I mean that the
550
00:58:26,799 --> 00:58:30,679
people who work at Facebook are, not all
of them are like crooks or anything that
551
00:58:30,679 --> 00:58:35,130
they will deliberately program this kind
of stuff. So they said that it was kind of
552
00:58:35,130 --> 00:58:41,189
a bug from from an update that they did.
And the question is whether we can
553
00:58:41,189 --> 00:58:49,430
actually detect fake accounts with the
camera. And the problem is that current, I
554
00:58:49,430 --> 00:58:57,469
don't think that current face recognition
technology is enough to detect that you
555
00:58:57,469 --> 00:59:02,940
are a unique person. So there are so many
people on the planet that probably another
556
00:59:02,940 --> 00:59:08,959
person who has the same face. And I think
the new iPhone, they also have this much
557
00:59:08,959 --> 00:59:14,579
more sophisticated version of this
technology. And even they say, OK, there's
558
00:59:14,579 --> 00:59:19,079
a chance of one in, I don't know, that
there is somebody who can unlock your
559
00:59:19,079 --> 00:59:23,829
phone. So I think it's really hard to do
that with, do that with recording
560
00:59:23,829 --> 00:59:29,299
technology, to actually prove that
somebody is just one person.
561
00:59:29,299 --> 00:59:38,059
Herald: So with that, would you please
help me thank Svea, Dennis and Philip
562
00:59:38,059 --> 00:59:41,160
one more time for this fantastic
presentation! Very interesting and very,
563
00:59:41,160 --> 00:59:48,099
very disturbing. Thank you very much.
Applause
564
00:59:48,099 --> 00:59:52,099
postroll music
565
00:59:52,099 --> 01:00:16,000
Subtitles created by c3subtitles.de
in the year 2020. Join, and help us!