WEBVTT 00:00:00.000 --> 00:00:19.030 36C3 preroll music 00:00:19.030 --> 00:00:26.500 Herald: OK. So inside the fake like factories. I'm going to date myself. I 00:00:26.500 --> 00:00:32.980 remember it was the Congress around 1990,1991 or so, where I was sitting 00:00:32.980 --> 00:00:38.550 together with some people who came over to the states to visit the CCC Congress. And 00:00:38.550 --> 00:00:43.230 we were kind of riffing on how great the internet is gonna make the world, you 00:00:43.230 --> 00:00:46.970 know, how how it's gonna bring world peace and truth will rule and everything like 00:00:46.970 --> 00:00:57.259 that. Boy, were we naive, boy, where we totally wrong. And today I'm going to be 00:00:57.259 --> 00:01:03.470 schooled in how wrong I actually was because we have Svea, Dennis and Philip to 00:01:03.470 --> 00:01:08.980 tell us all about the fake like factories around the world. And with that, could you 00:01:08.980 --> 00:01:17.670 please help me in welcoming them onto the stage? Svea, Dennis and Philip. 00:01:17.670 --> 00:01:28.810 Philip: Thank you very much. Welcome to our talk "Inside the Fake Like Factories 00:01:28.810 --> 00:01:35.899 ". My name is Philip. I'm an Internet activist against disinformation and I'm 00:01:35.899 --> 00:01:38.719 also a student of the University of Bamberg. 00:01:38.719 --> 00:01:45.039 Svea: Hi. Thank you that you listen to us tonight. My name is Svea. I'm an 00:01:45.039 --> 00:01:50.219 investigative journalist, freelance mostly for the NDR and ARD. It's a public 00:01:50.219 --> 00:01:55.759 broadcaster in Germany. And I focus on tech issues. And I had the pleasure to 00:01:55.759 --> 00:02:01.280 work with these two guys on, for me, a journalistic project and for them on a 00:02:01.280 --> 00:02:04.289 scientific project. Dennis: Yeah. Hi, everyone. My name is 00:02:04.289 --> 00:02:09.009 Dennis. I'm a PhD student from Ruhr University Bochum. I'm working as a 00:02:09.009 --> 00:02:16.160 research assistant for the chair for System Security. My research focuses on 00:02:16.160 --> 00:02:21.349 network security topics and Internet measurements. And as Svea said, Philip and 00:02:21.349 --> 00:02:26.660 myself, we are here for the scientific part and Svea is for the journalistic part 00:02:26.660 --> 00:02:31.790 here. Philip: So here's our outline for today. 00:02:31.790 --> 00:02:38.550 So first, I'm going to briefly talk about our motivation for our descent into the 00:02:38.550 --> 00:02:45.160 fake like factories and then we are going to show you how we got our hands on ninety 00:02:45.160 --> 00:02:50.780 thousand fake like campaigns of a major crowd working platform. And we are also 00:02:50.780 --> 00:02:56.080 going to show you why we think that there are 10 billion registered Facebook users 00:02:56.080 --> 00:03:04.360 today. So first, I'm going to talk about the like button. The like button is the 00:03:04.360 --> 00:03:12.150 ultimate indicator for popularity on social media. It shows you how trustworthy 00:03:12.150 --> 00:03:18.620 someone is. It shows how how popular someone is. It shows, it is an indicator 00:03:18.620 --> 00:03:26.520 for economic success of brands and it also influences the Facebook algorithm. And as 00:03:26.520 --> 00:03:31.710 we are going to show now, these kind of likes can be easily forged and 00:03:31.710 --> 00:03:38.580 manipulated. But the problem is that many users will still prefer this bad info on 00:03:38.580 --> 00:03:45.960 Facebook about the popularity of a product to no info at all. And so this is a real 00:03:45.960 --> 00:03:53.780 problem. And there is no real solution to this. So first, we are going to talk about 00:03:53.780 --> 00:03:58.990 the factories and the workers in the fake like factories. 00:03:58.990 --> 00:04:04.210 Svea: That there are fake likes and that you can buy likes everywhere, it's well 00:04:04.210 --> 00:04:09.660 known. So if you Google "buying fake likes" or even "fake comments" for 00:04:09.660 --> 00:04:15.100 Instagram or for Facebook, then you will get like a hundreds of results and you can 00:04:15.100 --> 00:04:19.989 buy them very cheap and very expensive. It doesn't matter, you can buy them from 00:04:19.989 --> 00:04:27.790 every country. But when you think of these bought likes, then you may think of this. 00:04:27.790 --> 00:04:34.960 So you may think of somebody sitting in China, Pakistan or India, and you think of 00:04:34.960 --> 00:04:40.240 computers and machines doing all this and that they are, yeah, that they are fake 00:04:40.240 --> 00:04:47.630 and also that they can easily be detected and that maybe they are not a big problem. 00:04:47.630 --> 00:04:54.880 But it's not always like this. It also can be like this. So, I want you to meet 00:04:54.880 --> 00:05:03.120 Maria, I met her in Berlin. And Harald, he lives near Mönchen-Gladbach. So Maria, she 00:05:03.120 --> 00:05:11.750 is a a retiree. She was a former police officer. And as money is always short, she 00:05:11.750 --> 00:05:19.670 is clicking Facebook likes for money. She earns between 2 cent and 6 cent per like. 00:05:19.670 --> 00:05:28.720 And Harald, he was a baker once, is now getting social aid and he is also clicking 00:05:28.720 --> 00:05:34.480 and liking and commenting the whole day. We met them during our research project 00:05:34.480 --> 00:05:40.930 and did some interviews about their likes. And one platform they are clicking and 00:05:40.930 --> 00:05:46.750 working for is PaidLikes. It's only one platform out of a universe, out of a 00:05:46.750 --> 00:05:52.070 cosmos. PaidLikes, they are sitting just a couple of minutes from here in Magdeburg 00:05:52.070 --> 00:05:56.990 and they are offering that you can earn money with liking on different platforms. 00:05:56.990 --> 00:06:02.410 And it looks like this when you log into the platform with your Facebook account 00:06:02.410 --> 00:06:07.300 then you get in the morning, in the afternoon, in the evening, you get, we 00:06:07.300 --> 00:06:13.260 call it campaigns. But these are pages, Facebook fan pages or Instagram pages, or 00:06:13.260 --> 00:06:18.240 posts, or comments. You can, you know, you can work your way through them and click 00:06:18.240 --> 00:06:22.930 them. And I blurred you see here the blue bar; I blurred them because we don't want 00:06:22.930 --> 00:06:29.800 to get sued from all these companies, which you can see there. To take you a 00:06:29.800 --> 00:06:37.310 little bit with me on the journey. Harald, he was okay with us coming by for 00:06:37.310 --> 00:06:44.280 television and he was okay that we did a long interview with him, and I want to 00:06:44.280 --> 00:06:50.080 show you a very small piece out of his daily life sitting there doing the 00:06:50.080 --> 00:06:53.540 household, the washing and the cleaning, and clicking. 00:07:26.760 --> 00:07:36.020 Come on. It could be like that. You click and you earn some money. How did we meet 00:07:36.020 --> 00:07:41.150 him and all the others? Of course, because Philip and Dennis, they have a more 00:07:41.150 --> 00:07:45.169 scientific approach. So it was also important not only to talk to one or two, 00:07:45.169 --> 00:07:50.120 but to talk to many. So we created a Facebook fan page, which we call "Eine 00:07:50.120 --> 00:07:54.210 Linie unterm Strich" (a line under a line) because I thought, okay, nobody will like 00:07:54.210 --> 00:08:01.080 this freely. And then we did a post. This post, and we bought likes, and you won't 00:08:01.080 --> 00:08:10.310 believe it, it worked so well; 222 people, all the people I paid for liked this. And 00:08:10.310 --> 00:08:18.259 then we wrote all of them and we talked to many of them. Some of them only in 00:08:18.259 --> 00:08:23.410 writing, some of them only we just called or had a phone chat. But they gave us a 00:08:23.410 --> 00:08:29.949 lot of information about their life as a click worker, which I will sum up. So what 00:08:29.949 --> 00:08:36.169 PaidLikes by itself says, they say that they have 30000 registered users, and it's 00:08:36.169 --> 00:08:41.070 really interesting because you might think that they are all registered with 10 or 15 00:08:41.070 --> 00:08:45.620 accounts, but most of them, they are not. They are clicking with their real account, 00:08:45.620 --> 00:08:57.529 which makes it really hard to detect them. So they even scan their I.D. so that the 00:08:57.529 --> 00:09:03.210 company knows that they are real. Then they earn their money. And we met men, 00:09:03.210 --> 00:09:09.760 women, stay-at-home moms, low-income earners, retirees, people who are getting 00:09:09.760 --> 00:09:17.850 social care. So, basically, anybody. There was no kind of bias. And many of them are 00:09:17.850 --> 00:09:24.890 clicking for two and more platforms. That was, I didn't meet anybody who's only 00:09:24.890 --> 00:09:29.370 clicking for one platform. They all have a variety of platforms where they are 00:09:29.370 --> 00:09:34.610 writing comments or clicking likes. And you can make - this is what they told us - 00:09:34.610 --> 00:09:41.580 between 15 euro and 450 euro monthly, if you are a so-called power clicker and you 00:09:41.580 --> 00:09:48.410 do this some kind of professional. But this are only the workers, and maybe you 00:09:48.410 --> 00:09:52.740 are more interested in who are the buyers? Who benefits? 00:09:52.740 --> 00:09:59.631 Dennis: Yeah. Let's come to step two. Who benefits from the campaigns? So I think 00:09:59.631 --> 00:10:06.089 you all remember this page. This is the screen if you log into PaidLikes and, 00:10:06.089 --> 00:10:14.490 you'll see the campaigns with, you have to click in order to get a little bit of 00:10:14.490 --> 00:10:25.370 money. And by luck we've noticed that if you go over a URL, we see in the left 00:10:25.370 --> 00:10:31.980 bottom side of the browser, a URL redirecting to the campaign. You have to 00:10:31.980 --> 00:10:40.700 click and you see that every campaign is using a unique ID. It is just a simple 00:10:40.700 --> 00:10:49.640 integer, and the good thing is, it is just incremented. So now maybe some of you guys 00:10:49.640 --> 00:10:56.570 notice what we can do with that. And yeah, it is really easy with these constructed 00:10:56.570 --> 00:11:02.670 URLs to implement a crawler for data gathering, and our crawler simply 00:11:02.670 --> 00:11:11.931 requested all campaign IDs between 0 and 90000. Maybe some of you ask why 90000? As 00:11:11.931 --> 00:11:17.110 I already said, we were also registered as click workers and we see, we saw that the 00:11:17.110 --> 00:11:24.779 highest ID campaign used is about 88000. So we thought OK, 90000 is a good value 00:11:24.779 --> 00:11:30.540 and we check for every request between these 90000 requests if it got resolved or 00:11:30.540 --> 00:11:36.030 not, and if it got resolved, we redirected the URL we present this source. That 00:11:36.030 --> 00:11:42.431 should be liked or followed. And we did not save the page sources from the 00:11:42.431 --> 00:11:50.750 resolved URLs, we only save the resolved URLs in the list of campaigns, and this 00:11:50.750 --> 00:11:58.700 list was then the basis for further analysis. And here you see our list. 00:11:58.700 --> 00:12:05.740 Svea: Yes. This was the point when Dennis and Philip, when they came to us and said, 00:12:05.740 --> 00:12:12.000 hey, we have a list. So what can you find? And of course we searched AfD, was one of 00:12:12.000 --> 00:12:20.940 the first search queries. And yeah, of course, AfD is also in that list. Maybe 00:12:20.940 --> 00:12:31.149 not so surprisingly for some. And when you look, it is AFD Gelsenkirchen. And the fan 00:12:31.149 --> 00:12:39.589 page. And we asked AfD Gelsenkirchen, did you buy likes? And they said, we don't 00:12:39.589 --> 00:12:48.240 know how we got on that list. But however, we do not rule out an anonymous donation. 00:12:48.240 --> 00:12:55.410 But now you would think, Ok, they found AfD; this is very expectable. But no, all 00:12:55.410 --> 00:13:00.930 political parties – mostly local and regional entities - showed up on that 00:13:00.930 --> 00:13:09.250 list. So we have CDU/CSU. We have had FDP, SPD, AfD, Die Grünen and Die Linke. But 00:13:09.250 --> 00:13:15.390 not that you think Angela Merkel or some very big Facebook fan pages just showed 00:13:15.390 --> 00:13:23.800 up. No, no. Very small entities with a couple of hundreds or maybe 10000 or 15000 00:13:23.800 --> 00:13:28.390 followers. And I think this makes perfectly sense, because somebody who has 00:13:28.390 --> 00:13:35.370 already very, very much many fans probably would not buy them there at 00:13:35.370 --> 00:13:46.311 PaidLikes. And we asked many of them, and mostly they could not explain it. They 00:13:46.311 --> 00:13:52.040 would never do something like that. Yeah, they were completely over asked. But you 00:13:52.040 --> 00:13:56.690 have to think that we only saw the campaign. The campaigns, their Facebook 00:13:56.690 --> 00:14:03.110 fan pages, we could not see who bought the likes. And as you can imagine, everybody 00:14:03.110 --> 00:14:08.740 could have done it like the mother, the brother, the fan, you know, the dog. So 00:14:08.740 --> 00:14:15.160 this was a case we would have needed a lot of luck to call anybody out of the blue 00:14:15.160 --> 00:14:20.260 and then he would say, oh, yes, I did this. And there was one, or there were 00:14:20.260 --> 00:14:25.810 some politicians who admitted it. And one of them, she did it also publicly and gave 00:14:25.810 --> 00:14:35.339 us an interview. It's Tanja Kühne. She is a regional politician from Walsrode, 00:14:35.339 --> 00:14:40.260 Niedersachsen. And she was in the..., it was the case that it was after an election 00:14:40.260 --> 00:14:44.360 and she was not very happy with her fan page. That is what she told us. She was 00:14:44.360 --> 00:14:49.220 very unlucky and she wanted, you know, to push herself and to boost it a little bit, 00:14:49.220 --> 00:14:55.510 and get more friends and followers and reach. And then she bought 500 followers. 00:14:55.510 --> 00:15:02.870 And then we had a nice interview with her about that. Show you a small piece. 00:15:53.829 --> 00:15:59.760 Okay, so you see – answers are pretty interesting. And she.. I think she was 00:15:59.760 --> 00:16:05.180 that courageous to speak out to us. Many of others did too, but only on the phone. 00:16:05.180 --> 00:16:09.180 And they didn't want to go on the record. But she's not the only one who answered 00:16:09.180 --> 00:16:14.110 like this. Because, of course, if you call through a list of potential fake like 00:16:14.110 --> 00:16:21.120 buyers, of course they answer like, no, it's not a scam. And I also think from a 00:16:21.120 --> 00:16:26.180 jurisdictional way, it's it's also very hard to show that this is fraud and a 00:16:26.180 --> 00:16:33.209 scam. And it's more an ethical problem that you can that you can see here, that 00:16:33.209 --> 00:16:40.170 it's manipulative if you buy likes. We also found a guy from FSP from the 00:16:40.170 --> 00:16:45.269 Bundestag. But yeah, he ran away and didn't want to get interviewed, so I 00:16:45.269 --> 00:16:52.700 couldn't show you. So bought, or no probably... He was like 40 times in our 00:16:52.700 --> 00:16:59.100 list for various Facebook posts and videos and also for his Instagram account. But we 00:16:59.100 --> 00:17:06.730 could not get him on, we could not get him on record. So what did others say? We, of 00:17:06.730 --> 00:17:10.970 course, confronted Facebook, Instagram and YouTube with this small research. And they 00:17:10.970 --> 00:17:18.079 said, no, we don't want fake likes on our platform. PaidLikes is active since 2012, 00:17:18.079 --> 00:17:25.370 you know. So they waited seven years. But after our report, at least, Facebook 00:17:25.370 --> 00:17:32.549 temporarily blocked PaidLikes. And of course, we asked them too, and spoke to 00:17:32.549 --> 00:17:35.781 them and wrote with PaidLikes in Magdeburg. And they said, of course, it's 00:17:35.781 --> 00:17:41.620 not a scam because the click workers they are freely clicking on pages. So, yeah, 00:17:41.620 --> 00:17:47.640 kind of nobody cares. But PaidLikes, this is only the tip of the iceberg. 00:17:47.640 --> 00:17:58.520 Philip: So we also wanted to dive a little bit into this fake like universe outside 00:17:58.520 --> 00:18:05.780 of PaidLikes and to see what else is out there. And so we did an analysis of 00:18:05.780 --> 00:18:12.780 account creation on Facebook. So what Facebook is saying about account creation 00:18:12.780 --> 00:18:19.299 is that they are very effective against fake accounts. So they say they remove 00:18:19.299 --> 00:18:26.330 billions of accounts each year, and that most of these accounts never reach any 00:18:26.330 --> 00:18:33.000 real users and they remove them before they get reported. So what Facebook 00:18:33.000 --> 00:18:39.080 basically wants to tell you is that they have it under control. However, there are 00:18:39.080 --> 00:18:45.700 a number of reports that suggest otherwise. For example, recently at NATO- 00:18:45.700 --> 00:18:53.630 Stratcom Taskforce released a report where they actually bought 54000 likes, 54000 00:18:53.630 --> 00:19:02.220 social media interactions for just 300 Euros. So this is a very low price. And I 00:19:02.220 --> 00:19:07.169 think you wouldn't expect such a low price if it would be hard to get that many 00:19:07.169 --> 00:19:15.880 interactions. They bought 3500 comments, 25000 likes, 20000 views and 5100 00:19:15.880 --> 00:19:22.991 followers. Everything for just 300 Euros. So, you know, the thing they have in 00:19:22.991 --> 00:19:32.050 common, they are cheap, the fake likes and the fake interactions. So we also have, 00:19:32.050 --> 00:19:38.470 there was also another report from Vice Germany recently. And they reported on 00:19:38.470 --> 00:19:46.410 some interesting facts about automated fake accounts. They reported on findings 00:19:46.410 --> 00:19:50.980 that suggest that actually people use internet or hacked internet of things 00:19:50.980 --> 00:19:59.150 devices and to use them to create these fake accounts and to manage them. And so 00:19:59.150 --> 00:20:04.590 it's actually kind of interesting to think about this this wa. To say, OK, maybe next 00:20:04.590 --> 00:20:11.020 election your fridge is actually going to support the other candidate on Facebook. 00:20:11.020 --> 00:20:16.970 And so we also wanted to look into this and we wanted to go a step further and to 00:20:16.970 --> 00:20:24.660 look at who these people are. Who are they, and what what are they doing on 00:20:24.660 --> 00:20:32.200 Facebook? And so we actually examined the profiles of purchased likes. For this we 00:20:32.200 --> 00:20:38.390 created four comments under arbitrary posts, and then we bought likes for these 00:20:38.390 --> 00:20:46.500 comments, and then we examined the resulting profiles of the fake likes. So 00:20:46.500 --> 00:20:51.050 it was pretty cheap to buy these likes. Comment likes are always a little bit more 00:20:51.050 --> 00:20:59.520 expensive than other likes. And we found all these offerings on Google and we paid 00:20:59.520 --> 00:21:08.169 with PayPal. So we actually used a pretty neat trick to estimate the age of these 00:21:08.169 --> 00:21:16.490 fake accounts. So as you can see here, the Facebook user ID is incremented. So 00:21:16.490 --> 00:21:24.250 Facebook started in 2009 to use incremented Facebook ID, and they use this 00:21:24.250 --> 00:21:31.780 pattern of 1 0 0 0 and then the incremented number. And as you can see, in 00:21:31.780 --> 00:21:40.200 2009 this incremented number was very close to zero. And then today it is close 00:21:40.200 --> 00:21:49.559 to 40 billion. And in this time period, you can see that you can kind of get a 00:21:49.559 --> 00:21:56.770 rather fitting line through all these points. And you can see that the likes are 00:21:56.770 --> 00:22:02.710 in fact incremented, ... the account IDs are in fact incremented over time. So we 00:22:02.710 --> 00:22:08.670 can use this fact in reverse to estimate the creation date of an account where we 00:22:08.670 --> 00:22:15.340 know the Facebook ID. And that's exactly what we did with these fake likes. So we 00:22:15.340 --> 00:22:22.090 estimated the account creation dates. And as you can see, we get kind of different 00:22:22.090 --> 00:22:28.929 results from different services. For example, PaidLikes, they had rather old 00:22:28.929 --> 00:22:35.750 accounts. So this means they use very authentic accounts. And we already know 00:22:35.750 --> 00:22:41.370 that because we talked to them. So these are very authentic accounts. Also like 00:22:41.370 --> 00:22:46.660 Service A over here also uses very, very authentic accounts. But on the other hand, 00:22:46.660 --> 00:22:52.160 like service B uses very new accounts, they were all created in the last three 00:22:52.160 --> 00:22:58.280 years. So if you look at the accounts and also from these numbers, we think that 00:22:58.280 --> 00:23:06.510 these accounts were bots and on service C it's kind of not clear, are these are 00:23:06.510 --> 00:23:10.870 these accounts bots or are these clickworkers? Maybe it's a mixture of 00:23:10.870 --> 00:23:17.820 both, we don't know exactly for sure. But this is an interesting metric to measure 00:23:17.820 --> 00:23:23.390 the age of the accounts to determine if some of them might be bots. And that's 00:23:23.390 --> 00:23:29.340 exactly what we did on this page. So this is actually a page for garden furniture 00:23:29.340 --> 00:23:36.750 and we found it in our list that we got from paid likes. So they bought, obviously 00:23:36.750 --> 00:23:43.970 they were on this list for bought likes on Facebook, on PaidLikes. And they caught 00:23:43.970 --> 00:23:51.000 our eye because they had one million likes. And that's rather unusual for a 00:23:51.000 --> 00:24:01.260 shop for garden furniture in Germany. And so we looked at this page further and we 00:24:01.260 --> 00:24:07.390 noticed other interesting things. For example, there are posts, all the time, 00:24:07.390 --> 00:24:13.820 they got like thousands of likes. And that's also kind of unusual for a garden 00:24:13.820 --> 00:24:19.590 furniture shop. And so we looked into the likes and as you can see, they all look 00:24:19.590 --> 00:24:26.790 like they come from Southeast Asia and they don't look very authentic. And we 00:24:26.790 --> 00:24:32.460 were actually able to estimate the creation dates of these accounts. And we 00:24:32.460 --> 00:24:36.700 found that most of these accounts that were used for liking these posts on this 00:24:36.700 --> 00:24:44.130 page were actually created in the last three years. So this is a page where 00:24:44.130 --> 00:24:49.540 everything, from the number of people who like to page to the number of people who 00:24:49.540 --> 00:24:55.559 like to posts is complete fraud. So nothing about this is real. And it's 00:24:55.559 --> 00:25:02.380 obvious that this can happen on Facebook and that this is a really, really big 00:25:02.380 --> 00:25:08.309 problem. I mean, this is a, this is a shop for garden furniture. Obviously, they 00:25:08.309 --> 00:25:14.580 probably don't have such huge sums of money. So it was probably very cheap to 00:25:14.580 --> 00:25:22.170 buy this amount of fake accounts. And it is really shocking to see how, how big, 00:25:22.170 --> 00:25:31.179 how big the scale is of this kind of operations. And so what we have to say is, 00:25:31.179 --> 00:25:39.970 OK, when Facebook says they have it under control, we have to doubt that. So now we 00:25:39.970 --> 00:25:46.320 can look at the bigger picture. And what we are going to do here is we are going to 00:25:46.320 --> 00:25:52.700 use this same graph that we used before to estimate the creation dates, but in a 00:25:52.700 --> 00:25:59.080 different way. So we can actually see that the lowest and the highest points of 00:25:59.080 --> 00:26:05.090 Facebook IDs in this graph. So we know the newest Facebook ID by creating a new 00:26:05.090 --> 00:26:13.200 account. And we know the lowest ID because it's zero. And then we know that there are 00:26:13.200 --> 00:26:20.780 40 billion Facebook IDs. Now, in the next step, we took a sample, a random sample 00:26:20.780 --> 00:26:27.610 from these 40 billion Facebook IDs. And inside of the sample, we checked if these 00:26:27.610 --> 00:26:33.740 accounts exist, if this ID corresponds to an existing account. And we do that because 00:26:33.740 --> 00:26:39.360 we obviously cannot check 40 billion accounts and 40 billion IDs, but we can 00:26:39.360 --> 00:26:45.720 check a small sample of these accounts of these IDs and estimate, then, the number 00:26:45.720 --> 00:26:54.470 of existing accounts on Facebook and total. So for this, we repeatedly access 00:26:54.470 --> 00:27:02.770 the same sample of one million random IDs over the course of one year. And we also 00:27:02.770 --> 00:27:10.100 pulled a sample of 10 million random IDs for closer analysis this July. And now 00:27:10.100 --> 00:27:15.950 Dennis is going to tell you how we did it. Dennis: Yeah. Well, pretty interesting, 00:27:15.950 --> 00:27:21.160 pretty interesting results so far, right? So we again implemented the crawler, the 00:27:21.160 --> 00:27:26.530 second time for gathering public Facebook information, the public Facebook account 00:27:26.530 --> 00:27:35.730 data. And, yeah, this was not so easy as in the first case. Um, yeah. As. It's not 00:27:35.730 --> 00:27:45.059 surprising that Facebook is using a lot of measures to try to block the automated 00:27:45.059 --> 00:27:52.460 crawling of the Facebook page, for example with IP blocking or CAPTCHA solving. But, 00:27:52.460 --> 00:27:59.929 uh, we were pretty easy... Yeah, we could pretty easy solve this problem by using 00:27:59.929 --> 00:28:06.980 the Tor Anonymity Network. So every time our IP got blocked by crawling the data, 00:28:06.980 --> 00:28:14.480 we just made a new Tor connection and change the IP. And this also with the 00:28:14.480 --> 00:28:21.440 CAPTCHAs. And with this easy method, we were able to to crawl all the Facebook, 00:28:21.440 --> 00:28:26.020 and all the public Facebook data. And let's have a look at two examples. The 00:28:26.020 --> 00:28:36.890 first example is facebook.com/4. So the, very, very small Facebook ID. Yeah, in 00:28:36.890 --> 00:28:41.790 this case, we are, we are redirected and check the response and find a valid 00:28:41.790 --> 00:28:50.070 account page. And does anyone know which account this is? Mark Zuckerberg? Yeah, 00:28:50.070 --> 00:28:55.360 that's correct. This is this is a public account for Mark Zuckerberg. Number four, 00:28:55.360 --> 00:29:01.679 as we see, as we already saw, the other IDs are really high. But he got the number 00:29:01.679 --> 00:29:10.690 four. Second example was facebook.com/3. In this case, we are not forwarded. And 00:29:10.690 --> 00:29:17.760 this means that it is an invalid account. And that was really easy to confirm with a 00:29:17.760 --> 00:29:23.740 quick Google search. And it was a test account from the beginning of Facebook. So 00:29:23.740 --> 00:29:31.059 we did not get redirected. And it's just the login page from Facebook. And with 00:29:31.059 --> 00:29:38.500 these examples, we did, we did a lot of, a lot more experiments. And at the end, we 00:29:38.500 --> 00:29:46.970 were able to to build this tree. And, yeah, this tree represents the high level 00:29:46.970 --> 00:29:53.059 approach from our scraper. So in the, What's that? 00:29:53.059 --> 00:29:56.340 Svea: Okay. Sleeping. Laughing 00:29:56.340 --> 00:30:07.090 Dennis: Yeah. We have still time. Right. So what? Okay, so everyone is waking up 00:30:07.090 --> 00:30:16.680 again. Oh, yeah. The first step we call the domain, www.facebook.com/FID. If we 00:30:16.680 --> 00:30:24.650 get redirected in this case, then we check if the, if the page is an account page. If 00:30:24.650 --> 00:30:31.270 it's an account page, then it's an public account like the example 4 and we were 00:30:31.270 --> 00:30:39.890 able to save the raw data, the raw HTTP source. If we, if it's not an account page 00:30:39.890 --> 00:30:45.070 then everything is OK. If it's not, it's not a public account and we are not able 00:30:45.070 --> 00:30:52.580 to save any data. And if we call, if we do, if we do not get redirected in the 00:30:52.580 --> 00:31:01.630 first step, then we call the second domain, facebook.com/profile.php?id=FID 00:31:01.630 --> 00:31:09.289 with the mobile user agent. And if we get redirected then, then again, it is a 00:31:09.289 --> 00:31:14.990 nonpublic profile and we cannot save anything. But, and if we get not 00:31:14.990 --> 00:31:22.710 redirected, it is an invalid profile and it is most often a deleted account. Yeah. 00:31:22.710 --> 00:31:29.390 And yeah, that's the high level overview of our scraper. And Phillip will now give 00:31:29.390 --> 00:31:32.340 some more information on interesting results. 00:31:32.340 --> 00:31:38.820 Phillip: So the most interesting result of this scraping of the sample of Facebook 00:31:38.820 --> 00:31:47.070 IDs was that one in four Facebook IDs corresponds to a valid account. And you 00:31:47.070 --> 00:31:53.559 can do the math. There are 40 billion Facebook IDs, so there must be 10 billion 00:31:53.559 --> 00:32:00.170 registered users on Facebook. And this means that there are more registered users 00:32:00.170 --> 00:32:08.140 on Facebook than there are humans on Earth. And also, it means that it's even 00:32:08.140 --> 00:32:12.460 worse than that because not everybody on Earth can have a Facebook account because 00:32:12.460 --> 00:32:17.370 not everybody, you need a smartphone for that. And many people don't have those. So 00:32:17.370 --> 00:32:22.270 this is actually a pretty high number and it's very unexpected. So in July 2019, 00:32:22.270 --> 00:32:29.059 there were more than ten billion Facebook accounts. Also, we did another research on 00:32:29.059 --> 00:32:36.429 the timeframe between October 2018 and today, or this month. And we found that in 00:32:36.429 --> 00:32:43.140 this timeframe there were 2 billion new registered Facebook accounts. So this is 00:32:43.140 --> 00:32:48.679 like the timeframe of one year, more or less. And in a similar timeframe, the 00:32:48.679 --> 00:32:58.899 monthly active user base rose by only 187 million. Facebook deleted 150 million 00:32:58.899 --> 00:33:05.419 older accounts between October 2018 and July 2019. And we know that because we 00:33:05.419 --> 00:33:11.460 pulled the same sample over a longer period of time. And then we watched for 00:33:11.460 --> 00:33:16.230 accounts that got deleted in the sample. And that enables us to estimate this 00:33:16.230 --> 00:33:23.400 number of 150 million accounts that got deleted that are basically older than our 00:33:23.400 --> 00:33:31.890 sample. So I made some nice graphs for your viewing pleasure. So, again, the 00:33:31.890 --> 00:33:40.919 older accounts were, just 150 million were deleted since October 2018. These are 00:33:40.919 --> 00:33:46.350 accounts that are older than last year. And Facebook claims that since then, about 00:33:46.350 --> 00:33:52.789 7 billion accounts got deleted from their platform, which is vastly more than these 00:33:52.789 --> 00:33:58.370 older accounts. And that that's why we think that Facebook mostly deleted these 00:33:58.370 --> 00:34:06.770 newer accounts. And if an account is older than a certain age, then it is very 00:34:06.770 --> 00:34:13.069 unlikely that it gets deleted. And also, I think you can see the scales here. So, of 00:34:13.069 --> 00:34:17.960 course, the registered users are not the same thing as active users, but you can 00:34:17.960 --> 00:34:23.290 still see that there are much more registrations of, of new users than there 00:34:23.290 --> 00:34:30.139 are active users. And there are new active users during the last year. So what does 00:34:30.139 --> 00:34:37.909 this all mean? Does it mean that Facebook gets flooded by fake accounts? We don't 00:34:37.909 --> 00:34:42.980 really know. We only know these numbers. What Facebook is telling us is that they 00:34:42.980 --> 00:34:50.409 only count and publish active users, as I already said, that there is a disconnect 00:34:50.409 --> 00:34:56.759 between this record, registered users and active users and Facebook only reports on 00:34:56.759 --> 00:35:04.289 the active users. Also, they say that users register accounts, but they don't 00:35:04.289 --> 00:35:10.519 verify them or they don't use them, and that's how this number gets so high. But I 00:35:10.519 --> 00:35:19.319 think that that's not really explaining these high numbers and because that's just 00:35:19.319 --> 00:35:26.469 by orders of magnitude larger than anything that this could cause. Also, they 00:35:26.469 --> 00:35:31.819 say that they regularly delete fake accounts. But we have seen that these are 00:35:31.819 --> 00:35:37.519 mostly accounts that get deleted directly after their creation. And if they survive 00:35:37.519 --> 00:35:46.170 long enough, then they are getting through. So what does this all mean? 00:35:46.170 --> 00:35:55.390 Svea: Okay, so you got the full load, which I had like over two or three months. 00:35:55.390 --> 00:36:02.869 And what for me was, was a one very big conclusion was that we have some kind of 00:36:02.869 --> 00:36:08.530 broken metric here, that all the likes and all the hearts on Instagram and the 00:36:08.530 --> 00:36:13.650 followers that they can so easily be manipulated. And then it's it's so hard to 00:36:13.650 --> 00:36:19.029 tell in some cases, it's so hard to tell if they are real or not real. And this 00:36:19.029 --> 00:36:26.160 opens the gate for manipulation and yes, untrueness. And for economic losses, if 00:36:26.160 --> 00:36:33.109 you think as somebody who is investing money and or as an advertiser, for 00:36:33.109 --> 00:36:40.170 example. And in the very end, it is a case of eroding trust, which means that we 00:36:40.170 --> 00:36:45.739 cannot trust these numbers anymore. These numbers are, you know, they are so easily 00:36:45.739 --> 00:36:53.799 manipulated. And why should we trust this? And this has a severe consequence for all 00:36:53.799 --> 00:36:59.420 the social networks. If you are still in them. So what can be a solution? And 00:36:59.420 --> 00:37:05.150 Philip, you thought about that. Phillip: So basically we have two 00:37:05.150 --> 00:37:11.410 problems. One is click workers and one is fakes. Click workers are basically just 00:37:11.410 --> 00:37:18.420 hyper active users and they are selling their hyper activity. And so what social 00:37:18.420 --> 00:37:23.660 networks could do is just make interactions scarce, so just lower the 00:37:23.660 --> 00:37:29.180 value of more interactions. If you are a hyper active users, then your interaction 00:37:29.180 --> 00:37:34.240 should count less than the interactions of a less active user. 00:37:34.240 --> 00:37:39.229 Mumbling That's kind of solvable, I think. The real 00:37:39.229 --> 00:37:46.890 problem is the authenticity. So if you if you get stopped from posting or liking 00:37:46.890 --> 00:37:52.640 hundreds of pages a day, then maybe you just create multiple accounts and operate 00:37:52.640 --> 00:37:58.599 them simultaneously. And this can only be solved by authenticity. So this can only 00:37:58.599 --> 00:38:04.990 be solved if you know that the person who is operating the account is just one 00:38:04.990 --> 00:38:10.569 person, is operating one account. And this is really hard to do, because Facebook 00:38:10.569 --> 00:38:14.940 doesn't know who is clicking. Is it a bot? Is it a clickworrker, or is it one 00:38:14.940 --> 00:38:20.410 clickworker for ten accounts? How does this work? And so this is really hard for 00:38:20.410 --> 00:38:27.609 the, for the social media companies to do. And you could say, OK, let's send in the 00:38:27.609 --> 00:38:32.359 passport or something like that to prove authenticity. But that's actually not a 00:38:32.359 --> 00:38:37.109 good idea because nobody wants to send their passport to Facebook. And so this is 00:38:37.109 --> 00:38:42.359 really a hard problem that has to be solved. If we want to use social, social 00:38:42.359 --> 00:38:49.750 media in a meaningful way. And so this is what, what companies could do. And now... 00:38:49.750 --> 00:38:53.200 Svea: But what do what you could do. Okay. Of course, you can delete 00:38:53.200 --> 00:38:56.469 your Facebook account or your Instagram account and stop. 00:38:56.469 --> 00:39:01.299 Slight Applause, Lauthing Svea: Yeah! Stay away from social media. 00:39:01.299 --> 00:39:08.959 But this maybe is not for all of us a solution. So I think be aware, of course. 00:39:08.959 --> 00:39:17.499 Spread the word, tell others. And if, if you, if you like, then and you get more 00:39:17.499 --> 00:39:24.019 intelligence about that, we are really happy to dig deeper in these networks. And 00:39:24.019 --> 00:39:30.180 and we will go on investigating and so at last but not least, it's to say thank you 00:39:30.180 --> 00:39:33.349 to you guys. Thank you very much for listening. 00:39:33.349 --> 00:39:40.089 Applause Svea: And we did not do this alone. We are 00:39:40.089 --> 00:39:44.849 not three people. There are many more standing behind and doing this, this 00:39:44.849 --> 00:39:50.709 beautiful research. And we are opening now for questions, please. 00:39:50.719 --> 00:39:55.429 Herald: Yes. Please, thank Svea, Phil and Dennis again. 00:39:55.429 --> 00:40:05.519 Applause And we have microphones out 00:40:05.519 --> 00:40:09.680 here in the room, about nine of them, actually. If you line up behind them to 00:40:09.680 --> 00:40:15.780 ask a question, remember that a question is a sentence with a question mark behind 00:40:15.780 --> 00:40:20.500 it. And I think I see somebody at number three. So let's start with that. 00:40:20.500 --> 00:40:25.979 Question: Hi. I, I just have a little question. Wouldn't a dislike button, the 00:40:25.979 --> 00:40:30.749 concept of a dislike button, wouldn't that be a solution to all the problems? 00:40:30.749 --> 00:40:38.039 Phillip: So we thought about recommending that Facebook ditches the like button 00:40:38.039 --> 00:40:42.299 altogether. I think that would be a better solution than a dislike button, because a 00:40:42.299 --> 00:40:47.079 dislike button could also be manipulated and it would be even worse because you 00:40:47.079 --> 00:40:54.119 could actually manipulate the network into down ranking posts or kind of not showing 00:40:54.119 --> 00:41:00.670 posts to somebody. And that, I think would be even worse. I imagine what dictators 00:41:00.670 --> 00:41:08.209 would do with that. And so I think the best option would be to actually not show 00:41:08.209 --> 00:41:18.029 off like, like counts anymore and to this, to actually make people not invest into 00:41:18.029 --> 00:41:25.199 these counts if they become meaningless. Herald: I think I see a microphone 7, up 00:41:25.199 --> 00:41:28.109 there. Question: Hello. So one question I had is 00:41:28.109 --> 00:41:37.210 you are signed creation dates to IDs. How did you do this? 00:41:37.210 --> 00:41:52.489 Phillip: So, we actually knew the creation date of some accounts. And then we kind of 00:41:52.489 --> 00:41:58.210 interpolated between the creation dates and the IDs. So you see this black line 00:41:58.210 --> 00:42:04.109 there. That's actually our, our interpolation. And with this black line, 00:42:04.109 --> 00:42:10.910 we can then estimate the creation dates for IDs that we do not yet know because 00:42:10.910 --> 00:42:17.430 they did, kind of fill in the gaps. Q: Follow up question, do you know why 00:42:17.430 --> 00:42:20.310 there are some points outside of this graph? 00:42:20.310 --> 00:42:23.999 Phillip: No. Q: No? Thank you. 00:42:23.999 --> 00:42:26.400 Herald: So there was a question from the Internet. 00:42:26.400 --> 00:42:33.723 Question: Did you report your findings to Facebook? And did they do anything? 00:42:33.723 --> 00:42:41.509 Svea: Because this research is very new, we, we just recently approached them and 00:42:41.509 --> 00:42:47.190 showed them the research and we got an answer. But I think we also already showed 00:42:47.190 --> 00:42:54.480 the answer. It was that they, I think that they only count and publish active users. 00:42:54.480 --> 00:42:59.680 They could, they did not want to tell us how many registered users they have, that 00:42:59.680 --> 00:43:03.859 they say, oh, sometimes users register accounts, but don't use them or verify 00:43:03.859 --> 00:43:08.930 them. And that they regularly delete fake accounts. But we hope that we get into a 00:43:08.930 --> 00:43:12.469 closer discussion with them soon about this. 00:43:12.469 --> 00:43:19.469 Herald: Microphone two. Question: When hunting down the bias of 00:43:19.469 --> 00:43:26.740 the campaigns, did you dig out your own campaign line, Line below the line? No, 00:43:26.740 --> 00:43:34.039 because they stopped scraping in August. And I, you stopped scraping in August. And 00:43:34.039 --> 00:43:39.449 then I started, you know, the whole project started with them coming to us 00:43:39.449 --> 00:43:44.599 with the list. And then we thought, oh, this is very interesting. And then the 00:43:44.599 --> 00:43:50.729 whole journalistic research started. And, but I think if we, I think if we would do 00:43:50.729 --> 00:43:56.200 it again, of course, I think we would find us. We all also found there was another 00:43:56.200 --> 00:44:01.650 magazine, and they did, also a test, paid test a couple of years ago. And we found 00:44:01.650 --> 00:44:04.920 their campaign. Phillip: So, so we we actually did another 00:44:04.920 --> 00:44:11.480 test. And for the other test, I noted we also got like this ID, I think. And it 00:44:11.480 --> 00:44:20.329 worked to plug it into the URL and then we also got to redirected to our own page. So 00:44:20.329 --> 00:44:22.569 that worked. Q: Thank you. 00:44:22.569 --> 00:44:26.379 Herald: Microphone three. Question: Hi. I'm Farhan, I'm a Pakistani 00:44:26.379 --> 00:44:30.759 journalist. And first of all, I would like to say that you were right when you said 00:44:30.759 --> 00:44:34.910 that there might be people sitting in Pakistan clicking on the likes. That does 00:44:34.910 --> 00:44:41.329 happen. But my question would be that Facebook does have its own ad program that 00:44:41.329 --> 00:44:47.470 it aggressively pushes. And in that ad program, there is also options whereby 00:44:47.470 --> 00:44:53.701 people can buy likes and comments and impressions and reactions. Did you, would 00:44:53.701 --> 00:44:59.670 you also consider those as a fake? I mean, that they're not fake, per se, but they're 00:44:59.670 --> 00:45:05.799 still bought likes. So what's your view on those? Thank you. 00:45:05.799 --> 00:45:14.349 Phillip: So, when you buy ads on Facebook, then, so, what you what you actually want 00:45:14.349 --> 00:45:19.489 to have is fans for your page that are actually interested in your page. So 00:45:19.489 --> 00:45:25.460 that's kind of the difference, I think to the, to the paid likes system where the 00:45:25.460 --> 00:45:30.119 people themselves, they get paid for liking stuff that they wouldn't normally 00:45:30.119 --> 00:45:35.599 like. So I think that's the fundamental difference between the two programs. And 00:45:35.599 --> 00:45:40.529 that's why I think that one is unethical. And one is not really that unethical. 00:45:40.529 --> 00:45:47.749 Svea: The very problem is if you, if you buy these click workers, then you have 00:45:47.749 --> 00:45:52.789 many people in your fan page. They are not interested in you. They don't care about 00:45:52.789 --> 00:45:57.410 you. They don't look at your products. They don't look at your political party. 00:45:57.410 --> 00:46:03.539 And then often the people, they additionally, they make Facebook ads, and 00:46:03.539 --> 00:46:08.229 these ads, they are shown, again, the click workers and they don't look at them. 00:46:08.229 --> 00:46:13.410 So, you know, people, they are burning money and money and money with this whole 00:46:13.410 --> 00:46:18.069 corrupt system. Herald: So, microphone two. 00:46:18.069 --> 00:46:22.039 Question: Hi. Thanks. Thanks for the talk and thanks for the effort of going through 00:46:22.039 --> 00:46:27.709 all of this project. From my understanding, this whole finding 00:46:27.709 --> 00:46:35.209 basically undermines the trust in Facebook's likes in general, per se. So I 00:46:35.209 --> 00:46:42.369 would expect now the price of likes to drop and the pay for click workers to drop 00:46:42.369 --> 00:46:49.250 as well. Do you have any metrics on that? Svea: The research just went public. I 00:46:49.250 --> 00:46:56.180 think one week ago. So, so what we have seen as an effect is that Facebook, they 00:46:56.180 --> 00:47:02.940 excluded paid likes for, for a moment. So, yes, of course, one platform is down. But 00:47:02.940 --> 00:47:08.010 I think there are so many outside. There are so many. So I think... 00:47:08.010 --> 00:47:14.229 Q: I meant the phenomenon of paid likes, not the company itself. Like the value of 00:47:14.229 --> 00:47:19.319 a like as a measure of credibility... Phillip: We didn't... 00:47:19.319 --> 00:47:22.829 Q: ...is declining now. That's my, that's my... 00:47:22.829 --> 00:47:27.869 Svea: Yes. That's why many people are buying Instagram hearts now. So, so, yes, 00:47:27.869 --> 00:47:32.900 that's true. The like is not the fancy hot shit anymore. Yes. And we also saw in the 00:47:32.900 --> 00:47:40.670 data that the likes for the fan pages, they rapidly went down and the likes for 00:47:40.670 --> 00:47:45.229 the posts and the comments, they went up. So I think, yes, there is a shift. And 00:47:45.229 --> 00:47:51.809 what we also saw in that data was that the Facebook likes, they, they went down from 00:47:51.809 --> 00:47:57.839 2016. They are rapidly down. And what is growing and rising is YouTube and 00:47:57.839 --> 00:48:01.609 Instagram. Now, everything is about, today, everything is about Instagram. 00:48:01.609 --> 00:48:05.270 Q: Thanks. Herald: So let's go to number one. 00:48:05.270 --> 00:48:09.630 Question: Hello and thank you very much for this fascinating talk, because I've 00:48:09.630 --> 00:48:15.400 been following this whole topic for a while. And I was wondering if you were 00:48:15.400 --> 00:48:20.849 looking also into the demographics, in terms of age groups and social class, not 00:48:20.849 --> 00:48:25.619 of the people who were doing the actual liking, but actually, you know, buying 00:48:25.619 --> 00:48:31.249 these likes. Because I think that what is changing is an entire social discourse on 00:48:31.249 --> 00:48:36.709 social capital and, the bold U.S. kind of term, because it can now be quantified. As 00:48:36.709 --> 00:48:43.650 a teacher, I hear of kids who buy likes to be more popular than their other 00:48:43.650 --> 00:48:47.880 schoolmates. So I'm wondering if you're looking into that, because I think that's 00:48:47.880 --> 00:48:52.559 fascinating, fascinating area to actually come up with numbers about it. 00:48:52.559 --> 00:48:59.229 Svea: It definitely is. And we were all so fascinated by this data set of 90,000 data 00:48:59.229 --> 00:49:05.479 points. And what we did was, and this was very hard, and was that we tried it, first 00:49:05.479 --> 00:49:11.869 of all, to look who is buying likes, like automotives, you know, to to, this some, 00:49:11.869 --> 00:49:18.910 you know, what, what kind of branches? Who is in that? And so this was this was 00:49:18.910 --> 00:49:24.769 doable. But to get more into demographics, you would have liked to, to crawl, to 00:49:24.769 --> 00:49:33.699 click every page. And so we we did not do this. What we did was, of course, that we 00:49:33.699 --> 00:49:38.489 that we were a team of three to ten people and manually looking into it. And what we, 00:49:38.489 --> 00:49:43.739 of course, saw that on Instagram and on YouTube, you have many of these very young 00:49:43.739 --> 00:49:47.219 people. Some of them, I actually called them and they were like, Yes, I bought 00:49:47.219 --> 00:49:54.089 likes. Very bad idea. So I think yes, I think there is a demographic shift away 00:49:54.089 --> 00:49:59.890 from the companies and the automotive and industries buying Facebook fan page likes 00:49:59.890 --> 00:50:04.390 to Instagram and YouTube wannabe- influencers. 00:50:04.390 --> 00:50:06.430 Q: Influencers, influencer culture is obviously... 00:50:06.430 --> 00:50:12.670 Svea: Yes. And I have to admit here we, we showed you the political side, but we have 00:50:12.670 --> 00:50:19.849 to admit that the political likes, they were like this small in the numbers. And 00:50:19.849 --> 00:50:25.640 the very, very vast majority of this data set, it's about wedding planners, 00:50:25.640 --> 00:50:31.440 photography, tattoo studios and influencers, influencers, influencers and 00:50:31.440 --> 00:50:34.479 YouTubers, of course. Q: Yes. Thank you so much. 00:50:34.479 --> 00:50:37.439 Herald: So we have a lot of questions in the room. I'm going to get to you as soon 00:50:37.439 --> 00:50:40.009 as we can. I'd like to go to the Internet first. 00:50:40.009 --> 00:50:44.680 Signal Angel: Do you think this will get bit better or worse if people move to more 00:50:44.680 --> 00:50:48.319 decentralized platforms? Phillip: To more what? 00:50:48.319 --> 00:50:54.910 Svea: If it get better or worse. Dennis: Can you repeat that, please? 00:50:54.910 --> 00:50:58.880 Herald: Would this issue get better or worse if people move to a more 00:50:58.880 --> 00:51:01.239 decentralized platform? Phillip: Decentralized. decentralized, 00:51:01.239 --> 00:51:12.160 okay. So, I mean, we can look at, at the, this slide, I think, and think about 00:51:12.160 --> 00:51:18.249 whether decentralized platforms would change any of these, any of these two 00:51:18.249 --> 00:51:25.999 points here. And I fear, I don't think so, because they cannot solve the interactions 00:51:25.999 --> 00:51:30.210 problem that people can be hyperactive. Actually, that's kind of a normal thing 00:51:30.210 --> 00:51:34.299 with social media. A small portion of social media users is much more active 00:51:34.299 --> 00:51:39.880 than everybody else. That's kind of. You have that without paying for it. So 00:51:39.880 --> 00:51:44.720 without even having paid likes, you will have to consider if social media is really 00:51:44.720 --> 00:51:51.189 kind of representative of the society. But, and the other thing is authenticity. 00:51:51.189 --> 00:51:57.170 And also in a decentralized platform, you could have multiple accounts run by the 00:51:57.170 --> 00:52:01.199 same person. Herald: So, microphone seven, all the way 00:52:01.199 --> 00:52:06.779 back there. Question: Hi. Do you know if Facebook even 00:52:06.779 --> 00:52:10.220 removes the likes when they delete fake accounts? 00:52:10.220 --> 00:52:17.319 Svea: Do you know that? Phillip: No, we don't know that. No, we 00:52:17.319 --> 00:52:21.259 don't. We don't know. We know they delete fake accounts, but we don't know if they 00:52:21.259 --> 00:52:27.619 also delete the likes. I know from our research that the people we approached, 00:52:27.619 --> 00:52:31.329 they did not delete the click workers. They get... 00:52:31.329 --> 00:52:35.839 Herald: Microphone two. Question: Yeah. Hi. So I have a question 00:52:35.839 --> 00:52:41.359 with respect to this, one out of four Facebook accounts are active in your, in 00:52:41.359 --> 00:52:46.949 your test. Did you see any difference with respect to age of the accounts? So is it 00:52:46.949 --> 00:52:52.489 always one out the four to the entire sample? Or does it maybe change over the, 00:52:52.489 --> 00:52:57.730 over the like going from a zero ID to, well, 10 billion or 40 billion? 00:52:57.730 --> 00:53:02.189 Phillip: So you're talking about the density of accounts in our ID? 00:53:02.189 --> 00:53:05.989 Q: Kind of. Phillip: So, so there are changes over 00:53:05.989 --> 00:53:12.150 time. Yeah. So I guess I think now it's less than it was before. So now they are 00:53:12.150 --> 00:53:19.089 less than for then, and before it was more and so I think it was. Yeah. I don't know. 00:53:19.089 --> 00:53:23.660 Q: But you don't see anything specific that now, only in the new accounts, only 00:53:23.660 --> 00:53:28.229 one out of 10 is active or valid and before it was one out of two or something 00:53:28.229 --> 00:53:31.259 like that. Phillip: It's not that extreme. So it's 00:53:31.259 --> 00:53:34.859 less than that. It's kind of... Dennis: We have to say we did not check 00:53:34.859 --> 00:53:41.239 this, but there were no special cases. Phillip: But it changed over time? So 00:53:41.239 --> 00:53:47.200 before it was less and, before it was more and now it is less. And so what we checked 00:53:47.200 --> 00:53:54.710 was whether an ID actually corresponds to an account. And so this metric, yeah. And 00:53:54.710 --> 00:53:57.299 it changed a little bit over time, but not much. 00:53:57.299 --> 00:54:02.239 Herald: So, so number three, please. Question: Yeah. Thank you for a very 00:54:02.239 --> 00:54:06.989 interesting talk. At the end, you gave some recommendations, how to fix the 00:54:06.989 --> 00:54:11.769 metrics, right? And it's always nice to have some metrics because then, well, we 00:54:11.769 --> 00:54:15.220 are the people who deal with the numbers. So we want the metrics. But I want to 00:54:15.220 --> 00:54:20.309 raise the issue whether quantitative measure is actually the right thing to do. 00:54:20.309 --> 00:54:26.449 So would you buy your furniture from store A with 300 likes against store B with 200 00:54:26.449 --> 00:54:32.049 likes? Or would it not be better to have a more qualitative thing? And to what extent 00:54:32.049 --> 00:54:38.259 is a quantitative measure maybe also the source of a lot of bad developments we see 00:54:38.259 --> 00:54:43.390 in social media to begin with, even not with bot firms and anything, but just 00:54:43.390 --> 00:54:48.339 people who go for the quick like and say Hooray for Trump and then get, whatever, 00:54:48.339 --> 00:54:52.479 all the Trumpists is liking that and the others say Fuck Trump and you get all the 00:54:52.479 --> 00:54:57.229 non Trumpists like that and you get all the polarization, right? So, Instagram, I 00:54:57.229 --> 00:55:02.650 think they just don't just display their like equivalent anymore in order to 00:55:02.650 --> 00:55:04.929 prevent that, so could you maybe comment on that? 00:55:04.929 --> 00:55:12.299 Svea: I think this is a good idea, to, to hide the likes. Yes. But I you know, we 00:55:12.299 --> 00:55:17.799 talked to many clickworkers and they do a lot of stuff. And what they also do is 00:55:17.799 --> 00:55:23.309 taking comments and doing copy paste for comments section or for Amazon reviews. 00:55:23.309 --> 00:55:29.789 So, you know, I think it's really hard to get them out of the system because maybe 00:55:29.789 --> 00:55:34.390 if the likes are not shown and if and when the comments are counting, then you will 00:55:34.390 --> 00:55:41.069 have people who are copy pasting comments in the comments section. So I really think 00:55:41.069 --> 00:55:44.519 that the networks, that they really have an issue here. 00:55:44.519 --> 00:55:49.829 Herald: So let's try to squeeze the last three questions now. First, number seven, 00:55:49.829 --> 00:55:52.950 really quick. Question: Very quick. Thank you for the 00:55:52.950 --> 00:55:58.799 nice insights. And I have a question about the location of the users. So you made 00:55:58.799 --> 00:56:03.289 your point that you can analyze by the metadata where, uh, when the account was 00:56:03.289 --> 00:56:08.650 made. But how about the location of the followers? Is there any way to analyze 00:56:08.650 --> 00:56:12.339 that as well? Phillip: So we can only analyze that if 00:56:12.339 --> 00:56:21.049 the users agreed to share it publicly and not all of them do that, I think often a 00:56:21.049 --> 00:56:26.460 name check is often a very good way to check where somebody is from. For these 00:56:26.460 --> 00:56:32.190 fake likes, for example. But as I said, it always depends on what the user himself is 00:56:32.190 --> 00:56:36.130 willing to share. Herald: Internet? 00:56:36.130 --> 00:56:41.039 Signal Angel: Isn't this just the western version of the Chinese social credit 00:56:41.039 --> 00:56:43.999 system? Where do we go from here? What is the future of all this? 00:56:43.999 --> 00:56:54.089 Svea: Yeah, it's dystopian, right? Oh, yeah, I don't, after this research, you 00:56:54.089 --> 00:57:01.109 know, for me, I deleted my Facebook account like one or two years ago. So this 00:57:01.109 --> 00:57:07.279 does you know, this did not matter to me so much. But I stayed on Instagram and 00:57:07.279 --> 00:57:13.359 when I saw all this bought likes and abonnents and followers and also YouTube, 00:57:13.359 --> 00:57:16.999 all this views, this, because the click workers, they also watch YouTube videos. 00:57:16.999 --> 00:57:20.859 They have to stay on them like 40 seconds, it's really funny because they hate 00:57:20.859 --> 00:57:27.239 hearing like techno music, rap music, all 40 seconds and then they go on. But when I 00:57:27.239 --> 00:57:34.589 sit next to Herald for two hour, three hours, I was so desillusionated about all 00:57:34.589 --> 00:57:40.960 the social network things. And and I thought, OK, don't count on anything. Just 00:57:40.960 --> 00:57:46.119 if you like the content, follow them and look at them. But don't believe anything. 00:57:46.119 --> 00:57:50.479 That was my personal take away from this research. 00:57:50.479 --> 00:57:53.970 Herald: So very last question, microphone two. 00:57:53.970 --> 00:57:59.150 Question: A couple of days ago, The Independent reported that Facebook, the 00:57:59.150 --> 00:58:06.839 Facebook App was activating the camera when reading a news feed. Could this be in 00:58:06.839 --> 00:58:10.779 use in the context of detecting fake accounts? 00:58:10.779 --> 00:58:18.400 Svea: I don't know. Phillip: So, I think that that in this 00:58:18.400 --> 00:58:26.799 particular instance that it was probably a bug. So, I don't know, but I mean that the 00:58:26.799 --> 00:58:30.679 people who work at Facebook are, not all of them are like crooks or anything that 00:58:30.679 --> 00:58:35.130 they will deliberately program this kind of stuff. So they said that it was kind of 00:58:35.130 --> 00:58:41.189 a bug from from an update that they did. And the question is whether we can 00:58:41.189 --> 00:58:49.430 actually detect fake accounts with the camera. And the problem is that current, I 00:58:49.430 --> 00:58:57.469 don't think that current face recognition technology is enough to detect that you 00:58:57.469 --> 00:59:02.940 are a unique person. So there are so many people on the planet that probably another 00:59:02.940 --> 00:59:08.959 person who has the same face. And I think the new iPhone, they also have this much 00:59:08.959 --> 00:59:14.579 more sophisticated version of this technology. And even they say, OK, there's 00:59:14.579 --> 00:59:19.079 a chance of one in, I don't know, that there is somebody who can unlock your 00:59:19.079 --> 00:59:23.829 phone. So I think it's really hard to do that with, do that with recording 00:59:23.829 --> 00:59:29.299 technology, to actually prove that somebody is just one person. 00:59:29.299 --> 00:59:38.059 Herald: So with that, would you please help me thank Svea, Dennis and Philip 00:59:38.059 --> 00:59:41.160 one more time for this fantastic presentation! Very interesting and very, 00:59:41.160 --> 00:59:48.099 very disturbing. Thank you very much. Applause 00:59:48.099 --> 00:59:52.099 postroll music 00:59:52.099 --> 01:00:16.000 Subtitles created by c3subtitles.de in the year 2020. Join, and help us!