WEBVTT 00:00:01.468 --> 00:00:06.690 So, on April 23 of 2013, 00:00:06.714 --> 00:00:12.228 the Associated Press put out the following tweet on Twitter. 00:00:12.252 --> 00:00:14.649 It said, "Breaking news: 00:00:14.673 --> 00:00:17.244 Two explosions at the White House 00:00:17.268 --> 00:00:19.601 and Barack Obama has been injured." 00:00:20.212 --> 00:00:25.637 This tweet was retweeted 4,000 times in less than five minutes, 00:00:25.661 --> 00:00:27.878 and it went viral thereafter. NOTE Paragraph 00:00:28.760 --> 00:00:33.110 Now, this tweet wasn't real news put out by the Associated Press. 00:00:33.134 --> 00:00:36.467 In fact it was false news, or fake news, 00:00:36.491 --> 00:00:39.316 that was propagated by Syrian hackers 00:00:39.340 --> 00:00:44.034 that had infiltrated the Associated Press Twitter handle. 00:00:44.407 --> 00:00:48.296 Their purpose was to disrupt society, but they disrupted much more. 00:00:48.320 --> 00:00:50.796 Because automated trading algorithms 00:00:50.820 --> 00:00:54.180 immediately seized on the sentiment on this tweet, 00:00:54.204 --> 00:00:57.172 and began trading based on the potential 00:00:57.196 --> 00:01:00.577 that the president of the United States had been injured or killed 00:01:00.601 --> 00:01:01.801 in this explosion. 00:01:02.188 --> 00:01:04.180 And as they started tweeting, 00:01:04.204 --> 00:01:07.553 they immediately sent the stock market crashing, 00:01:07.577 --> 00:01:12.744 wiping out 140 billion dollars in equity value in a single day. NOTE Paragraph 00:01:13.062 --> 00:01:17.538 Robert Mueller, special counsel prosecutor in the United States, 00:01:17.562 --> 00:01:21.454 issued indictments against three Russian companies 00:01:21.478 --> 00:01:24.097 and 13 Russian individuals 00:01:24.121 --> 00:01:27.288 on a conspiracy to defraud the United States 00:01:27.312 --> 00:01:31.092 by meddling in the 2016 presidential election. 00:01:31.855 --> 00:01:35.419 And what this indictment tells as a story 00:01:35.443 --> 00:01:38.585 is the story of the Internet Research Agency, 00:01:38.609 --> 00:01:42.203 the shadowy arm of the Kremlin on social media. 00:01:42.815 --> 00:01:45.592 During the presidential election alone, 00:01:45.616 --> 00:01:47.505 the Internet Agency's efforts 00:01:47.529 --> 00:01:52.696 reached 126 million people on Facebook in the United States, 00:01:52.720 --> 00:01:55.997 issued three million individual tweets 00:01:56.021 --> 00:01:59.863 and 43 hours' worth of YouTube content. 00:01:59.887 --> 00:02:01.539 All of which was fake -- 00:02:01.563 --> 00:02:07.886 misinformation designed to sow discord in the US presidential election. NOTE Paragraph 00:02:08.996 --> 00:02:11.646 A recent study by Oxford University 00:02:11.670 --> 00:02:14.940 showed that in the recent Swedish elections, 00:02:14.964 --> 00:02:19.339 one third of all of the information spreading on social media 00:02:19.363 --> 00:02:20.561 about the election 00:02:20.585 --> 00:02:22.672 was fake or misinformation. NOTE Paragraph 00:02:23.037 --> 00:02:28.115 In addition, these types of social-media misinformation campaigns 00:02:28.139 --> 00:02:32.290 can spread what has been called "genocidal propaganda," 00:02:32.314 --> 00:02:35.425 for instance against the Rohingya in Burma, 00:02:35.449 --> 00:02:37.752 triggering mob killings in India. NOTE Paragraph 00:02:37.776 --> 00:02:39.270 We studied fake news 00:02:39.294 --> 00:02:42.513 and began studying it before it was a popular term. 00:02:43.030 --> 00:02:48.070 And we recently published the largest-ever longitudinal study 00:02:48.094 --> 00:02:50.380 of the spread of fake news online 00:02:50.404 --> 00:02:53.608 on the cover of "Science" in March of this year. 00:02:54.523 --> 00:02:58.684 We studied all of the verified true and false news stories 00:02:58.708 --> 00:03:00.461 that ever spread on Twitter, 00:03:00.485 --> 00:03:04.303 from its inception in 2006 to 2017. 00:03:04.612 --> 00:03:06.926 And when we studied this information, 00:03:06.950 --> 00:03:09.826 we studied verified news stories 00:03:09.850 --> 00:03:13.768 that were verified by six independent fact-checking organizations. 00:03:13.792 --> 00:03:16.554 So we knew which stories were true 00:03:16.578 --> 00:03:18.704 and which stories were false. 00:03:18.728 --> 00:03:20.601 We can measure their diffusion, 00:03:20.625 --> 00:03:22.276 the speed of their diffusion, 00:03:22.300 --> 00:03:24.395 the depth and breadth of their diffusion, 00:03:24.419 --> 00:03:28.561 how many people become entangled in this information cascade and so on. 00:03:28.942 --> 00:03:30.426 And what we did in this paper 00:03:30.450 --> 00:03:34.315 was we compared the spread of true news to the spread of false news. 00:03:34.339 --> 00:03:36.022 And here's what we found. NOTE Paragraph 00:03:36.046 --> 00:03:40.025 We found that false news diffused further, faster, deeper 00:03:40.049 --> 00:03:41.855 and more broadly than the truth 00:03:41.879 --> 00:03:44.882 in every category of information that we studied, 00:03:44.906 --> 00:03:47.405 sometimes by an order of magnitude. 00:03:47.842 --> 00:03:51.366 And in fact, false political news was the most viral. 00:03:51.390 --> 00:03:54.537 It diffused further, faster, deeper and more broadly 00:03:54.561 --> 00:03:57.363 than any other type of false news. 00:03:57.387 --> 00:03:58.680 When we saw this, 00:03:58.704 --> 00:04:01.545 we were at once worried but also curious. 00:04:01.569 --> 00:04:02.720 Why? 00:04:02.744 --> 00:04:06.117 Why does false news travel so much further, faster, deeper 00:04:06.141 --> 00:04:08.005 and more broadly than the truth? NOTE Paragraph 00:04:08.339 --> 00:04:11.300 The first hypothesis that we came up with was, 00:04:11.324 --> 00:04:16.116 "Well, maybe people who spread false news have more followers or follow more people, 00:04:16.140 --> 00:04:17.697 or tweet more often, 00:04:17.721 --> 00:04:21.847 or maybe they're more often 'verified' users of Twitter, with more credibility, 00:04:21.871 --> 00:04:24.053 or maybe they've been on Twitter longer." 00:04:24.077 --> 00:04:26.375 So we checked each one of these in turn. 00:04:26.691 --> 00:04:29.611 And what we found was exactly the opposite. 00:04:29.635 --> 00:04:32.071 False-news spreaders had fewer followers, 00:04:32.095 --> 00:04:34.349 followed fewer people, were less active, 00:04:34.373 --> 00:04:35.833 less often "verified" 00:04:35.857 --> 00:04:38.817 and had been on Twitter for a shorter period of time. 00:04:38.841 --> 00:04:40.030 And yet, 00:04:40.054 --> 00:04:45.087 false news was 70 percent more likely to be retweeted than the truth, 00:04:45.111 --> 00:04:48.474 controlling for all of these and many other factors. NOTE Paragraph 00:04:48.498 --> 00:04:51.188 So we had to come up with other explanations. 00:04:51.212 --> 00:04:54.679 And we devised what we called a "novelty hypothesis." 00:04:55.038 --> 00:04:56.998 So if you read the literature, 00:04:57.022 --> 00:05:00.776 it is well known that human attention is drawn to novelty, 00:05:00.800 --> 00:05:03.319 things that are new in the environment. 00:05:03.343 --> 00:05:05.328 And if you read the sociology literature, 00:05:05.352 --> 00:05:09.652 you know that we like to share novel information. 00:05:09.676 --> 00:05:13.514 It makes us seem like we have access to inside information, 00:05:13.538 --> 00:05:17.323 and we gain in status by spreading this kind of information. NOTE Paragraph 00:05:17.792 --> 00:05:24.244 So what we did was we measured the novelty of an incoming true or false tweet, 00:05:24.268 --> 00:05:28.323 compared to the corpus of what that individual had seen 00:05:28.347 --> 00:05:31.299 in the 60 days prior on Twitter. 00:05:31.323 --> 00:05:33.982 But that wasn't enough, because we thought to ourselves, 00:05:34.006 --> 00:05:38.214 "Well, maybe false news is more novel in an information-theoretic sense, 00:05:38.238 --> 00:05:41.496 but maybe people don't perceive it as more novel." NOTE Paragraph 00:05:41.849 --> 00:05:45.776 So to understand people's perceptions of false news, 00:05:45.800 --> 00:05:49.490 we looked at the information and the sentiment 00:05:49.514 --> 00:05:53.720 contained in the replies to true and false tweets. 00:05:54.022 --> 00:05:55.228 And what we found 00:05:55.252 --> 00:05:59.466 was that across a bunch of different measures of sentiment -- 00:05:59.490 --> 00:06:02.791 surprise, disgust, fear, sadness, 00:06:02.815 --> 00:06:05.299 anticipation, joy and trust -- 00:06:05.323 --> 00:06:11.180 false news exhibited significantly more surprise and disgust 00:06:11.204 --> 00:06:14.010 in the replies to false tweets. 00:06:14.392 --> 00:06:18.181 And true news exhibited significantly more anticipation, 00:06:18.205 --> 00:06:19.752 joy and trust 00:06:19.776 --> 00:06:22.323 in reply to true tweets. 00:06:22.347 --> 00:06:26.133 The surprise corroborates our novelty hypothesis. 00:06:26.157 --> 00:06:30.766 This is new and surprising, and so we're more likely to share it. NOTE Paragraph 00:06:31.092 --> 00:06:34.017 At the same time, there was congressional testimony 00:06:34.041 --> 00:06:37.077 in front of both houses of Congress in the United States, 00:06:37.101 --> 00:06:40.839 looking at the role of bots in the spread of misinformation. 00:06:40.863 --> 00:06:42.217 So we looked at this too -- 00:06:42.241 --> 00:06:45.839 we used multiple sophisticated bot-detection algorithms 00:06:45.863 --> 00:06:48.937 to find the bots in our data and to pull them out. 00:06:49.347 --> 00:06:52.006 So we pulled them out, we put them back in 00:06:52.030 --> 00:06:55.149 and we compared what happens to our measurement. 00:06:55.173 --> 00:06:57.466 And what we found was that, yes indeed, 00:06:57.490 --> 00:07:01.172 bots were accelerating the spread of false news online, 00:07:01.196 --> 00:07:03.847 but they were accelerating the spread of true news 00:07:03.871 --> 00:07:06.276 at approximately the same rate. 00:07:06.300 --> 00:07:09.158 Which means bots are not responsible 00:07:09.182 --> 00:07:13.895 for the differential diffusion of truth and falsity online. 00:07:13.919 --> 00:07:16.768 We can't abdicate that responsibility, 00:07:16.792 --> 00:07:21.051 because we, humans, are responsible for that spread. NOTE Paragraph 00:07:22.472 --> 00:07:25.806 Now, everything that I have told you so far, 00:07:25.830 --> 00:07:27.584 unfortunately for all of us, 00:07:27.608 --> 00:07:28.869 is the good news. NOTE Paragraph 00:07:30.670 --> 00:07:35.120 The reason is because it's about to get a whole lot worse. 00:07:35.850 --> 00:07:39.532 And two specific technologies are going to make it worse. 00:07:40.207 --> 00:07:45.379 We are going to see the rise of a tremendous wave of synthetic media. 00:07:45.403 --> 00:07:51.434 Fake video, fake audio that is very convincing to the human eye. 00:07:51.458 --> 00:07:54.212 And this will powered by two technologies. NOTE Paragraph 00:07:54.236 --> 00:07:58.069 The first of these is known as "generative adversarial networks." 00:07:58.093 --> 00:08:00.656 This is a machine-learning model with two networks: 00:08:00.680 --> 00:08:02.227 a discriminator, 00:08:02.251 --> 00:08:06.451 whose job it is to determine whether something is true or false, 00:08:06.475 --> 00:08:07.642 and a generator, 00:08:07.666 --> 00:08:10.816 whose job it is to generate synthetic media. 00:08:10.840 --> 00:08:15.942 So the synthetic generator generates synthetic video or audio, 00:08:15.966 --> 00:08:20.641 and the discriminator tries to tell, "Is this real or is this fake?" 00:08:20.665 --> 00:08:23.539 And in fact, it is the job of the generator 00:08:23.563 --> 00:08:27.998 to maximize the likelihood that it will fool the discriminator 00:08:28.022 --> 00:08:31.609 into thinking the synthetic video and audio that it is creating 00:08:31.633 --> 00:08:33.363 is actually true. 00:08:33.387 --> 00:08:35.760 Imagine a machine in a hyperloop, 00:08:35.784 --> 00:08:38.587 trying to get better and better at fooling us. NOTE Paragraph 00:08:39.114 --> 00:08:41.614 This, combined with the second technology, 00:08:41.638 --> 00:08:47.360 which is essentially the democratization of artificial intelligence to the people, 00:08:47.384 --> 00:08:49.573 the ability for anyone, 00:08:49.597 --> 00:08:52.427 without any background in artificial intelligence 00:08:52.451 --> 00:08:53.633 or machine learning, 00:08:53.657 --> 00:08:57.760 to deploy these kinds of algorithms to generate synthetic media 00:08:57.784 --> 00:09:02.331 makes it ultimately so much easier to create videos. NOTE Paragraph 00:09:02.355 --> 00:09:06.776 The White House issued a false, doctored video 00:09:06.800 --> 00:09:11.088 of a journalist interacting with an intern who was trying to take his microphone. 00:09:11.427 --> 00:09:13.426 They removed frames from this video 00:09:13.450 --> 00:09:16.737 in order to make his actions seem more punchy. 00:09:17.157 --> 00:09:20.542 And when videographers and stuntmen and women 00:09:20.566 --> 00:09:22.993 were interviewed about this type of technique, 00:09:23.017 --> 00:09:26.845 they said, "Yes, we use this in the movies all the time 00:09:26.869 --> 00:09:31.632 to make our punches and kicks look more choppy and more aggressive." 00:09:32.268 --> 00:09:34.135 They then put out this video 00:09:34.159 --> 00:09:36.659 and partly used it as justification 00:09:36.683 --> 00:09:40.682 to revoke Jim Acosta, the reporter's, press pass 00:09:40.706 --> 00:09:42.045 from the White House. 00:09:42.069 --> 00:09:46.878 And CNN had to sue to have that press pass reinstated. NOTE Paragraph 00:09:48.538 --> 00:09:54.141 There are about five different paths that I can think of that we can follow 00:09:54.165 --> 00:09:57.904 to try and address some of these very difficult problems today. 00:09:58.379 --> 00:10:00.189 Each one of them has promise, 00:10:00.213 --> 00:10:03.212 but each one of them has its own challenges. 00:10:03.236 --> 00:10:05.244 The first one is labeling. 00:10:05.268 --> 00:10:06.625 Think about it this way: 00:10:06.649 --> 00:10:10.260 when you go to the grocery store to buy food to consume, 00:10:10.284 --> 00:10:12.188 it's extensively labeled. 00:10:12.212 --> 00:10:14.204 You know how many calories it has, 00:10:14.228 --> 00:10:16.029 how much fat it contains -- 00:10:16.053 --> 00:10:20.331 and yet when we consume information, we have no labels whatsoever. 00:10:20.355 --> 00:10:22.283 What is contained in this information? 00:10:22.307 --> 00:10:23.760 Is the source credible? 00:10:23.784 --> 00:10:26.101 Where is this information gathered from? 00:10:26.125 --> 00:10:27.950 We have none of that information 00:10:27.974 --> 00:10:30.077 when we are consuming information. 00:10:30.101 --> 00:10:33.339 That is a potential avenue, but it comes with its challenges. 00:10:33.363 --> 00:10:39.814 For instance, who gets to decide, in society, what's true and what's false? 00:10:40.387 --> 00:10:42.029 Is it the governments? 00:10:42.053 --> 00:10:43.203 Is it Facebook? 00:10:43.601 --> 00:10:47.363 Is it an independent consortium of fact-checkers? 00:10:47.387 --> 00:10:49.853 And who's checking the fact-checkers? NOTE Paragraph 00:10:50.427 --> 00:10:53.511 Another potential avenue is incentives. 00:10:53.535 --> 00:10:56.169 We know that during the US presidential election 00:10:56.193 --> 00:10:59.883 there was a wave of misinformation that came from Macedonia 00:10:59.907 --> 00:11:02.244 that didn't have any political motive 00:11:02.268 --> 00:11:04.728 but instead had an economic motive. 00:11:04.752 --> 00:11:06.900 And this economic motive existed, 00:11:06.924 --> 00:11:10.448 because false news travels so much farther, faster 00:11:10.472 --> 00:11:12.482 and more deeply than the truth, 00:11:12.506 --> 00:11:17.466 and you can earn advertising dollars as you garner eyeballs and attention 00:11:17.490 --> 00:11:19.450 with this type of information. 00:11:19.474 --> 00:11:23.307 But if we can depress the spread of this information, 00:11:23.331 --> 00:11:26.228 perhaps it would reduce the economic incentive 00:11:26.252 --> 00:11:28.942 to produce it at all in the first place. NOTE Paragraph 00:11:28.966 --> 00:11:31.466 Third, we can think about regulation, 00:11:31.490 --> 00:11:33.815 and certainly, we should think about this option. 00:11:33.839 --> 00:11:35.450 In the United States, currently, 00:11:35.474 --> 00:11:40.322 we are exploring what might happen if Facebook and others are regulated. 00:11:40.346 --> 00:11:44.147 While we should consider things like regulating political speech, 00:11:44.171 --> 00:11:46.679 labeling the fact that it's political speech, 00:11:46.703 --> 00:11:50.522 making sure foreign actors can't fund political speech, 00:11:50.546 --> 00:11:53.093 it also has its own dangers. 00:11:53.522 --> 00:11:58.400 For instance, Malaysia just instituted a six-year prison sentence 00:11:58.424 --> 00:12:01.158 for anyone found spreading misinformation. 00:12:01.696 --> 00:12:03.775 And in authoritarian regimes, 00:12:03.799 --> 00:12:08.465 these kinds of policies can be used to suppress minority opinions 00:12:08.489 --> 00:12:11.997 and to continue to extend repression. NOTE Paragraph 00:12:12.680 --> 00:12:16.223 The fourth possible option is transparency. 00:12:16.843 --> 00:12:20.557 We want to know how do Facebook's algorithms work. 00:12:20.581 --> 00:12:23.461 How does the data combine with the algorithms 00:12:23.485 --> 00:12:26.323 to produce the outcomes that we see? 00:12:26.347 --> 00:12:28.696 We want them to open the kimono 00:12:28.720 --> 00:12:32.934 and show us exactly the inner workings of how Facebook is working. 00:12:32.958 --> 00:12:35.737 And if we want to know social media's effect on society, 00:12:35.761 --> 00:12:37.847 we need scientists, researchers 00:12:37.871 --> 00:12:41.014 and others to have access to this kind of information. 00:12:41.038 --> 00:12:42.585 But at the same time, 00:12:42.609 --> 00:12:46.410 we are asking Facebook to lock everything down, 00:12:46.434 --> 00:12:48.607 to keep all of the data secure. NOTE Paragraph 00:12:48.631 --> 00:12:51.790 So, Facebook and the other social media platforms 00:12:51.814 --> 00:12:54.948 are facing what I call a transparency paradox. 00:12:55.266 --> 00:12:57.940 We are asking them, at the same time, 00:12:57.964 --> 00:13:02.773 to be open and transparent and, simultaneously secure. 00:13:02.797 --> 00:13:05.488 This is a very difficult needle to thread, 00:13:05.512 --> 00:13:07.425 but they will need to thread this needle 00:13:07.449 --> 00:13:11.236 if we are to achieve the promise of social technologies 00:13:11.260 --> 00:13:12.902 while avoiding their peril. NOTE Paragraph 00:13:12.926 --> 00:13:17.617 The final thing that we could think about is algorithms and machine learning. 00:13:17.641 --> 00:13:22.918 Technology devised to root out and understand fake news, how it spreads, 00:13:22.942 --> 00:13:25.273 and to try and dampen its flow. 00:13:25.824 --> 00:13:28.721 Humans have to be in the loop of this technology, 00:13:28.745 --> 00:13:31.023 because we can never escape 00:13:31.047 --> 00:13:35.085 that underlying any technological solution or approach 00:13:35.109 --> 00:13:39.156 is a fundamental ethical and philosophical question 00:13:39.180 --> 00:13:42.450 about how do we define truth and falsity, 00:13:42.474 --> 00:13:45.654 to whom do we give the power to define truth and falsity 00:13:45.678 --> 00:13:48.138 and which opinions are legitimate, 00:13:48.162 --> 00:13:51.868 which type of speech should be allowed and so on. 00:13:51.892 --> 00:13:54.220 Technology is not a solution for that. 00:13:54.244 --> 00:13:57.942 Ethics and philosophy is a solution for that. NOTE Paragraph 00:13:58.950 --> 00:14:02.268 Nearly every theory of human decision making, 00:14:02.292 --> 00:14:05.053 human cooperation and human coordination 00:14:05.077 --> 00:14:08.751 has some sense of the truth at its core. 00:14:09.347 --> 00:14:11.403 But with the rise of fake news, 00:14:11.427 --> 00:14:12.870 the rise of fake video, 00:14:12.894 --> 00:14:14.776 the rise of fake audio, 00:14:14.800 --> 00:14:18.724 we are teetering on the brink of the end of reality, 00:14:18.748 --> 00:14:22.637 where we cannot tell what is real from what is fake. 00:14:22.661 --> 00:14:25.700 And that's potentially incredibly dangerous. NOTE Paragraph 00:14:26.931 --> 00:14:30.879 We have to be vigilant in defending the truth 00:14:30.903 --> 00:14:32.437 against misinformation. 00:14:32.919 --> 00:14:36.355 With our technologies, with our policies 00:14:36.379 --> 00:14:38.299 and, perhaps most importantly, 00:14:38.323 --> 00:14:41.537 with our own individual responsibilities, 00:14:41.561 --> 00:14:45.116 decisions, behaviors and actions. NOTE Paragraph 00:14:45.553 --> 00:14:46.990 Thank you very much. NOTE Paragraph 00:14:47.014 --> 00:14:50.531 (Applause)