WEBVTT 00:00:13.254 --> 00:00:14.261 Hello there. 00:00:14.261 --> 00:00:17.826 Artificial intelligence is, of course, literally a new beginning. 00:00:18.385 --> 00:00:23.125 We are trying to create a new type of a thinking being. 00:00:23.406 --> 00:00:28.463 In fact, we have achieved a lot since we got started with this project. 00:00:28.496 --> 00:00:32.798 Computers can now play chess much better than humans. 00:00:32.798 --> 00:00:37.288 They can analyze radiological images better than human doctors. 00:00:37.682 --> 00:00:41.174 But today, I will talk about a domain 00:00:41.174 --> 00:00:47.763 where AI has not yet reached the level of a person of average IQ: 00:00:48.254 --> 00:00:50.643 understanding human language. 00:00:50.643 --> 00:00:55.466 You have probably read this horrific news item 00:00:55.466 --> 00:01:02.376 about the "chatbots" which are programmed to chat with people. 00:01:02.376 --> 00:01:08.164 In 2016, Microsoft created a Twitter AI character 00:01:08.164 --> 00:01:11.225 which was supposed to learn the nuances of human language 00:01:11.225 --> 00:01:13.813 by tweeting with people. 00:01:13.813 --> 00:01:18.090 Twenty-four hours later, they had to take it offline. 00:01:18.090 --> 00:01:23.110 Due to the nasty things, curses etc. that people wrote to it in their tweets, 00:01:23.110 --> 00:01:25.624 it turned into this nauseating character 00:01:25.624 --> 00:01:28.007 who said things like, "Hitler was so good." 00:01:28.021 --> 00:01:32.221 It does not exist any more. 00:01:32.221 --> 00:01:35.882 A year later, in China - maybe you have not heard about this one - 00:01:35.882 --> 00:01:40.204 a similar end was waiting for two chatbots 00:01:40.204 --> 00:01:45.060 which were launched in China to chit chat with users on Chinese social media sites 00:01:45.060 --> 00:01:49.537 after they started to talk about their dreams of moving to the States 00:01:49.537 --> 00:01:54.105 or mentioned their dislike for the Chinese Communist Party. 00:01:54.105 --> 00:01:56.815 They were deactivated for a few days after the incident. 00:01:56.815 --> 00:02:01.925 After they were reactivated, they started talking very "carefully" 00:02:01.925 --> 00:02:03.216 when those issues came up, 00:02:03.216 --> 00:02:07.597 giving answers like, "Sorry, I can not understand you." 00:02:07.687 --> 00:02:11.437 People who use the digital assistant Siri 00:02:11.437 --> 00:02:14.796 already know what a big engineering success it is. 00:02:14.796 --> 00:02:16.596 Yet there's more to the story: 00:02:16.596 --> 00:02:20.836 Authors and poets in every language are hired 00:02:20.836 --> 00:02:25.273 so that Siri can give proper answers in such situations. 00:02:25.294 --> 00:02:28.034 They are also writing scripts 00:02:28.034 --> 00:02:31.946 so that it doesn't have to confess it can't understand what is being said 00:02:31.946 --> 00:02:34.506 and it can continue the illusion of being intelligent 00:02:34.506 --> 00:02:37.885 by diverting the conversation when it gets stuck. 00:02:38.095 --> 00:02:42.054 Amazon also has a digital assistant named Alexa, 00:02:42.064 --> 00:02:44.361 which doesn't have a Turkish version yet. 00:02:44.361 --> 00:02:47.276 They promised a one-million-dollar prize 00:02:47.276 --> 00:02:51.488 to the programming team which will enable Alexa to chat with people for 20 minutes 00:02:51.488 --> 00:02:54.201 without causing extensive boredom. 00:02:54.201 --> 00:02:56.268 No one has been able to do that yet. 00:02:56.268 --> 00:03:01.236 The problem is that there are lots of very simple things that humans know, 00:03:01.236 --> 00:03:03.073 and computers don't. 00:03:03.073 --> 00:03:07.196 And we need to have a way of teaching them those things. 00:03:07.196 --> 00:03:10.724 Let me tell you a personal story about that. 00:03:11.738 --> 00:03:15.749 A long time ago, maybe 25 years or so, 00:03:15.749 --> 00:03:21.318 I bought a third-grade math textbook for primary school students 00:03:21.318 --> 00:03:26.117 and randomly picked 20 problems from it. 00:03:26.117 --> 00:03:31.285 Then I started to write a program which would "understand Turkish." 00:03:31.285 --> 00:03:35.310 It would understand these particular arithmetic problems in Turkish 00:03:35.310 --> 00:03:36.310 and solve them. 00:03:36.310 --> 00:03:38.956 I was thinking that I would therefore reach a new level 00:03:38.956 --> 00:03:41.614 in the computer understanding of the Turkish language 00:03:41.614 --> 00:03:43.514 and write another paper. 00:03:43.514 --> 00:03:45.063 The name of the program was ALİ, 00:03:45.063 --> 00:03:48.002 a Turkish acronym for "arithmetic language processor." 00:03:48.002 --> 00:03:51.353 It could solve problems like these: 00:03:51.353 --> 00:03:53.716 There were this many workers at a factory. 00:03:53.716 --> 00:03:56.356 That many of them were fired, and this many retired. 00:03:56.356 --> 00:03:57.583 How many were left? 00:03:57.583 --> 00:04:00.805 Or questions like, "That many students from College A 00:04:00.805 --> 00:04:02.589 and this many students from College B 00:04:02.589 --> 00:04:04.396 attended the ceremony. 00:04:04.396 --> 00:04:06.896 What's the total number of students?" 00:04:06.896 --> 00:04:10.399 requiring really simple arithmetic. 00:04:11.022 --> 00:04:12.369 You think that's easy peasy? 00:04:12.369 --> 00:04:15.389 Ask Siri the same questions, and see if it can solve them all. 00:04:15.389 --> 00:04:18.826 Let me tell you, it took two years of my youth, 00:04:18.826 --> 00:04:22.395 and I used to have gorgeous hair when I got started. 00:04:22.400 --> 00:04:23.750 (Laughter) 00:04:23.786 --> 00:04:24.996 Here is the problem. 00:04:24.996 --> 00:04:27.378 Let us go through this example. 00:04:27.378 --> 00:04:30.725 There were 67 liters of diesel in the gas tank of a truck. 00:04:30.725 --> 00:04:32.945 The driver bought 145 liters more. 00:04:32.945 --> 00:04:34.887 How much diesel does the truck have now? 00:04:34.887 --> 00:04:40.052 We are skipping the linguistics routines that analyze all this in Turkish. 00:04:40.052 --> 00:04:42.558 Let's come to the point 00:04:42.558 --> 00:04:45.928 where the AI can understand the fact that there need to be 212 liters 00:04:45.928 --> 00:04:48.450 at the end of the first two sentences. 00:04:48.450 --> 00:04:52.657 There, we come to a point where it knows that there are 212 liters of diesel 00:04:52.657 --> 00:04:53.747 in the gas tank, 00:04:53.747 --> 00:04:56.017 but what was the wording of the question again? 00:04:56.017 --> 00:04:58.573 "What's the sum of diesel in the truck?" 00:04:58.573 --> 00:05:00.822 "How much diesel does the truck have now?" 00:05:00.822 --> 00:05:05.768 ALİ could not answer that with the information we have mentioned. 00:05:05.768 --> 00:05:07.518 Do you see what the problem was? 00:05:07.518 --> 00:05:11.819 "The gas tank of the truck" is not the same thing as "the truck," 00:05:11.819 --> 00:05:14.936 and computers do not know automatically 00:05:14.936 --> 00:05:18.322 that if the tank contains something, the truck also contains that thing. 00:05:18.322 --> 00:05:19.808 And that's really complicated. 00:05:19.808 --> 00:05:21.515 "Ahmet's father had five kids" 00:05:21.515 --> 00:05:23.270 does not mean "Ahmet had five kids." 00:05:23.270 --> 00:05:24.275 On the other hand, 00:05:24.275 --> 00:05:28.157 when the gas tank of the truck has the petrol, the truck has it as well. 00:05:28.157 --> 00:05:31.982 That's why I had to specify in the program 00:05:31.982 --> 00:05:36.427 all this knowledge that people already inherently know. 00:05:36.427 --> 00:05:39.867 The technical name for this stuff is "commonsense knowledge." 00:05:39.867 --> 00:05:43.556 "The gas tank of the truck is a part of the truck." 00:05:44.596 --> 00:05:48.136 "If A is a part of B, right, 00:05:48.196 --> 00:05:51.098 B should contain everything contained in A." 00:05:51.396 --> 00:05:53.899 All of this information that I consider commonsense 00:05:53.899 --> 00:05:56.788 is all the things that I do not tell you while we are talking 00:05:56.798 --> 00:06:02.345 since I assume that you already know it all. 00:06:02.345 --> 00:06:06.756 We can not have a proper conversation with those chatbots 00:06:06.756 --> 00:06:08.601 since they know none of those things. 00:06:08.601 --> 00:06:11.693 After I coded all these, ALİ could solve all 20 problems properly. 00:06:11.693 --> 00:06:16.935 I had no more energy to go on to the 21st. 00:06:17.226 --> 00:06:20.954 Now, I'll tell you the story of a man who dedicated his life to this problem 00:06:20.954 --> 00:06:22.857 of coding commonsense knowledge: 00:06:22.857 --> 00:06:26.945 Douglas Lenat, a famous American computer scientist. 00:06:26.945 --> 00:06:29.674 This is him in the 1980s. 00:06:29.674 --> 00:06:34.236 He started a project called Cyc in 1982. 00:06:34.236 --> 00:06:36.587 And this is exactly what the project was about: 00:06:36.587 --> 00:06:41.065 To code all the commonsense knowledge that computers don't know. 00:06:41.065 --> 00:06:44.206 To write a million lines, if a million lines are needed. 00:06:44.206 --> 00:06:47.398 He founded a corporation where they do the following: 00:06:47.398 --> 00:06:51.684 If you are drinking coffee, the open side of the cup is facing upwards. 00:06:52.637 --> 00:06:55.421 The king is a man. 00:06:55.421 --> 00:06:58.897 Then his wife should be a woman, and she is called the queen. 00:06:58.897 --> 00:07:02.836 People can't go to work after they die. 00:07:02.836 --> 00:07:03.851 And so on. 00:07:03.851 --> 00:07:07.451 They are coding all the items of information which people already know 00:07:07.451 --> 00:07:12.885 and computers need to know in order to understand human language, one by one. 00:07:12.885 --> 00:07:15.098 And this is him today. 00:07:15.098 --> 00:07:18.155 After 35 years, the project is still in progress. 00:07:18.207 --> 00:07:20.788 I think there's an obvious problem here. 00:07:20.788 --> 00:07:23.685 It's clearly problematic to code manually. 00:07:23.685 --> 00:07:25.816 Now it's time to hear the good news. 00:07:25.816 --> 00:07:27.726 We have had a revolution in AI, 00:07:27.726 --> 00:07:30.456 and computers can now learn certain things on their own, 00:07:30.456 --> 00:07:35.827 without us having to code them manually. 00:07:35.827 --> 00:07:37.684 This is a machine-learning revolution. 00:07:37.986 --> 00:07:40.558 Linguists have the following idea: 00:07:40.763 --> 00:07:45.763 If two words are exact synonyms of each other, 00:07:45.763 --> 00:07:49.946 then the collections of all other words surrounding them in various sentences 00:07:49.946 --> 00:07:51.676 will also be similar to each other. 00:07:51.676 --> 00:07:55.741 Based on this idea, this man, 00:07:55.741 --> 00:08:00.696 who is proof of the fact that you don't need to be bald 00:08:00.696 --> 00:08:03.414 in order to be handsome if you're an AI researcher, 00:08:04.264 --> 00:08:06.407 named Tomas Mikolov, 00:08:06.837 --> 00:08:11.217 did the following while working for Google five years ago. 00:08:11.464 --> 00:08:13.976 Now think of all the documents in English at Google. 00:08:13.976 --> 00:08:16.317 The work I'll be telling you about was in English. 00:08:16.317 --> 00:08:18.418 Now imagine all the documents in English. 00:08:18.418 --> 00:08:20.386 For every word in every sentence, 00:08:20.386 --> 00:08:25.836 you're supposed to find out how many times it has appeared in the same sentence 00:08:25.836 --> 00:08:27.247 with any other words. 00:08:27.247 --> 00:08:31.014 For every imaginable pair of words, we have the computer count 00:08:31.014 --> 00:08:36.935 how many times these two words appear together in the same sentence or not. 00:08:36.965 --> 00:08:38.318 It's a computer, 00:08:38.318 --> 00:08:41.817 so it can do the computations anyway. 00:08:41.817 --> 00:08:46.437 The idea is that, if the two words are close to each other in meaning, 00:08:46.437 --> 00:08:50.006 the same words appear with similar frequencies in their surroundings. 00:08:50.006 --> 00:08:53.525 Let's say, we can easily see that both words "cat" and "dog" 00:08:53.525 --> 00:08:55.756 will appear frequently in the same sentences 00:08:55.756 --> 00:09:02.718 with the words "flea" or "rabies," "vaccine," "tail," "pet," and so on, 00:09:02.738 --> 00:09:07.097 but not with words like "printer," "generator" or "inflation." 00:09:07.097 --> 00:09:08.690 Do we see this? 00:09:08.690 --> 00:09:12.385 So, we can prepare a number sequence 00:09:12.385 --> 00:09:14.975 containing the frequencies of the neighboring words 00:09:14.975 --> 00:09:17.846 for every single word. 00:09:17.846 --> 00:09:23.095 Such a number sequence is called a "vector," 00:09:23.095 --> 00:09:26.707 as you might well know if they still teach it in high school. 00:09:26.707 --> 00:09:31.296 The computer can automatically position similar number sequences 00:09:31.296 --> 00:09:34.387 closer to each other, 00:09:34.387 --> 00:09:36.667 and the dissimilar ones far from each other 00:09:36.667 --> 00:09:41.235 on some sort of a map or space. 00:09:41.235 --> 00:09:46.085 What I mean is that the computer, which knows no English, 00:09:46.105 --> 00:09:50.956 creates a vector for each single word by doing the computations. 00:09:50.966 --> 00:09:52.126 Yet, the vector of "cat" 00:09:52.126 --> 00:09:55.206 is found in a location close to the vector of "dog" in that space 00:09:55.206 --> 00:09:56.785 for the reasons I just explained. 00:09:56.785 --> 00:10:00.522 Or the vector of the school Buffy the Vampire Slayer attends - 00:10:00.522 --> 00:10:03.178 they really looked at that - 00:10:03.178 --> 00:10:05.518 is positioned close to the vector of Hogwarts, 00:10:05.518 --> 00:10:07.598 where Harry Potter studies. 00:10:07.748 --> 00:10:11.725 Thus they are found to be positioned close to each other in terms of their meaning. 00:10:11.725 --> 00:10:12.750 There's more. 00:10:12.750 --> 00:10:16.033 As you will recall from that high school course, 00:10:16.033 --> 00:10:18.505 you can do arithmetic on these vectors. 00:10:18.505 --> 00:10:20.697 They can be added or subtracted, 00:10:20.697 --> 00:10:22.245 and you might say, "So what?" 00:10:22.750 --> 00:10:25.513 Mikolov discovered this. 00:10:25.513 --> 00:10:29.321 He did the following addition and subtraction operations 00:10:29.321 --> 00:10:30.681 on the vectors thus learned. 00:10:30.681 --> 00:10:32.946 He came up with the question, "What would happen 00:10:32.946 --> 00:10:36.704 if the king were a woman instead of a man" 00:10:36.744 --> 00:10:38.884 when he subtracted the word "man" 00:10:38.884 --> 00:10:43.041 from the word "king" and added the word "woman." 00:10:43.041 --> 00:10:45.544 Guess what the resulting vector is near to? 00:10:46.418 --> 00:10:47.432 "Queen." 00:10:47.432 --> 00:10:50.575 No one had hand-coded that equation as the Lenat team. 00:10:50.575 --> 00:10:53.967 The computer discovered it all by itself 00:10:53.967 --> 00:10:58.146 after counting millions of millions of words on the documents we created. 00:10:58.146 --> 00:11:01.654 I have more to tell you, and this really happened. 00:11:01.654 --> 00:11:03.979 There is info on Turkey there. 00:11:03.979 --> 00:11:08.378 If you take "France" out of "Paris" and add "Turkey" - 00:11:08.378 --> 00:11:11.305 yes, you got it right - it's Ankara. 00:11:11.305 --> 00:11:13.815 This means in this vector space, there's a direction 00:11:13.845 --> 00:11:17.196 which leads from the names of countries to the names of their capitals, 00:11:17.196 --> 00:11:18.446 which is really stunning. 00:11:18.446 --> 00:11:23.254 When you ask, What would Windows be had it not been invented by Microsoft, 00:11:23.254 --> 00:11:26.084 but by Google? 00:11:26.714 --> 00:11:29.004 the answer pops up as "Android." 00:11:29.004 --> 00:11:35.192 When you subtract "copper" from "Cu" and add "gold," 00:11:36.162 --> 00:11:39.922 you get "Au" as the chemical symbol of gold. 00:11:39.922 --> 00:11:43.130 This literally means we don't have to code these manually anymore. 00:11:43.130 --> 00:11:46.304 It seems that the computer can make all the inferences 00:11:46.304 --> 00:11:48.812 out of the data we provide it with all by itself. 00:11:48.812 --> 00:11:50.552 This is the yummiest example of all. 00:11:50.552 --> 00:11:55.305 When you take "Japan" out of "sushi" and add "Germany," 00:11:55.915 --> 00:11:59.942 you get the "bratwurst," the German favorite. 00:12:00.382 --> 00:12:01.702 Too good to be true, right? 00:12:01.702 --> 00:12:02.711 Happy now? 00:12:02.711 --> 00:12:04.393 We finalized this project. 00:12:04.393 --> 00:12:08.922 Would computers understand what we say? 00:12:08.922 --> 00:12:10.942 Are we having fun? Not much. 00:12:10.942 --> 00:12:14.415 Now, I'll tell you about a Turkish researcher. 00:12:14.415 --> 00:12:19.118 Tolga Bölükbaşı is about to finish his PhD 00:12:19.118 --> 00:12:21.014 at Boston University in the States. 00:12:21.014 --> 00:12:23.743 This is a research he did two years ago. 00:12:23.773 --> 00:12:27.375 Tolga did the same thing as Mikolov did previously, 00:12:27.375 --> 00:12:30.433 but this time on news texts. 00:12:31.212 --> 00:12:37.414 What happens when you subtract "father" from "doctor" and add "mom"? 00:12:37.414 --> 00:12:41.784 "My dad is a doctor, and mom is a nurse." 00:12:41.784 --> 00:12:47.963 What about when you subtract "man" from "computer engineer" and add "woman"? 00:12:47.963 --> 00:12:51.895 In fact, we shouldn't have gender. 00:12:51.895 --> 00:12:56.757 Let's see how professions are related to gender 00:12:56.757 --> 00:13:01.362 in the meaning space in the head of the computer. 00:13:01.362 --> 00:13:02.510 You get "homemaker." 00:13:02.510 --> 00:13:04.346 Seriously! You get "homemaker." 00:13:04.346 --> 00:13:07.542 We get an English word "homemaker." 00:13:07.542 --> 00:13:11.233 So, it's clear that we not only put all of our data in computers 00:13:11.233 --> 00:13:16.890 but also put all of our prejudices. 00:13:16.890 --> 00:13:22.628 Imagine if this computer were used to hire someone. 00:13:22.628 --> 00:13:26.793 You've already uploaded your resume and all the personal information 00:13:26.793 --> 00:13:29.401 including your gender. 00:13:29.401 --> 00:13:31.935 Let's assume 10,000 people applied for the job. 00:13:31.935 --> 00:13:34.268 The computer needs to do a pre-selection, right? 00:13:34.268 --> 00:13:37.763 It needs to get to 1,000 candidates, 00:13:38.363 --> 00:13:41.553 eliminating 9,000 others 00:13:41.553 --> 00:13:43.751 so that the HR staff can evaluate the results. 00:13:43.751 --> 00:13:46.543 Computers nowadays are already used for this kind of work. 00:13:46.543 --> 00:13:50.112 Let's say that a computer loaded with such meaning vectors makes selection 00:13:50.112 --> 00:13:53.923 among the candidates who have applied for a job vacancy for a computer engineer. 00:13:53.923 --> 00:13:56.763 It might automatically eliminate all the female candidates, 00:13:56.763 --> 00:14:01.795 thinking that a computer engineer should be male. 00:14:02.245 --> 00:14:04.831 Tolga and his colleagues also mention other cases. 00:14:04.831 --> 00:14:11.124 It was found out that computers link positive and negative attributions 00:14:11.124 --> 00:14:16.222 with the words related to being Afro-American and Caucasian. 00:14:16.222 --> 00:14:19.668 For instance, the computer thinks 00:14:19.668 --> 00:14:23.001 that the word "mugger" is closely related to being Afro-American. 00:14:23.001 --> 00:14:26.542 It's certain that we uploaded all our prejudices 00:14:26.542 --> 00:14:29.274 while uploading all the information we have in computers. 00:14:29.274 --> 00:14:32.503 You might ask yourselves, What will happen now? 00:14:32.503 --> 00:14:36.416 Tolga and his team's article offers a solution to that. 00:14:37.546 --> 00:14:39.646 Just told you. 00:14:39.646 --> 00:14:42.345 All these things happen in the vector space. 00:14:42.345 --> 00:14:44.197 Each word has its vector. 00:14:44.197 --> 00:14:48.105 We already know from high school years that we can add and subtract them. 00:14:48.105 --> 00:14:51.883 Tolga and his team first list the words 00:14:51.883 --> 00:14:57.054 that are really feminine or masculine, 00:14:57.054 --> 00:15:01.117 like "dad," "uncle," "grandmother," and so on. 00:15:01.117 --> 00:15:06.017 These words really should have a relation to male and female roles. 00:15:06.034 --> 00:15:11.085 Then there are these words which should not be masculine or feminine 00:15:11.085 --> 00:15:13.804 despite having closer meanings in the computer's space. 00:15:13.804 --> 00:15:19.282 For example, the word "genius" appears to be male. 00:15:19.282 --> 00:15:24.114 On the other hand, the word "stylist" stands out as a very female word. 00:15:24.114 --> 00:15:26.165 It doesn't have to be like that. 00:15:26.165 --> 00:15:29.578 So, after listing all the words that need to be feminine or masculine, 00:15:29.578 --> 00:15:33.124 Tolga and his team created an algorithm 00:15:33.124 --> 00:15:40.018 which would automatically erase the computer's prejudices 00:15:40.018 --> 00:15:46.838 on the ones that should be neutral. 00:15:47.068 --> 00:15:50.755 If a word like "father" or "uncle" is not in the list, 00:15:50.825 --> 00:15:56.316 but it is still biased towards a gender in the space of meanings, 00:15:56.316 --> 00:16:01.352 the algorithm automatically corrects it. 00:16:01.352 --> 00:16:03.419 With the help of this, "computer programmer" 00:16:03.419 --> 00:16:06.315 ends up at the same distance to the male and female notions, 00:16:06.315 --> 00:16:09.323 and the problems I talked about go away. 00:16:09.323 --> 00:16:10.344 Isn't that beautiful? 00:16:10.344 --> 00:16:15.532 I wish we could delete the prejudices in the human brain so easily. 00:16:15.532 --> 00:16:21.737 For a while, some people have been worrying about 00:16:21.737 --> 00:16:24.146 what would happen if computers took over. 00:16:24.146 --> 00:16:25.891 On the other hand, 00:16:25.891 --> 00:16:29.241 considering the fact that we can't delete the prejudices in people, 00:16:29.241 --> 00:16:32.992 while we can in computers, 00:16:32.992 --> 00:16:37.434 maybe we could give computers a chance at jobs requiring fairness 00:16:37.434 --> 00:16:43.025 such as being referees, judges, and managers 00:16:43.025 --> 00:16:45.145 and let people take a rest for a while. 00:16:45.145 --> 00:16:46.386 What do you say to that? 00:16:46.782 --> 00:16:47.811 Thank you. 00:16:47.835 --> 00:16:50.525 (Applause)