Return to Video

The intelligence of the future: robots without prejudice | Cem Say | TEDxIstanbul

  • 0:13 - 0:14
    Hello there.
  • 0:14 - 0:18
    Artificial intelligence is, of course,
    literally a new beginning.
  • 0:18 - 0:23
    We are trying to create
    a new type of a thinking being.
  • 0:23 - 0:28
    In fact, we have achieved a lot
    since we got started with this project.
  • 0:28 - 0:33
    Computers can now play chess
    much better than humans.
  • 0:33 - 0:37
    They can analyze radiological images
    better than human doctors.
  • 0:38 - 0:41
    But today, I will talk about a domain
  • 0:41 - 0:48
    where AI has not yet reached
    the level of a person of average IQ:
  • 0:48 - 0:51
    understanding human language.
  • 0:51 - 0:55
    You have probably read
    this horrific news item
  • 0:55 - 1:02
    about the "chatbots"
    which are programmed to chat with people.
  • 1:02 - 1:08
    In 2016, Microsoft created
    a Twitter AI character
  • 1:08 - 1:11
    which was supposed to learn
    the nuances of human language
  • 1:11 - 1:14
    by tweeting with people.
  • 1:14 - 1:18
    Twenty-four hours later,
    they had to take it offline.
  • 1:18 - 1:23
    Due to the nasty things, curses etc.
    that people wrote to it in their tweets,
  • 1:23 - 1:26
    it turned into this nauseating character
  • 1:26 - 1:28
    who said things like,
    "Hitler was so good."
  • 1:28 - 1:32
    It does not exist any more.
  • 1:32 - 1:36
    A year later, in China - maybe
    you have not heard about this one -
  • 1:36 - 1:40
    a similar end was waiting for two chatbots
  • 1:40 - 1:45
    which were launched in China to chit chat
    with users on Chinese social media sites
  • 1:45 - 1:50
    after they started to talk
    about their dreams of moving to the States
  • 1:50 - 1:54
    or mentioned their dislike
    for the Chinese Communist Party.
  • 1:54 - 1:57
    They were deactivated
    for a few days after the incident.
  • 1:57 - 2:02
    After they were reactivated,
    they started talking very "carefully"
  • 2:02 - 2:03
    when those issues came up,
  • 2:03 - 2:08
    giving answers like,
    "Sorry, I can not understand you."
  • 2:08 - 2:11
    People who use the digital assistant Siri
  • 2:11 - 2:15
    already know what a big
    engineering success it is.
  • 2:15 - 2:17
    Yet there's more to the story:
  • 2:17 - 2:21
    Authors and poets
    in every language are hired
  • 2:21 - 2:25
    so that Siri can give proper answers
    in such situations.
  • 2:25 - 2:28
    They are also writing scripts
  • 2:28 - 2:32
    so that it doesn't have to confess
    it can't understand what is being said
  • 2:32 - 2:35
    and it can continue the illusion
    of being intelligent
  • 2:35 - 2:38
    by diverting the conversation
    when it gets stuck.
  • 2:38 - 2:42
    Amazon also has a digital
    assistant named Alexa,
  • 2:42 - 2:44
    which doesn't have a Turkish version yet.
  • 2:44 - 2:47
    They promised a one-million-dollar prize
  • 2:47 - 2:51
    to the programming team which will enable
    Alexa to chat with people for 20 minutes
  • 2:51 - 2:54
    without causing extensive boredom.
  • 2:54 - 2:56
    No one has been able to do that yet.
  • 2:56 - 3:01
    The problem is that there are lots
    of very simple things that humans know,
  • 3:01 - 3:03
    and computers don't.
  • 3:03 - 3:07
    And we need to have a way
    of teaching them those things.
  • 3:07 - 3:11
    Let me tell you
    a personal story about that.
  • 3:12 - 3:16
    A long time ago, maybe 25 years or so,
  • 3:16 - 3:21
    I bought a third-grade math textbook
    for primary school students
  • 3:21 - 3:26
    and randomly picked 20 problems from it.
  • 3:26 - 3:31
    Then I started to write a program
    which would "understand Turkish."
  • 3:31 - 3:35
    It would understand these particular
    arithmetic problems in Turkish
  • 3:35 - 3:36
    and solve them.
  • 3:36 - 3:39
    I was thinking that
    I would therefore reach a new level
  • 3:39 - 3:42
    in the computer understanding
    of the Turkish language
  • 3:42 - 3:44
    and write another paper.
  • 3:44 - 3:45
    The name of the program was ALİ,
  • 3:45 - 3:48
    a Turkish acronym for
    "arithmetic language processor."
  • 3:48 - 3:51
    It could solve problems like these:
  • 3:51 - 3:54
    There were this many workers at a factory.
  • 3:54 - 3:56
    That many of them were fired,
    and this many retired.
  • 3:56 - 3:58
    How many were left?
  • 3:58 - 4:01
    Or questions like,
    "That many students from College A
  • 4:01 - 4:03
    and this many students from College B
  • 4:03 - 4:04
    attended the ceremony.
  • 4:04 - 4:07
    What's the total number of students?"
  • 4:07 - 4:10
    requiring really simple arithmetic.
  • 4:11 - 4:12
    You think that's easy peasy?
  • 4:12 - 4:15
    Ask Siri the same questions,
    and see if it can solve them all.
  • 4:15 - 4:19
    Let me tell you,
    it took two years of my youth,
  • 4:19 - 4:22
    and I used to have gorgeous hair
    when I got started.
  • 4:22 - 4:24
    (Laughter)
  • 4:24 - 4:25
    Here is the problem.
  • 4:25 - 4:27
    Let us go through this example.
  • 4:27 - 4:31
    There were 67 liters of diesel
    in the gas tank of a truck.
  • 4:31 - 4:33
    The driver bought 145 liters more.
  • 4:33 - 4:35
    How much diesel does the truck have now?
  • 4:35 - 4:40
    We are skipping the linguistics routines
    that analyze all this in Turkish.
  • 4:40 - 4:43
    Let's come to the point
  • 4:43 - 4:46
    where the AI can understand
    the fact that there need to be 212 liters
  • 4:46 - 4:48
    at the end of the first two sentences.
  • 4:48 - 4:53
    There, we come to a point where it knows
    that there are 212 liters of diesel
  • 4:53 - 4:54
    in the gas tank,
  • 4:54 - 4:56
    but what was the wording
    of the question again?
  • 4:56 - 4:59
    "What's the sum of diesel in the truck?"
  • 4:59 - 5:01
    "How much diesel
    does the truck have now?"
  • 5:01 - 5:06
    ALİ could not answer that
    with the information we have mentioned.
  • 5:06 - 5:08
    Do you see what the problem was?
  • 5:08 - 5:12
    "The gas tank of the truck" is not
    the same thing as "the truck,"
  • 5:12 - 5:15
    and computers do not know automatically
  • 5:15 - 5:18
    that if the tank contains something,
    the truck also contains that thing.
  • 5:18 - 5:20
    And that's really complicated.
  • 5:20 - 5:22
    "Ahmet's father had five kids"
  • 5:22 - 5:23
    does not mean "Ahmet had five kids."
  • 5:23 - 5:24
    On the other hand,
  • 5:24 - 5:28
    when the gas tank of the truck
    has the petrol, the truck has it as well.
  • 5:28 - 5:32
    That's why I had to specify in the program
  • 5:32 - 5:36
    all this knowledge
    that people already inherently know.
  • 5:36 - 5:40
    The technical name for this stuff
    is "commonsense knowledge."
  • 5:40 - 5:44
    "The gas tank of the truck
    is a part of the truck."
  • 5:45 - 5:48
    "If A is a part of B, right,
  • 5:48 - 5:51
    B should contain
    everything contained in A."
  • 5:51 - 5:54
    All of this information
    that I consider commonsense
  • 5:54 - 5:57
    is all the things that I do not tell you
    while we are talking
  • 5:57 - 6:02
    since I assume
    that you already know it all.
  • 6:02 - 6:07
    We can not have a proper conversation
    with those chatbots
  • 6:07 - 6:09
    since they know none of those things.
  • 6:09 - 6:12
    After I coded all these,
    ALİ could solve all 20 problems properly.
  • 6:12 - 6:17
    I had no more energy to go on to the 21st.
  • 6:17 - 6:21
    Now, I'll tell you the story of a man
    who dedicated his life to this problem
  • 6:21 - 6:23
    of coding commonsense knowledge:
  • 6:23 - 6:27
    Douglas Lenat, a famous American
    computer scientist.
  • 6:27 - 6:30
    This is him in the 1980s.
  • 6:30 - 6:34
    He started a project called Cyc in 1982.
  • 6:34 - 6:37
    And this is exactly
    what the project was about:
  • 6:37 - 6:41
    To code all the commonsense knowledge
    that computers don't know.
  • 6:41 - 6:44
    To write a million lines,
    if a million lines are needed.
  • 6:44 - 6:47
    He founded a corporation
    where they do the following:
  • 6:47 - 6:52
    If you are drinking coffee, the open side
    of the cup is facing upwards.
  • 6:53 - 6:55
    The king is a man.
  • 6:55 - 6:59
    Then his wife should be a woman,
    and she is called the queen.
  • 6:59 - 7:03
    People can't go to work after they die.
  • 7:03 - 7:04
    And so on.
  • 7:04 - 7:07
    They are coding all the items
    of information which people already know
  • 7:07 - 7:13
    and computers need to know in order
    to understand human language, one by one.
  • 7:13 - 7:15
    And this is him today.
  • 7:15 - 7:18
    After 35 years, the project
    is still in progress.
  • 7:18 - 7:21
    I think there's an obvious problem here.
  • 7:21 - 7:24
    It's clearly problematic to code manually.
  • 7:24 - 7:26
    Now it's time to hear the good news.
  • 7:26 - 7:28
    We have had a revolution in AI,
  • 7:28 - 7:30
    and computers can now learn
    certain things on their own,
  • 7:30 - 7:36
    without us having to code them manually.
  • 7:36 - 7:38
    This is a machine-learning revolution.
  • 7:38 - 7:41
    Linguists have the following idea:
  • 7:41 - 7:46
    If two words are exact
    synonyms of each other,
  • 7:46 - 7:50
    then the collections of all other words
    surrounding them in various sentences
  • 7:50 - 7:52
    will also be similar to each other.
  • 7:52 - 7:56
    Based on this idea, this man,
  • 7:56 - 8:01
    who is proof of the fact
    that you don't need to be bald
  • 8:01 - 8:03
    in order to be handsome
    if you're an AI researcher,
  • 8:04 - 8:06
    named Tomas Mikolov,
  • 8:07 - 8:11
    did the following while working
    for Google five years ago.
  • 8:11 - 8:14
    Now think of all the documents
    in English at Google.
  • 8:14 - 8:16
    The work I'll be telling
    you about was in English.
  • 8:16 - 8:18
    Now imagine all the documents in English.
  • 8:18 - 8:20
    For every word in every sentence,
  • 8:20 - 8:26
    you're supposed to find out how many times
    it has appeared in the same sentence
  • 8:26 - 8:27
    with any other words.
  • 8:27 - 8:31
    For every imaginable pair of words,
    we have the computer count
  • 8:31 - 8:37
    how many times these two words appear
    together in the same sentence or not.
  • 8:37 - 8:38
    It's a computer,
  • 8:38 - 8:42
    so it can do the computations anyway.
  • 8:42 - 8:46
    The idea is that, if the two words
    are close to each other in meaning,
  • 8:46 - 8:50
    the same words appear with similar
    frequencies in their surroundings.
  • 8:50 - 8:54
    Let's say, we can easily see
    that both words "cat" and "dog"
  • 8:54 - 8:56
    will appear frequently
    in the same sentences
  • 8:56 - 9:03
    with the words "flea" or "rabies,"
    "vaccine," "tail," "pet," and so on,
  • 9:03 - 9:07
    but not with words like "printer,"
    "generator" or "inflation."
  • 9:07 - 9:09
    Do we see this?
  • 9:09 - 9:12
    So, we can prepare a number sequence
  • 9:12 - 9:15
    containing the frequencies
    of the neighboring words
  • 9:15 - 9:18
    for every single word.
  • 9:18 - 9:23
    Such a number sequence
    is called a "vector,"
  • 9:23 - 9:27
    as you might well know
    if they still teach it in high school.
  • 9:27 - 9:31
    The computer can automatically position
    similar number sequences
  • 9:31 - 9:34
    closer to each other,
  • 9:34 - 9:37
    and the dissimilar ones
    far from each other
  • 9:37 - 9:41
    on some sort of a map or space.
  • 9:41 - 9:46
    What I mean is that the computer,
    which knows no English,
  • 9:46 - 9:51
    creates a vector for each single word
    by doing the computations.
  • 9:51 - 9:52
    Yet, the vector of "cat"
  • 9:52 - 9:55
    is found in a location
    close to the vector of "dog" in that space
  • 9:55 - 9:57
    for the reasons I just explained.
  • 9:57 - 10:01
    Or the vector of the school
    Buffy the Vampire Slayer attends -
  • 10:01 - 10:03
    they really looked at that -
  • 10:03 - 10:06
    is positioned close
    to the vector of Hogwarts,
  • 10:06 - 10:08
    where Harry Potter studies.
  • 10:08 - 10:12
    Thus they are found to be positioned close
    to each other in terms of their meaning.
  • 10:12 - 10:13
    There's more.
  • 10:13 - 10:16
    As you will recall
    from that high school course,
  • 10:16 - 10:19
    you can do arithmetic on these vectors.
  • 10:19 - 10:21
    They can be added or subtracted,
  • 10:21 - 10:22
    and you might say, "So what?"
  • 10:23 - 10:26
    Mikolov discovered this.
  • 10:26 - 10:29
    He did the following addition
    and subtraction operations
  • 10:29 - 10:31
    on the vectors thus learned.
  • 10:31 - 10:33
    He came up with the question,
    "What would happen
  • 10:33 - 10:37
    if the king were a woman instead of a man"
  • 10:37 - 10:39
    when he subtracted the word "man"
  • 10:39 - 10:43
    from the word "king"
    and added the word "woman."
  • 10:43 - 10:46
    Guess what the resulting
    vector is near to?
  • 10:46 - 10:47
    "Queen."
  • 10:47 - 10:51
    No one had hand-coded
    that equation as the Lenat team.
  • 10:51 - 10:54
    The computer discovered it all by itself
  • 10:54 - 10:58
    after counting millions of millions
    of words on the documents we created.
  • 10:58 - 11:02
    I have more to tell you,
    and this really happened.
  • 11:02 - 11:04
    There is info on Turkey there.
  • 11:04 - 11:08
    If you take "France" out of "Paris"
    and add "Turkey" -
  • 11:08 - 11:11
    yes, you got it right - it's Ankara.
  • 11:11 - 11:14
    This means in this vector space,
    there's a direction
  • 11:14 - 11:17
    which leads from the names of countries
    to the names of their capitals,
  • 11:17 - 11:18
    which is really stunning.
  • 11:18 - 11:23
    When you ask, What would Windows be
    had it not been invented by Microsoft,
  • 11:23 - 11:26
    but by Google?
  • 11:27 - 11:29
    the answer pops up as "Android."
  • 11:29 - 11:35
    When you subtract "copper"
    from "Cu" and add "gold,"
  • 11:36 - 11:40
    you get "Au" as the chemical
    symbol of gold.
  • 11:40 - 11:43
    This literally means we don't have to code
    these manually anymore.
  • 11:43 - 11:46
    It seems that the computer
    can make all the inferences
  • 11:46 - 11:49
    out of the data
    we provide it with all by itself.
  • 11:49 - 11:51
    This is the yummiest example of all.
  • 11:51 - 11:55
    When you take "Japan" out of "sushi"
    and add "Germany,"
  • 11:56 - 12:00
    you get the "bratwurst,"
    the German favorite.
  • 12:00 - 12:02
    Too good to be true, right?
  • 12:02 - 12:03
    Happy now?
  • 12:03 - 12:04
    We finalized this project.
  • 12:04 - 12:09
    Would computers understand what we say?
  • 12:09 - 12:11
    Are we having fun? Not much.
  • 12:11 - 12:14
    Now, I'll tell you
    about a Turkish researcher.
  • 12:14 - 12:19
    Tolga Bölükbaşı is about to finish his PhD
  • 12:19 - 12:21
    at Boston University in the States.
  • 12:21 - 12:24
    This is a research he did two years ago.
  • 12:24 - 12:27
    Tolga did the same thing
    as Mikolov did previously,
  • 12:27 - 12:30
    but this time on news texts.
  • 12:31 - 12:37
    What happens when you subtract "father"
    from "doctor" and add "mom"?
  • 12:37 - 12:42
    "My dad is a doctor, and mom is a nurse."
  • 12:42 - 12:48
    What about when you subtract "man"
    from "computer engineer" and add "woman"?
  • 12:48 - 12:52
    In fact, we shouldn't have gender.
  • 12:52 - 12:57
    Let's see how professions
    are related to gender
  • 12:57 - 13:01
    in the meaning space
    in the head of the computer.
  • 13:01 - 13:03
    You get "homemaker."
  • 13:03 - 13:04
    Seriously! You get "homemaker."
  • 13:04 - 13:08
    We get an English word "homemaker."
  • 13:08 - 13:11
    So, it's clear that we not only put
    all of our data in computers
  • 13:11 - 13:17
    but also put all of our prejudices.
  • 13:17 - 13:23
    Imagine if this computer
    were used to hire someone.
  • 13:23 - 13:27
    You've already uploaded your resume
    and all the personal information
  • 13:27 - 13:29
    including your gender.
  • 13:29 - 13:32
    Let's assume 10,000 people
    applied for the job.
  • 13:32 - 13:34
    The computer needs
    to do a pre-selection, right?
  • 13:34 - 13:38
    It needs to get to 1,000 candidates,
  • 13:38 - 13:42
    eliminating 9,000 others
  • 13:42 - 13:44
    so that the HR staff
    can evaluate the results.
  • 13:44 - 13:47
    Computers nowadays are already
    used for this kind of work.
  • 13:47 - 13:50
    Let's say that a computer loaded
    with such meaning vectors makes selection
  • 13:50 - 13:54
    among the candidates who have applied
    for a job vacancy for a computer engineer.
  • 13:54 - 13:57
    It might automatically eliminate
    all the female candidates,
  • 13:57 - 14:02
    thinking that a computer
    engineer should be male.
  • 14:02 - 14:05
    Tolga and his colleagues
    also mention other cases.
  • 14:05 - 14:11
    It was found out that computers
    link positive and negative attributions
  • 14:11 - 14:16
    with the words related to being
    Afro-American and Caucasian.
  • 14:16 - 14:20
    For instance, the computer thinks
  • 14:20 - 14:23
    that the word "mugger" is closely related
    to being Afro-American.
  • 14:23 - 14:27
    It's certain that we uploaded
    all our prejudices
  • 14:27 - 14:29
    while uploading all the information
    we have in computers.
  • 14:29 - 14:33
    You might ask yourselves,
    What will happen now?
  • 14:33 - 14:36
    Tolga and his team's article
    offers a solution to that.
  • 14:38 - 14:40
    Just told you.
  • 14:40 - 14:42
    All these things happen
    in the vector space.
  • 14:42 - 14:44
    Each word has its vector.
  • 14:44 - 14:48
    We already know from high school years
    that we can add and subtract them.
  • 14:48 - 14:52
    Tolga and his team first list the words
  • 14:52 - 14:57
    that are really feminine or masculine,
  • 14:57 - 15:01
    like "dad," "uncle,"
    "grandmother," and so on.
  • 15:01 - 15:06
    These words really should have
    a relation to male and female roles.
  • 15:06 - 15:11
    Then there are these words
    which should not be masculine or feminine
  • 15:11 - 15:14
    despite having closer meanings
    in the computer's space.
  • 15:14 - 15:19
    For example, the word "genius"
    appears to be male.
  • 15:19 - 15:24
    On the other hand, the word "stylist"
    stands out as a very female word.
  • 15:24 - 15:26
    It doesn't have to be like that.
  • 15:26 - 15:30
    So, after listing all the words
    that need to be feminine or masculine,
  • 15:30 - 15:33
    Tolga and his team created an algorithm
  • 15:33 - 15:40
    which would automatically erase
    the computer's prejudices
  • 15:40 - 15:47
    on the ones that should be neutral.
  • 15:47 - 15:51
    If a word like "father" or "uncle"
    is not in the list,
  • 15:51 - 15:56
    but it is still biased towards a gender
    in the space of meanings,
  • 15:56 - 16:01
    the algorithm automatically corrects it.
  • 16:01 - 16:03
    With the help of this,
    "computer programmer"
  • 16:03 - 16:06
    ends up at the same distance
    to the male and female notions,
  • 16:06 - 16:09
    and the problems I talked about go away.
  • 16:09 - 16:10
    Isn't that beautiful?
  • 16:10 - 16:16
    I wish we could delete the prejudices
    in the human brain so easily.
  • 16:16 - 16:22
    For a while, some people
    have been worrying about
  • 16:22 - 16:24
    what would happen
    if computers took over.
  • 16:24 - 16:26
    On the other hand,
  • 16:26 - 16:29
    considering the fact that we can't delete
    the prejudices in people,
  • 16:29 - 16:33
    while we can in computers,
  • 16:33 - 16:37
    maybe we could give computers a chance
    at jobs requiring fairness
  • 16:37 - 16:43
    such as being referees,
    judges, and managers
  • 16:43 - 16:45
    and let people take a rest for a while.
  • 16:45 - 16:46
    What do you say to that?
  • 16:47 - 16:48
    Thank you.
  • 16:48 - 16:51
    (Applause)
Title:
The intelligence of the future: robots without prejudice | Cem Say | TEDxIstanbul
Description:

Cem Say, famous for his writings and research on artificial intelligence, tells us what artificial intelligence is and when and where humanity will benefit from it. Moreover, he says we should not be afraid of artificial intelligence, adding that the prejudices of humanity would be exceeded by artificial intelligence with the use of technology.
While giving lectures at the Department of Computer Engineering at Boğaziçi University, Prof. Dr. Cem Say is a pioneer at making research on qualitative reasoning and understanding Turkish in artificial intelligence.
Say, one of the founders of the Cognitive Science Graduate Degree, works on how the human mind works and operated. He was among the computer experts who examined the digital evidence in the court cases and revealed their counterfeits.
He contributes to popular science journals such as "Technology for Everyone" and is well-known with his easy-to-read and learn articles describing complicated technology objects - such as Bitcoin, blockchain, sharing economy etc.

This talk was given at a TEDx event using the TED conference format but independently organized by a local community. Learn more at https://www.ted.com/tedx

more » « less
Video Language:
Turkish
Team:
closed TED
Project:
TEDxTalks
Duration:
16:55

English subtitles

Revisions Compare revisions