< Return to Video

Fake videos of real people -- and how to spot them

  • 0:01 - 0:02
    Look at these images.
  • 0:02 - 0:05
    Now, tell me which Obama here is real.
  • 0:05 - 0:08
    (Video) Barack Obama: To help families
    refinance their homes,
  • 0:08 - 0:10
    to invest in things
    like high-tech manufacturing,
  • 0:10 - 0:11
    clean energy
  • 0:11 - 0:14
    and the infrastructure
    that creates good new jobs.
  • 0:15 - 0:16
    Supasorn Suwajanakorn: Anyone?
  • 0:16 - 0:18
    The answer is none of them.
  • 0:18 - 0:19
    (Laughter)
  • 0:19 - 0:21
    None of these is actually real.
  • 0:21 - 0:23
    So let me tell you how we got here.
  • 0:24 - 0:26
    My inspiration for this work
  • 0:26 - 0:31
    was a project meant to preserve our last
    chance for learning about the Holocaust
  • 0:31 - 0:33
    from the survivors.
  • 0:33 - 0:35
    It's called New Dimensions in Testimony,
  • 0:35 - 0:39
    and it allows you to have
    interactive conversations
  • 0:39 - 0:41
    with a hologram
    of a real Holocaust survivor.
  • 0:42 - 0:44
    (Video) Man: How did you
    survive the Holocaust?
  • 0:44 - 0:45
    (Video) Hologram: How did I survive?
  • 0:46 - 0:48
    I survived,
  • 0:48 - 0:50
    I believe,
  • 0:50 - 0:53
    because providence watched over me.
  • 0:54 - 0:57
    SS: Turns out these answers
    were prerecorded in a studio.
  • 0:57 - 1:00
    Yet the effect is astounding.
  • 1:00 - 1:03
    You feel so connected to his story
    and to him as a person.
  • 1:04 - 1:07
    I think there's something special
    about human interaction
  • 1:07 - 1:10
    that makes it much more profound
  • 1:10 - 1:12
    and personal
  • 1:12 - 1:16
    than what books or lectures
    or movies could ever teach us.
  • 1:16 - 1:19
    So I saw this and began to wonder,
  • 1:19 - 1:22
    can we create a model
    like this for anyone?
  • 1:22 - 1:25
    A model that looks, talks
    and acts just like them?
  • 1:26 - 1:28
    So I set out to see if this could be done
  • 1:28 - 1:30
    and eventually came up with a new solution
  • 1:30 - 1:33
    that can build a model of a person
    using nothing but these:
  • 1:34 - 1:36
    existing photos and videos of a person.
  • 1:37 - 1:39
    If you can leverage
    this kind of passive information,
  • 1:39 - 1:41
    just photos and video that are out there,
  • 1:41 - 1:43
    that's the key to scaling to anyone.
  • 1:44 - 1:46
    By the way, here's Richard Feynman,
  • 1:46 - 1:49
    who in addition to being
    a Nobel Prize winner in physics
  • 1:49 - 1:52
    was also known as a legendary teacher.
  • 1:53 - 1:55
    Wouldn't it be great
    if we could bring him back
  • 1:55 - 1:59
    to give his lectures
    and inspire millions of kids,
  • 1:59 - 2:02
    perhaps not just in English
    but in any language?
  • 2:02 - 2:07
    Or if you could ask our grandparents
    for advice and hear those comforting words
  • 2:07 - 2:09
    even if they're no longer with us?
  • 2:10 - 2:13
    Or maybe using this tool,
    book authors, alive or not,
  • 2:13 - 2:16
    could read aloud all of their books
    for anyone interested.
  • 2:17 - 2:20
    The creative possibilities
    here are endless,
  • 2:20 - 2:21
    and to me, that's very exciting.
  • 2:23 - 2:25
    And here's how it's working so far.
  • 2:25 - 2:26
    First, we introduce a new technique
  • 2:26 - 2:31
    that can reconstruct a high-detailed
    3D face model from any image
  • 2:31 - 2:33
    without ever 3D-scanning the person.
  • 2:34 - 2:37
    And here's the same output model
    from different views.
  • 2:38 - 2:39
    This also works on videos,
  • 2:39 - 2:42
    by running the same algorithm
    on each video frame
  • 2:42 - 2:45
    and generating a moving 3D model.
  • 2:46 - 2:48
    And here's the same
    output model from different angles.
  • 2:50 - 2:52
    It turns out this problem
    is very challenging,
  • 2:52 - 2:55
    but the key trick
    is that we are going to analyze
  • 2:55 - 2:58
    a large photo collection
    of the person beforehand.
  • 2:59 - 3:01
    For George W. Bush,
    we can just search on Google,
  • 3:02 - 3:05
    and from that, we are able
    to build an average model,
  • 3:05 - 3:08
    an iterative, refined model
    to recover the expression
  • 3:08 - 3:10
    in fine details,
    like creases and wrinkles.
  • 3:11 - 3:13
    What's fascinating about this
  • 3:13 - 3:16
    is that the photo collection
    can come from your typical photos.
  • 3:16 - 3:19
    It doesn't really matter
    what expression you're making
  • 3:19 - 3:21
    or where you took those photos.
  • 3:21 - 3:23
    What matters is
    that there are a lot of them.
  • 3:23 - 3:25
    And we are still missing color here,
  • 3:25 - 3:27
    so next, we develop
    a new blending technique
  • 3:27 - 3:30
    that improves upon
    a single averaging method
  • 3:30 - 3:33
    and produces sharp
    facial textures and colors.
  • 3:34 - 3:37
    And this can be done for any expression.
  • 3:37 - 3:40
    Now we have a control
    of a model of a person,
  • 3:40 - 3:44
    and the way it's controlled now
    is by a sequence of static photos.
  • 3:44 - 3:47
    Notice how the wrinkles come and go,
    depending on the expression.
  • 3:48 - 3:51
    We can also use a video
    to drive the model.
  • 3:51 - 3:53
    (Video) Daniel Craig: Right, but somehow,
  • 3:53 - 3:57
    we've managed to attract
    some more amazing people.
  • 3:58 - 4:00
    SS: And here's another fun demo.
  • 4:00 - 4:02
    So what you see here
    are controllable models
  • 4:02 - 4:04
    of people I built
    from their internet photos.
  • 4:04 - 4:07
    Now, if you transfer
    the motion from the input video,
  • 4:07 - 4:10
    we can actually drive the entire party.
  • 4:10 - 4:12
    George W. Bush:
    It's a difficult bill to pass,
  • 4:12 - 4:14
    because there's a lot of moving parts,
  • 4:14 - 4:19
    and the legislative processes can be ugly.
  • 4:19 - 4:21
    (Applause)
  • 4:21 - 4:23
    SS: So coming back a little bit,
  • 4:23 - 4:26
    our ultimate goal, rather,
    is to capture their mannerisms
  • 4:26 - 4:29
    or the unique way each
    of these people talks and smiles.
  • 4:29 - 4:31
    So to do that, can we
    actually teach the computer
  • 4:31 - 4:34
    to imitate the way someone talks
  • 4:34 - 4:36
    by only showing it
    video footage of the person?
  • 4:37 - 4:39
    And what I did exactly was,
    I let a computer watch
  • 4:39 - 4:43
    14 hours of pure Barack Obama
    giving addresses.
  • 4:43 - 4:47
    And here's what we can produce
    given only his audio.
  • 4:47 - 4:49
    (Video) BO: The results are clear.
  • 4:49 - 4:53
    America's businesses have created
    14.5 million new jobs
  • 4:53 - 4:56
    over 75 straight months.
  • 4:56 - 4:59
    SS: So what's being synthesized here
    is only the mouth region,
  • 4:59 - 5:00
    and here's how we do it.
  • 5:01 - 5:03
    Our pipeline uses a neural network
  • 5:03 - 5:06
    to convert and input audio
    into these mouth points.
  • 5:07 - 5:11
    (Video) BO: We get it through our job
    or through Medicare or Medicaid.
  • 5:11 - 5:14
    SS: Then we synthesize the texture,
    enhance details and teeth,
  • 5:14 - 5:17
    and blend it into the head
    and background from a source video.
  • 5:17 - 5:19
    (Video) BO: Women can get free checkups,
  • 5:19 - 5:22
    and you can't get charged more
    just for being a woman.
  • 5:23 - 5:26
    Young people can stay
    on a parent's plan until they turn 26.
  • 5:27 - 5:30
    SS: I think these results
    seem very realistic and intriguing,
  • 5:30 - 5:33
    but at the same time
    frightening, even to me.
  • 5:33 - 5:37
    Our goal was to build an accurate model
    of a person, not to misrepresent them.
  • 5:38 - 5:41
    But one thing that concerns me
    is its potential for misuse.
  • 5:42 - 5:45
    People have been thinking
    about this problem for a long time,
  • 5:45 - 5:47
    since the days when Photoshop
    first hit the market.
  • 5:48 - 5:52
    As a researcher, I'm also working
    on countermeasure technology,
  • 5:52 - 5:55
    and I'm part of an ongoing
    effort at AI Foundation,
  • 5:55 - 5:58
    which uses a combination
    of machine learning and human moderators
  • 5:58 - 6:00
    to detect fake images and videos,
  • 6:00 - 6:02
    fighting against my own work.
  • 6:03 - 6:06
    And one of the tools we plan to release
    is called Reality Defender,
  • 6:06 - 6:10
    which is a web-browser plug-in
    that can flag potentially fake content
  • 6:10 - 6:12
    automatically, right in the browser.
  • 6:13 - 6:17
    (Applause)
  • 6:17 - 6:18
    Despite all this, though,
  • 6:18 - 6:20
    fake videos could do a lot of damage,
  • 6:20 - 6:23
    even before anyone has a chance to verify,
  • 6:23 - 6:26
    so it's very important
    that we make everyone aware
  • 6:26 - 6:28
    of what's currently possible
  • 6:28 - 6:32
    so we can have the right assumption
    and be critical about what we see.
  • 6:32 - 6:37
    There's still a long way to go before
    we can fully model individual people
  • 6:37 - 6:40
    and before we can ensure
    the safety of this technology.
  • 6:41 - 6:43
    But I'm excited and hopeful,
  • 6:43 - 6:46
    because if we use it right and carefully,
  • 6:46 - 6:51
    this tool can allow any individual's
    positive impact on the world
  • 6:51 - 6:53
    to be massively scaled
  • 6:53 - 6:56
    and really help shape our future
    the way we want it to be.
  • 6:56 - 6:57
    Thank you.
  • 6:57 - 7:02
    (Applause)
Title:
Fake videos of real people -- and how to spot them
Speaker:
Supasorn Suwajanakorn
Description:

Do you think you're good at spotting fake videos, where famous people say things they've never said in real life? See how they're made in this astonishing talk and tech demo. Computer scientist Supasorn Suwajanakorn shows how, as a grad student, he used AI and 3D modeling to create photorealistic fake videos of people synced to audio. Learn more about both the ethical implications and the creative possibilities of this tech -- and the steps being taken to fight against its misuse.

more » « less
Video Language:
English
Team:
closed TED
Project:
TEDTalks
Duration:
07:15
  • @5:02 the speaker says "to convert AN input audio into these mouth points," not "to convert AND input audio into these mouth points," imho.

English subtitles

Revisions Compare revisions