Return to Video

Digital humans that look just like us

  • 0:02 - 0:03
    Hello.
  • 0:03 - 0:05
    I'm not a real person.
  • 0:05 - 0:08
    I'm actually a copy of a real person.
  • 0:08 - 0:10
    Although, I feel like a real person.
  • 0:10 - 0:12
    It's kind of hard to explain.
  • 0:12 - 0:16
    Hold on -- I think I saw
    a real person ... there's one.
  • 0:17 - 0:18
    Let's bring him onstage.
  • 0:21 - 0:22
    Hello.
  • 0:23 - 0:27
    (Applause)
  • 0:28 - 0:31
    What you see up there is a digital human.
  • 0:32 - 0:35
    I'm wearing an inertial
    motion capture suit
  • 0:35 - 0:38
    that's figuring what my body is doing.
  • 0:38 - 0:41
    And I've got a single camera here
    that's watching my face
  • 0:41 - 0:46
    and feeding some machine-learning software
    that's taking my expressions,
  • 0:46 - 0:50
    like, "Hm, hm, hm,"
  • 0:50 - 0:52
    and transferring it to that guy.
  • 0:53 - 0:57
    We call him "DigiDoug."
  • 0:57 - 1:02
    He's actually a 3-D character
    that I'm controlling live in real time.
  • 1:04 - 1:07
    So, I work in visual effects.
  • 1:07 - 1:08
    And in visual effects,
  • 1:08 - 1:14
    one of the hardest things to do
    is to create believable, digital humans
  • 1:14 - 1:16
    that the audience accepts as real.
  • 1:16 - 1:21
    People are just really good
    at recognizing other people.
  • 1:21 - 1:22
    Go figure!
  • 1:24 - 1:27
    So, that's OK, we like a challenge.
  • 1:27 - 1:29
    Over the last 15 years,
  • 1:29 - 1:34
    we've been putting
    humans and creatures into film
  • 1:34 - 1:36
    that you accept as real.
  • 1:37 - 1:39
    If they're happy, you should feel happy.
  • 1:40 - 1:45
    And if they feel pain,
    you should empathize with them.
  • 1:46 - 1:49
    We're getting pretty good at it, too.
  • 1:49 - 1:51
    But it's really, really difficult.
  • 1:52 - 1:55
    Effects like these take thousands of hours
  • 1:55 - 1:58
    and hundreds of really talented artists.
  • 1:59 - 2:00
    But things have changed.
  • 2:01 - 2:03
    Over the last five years,
  • 2:03 - 2:07
    computers and graphics cards
    have gotten seriously fast.
  • 2:09 - 2:12
    And machine learning,
    deep learning, has happened.
  • 2:13 - 2:15
    So we asked ourselves:
  • 2:15 - 2:19
    Do you suppose we could create
    a photo-realistic human,
  • 2:19 - 2:21
    like we're doing for film,
  • 2:22 - 2:28
    but where you're seeing
    the actual emotions and the details
  • 2:28 - 2:32
    of the person who's controlling
    the digital human
  • 2:32 - 2:33
    in real time?
  • 2:34 - 2:35
    In fact, that's our goal:
  • 2:35 - 2:39
    If you were having
    a conversation with DigiDoug
  • 2:39 - 2:40
    one-on-one,
  • 2:41 - 2:47
    is it real enough so that you could tell
    whether or not I was lying to you?
  • 2:48 - 2:49
    So that was our goal.
  • 2:51 - 2:55
    About a year and a half ago,
    we set off to achieve this goal.
  • 2:55 - 2:59
    What I'm going to do now is take you
    basically on a little bit of a journey
  • 2:59 - 3:02
    to see exactly what we had to do
    to get where we are.
  • 3:04 - 3:08
    We had to capture
    an enormous amount of data.
  • 3:08 - 3:11
    In fact, by the end of this thing,
  • 3:11 - 3:16
    we had probably one of the largest
    facial data sets on the planet.
  • 3:16 - 3:18
    Of my face.
  • 3:18 - 3:20
    (Laughter)
  • 3:20 - 3:21
    Why me?
  • 3:21 - 3:24
    Well, I'll do just about
    anything for science.
  • 3:24 - 3:26
    I mean, look at me!
  • 3:27 - 3:28
    I mean, come on.
  • 3:31 - 3:37
    We had to first figure out
    what my face actually looked like.
  • 3:37 - 3:40
    Not just a photograph or a 3-D scan,
  • 3:40 - 3:44
    but what it actually looked like
    in any photograph,
  • 3:44 - 3:47
    how light interacts with my skin.
  • 3:48 - 3:53
    Luckily for us, about three blocks away
    from our Los Angeles studio
  • 3:53 - 3:55
    is this place called ICT.
  • 3:56 - 3:57
    They're a research lab
  • 3:57 - 4:00
    that's associated with the University
    of Southern California.
  • 4:01 - 4:04
    They have a device there,
    it's called the "light stage."
  • 4:04 - 4:08
    It has a zillion
    individually controlled lights
  • 4:08 - 4:10
    and a whole bunch of cameras.
  • 4:10 - 4:16
    And with that, we can reconstruct my face
    under a myriad of lighting conditions.
  • 4:18 - 4:19
    We even captured the blood flow
  • 4:19 - 4:22
    and how my face changes
    when I make expressions.
  • 4:23 - 4:29
    This let us build a model of my face
    that, quite frankly, is just amazing.
  • 4:29 - 4:34
    It's got an unfortunate
    level of detail, unfortunately.
  • 4:34 - 4:35
    (Laughter)
  • 4:35 - 4:39
    You can see every pore, every wrinkle.
  • 4:39 - 4:40
    But we had to have that.
  • 4:41 - 4:43
    Reality is all about detail.
  • 4:43 - 4:45
    And without it, you miss it.
  • 4:47 - 4:48
    We are far from done, though.
  • 4:49 - 4:53
    This let us build a model of my face
    that looked like me.
  • 4:53 - 4:56
    But it didn't really move like me.
  • 4:57 - 5:00
    And that's where
    machine learning comes in.
  • 5:00 - 5:03
    And machine learning needs a ton of data.
  • 5:03 - 5:08
    So I sat down in front of some
    high-resolution motion-capturing device.
  • 5:08 - 5:13
    And also, we did this traditional
    motion capture with markers.
  • 5:14 - 5:17
    We created a whole bunch
    of images of my face
  • 5:17 - 5:21
    and moving point clouds
    that represented that shapes of my face.
  • 5:22 - 5:25
    Man, I made a lot of expressions,
  • 5:25 - 5:28
    I said different lines
    in different emotional states ...
  • 5:28 - 5:31
    We had to do a lot of capture with this.
  • 5:32 - 5:35
    Once we had this enormous amount of data,
  • 5:35 - 5:38
    we built and trained deep neural networks.
  • 5:39 - 5:41
    And when we were finished with that,
  • 5:41 - 5:43
    in 16 milliseconds,
  • 5:43 - 5:46
    the neural network can look at my image
  • 5:46 - 5:49
    and figure out everything about my face.
  • 5:50 - 5:56
    It can compute my expression,
    my wrinkles, my blood flow --
  • 5:56 - 5:58
    even how my eyelashes move.
  • 5:59 - 6:02
    This is then rendered
    and displayed up there
  • 6:02 - 6:05
    with all the detail
    that we captured previously.
  • 6:06 - 6:07
    We're far from done.
  • 6:08 - 6:10
    This is very much a work in progress.
  • 6:10 - 6:14
    This is actually the first time
    we've shown it outside of our company.
  • 6:14 - 6:18
    And, you know, it doesn't look
    as convincing as we want;
  • 6:18 - 6:20
    I've got wires coming out
    of the back of me,
  • 6:20 - 6:22
    and there's a sixth-of-a-second delay
  • 6:22 - 6:27
    between when we capture the video
    and we display it up there.
  • 6:27 - 6:29
    Sixth of a second -- that's crazy good!
  • 6:30 - 6:33
    But it's still why you're hearing
    a bit of an echo and stuff.
  • 6:34 - 6:38
    And you know, this machine learning
    stuff is brand-new to us,
  • 6:38 - 6:42
    sometimes it's hard to convince
    to do the right thing, you know?
  • 6:42 - 6:44
    It goes a little sideways.
  • 6:44 - 6:47
    (Laughter)
  • 6:48 - 6:51
    But why did we do this?
  • 6:51 - 6:53
    Well, there's two reasons, really.
  • 6:53 - 6:56
    First of all, it is just crazy cool.
  • 6:56 - 6:57
    (Laughter)
  • 6:57 - 6:59
    How cool is it?
  • 6:59 - 7:01
    Well, with the push of a button,
  • 7:01 - 7:05
    I can deliver this talk
    as a completely different character.
  • 7:06 - 7:08
    This is Elbor.
  • 7:10 - 7:12
    We put him together
    to test how this would work
  • 7:12 - 7:15
    with a different appearance.
  • 7:15 - 7:20
    And the cool thing about this technology
    is that, while I've changed my character,
  • 7:20 - 7:24
    the performance is still all me.
  • 7:24 - 7:26
    I tend to talk out of the right
    side of my mouth;
  • 7:26 - 7:28
    so does Elbor.
  • 7:28 - 7:29
    (Laughter)
  • 7:30 - 7:33
    Now, the second reason we did this,
    and you can imagine,
  • 7:33 - 7:35
    is this is going to be great for film.
  • 7:35 - 7:38
    This is a brand-new, exciting tool
  • 7:38 - 7:42
    for artists and directors
    and storytellers.
  • 7:43 - 7:45
    It's pretty obvious, right?
  • 7:45 - 7:47
    I mean, this is going to be
    really neat to have.
  • 7:47 - 7:49
    But also, now that we've built it,
  • 7:49 - 7:52
    it's clear that this
    is going to go way beyond film.
  • 7:54 - 7:55
    But wait.
  • 7:56 - 8:00
    Didn't I just change my identity
    with the push of a button?
  • 8:00 - 8:03
    Isn't this like "deepfake"
    and face-swapping
  • 8:03 - 8:04
    that you guys may have heard of?
  • 8:05 - 8:06
    Well, yeah.
  • 8:07 - 8:10
    In fact, we are using
    some of the same technology
  • 8:10 - 8:12
    that deepfake is using.
  • 8:12 - 8:17
    Deepfake is 2-D and image based,
    while ours is full 3-D
  • 8:17 - 8:19
    and way more powerful.
  • 8:19 - 8:21
    But they're very related.
  • 8:22 - 8:24
    And now I can hear you thinking,
  • 8:24 - 8:25
    "Darn it!
  • 8:25 - 8:29
    I though I could at least
    trust and believe in video.
  • 8:29 - 8:32
    If it was live video,
    didn't it have to be true?"
  • 8:33 - 8:36
    Well, we know that's not
    really the case, right?
  • 8:37 - 8:41
    Even without this, there are simple tricks
    that you can do with video
  • 8:41 - 8:43
    like how you frame a shot
  • 8:43 - 8:48
    that can make it really misrepresent
    what's actually going on.
  • 8:48 - 8:52
    And I've been working
    in visual effects for a long time,
  • 8:52 - 8:54
    and I've known for a long time
  • 8:54 - 8:59
    that with enough effort,
    we can fool anyone about anything.
  • 9:00 - 9:02
    What this stuff and deepfake is doing
  • 9:02 - 9:07
    is making it easier and more accessible
    to manipulate video,
  • 9:07 - 9:12
    just like Photoshop did
    for manipulating images, some time ago.
  • 9:13 - 9:15
    I prefer to think about
  • 9:15 - 9:20
    how this technology could bring
    humanity to other technology
  • 9:20 - 9:22
    and bring us all closer together.
  • 9:22 - 9:24
    Now that you've seen this,
  • 9:25 - 9:26
    think about the possibilities.
  • 9:28 - 9:32
    Right off the bat, you're going to see it
    in live events and concerts, like this.
  • 9:34 - 9:38
    Digital celebrities, especially
    with new projection technology,
  • 9:38 - 9:42
    are going to be just like the movies,
    but alive and in real time.
  • 9:44 - 9:46
    And new forms of communication are coming.
  • 9:47 - 9:51
    You can already interact
    with DigiDoug in VR.
  • 9:52 - 9:54
    And it is eye-opening.
  • 9:54 - 9:58
    It's just like you and I
    are in the same room,
  • 9:58 - 10:00
    even though we may be miles apart.
  • 10:00 - 10:03
    Heck, the next time you make a video call,
  • 10:03 - 10:07
    you will be able to choose
    the version of you
  • 10:07 - 10:08
    you want people to see.
  • 10:09 - 10:12
    It's like really, really good makeup.
  • 10:13 - 10:16
    I was scanned about a year and a half ago.
  • 10:17 - 10:19
    I've aged.
  • 10:19 - 10:20
    DigiDoug hasn't.
  • 10:21 - 10:24
    On video calls, I never have to grow old.
  • 10:26 - 10:29
    And as you can imagine,
    this is going to be used
  • 10:29 - 10:33
    to give virtual assistants
    a body and a face.
  • 10:33 - 10:34
    A humanity.
  • 10:34 - 10:37
    I already love it that when I talk
    to virtual assistants,
  • 10:37 - 10:40
    they answer back in a soothing,
    humanlike voice.
  • 10:40 - 10:42
    Now they'll have a face.
  • 10:42 - 10:47
    And you'll get all the nonverbal cues
    that make communication so much easier.
  • 10:48 - 10:50
    It's going to be really nice.
  • 10:50 - 10:53
    You'll be able to tell when
    a virtual assistant is busy or confused
  • 10:53 - 10:56
    or concerned about something.
  • 10:58 - 11:00
    Now, I couldn't leave the stage
  • 11:00 - 11:03
    without you actually being able
    to see my real face,
  • 11:03 - 11:05
    so you can do some comparison.
  • 11:07 - 11:08
    So let me take off my helmet here.
  • 11:08 - 11:13
    Yeah, don't worry,
    it looks way worse than it feels.
  • 11:13 - 11:16
    (Laughter)
  • 11:17 - 11:19
    So this is where we are.
  • 11:19 - 11:21
    Let me put this back on here.
  • 11:21 - 11:22
    (Laughter)
  • 11:23 - 11:24
    Doink!
  • 11:25 - 11:27
    So this is where we are.
  • 11:28 - 11:32
    We're on the cusp of being able
    to interact with digital humans
  • 11:32 - 11:34
    that are strikingly real,
  • 11:34 - 11:37
    whether they're being controlled
    by a person or a machine.
  • 11:37 - 11:42
    And like all new technology these days,
  • 11:43 - 11:47
    it's going to come with some
    serious and real concerns
  • 11:47 - 11:49
    that we have to deal with.
  • 11:50 - 11:52
    But I am just so really excited
  • 11:52 - 11:57
    about the ability to bring something
    that I've seen only in science fiction
  • 11:57 - 12:00
    for my entire life
  • 12:00 - 12:01
    into reality.
  • 12:02 - 12:06
    Communicating with computers
    will be like talking to a friend.
  • 12:06 - 12:09
    And talking to faraway friends
  • 12:09 - 12:12
    will be like sitting with them
    together in the same room.
  • 12:13 - 12:14
    Thank you very much.
  • 12:14 - 12:21
    (Applause)
Title:
Digital humans that look just like us
Speaker:
Doug Roble
Description:

In an astonishing talk and tech demo, software researcher Doug Roble debuts "DigiDoug": a real-time, 3-D, digital rendering of his likeness that's accurate down to the scale of pores and wrinkles. Powered by an inertial motion capture suit, deep neural networks and enormous amounts of data, DigiDoug renders the real Doug's emotions (and even how his blood flows and eyelashes move) in striking detail. Learn more about how this exciting tech was built -- and its applications in movies, virtual assistants and beyond.

more » « less
Video Language:
English
Team:
closed TED
Project:
TEDTalks
Duration:
12:34

English subtitles

Revisions Compare revisions