< Return to Video

What is cryptography? (15 min)

  • 0:00 - 0:03
    Before we start with the technical
    material, I wanna give you a quick
  • 0:03 - 0:06
    overview of what cryptography is about and
    the different areas of cryptography. So
  • 0:06 - 0:10
    the core of cryptography of course is
    secure communication that essentially
  • 0:10 - 0:15
    consists of two parts. The first is
    secured key establishment and then how do
  • 0:15 - 0:19
    we communicate securely once we have a
    shared key. We already said that secured
  • 0:19 - 0:23
    key establishment essentially amounts to
    Alice and Bob sending messages to one
  • 0:23 - 0:27
    another such that at the end of this
    protocol, there's a shared key that they
  • 0:27 - 0:31
    both agree on, shared key K, and beyond
    that, beyond just a shared key, in fact
  • 0:31 - 0:35
    Alice would know that she's talking to Bob
    and Bob would know that he's talking to
  • 0:35 - 0:40
    Alice. But a poor attacker who listens in
    on this conversation has no idea what the
  • 0:40 - 0:44
    shared key is. And we'll see how to do
    that later on in the course. Now once they
  • 0:44 - 0:48
    have a shared key, they want exchange
    messages securely using this key, and
  • 0:48 - 0:52
    we'll talk about encryption schemes that
    allow them to do that in such a way that
  • 0:52 - 0:55
    an attacker can't figure out what messages
    are being sent back and forth. And
  • 0:55 - 1:00
    furthermore an attacker cannot even tamper
    with this traffic without being detected.
  • 1:00 - 1:03
    In other words, these encryption schemes
    provide both confidentiality and
  • 1:03 - 1:07
    integrity. But cryptography does much,
    much, much more than just these two
  • 1:07 - 1:11
    things. And I want to give you a few
    examples of that. So the first example I
  • 1:11 - 1:14
    want to give you is what's called a
    digital signature. So a digital signature,
  • 1:14 - 1:19
    basically, is the analog of the signature
    in the physical world. In the physical
  • 1:19 - 1:23
    world, remember when you sign a document,
    essentially, you write your signature on
  • 1:23 - 1:28
    that document and your signature is always
    the same. You always write the same
  • 1:28 - 1:32
    signature on all documents that you wanna
    sign. In the digital world, this can't
  • 1:32 - 1:37
    possibly work because if the attacker just
    obtained one signed document from me, he
  • 1:37 - 1:41
    can cut and paste my signature unto some
    other document that I might not have
  • 1:41 - 1:45
    wanted to sign. And so, it's simply not
    possible in a digital world that my
  • 1:45 - 1:50
    signature is the same for all documents
    that I want to sign. So we're gonna talk
  • 1:50 - 1:54
    about how to construct digital signatures
    in the second half of the course. It's
  • 1:54 - 1:58
    really quite an interesting primitive and
    we'll see exactly how to do it. Just to
  • 1:58 - 2:02
    give you a hint, the way digital
    signatures work is basically by making the
  • 2:02 - 2:06
    digital signature via function of the
    content being signed. So an attacker who
  • 2:06 - 2:10
    tries to copy my signature from one
    document to another is not gonna succeed
  • 2:10 - 2:15
    because the signature. On the new document
    is not gonna be the proper function of the
  • 2:15 - 2:19
    data in the new document, and as a result,
    the signature won't verify. And as I said,
  • 2:19 - 2:23
    we're gonna see exactly how to construct
    digital signatures later on and then we'll
  • 2:23 - 2:27
    prove that those constructions are secure.
    Another application of cryptography that I
  • 2:27 - 2:31
    wanted to mention, is anonymous
    communication. So, here, imagine user
  • 2:31 - 2:36
    Alice wants to talk to some chat server,
    Bob. And, perhaps she wants to talk about
  • 2:36 - 2:40
    a medical condition, and so she wants to
    do that anonymously, so that the chat
  • 2:40 - 2:45
    server doesn't actually know who she is.
    Well, there's a standard method, called a
  • 2:45 - 2:50
    mixnet, that allows Alice to communicate
    over the public internet with Bob through
  • 2:50 - 2:55
    a sequence of proxies such that at the end
    of the communication Bob has no idea who he
  • 2:55 - 3:00
    just talked to. The way mixnets work
    is basically as Alice sends her messages
  • 3:00 - 3:04
    to Bob through a sequence of proxies,
    these messages get encrypted and
  • 3:04 - 3:08
    decrypted appropriately so that Bob has no
    idea who he talked to and the proxies
  • 3:08 - 3:13
    themselves don't even know that Alice is
    talking to Bob, or that actually who is
  • 3:13 - 3:17
    talking to whom more generally. One
    interesting thing about this anonymous
  • 3:17 - 3:20
    communication channel is, this is
    bi-directional. In other words, even
  • 3:20 - 3:25
    though Bob has no idea who he's talking
    to, he can still respond to Alice and
  • 3:25 - 3:29
    Alice will get those messages. Once we
    have anonymous communication, we can build
  • 3:29 - 3:34
    other privacy mechanisms. And I wanna give
    you one example which is called anonymous
  • 3:34 - 3:38
    digital cash. Remember that in the
    physical world if I have a physical
  • 3:38 - 3:42
    dollar, I can walk into a bookstore and
    buy a book and the merchant would have no
  • 3:42 - 3:47
    idea who I am. The question is whether we
    can do the exact same thing in the digital
  • 3:47 - 3:51
    world. In the digital world, basically,
    Alice might have a digital dollar,
  • 3:51 - 3:56
    a digital dollar coin. And she might wanna
    spend that digital dollar at some online
  • 3:56 - 4:01
    merchants, perhaps some online bookstore.
    Now, what we'd like to do is make it so
  • 4:01 - 4:06
    that when Alice spends her coin at the
    bookstore, the bookstore would have no
  • 4:06 - 4:11
    idea who Alice is. So we provide the same
    anonymity that we get from physical cash.
  • 4:11 - 4:15
    Now the problem is that in the digital
    world, Alice can take the coin that she
  • 4:15 - 4:20
    had, this one dollar coin, and before she spent
    it, she can actually make copies of it.
  • 4:20 - 4:24
    And then all of a sudden instead of just
    having just one dollar coin now all
  • 4:24 - 4:28
    of a sudden she has three dollar coins and
    they're all the same of course, and
  • 4:28 - 4:32
    there's nothing preventing her from taking
    those replicas of a dollar coin and
  • 4:32 - 4:36
    spending it at other merchants. And so the
    question is how do we provide anonymous
  • 4:36 - 4:40
    digital cash? But at the same time, also
    prevent Alice from double spending the
  • 4:40 - 4:44
    dollar coin at different merchants. In
    some sense there's a paradox here where
  • 4:44 - 4:48
    anonymity is in conflict with security
    because if we have anonymous cash there's
  • 4:48 - 4:52
    nothing to prevent Alice from double
    spending the coin and because the coin is
  • 4:52 - 4:56
    anonymous we have no way of telling who
    committed this fraud. And so the question
  • 4:56 - 5:00
    is how do we resolve this tension. And it
    turns out, it's completely doable. And
  • 5:00 - 5:05
    we'll talk about anonymous digital cash
    later on. Just to give you a hint, I'll
  • 5:05 - 5:09
    say that the way we do it is basically by
    making sure that if Alice spends the coin
  • 5:09 - 5:14
    once then no one knows who she is, but if
    she spends the coin more than once, all of
  • 5:14 - 5:18
    a sudden, her identity is completely
    exposed and then she could be subject to
  • 5:18 - 5:22
    all sorts of legal problems. And so that's
    how anonymous digital cash would
  • 5:22 - 5:26
    work at a high level and we'll see how to
    implement it later on in the course.
  • 5:26 - 5:30
    Another application of cryptography has to
    do with more abstract protocols, but
  • 5:30 - 5:34
    before I speak the general result, I
    want to give you two examples. So the
  • 5:34 - 5:38
    first example has to do with election
    systems. So here's the election problem.
  • 5:38 - 5:43
    Suppose ... we have two parties, party zero
    and party one. And voters vote for these
  • 5:43 - 5:47
    parties. So for example, this voter could
    have voted for party zero, this voter voted for
  • 5:47 - 5:52
    party one. And so on. So in this election,
    party zero got three votes and party two
  • 5:52 - 5:57
    got two votes. So the winner of the
    election, of course, is party zero. But
  • 5:57 - 6:02
    more generally, the winner of the election
    is the party who receives the majority of
  • 6:02 - 6:06
    the votes. Now, the voting problem is the
    following. The voters would somehow like
  • 6:06 - 6:12
    to compute the majority of the votes, but
    do it in such a way such that nothing else
  • 6:12 - 6:17
    is revealed about their individual votes.
    Okay? So the question is how to do that?
  • 6:17 - 6:21
    And to do so, we're gonna introduce an
    election center who's gonna help us
  • 6:21 - 6:27
    compute the majority, but keep the votes
    otherwise secret. And what the parties
  • 6:27 - 6:32
    will do is they will each send the funny
    encryption of their votes to the election
  • 6:32 - 6:37
    center in such a way that at the end of
    the election, the election center is able
  • 6:37 - 6:42
    to compute and output the winner of the
    election. However, other than the winner
  • 6:42 - 6:47
    of the election, nothing else is revealed
    about the individual votes. The individual
  • 6:47 - 6:51
    votes otherwise remain completely private.
    Of course the election center is also
  • 6:51 - 6:56
    gonna verify that this voter for example
    is allowed to vote and that the voter has
  • 6:56 - 7:01
    only voted once. But other than that
    information the election center and the
  • 7:01 - 7:05
    rest of the world learned nothing else
    about that voter's vote other than the
  • 7:05 - 7:10
    result of the election. So this is an
    example of a protocol that involves six
  • 7:10 - 7:14
    parties. In this case, there are five
    voters in one election center. These
  • 7:14 - 7:19
    parties compute amongst themselves. And at
    the end of the computation, the result of
  • 7:19 - 7:24
    the election is known but nothing else is
    revealed about the individual inputs. Now
  • 7:24 - 7:29
    a similar problem comes up in the context
    of private auctions. So in a private
  • 7:29 - 7:34
    auction every bidder has his own bid that
    he wants to bid. And now suppose the
  • 7:34 - 7:39
    auction mechanism that's being used is
    what's called a Vickrey auction where the
  • 7:39 - 7:45
    definition of a Vickrey auction is that
    the winner is the highest bidder. But the
  • 7:45 - 7:50
    amounts that the winner pays is actually
    the second highest bid. So he pays the
  • 7:50 - 7:55
    second highest bid. Okay, so this is a
    standard auction mechanism called the
  • 7:55 - 8:00
    Vickrey auction. And now what we'd like to
    do is basically enable the participants to
  • 8:00 - 8:05
    compute, to figure out who the highest
    bidder is and how much he's supposed to
  • 8:05 - 8:09
    pay, but other than that, all other
    information about the individual bids
  • 8:09 - 8:14
    should remain secret. So for example, the
    actual amount that the highest bidder bid
  • 8:14 - 8:19
    should remain secret. The only thing that
    should become public is the second highest
  • 8:19 - 8:24
    bid and the identity of the highest
    bidder. So again now, the way we will do
  • 8:24 - 8:28
    that is we'll introduce an auction center, and
    in a similar way, essentially, everybody
  • 8:28 - 8:33
    will send their encrypted bids to the
    auction center. The auction center will
  • 8:33 - 8:37
    compute the identity of the winner and in
    fact he will also compute the second
  • 8:37 - 8:42
    highest bid but other than these two
    values, nothing else is revealed about the
  • 8:42 - 8:46
    individual bids. Now, this is actually an
    example of a much more general problem
  • 8:46 - 8:50
    called secure multi-party computation. Let
    me explain what secure multi-party
  • 8:50 - 8:55
    computation is about. So here, basically
    abstractly, the participants have a secret
  • 8:55 - 8:59
    inputs to themselves. So, in the case of
    an election, the inputs would be the
  • 8:59 - 9:03
    votes. In the case of an auction, the
    inputs would be the secret bids. And then
  • 9:03 - 9:07
    what they would like to do is compute some
    sort of a function of their inputs.
  • 9:07 - 9:11
    Again, in the case of an election, the
    function's a majority. In the case of
  • 9:11 - 9:15
    auction, the function happens to be the
    second highest, largest number among x one
  • 9:15 - 9:19
    to x four. And the question is, how can
    they do that, such that the value of the
  • 9:19 - 9:23
    function is revealed, but nothing else is
    revealed about the individual votes? So
  • 9:23 - 9:28
    let me show you kind of a dumb, insecure
    way of doing it. What we do is introduce a
  • 9:28 - 9:32
    trusted party. And then, this trusted
    authority basically collects individual
  • 9:32 - 9:36
    inputs. And it kinda promises to keep the
    individual inputs secret, so that only it
  • 9:36 - 9:41
    would know what they are. And then, it
    publishes the value of the function, to
  • 9:41 - 9:45
    the world. So, the idea is now that the
    value of the function became public, but
  • 9:45 - 9:49
    nothing else is revealed about the
    individual inputs. But, of course, you got
  • 9:49 - 9:53
    this trusted authority that you got to
    trust, and if for some reason it's not
  • 9:53 - 9:57
    trustworthy, then you have a problem. And
    so, there's a very central theorem in
  • 9:57 - 10:01
    crypto and it really is quite a
    surprising fact. That says that any
  • 10:01 - 10:05
    computation you'd like to do, any function
    F you'd like to compute, that you can
  • 10:05 - 10:09
    compute with a trusted authority, you can
    also do without a trusted authority.
  • 10:09 - 10:14
    Let me at a high level explain what this
    means. Basically, what we're gonna do, is
  • 10:14 - 10:18
    we're gonna get rid of the authority. So
    the parties are actually not gonna send
  • 10:18 - 10:22
    their inputs to the authority. And, in
    fact, there's no longer going to be an
  • 10:22 - 10:26
    authority in the system. Instead, what the
    parties are gonna do, is they're gonna
  • 10:26 - 10:31
    talk to one another using some protocol.
    Such that at the end of the protocol all
  • 10:31 - 10:35
    of a sudden the value of the function
    becomes known to everybody. And yet
  • 10:35 - 10:39
    nothing other than the value of the
    function is revealed. In other words, the
  • 10:39 - 10:44
    individual inputs is still kept secret.
    But again there's no authority, there's
  • 10:44 - 10:48
    just a way for them to talk to one another
    such that the final output is revealed. So
  • 10:48 - 10:52
    this is a fairly general result, it's kind
    of a surprising fact that is at all
  • 10:52 - 10:56
    doable. And in fact it is and towards the
    end of the class we'll actually see how to
  • 10:56 - 11:01
    make this happen. Now, there are some
    applications of cryptography that I can't
  • 11:01 - 11:06
    classify any other way other than to say
    that they are purely magical. Let me give
  • 11:06 - 11:10
    you two examples of that. So the first is
    what's called privately outsourcing
  • 11:10 - 11:15
    computation. So I'll give you an example
    of a Google search just to illustrate the
  • 11:15 - 11:20
    point. So imagine Alice has a search query
    that she wants to issue. It turns out that
  • 11:20 - 11:25
    there are very special encryption schemes
    such that Alice can send an encryption of
  • 11:25 - 11:30
    her query to Google. And then because of
    the property of the encryption scheme
  • 11:30 - 11:35
    Google can actually compute on the
    encrypted values without knowing what the
  • 11:35 - 11:40
    plain texts are. So Google can actually
    run its massive search algorithm on the
  • 11:40 - 11:45
    encrypted query and recover in encrypted
    results. Okay. Google will send the
  • 11:45 - 11:49
    encrypted results back to Alice. Alice
    will decrypt and then she will receive the
  • 11:49 - 11:54
    results. But the magic here is all Google
    saw was just encryptions of her queries
  • 11:54 - 11:57
    and nothing else. And so, Google as a
    result has no idea what Alice just
  • 11:57 - 12:02
    searched for and nevertheless Alice
    actually learned exactly what she
  • 12:02 - 12:06
    wanted to learn. Okay so, these are
    magical kind of encryption schemes. They're
  • 12:06 - 12:10
    fairly recent, this is only a new
    development from about two or three years
  • 12:10 - 12:14
    ago, that allows us to compute on encrypted
    data, even though we don't really know
  • 12:14 - 12:19
    what's inside the encryption. Now, before
    you rush off and think about implementing
  • 12:19 - 12:22
    this, I should warn you that this is
    really at this point just theoretical, in
  • 12:22 - 12:26
    the sense that running a Google search on
    encryption data probably would take a
  • 12:26 - 12:31
    billion years. But nevertheless, just the
    fact that this is doable is already really
  • 12:31 - 12:34
    surprising, and is already quite useful
    for relatively simple computations. So in
  • 12:34 - 12:39
    fact, we'll see some applications of this
    later on. The other magical application I
  • 12:39 - 12:42
    want to show you is what's called zero
    knowledge. And in particular, I'll tell
  • 12:42 - 12:46
    you about something called the zero
    knowledge proof of knowledge. So here ...
  • 12:46 - 12:50
    what happens is there's a certain
    number N, which Alice knows. And the way
  • 12:50 - 12:54
    the number N was constructed is as a
    product of two large primes. So, imagine
  • 12:54 - 12:59
    here we have two primes, P and Q. Each one
    you can think of it as like 1000 digits.
  • 12:59 - 13:04
    And you probably know that multiplying
    two 1000-digit numbers is fairly easy. But if
  • 13:04 - 13:08
    I just give you their product, figuring
    out their factorization into primes is
  • 13:08 - 13:12
    actually quite difficult. And, in fact,
    we're gonna use the fact that factoring is
  • 13:12 - 13:17
    difficult to build public key cryptosystems
    in the second half of the course.
  • 13:17 - 13:21
    Okay, so Alice happens to have this number
    N, and she also knows the factorization of
  • 13:21 - 13:25
    N. Now Bob just has the number N. He
    doesn't actually know the factorization.
  • 13:25 - 13:29
    Now, the magical fact about the zero
    knowledge proof of knowledge, is that
  • 13:29 - 13:33
    Alice can prove to Bob that she knows the
    factorization of N. Yes, you can actually
  • 13:33 - 13:37
    give this proof to Bob, that Bob can
    check, and become convinced that Alice
  • 13:37 - 13:42
    knows the factorization of N, however Bob
    learns nothing at all. About the factors P
  • 13:42 - 13:47
    and Q, and this is provable. Bob
    absolutely learns nothing at all about the
  • 13:47 - 13:51
    factors P and Q. And the statement
    actually is very, very general. This is
  • 13:51 - 13:55
    not just about proving the factorization
    of N. In fact, almost any puzzle that you
  • 13:55 - 14:00
    want to prove that you know an answer to,
    you can prove it is your knowledge. So if
  • 14:00 - 14:04
    you have a crossword puzzle that you've
    solved. Well, maybe crosswords is not the
  • 14:04 - 14:08
    best example. But if you have like a
    Sudoku puzzle, for example, that you want
  • 14:08 - 14:12
    to prove that you've solved, you can prove
    it to Bob in a way that Bob would learn
  • 14:12 - 14:17
    nothing at all about the solution, and yet
    Bob would be convinced that you really do
  • 14:17 - 14:21
    have a solution to this puzzle. Okay. So
    those are kind of magical applications.
  • 14:21 - 14:25
    And so the last thing I want to say is
    that modern cryptography is a very
  • 14:25 - 14:29
    rigorous science. And in fact, every
    concept we're gonna describe is gonna
  • 14:29 - 14:33
    follow three very rigorous steps, okay,
    and we're gonna see these three steps
  • 14:33 - 14:37
    again and again and again so I wanna
    explain what they are. So the first thing
  • 14:37 - 14:41
    we're gonna do when we introduce a new
    primitive like a digital signature is
  • 14:41 - 14:46
    we're gonna specify precisely what the
    threat model is. That is, what can an
  • 14:46 - 14:50
    attacker do to attack a digital signature
    and what is his goal in forging
  • 14:50 - 14:54
    signatures? Okay, so we're gonna define
    exactly what does it mean for a signature
  • 14:54 - 14:58
    for example to be unforgeable.
    Unforgeable. Okay and I'm giving digital
  • 14:58 - 15:02
    signatures just as an example. For every
    primitive we describe we're gonna
  • 15:02 - 15:06
    precisely define what the threat model is.
    Then we're gonna propose a construction
  • 15:06 - 15:11
    and then we're gonna give a proof that any
    attacker that's able to attack the
  • 15:11 - 15:16
    construction under this threat model. That
    attacker can also be used to solve some
  • 15:16 - 15:20
    underlying hard problem. And as a result,
    if the problem really is hard, that
  • 15:20 - 15:24
    actually proves that no attacker can break
    the construction under the threat model.
  • 15:24 - 15:28
    Okay. But these three steps are actually
    quite important. In the case of
  • 15:28 - 15:32
    signatures, we'll define what it means for
    a signature to be, forgeable, then we'll
  • 15:32 - 15:36
    give a construction, and then for example
    we'll say that anyone who can break our
  • 15:36 - 15:40
    construction can then be used to say
    factor integers, which is believed to be a
  • 15:40 - 15:44
    hard problem. Okay, so we're going to
    follow these three steps throughout, and
  • 15:44 - 15:47
    then you'll see how this actually comes
    about. Okay, so this is the end of the
  • 15:47 - 15:51
    segment. And then in the next segment
    we'll talk a little bit about the history
  • 15:51 - 15:52
    of cryptography.
Title:
What is cryptography? (15 min)
Video Language:
English

English subtitles

Revisions