Return to Video

ElGamal Security (14 min)

  • 0:00 - 0:03
    In this segment, we're gonna study the
    security of the ElGamal public key
  • 0:03 - 0:07
    encryption system. So let me remind you
    that when we first presented the
  • 0:07 - 0:11
    Diffie-Hellman protocol, we said that the
    security is based on the assumption that
  • 0:11 - 0:15
    says that given G, G to the A, G to the B,
    it's difficult to compute the
  • 0:15 - 0:19
    Diffie-Hellman secret, G to the AB. This
    is basically what the attacker has to do.
  • 0:19 - 0:23
    He sees Alice's contribution. He sees
    Bob's contribution and then his goal is to
  • 0:23 - 0:27
    figure out what the Diffie-Hellman secret
    is. And we said that this problem is
  • 0:27 - 0:31
    difficult. The statement that the problem
    is difficult is what's called the
  • 0:31 - 0:35
    computational Diffie-Hellman assumption.
    So, let's take this assumption more
  • 0:35 - 0:40
    precisely. So, as usual we're going to
    look at a finite cyclic group of order N,
  • 0:40 - 0:44
    so G is some group, in your head you should be
    thinking ZP star, but there could
  • 0:44 - 0:48
    be other groups, for example, like an ellipctic curve group. And then we say that
  • 0:48 - 0:52
    the computational Diffie-Hellman
    assumption. I've often used the shorthand
  • 0:52 - 0:56
    CDH for Computational Diffie Hellman.
    We'll say this assumption holds in G if
  • 0:56 - 1:01
    the following condition holds, namely for
    all efficient algorithms. If we choose a
  • 1:01 - 1:05
    random generator, and then we choose
    random exponents A, B in ZN. Then when
  • 1:05 - 1:09
    we give the algorithm G, G to the A, and G
    to the B, the probability that the
  • 1:09 - 1:14
    algorithm will produce the Diffie Hellman
    secret is negligible. If this is true for
  • 1:14 - 1:18
    all efficient algorithms, then we say that
    the CDH assumption holds for G. CDH, as I
  • 1:18 - 1:23
    said, stands for Computational Diffie
    Hellman. As it turns out, this assumption
  • 1:23 - 1:28
    is not ideal for analyzing the security of
    the ElGamal system. And instead I'm gonna
  • 1:28 - 1:32
    go ahead and make a slightly stronger
    assumption called a hash Diffie-Hellman
  • 1:32 - 1:36
    assumption. Okay. So what is hash
    Diffie-Hellman assumption? Again, we are
  • 1:36 - 1:40
    focusing on a particular group G and now
    we're also gonna introduce a hash function
  • 1:40 - 1:45
    H that maps pairs of elements in G into
    the key space of some symmetric encryption
  • 1:45 - 1:49
    system. And then we say that the hash
    Diffie-Hellman a ssumption, or HDH for
  • 1:49 - 1:53
    short, holds for this pair, G comma H for
    the group in the hash function if the
  • 1:53 - 1:58
    following is true. Basically, if I choose
    a random generator and then I choose
  • 1:58 - 2:02
    random exponents A and B in ZN and then I
    also choose a random R and K, then the
  • 2:02 - 2:06
    following distributions are
    computationally indistinguishable. That
  • 2:06 - 2:10
    is, if I give you G, G to the A, G to the
    B, and then this hash value, this should
  • 2:10 - 2:13
    look familiar to you. This is the hash
    value that's used in the ElGamal system to
  • 2:13 - 2:17
    derive the symmetric encryption key. What
    we're saying is that this distribution is
  • 2:17 - 2:21
    indistinguishable from a distribution
    where you're also given. G, G to the A, G
  • 2:21 - 2:25
    to the B. But now, instead of giving the
    hash, you're given just really random key.
  • 2:25 - 2:30
    So what this assumption says is that the
    symmetric key that was derived during
  • 2:30 - 2:34
    encryption in the ElGamal system,
    essentially is indistinguishable from a
  • 2:34 - 2:38
    truly random key that is derived
    independently from the rest of the
  • 2:38 - 2:42
    parameters of the system. It's a truly
    independent random key, looks basically
  • 2:42 - 2:47
    the same as the key that we used, to
    derive from the G to the A and the G to
  • 2:47 - 2:52
    the B. Now I'd like to point out that the
    Hash Diffie-Hellman assumption is actually
  • 2:52 - 2:56
    a stronger assumption than the
    Computational Diffie-Hellman assumption.
  • 2:56 - 3:01
    The way to see that is using the contra
    positive, that is suppose the
  • 3:01 - 3:06
    Computational Diffie-Hellman assumption
    happens to be easy in the group G. Then I
  • 3:06 - 3:11
    claim that in fact the Hash Diffie-Hellman
    assumption is also easy in the group G. In
  • 3:11 - 3:16
    fact, I'll say for G, H and this is true
    in fact for all hash functions where the
  • 3:16 - 3:20
    image of the hash function. It contains at
    least two elements. In other words all I
  • 3:20 - 3:25
    want to say is that the Hash Diffie-Hellman assumption
    and it's easy for all
  • 3:25 - 3:29
    hash functions to go match everything to a
    single point. Which is true for almost all
  • 3:29 - 3:33
    hash functions of interest to us. So
    actually, this is a really simple
  • 3:33 - 3:37
    statement to prove. Basically, if the
    Computational Diffie-Hellman assumption is easy, what that
  • 3:37 - 3:42
    says is that, given G to the A and G to the B, I can actually calculate G to the AB
  • 3:42 - 3:46
    myself. Cuz I know the hash function H, I
    can calculate this complete value here. So
  • 3:46 - 3:50
    if you give me a tuple that's
    sampled from the left or sampled from the
  • 3:50 - 3:55
    right. I can very easily tell where it's
    from. If it's sampled from the left, then
  • 3:55 - 3:59
    once I've calculated G to the AB myself, I
    can just test that the last element in the
  • 3:59 - 4:03
    tuple is, in fact, the hash of G to
    the B and G to the AB. If it is, I know
  • 4:03 - 4:08
    the sample is from the left. If it isn't,
    I know the sample is from the right. Okay,
  • 4:08 - 4:13
    so this gives me pretty high advantage, in
    distinguishing these two distributions. So
  • 4:13 - 4:17
    CDH is easy, it's very easy to see that
    these distributions are distinguishable.
  • 4:17 - 4:22
    And this basically says that if the Hash
    Diffie-Hellman is in fact hard, then CDH
  • 4:22 - 4:27
    must also be hard. Which then we say, that
    therefore the Hash Diffie-Hellman is a
  • 4:27 - 4:31
    stronger assumption. There might be group
    where CDH is hard, but HDH is not hard.
  • 4:31 - 4:36
    Okay. So we say HDH is a
    stronger assumption. If you found this
  • 4:36 - 4:40
    discussion of weak assumption versus
    strong assumption confusing, it's not
  • 4:40 - 4:44
    really that important, it's fine to ignore
    it. The important thing that I want to
  • 4:44 - 4:47
    show you is in fact that the Hash
    Diffie-Hellman assumption is sufficient to
  • 4:47 - 4:52
    prove the semantic security of the ElGamal
    system. Before we do that let me quickly
  • 4:52 - 4:56
    ask you the following puzzle just to make
    sure the Hash Diffie-Hellman assumption is
  • 4:56 - 5:00
    clear. So imagine our key space is {0, 1} to the 128. So 128 bit strings and our
  • 5:00 - 5:04
    hash function will map pairs of group
    elements into this 128 byte strings.
  • 5:04 - 5:08
    Suppose it so happens that we chose a hash
    function Such that it always outputs
  • 5:08 - 5:13
    strings that begin with zero. In other
    words, of the 128 bit strings the most
  • 5:13 - 5:18
    significant bit of the output is always
    zero. If we chose such a hash function,
  • 5:18 - 5:24
    does the Hash Diffie-Hellman assumption
    hold for this pair, G comma H. So the
  • 5:24 - 5:27
    answer is no it doesn't hold. And the
    reason is because it now very easy to
  • 5:27 - 5:31
    distinguish the two distributions that we
    have here. The distribution on the left
  • 5:31 - 5:35
    an the distribution on the right. And the
    way you would distinguish the two is
  • 5:35 - 5:40
    basically if I choose a truly random
    element in K, in {0, 1} to the 128,
  • 5:40 - 5:44
    then mostly it can very well be zero with
    probability one half. However, the tuple
  • 5:44 - 5:49
    that's given to me is chosen from the left
    distribution, then the most significant
  • 5:49 - 5:53
    bit of the hash will always be zero and
    therefore if I see zero, I'm gonna say the
  • 5:53 - 5:57
    distribution is a distribution on the
    left. If I see one, I'm gonna say the
  • 5:57 - 6:01
    distribution is a distribution on the
    right. That will give me advantage one
  • 6:01 - 6:05
    half in distinguishing these two
    distributions. And so as a result, the
  • 6:05 - 6:10
    Hash Diffie-Hillman assumption is false in
    this case. So clearly you could see that,
  • 6:10 - 6:14
    even though CDH might be hard in a certain
    group G, if you choose a bad hash
  • 6:14 - 6:18
    function, then HDH will not hold for the
    pair G comma H. Okay. But suppose it so
  • 6:18 - 6:23
    happens that we choose a group G and a
    hash function H for which the Hash Diffie
  • 6:23 - 6:27
    Hellman assumption holds. And in fact, if
    you choose the hash function to be just
  • 6:27 - 6:32
    SHA-256, for all we know, the Hash
    Diffie Hellman assumption holds in the
  • 6:32 - 6:36
    group ZP star, and holds in elliptic curve
    groups. My goal then is to show you that
  • 6:36 - 6:40
    ElGamal is semantically secure under
    the Hash Diffie-Hellman assumption. So let me
  • 6:40 - 6:44
    quickly remind you how theElGamal
    system works. While we're gonna choose a
  • 6:44 - 6:48
    random generator G, we're gonna choose a
    random 'a' in ZN, the public key is
  • 6:48 - 6:52
    gonna be G, and G to the A, the secret key
    is simply gonna be A. And now here I wrote
  • 6:52 - 6:56
    shorthand for the ElGamal encryption.
    Basically, what to encrypt, what we do is
  • 6:56 - 7:02
    we choose a random B. We hash G as being H
    to the B. Remember this H to the B is the
  • 7:02 - 7:06
    value G to the AB. This is the Diffie
    Hellman secret. The hash function gave us
  • 7:06 - 7:11
    a symmetric encryption key K. We encrypt
    the message with K, and we output G to the
  • 7:11 - 7:15
    B and the symmetric cipher text. To
    decrypt we have to value U, and the Diffie
  • 7:15 - 7:19
    Hellman secret, G to the A. To derive a
    symmetric key, we decrypt the cipher text.
  • 7:19 - 7:23
    And then we output the plaintext message m. So let's see how to argue that,
  • 7:23 - 7:28
    in fact, ElGamal is semantically secure under
    this Hash Diffie-Hillman assumption is
  • 7:28 - 7:32
    fairly simple. So well we have to argue,
    remember we had, in semantic security, we
  • 7:32 - 7:36
    have these two experiments. One
    experiment, the attacker is given the
  • 7:36 - 7:40
    encryption of m0. In the other experiment,
    the attacker is given the encryption of m1.
  • 7:40 - 7:44
    So I wrote the two experiments here. Here
    you notice that the attacker starts by
  • 7:44 - 7:48
    sending off the public key to the
    adversary. The adversary then chooses two
  • 7:48 - 7:53
    equal length messages m0 and m1. And then
    if he gets the ElGamal encryption of M0,
  • 7:53 - 7:57
    and a kind of rough shorthand for what
    ElGamal ciphertext is, okay? Similarly, in
  • 7:57 - 8:02
    encryption one, he simply gets the
    encryption of M1 instead of M0, and
  • 8:02 - 8:06
    everything else is the same about these
    two experiments. Now, because of the Hash
  • 8:06 - 8:11
    Diffie-Hellman assumption, we know that
    even though the attacker got to see G, G
  • 8:11 - 8:16
    to the a and G to the b, we know that the
    output of the hash of G to the b, G to the
  • 8:16 - 8:21
    ab is indistinguishable from random.
    Therefore, if we replace the actual hash
  • 8:21 - 8:26
    function by a truly generated random key K
    that's independent of everything else, by
  • 8:26 - 8:31
    the Hash Diffie-Hellman assumption, the
    attacker can't distinguish these two
  • 8:31 - 8:35
    games. And again, it's a simple exercise
    for you to show that if he could
  • 8:35 - 8:39
    distinguish these two games, then he would
    break the Hash Diffie-Hellman assumption.
  • 8:39 - 8:43
    But then the same is true for experiment
    one. The attacker also could not
  • 8:43 - 8:47
    distinguish the output of the hash
    function from a truly random key, that was
  • 8:47 - 8:51
    used the generate the challenge cipher
    text. Okay. So now basically we look at
  • 8:51 - 8:55
    these two experiments and we realize that,
    wait a minute, what's going on in these
  • 8:55 - 8:59
    two experiments? Basically everything is
    the same except in one experiment, the
  • 8:59 - 9:03
    attacker gets the encryption under
    a symmetric encryption system of m0. In the
  • 9:03 - 9:07
    other one, he gets the encryption under a
    symmetric encryption system of m1 and the
  • 9:07 - 9:11
    key is chosen at random independently over
    everything else. So by the fact that the
  • 9:11 - 9:14
    symmetric encryption system is
    semantically secure, these two games are
  • 9:14 - 9:18
    indistinguishable. If they were
    distinguishable, then the attacker could
  • 9:18 - 9:22
    break the semantic security of this
    symmetric encryption system. So now we
  • 9:22 - 9:26
    kinda prove this, you know, this chain of
    equivalences. And that proves that the
  • 9:26 - 9:29
    original games, in fact, are
    indistinguishable, computationally
  • 9:29 - 9:33
    indistinguishable. And therefore, the
    ElGamal system is semantically secure.
  • 9:33 - 9:38
    Okay so it's like I said a very simple
    proof by pictures and you can make this
  • 9:38 - 9:42
    into a rigorous proof without too much
    work. But semantic security isn't enough,
  • 9:42 - 9:46
    what we really want is chosen ciphertext
    security, and the question is ElGamal chosen ciphertext
  • 9:46 - 9:50
    secure? So it turns out to prove the
    chosen ciphertext security of ElGamal we
  • 9:50 - 9:55
    actually need a stronger assumption, Hash Diffie-Hellman or Computational Diffie-Hellman
  • 9:55 - 9:59
    are actually not enough to prove
    chosen ciphertext security of the system as far
  • 9:59 - 10:03
    as we know. So to prove chosen-ciphertext
    security, I'm going to introduce a new
  • 10:03 - 10:06
    assumption called Interactive Diffie
    Hellman assumption. And actually,
  • 10:06 - 10:10
    technically we really don't like this
    assumption. And we're going to try to get
  • 10:10 - 10:15
    rid of this, later on. But for now, we
    just want to analyze the security of the
  • 10:15 - 10:19
    basic ElGamal system against chosen
    ciphertext attack. So to argue that it is
  • 10:19 - 10:24
    chosen ciphertext secure, here is what the
    assumption says. Basically the challenger
  • 10:24 - 10:28
    starts, you know, plays a game with the
    adversary, he generates a random G,
  • 10:28 - 10:32
    generates two exponents, and then he says
    to the adversary as usual, G, G to the a
  • 10:32 - 10:37
    and G to the b. Now the adversary's goal
    is basically to figure out the
  • 10:37 - 10:41
    Diffie-Helmman's secrets, mainly g to the
    ab. He outputs the value of V and he wins
  • 10:41 - 10:45
    the game if V happens to be equal to G to
    the AB. So, so far this looks identical to
  • 10:45 - 10:49
    the Computational Diffie-Hellman assumption,
    there's no difference - this is
  • 10:49 - 10:53
    the Computational Diffie-Hellman assumption
    except in Interactive Diffie-Hellman would
  • 10:53 - 10:57
    give the attacker a little bit more power.
    So because the attacker has more power,
  • 10:57 - 11:02
    it's harder to satisfy the assumption,
    which is why we say that this assumption
  • 11:02 - 11:06
    is stronger than Computational
    Diffie-Hellman. Now what is this power to be
  • 11:06 - 11:11
    given? We are given the ability to make
    queries. In particular, he gets to submit
  • 11:11 - 11:16
    two elements from the group G, so U1, V1
    disappear from the group G. And then he's
  • 11:16 - 11:20
    told whether U1 to the A is equal to V1,
    so he gets one. If you wanted the A equals
  • 11:20 - 11:24
    to V1 get zero, otherwise it's kind of a
    bizarre type of query. What, how does it
  • 11:24 - 11:28
    be possibly help the attacker? But it
    turns out we need to be able to answer
  • 11:28 - 11:32
    these types of queries to the adversary in
    order to be able to prove chosen
  • 11:32 - 11:36
    ciphertext security. And in fact, he can
    do these type of queries again and again
  • 11:36 - 11:39
    and again. So he can issue as many
    queries like these as he wants and then the
  • 11:39 - 11:43
    assumption says that despite all these
    queries he still can't figure out the
  • 11:43 - 11:47
    Diffie-Hellman secret, namely despite
    making all these queries, the probability
  • 11:47 - 11:51
    that he outputs the
    Diffie-Hellman secret is negligible. Okay.
  • 11:51 - 11:56
    So clearly if this assumption holds, that the Computational Diffie-Hellman assumption
  • 11:56 - 11:59
    holds, because here, the adversary has more power,
    and as a result we say that this assumption
  • 11:59 - 12:03
    is stronger that Computational Diffie-Hellman, the thing,
    we really don't like about this
  • 12:03 - 12:06
    assumption, is the fact, that, it's
    interactive, and that the adversary is allowed to
  • 12:06 - 12:10
    make queries to the challenger, and as I
    said, we're gonna trying to get rid
  • 12:10 - 12:14
    of this interaction later on. But for now
    I'll tell you that under this Interactive
  • 12:14 - 12:19
    Diffie-Hellman assumption and under the
    assumption that the symmetric encryption
  • 12:19 - 12:23
    system provides authenticated encryption, and
    under the assumption that the hash
  • 12:23 - 12:27
    function is kind of an ideal hash
    function, what we call the random oracle,
  • 12:27 - 12:31
    then in fact the ElGamal system is chosen ciphertext secure and again this
  • 12:31 - 12:35
    RO here denotes the fact that it's in the
    random oracle model. Which is not
  • 12:35 - 12:39
    important, so much for our purposes. The
    main thing to remember that under
  • 12:39 - 12:43
    kind of this weird, interactive
    assumption we can actually prove that
  • 12:43 - 12:47
    ElGamal is a chosen ciphertext secure.
    And as far as we know this assumption
  • 12:47 - 12:51
    actually holds for the group ZP star.
    So what we're gonna do next is basically
  • 12:51 - 12:56
    answer this question of whether we can get
    rid of the interactive assumption. Can we
  • 12:56 - 13:00
    prove security strictly based on CDH? And
    similarly can we proof security without
  • 13:00 - 13:04
    relying on random oracles? Just you know
    without having to assume, that the hash
  • 13:04 - 13:08
    function is ideal. Just you know, can we
    prove security using a concrete hash
  • 13:08 - 13:12
    function. And we'll do that very briefly
    in the next segment.
Title:
ElGamal Security (14 min)
Video Language:
English

English subtitles

Revisions