-
In this segment, we're gonna study the
security of the ElGamal public key
-
encryption system. So let me remind you
that when we first presented the
-
Diffie-Hellman protocol, we said that the
security is based on the assumption that
-
says that given G, G to the A, G to the B,
it's difficult to compute the
-
Diffie-Hellman secret, G to the AB. This
is basically what the attacker has to do.
-
He sees Alice's contribution. He sees
Bob's contribution and then his goal is to
-
figure out what the Diffie-Hellman secret
is. And we said that this problem is
-
difficult. The statement that the problem
is difficult is what's called the
-
computational Diffie-Hellman assumption.
So, let's take this assumption more
-
precisely. So, as usual we're going to
look at a finite cyclic group of order N,
-
so G is some group, in your head you should be
thinking ZP star, but there could
-
be other groups, for example, like an ellipctic curve group. And then we say that
-
the computational Diffie-Hellman
assumption. I've often used the shorthand
-
CDH for Computational Diffie Hellman.
We'll say this assumption holds in G if
-
the following condition holds, namely for
all efficient algorithms. If we choose a
-
random generator, and then we choose
random exponents A, B in ZN. Then when
-
we give the algorithm G, G to the A, and G
to the B, the probability that the
-
algorithm will produce the Diffie Hellman
secret is negligible. If this is true for
-
all efficient algorithms, then we say that
the CDH assumption holds for G. CDH, as I
-
said, stands for Computational Diffie
Hellman. As it turns out, this assumption
-
is not ideal for analyzing the security of
the ElGamal system. And instead I'm gonna
-
go ahead and make a slightly stronger
assumption called a hash Diffie-Hellman
-
assumption. Okay. So what is hash
Diffie-Hellman assumption? Again, we are
-
focusing on a particular group G and now
we're also gonna introduce a hash function
-
H that maps pairs of elements in G into
the key space of some symmetric encryption
-
system. And then we say that the hash
Diffie-Hellman a ssumption, or HDH for
-
short, holds for this pair, G comma H for
the group in the hash function if the
-
following is true. Basically, if I choose
a random generator and then I choose
-
random exponents A and B in ZN and then I
also choose a random R and K, then the
-
following distributions are
computationally indistinguishable. That
-
is, if I give you G, G to the A, G to the
B, and then this hash value, this should
-
look familiar to you. This is the hash
value that's used in the ElGamal system to
-
derive the symmetric encryption key. What
we're saying is that this distribution is
-
indistinguishable from a distribution
where you're also given. G, G to the A, G
-
to the B. But now, instead of giving the
hash, you're given just really random key.
-
So what this assumption says is that the
symmetric key that was derived during
-
encryption in the ElGamal system,
essentially is indistinguishable from a
-
truly random key that is derived
independently from the rest of the
-
parameters of the system. It's a truly
independent random key, looks basically
-
the same as the key that we used, to
derive from the G to the A and the G to
-
the B. Now I'd like to point out that the
Hash Diffie-Hellman assumption is actually
-
a stronger assumption than the
Computational Diffie-Hellman assumption.
-
The way to see that is using the contra
positive, that is suppose the
-
Computational Diffie-Hellman assumption
happens to be easy in the group G. Then I
-
claim that in fact the Hash Diffie-Hellman
assumption is also easy in the group G. In
-
fact, I'll say for G, H and this is true
in fact for all hash functions where the
-
image of the hash function. It contains at
least two elements. In other words all I
-
want to say is that the Hash Diffie-Hellman assumption
and it's easy for all
-
hash functions to go match everything to a
single point. Which is true for almost all
-
hash functions of interest to us. So
actually, this is a really simple
-
statement to prove. Basically, if the
Computational Diffie-Hellman assumption is easy, what that
-
says is that, given G to the A and G to the B, I can actually calculate G to the AB
-
myself. Cuz I know the hash function H, I
can calculate this complete value here. So
-
if you give me a tuple that's
sampled from the left or sampled from the
-
right. I can very easily tell where it's
from. If it's sampled from the left, then
-
once I've calculated G to the AB myself, I
can just test that the last element in the
-
tuple is, in fact, the hash of G to
the B and G to the AB. If it is, I know
-
the sample is from the left. If it isn't,
I know the sample is from the right. Okay,
-
so this gives me pretty high advantage, in
distinguishing these two distributions. So
-
CDH is easy, it's very easy to see that
these distributions are distinguishable.
-
And this basically says that if the Hash
Diffie-Hellman is in fact hard, then CDH
-
must also be hard. Which then we say, that
therefore the Hash Diffie-Hellman is a
-
stronger assumption. There might be group
where CDH is hard, but HDH is not hard.
-
Okay. So we say HDH is a
stronger assumption. If you found this
-
discussion of weak assumption versus
strong assumption confusing, it's not
-
really that important, it's fine to ignore
it. The important thing that I want to
-
show you is in fact that the Hash
Diffie-Hellman assumption is sufficient to
-
prove the semantic security of the ElGamal
system. Before we do that let me quickly
-
ask you the following puzzle just to make
sure the Hash Diffie-Hellman assumption is
-
clear. So imagine our key space is {0, 1} to the 128. So 128 bit strings and our
-
hash function will map pairs of group
elements into this 128 byte strings.
-
Suppose it so happens that we chose a hash
function Such that it always outputs
-
strings that begin with zero. In other
words, of the 128 bit strings the most
-
significant bit of the output is always
zero. If we chose such a hash function,
-
does the Hash Diffie-Hellman assumption
hold for this pair, G comma H. So the
-
answer is no it doesn't hold. And the
reason is because it now very easy to
-
distinguish the two distributions that we
have here. The distribution on the left
-
an the distribution on the right. And the
way you would distinguish the two is
-
basically if I choose a truly random
element in K, in {0, 1} to the 128,
-
then mostly it can very well be zero with
probability one half. However, the tuple
-
that's given to me is chosen from the left
distribution, then the most significant
-
bit of the hash will always be zero and
therefore if I see zero, I'm gonna say the
-
distribution is a distribution on the
left. If I see one, I'm gonna say the
-
distribution is a distribution on the
right. That will give me advantage one
-
half in distinguishing these two
distributions. And so as a result, the
-
Hash Diffie-Hillman assumption is false in
this case. So clearly you could see that,
-
even though CDH might be hard in a certain
group G, if you choose a bad hash
-
function, then HDH will not hold for the
pair G comma H. Okay. But suppose it so
-
happens that we choose a group G and a
hash function H for which the Hash Diffie
-
Hellman assumption holds. And in fact, if
you choose the hash function to be just
-
SHA-256, for all we know, the Hash
Diffie Hellman assumption holds in the
-
group ZP star, and holds in elliptic curve
groups. My goal then is to show you that
-
ElGamal is semantically secure under
the Hash Diffie-Hellman assumption. So let me
-
quickly remind you how theElGamal
system works. While we're gonna choose a
-
random generator G, we're gonna choose a
random 'a' in ZN, the public key is
-
gonna be G, and G to the A, the secret key
is simply gonna be A. And now here I wrote
-
shorthand for the ElGamal encryption.
Basically, what to encrypt, what we do is
-
we choose a random B. We hash G as being H
to the B. Remember this H to the B is the
-
value G to the AB. This is the Diffie
Hellman secret. The hash function gave us
-
a symmetric encryption key K. We encrypt
the message with K, and we output G to the
-
B and the symmetric cipher text. To
decrypt we have to value U, and the Diffie
-
Hellman secret, G to the A. To derive a
symmetric key, we decrypt the cipher text.
-
And then we output the plaintext message m. So let's see how to argue that,
-
in fact, ElGamal is semantically secure under
this Hash Diffie-Hillman assumption is
-
fairly simple. So well we have to argue,
remember we had, in semantic security, we
-
have these two experiments. One
experiment, the attacker is given the
-
encryption of m0. In the other experiment,
the attacker is given the encryption of m1.
-
So I wrote the two experiments here. Here
you notice that the attacker starts by
-
sending off the public key to the
adversary. The adversary then chooses two
-
equal length messages m0 and m1. And then
if he gets the ElGamal encryption of M0,
-
and a kind of rough shorthand for what
ElGamal ciphertext is, okay? Similarly, in
-
encryption one, he simply gets the
encryption of M1 instead of M0, and
-
everything else is the same about these
two experiments. Now, because of the Hash
-
Diffie-Hellman assumption, we know that
even though the attacker got to see G, G
-
to the a and G to the b, we know that the
output of the hash of G to the b, G to the
-
ab is indistinguishable from random.
Therefore, if we replace the actual hash
-
function by a truly generated random key K
that's independent of everything else, by
-
the Hash Diffie-Hellman assumption, the
attacker can't distinguish these two
-
games. And again, it's a simple exercise
for you to show that if he could
-
distinguish these two games, then he would
break the Hash Diffie-Hellman assumption.
-
But then the same is true for experiment
one. The attacker also could not
-
distinguish the output of the hash
function from a truly random key, that was
-
used the generate the challenge cipher
text. Okay. So now basically we look at
-
these two experiments and we realize that,
wait a minute, what's going on in these
-
two experiments? Basically everything is
the same except in one experiment, the
-
attacker gets the encryption under
a symmetric encryption system of m0. In the
-
other one, he gets the encryption under a
symmetric encryption system of m1 and the
-
key is chosen at random independently over
everything else. So by the fact that the
-
symmetric encryption system is
semantically secure, these two games are
-
indistinguishable. If they were
distinguishable, then the attacker could
-
break the semantic security of this
symmetric encryption system. So now we
-
kinda prove this, you know, this chain of
equivalences. And that proves that the
-
original games, in fact, are
indistinguishable, computationally
-
indistinguishable. And therefore, the
ElGamal system is semantically secure.
-
Okay so it's like I said a very simple
proof by pictures and you can make this
-
into a rigorous proof without too much
work. But semantic security isn't enough,
-
what we really want is chosen ciphertext
security, and the question is ElGamal chosen ciphertext
-
secure? So it turns out to prove the
chosen ciphertext security of ElGamal we
-
actually need a stronger assumption, Hash Diffie-Hellman or Computational Diffie-Hellman
-
are actually not enough to prove
chosen ciphertext security of the system as far
-
as we know. So to prove chosen-ciphertext
security, I'm going to introduce a new
-
assumption called Interactive Diffie
Hellman assumption. And actually,
-
technically we really don't like this
assumption. And we're going to try to get
-
rid of this, later on. But for now, we
just want to analyze the security of the
-
basic ElGamal system against chosen
ciphertext attack. So to argue that it is
-
chosen ciphertext secure, here is what the
assumption says. Basically the challenger
-
starts, you know, plays a game with the
adversary, he generates a random G,
-
generates two exponents, and then he says
to the adversary as usual, G, G to the a
-
and G to the b. Now the adversary's goal
is basically to figure out the
-
Diffie-Helmman's secrets, mainly g to the
ab. He outputs the value of V and he wins
-
the game if V happens to be equal to G to
the AB. So, so far this looks identical to
-
the Computational Diffie-Hellman assumption,
there's no difference - this is
-
the Computational Diffie-Hellman assumption
except in Interactive Diffie-Hellman would
-
give the attacker a little bit more power.
So because the attacker has more power,
-
it's harder to satisfy the assumption,
which is why we say that this assumption
-
is stronger than Computational
Diffie-Hellman. Now what is this power to be
-
given? We are given the ability to make
queries. In particular, he gets to submit
-
two elements from the group G, so U1, V1
disappear from the group G. And then he's
-
told whether U1 to the A is equal to V1,
so he gets one. If you wanted the A equals
-
to V1 get zero, otherwise it's kind of a
bizarre type of query. What, how does it
-
be possibly help the attacker? But it
turns out we need to be able to answer
-
these types of queries to the adversary in
order to be able to prove chosen
-
ciphertext security. And in fact, he can
do these type of queries again and again
-
and again. So he can issue as many
queries like these as he wants and then the
-
assumption says that despite all these
queries he still can't figure out the
-
Diffie-Hellman secret, namely despite
making all these queries, the probability
-
that he outputs the
Diffie-Hellman secret is negligible. Okay.
-
So clearly if this assumption holds, that the Computational Diffie-Hellman assumption
-
holds, because here, the adversary has more power,
and as a result we say that this assumption
-
is stronger that Computational Diffie-Hellman, the thing,
we really don't like about this
-
assumption, is the fact, that, it's
interactive, and that the adversary is allowed to
-
make queries to the challenger, and as I
said, we're gonna trying to get rid
-
of this interaction later on. But for now
I'll tell you that under this Interactive
-
Diffie-Hellman assumption and under the
assumption that the symmetric encryption
-
system provides authenticated encryption, and
under the assumption that the hash
-
function is kind of an ideal hash
function, what we call the random oracle,
-
then in fact the ElGamal system is chosen ciphertext secure and again this
-
RO here denotes the fact that it's in the
random oracle model. Which is not
-
important, so much for our purposes. The
main thing to remember that under
-
kind of this weird, interactive
assumption we can actually prove that
-
ElGamal is a chosen ciphertext secure.
And as far as we know this assumption
-
actually holds for the group ZP star.
So what we're gonna do next is basically
-
answer this question of whether we can get
rid of the interactive assumption. Can we
-
prove security strictly based on CDH? And
similarly can we proof security without
-
relying on random oracles? Just you know
without having to assume, that the hash
-
function is ideal. Just you know, can we
prove security using a concrete hash
-
function. And we'll do that very briefly
in the next segment.