ElGamal Security (14 min)

Edit subtitles

0:00 - 0:03

In this segment, we're gonna study the
security of the ElGamal public key
0:03 - 0:07

encryption system. So let me remind you
that when we first presented the
0:07 - 0:11

Diffie-Hellman protocol, we said that the
security is based on the assumption that
0:11 - 0:15

says that given G, G to the A, G to the B,
it's difficult to compute the
0:15 - 0:19

Diffie-Hellman secret, G to the AB. This
is basically what the attacker has to do.
0:19 - 0:23

He sees Alice's contribution. He sees
Bob's contribution and then his goal is to
0:23 - 0:27

figure out what the Diffie-Hellman secret
is. And we said that this problem is
0:27 - 0:31

difficult. The statement that the problem
is difficult is what's called the
0:31 - 0:35

computational Diffie-Hellman assumption.
So, let's take this assumption more
0:35 - 0:40

precisely. So, as usual we're going to
look at a finite cyclic group of order N,
0:40 - 0:44

so G is some group, in your head you should be
thinking ZP star, but there could
0:44 - 0:48

be other groups, for example, like an ellipctic curve group. And then we say that
0:48 - 0:52

the computational Diffie-Hellman
assumption. I've often used the shorthand
0:52 - 0:56

CDH for Computational Diffie Hellman.
We'll say this assumption holds in G if
0:56 - 1:01

the following condition holds, namely for
all efficient algorithms. If we choose a
1:01 - 1:05

random generator, and then we choose
random exponents A, B in ZN. Then when
1:05 - 1:09

we give the algorithm G, G to the A, and G
to the B, the probability that the
1:09 - 1:14

algorithm will produce the Diffie Hellman
secret is negligible. If this is true for
1:14 - 1:18

all efficient algorithms, then we say that
the CDH assumption holds for G. CDH, as I
1:18 - 1:23

said, stands for Computational Diffie
Hellman. As it turns out, this assumption
1:23 - 1:28

is not ideal for analyzing the security of
the ElGamal system. And instead I'm gonna
1:28 - 1:32

go ahead and make a slightly stronger
assumption called a hash Diffie-Hellman
1:32 - 1:36

assumption. Okay. So what is hash
Diffie-Hellman assumption? Again, we are
1:36 - 1:40

focusing on a particular group G and now
we're also gonna introduce a hash function
1:40 - 1:45

H that maps pairs of elements in G into
the key space of some symmetric encryption
1:45 - 1:49

system. And then we say that the hash
Diffie-Hellman a ssumption, or HDH for
1:49 - 1:53

short, holds for this pair, G comma H for
the group in the hash function if the
1:53 - 1:58

following is true. Basically, if I choose
a random generator and then I choose
1:58 - 2:02

random exponents A and B in ZN and then I
also choose a random R and K, then the
2:02 - 2:06

following distributions are
computationally indistinguishable. That
2:06 - 2:10

is, if I give you G, G to the A, G to the
B, and then this hash value, this should
2:10 - 2:13

look familiar to you. This is the hash
value that's used in the ElGamal system to
2:13 - 2:17

derive the symmetric encryption key. What
we're saying is that this distribution is
2:17 - 2:21

indistinguishable from a distribution
where you're also given. G, G to the A, G
2:21 - 2:25

to the B. But now, instead of giving the
hash, you're given just really random key.
2:25 - 2:30

So what this assumption says is that the
symmetric key that was derived during
2:30 - 2:34

encryption in the ElGamal system,
essentially is indistinguishable from a
2:34 - 2:38

truly random key that is derived
independently from the rest of the
2:38 - 2:42

parameters of the system. It's a truly
independent random key, looks basically
2:42 - 2:47

the same as the key that we used, to
derive from the G to the A and the G to
2:47 - 2:52

the B. Now I'd like to point out that the
Hash Diffie-Hellman assumption is actually
2:52 - 2:56

a stronger assumption than the
Computational Diffie-Hellman assumption.
2:56 - 3:01

The way to see that is using the contra
positive, that is suppose the
3:01 - 3:06

Computational Diffie-Hellman assumption
happens to be easy in the group G. Then I
3:06 - 3:11

claim that in fact the Hash Diffie-Hellman
assumption is also easy in the group G. In
3:11 - 3:16

fact, I'll say for G, H and this is true
in fact for all hash functions where the
3:16 - 3:20

image of the hash function. It contains at
least two elements. In other words all I
3:20 - 3:25

want to say is that the Hash Diffie-Hellman assumption
and it's easy for all
3:25 - 3:29

hash functions to go match everything to a
single point. Which is true for almost all
3:29 - 3:33

hash functions of interest to us. So
actually, this is a really simple
3:33 - 3:37

statement to prove. Basically, if the
Computational Diffie-Hellman assumption is easy, what that
3:37 - 3:42

says is that, given G to the A and G to the B, I can actually calculate G to the AB
3:42 - 3:46

myself. Cuz I know the hash function H, I
can calculate this complete value here. So
3:46 - 3:50

if you give me a tuple that's
sampled from the left or sampled from the
3:50 - 3:55

right. I can very easily tell where it's
from. If it's sampled from the left, then
3:55 - 3:59

once I've calculated G to the AB myself, I
can just test that the last element in the
3:59 - 4:03

tuple is, in fact, the hash of G to
the B and G to the AB. If it is, I know
4:03 - 4:08

the sample is from the left. If it isn't,
I know the sample is from the right. Okay,
4:08 - 4:13

so this gives me pretty high advantage, in
distinguishing these two distributions. So
4:13 - 4:17

CDH is easy, it's very easy to see that
these distributions are distinguishable.
4:17 - 4:22

And this basically says that if the Hash
Diffie-Hellman is in fact hard, then CDH
4:22 - 4:27

must also be hard. Which then we say, that
therefore the Hash Diffie-Hellman is a
4:27 - 4:31

stronger assumption. There might be group
where CDH is hard, but HDH is not hard.
4:31 - 4:36

Okay. So we say HDH is a
stronger assumption. If you found this
4:36 - 4:40

discussion of weak assumption versus
strong assumption confusing, it's not
4:40 - 4:44

really that important, it's fine to ignore
it. The important thing that I want to
4:44 - 4:47

show you is in fact that the Hash
Diffie-Hellman assumption is sufficient to
4:47 - 4:52

prove the semantic security of the ElGamal
system. Before we do that let me quickly
4:52 - 4:56

ask you the following puzzle just to make
sure the Hash Diffie-Hellman assumption is
4:56 - 5:00

clear. So imagine our key space is {0, 1} to the 128. So 128 bit strings and our
5:00 - 5:04

hash function will map pairs of group
elements into this 128 byte strings.
5:04 - 5:08

Suppose it so happens that we chose a hash
function Such that it always outputs
5:08 - 5:13

strings that begin with zero. In other
words, of the 128 bit strings the most
5:13 - 5:18

significant bit of the output is always
zero. If we chose such a hash function,
5:18 - 5:24

does the Hash Diffie-Hellman assumption
hold for this pair, G comma H. So the
5:24 - 5:27

answer is no it doesn't hold. And the
reason is because it now very easy to
5:27 - 5:31

distinguish the two distributions that we
have here. The distribution on the left
5:31 - 5:35

an the distribution on the right. And the
way you would distinguish the two is
5:35 - 5:40

basically if I choose a truly random
element in K, in {0, 1} to the 128,
5:40 - 5:44

then mostly it can very well be zero with
probability one half. However, the tuple
5:44 - 5:49

that's given to me is chosen from the left
distribution, then the most significant
5:49 - 5:53

bit of the hash will always be zero and
therefore if I see zero, I'm gonna say the
5:53 - 5:57

distribution is a distribution on the
left. If I see one, I'm gonna say the
5:57 - 6:01

distribution is a distribution on the
right. That will give me advantage one
6:01 - 6:05

half in distinguishing these two
distributions. And so as a result, the
6:05 - 6:10

Hash Diffie-Hillman assumption is false in
this case. So clearly you could see that,
6:10 - 6:14

even though CDH might be hard in a certain
group G, if you choose a bad hash
6:14 - 6:18

function, then HDH will not hold for the
pair G comma H. Okay. But suppose it so
6:18 - 6:23

happens that we choose a group G and a
hash function H for which the Hash Diffie
6:23 - 6:27

Hellman assumption holds. And in fact, if
you choose the hash function to be just
6:27 - 6:32

SHA-256, for all we know, the Hash
Diffie Hellman assumption holds in the
6:32 - 6:36

group ZP star, and holds in elliptic curve
groups. My goal then is to show you that
6:36 - 6:40

ElGamal is semantically secure under
the Hash Diffie-Hellman assumption. So let me
6:40 - 6:44

quickly remind you how theElGamal
system works. While we're gonna choose a
6:44 - 6:48

random generator G, we're gonna choose a
random 'a' in ZN, the public key is
6:48 - 6:52

gonna be G, and G to the A, the secret key
is simply gonna be A. And now here I wrote
6:52 - 6:56

shorthand for the ElGamal encryption.
Basically, what to encrypt, what we do is
6:56 - 7:02

we choose a random B. We hash G as being H
to the B. Remember this H to the B is the
7:02 - 7:06

value G to the AB. This is the Diffie
Hellman secret. The hash function gave us
7:06 - 7:11

a symmetric encryption key K. We encrypt
the message with K, and we output G to the
7:11 - 7:15

B and the symmetric cipher text. To
decrypt we have to value U, and the Diffie
7:15 - 7:19

Hellman secret, G to the A. To derive a
symmetric key, we decrypt the cipher text.
7:19 - 7:23

And then we output the plaintext message m. So let's see how to argue that,
7:23 - 7:28

in fact, ElGamal is semantically secure under
this Hash Diffie-Hillman assumption is
7:28 - 7:32

fairly simple. So well we have to argue,
remember we had, in semantic security, we
7:32 - 7:36

have these two experiments. One
experiment, the attacker is given the
7:36 - 7:40

encryption of m0. In the other experiment,
the attacker is given the encryption of m1.
7:40 - 7:44

So I wrote the two experiments here. Here
you notice that the attacker starts by
7:44 - 7:48

sending off the public key to the
adversary. The adversary then chooses two
7:48 - 7:53

equal length messages m0 and m1. And then
if he gets the ElGamal encryption of M0,
7:53 - 7:57

and a kind of rough shorthand for what
ElGamal ciphertext is, okay? Similarly, in
7:57 - 8:02

encryption one, he simply gets the
encryption of M1 instead of M0, and
8:02 - 8:06

everything else is the same about these
two experiments. Now, because of the Hash
8:06 - 8:11

Diffie-Hellman assumption, we know that
even though the attacker got to see G, G
8:11 - 8:16

to the a and G to the b, we know that the
output of the hash of G to the b, G to the
8:16 - 8:21

ab is indistinguishable from random.
Therefore, if we replace the actual hash
8:21 - 8:26

function by a truly generated random key K
that's independent of everything else, by
8:26 - 8:31

the Hash Diffie-Hellman assumption, the
attacker can't distinguish these two
8:31 - 8:35

games. And again, it's a simple exercise
for you to show that if he could
8:35 - 8:39

distinguish these two games, then he would
break the Hash Diffie-Hellman assumption.
8:39 - 8:43

But then the same is true for experiment
one. The attacker also could not
8:43 - 8:47

distinguish the output of the hash
function from a truly random key, that was
8:47 - 8:51

used the generate the challenge cipher
text. Okay. So now basically we look at
8:51 - 8:55

these two experiments and we realize that,
wait a minute, what's going on in these
8:55 - 8:59

two experiments? Basically everything is
the same except in one experiment, the
8:59 - 9:03

attacker gets the encryption under
a symmetric encryption system of m0. In the
9:03 - 9:07

other one, he gets the encryption under a
symmetric encryption system of m1 and the
9:07 - 9:11

key is chosen at random independently over
everything else. So by the fact that the
9:11 - 9:14

symmetric encryption system is
semantically secure, these two games are
9:14 - 9:18

indistinguishable. If they were
distinguishable, then the attacker could
9:18 - 9:22

break the semantic security of this
symmetric encryption system. So now we
9:22 - 9:26

kinda prove this, you know, this chain of
equivalences. And that proves that the
9:26 - 9:29

original games, in fact, are
indistinguishable, computationally
9:29 - 9:33

indistinguishable. And therefore, the
ElGamal system is semantically secure.
9:33 - 9:38

Okay so it's like I said a very simple
proof by pictures and you can make this
9:38 - 9:42

into a rigorous proof without too much
work. But semantic security isn't enough,
9:42 - 9:46

what we really want is chosen ciphertext
security, and the question is ElGamal chosen ciphertext
9:46 - 9:50

secure? So it turns out to prove the
chosen ciphertext security of ElGamal we
9:50 - 9:55

actually need a stronger assumption, Hash Diffie-Hellman or Computational Diffie-Hellman
9:55 - 9:59

are actually not enough to prove
chosen ciphertext security of the system as far
9:59 - 10:03

as we know. So to prove chosen-ciphertext
security, I'm going to introduce a new
10:03 - 10:06

assumption called Interactive Diffie
Hellman assumption. And actually,
10:06 - 10:10

technically we really don't like this
assumption. And we're going to try to get
10:10 - 10:15

rid of this, later on. But for now, we
just want to analyze the security of the
10:15 - 10:19

basic ElGamal system against chosen
ciphertext attack. So to argue that it is
10:19 - 10:24

chosen ciphertext secure, here is what the
assumption says. Basically the challenger
10:24 - 10:28

starts, you know, plays a game with the
adversary, he generates a random G,
10:28 - 10:32

generates two exponents, and then he says
to the adversary as usual, G, G to the a
10:32 - 10:37

and G to the b. Now the adversary's goal
is basically to figure out the
10:37 - 10:41

Diffie-Helmman's secrets, mainly g to the
ab. He outputs the value of V and he wins
10:41 - 10:45

the game if V happens to be equal to G to
the AB. So, so far this looks identical to
10:45 - 10:49

the Computational Diffie-Hellman assumption,
there's no difference - this is
10:49 - 10:53

the Computational Diffie-Hellman assumption
except in Interactive Diffie-Hellman would
10:53 - 10:57

give the attacker a little bit more power.
So because the attacker has more power,
10:57 - 11:02

it's harder to satisfy the assumption,
which is why we say that this assumption
11:02 - 11:06

is stronger than Computational
Diffie-Hellman. Now what is this power to be
11:06 - 11:11

given? We are given the ability to make
queries. In particular, he gets to submit
11:11 - 11:16

two elements from the group G, so U1, V1
disappear from the group G. And then he's
11:16 - 11:20

told whether U1 to the A is equal to V1,
so he gets one. If you wanted the A equals
11:20 - 11:24

to V1 get zero, otherwise it's kind of a
bizarre type of query. What, how does it
11:24 - 11:28

be possibly help the attacker? But it
turns out we need to be able to answer
11:28 - 11:32

these types of queries to the adversary in
order to be able to prove chosen
11:32 - 11:36

ciphertext security. And in fact, he can
do these type of queries again and again
11:36 - 11:39

and again. So he can issue as many
queries like these as he wants and then the
11:39 - 11:43

assumption says that despite all these
queries he still can't figure out the
11:43 - 11:47

Diffie-Hellman secret, namely despite
making all these queries, the probability
11:47 - 11:51

that he outputs the
Diffie-Hellman secret is negligible. Okay.
11:51 - 11:56

So clearly if this assumption holds, that the Computational Diffie-Hellman assumption
11:56 - 11:59

holds, because here, the adversary has more power,
and as a result we say that this assumption
11:59 - 12:03

is stronger that Computational Diffie-Hellman, the thing,
we really don't like about this
12:03 - 12:06

assumption, is the fact, that, it's
interactive, and that the adversary is allowed to
12:06 - 12:10

make queries to the challenger, and as I
said, we're gonna trying to get rid
12:10 - 12:14

of this interaction later on. But for now
I'll tell you that under this Interactive
12:14 - 12:19

Diffie-Hellman assumption and under the
assumption that the symmetric encryption
12:19 - 12:23

system provides authenticated encryption, and
under the assumption that the hash
12:23 - 12:27

function is kind of an ideal hash
function, what we call the random oracle,
12:27 - 12:31

then in fact the ElGamal system is chosen ciphertext secure and again this
12:31 - 12:35

RO here denotes the fact that it's in the
random oracle model. Which is not
12:35 - 12:39

important, so much for our purposes. The
main thing to remember that under
12:39 - 12:43

kind of this weird, interactive
assumption we can actually prove that
12:43 - 12:47

ElGamal is a chosen ciphertext secure.
And as far as we know this assumption
12:47 - 12:51

actually holds for the group ZP star.
So what we're gonna do next is basically
12:51 - 12:56

answer this question of whether we can get
rid of the interactive assumption. Can we
12:56 - 13:00

prove security strictly based on CDH? And
similarly can we proof security without
13:00 - 13:04

relying on random oracles? Just you know
without having to assume, that the hash
13:04 - 13:08

function is ideal. Just you know, can we
prove security using a concrete hash
13:08 - 13:12

function. And we'll do that very briefly
in the next segment.

Title:: ElGamal Security (14 min)
Video Language:: English

	vickymnit edited English subtitles for ElGamal Security (14 min)
	vickymnit edited English subtitles for ElGamal Security (14 min)
	vickymnit edited English subtitles for ElGamal Security (14 min)
	rodolfojcj edited English subtitles for ElGamal Security (14 min)
	rodolfojcj edited English subtitles for ElGamal Security (14 min)
	rodolfojcj edited English subtitles for ElGamal Security (14 min)
	stanford-bot edited English subtitles for ElGamal Security (14 min)
	stanford-bot edited English subtitles for ElGamal Security (14 min)

Show all

English subtitles

Revisions

Revision 3

vickymnit

ElGamal Security (14 min)

Revisions

Our website uses cookies

Operating cookies (Required)