WEBVTT

00:00:00.000 --> 00:00:03.574
In the last segment, we saw that the
ElGamal public key encryption system is

00:00:03.574 --> 00:00:07.525
chosen cipher text secure under a somewhat
strange assumption. In this segment, we're

00:00:07.525 --> 00:00:11.146
gonna look at variants of ElGamal that
have a much better chosen cipher text

00:00:11.146 --> 00:00:14.909
security analysis. Now, I should say that
over the past decade, there's been a ton

00:00:14.909 --> 00:00:18.766
of research on constructing, public key
encryptions that are chosen cipher text

00:00:18.766 --> 00:00:22.387
secure. I actually debated for some time
on how to best present this here. And

00:00:22.387 --> 00:00:26.197
finally, I decided to kind of show you a
survey of the main results from the last

00:00:26.197 --> 00:00:29.960
decades, specifically as they apply to the
ElGamal system. And then, at the end of

00:00:29.960 --> 00:00:33.770
the module, I suggest a number of papers
that you can look at for further reading.

00:00:33.770 --> 00:00:38.332
So let me start by reminding you how the
ElGamal encryption system works. I'm sure

00:00:38.332 --> 00:00:42.783
by now you all remember how ElGamal works,
but just to be safe, let me remind you

00:00:42.783 --> 00:00:47.623
that key generation in ElGamal picks a
random generator, a random exponent from ZN

00:00:47.623 --> 00:00:51.963
and then the public key is simply the
generator and this element g to the a,

00:00:51.963 --> 00:00:56.332
whereas the secret key is simply the
discrete log of h base g. This value A.

00:00:56.332 --> 00:01:01.255
Now the way we encrypt is we pick a random
exponent B from ZN. We compute the hash of

00:01:01.255 --> 00:01:05.947
G to the B and H to the B. And I'm gonna
remind you that H to the B is the Diffie

00:01:05.947 --> 00:01:10.233
Hellman secret, G to the AB. And then we
actually encrypt a message using a

00:01:10.233 --> 00:01:15.156
symmetric encryption system with the key K
that's derived from the hash function. And

00:01:15.156 --> 00:01:19.731
finally, the output cipher text is G to
the B, and the symmetric cipher text. The

00:01:19.731 --> 00:01:24.654
way we decrypt is you know, as we've seen
before basically, by hashing U and the

00:01:24.654 --> 00:01:28.940
Diffie Hellman secrets, decrypting the
symmetric system, and outputting the

00:01:28.940 --> 00:01:33.645
message M. Now in the last segment we said
that ElGamal is chosen ciphertext secure under

00:01:33.645 --> 00:01:37.881
this funny Interactive Diffie-Hellman
assumption. I am not gonna remind you what

00:01:37.881 --> 00:01:42.447
the assumption is here but I'll also say
that this theorem kind of raises two very

00:01:42.447 --> 00:01:46.683
natural questions. The first question is
can we prove CCA security of

00:01:46.683 --> 00:01:50.864
ElGamal just based on the standard
Computational Diffie-Hellman assumption,

00:01:50.864 --> 00:01:55.011
namely the G to the A, G to the B, you
can't compute G to the AB. Can we prove

00:01:55.011 --> 00:01:59.259
chosen-ciphertext security just based on
that? And the truth's that there are two

00:01:59.259 --> 00:02:03.454
ways to do it. The first option is to use
a group where the computational Diffie

00:02:03.454 --> 00:02:07.306
Hellman assumption is hard. But is
provably equivalent to the Interactive

00:02:07.306 --> 00:02:11.227
Diffie Hellman assumption. And it turns
out it's actually not that hard to

00:02:11.227 --> 00:02:15.147
construct such groups. These groups are
called bilinear groups. And in such

00:02:15.147 --> 00:02:19.544
groups, we know that the ElGamal system is
chosen cipher text secure, strictly based

00:02:19.544 --> 00:02:23.782
under the Computational Diffie Hellman
assumption, at least in the random oracle

00:02:23.782 --> 00:02:28.983
model. I'll tell you that these bi-linear
groups are actually constructed from very

00:02:28.983 --> 00:02:33.715
special elliptic curves. So there's a bit
more algebraic structure that enables us

00:02:33.715 --> 00:02:38.404
to prove this equivalence of IDH and CDH.
But in general, who knows, maybe you don't

00:02:38.404 --> 00:02:42.928
want to use elliptic curve groups, maybe
you want to use ZP star for some reason.

00:02:42.928 --> 00:02:47.817
So it's a pretty natural question to ask.
Can we change the ElGamal system such that

00:02:47.817 --> 00:02:52.828
chosen ciphertext security of ElGamal now can be proven, directly based on CDH, for any group

00:02:52.828 --> 00:02:57.840
where CDH is hard? [Now with that ??] assuming
any additional structure about the group,

00:02:57.840 --> 00:03:02.132
And it turns out the answer is yes. And
there's kind of an elegant construction

00:03:02.132 --> 00:03:06.696
called twin ElGamal, so let me show you
how twin ElGamal works. It's a very simple

00:03:06.696 --> 00:03:10.607
variation of ElGamal that does the
following. Again, in key generation, we

00:03:10.607 --> 00:03:14.954
choose a random generator. But this time,
we're gonna choose two exponents, A1 and

00:03:14.954 --> 00:03:19.409
A2 as the secret key. So the secret key is
gonna consist of these two exponents, A1

00:03:19.409 --> 00:03:23.783
and A2. You know the public key of course
is gonna consist of our generator. And

00:03:23.783 --> 00:03:28.340
then we're gonna release G to the A1 and G
to the A2. So remember that in regular

00:03:28.340 --> 00:03:32.840
ElGamal what the public key is simply g
to the A and that's it. Here we have two

00:03:32.840 --> 00:03:37.340
exponents A1 and A2 and therefore we, we
release both G to the A1 and G to the A2.

00:03:37.340 --> 00:03:42.297
Now the way we encrypt, you'll notice the
public key here is one element longer than

00:03:42.297 --> 00:03:47.137
regular ElGamal. The way we encrypt using
this public key is actually very similar

00:03:47.137 --> 00:03:51.859
to regular ElGamal. We choose a random B,
and now what we'll hash is actually not

00:03:51.859 --> 00:03:56.522
just G to the B and H1 to the B, but,
in fact, G to the B, H1 to the B, and H2

00:03:56.522 --> 00:04:01.691
to the B. Okay so we basically hashing
three elements instead of two elements and

00:04:01.691 --> 00:04:06.642
then we basically encrypt the message
using the derived symmetric encryption key

00:04:06.642 --> 00:04:11.410
and as usual we output g to the b and c as our
final ciphertext. How do we decrypt?

00:04:11.410 --> 00:04:16.027
Well, so now the secret key consists of
these two exponents, A1 and A2. And the

00:04:16.027 --> 00:04:20.584
cipher text consists of U, and the
symmetric cipher text C. So let me ask you

00:04:20.584 --> 00:04:25.501
a puzzle, and see if you can figure out
how to derive the symmetric encryption key

00:04:25.501 --> 00:04:31.934
K, just given the secret key and the value
U. So it's kind of a cute puzzle and I

00:04:31.934 --> 00:04:37.250
hope everybody worked out, the solution
which is basically what we can do is we

00:04:37.250 --> 00:04:42.566
can take U to the power of A1, and that is
basically G to the B A1 And U to the A2

00:04:42.566 --> 00:04:48.012
which is G to the B A2. And that basically
gives us exactly the same values, as H1 to

00:04:48.012 --> 00:04:53.069
the B and H2 to the B. So this way the
decryptor arrives at the same symmetric

00:04:53.069 --> 00:04:58.515
key that the encryptor did. Then he decrypts
the ciphertext using the symmetric system

00:04:58.515 --> 00:05:03.389
and finally outputs the message M. So you
notice this is a very simple modification

00:05:03.389 --> 00:05:07.802
to regular ElGamal. All we do is we stick
one more element to the public key. When

00:05:07.802 --> 00:05:11.888
we do the hashing, we hash one more
element, as opposed to just two elements.

00:05:11.888 --> 00:05:16.246
We hash three elements. And similarly, we
do doing decryption, and nothing else

00:05:16.246 --> 00:05:20.491
changes. The cipher text is the same
length as before, and that's it, Now the

00:05:20.491 --> 00:05:24.647
amazing thing is that this simple
modification allows us to now prove chosen

00:05:24.647 --> 00:05:28.640
cipher text security directly based on
standard Computational Diffie-Hellman

00:05:28.640 --> 00:05:32.795
assumption, okay? Still we're going to
need to assume that we have a symmetric

00:05:32.795 --> 00:05:36.897
encryption system that provides us
authenticated encryption and that the hash

00:05:36.897 --> 00:05:41.430
function that we're using is an ideal hash
function in what we call a random oracle

00:05:41.430 --> 00:05:45.747
But nevertheless given that, we can
prove chosen cipher text security strictly

00:05:45.747 --> 00:05:49.803
based on a Computational Diffie-Hellman
assumption. So now what's the cost of this?

00:05:49.803 --> 00:05:53.966
Of course, if you think about this, during
encryption and decryption, encryption has

00:05:53.966 --> 00:05:57.418
to do one more exponentiation, and the
decryption has to do one more

00:05:57.418 --> 00:06:01.581
exponentiation. So the encryptor now does
three exponentiations instead of two, and

00:06:01.581 --> 00:06:05.490
the decryptor does two exponentiations
instead of one. So the question is now,

00:06:05.490 --> 00:06:09.965
now it's a philosophical question. Is this
extra effort worth it? So you do more work

00:06:09.965 --> 00:06:14.228
during encryption and decryption. And your
public key is a little bit bigger, but

00:06:14.228 --> 00:06:18.331
that doesn't really matter. The work
during encryption and decryption is more

00:06:18.331 --> 00:06:22.434
significant. And as a result you get
chosen ciphertext security based on kind

00:06:22.434 --> 00:06:26.644
of a more natural assumption, namely
Computational Diffie-Hellman as opposed to

00:06:26.644 --> 00:06:30.480
these odd-looking Interactive
Diffie-Hellman assumption. But is it worth

00:06:30.480 --> 00:06:34.743
it? Kind of the question is are there
groups where CDH holds but IDH does not

00:06:34.743 --> 00:06:38.815
hold? If there were such groups, then it
would definitely be worth it, because

00:06:38.815 --> 00:06:43.050
those groups, the twin ElGamal would be
secure, but the regular ElGamal would not

00:06:43.050 --> 00:06:47.508
be CCA secure. But unfortunately we don't
know if there is any such group and in

00:06:47.508 --> 00:06:51.383
fact as far as we know, it's certainly
possible that any group where CDH holds,

00:06:51.383 --> 00:06:55.307
IDH also holds. So the answer is, really,
we don't know whether the extra cost is

00:06:55.307 --> 00:06:59.530
worth it or not but then nevertheless it's
a cute result that shows that if you want

00:06:59.530 --> 00:07:03.256
to achieve chosen ciphertext
security directly from CDH, you could do

00:07:03.256 --> 00:07:08.051
it without too many changes to the ElGamal
system. The next question is whether we

00:07:08.051 --> 00:07:12.230
can get rid of this assumption that the
hash function is an ideal hash function

00:07:12.230 --> 00:07:16.617
mainly that it's a random oracle. And this
is actually a huge topic. There's a lot of

00:07:16.617 --> 00:07:20.482
work in this area on how to build
efficient public key encryption systems

00:07:20.482 --> 00:07:24.922
that are chosen ciphertext secure without
random oracles. This is a very active area

00:07:24.922 --> 00:07:29.192
of research as I said in the past decade
and even longer. And I'll mention that as

00:07:29.192 --> 00:07:33.379
it applies to ElGamal, they are basically,
again two families of constructions. The

00:07:33.379 --> 00:07:37.515
first construction's a construction that
uses these special groups called these

00:07:37.515 --> 00:07:41.599
bilinear groups that I just mentioned
before. It turns out the extra structure

00:07:41.599 --> 00:07:45.682
of these bilinear groups allows us to
build very efficient chosen ciphertext

00:07:45.682 --> 00:07:50.128
secure systems without referring to random
oracles and as I said at the end of the

00:07:50.128 --> 00:07:54.367
module, I point to a number of papers that
explain how to do that. This is quite an

00:07:54.367 --> 00:07:58.876
interesting construction. But it will take
me several hours to kind of explain how it

00:07:58.876 --> 00:08:03.434
works. The other alternative is actually
to use groups where a stronger assumption

00:08:03.434 --> 00:08:07.830
called the Decision Diffie-Hellman assumption
holds. Again, I am not gonna define this

00:08:07.830 --> 00:08:11.798
assumption here. I'll just tell you that
this assumption actually holds in

00:08:11.798 --> 00:08:16.141
subgroups of ZP star, in particular if we
look at the prime order of a subgroup of

00:08:16.141 --> 00:08:19.812
ZP star. The Decision Diffie-Hellman
assumption seems to hold in those groups

00:08:19.812 --> 00:08:23.852
and then in those groups we can actually
build a variant of ElGamal, called the

00:08:23.852 --> 00:08:27.141
Cramer Shoup system that is chosen
ciphertext secure and the Decision

00:08:27.141 --> 00:08:30.665
Diffie-Hellman assumption without random
oracles. Again, this is a beautiful,

00:08:30.665 --> 00:08:34.659
beautiful result but again it would take a
couple of hours to explain and so I'm not

00:08:34.659 --> 00:08:38.324
gonna do that here. This is probably one
of these things that I gonna leave to

00:08:38.324 --> 00:08:42.083
either the advanced topics or to do an
advanced course so that we'll do it at a

00:08:42.083 --> 00:08:46.335
later time. But again I points to a paper
at the end of the module just in case you

00:08:46.335 --> 00:08:50.626
want to read more about this. So here is a
list of papers that provides more reading

00:08:50.626 --> 00:08:54.208
material. So if you wanna read about
Diffie Hellman assumptions, I guess I

00:08:54.208 --> 00:08:58.036
wrote a paper about this a long time ago,
that talks about various assumptions

00:08:58.036 --> 00:09:01.716
related to, to Diffie Hellman. And in
particular, studies the Decision Diffie

00:09:01.716 --> 00:09:05.685
Hellman assumption. And if you wanna learn
about how to build chosen ciphertext

00:09:05.685 --> 00:09:10.098
secure system from Decision Diffie-Hellman
and assumptions like it. There's a very

00:09:10.098 --> 00:09:14.563
cute paper by Kramer and Shoup back
from 2002 that shows how to do it. This is

00:09:14.563 --> 00:09:18.618
again a very highly recommended paper. If
you want to learn how to build chosen

00:09:18.618 --> 00:09:22.672
ciphertext security from these
bilinear groups, then the paper to read is

00:09:22.672 --> 00:09:26.983
the one, cited here, which actually uses a
general mechanism called Identity Based

00:09:26.983 --> 00:09:30.884
Encryption which very surprisingly it
turns out to actually gives us chosen

00:09:30.884 --> 00:09:34.638
ciphertext security almost for free.
So, once we build identity-based

00:09:34.638 --> 00:09:38.486
encryption chosen ciphertext security falls
immediately. And that's covered in this

00:09:38.486 --> 00:09:42.283
paper that I, that I describe her. The
Twin Diffie-Hellman construction and its

00:09:42.283 --> 00:09:46.381
proof is described in this paper that I
reference here And finally, if you kind of

00:09:46.381 --> 00:09:50.135
want to see a very recent paper. That
gives a very general framework for

00:09:50.135 --> 00:09:54.345
building, chosen ciphertext secure systems, using
what's called extractable hash proofs that

00:09:54.345 --> 00:09:58.506
is again a nice paper by Hoeteck Wee, that
was just recently published. I definitely

00:09:58.506 --> 00:10:02.416
recommend reading it all, all have
very, very elegant ideas in them. I wish I

00:10:02.416 --> 00:10:06.426
could cover all of them in the basic
course but I'm gonna have to leave some of

00:10:06.426 --> 00:10:10.436
these to the more advanced course or
perhaps the more advanced topics at the

00:10:10.436 --> 00:10:14.446
end of this course. Okay, so I'll stop
here and in the next segment I'm gonna tie

00:10:14.446 --> 00:10:18.506
ElGamal and RSA encryption
together so that you see how the two kind

00:10:18.506 --> 00:10:20.512
of follow from a more general principle.