1
00:00:00,000 --> 00:00:03,574
In the last segment, we saw that the
ElGamal public key encryption system is

2
00:00:03,574 --> 00:00:07,525
chosen cipher text secure under a somewhat
strange assumption. In this segment, we're

3
00:00:07,525 --> 00:00:11,146
gonna look at variants of ElGamal that
have a much better chosen cipher text

4
00:00:11,146 --> 00:00:14,909
security analysis. Now, I should say that
over the past decade, there's been a ton

5
00:00:14,909 --> 00:00:18,766
of research on constructing, public key
encryptions that are chosen cipher text

6
00:00:18,766 --> 00:00:22,387
secure. I actually debated for some time
on how to best present this here. And

7
00:00:22,387 --> 00:00:26,197
finally, I decided to kind of show you a
survey of the main results from the last

8
00:00:26,197 --> 00:00:29,960
decades, specifically as they apply to the
ElGamal system. And then, at the end of

9
00:00:29,960 --> 00:00:33,770
the module, I suggest a number of papers
that you can look at for further reading.

10
00:00:33,770 --> 00:00:38,332
So let me start by reminding you how the
ElGamal encryption system works. I'm sure

11
00:00:38,332 --> 00:00:42,783
by now you all remember how ElGamal works,
but just to be safe, let me remind you

12
00:00:42,783 --> 00:00:47,623
that key generation in ElGamal picks a
random generator, a random exponent from ZN

13
00:00:47,623 --> 00:00:51,963
and then the public key is simply the
generator and this element g to the a,

14
00:00:51,963 --> 00:00:56,332
whereas the secret key is simply the
discrete log of h base g. This value A.

15
00:00:56,332 --> 00:01:01,255
Now the way we encrypt is we pick a random
exponent B from ZN. We compute the hash of

16
00:01:01,255 --> 00:01:05,947
G to the B and H to the B. And I'm gonna
remind you that H to the B is the Diffie

17
00:01:05,947 --> 00:01:10,233
Hellman secret, G to the AB. And then we
actually encrypt a message using a

18
00:01:10,233 --> 00:01:15,156
symmetric encryption system with the key K
that's derived from the hash function. And

19
00:01:15,156 --> 00:01:19,731
finally, the output cipher text is G to
the B, and the symmetric cipher text. The

20
00:01:19,731 --> 00:01:24,654
way we decrypt is you know, as we've seen
before basically, by hashing U and the

21
00:01:24,654 --> 00:01:28,940
Diffie Hellman secrets, decrypting the
symmetric system, and outputting the

22
00:01:28,940 --> 00:01:33,645
message M. Now in the last segment we said
that ElGamal is chosen ciphertext secure under

23
00:01:33,645 --> 00:01:37,881
this funny Interactive Diffie-Hellman
assumption. I am not gonna remind you what

24
00:01:37,881 --> 00:01:42,447
the assumption is here but I'll also say
that this theorem kind of raises two very

25
00:01:42,447 --> 00:01:46,683
natural questions. The first question is
can we prove CCA security of

26
00:01:46,683 --> 00:01:50,864
ElGamal just based on the standard
Computational Diffie-Hellman assumption,

27
00:01:50,864 --> 00:01:55,011
namely the G to the A, G to the B, you
can't compute G to the AB. Can we prove

28
00:01:55,011 --> 00:01:59,259
chosen-ciphertext security just based on
that? And the truth's that there are two

29
00:01:59,259 --> 00:02:03,454
ways to do it. The first option is to use
a group where the computational Diffie

30
00:02:03,454 --> 00:02:07,306
Hellman assumption is hard. But is
provably equivalent to the Interactive

31
00:02:07,306 --> 00:02:11,227
Diffie Hellman assumption. And it turns
out it's actually not that hard to

32
00:02:11,227 --> 00:02:15,147
construct such groups. These groups are
called bilinear groups. And in such

33
00:02:15,147 --> 00:02:19,544
groups, we know that the ElGamal system is
chosen cipher text secure, strictly based

34
00:02:19,544 --> 00:02:23,782
under the Computational Diffie Hellman
assumption, at least in the random oracle

35
00:02:23,782 --> 00:02:28,983
model. I'll tell you that these bi-linear
groups are actually constructed from very

36
00:02:28,983 --> 00:02:33,715
special elliptic curves. So there's a bit
more algebraic structure that enables us

37
00:02:33,715 --> 00:02:38,404
to prove this equivalence of IDH and CDH.
But in general, who knows, maybe you don't

38
00:02:38,404 --> 00:02:42,928
want to use elliptic curve groups, maybe
you want to use ZP star for some reason.

39
00:02:42,928 --> 00:02:47,817
So it's a pretty natural question to ask.
Can we change the ElGamal system such that

40
00:02:47,817 --> 00:02:52,828
chosen ciphertext security of ElGamal now can be proven, directly based on CDH, for any group

41
00:02:52,828 --> 00:02:57,840
where CDH is hard? [Now with that ??] assuming
any additional structure about the group,

42
00:02:57,840 --> 00:03:02,132
And it turns out the answer is yes. And
there's kind of an elegant construction

43
00:03:02,132 --> 00:03:06,696
called twin ElGamal, so let me show you
how twin ElGamal works. It's a very simple

44
00:03:06,696 --> 00:03:10,607
variation of ElGamal that does the
following. Again, in key generation, we

45
00:03:10,607 --> 00:03:14,954
choose a random generator. But this time,
we're gonna choose two exponents, A1 and

46
00:03:14,954 --> 00:03:19,409
A2 as the secret key. So the secret key is
gonna consist of these two exponents, A1

47
00:03:19,409 --> 00:03:23,783
and A2. You know the public key of course
is gonna consist of our generator. And

48
00:03:23,783 --> 00:03:28,340
then we're gonna release G to the A1 and G
to the A2. So remember that in regular

49
00:03:28,340 --> 00:03:32,840
ElGamal what the public key is simply g
to the A and that's it. Here we have two

50
00:03:32,840 --> 00:03:37,340
exponents A1 and A2 and therefore we, we
release both G to the A1 and G to the A2.

51
00:03:37,340 --> 00:03:42,297
Now the way we encrypt, you'll notice the
public key here is one element longer than

52
00:03:42,297 --> 00:03:47,137
regular ElGamal. The way we encrypt using
this public key is actually very similar

53
00:03:47,137 --> 00:03:51,859
to regular ElGamal. We choose a random B,
and now what we'll hash is actually not

54
00:03:51,859 --> 00:03:56,522
just G to the B and H1 to the B, but,
in fact, G to the B, H1 to the B, and H2

55
00:03:56,522 --> 00:04:01,691
to the B. Okay so we basically hashing
three elements instead of two elements and

56
00:04:01,691 --> 00:04:06,642
then we basically encrypt the message
using the derived symmetric encryption key

57
00:04:06,642 --> 00:04:11,410
and as usual we output g to the b and c as our
final ciphertext. How do we decrypt?

58
00:04:11,410 --> 00:04:16,027
Well, so now the secret key consists of
these two exponents, A1 and A2. And the

59
00:04:16,027 --> 00:04:20,584
cipher text consists of U, and the
symmetric cipher text C. So let me ask you

60
00:04:20,584 --> 00:04:25,501
a puzzle, and see if you can figure out
how to derive the symmetric encryption key

61
00:04:25,501 --> 00:04:31,934
K, just given the secret key and the value
U. So it's kind of a cute puzzle and I

62
00:04:31,934 --> 00:04:37,250
hope everybody worked out, the solution
which is basically what we can do is we

63
00:04:37,250 --> 00:04:42,566
can take U to the power of A1, and that is
basically G to the B A1 And U to the A2

64
00:04:42,566 --> 00:04:48,012
which is G to the B A2. And that basically
gives us exactly the same values, as H1 to

65
00:04:48,012 --> 00:04:53,069
the B and H2 to the B. So this way the
decryptor arrives at the same symmetric

66
00:04:53,069 --> 00:04:58,515
key that the encryptor did. Then he decrypts
the ciphertext using the symmetric system

67
00:04:58,515 --> 00:05:03,389
and finally outputs the message M. So you
notice this is a very simple modification

68
00:05:03,389 --> 00:05:07,802
to regular ElGamal. All we do is we stick
one more element to the public key. When

69
00:05:07,802 --> 00:05:11,888
we do the hashing, we hash one more
element, as opposed to just two elements.

70
00:05:11,888 --> 00:05:16,246
We hash three elements. And similarly, we
do doing decryption, and nothing else

71
00:05:16,246 --> 00:05:20,491
changes. The cipher text is the same
length as before, and that's it, Now the

72
00:05:20,491 --> 00:05:24,647
amazing thing is that this simple
modification allows us to now prove chosen

73
00:05:24,647 --> 00:05:28,640
cipher text security directly based on
standard Computational Diffie-Hellman

74
00:05:28,640 --> 00:05:32,795
assumption, okay? Still we're going to
need to assume that we have a symmetric

75
00:05:32,795 --> 00:05:36,897
encryption system that provides us
authenticated encryption and that the hash

76
00:05:36,897 --> 00:05:41,430
function that we're using is an ideal hash
function in what we call a random oracle

77
00:05:41,430 --> 00:05:45,747
But nevertheless given that, we can
prove chosen cipher text security strictly

78
00:05:45,747 --> 00:05:49,803
based on a Computational Diffie-Hellman
assumption. So now what's the cost of this?

79
00:05:49,803 --> 00:05:53,966
Of course, if you think about this, during
encryption and decryption, encryption has

80
00:05:53,966 --> 00:05:57,418
to do one more exponentiation, and the
decryption has to do one more

81
00:05:57,418 --> 00:06:01,581
exponentiation. So the encryptor now does
three exponentiations instead of two, and

82
00:06:01,581 --> 00:06:05,490
the decryptor does two exponentiations
instead of one. So the question is now,

83
00:06:05,490 --> 00:06:09,965
now it's a philosophical question. Is this
extra effort worth it? So you do more work

84
00:06:09,965 --> 00:06:14,228
during encryption and decryption. And your
public key is a little bit bigger, but

85
00:06:14,228 --> 00:06:18,331
that doesn't really matter. The work
during encryption and decryption is more

86
00:06:18,331 --> 00:06:22,434
significant. And as a result you get
chosen ciphertext security based on kind

87
00:06:22,434 --> 00:06:26,644
of a more natural assumption, namely
Computational Diffie-Hellman as opposed to

88
00:06:26,644 --> 00:06:30,480
these odd-looking Interactive
Diffie-Hellman assumption. But is it worth

89
00:06:30,480 --> 00:06:34,743
it? Kind of the question is are there
groups where CDH holds but IDH does not

90
00:06:34,743 --> 00:06:38,815
hold? If there were such groups, then it
would definitely be worth it, because

91
00:06:38,815 --> 00:06:43,050
those groups, the twin ElGamal would be
secure, but the regular ElGamal would not

92
00:06:43,050 --> 00:06:47,508
be CCA secure. But unfortunately we don't
know if there is any such group and in

93
00:06:47,508 --> 00:06:51,383
fact as far as we know, it's certainly
possible that any group where CDH holds,

94
00:06:51,383 --> 00:06:55,307
IDH also holds. So the answer is, really,
we don't know whether the extra cost is

95
00:06:55,307 --> 00:06:59,530
worth it or not but then nevertheless it's
a cute result that shows that if you want

96
00:06:59,530 --> 00:07:03,256
to achieve chosen ciphertext
security directly from CDH, you could do

97
00:07:03,256 --> 00:07:08,051
it without too many changes to the ElGamal
system. The next question is whether we

98
00:07:08,051 --> 00:07:12,230
can get rid of this assumption that the
hash function is an ideal hash function

99
00:07:12,230 --> 00:07:16,617
mainly that it's a random oracle. And this
is actually a huge topic. There's a lot of

100
00:07:16,617 --> 00:07:20,482
work in this area on how to build
efficient public key encryption systems

101
00:07:20,482 --> 00:07:24,922
that are chosen ciphertext secure without
random oracles. This is a very active area

102
00:07:24,922 --> 00:07:29,192
of research as I said in the past decade
and even longer. And I'll mention that as

103
00:07:29,192 --> 00:07:33,379
it applies to ElGamal, they are basically,
again two families of constructions. The

104
00:07:33,379 --> 00:07:37,515
first construction's a construction that
uses these special groups called these

105
00:07:37,515 --> 00:07:41,599
bilinear groups that I just mentioned
before. It turns out the extra structure

106
00:07:41,599 --> 00:07:45,682
of these bilinear groups allows us to
build very efficient chosen ciphertext

107
00:07:45,682 --> 00:07:50,128
secure systems without referring to random
oracles and as I said at the end of the

108
00:07:50,128 --> 00:07:54,367
module, I point to a number of papers that
explain how to do that. This is quite an

109
00:07:54,367 --> 00:07:58,876
interesting construction. But it will take
me several hours to kind of explain how it

110
00:07:58,876 --> 00:08:03,434
works. The other alternative is actually
to use groups where a stronger assumption

111
00:08:03,434 --> 00:08:07,830
called the Decision Diffie-Hellman assumption
holds. Again, I am not gonna define this

112
00:08:07,830 --> 00:08:11,798
assumption here. I'll just tell you that
this assumption actually holds in

113
00:08:11,798 --> 00:08:16,141
subgroups of ZP star, in particular if we
look at the prime order of a subgroup of

114
00:08:16,141 --> 00:08:19,812
ZP star. The Decision Diffie-Hellman
assumption seems to hold in those groups

115
00:08:19,812 --> 00:08:23,852
and then in those groups we can actually
build a variant of ElGamal, called the

116
00:08:23,852 --> 00:08:27,141
Cramer Shoup system that is chosen
ciphertext secure and the Decision

117
00:08:27,141 --> 00:08:30,665
Diffie-Hellman assumption without random
oracles. Again, this is a beautiful,

118
00:08:30,665 --> 00:08:34,659
beautiful result but again it would take a
couple of hours to explain and so I'm not

119
00:08:34,659 --> 00:08:38,324
gonna do that here. This is probably one
of these things that I gonna leave to

120
00:08:38,324 --> 00:08:42,083
either the advanced topics or to do an
advanced course so that we'll do it at a

121
00:08:42,083 --> 00:08:46,335
later time. But again I points to a paper
at the end of the module just in case you

122
00:08:46,335 --> 00:08:50,626
want to read more about this. So here is a
list of papers that provides more reading

123
00:08:50,626 --> 00:08:54,208
material. So if you wanna read about
Diffie Hellman assumptions, I guess I

124
00:08:54,208 --> 00:08:58,036
wrote a paper about this a long time ago,
that talks about various assumptions

125
00:08:58,036 --> 00:09:01,716
related to, to Diffie Hellman. And in
particular, studies the Decision Diffie

126
00:09:01,716 --> 00:09:05,685
Hellman assumption. And if you wanna learn
about how to build chosen ciphertext

127
00:09:05,685 --> 00:09:10,098
secure system from Decision Diffie-Hellman
and assumptions like it. There's a very

128
00:09:10,098 --> 00:09:14,563
cute paper by Kramer and Shoup back
from 2002 that shows how to do it. This is

129
00:09:14,563 --> 00:09:18,618
again a very highly recommended paper. If
you want to learn how to build chosen

130
00:09:18,618 --> 00:09:22,672
ciphertext security from these
bilinear groups, then the paper to read is

131
00:09:22,672 --> 00:09:26,983
the one, cited here, which actually uses a
general mechanism called Identity Based

132
00:09:26,983 --> 00:09:30,884
Encryption which very surprisingly it
turns out to actually gives us chosen

133
00:09:30,884 --> 00:09:34,638
ciphertext security almost for free.
So, once we build identity-based

134
00:09:34,638 --> 00:09:38,486
encryption chosen ciphertext security falls
immediately. And that's covered in this

135
00:09:38,486 --> 00:09:42,283
paper that I, that I describe her. The
Twin Diffie-Hellman construction and its

136
00:09:42,283 --> 00:09:46,381
proof is described in this paper that I
reference here And finally, if you kind of

137
00:09:46,381 --> 00:09:50,135
want to see a very recent paper. That
gives a very general framework for

138
00:09:50,135 --> 00:09:54,345
building, chosen ciphertext secure systems, using
what's called extractable hash proofs that

139
00:09:54,345 --> 00:09:58,506
is again a nice paper by Hoeteck Wee, that
was just recently published. I definitely

140
00:09:58,506 --> 00:10:02,416
recommend reading it all, all have
very, very elegant ideas in them. I wish I

141
00:10:02,416 --> 00:10:06,426
could cover all of them in the basic
course but I'm gonna have to leave some of

142
00:10:06,426 --> 00:10:10,436
these to the more advanced course or
perhaps the more advanced topics at the

143
00:10:10,436 --> 00:10:14,446
end of this course. Okay, so I'll stop
here and in the next segment I'm gonna tie

144
00:10:14,446 --> 00:10:18,506
ElGamal and RSA encryption
together so that you see how the two kind

145
00:10:18,506 --> 00:10:20,512
of follow from a more general principle.