1
00:00:00,000 --> 00:00:04,152
Now that we understand chosen plaintext
security, let's build encryption schemes

2
00:00:04,152 --> 00:00:08,515
that are chosen plaintext secure. And the
first such encryption scheme is going to

3
00:00:08,515 --> 00:00:12,510
be called cipher bock chaining. So here
is how cipher block chaining works.

4
00:00:12,510 --> 00:00:16,610
Cipher block chaining is a way of using a
block cipher to get chosen plaintext

5
00:00:16,610 --> 00:00:20,868
security. In particular, we are going to
look at a mode called cipher block chaining

6
00:00:20,868 --> 00:00:25,021
with a random IV. CBC stands for cipher
block chaning. So suppose we have a block

7
00:00:25,021 --> 00:00:28,963
cipher, so EB is a block cipher. So now
let's define CBC to be the following

8
00:00:28,963 --> 00:00:33,248
encryption scheme. So the encryption
algorithm when it's asked to encrypt a

9
00:00:33,248 --> 00:00:37,991
message m, the first thing it's going to do
is it's going to choose a random IV that's

10
00:00:37,991 --> 00:00:41,958
exactly one block of the block
cipher. So IV is one cypher block.

11
00:00:41,958 --> 00:00:46,035
So in the case of AES the IV
would be 16 bytes. And then we're

12
00:00:46,035 --> 00:00:50,649
gonna run through the algorithm here, the
IV basically that we chose is gonna be XORed

13
00:00:50,649 --> 00:00:54,726
to the first plain text
block. And then the result is gonna be

14
00:00:54,726 --> 00:00:58,857
encrypted using the block cipher and
output of the first block of the ciphertext.

15
00:00:58,857 --> 00:01:03,041
And now comes the chaining part
where we actually use the first block of

16
00:01:03,041 --> 00:01:07,436
the ciphertext to kind of mask the second
block of the plaintext. So we XOR

17
00:01:07,436 --> 00:01:11,588
the two together and the encryption of
that becomes the second ciphertext block.

18
00:01:11,588 --> 00:01:15,535
And so on, and so on, and so forth. So
this is cIpher block chaining, you can

19
00:01:17,559 --> 00:01:19,584
see that each cIpher block is chained and
XORed into the next plaintext

20
00:01:19,584 --> 00:01:24,118
block, and the final ciphertext is going to
be essentially the IV, the initial IV

21
00:01:24,118 --> 00:01:30,024
that we chose along with all the ciphertext blocks. I should say that IV stands

22
00:01:30,024 --> 00:01:35,795
for Initialization Vector. And we're going to
be seeing that term used quite a bit,

23
00:01:35,795 --> 00:01:39,717
every time we need to pick something at
random at the beginning of the encryption

24
00:01:39,717 --> 00:01:43,543
scheme typically we'll call that an IV
for initialization vector. So you notice

25
00:01:43,543 --> 00:01:47,322
that the cIphertext is a little bit
longer than the plain text because we had

26
00:01:47,322 --> 00:01:51,149
to include this IV in the cIphertexts
which basically captures the randomness

27
00:01:51,149 --> 00:01:55,450
that was used during encryption. So the
first question is how do we decrypt the

28
00:01:55,450 --> 00:02:00,226
results of CBC encryption, and so
let me remind you again that if when we

29
00:02:00,226 --> 00:02:04,470
encrypt the first message block we
XOR it with the IV, encrypt the

30
00:02:04,470 --> 00:02:09,187
result and that becomes the first ciphertext
block. So let me ask you how would

31
00:02:09,187 --> 00:02:13,667
you decrypt that? So given the first
ciphertext block, how would you recover

32
00:02:13,667 --> 00:02:17,915
the original first plaintext block? So
decryption is actually very similar to

33
00:02:17,915 --> 00:02:21,660
encryption, here I wrote down the
decryption circuit, you can see basically

34
00:02:21,660 --> 00:02:25,961
it's almost the same thing except the XOR
is on the bottom, instead of on the top, and

35
00:02:25,961 --> 00:02:29,605
again you realize that essentially we
chopped off the IV as part of the

36
00:02:29,605 --> 00:02:33,754
decryption process and we only output the
original message back, the IV is dropped

37
00:02:33,754 --> 00:02:38,438
by the decryption algorithm. Okay, so the
following theorem is going to show that in

38
00:02:38,438 --> 00:02:43,762
fact CBC mode encryption with a random IV
is in fact semantically secure under a

39
00:02:43,762 --> 00:02:48,956
chosen plaintext attack, and so let's
take that more precisely, basically if we

40
00:02:48,956 --> 00:02:54,083
start with a PRP, in other words, our
block cipher E, that is defined over a

41
00:02:54,083 --> 00:02:59,079
space X, then we are gonna to end up with
a encryption algorithm Ecbc that takes

42
00:02:59,079 --> 00:03:03,944
messages of length L and outputs
ciphertexts of length L+1. And then

43
00:03:03,944 --> 00:03:09,324
suppose we have an adversary that makes q
chosen plaintext queries. Then we can

44
00:03:09,324 --> 00:03:15,024
state the following security fact, that
for every such adversary that's attacking

45
00:03:15,024 --> 00:03:20,184
Ecbc, to exist an adversary that's
attacking the PRP, the block cipher, with

46
00:03:20,184 --> 00:03:24,926
the following relation between the two
algorithms, in other words, the advantage

47
00:03:24,926 --> 00:03:29,851
of algorithm A against the encryption scheme
is less than the advantage of algorithm B

48
00:03:29,851 --> 00:03:35,080
against the original PRP plus some noise
term. So let me interpret this theorem for

49
00:03:35,080 --> 00:03:40,005
you as usual, so what this means is that
essentially since E is a secure PRP this

50
00:03:40,005 --> 00:03:45,051
quantity here is negligible, and our goal
is to say that adversary A's advantage is

51
00:03:45,051 --> 00:03:49,794
also negligible. However, here we are
prevented from saying that because we got

52
00:03:49,794 --> 00:03:54,630
this extra error term. This is often
called an error term and to argue that CBC

53
00:03:54,630 --> 00:03:59,676
is secure we have to make sure that the
error term is also negligible. Because if

54
00:03:59,676 --> 00:04:04,474
both of these terms on the right are
negligible, there sum is negligible and

55
00:04:04,474 --> 00:04:09,458
therefore the advantage of A against Ecbc
would also be negligible. So this says

56
00:04:09,458 --> 00:04:14,565
that in fact for Ecbc to be secure it has better
be the case that q squared L squared Is

57
00:04:14,565 --> 00:04:19,564
much, much, much smaller than the value X,
so let me remind you what q and L are, so

58
00:04:19,564 --> 00:04:24,566
L is simply the length of the messages
that we're encrypting. Okay, so L could be

59
00:04:24,566 --> 00:04:29,902
like say a 1000, which means that we are
encrypting messages that are at most 1000

60
00:04:29,902 --> 00:04:35,303
AES blocks. q is the number of ciphertexts
that the adversary gets to see under the

61
00:04:35,303 --> 00:04:40,770
CPA attack, but in real life what q is, is
basically the number of times that we have

62
00:04:40,770 --> 00:04:46,041
used the key K to encrypt messages, in other
words if we use a particular AES key to

63
00:04:46,041 --> 00:04:51,052
encrypt 100 messages, Q would be 100.
It is because the adversary would then see

64
00:04:51,052 --> 00:04:56,224
at most 100 messages encrypted under this key K. Okay
so lets see what this means in the real

65
00:04:56,224 --> 00:05:00,866
world. So here I've rewrote the error
balance from the theorem. And just to remind

66
00:05:00,866 --> 00:05:05,093
you to use the messages encrypted with K
and L with the lengths of the messages and so

67
00:05:05,093 --> 00:05:09,370
suppose we want the adversary's advantage
to be less than one over two to the thirty

68
00:05:09,370 --> 00:05:13,346
two. This means that the error term had
better be less than one over two to the

69
00:05:13,346 --> 00:05:17,853
thirty two. Okay, so let's look at AES and see
what this mean. For AES, AES of course uses

70
00:05:17,853 --> 00:05:22,300
128 bit blocks, so X is going to be two
to the 128, the

71
00:05:22,300 --> 00:05:26,363
size of X is going to be 2 to the
128, and if you

72
00:05:26,363 --> 00:05:30,865
plug this into the expression you see that
basically the product q times L had

73
00:05:30,865 --> 00:05:35,477
better be less than two to the forty eight.
This means that after we use a particular

74
00:05:35,477 --> 00:05:40,014
key to encrypt 2 to the 48 AES
blocks we have to change the key. Okay, so

75
00:05:40,014 --> 00:05:46,966
essentially CBC stops being secure after
the key is used to encrypt 2 to the 48 different AES blocks.

76
00:05:46,966 --> 00:05:49,572
So its
kinda nice that the security theorem tells

77
00:05:49,572 --> 00:05:54,499
you exactly how long the key can be used
and then how frequently, essentially, you have to

78
00:05:54,499 --> 00:05:59,575
replace the key. Now interestingly if you
apply the same analogy to the 3DES it

79
00:05:59,575 --> 00:06:04,909
actually has a much shorter block, maybe
only 64 bits, you see the key has to be

80
00:06:04,909 --> 00:06:10,485
changed much more frequently, maybe after every
65 thousand DES blocks, essentially you need to generate a new key. So

81
00:06:10,485 --> 00:06:15,275
this is one of the reasons why AES has a
larger block size so that in fact modes

82
00:06:15,275 --> 00:06:20,240
like CBC would be more secure and one can
use the keys for a longer period of time, before having

83
00:06:20,240 --> 00:06:24,796
to replace it. What this means is having
to replace two to the sixteen blocks,

84
00:06:24,796 --> 00:06:29,586
each block of course is 8 bytes, so
after you encrypt about half a megabyte of

85
00:06:29,586 --> 00:06:33,868
data you would have to change the DES key
which is actually quite low. And you

86
00:06:33,868 --> 00:06:37,645
notice with AES you can encrypt quite a
bit more data before you have to change the

87
00:06:37,645 --> 00:06:42,604
key. So I want to warn you about a very
common mistake that people have made when

88
00:06:42,604 --> 00:06:47,627
using CBC with a random IV. That is that
the minute that the attacker can predict

89
00:06:47,627 --> 00:06:52,712
the IV that you're going to be using for
encrypting a particular message decipher

90
00:06:52,712 --> 00:06:57,797
this Ecbc is no longer CPA secure. So when
using CBC with a random IV like we've

91
00:06:57,797 --> 00:07:02,246
just shown It's crucial that the IV is not
predictable. But lets see an attack. So

92
00:07:02,246 --> 00:07:06,282
suppose it so happens that given a
particular encryption in a message that

93
00:07:06,282 --> 00:07:10,695
attacker can actually predict that IV that
will be used for the next message. Well

94
00:07:10,695 --> 00:07:14,839
let's show that in fact the resulting
system is not CPA secure. So the first thing the

95
00:07:14,839 --> 00:07:19,197
adversary is going to do is, he is going
to ask for the encryption of a one block

96
00:07:19,197 --> 00:07:23,449
message. In particular that one block is
going to be zero. So what the adversary

97
00:07:23,449 --> 00:07:27,592
gets back is the encryption of one
block, which namely is the encryption of

98
00:07:27,592 --> 00:07:31,748
the message namely zero, XOR the IV. Okay
and of course the adversary also gets the

99
00:07:31,748 --> 00:07:35,877
IV. Okay so now the adversary by
assumption can predict the IV that's gonna

100
00:07:35,877 --> 00:07:40,196
be used for the next encryption. Okay so
let's say that IV is called, well IV. So

101
00:07:40,196 --> 00:07:44,460
next the adversary is going to issue his
semantic security challenge and the

102
00:07:44,460 --> 00:07:49,167
message m0 is going to be the predicted IV
XOR IV1 which was used in the encryption

103
00:07:49,167 --> 00:07:53,707
of c1. And the, the message of m1 is just
going to be some other message, it doesn't

104
00:07:53,707 --> 00:07:58,248
really matter what it is. So now let's see
what happens when the adversary receives

105
00:07:58,248 --> 00:08:02,346
the result of the semantic security
challenge. Well, he is going to get the

106
00:08:02,346 --> 00:08:06,470
encryption of m0 or m1. So when the
adversary receives the encryption of m0,

107
00:08:06,470 --> 00:08:10,800
tell me what is the actual plain text
that is encrypted in the ciphertext c?

108
00:08:11,260 --> 00:08:17,368
Well so the answer is that what is
actually encrypted is the message which is

109
00:08:17,368 --> 00:08:22,826
IV XOR IV1 XOR the IV that's used to
encrypt that message which happens to be

110
00:08:22,826 --> 00:08:28,301
IV and this of course is IV1. So when the
adversary receives the encryption of m0,

111
00:08:28,301 --> 00:08:33,167
he is actually receiving the block cipher
encryption of IV1. And lo and behold,

112
00:08:33,167 --> 00:08:38,440
you'll notice that he already has that
value from his chosen plaintext query.

113
00:08:38,440 --> 00:08:42,800
And then, when he is receiving the
encryption of message m1, he just received

114
00:08:42,800 --> 00:08:47,825
a normal CBC encryption of the message m1.
So you realize that now he has a simple

115
00:08:47,825 --> 00:08:53,057
way of breaking the scheme, namely what
he'll do is he'll say, he's gonna ask, "Is the second

116
00:08:53,057 --> 00:08:58,354
block of the ciphertext c equal to the
value that I received in my CPA query?" If

117
00:08:58,354 --> 00:09:03,843
so I'll say that I received the encryption
of m0, otherwise I'll say that I received

118
00:09:03,843 --> 00:09:09,209
the encryption of m1. So really his test
is c1 he refers to the second block

119
00:09:09,209 --> 00:09:14,441
of c and c11 refers to the second block of
c1, if the two are equal he says zero,

120
00:09:14,441 --> 00:09:20,095
otherwise he says one. So the advantage of
this adversary is going to be 1 and as a

121
00:09:20,095 --> 00:09:25,650
result, he completely breaks CPA security
of this CBC encryption. So the lesson here

122
00:09:25,650 --> 00:09:30,334
is, if the IV is predictable then, in
fact, there is no CPA security and

123
00:09:30,334 --> 00:09:35,621
unfortunately, this is actually a very
common mistake in practice. In particular

124
00:09:35,621 --> 00:09:41,339
even in SSL protocol and in TLS 1.1, it turns
out that IV for record number I is in fact

125
00:09:41,339 --> 00:09:46,363
the last ciphertext block of record I-1. That means that exactly given

126
00:09:46,363 --> 00:09:51,579
the encryption of record I-1, the
adversary knows exactly what IV is going

127
00:09:51,579 --> 00:09:56,031
to be used as record number I. Very
recently, just last summer, this was

128
00:09:56,031 --> 00:10:00,737
actually converted into a pretty
devastating attack on SSL. We'll describe

129
00:10:00,737 --> 00:10:06,016
that attack once we talk about SSL in more
detail, but for now I wanted to make sure

130
00:10:06,016 --> 00:10:12,371
you understand than when you use CBC encryption,
its absolutely crucial that the IV be random.

131
00:10:12,371 --> 00:10:16,372
Okay, so now I going to show you the nonce based version of CBC encryption

132
00:10:16,372 --> 00:10:21,443
So in this mode the IV is replaced by non random but unique nonce

133
00:10:21,443 --> 00:10:23,928
for example the numbers 1,2,3,4,5, could all be used as a nonce, and now, the appeal of this mode

134
00:10:23,928 --> 00:10:25,246
is that if the recipient actually knows
what the nonce is supposed to be

135
00:10:25,246 --> 00:10:25,880
then there's no reason to include the nonce
in the ciphertext, in which case, the ciphertext

136
00:10:25,880 --> 00:10:26,197
is exactly the same length as the plaintext,
unlike CBC with the random IV,

137
00:10:26,197 --> 00:10:26,276
where we had to expand the ciphertext to include the IV, here, if the nonce is already known to the recipient,

138
00:10:26,276 --> 00:10:26,315
there's no reason to include it in the ciphertext, and
the ciphertext is exactly the same length as the plaintext.

139
00:10:26,315 --> 00:10:26,335
So it's perfectly fine to use a non-random but unique nonce. However, it's absolutely crucial to know that,

140
00:10:26,335 --> 00:10:26,345
if you do this, there's one more step that you have
to do before you use the nonce in the CBC chain.

141
00:10:26,345 --> 00:10:26,355
In particular, in this mode now we're going to
be using two independent keys, k and k1.

142
00:10:26,355 --> 00:10:26,434
The key k is, as before, going to be used to
encrypt the individual message blocks,

143
00:10:26,434 --> 00:10:26,474
However, this key k1 is going to be used to
encrypt the non-random but unique nonce,

144
00:10:26,474 --> 00:10:26,494
so that the output is going to be a random IV,
which is then used in the CBC chain.

145
00:10:26,494 --> 00:10:26,504
So this extra step here, encrypting the nonce
with the key k1, is absolutely crucial.

146
00:10:26,504 --> 00:10:26,509
Without it, CBC mode encryption would not be secure.

147
00:10:26,509 --> 00:10:26,514

However it if is going to be a counter you

148
00:10:26,514 --> 00:10:32,046
need to do one more step. Before actually
encryption CBC and in particular you have

149
00:10:32,046 --> 00:10:37,380
to actually encrypt the notes to obtain
the IV that will actually be used for

150
00:10:37,380 --> 00:10:42,919
encryption. The notes on CBC is similar to
a random IV, the difference is that the

151
00:10:42,919 --> 00:10:48,047
notes is first encrypted and the results
is that the IV is used in the CBC

152
00:10:48,047 --> 00:10:52,728
encryption Now the beauty of this mode is
that the Nance doesn't necessarily have to

153
00:10:52,728 --> 00:10:56,975
be included in the cipher text. It only
needs to be in there if its unknowns are

154
00:10:56,975 --> 00:11:01,116
the decrypter but it if the decrypter
happens to already know the value of the

155
00:11:01,116 --> 00:11:05,310
counter by some other means then in fact
the cipher text is only as big as the

156
00:11:05,310 --> 00:11:09,291
plain text. There's no extra value
transmitted in the cipher text. And again,

157
00:11:09,291 --> 00:11:13,591
I warn that when you're using non spaced
encryption, it's absolutely crucial that

158
00:11:13,591 --> 00:11:17,679
the key common Nance spare is only used
for one message so for every message,

159
00:11:17,679 --> 00:11:22,028
either the Nance has changed or the key
has changed. Okay, so here emphasize the

160
00:11:22,028 --> 00:11:26,499
fact that you need to do this extra
encryption step before actual using the

161
00:11:26,499 --> 00:11:31,088
Nance. This is very common mistake that
actually forgotten in practice and for

162
00:11:31,088 --> 00:11:35,795
example in TLS, this was not done and as a
result there was a significant attack

163
00:11:35,795 --> 00:11:40,282
against CBC encryption in TLS. Remember
the reason that this is so important to

164
00:11:40,282 --> 00:11:44,950
know is that in fact many crypto APIs are
set up to almost deliberately mislead the

165
00:11:44,950 --> 00:11:49,451
user to using CBC incorrectly. So let's
look to see how CBC implemented inside of

166
00:11:49,451 --> 00:11:53,840
open SSL. So here are the arguments of the
function. Basically this is the plain

167
00:11:53,840 --> 00:11:58,119
text, this is the place where the cipher
text will get written to. This is the

168
00:11:58,119 --> 00:12:02,760
length of the plain text. This is a, a Yes
key Finally there is an argument here that

169
00:12:02,760 --> 00:12:06,438
says whether you are crypting or
decrypting. And the most important

170
00:12:06,438 --> 00:12:10,884
parameter that I wanted to point out here
is the actual IV and unfortunately, the

171
00:12:10,884 --> 00:12:15,330
user is asked to supply this IV and the
function uses the IV directly in the CBC

172
00:12:15,330 --> 00:12:19,831
encryption mechanism. It doesn't encrypt
the IV before using it and as a result, if

173
00:12:19,831 --> 00:12:24,332
you ever call this function using a non
random IV, the resulting encryption system

174
00:12:24,332 --> 00:12:28,816
won't be CPA secure. Okay, so it's very
important to know that when calling

175
00:12:28,816 --> 00:12:33,960
functions like this. Cbc encryption or
open SSL either supply a truly random IV

176
00:12:33,960 --> 00:12:38,836
or if you want the IV to be a counter than
you have to encrypt a counter using AAS

177
00:12:38,836 --> 00:12:43,668
before you actually call a CBC encrypt and
you have to that yourself. So again, it's

178
00:12:43,668 --> 00:12:48,340
very important that the programmer knows
that it needs to be done otherwise the CBC

179
00:12:48,340 --> 00:12:52,456
encryption is insecure. One last
technicality about CBC is what to do when

180
00:12:52,456 --> 00:12:57,183
the message is not a multiple of the block
cipher block length? That is what do we do

181
00:12:57,183 --> 00:13:01,689
if the last message block is shorter than
the block length of AES, for example? So

182
00:13:01,689 --> 00:13:06,281
the last message block is less than
sixteen bytes. And the answer is if we add

183
00:13:06,281 --> 00:13:11,586
a pad to the last block so it becomes as
long as sixteen bytes, as long as the AES

184
00:13:11,586 --> 00:13:16,633
block size. And this pad, of course, if
going to be removed during encryption. So

185
00:13:16,633 --> 00:13:21,873
here is a typical path, this is the path
that is used in TLS. Basically a pad with

186
00:13:21,873 --> 00:13:26,919
N bytes then essentially what you do is
you write the number N, N times. So for

187
00:13:26,919 --> 00:13:32,036
example if you pad with five bytes, you
pad with the string 555555. So five bytes

188
00:13:32,036 --> 00:13:37,175
where each byte is the value five. And the
key thing about this pad is basically when

189
00:13:37,175 --> 00:13:42,012
the decrypter receives the message, what
he does is he looks at the last byte of

190
00:13:42,012 --> 00:13:46,970
the last block. So suppose that value is
five, then he simply removes the last five

191
00:13:46,970 --> 00:13:51,818
bytes of the message. Now the question is
what do we do if in fact the message is a

192
00:13:51,818 --> 00:13:56,262
multiple of sixteen bytes so in fact no
pad is needed? If we don't pad at all,

193
00:13:56,262 --> 00:14:00,476
well that's a problem because the
decrypter is going to look at the very

194
00:14:00,476 --> 00:14:05,267
last byte of the last block which is not
part of the actual message and he's going

195
00:14:05,267 --> 00:14:10,000
to remove that many bytes from the plain
text. So that actually would be a problem.

196
00:14:10,000 --> 00:14:15,363
So the solution is, if in fact there is no
pad that's needed, nevertheless we still

197
00:14:15,363 --> 00:14:20,662
have to add a dummy block. And since we
add the dummy block this would be a block

198
00:14:20,662 --> 00:14:25,830
that's basically contains sixteen bytes
each one containing the number sixteen.

199
00:14:25,830 --> 00:14:30,042
Okay, so we add essentially sixteen dummy
blocks. The decrypter, that when he's

200
00:14:30,042 --> 00:14:34,473
decrypting, he looks at the last byte of
the last block, he sees that the value is

201
00:14:34,473 --> 00:14:38,823
sixteen, therefore he removes the entire
block. And whatever's left is the actual

202
00:14:38,823 --> 00:14:42,975
plain text. So it's a bit unfortunate that
in fact if you're encrypting short

203
00:14:42,975 --> 00:14:47,019
messages with CBC and the messages happen
to be, say, 32 bytes, so they are a

204
00:14:47,019 --> 00:14:51,387
multiple of sixteen bytes, then you have
to add one more block and make all these

205
00:14:51,387 --> 00:14:55,108
ciphertexts be 48 bytes just to
accommodate the CBC padding. I should

206
00:14:55,108 --> 00:14:59,584
mention there's a variant of CBC called
CBC with ciphertext stealing that actually

207
00:14:59,584 --> 00:15:03,790
avoids this problem, but I'm not gonna
describe that here. If you're interested

208
00:15:03,790 --> 00:15:07,907
you can look that up online. Okay, so
that's the end of our discussion of CBC

209
00:15:07,907 --> 00:15:12,198
and in the next segment we'll see how to
use counter modes to encrypt multiple

210
00:15:12,198 --> 00:15:13,720
messages using a single key.