Now that we understand chosen plaintext security, let's build encryption schemes that are chosen plaintext secure. And the first such encryption scheme is going to be called cipher bock chaining. So here is how cipher block chaining works. Cipher block chaining is a way of using a block cipher to get chosen plaintext security. In particular, we are going to look at a mode called cipher block chaining with a random IV. CBC stands for cipher block chaning. So suppose we have a block cipher, so EB is a block cipher. So now let's define CBC to be the following encryption scheme. So the encryption algorithm when it's asked to encrypt a message m, the first thing it's going to do is it's going to choose a random IV that's exactly one block of the block cipher. So IV is one cypher block. So in the case of AES the IV would be 16 bytes. And then we're gonna run through the algorithm here, the IV basically that we chose is gonna be XORed to the first plain text block. And then the result is gonna be encrypted using the block cipher and output of the first block of the ciphertext. And now comes the chaining part where we actually use the first block of the ciphertext to kind of mask the second block of the plaintext. So we XOR the two together and the encryption of that becomes the second ciphertext block. And so on, and so on, and so forth. So this is cIpher block chaining, you can see that each cIpher block is chained and XORed into the next plaintext block, and the final ciphertext is going to be essentially the IV, the initial IV that we chose along with all the ciphertext blocks. I should say that IV stands for Initialization Vector. And we're going to be seeing that term used quite a bit, every time we need to pick something at random at the beginning of the encryption scheme typically we'll call that an IV for initialization vector. So you notice that the cIphertext is a little bit longer than the plain text because we had to include this IV in the cIphertexts which basically captures the randomness that was used during encryption. So the first question is how do we decrypt the results of CBC encryption, and so let me remind you again that if when we encrypt the first message block we XOR it with the IV, encrypt the result and that becomes the first ciphertext block. So let me ask you how would you decrypt that? So given the first ciphertext block, how would you recover the original first plaintext block? So decryption is actually very similar to encryption, here I wrote down the decryption circuit, you can see basically it's almost the same thing except the XOR is on the bottom, instead of on the top, and again you realize that essentially we chopped off the IV as part of the decryption process and we only output the original message back, the IV is dropped by the decryption algorithm. Okay, so the following theorem is going to show that in fact CBC mode encryption with a random IV is in fact semantically secure under a chosen plaintext attack, and so let's take that more precisely, basically if we start with a PRP, in other words, our block cipher E, that is defined over a space X, then we are gonna to end up with a encryption algorithm Ecbc that takes messages of length L and outputs ciphertexts of length L+1. And then suppose we have an adversary that makes q chosen plaintext queries. Then we can state the following security fact, that for every such adversary that's attacking Ecbc, to exist an adversary that's attacking the PRP, the block cipher, with the following relation between the two algorithms, in other words, the advantage of algorithm A against the encryption scheme is less than the advantage of algorithm B against the original PRP plus some noise term. So let me interpret this theorem for you as usual, so what this means is that essentially since E is a secure PRP this quantity here is negligible, and our goal is to say that adversary A's advantage is also negligible. However, here we are prevented from saying that because we got this extra error term. This is often called an error term and to argue that CBC is secure we have to make sure that the error term is also negligible. Because if both of these terms on the right are negligible, there sum is negligible and therefore the advantage of A against Ecbc would also be negligible. So this says that in fact for Ecbc to be secure it has better be the case that q squared L squared Is much, much, much smaller than the value X, so let me remind you what q and L are, so L is simply the length of the messages that we're encrypting. Okay, so L could be like say a 1000, which means that we are encrypting messages that are at most 1000 AES blocks. q is the number of ciphertexts that the adversary gets to see under the CPA attack, but in real life what q is, is basically the number of times that we have used the key K to encrypt messages, in other words if we use a particular AES key to encrypt 100 messages, Q would be 100. It is because the adversary would then see at most 100 messages encrypted under this key K. Okay so lets see what this means in the real world. So here I've rewrote the error balance from the theorem. And just to remind you to use the messages encrypted with K and L with the lengths of the messages and so suppose we want the adversary's advantage to be less than one over two to the thirty two. This means that the error term had better be less than one over two to the thirty two. Okay, so let's look at AES and see what this mean. For AES, AES of course uses 128 bit blocks, so X is going to be two to the 128, the size of X is going to be 2 to the 128, and if you plug this into the expression you see that basically the product q times L had better be less than two to the forty eight. This means that after we use a particular key to encrypt 2 to the 48 AES blocks we have to change the key. Okay, so essentially CBC stops being secure after the key is used to encrypt 2 to the 48 different AES blocks. So its kinda nice that the security theorem tells you exactly how long the key can be used and then how frequently, essentially, you have to replace the key. Now interestingly if you apply the same analogy to the 3DES it actually has a much shorter block, maybe only 64 bits, you see the key has to be changed much more frequently, maybe after every 65 thousand DES blocks, essentially you need to generate a new key. So this is one of the reasons why AES has a larger block size so that in fact modes like CBC would be more secure and one can use the keys for a longer period of time, before having to replace it. What this means is having to replace two to the sixteen blocks, each block of course is 8 bytes, so after you encrypt about half a megabyte of data you would have to change the DES key which is actually quite low. And you notice with AES you can encrypt quite a bit more data before you have to change the key. So I want to warn you about a very common mistake that people have made when using CBC with a random IV. That is that the minute that the attacker can predict the IV that you're going to be using for encrypting a particular message decipher this Ecbc is no longer CPA secure. So when using CBC with a random IV like we've just shown It's crucial that the IV is not predictable. But lets see an attack. So suppose it so happens that given a particular encryption in a message that attacker can actually predict that IV that will be used for the next message. Well let's show that in fact the resulting system is not CPA secure. So the first thing the adversary is going to do is, he is going to ask for the encryption of a one block message. In particular that one block is going to be zero. So what the adversary gets back is the encryption of one block, which namely is the encryption of the message namely zero, XOR the IV. Okay and of course the adversary also gets the IV. Okay so now the adversary by assumption can predict the IV that's gonna be used for the next encryption. Okay so let's say that IV is called, well IV. So next the adversary is going to issue his semantic security challenge and the message m0 is going to be the predicted IV XOR IV1 which was used in the encryption of c1. And the, the message of m1 is just going to be some other message, it doesn't really matter what it is. So now let's see what happens when the adversary receives the result of the semantic security challenge. Well, he is going to get the encryption of m0 or m1. So when the adversary receives the encryption of m0, tell me what is the actual plain text that is encrypted in the ciphertext c? Well so the answer is that what is actually encrypted is the message which is IV XOR IV1 XOR the IV that's used to encrypt that message which happens to be IV and this of course is IV1. So when the adversary receives the encryption of m0, he is actually receiving the block cipher encryption of IV1. And lo and behold, you'll notice that he already has that value from his chosen plaintext query. And then, when he is receiving the encryption of message m1, he just received a normal CBC encryption of the message m1. So you realize that now he has a simple way of breaking the scheme, namely what he'll do is he'll say, he's gonna ask, "Is the second block of the ciphertext c equal to the value that I received in my CPA query?" If so I'll say that I received the encryption of m0, otherwise I'll say that I received the encryption of m1. So really his test is c1 he refers to the second block of c and c11 refers to the second block of c1, if the two are equal he says zero, otherwise he says one. So the advantage of this adversary is going to be 1 and as a result, he completely breaks CPA security of this CBC encryption. So the lesson here is, if the IV is predictable then, in fact, there is no CPA security and unfortunately, this is actually a very common mistake in practice. In particular even in SSL protocol and in TLS 1.1, it turns out that IV for record number I is in fact the last ciphertext block of record I-1. That means that exactly given the encryption of record I-1, the adversary knows exactly what IV is going to be used as record number I. Very recently, just last summer, this was actually converted into a pretty devastating attack on SSL. We'll describe that attack once we talk about SSL in more detail, but for now I wanted to make sure you understand than when you use CBC encryption, its absolutely crucial that the IV be random. Okay, so now I going to show you the nonce based version of CBC encryption So in this mode the IV is replaced by non random but unique nonce for example the numbers 1,2,3,4,5, could all be used as a nonce, and now, the appeal of this mode is that if the recipient actually knows what the nonce is supposed to be then there's no reason to include the nonce in the ciphertext, in which case, the ciphertext is exactly the same length as the plaintext, unlike CBC with the random IV, where we had to expand the ciphertext to include the IV, here, if the nonce is already known to the recipient, there's no reason to include it in the ciphertext, and the ciphertext is exactly the same length as the plaintext. So it's perfectly fine to use a non-random but unique nonce. However, it's absolutely crucial to know that, if you do this, there's one more step that you have to do before you use the nonce in the CBC chain. In particular, in this mode now we're going to be using two independent keys, k and k1. The key k is, as before, going to be used to encrypt the individual message blocks, However, this key k1 is going to be used to encrypt the non-random but unique nonce, so that the output is going to be a random IV, which is then used in the CBC chain. So this extra step here, encrypting the nonce with the key k1, is absolutely crucial. Without it, CBC mode encryption would not be secure. However it if is going to be a counter you need to do one more step. Before actually encryption CBC and in particular you have to actually encrypt the notes to obtain the IV that will actually be used for encryption. The notes on CBC is similar to a random IV, the difference is that the notes is first encrypted and the results is that the IV is used in the CBC encryption Now the beauty of this mode is that the Nance doesn't necessarily have to be included in the cipher text. It only needs to be in there if its unknowns are the decrypter but it if the decrypter happens to already know the value of the counter by some other means then in fact the cipher text is only as big as the plain text. There's no extra value transmitted in the cipher text. And again, I warn that when you're using non spaced encryption, it's absolutely crucial that the key common Nance spare is only used for one message so for every message, either the Nance has changed or the key has changed. Okay, so here emphasize the fact that you need to do this extra encryption step before actual using the Nance. This is very common mistake that actually forgotten in practice and for example in TLS, this was not done and as a result there was a significant attack against CBC encryption in TLS. Remember the reason that this is so important to know is that in fact many crypto APIs are set up to almost deliberately mislead the user to using CBC incorrectly. So let's look to see how CBC implemented inside of open SSL. So here are the arguments of the function. Basically this is the plain text, this is the place where the cipher text will get written to. This is the length of the plain text. This is a, a Yes key Finally there is an argument here that says whether you are crypting or decrypting. And the most important parameter that I wanted to point out here is the actual IV and unfortunately, the user is asked to supply this IV and the function uses the IV directly in the CBC encryption mechanism. It doesn't encrypt the IV before using it and as a result, if you ever call this function using a non random IV, the resulting encryption system won't be CPA secure. Okay, so it's very important to know that when calling functions like this. Cbc encryption or open SSL either supply a truly random IV or if you want the IV to be a counter than you have to encrypt a counter using AAS before you actually call a CBC encrypt and you have to that yourself. So again, it's very important that the programmer knows that it needs to be done otherwise the CBC encryption is insecure. One last technicality about CBC is what to do when the message is not a multiple of the block cipher block length? That is what do we do if the last message block is shorter than the block length of AES, for example? So the last message block is less than sixteen bytes. And the answer is if we add a pad to the last block so it becomes as long as sixteen bytes, as long as the AES block size. And this pad, of course, if going to be removed during encryption. So here is a typical path, this is the path that is used in TLS. Basically a pad with N bytes then essentially what you do is you write the number N, N times. So for example if you pad with five bytes, you pad with the string 555555. So five bytes where each byte is the value five. And the key thing about this pad is basically when the decrypter receives the message, what he does is he looks at the last byte of the last block. So suppose that value is five, then he simply removes the last five bytes of the message. Now the question is what do we do if in fact the message is a multiple of sixteen bytes so in fact no pad is needed? If we don't pad at all, well that's a problem because the decrypter is going to look at the very last byte of the last block which is not part of the actual message and he's going to remove that many bytes from the plain text. So that actually would be a problem. So the solution is, if in fact there is no pad that's needed, nevertheless we still have to add a dummy block. And since we add the dummy block this would be a block that's basically contains sixteen bytes each one containing the number sixteen. Okay, so we add essentially sixteen dummy blocks. The decrypter, that when he's decrypting, he looks at the last byte of the last block, he sees that the value is sixteen, therefore he removes the entire block. And whatever's left is the actual plain text. So it's a bit unfortunate that in fact if you're encrypting short messages with CBC and the messages happen to be, say, 32 bytes, so they are a multiple of sixteen bytes, then you have to add one more block and make all these ciphertexts be 48 bytes just to accommodate the CBC padding. I should mention there's a variant of CBC called CBC with ciphertext stealing that actually avoids this problem, but I'm not gonna describe that here. If you're interested you can look that up online. Okay, so that's the end of our discussion of CBC and in the next segment we'll see how to use counter modes to encrypt multiple messages using a single key.