WEBVTT 00:00:00.000 --> 00:00:04.669 In this segment we will look at how to use block ciphers to encrypt multiple messages 00:00:04.669 --> 00:00:08.959 using the same key. This comes up in practice for example in file systems where 00:00:08.959 --> 00:00:13.412 the same key's used to encrypt multiple files. It comes up in networking protocols 00:00:13.412 --> 00:00:17.647 where the same key is used to encrypt multiple packets. So let's see how to do 00:00:17.647 --> 00:00:21.883 it. The first thing we need to do is to define what is mean for a cipher to be 00:00:21.883 --> 00:00:26.185 secure when the same key is used to encrypt multiple messages. When we use the 00:00:26.185 --> 00:00:31.041 key more than once the result of that is that the adversary gets to see many cyber 00:00:31.041 --> 00:00:35.627 text encrypted using the same key. As a result, when we define security, we're 00:00:35.627 --> 00:00:40.512 gonna allow the adversary to mount what's called a chosen plain text attack. In 00:00:40.512 --> 00:00:45.522 other words, the adversary can obtain the encryption of arbitrary messages of his 00:00:45.522 --> 00:00:49.710 choice. So, for example, if the adversary's interacting with Alice. The 00:00:49.710 --> 00:00:53.924 adversary can ask Alice to encrypt arbitrary messages of the adversary's 00:00:53.924 --> 00:00:58.138 choosing. And Alice will go ahead and encrypt those messages and give the 00:00:58.138 --> 00:01:02.929 adversary the resulting cipher texts. You might wonder why would Alice ever do this. 00:01:02.929 --> 00:01:07.431 How could this possibly happen in real life? But it turns out this is actually 00:01:07.431 --> 00:01:11.760 very common in real life. And in fact, this modeling is quite a conservative 00:01:11.760 --> 00:01:16.320 modeling of real life. For example, the adversary might send Alice an email. When 00:01:16.320 --> 00:01:21.168 Alice receives the email, the writes it to her encrypted disk, thereby encrypting the 00:01:21.168 --> 00:01:26.140 adversary's email using her secret key. If later the adversary steals this disc, then 00:01:26.140 --> 00:01:31.280 he obtains the encryption of an email that he sent Alice under Alice's secret key. So 00:01:31.280 --> 00:01:36.298 that's an example of a chosen plain text attack, where the adversary provided Alice 00:01:36.298 --> 00:01:41.075 with a message and she encrypted that message using her own key. And then later 00:01:41.075 --> 00:01:45.429 the attacker was able to obtain the resulting cipher text. So that's the 00:01:45.429 --> 00:01:49.661 adversary's power. And then the adversary's goal is basically to break 00:01:49.661 --> 00:01:54.368 semantic security. So let's define this more precisely. As usual, we're gonna 00:01:54.368 --> 00:01:59.447 define semantic security under a chosen plain text attack using two experiments, 00:01:59.447 --> 00:02:04.717 experiment zero and experiment one, that are modeled as a game between a challenger 00:02:04.717 --> 00:02:09.669 and an adversary. When the game begins, the challenger is gonna choose a random 00:02:09.669 --> 00:02:14.006 key K. And now the adversary basically gets to query the challenger. So the 00:02:14.006 --> 00:02:18.198 adversary now begins by submitting a semantic security query, namely, he 00:02:18.198 --> 00:02:22.804 submits two messages, M zero and M one. I added another index, but let me ignore 00:02:22.804 --> 00:02:27.351 that extra index for a while. So the adversary submits two messages, M zero and 00:02:27.351 --> 00:02:31.780 M one, that happen to be of the same length. And then the adversary receives 00:02:31.780 --> 00:02:36.031 the encryption of one of those messages, either of M zero or of M one. In 00:02:36.031 --> 00:02:40.224 experiment zero, he receives the encryption of M zero. In experiment one, 00:02:40.224 --> 00:02:44.952 he receives the encryption of M one. So, so far this would look familiar this looks 00:02:44.952 --> 00:02:49.477 exactly like a standard semantic security [inaudible]. However, plain text attack 00:02:49.477 --> 00:02:54.007 the adversary can now repeat this query again. So now you can issue a query with 00:02:54.007 --> 00:02:58.485 two other plain texts, again of the same length, and again you would receive the 00:02:58.485 --> 00:03:03.189 encryption of one of them. In experiment zero you would receive the encryption of M 00:03:03.189 --> 00:03:07.837 zero. In experiment one you would receive the encryption of M one. And the attacker 00:03:07.837 --> 00:03:12.542 can continue issuing queries like this. In fact we'll say that he can issue up to Q 00:03:12.542 --> 00:03:17.020 queries of this type. And then, remember, every time he issues a pair of messages. 00:03:17.020 --> 00:03:21.416 That happen to be of the same length and every time he either gets the encryption 00:03:21.416 --> 00:03:25.867 of the left side or the right side again in experiment zero he will always get the 00:03:25.867 --> 00:03:29.727 encryption of the left message in experiment one he will always get the 00:03:29.727 --> 00:03:33.970 encryption of the left message. And, then adversary's goal is, basically, to figure 00:03:33.970 --> 00:03:38.289 out whether he's in experimental zero or in experiment one. In other words, whether 00:03:38.289 --> 00:03:42.713 he was constantly receiving the encryption of the left message or the encryption of 00:03:42.713 --> 00:03:47.032 the right message. So, in some sense, this is a standard semantic security game just 00:03:47.032 --> 00:03:51.193 iterated over many queries that the attacker can issue to adaptively one after 00:03:51.193 --> 00:03:56.014 the other. Now the chosen plain text attack is captured by the fact that if the 00:03:56.014 --> 00:04:00.646 attacker wants the encryption of a particular message m. What he could do is, 00:04:00.646 --> 00:04:05.234 for example, use query J for sum J, where in this query J he'll set both the zero 00:04:05.234 --> 00:04:09.593 message and the one message to be the exactly same message M. In other words, 00:04:09.593 --> 00:04:14.008 both the left message and the right message are the same, and both are set to 00:04:14.008 --> 00:04:18.653 the message M. In this case, what he will receive, since both messages are the same, 00:04:18.653 --> 00:04:23.126 he knows that he's gonna receive the encryption of this message M that he was 00:04:23.126 --> 00:04:27.600 interested in. So this is exactly what we meant by a chosen [inaudible] attack. 00:04:27.600 --> 00:04:32.598 Where the advisory can submit a message m and receive the encryption of that 00:04:32.598 --> 00:04:37.429 particular message m of his choice. So some of his queries might be of this chose 00:04:37.429 --> 00:04:42.157 plain text flavor where the message on the left is equal to the message on the right, 00:04:42.157 --> 00:04:46.775 but some of the queries might be standard semantic security queries where the two 00:04:46.775 --> 00:04:51.281 messages are distinct and that actually gives him information on whether he's in 00:04:51.281 --> 00:04:55.453 experiment zero or in experiment one. Now by now you should be used to this 00:04:55.453 --> 00:05:00.182 definition where we say that the system is semantically secure under a chosen plain 00:05:00.182 --> 00:05:04.141 text attack. If, for all efficient adversaries, they cannot distinguish 00:05:04.141 --> 00:05:08.703 experiment zero from experiment one. In other words, the probability that, at the 00:05:08.703 --> 00:05:13.091 end, the output, B Prime, which we're gonna denote by the output of experiment 00:05:13.091 --> 00:05:17.769 B. This output will be the same whether [inaudible] experiment zero or experiment 00:05:17.769 --> 00:05:22.310 one. So the attacker couldn't distinguish between always receiving encryptions of 00:05:22.310 --> 00:05:26.900 the left messages, versus always receiving encryptions of the right messages. So in 00:05:26.900 --> 00:05:31.267 your mind, I'd like you to be thinking of an adversary that is able to mount a 00:05:31.267 --> 00:05:35.745 chosen plaintext attack, namely, be given the encryption of arbitrary messages of 00:05:35.745 --> 00:05:40.168 his choice, and his goal is to break semantic security for some other challenge 00:05:40.168 --> 00:05:44.330 cipher texts. And as I said in this [inaudible] model of the real world the 00:05:44.330 --> 00:05:48.721 attacker is able to fool Alice into encrypting for him messages of his choice 00:05:48.721 --> 00:05:53.287 and then the attacker's goal is to somehow break some challenge cypher text. So I 00:05:53.287 --> 00:05:58.173 claim that all the ciphers that we've seen up until now, namely deterministic counter 00:05:58.173 --> 00:06:02.541 mode or the one time pad, are insecure under a chosen plain text attack. More 00:06:02.541 --> 00:06:07.312 generally, suppose we have an encryption scheme that always outputs the same cipher 00:06:07.312 --> 00:06:11.968 text for a particular message M. In other words, if I ask the encryption scheme to 00:06:11.968 --> 00:06:16.188 encrypt the message M once. And then I ask the encryption scheme to encrypt the 00:06:16.188 --> 00:06:21.183 message m again. If in both cases the encryption scheme outputs the same cypher 00:06:21.183 --> 00:06:26.550 text, then that system cannot possibly be secure under a chosen plain text attack. 00:06:26.550 --> 00:06:31.281 And both deterministic counter mode and the one time pad were of that flavor. They 00:06:31.281 --> 00:06:35.923 always output the same cipher text, given the same message. And so let's see why 00:06:35.923 --> 00:06:41.143 that cannot be chosen plain text secure. And the attack is fairly simple, what the 00:06:41.143 --> 00:06:46.300 attacker is gonna do, is he's gonna output the same message twice. This just says. 00:06:46.300 --> 00:06:51.233 That he really wants the encryption of M0. So here the attacker is given C0 which is 00:06:51.233 --> 00:06:55.872 the encryption of M0. So this was his chosen plain text query where he actually 00:06:55.872 --> 00:07:00.805 received the encryption of the message M0 of his choice. And now he's going to break 00:07:00.805 --> 00:07:05.445 semantic security. So what he does is he outputs two messages, M0 and M1 of the 00:07:05.445 --> 00:07:10.084 same length, and he's going to be given the encryption of MB. But low and behold, 00:07:10.084 --> 00:07:15.850 we said that the encryption system. Always outputs the same cipher text when its 00:07:15.850 --> 00:07:21.539 encrypting the message, M0. Therefore, if B is=to zero, we know that C, this 00:07:21.539 --> 00:07:27.310 challenged cipher text, is simply=to CO, because it's the encryption of M0. 00:07:27.310 --> 00:07:32.409 However, if B is=to one. Then we know that this challenge cypher text is the 00:07:32.409 --> 00:07:38.048 encryption of M1 which is something other than C zero so all the attacker does is he 00:07:38.048 --> 00:07:43.441 just checks his C is = to C0 the output's zero in other words he outputs one. So, in 00:07:43.441 --> 00:07:47.722 this case, the attacker is able to perfectly guess this bit B, so he knows 00:07:47.722 --> 00:07:52.412 exactly [inaudible] given the encryption of M0, or the encryption of M1. And as a 00:07:52.412 --> 00:07:57.103 result, his advantage in winning this game is one. Meaning that the system cannot 00:07:57.103 --> 00:08:01.491 possibly be CPA secure. One is not a negligible number. So this shows that the 00:08:01.491 --> 00:08:05.582 deterministic encryption schemes cannot possibly be CPA-secure, but you might 00:08:05.582 --> 00:08:09.345 wonder well, what does this mean in practice? Well in practice this means 00:08:09.345 --> 00:08:13.111 again that every message is always encrypted to the same cipher text. What 00:08:13.111 --> 00:08:17.234 this means is if you're encrypting files on disk, and you happen to be encrypting 00:08:17.234 --> 00:08:21.407 two files that happen to be the same, they will result in the same cipher text and 00:08:21.407 --> 00:08:25.327 then the attacker by looking at the encrypted disk, will learn that these two 00:08:25.327 --> 00:08:29.297 files actually contain the same content. The attacker might not learn what the 00:08:29.297 --> 00:08:33.419 content is, but he will learn that these two encrypted files are an encryption of 00:08:33.419 --> 00:08:37.524 the same content and he shouldn't be able to learn that. Similarly, if you send two 00:08:37.524 --> 00:08:41.287 encrypted packets on the network that happen to be the same, the attacker will 00:08:41.287 --> 00:08:45.146 not learn the content of those packets, but he will learn that those two packets 00:08:45.146 --> 00:08:49.301 actually contain the same information. Think for example of an encrypted voice 00:08:49.301 --> 00:08:53.769 conversation. Every time there's quiet on the line, the system will be sending 00:08:53.769 --> 00:08:58.072 encryptions of zero. But since encryption of zero are always mapped to the same 00:08:58.072 --> 00:09:02.334 cypher text. An attacker looking at the network will be able to identify exactly 00:09:02.334 --> 00:09:06.489 the points in the conversation where there's quiet because he will always see 00:09:06.489 --> 00:09:11.113 those exact same cypher text every time. So these are examples where deterministic 00:09:11.113 --> 00:09:15.492 encryption cannot possibly be secure. And as I say formerly we say that the 00:09:15.492 --> 00:09:19.800 terministic encryption can not be semantically secure under a chosen plain 00:09:19.800 --> 00:09:24.743 text attack. So what do we do, well the lesson here is if the secret keys gonna be 00:09:24.743 --> 00:09:29.674 used to encrypt multiple messages, it had better be the case that given the same 00:09:29.674 --> 00:09:33.572 plain text to encrypt twice. The encryption algorithm must produce 00:09:33.572 --> 00:09:38.147 different cipher texts. And so there are two ways to do that. The first method is 00:09:38.147 --> 00:09:42.836 what's called randomized encryption. Here, the encryption algorithm itself is going 00:09:42.836 --> 00:09:47.296 to choose some random string during the encryption process and it is going to 00:09:47.296 --> 00:09:51.642 encrypt the message M using that random string. So what this means is that a 00:09:51.642 --> 00:09:56.389 particular message, M0 for example, isn't just going to be mapped to one cipher text 00:09:56.389 --> 00:10:00.894 but it's going to be mapped to a whole ball of cipher texts. Whereon every 00:10:00.894 --> 00:10:06.692 encryption, basically, we output one point in this ball. So every time we encrypt, the 00:10:06.692 --> 00:10:11.292 encryption algorithm chooses a random string, and that random string leads to 00:10:11.292 --> 00:10:15.832 one point in this ball. Of course, the decryption algorithm, when it takes any 00:10:15.832 --> 00:10:20.610 point in this ball, will always map the result to M zero. Similarly cipher text M 00:10:20.610 --> 00:10:25.449 one will be mapped to a ball, and every time we encrypt M one, we basically output 00:10:25.449 --> 00:10:29.690 one point in this ball. And these balls have to be disjoint, so that the 00:10:29.690 --> 00:10:34.469 encryption algorithm, when it obtains a point in the ball corresponding to M one, 00:10:34.469 --> 00:10:38.964 will always output the message M one. In this way, since the encryption algorithm 00:10:38.964 --> 00:10:43.266 uses randomness, if we encrypt the same message twice, with high probability we'll 00:10:43.266 --> 00:10:47.144 get different cipher texts. Unfortunately this means that the cipher text 00:10:47.144 --> 00:10:51.393 necessarily has to be longer than the plain text because somehow the randomness 00:10:51.393 --> 00:10:55.855 that was used to generate the cipher text is now encoded somehow in the cipher text. 00:10:55.855 --> 00:11:00.158 So the cipher text takes more space. And roughly speaking, the cipher text size is 00:11:00.158 --> 00:11:04.620 going to be larger than the plain text. By basically the number of random bits that 00:11:04.620 --> 00:11:08.748 were used during encryption. So if the plain texts are very big, if the plain 00:11:08.748 --> 00:11:13.203 texts are gigabytes long, the number of random bits is going to be on the order of 00:11:13.203 --> 00:11:17.494 128. So maybe this extra space doesn't really matter. But if the plain texts are 00:11:17.494 --> 00:11:21.786 very short, maybe they themselves are 128 bits, then adding an extra 128 bits to 00:11:21.786 --> 00:11:26.240 every cipher text is going to double the total cipher text size. And that could be 00:11:26.240 --> 00:11:31.117 quite expensive. So as I say randomized encryption is a fine solution but in some 00:11:31.117 --> 00:11:35.862 cases it actually introduces quite a bit of costs. So let's look at a simple example. 00:11:35.862 --> 00:11:41.107 So imagine we have a pseudorandom function that takes inputs in a certain 00:11:41.107 --> 00:11:46.223 space r which is gonna be called a nonce space. And outputs, outputs in the message 00:11:46.223 --> 00:11:50.636 space. And, now, let's define the following randomize encryption scheme 00:11:50.636 --> 00:11:55.880 where we want to encrypt the message m with the encryption of whatever it's gonna 00:11:55.880 --> 00:12:01.149 do is first it's gonna generate a random r in this nonce space R. And then it's going 00:12:01.149 --> 00:12:06.232 to open a cypher text that consist of two components, the first component is going 00:12:06.232 --> 00:12:10.943 to be this value R and the second component is going to be an evaluation of 00:12:10.943 --> 00:12:16.180 the pseudo-random function at the point R XOR with the message M. And my question to 00:12:16.180 --> 00:12:21.397 you is, is this encryption system semantically secure under a chosen plain 00:12:21.397 --> 00:12:26.290 text attack. So the correct answer is yes. But only if the nonce space R is large 00:12:26.290 --> 00:12:31.249 enough so that little R never repeats with very, very high probability. And let's 00:12:31.249 --> 00:12:36.332 quickly argue why that's true. So first of all, because F is a secure pseudo-random 00:12:36.332 --> 00:12:41.352 function, we might as well replace it with a truly random function. In other words, 00:12:41.352 --> 00:12:46.373 this is indistinguishable from the case where we encrypt the message M, using the 00:12:46.373 --> 00:12:51.252 truly random function little F, evaluated to point R, and then XOR with M. 00:12:51.252 --> 00:12:57.320 But since this little r never repeats every cypher text uses a different little r what 00:12:57.320 --> 00:13:03.095 this means is that the values of F(r) are random uniform independent strings 00:13:03.095 --> 00:13:08.818 every time. So every time we encrypt a message, we encrypt it essentially using a 00:13:08.818 --> 00:13:14.369 new uniform random one time pad. And since XORing a uniform string with any string 00:13:14.369 --> 00:13:19.666 simply generates a new uniform string, the resulting cipher text is distributed as 00:13:19.666 --> 00:13:24.767 simply two random uniform strings. I'll call them r and r prime. And so both in 00:13:24.767 --> 00:13:30.325 experiment zero and in experiment one, all the attacker gets to see are truly uniform 00:13:30.325 --> 00:13:35.622 random strings r, r', and since in both experiments the attacker is seeing the same 00:13:35.622 --> 00:13:40.666 distribution, he cannot distinguish the two distributions. And so since security 00:13:40.666 --> 00:13:45.695 holds completely when we're using a truly random function it's also gonna hold when 00:13:45.695 --> 00:13:50.559 we're using a pseudorandom function. Okay, so this is a nice example of how we use 00:13:50.559 --> 00:13:55.435 the fact that the pseudo random function behaves like a random function to argue 00:13:55.435 --> 00:13:59.829 security of this particular encryption scheme. Okay, so now we have a nice 00:13:59.829 --> 00:14:04.465 example of randomized encryption. The other approach to building chosen plain 00:14:04.465 --> 00:14:09.344 text secure encryption schemes is what's called a nonce based encryption. Now, in 00:14:09.344 --> 00:14:14.012 a non-spaced encryption system, the encryption algorithm actually takes three 00:14:14.012 --> 00:14:19.044 inputs rather than two. As usual it takes the key and the message. But it also takes 00:14:19.044 --> 00:14:23.773 an additional input called a nonce. And similarly, the decryption algorithm also 00:14:23.773 --> 00:14:28.683 takes the nonce as input, and then produces the resulting decrypted plain text. And 00:14:28.683 --> 00:14:33.529 what is this nonce value n. This nonce is a public value. It does not need to be 00:14:33.529 --> 00:14:38.402 hidden from the adversary but the only requirement is that the pair (k,n) 00:14:38.402 --> 00:14:43.213 is only used to encrypt a single message. In other words, this pair (k,n) 00:14:43.213 --> 00:14:48.148 must change from message to message. And there are two ways to change it. One way 00:14:48.148 --> 00:14:53.144 to change it is by choosing a new random key for every message. And the other way 00:14:53.144 --> 00:14:58.276 is to keep using the same key all the time but then we must choose a new nonce for 00:14:58.276 --> 00:15:02.721 every message. And, and as I said, I wanna emphasize again, this nonce need not 00:15:02.721 --> 00:15:06.823 be secret, and it need not be random. The only requirement is the nonce is unique. 00:15:06.823 --> 00:15:11.029 And in fact, we're gonna use this term throughout the course. A nonce 00:15:11.029 --> 00:15:15.247 for us, means a unique value that doesn't repeat. It does not have to be random. So 00:15:15.247 --> 00:15:19.891 let's look at some examples of choosing an nonce, well the simplest option is 00:15:19.891 --> 00:15:24.255 simply to make the nonce of the accounter so for example the networking 00:15:24.255 --> 00:15:28.898 protocol you can imagine the nonce being a packet counter that's incremented 00:15:28.898 --> 00:15:33.598 every time a packet is sent by a sender or received by the receiver this means that 00:15:33.598 --> 00:15:37.962 the encrypter has to keep state from message to message mainly that he has to 00:15:37.962 --> 00:15:42.270 keep this counter around and increment it after every message is transmitted. 00:15:42.270 --> 00:15:47.487 Interestingly, if the decrypter actually has the same state then there is no need 00:15:47.487 --> 00:15:52.705 to include the nuance in the cipher text since the nuance is implicit. Let's look 00:15:52.705 --> 00:15:57.987 at an example. The https protocol is run over a reliable transport mechanism which 00:15:57.987 --> 00:16:03.075 means that packets sent by the sender are assumed to be received in order at a 00:16:03.075 --> 00:16:07.645 recipient. So if the sender sends packet #5 and then packet #6, the recipient 00:16:07.645 --> 00:16:12.068 will receive packet #5 and then packet #6 in that order. This 00:16:12.068 --> 00:16:16.215 means that if the sender maintains a packet counter, the recipient can also 00:16:16.215 --> 00:16:20.860 maintain a packet counter and two counters basically increment in sync. In this case 00:16:20.860 --> 00:16:24.896 there is no reason to include the nonce in the packets because the 00:16:24.896 --> 00:16:29.476 nonce is implicit between the two sides. However, in other protocols, for 00:16:29.476 --> 00:16:34.600 example, in IPsec, IPsec has a protocol designed to encrypt the IP layer. The IP 00:16:34.600 --> 00:16:39.330 layer does not guarantee in order delivery. And so the sender might send 00:16:39.330 --> 00:16:44.520 packet #5 and then packet #6, but those will be received in reverse order at 00:16:44.520 --> 00:16:49.164 the recipient. In this case it's still fine to use a packet counter as a nonce 00:16:49.164 --> 00:16:53.748 but now the nonce has to be included in the packet so that the recipient knows 00:16:53.748 --> 00:16:58.102 which nonce to use to decrypt the received packet. So as I say, nonce based 00:16:58.102 --> 00:17:02.686 encryption is a very efficient way to achieve CPA security. In particular if the 00:17:02.686 --> 00:17:07.098 nonce is implicit, it doesn't even increase the cipher text length. Of course 00:17:07.098 --> 00:17:11.796 another method to generate a unique nonce is simply to pick the nonce at random 00:17:11.796 --> 00:17:16.495 assuming the nonce space is sufficiently large so that with high probability the 00:17:16.495 --> 00:17:21.579 nonce will never repeat for the life of the key. Now in this case, nonce 00:17:21.579 --> 00:17:26.098 based encryption simply reduces to randomized encryption. However, the 00:17:26.098 --> 00:17:31.600 benefit here is that the sender does not need to maintain any state from message to 00:17:31.600 --> 00:17:36.382 message. So this is very useful for example if encryption happens to take 00:17:36.382 --> 00:17:41.425 place on multiple devices. For example, I might have both a laptop and a smart 00:17:41.425 --> 00:17:46.096 phone. They might both use the same key. But in this case if I require state full 00:17:46.097 --> 00:17:49.961 encryption, then my laptop and the smartphone would have to coordinate to 00:17:49.961 --> 00:17:54.302 make sure that they never reuse the same nonces. Whereas if both of them simply take 00:17:54.302 --> 00:17:58.114 nonces at random, they don't need to coordinate because it was very high 00:17:58.114 --> 00:18:02.243 probability they'll simply never choose the same nonce. Again assuming the nonce 00:18:02.243 --> 00:18:06.478 space is big enough. So there are some cases where stateless encryption is quite 00:18:06.478 --> 00:18:10.562 important, in particular where the same key is used by multiple machines. So I 00:18:10.562 --> 00:18:14.492 wanted to find, more precisely, what security means for nonce based 00:18:14.492 --> 00:18:18.694 encryption. And in particular, I want to emphasize that the system must remain 00:18:18.694 --> 00:18:23.122 secure when the nonce are chosen by the adversary. The reason it's important 00:18:23.122 --> 00:18:27.027 to allow the adversary to choose the nonces is because the adversary can 00:18:27.027 --> 00:18:31.090 choose which cipher text it wants to attack. So imagine the nonce happens to 00:18:31.090 --> 00:18:35.364 be a counter and it so happens that when the couter hits the value fifteen, maybe 00:18:35.364 --> 00:18:39.428 at that point it's easy for the adversary to break semantic security. So the 00:18:39.428 --> 00:18:43.702 adversary will wait until the fifteenth packet is sent and only then he will ask 00:18:43.702 --> 00:18:48.076 to break semantic security. So when we talk about nonce based encryption, we 00:18:48.076 --> 00:18:52.806 generally allow the adversary to choose the nonce and the system should remain 00:18:52.806 --> 00:18:57.772 secure even under those settings. So let's define the CPA game in this case and it's 00:18:57.772 --> 00:19:02.442 actually very similar to the game before. Basically the attacker gets to submit 00:19:02.442 --> 00:19:06.935 pairs of messages MI, MI0, and MI1. Obviously they both have to be of the same 00:19:06.935 --> 00:19:11.576 length. And he gets to supply the nonce. And in response, the adversary is given 00:19:11.576 --> 00:19:16.304 the encryption of either MI0, or MI1. But using the nonce that the adversary 00:19:16.304 --> 00:19:20.740 chose. And of course, as usual, the adversary's goal is to tell whether he was 00:19:20.740 --> 00:19:25.096 given the encryption of the left plain text or the right plain text. And as 00:19:25.096 --> 00:19:29.464 before the adversary gets to iterate these queries and he can issue as, as many 00:19:29.464 --> 00:19:33.610 queries as he wants, we usually let q denote the number of queries that the 00:19:33.610 --> 00:19:37.956 adversary issues. Now the only restriction of course, which is crucial, is that 00:19:37.956 --> 00:19:42.329 although the adversary gets to choose the nonces, he's restricted to choosing 00:19:42.329 --> 00:19:46.758 distinct nonces. The reason we force him to choose distinct nonces is because 00:19:46.758 --> 00:19:50.959 that's the requirement in practice. Even if the adversary fools Alice into 00:19:50.959 --> 00:19:55.161 encrypting multiple messages for him, Alice will never use the same nonce 00:19:55.161 --> 00:19:59.477 again. As a result, the adversary will never see messages encrypted using the 00:19:59.477 --> 00:20:03.678 same nonce and therefore, even in the game, we require that all nonce be 00:20:03.678 --> 00:20:08.305 distinct. And then as usual we say that the system is a nonce based encryption 00:20:08.305 --> 00:20:13.412 system that's, semantically secure under a chosen plain text attack if the adversary 00:20:13.421 --> 00:20:17.890 cannot distinguish experiment zero where he's given encryptions of the left 00:20:17.890 --> 00:20:22.593 messages from experiment one where he's given encryptions of the right messages. 00:20:22.593 --> 00:20:27.121 So let's look at an example of a nonce based encryption system. As before, we 00:20:27.121 --> 00:20:32.119 have a secure PRF that takes inputs in the nonce space R and outputs strings in the 00:20:32.119 --> 00:20:36.823 message space M. Now when a new key is chosen, we're going to reset our counter R 00:20:36.823 --> 00:20:41.006 to be zero. And now we encrypt the particular message M, what we will do is 00:20:41.006 --> 00:20:45.103 we will increment our counter R, and then encrypt the message M using the 00:20:45.103 --> 00:20:49.481 pseudorandom function applied to this value R. And as before, the cipher text is 00:20:49.481 --> 00:20:53.859 going to contain two components, our current value of the counter and then the 00:20:53.859 --> 00:20:58.518 one time pad encryption of the message M. And so my question to you is whether this 00:20:58.518 --> 00:21:03.312 is a secure, non-spaced encryption system. So the answer as before is yes, but only 00:21:03.312 --> 00:21:08.485 if the nuance space is large enough. So as we increment the counter R, it will never 00:21:08.485 --> 00:21:13.284 cycle back to zero so that the nuances will always, always be unique. We argue 00:21:13.284 --> 00:21:18.020 security the same way as before. Because the PRF is secure, we know that this 00:21:18.020 --> 00:21:22.819 encryption system is indistinguishable from using a truly random function. In 00:21:22.819 --> 00:21:27.493 other words, if we apply a truly random function to the counter and XOR the 00:21:27.493 --> 00:21:32.602 results with, the plain text M. But now since the nuance R never repeats, every 00:21:32.602 --> 00:21:37.447 time we compute this F of R, we get a truly random uniform and independent 00:21:37.447 --> 00:21:42.843 string so that we're actually encrypting every message using the one time pad. And 00:21:42.843 --> 00:21:48.306 as a result, all the adversary gets to see in both experiments are basically just a 00:21:48.306 --> 00:21:52.751 pair of random strings. So both the experiment zero and experiment one the 00:21:52.751 --> 00:21:57.408 adversary get's to see exactly the same distribution, namely, the responses to all 00:21:57.408 --> 00:22:02.064 this chosen plain text queries are just pairs of strings that are just uniformly 00:22:02.064 --> 00:22:06.950 distributed and this is basically the same in experiment zero and experiment one and, 00:22:06.950 --> 00:22:11.664 therefore, the attacker cannot distinguish the two experiments. And since he cannot 00:22:11.664 --> 00:22:16.206 win the semantic security game with a truly random function he, also, cannot win 00:22:16.206 --> 00:22:20.517 the semantics security game with the secure PRF, and, therefore, the scheme is 00:22:20.517 --> 00:22:25.222 secure. So now we understand what it means for a symmetric system to be secure when 00:22:25.222 --> 00:22:30.091 the keys used to encrypt multiple messages the requirement is that it be secure under 00:22:30.091 --> 00:22:34.777 a chosen plan of attack. And we said that basically, the only way to be secure under 00:22:34.777 --> 00:22:39.289 a chosen plain text attack is either to use randomized encryption, or to use, use 00:22:39.289 --> 00:22:43.462 nonce spaced encryption where the nonce never repeats. And then in the 00:22:43.462 --> 00:22:48.143 next two segments, we're gonna build two classic encryption systems that are secure 00:22:48.143 --> 00:22:50.174 when the key is used multiple times.