WEBVTT 00:00:00.000 --> 00:00:03.827 In the last segment, we explained what is a public key encryption system. And we 00:00:03.827 --> 00:00:07.557 defined what it means for a public key encryption system to be secure. If you 00:00:07.557 --> 00:00:11.142 remember, we required security against active attacks. And in particular, we 00:00:11.142 --> 00:00:15.211 defined chosen cipher text security as our goal. This week, we're gonna construct two 00:00:15.211 --> 00:00:19.281 families of public key encryption systems that are chosen cipher text secure. And in 00:00:19.281 --> 00:00:22.914 this segment, we're gonna start by constructing public key encryptions from, 00:00:23.059 --> 00:00:27.124 a concept called a trapdoor permutation. So let's start by defining a 00:00:27.124 --> 00:00:31.064 general concept called a trapdoor function. So what is a trapdoor function? 00:00:31.064 --> 00:00:35.484 Well, a trapdoor function basically is a function that goes from some set X to some 00:00:35.484 --> 00:00:39.585 set Y. And it's really defined by a triple of algorithms. There's a generation 00:00:39.585 --> 00:00:43.685 algorithm, the function f, and the inverse of the function f. So the generation 00:00:43.685 --> 00:00:47.892 algorithm, basically what it does when you run it, it will generate a key pair, a 00:00:47.892 --> 00:00:52.098 public key and a secret key. The public key is gonna define a specific function 00:00:52.098 --> 00:00:56.869 from the set X to the set Y. And then the secret key is going to define the inverse 00:00:56.869 --> 00:01:01.639 function now from the set Y to the set X. So the idea is that you can evaluate 00:01:01.639 --> 00:01:06.588 the function at any point using a public key PK and then you can invert that function 00:01:06.588 --> 00:01:12.443 using the secret key, SK. So what do I mean by inversion? More precisely, if we 00:01:12.443 --> 00:01:17.255 look at any public key and secret key pair generated by the key generation algorithm 00:01:17.255 --> 00:01:21.727 G, then it so happens that if I evaluate the function at the point X, and then I 00:01:21.727 --> 00:01:26.142 evaluate the inverse at the resulting point, I should get the original point X 00:01:26.142 --> 00:01:30.670 back. So the picture you should have in your mind, is there is this big set X and 00:01:30.670 --> 00:01:35.857 this big set Y And then, this function will map any point in X to a point in Y, 00:01:35.857 --> 00:01:41.508 and this can be done using the public key. So again any point in X can be mapped, to 00:01:41.508 --> 00:01:46.897 a point in Y. And then if someone has the secret key, then basically they can go in 00:01:46.897 --> 00:01:53.758 the reverse direction by applying, this secret key sk. So now that we 00:01:53.758 --> 00:01:58.289 understand what a trapdoor function is, let's define what it means for a trapdoor 00:01:58.289 --> 00:02:02.652 function to be secure. And so we'll say that this triple, (G, F, F inverse), is secure 00:02:02.652 --> 00:02:06.903 if in fact this function F(PK, .) is what's called a one way function. And let me 00:02:06.903 --> 00:02:10.986 explain what a, what is a one way function. The idea is that basically the 00:02:10.986 --> 00:02:15.516 function can be evaluated at any point, but inverting it is difficult without the 00:02:15.516 --> 00:02:19.639 secret key SK. So let's define that more precisely. As usual we define that using a 00:02:19.639 --> 00:02:23.764 game. So here we have our game between the challenger and the adversary. And the game 00:02:23.764 --> 00:02:27.496 proceeds as follows. Basically the challenger will generate a public key and 00:02:27.496 --> 00:02:31.622 a secret key. And then they will generate a random X. It will send the public key 00:02:31.622 --> 00:02:36.116 over to the adversary and then it will evaluate the function at the point X and 00:02:36.116 --> 00:02:40.160 send the resulting Y also to the adversary. So all the adversary gets to 00:02:40.160 --> 00:02:44.653 see is just a public key, which defines what the function is, and then he gets to 00:02:44.653 --> 00:02:49.483 see the image of this function on a random point X and is goal is basically to invert 00:02:49.483 --> 00:02:54.097 the function at this point Y. Okay, so he outputs some X prime. And we said that the 00:02:54.097 --> 00:02:58.507 trap door function is secure if the probability that the ad, adversary inverts 00:02:58.507 --> 00:03:03.143 the given at point y is negligible. In other words, given y the probability that 00:03:03.143 --> 00:03:07.271 the adversary's able to alter the pre image of y is in fact a negligible 00:03:07.271 --> 00:03:11.907 probability and if that's true for all efficient algorithms then we say that this 00:03:11.907 --> 00:03:17.882 trapdor function is secure. So again abstractly, it's a really interesting 00:03:17.882 --> 00:03:21.885 concept in that you can evaluate the function in the forward direction very 00:03:21.885 --> 00:03:26.150 easily. But then no one can evaluate the function in the reverse direction unless 00:03:26.150 --> 00:03:30.311 they have this trapdoor, the secret key SK, which then all of a sudden lets them 00:03:30.311 --> 00:03:35.424 invert the function very, very easily. So using the concept of a trapdoor function, 00:03:35.424 --> 00:03:39.552 it's not too difficult to build a public key encryption system, and let me show you how 00:03:39.552 --> 00:03:43.528 to do it. So here we have we our trap door function, (G, F, and F inverse). The other 00:03:43.528 --> 00:03:47.605 tool that we are going to need is a symmetric encryption scheme, and I'm going to 00:03:47.605 --> 00:03:51.531 assume that this encryption scheme is actually secure against active attacks, so in 00:03:51.531 --> 00:03:55.350 particular I needed to provide authenticated encryption. Notice that the 00:03:55.350 --> 00:04:00.726 symmetric encryption system takes keys in K and the trapdoor function takes inputs 00:04:00.726 --> 00:04:05.790 in X. Those are two different sets and so we're also gonna need the hash function. 00:04:05.790 --> 00:04:09.937 That goes from X to K. In other words, it maps elements in the set X into keys for 00:04:09.937 --> 00:04:14.033 the symmetric encryption systems. And now once we have these three components, we 00:04:14.033 --> 00:04:17.821 can actually construct the public key encryption system as follows: so the key 00:04:17.821 --> 00:04:21.764 generation for the public key encryption system is basically exactly the same as 00:04:21.764 --> 00:04:25.655 the key generation for the trap door function. So we run G for the trap door 00:04:25.655 --> 00:04:29.956 function, we get a public key and a secret key. And those are gonna be the public and 00:04:29.956 --> 00:04:34.171 secret keys for the public key encryption system. And how do we encrypt and decrypt? Let's 00:04:34.171 --> 00:04:38.978 start with encryption. So the encryption algorithm takes a public key and a message 00:04:38.978 --> 00:04:43.898 as input. So what it will do is it will generate a random X from the set capital 00:04:43.898 --> 00:04:48.545 X. It will then apply the trapdoor function to this random X, to obtain Y. So 00:04:48.545 --> 00:04:53.130 Y is the image of X under the trapdoor function. Then it will go ahead and 00:04:53.130 --> 00:04:58.272 generate a symmetric key by hashing X. So this is a symmetric key for the symmetric 00:04:58.272 --> 00:05:03.290 key system. And then finally it encrypts the plain text message 'm' using this key that was 00:05:03.290 --> 00:05:08.123 just generated. And then it outputs the value Y that it just computed, which is 00:05:08.123 --> 00:05:13.260 the image of X, along the encryption under the symmetric system of the message M. So 00:05:13.260 --> 00:05:18.366 that's how encryption works. And I want to emphasize again that the trapdoor function 00:05:18.366 --> 00:05:23.112 is only applied to this random value X, whereas the message itself is encrypted 00:05:23.112 --> 00:05:28.098 using a symmetric key system using a key that was derived from the value X that we 00:05:28.098 --> 00:05:32.959 chose at random. So now that we understand encryption, let's see how to decrypt. 00:05:32.959 --> 00:05:37.366 While the decryption algorithm takes a secret key as input, and the ciphertext. 00:05:37.366 --> 00:05:41.551 The ciphertext itself contains two components, the value Y and the value C. 00:05:41.551 --> 00:05:46.070 So the first step we're gonna do, is we're gonna apply the inverse transformation, 00:05:46.070 --> 00:05:50.366 the inverse trap door function to the value Y, and that will give us back the 00:05:50.366 --> 00:05:54.495 original X that was chosen during encryption. So now let me ask you, how do 00:05:54.495 --> 00:06:00.042 we derive the symmetric decryption key K from this X that we just obtained? Well, 00:06:00.042 --> 00:06:04.736 so that's an easy question. We basically hash X again. That gives us K just as 00:06:04.736 --> 00:06:09.372 during encryption. And now that we have this symmetric encryption key we can apply 00:06:09.372 --> 00:06:13.783 the, the symmetric decryption algorithm to decrypt the ciphertext C. We get the 00:06:13.783 --> 00:06:17.741 original message M and that's what we output. So, that's how the public key 00:06:17.741 --> 00:06:22.321 encryption system works were this trap door function is only used for encrypting 00:06:22.321 --> 00:06:26.788 some sort of a random value X and the actual message is encrypted using the 00:06:26.788 --> 00:06:31.244 symmetric system. So in pictures here, we have the message M obviously the plain 00:06:31.244 --> 00:06:35.545 text could be quite large. So, here we have the body of the deciphered text which 00:06:35.545 --> 00:06:39.953 can be quite long is actually encrypted using the symmetric system. And then again 00:06:39.953 --> 00:06:44.039 I emphasize that the key for the symmetric system is simply the hash of X. 00:06:44.039 --> 00:06:48.232 And then the header of ciphertext is simply this application of the trapdoor 00:06:48.232 --> 00:06:52.641 function to this random X that we picked. And so during decryption what happens is 00:06:52.641 --> 00:06:56.888 we first decrypt the header to get X and then we decrypt the body using the 00:06:56.888 --> 00:07:01.829 symmetric system to actually get the original plain text M. So as usual when I 00:07:01.829 --> 00:07:06.542 show you a system like this, obviously you want to verify that decryption in fact is 00:07:06.542 --> 00:07:10.605 the inverse of encryption. But more importantly you want to ask why is this 00:07:10.605 --> 00:07:14.963 system secure. And in fact there's a nice security theorem here that says. That if 00:07:14.963 --> 00:07:18.900 the trap door function that we started with is secure. In other words, that's a 00:07:18.900 --> 00:07:22.634 one way function if the adversary doesn't have a secret key. The symmetric 00:07:22.634 --> 00:07:26.621 encryption system provides authenticated encryption. And the hash function is a 00:07:26.621 --> 00:07:30.558 random oracle, which simply means that it's a random function from the set X to 00:07:30.558 --> 00:07:34.696 the set of keys K. So a random oracle is some sort of an idealization of, what a 00:07:34.696 --> 00:07:38.280 hash function is supposed to be. In practice, of course, when you come to 00:07:38.280 --> 00:07:42.317 implement a system like this, you would just use, SHA-256, or any of the 00:07:42.317 --> 00:07:47.252 other hash functions that we discussed in class. So, under those three conditions in 00:07:47.252 --> 00:07:51.863 fact the system that we just described is chosen cipher text secure so it is CCA 00:07:51.863 --> 00:07:56.416 secure, the little ro here just denote the fact that security is set in whats called 00:07:56.416 --> 00:08:00.572 a random oracle model. But, that's a detail that's actually not so important for 00:08:00.572 --> 00:08:05.012 discussion here, what I want you to remember is that if the trap door function 00:08:05.012 --> 00:08:09.000 is in fact a secure trap door function. The symmetric encryption system is secure 00:08:09.000 --> 00:08:13.017 against tampering so it provides authenticated encryption. And H 00:08:13.017 --> 00:08:17.468 is in some sense a good hash function. It's a random, function, which in practice 00:08:17.468 --> 00:08:22.245 you would just use SHA-256, then in fact the system that we just showed is CCA 00:08:22.245 --> 00:08:27.615 secure, is chosen ciphertext secure. I should tell you that there's actually an ISO 00:08:27.615 --> 00:08:31.752 standard that, defines this mode of encryption, of public encryption. ISO 00:08:31.752 --> 00:08:35.781 stands for International Standards Organization. So in fact this particular 00:08:35.781 --> 00:08:40.456 system has actually been standardized, and this is a fine thing to use. I'll refer to 00:08:40.456 --> 00:08:44.947 this as the ISO encryption in the next few segments. To conclude this segment, I want 00:08:44.947 --> 00:08:48.925 to warn you about an incorrect way of using a trapdoor function to build a 00:08:48.925 --> 00:08:53.328 public key encryption system. And in fact this method might be the first thing that 00:08:53.328 --> 00:08:57.572 comes to mind, and yet it's completely insecure. So let me show you, how not to 00:08:57.572 --> 00:09:01.762 encrypt using a trapdoor function. Well the first thing that might come to mind 00:09:01.762 --> 00:09:05.696 is, well, let's apply the trapdoor function directly to the message M. So we 00:09:05.696 --> 00:09:10.047 encrypt simply by applying a function to the message M, and we decrypt simply by 00:09:10.047 --> 00:09:14.180 applying F inverse to the ciphertext C to recover the original message M. So 00:09:14.180 --> 00:09:18.639 functionally, this is in fact, decryption is the inverse of encryption, and yet this 00:09:18.639 --> 00:09:22.881 is completely insecure for many, many different reasons. The easiest way to see 00:09:22.881 --> 00:09:26.960 that this is insecure, is that it's simply, this is deterministic encryption. 00:09:26.960 --> 00:09:30.944 You notice there is no randomness being used here. When we encrypt a message 00:09:30.944 --> 00:09:34.154 M, and since it is deterministic, it's cannot possibly be 00:09:34.154 --> 00:09:37.948 semantically secure. But in fact, as I said, when we instantiate this trap door 00:09:37.948 --> 00:09:41.644 function with particular implementations, for example with the RSA trap door 00:09:41.644 --> 00:09:44.951 function, then there are many, many attacks that are possible on this 00:09:44.951 --> 00:09:48.794 particular construction, and so you should never, ever, ever use it, and I'm gonna 00:09:48.794 --> 00:09:52.830 repeat this throughout this module, and in fact in the next segment I'll show you a 00:09:52.830 --> 00:09:56.699 number of attacks on this particular implementation. Okay so, what I would like 00:09:56.699 --> 00:10:00.717 you to remember is that you should be using an encryption system like the ISO 00:10:00.717 --> 00:10:04.992 standard, and you should never apply the trap door function directly to the message M. 00:10:04.992 --> 00:10:09.010 Although in the next segment we'll see other ways to encrypt using a trap 00:10:09.010 --> 00:10:13.233 door function that are also correct, but this particular method is clearly, clearly 00:10:13.233 --> 00:10:17.560 incorrect. Okay, so now that we understand how to build public key encryption 00:10:17.560 --> 00:10:21.423 given a trap door function, the next question is how to construct trap door 00:10:21.423 --> 00:10:24.360 functions, and we're going to do that in the next segment.