1 00:00:00,000 --> 00:00:03,827 In the last segment, we explained what is a public key encryption system. And we 2 00:00:03,827 --> 00:00:07,557 defined what it means for a public key encryption system to be secure. If you 3 00:00:07,557 --> 00:00:11,142 remember, we required security against active attacks. And in particular, we 4 00:00:11,142 --> 00:00:15,211 defined chosen cipher text security as our goal. This week, we're gonna construct two 5 00:00:15,211 --> 00:00:19,281 families of public key encryption systems that are chosen cipher text secure. And in 6 00:00:19,281 --> 00:00:22,914 this segment, we're gonna start by constructing public key encryptions from, 7 00:00:23,059 --> 00:00:27,124 a concept called a trapdoor permutation. So let's start by defining a 8 00:00:27,124 --> 00:00:31,064 general concept called a trapdoor function. So what is a trapdoor function? 9 00:00:31,064 --> 00:00:35,484 Well, a trapdoor function basically is a function that goes from some set X to some 10 00:00:35,484 --> 00:00:39,585 set Y. And it's really defined by a triple of algorithms. There's a generation 11 00:00:39,585 --> 00:00:43,685 algorithm, the function f, and the inverse of the function f. So the generation 12 00:00:43,685 --> 00:00:47,892 algorithm, basically what it does when you run it, it will generate a key pair, a 13 00:00:47,892 --> 00:00:52,098 public key and a secret key. The public key is gonna define a specific function 14 00:00:52,098 --> 00:00:56,869 from the set X to the set Y. And then the secret key is going to define the inverse 15 00:00:56,869 --> 00:01:01,639 function now from the set Y to the set X. So the idea is that you can evaluate 16 00:01:01,639 --> 00:01:06,588 the function at any point using a public key PK and then you can invert that function 17 00:01:06,588 --> 00:01:12,443 using the secret key, SK. So what do I mean by inversion? More precisely, if we 18 00:01:12,443 --> 00:01:17,255 look at any public key and secret key pair generated by the key generation algorithm 19 00:01:17,255 --> 00:01:21,727 G, then it so happens that if I evaluate the function at the point X, and then I 20 00:01:21,727 --> 00:01:26,142 evaluate the inverse at the resulting point, I should get the original point X 21 00:01:26,142 --> 00:01:30,670 back. So the picture you should have in your mind, is there is this big set X and 22 00:01:30,670 --> 00:01:35,857 this big set Y And then, this function will map any point in X to a point in Y, 23 00:01:35,857 --> 00:01:41,508 and this can be done using the public key. So again any point in X can be mapped, to 24 00:01:41,508 --> 00:01:46,897 a point in Y. And then if someone has the secret key, then basically they can go in 25 00:01:46,897 --> 00:01:53,758 the reverse direction by applying, this secret key sk. So now that we 26 00:01:53,758 --> 00:01:58,289 understand what a trapdoor function is, let's define what it means for a trapdoor 27 00:01:58,289 --> 00:02:02,652 function to be secure. And so we'll say that this triple, (G, F, F inverse), is secure 28 00:02:02,652 --> 00:02:06,903 if in fact this function F(PK, .) is what's called a one way function. And let me 29 00:02:06,903 --> 00:02:10,986 explain what a, what is a one way function. The idea is that basically the 30 00:02:10,986 --> 00:02:15,516 function can be evaluated at any point, but inverting it is difficult without the 31 00:02:15,516 --> 00:02:19,639 secret key SK. So let's define that more precisely. As usual we define that using a 32 00:02:19,639 --> 00:02:23,764 game. So here we have our game between the challenger and the adversary. And the game 33 00:02:23,764 --> 00:02:27,496 proceeds as follows. Basically the challenger will generate a public key and 34 00:02:27,496 --> 00:02:31,622 a secret key. And then they will generate a random X. It will send the public key 35 00:02:31,622 --> 00:02:36,116 over to the adversary and then it will evaluate the function at the point X and 36 00:02:36,116 --> 00:02:40,160 send the resulting Y also to the adversary. So all the adversary gets to 37 00:02:40,160 --> 00:02:44,653 see is just a public key, which defines what the function is, and then he gets to 38 00:02:44,653 --> 00:02:49,483 see the image of this function on a random point X and is goal is basically to invert 39 00:02:49,483 --> 00:02:54,097 the function at this point Y. Okay, so he outputs some X prime. And we said that the 40 00:02:54,097 --> 00:02:58,507 trap door function is secure if the probability that the ad, adversary inverts 41 00:02:58,507 --> 00:03:03,143 the given at point y is negligible. In other words, given y the probability that 42 00:03:03,143 --> 00:03:07,271 the adversary's able to alter the pre image of y is in fact a negligible 43 00:03:07,271 --> 00:03:11,907 probability and if that's true for all efficient algorithms then we say that this 44 00:03:11,907 --> 00:03:17,882 trapdor function is secure. So again abstractly, it's a really interesting 45 00:03:17,882 --> 00:03:21,885 concept in that you can evaluate the function in the forward direction very 46 00:03:21,885 --> 00:03:26,150 easily. But then no one can evaluate the function in the reverse direction unless 47 00:03:26,150 --> 00:03:30,311 they have this trapdoor, the secret key SK, which then all of a sudden lets them 48 00:03:30,311 --> 00:03:35,424 invert the function very, very easily. So using the concept of a trapdoor function, 49 00:03:35,424 --> 00:03:39,552 it's not too difficult to build a public key encryption system, and let me show you how 50 00:03:39,552 --> 00:03:43,528 to do it. So here we have we our trap door function, (G, F, and F inverse). The other 51 00:03:43,528 --> 00:03:47,605 tool that we are going to need is a symmetric encryption scheme, and I'm going to 52 00:03:47,605 --> 00:03:51,531 assume that this encryption scheme is actually secure against active attacks, so in 53 00:03:51,531 --> 00:03:55,350 particular I needed to provide authenticated encryption. Notice that the 54 00:03:55,350 --> 00:04:00,726 symmetric encryption system takes keys in K and the trapdoor function takes inputs 55 00:04:00,726 --> 00:04:05,790 in X. Those are two different sets and so we're also gonna need the hash function. 56 00:04:05,790 --> 00:04:09,937 That goes from X to K. In other words, it maps elements in the set X into keys for 57 00:04:09,937 --> 00:04:14,033 the symmetric encryption systems. And now once we have these three components, we 58 00:04:14,033 --> 00:04:17,821 can actually construct the public key encryption system as follows: so the key 59 00:04:17,821 --> 00:04:21,764 generation for the public key encryption system is basically exactly the same as 60 00:04:21,764 --> 00:04:25,655 the key generation for the trap door function. So we run G for the trap door 61 00:04:25,655 --> 00:04:29,956 function, we get a public key and a secret key. And those are gonna be the public and 62 00:04:29,956 --> 00:04:34,171 secret keys for the public key encryption system. And how do we encrypt and decrypt? Let's 63 00:04:34,171 --> 00:04:38,978 start with encryption. So the encryption algorithm takes a public key and a message 64 00:04:38,978 --> 00:04:43,898 as input. So what it will do is it will generate a random X from the set capital 65 00:04:43,898 --> 00:04:48,545 X. It will then apply the trapdoor function to this random X, to obtain Y. So 66 00:04:48,545 --> 00:04:53,130 Y is the image of X under the trapdoor function. Then it will go ahead and 67 00:04:53,130 --> 00:04:58,272 generate a symmetric key by hashing X. So this is a symmetric key for the symmetric 68 00:04:58,272 --> 00:05:03,290 key system. And then finally it encrypts the plain text message 'm' using this key that was 69 00:05:03,290 --> 00:05:08,123 just generated. And then it outputs the value Y that it just computed, which is 70 00:05:08,123 --> 00:05:13,260 the image of X, along the encryption under the symmetric system of the message M. So 71 00:05:13,260 --> 00:05:18,366 that's how encryption works. And I want to emphasize again that the trapdoor function 72 00:05:18,366 --> 00:05:23,112 is only applied to this random value X, whereas the message itself is encrypted 73 00:05:23,112 --> 00:05:28,098 using a symmetric key system using a key that was derived from the value X that we 74 00:05:28,098 --> 00:05:32,959 chose at random. So now that we understand encryption, let's see how to decrypt. 75 00:05:32,959 --> 00:05:37,366 While the decryption algorithm takes a secret key as input, and the ciphertext. 76 00:05:37,366 --> 00:05:41,551 The ciphertext itself contains two components, the value Y and the value C. 77 00:05:41,551 --> 00:05:46,070 So the first step we're gonna do, is we're gonna apply the inverse transformation, 78 00:05:46,070 --> 00:05:50,366 the inverse trap door function to the value Y, and that will give us back the 79 00:05:50,366 --> 00:05:54,495 original X that was chosen during encryption. So now let me ask you, how do 80 00:05:54,495 --> 00:06:00,042 we derive the symmetric decryption key K from this X that we just obtained? Well, 81 00:06:00,042 --> 00:06:04,736 so that's an easy question. We basically hash X again. That gives us K just as 82 00:06:04,736 --> 00:06:09,372 during encryption. And now that we have this symmetric encryption key we can apply 83 00:06:09,372 --> 00:06:13,783 the, the symmetric decryption algorithm to decrypt the ciphertext C. We get the 84 00:06:13,783 --> 00:06:17,741 original message M and that's what we output. So, that's how the public key 85 00:06:17,741 --> 00:06:22,321 encryption system works were this trap door function is only used for encrypting 86 00:06:22,321 --> 00:06:26,788 some sort of a random value X and the actual message is encrypted using the 87 00:06:26,788 --> 00:06:31,244 symmetric system. So in pictures here, we have the message M obviously the plain 88 00:06:31,244 --> 00:06:35,545 text could be quite large. So, here we have the body of the deciphered text which 89 00:06:35,545 --> 00:06:39,953 can be quite long is actually encrypted using the symmetric system. And then again 90 00:06:39,953 --> 00:06:44,039 I emphasize that the key for the symmetric system is simply the hash of X. 91 00:06:44,039 --> 00:06:48,232 And then the header of ciphertext is simply this application of the trapdoor 92 00:06:48,232 --> 00:06:52,641 function to this random X that we picked. And so during decryption what happens is 93 00:06:52,641 --> 00:06:56,888 we first decrypt the header to get X and then we decrypt the body using the 94 00:06:56,888 --> 00:07:01,829 symmetric system to actually get the original plain text M. So as usual when I 95 00:07:01,829 --> 00:07:06,542 show you a system like this, obviously you want to verify that decryption in fact is 96 00:07:06,542 --> 00:07:10,605 the inverse of encryption. But more importantly you want to ask why is this 97 00:07:10,605 --> 00:07:14,963 system secure. And in fact there's a nice security theorem here that says. That if 98 00:07:14,963 --> 00:07:18,900 the trap door function that we started with is secure. In other words, that's a 99 00:07:18,900 --> 00:07:22,634 one way function if the adversary doesn't have a secret key. The symmetric 100 00:07:22,634 --> 00:07:26,621 encryption system provides authenticated encryption. And the hash function is a 101 00:07:26,621 --> 00:07:30,558 random oracle, which simply means that it's a random function from the set X to 102 00:07:30,558 --> 00:07:34,696 the set of keys K. So a random oracle is some sort of an idealization of, what a 103 00:07:34,696 --> 00:07:38,280 hash function is supposed to be. In practice, of course, when you come to 104 00:07:38,280 --> 00:07:42,317 implement a system like this, you would just use, SHA-256, or any of the 105 00:07:42,317 --> 00:07:47,252 other hash functions that we discussed in class. So, under those three conditions in 106 00:07:47,252 --> 00:07:51,863 fact the system that we just described is chosen cipher text secure so it is CCA 107 00:07:51,863 --> 00:07:56,416 secure, the little ro here just denote the fact that security is set in whats called 108 00:07:56,416 --> 00:08:00,572 a random oracle model. But, that's a detail that's actually not so important for 109 00:08:00,572 --> 00:08:05,012 discussion here, what I want you to remember is that if the trap door function 110 00:08:05,012 --> 00:08:09,000 is in fact a secure trap door function. The symmetric encryption system is secure 111 00:08:09,000 --> 00:08:13,017 against tampering so it provides authenticated encryption. And H 112 00:08:13,017 --> 00:08:17,468 is in some sense a good hash function. It's a random, function, which in practice 113 00:08:17,468 --> 00:08:22,245 you would just use SHA-256, then in fact the system that we just showed is CCA 114 00:08:22,245 --> 00:08:27,615 secure, is chosen ciphertext secure. I should tell you that there's actually an ISO 115 00:08:27,615 --> 00:08:31,752 standard that, defines this mode of encryption, of public encryption. ISO 116 00:08:31,752 --> 00:08:35,781 stands for International Standards Organization. So in fact this particular 117 00:08:35,781 --> 00:08:40,456 system has actually been standardized, and this is a fine thing to use. I'll refer to 118 00:08:40,456 --> 00:08:44,947 this as the ISO encryption in the next few segments. To conclude this segment, I want 119 00:08:44,947 --> 00:08:48,925 to warn you about an incorrect way of using a trapdoor function to build a 120 00:08:48,925 --> 00:08:53,328 public key encryption system. And in fact this method might be the first thing that 121 00:08:53,328 --> 00:08:57,572 comes to mind, and yet it's completely insecure. So let me show you, how not to 122 00:08:57,572 --> 00:09:01,762 encrypt using a trapdoor function. Well the first thing that might come to mind 123 00:09:01,762 --> 00:09:05,696 is, well, let's apply the trapdoor function directly to the message M. So we 124 00:09:05,696 --> 00:09:10,047 encrypt simply by applying a function to the message M, and we decrypt simply by 125 00:09:10,047 --> 00:09:14,180 applying F inverse to the ciphertext C to recover the original message M. So 126 00:09:14,180 --> 00:09:18,639 functionally, this is in fact, decryption is the inverse of encryption, and yet this 127 00:09:18,639 --> 00:09:22,881 is completely insecure for many, many different reasons. The easiest way to see 128 00:09:22,881 --> 00:09:26,960 that this is insecure, is that it's simply, this is deterministic encryption. 129 00:09:26,960 --> 00:09:30,944 You notice there is no randomness being used here. When we encrypt a message 130 00:09:30,944 --> 00:09:34,154 M, and since it is deterministic, it's cannot possibly be 131 00:09:34,154 --> 00:09:37,948 semantically secure. But in fact, as I said, when we instantiate this trap door 132 00:09:37,948 --> 00:09:41,644 function with particular implementations, for example with the RSA trap door 133 00:09:41,644 --> 00:09:44,951 function, then there are many, many attacks that are possible on this 134 00:09:44,951 --> 00:09:48,794 particular construction, and so you should never, ever, ever use it, and I'm gonna 135 00:09:48,794 --> 00:09:52,830 repeat this throughout this module, and in fact in the next segment I'll show you a 136 00:09:52,830 --> 00:09:56,699 number of attacks on this particular implementation. Okay so, what I would like 137 00:09:56,699 --> 00:10:00,717 you to remember is that you should be using an encryption system like the ISO 138 00:10:00,717 --> 00:10:04,992 standard, and you should never apply the trap door function directly to the message M. 139 00:10:04,992 --> 00:10:09,010 Although in the next segment we'll see other ways to encrypt using a trap 140 00:10:09,010 --> 00:10:13,233 door function that are also correct, but this particular method is clearly, clearly 141 00:10:13,233 --> 00:10:17,560 incorrect. Okay, so now that we understand how to build public key encryption 142 00:10:17,560 --> 00:10:21,423 given a trap door function, the next question is how to construct trap door 143 00:10:21,423 --> 00:10:24,360 functions, and we're going to do that in the next segment.