0:00:00.000,0:00:03.752 In this segment, I want to tell you about[br]another form of encryption, called format 0:00:03.752,0:00:07.322 preserving encryption. This is again[br]something that comes up in practice quite 0:00:07.322,0:00:10.617 often, especially when it comes to[br]encrypting credit card numbers. And, so, 0:00:10.617,0:00:14.244 let's see how it works. Remember that a[br]typical credit card number is sixteen 0:00:14.244,0:00:18.978 digits, broken into four blocks of four[br]digits each. You've probably heard before 0:00:18.978,0:00:23.416 that the first six digits are what's[br]called the BIN number, which represent the 0:00:23.416,0:00:28.124 issuer ID. For example, all credit cards[br]issued by VISA always start with the digit 0:00:28.124,0:00:34.114 four; all credit cards issued by[br]MasterCard start with digits 51 to 55, and 0:00:34.114,0:00:38.808 so on and so forth. The next nine digits[br]are actually the account number that is 0:00:38.808,0:00:43.275 specific to the, to the particular[br]customer and the last digit is a check sum 0:00:43.275,0:00:48.031 that's computed from the previous fifteen[br]digits. So there are about 20,000 BIN 0:00:48.031,0:00:52.846 numbers and so if you do the calculation[br]it turns out there are about 2 to the 42 0:00:52.846,0:00:56.733 possible credit card numbers which[br]corresponds to about 42 bits of 0:00:56.733,0:01:01.489 information that you need to encode if you[br]want to represent a credit card number 0:01:01.489,0:01:05.999 compactly. Now the problem is this.[br]Suppose we wanted to encrypt credit card 0:01:05.999,0:01:10.713 numbers, so that when the user swipes the[br]credit card number at the point of sale 0:01:10.713,0:01:14.640 terminal, namely at the, you know,[br]the merchant's cash register. The cash 0:01:14.640,0:01:18.670 register, this is called a point of sale[br]terminal, goes ahead and encrypts the 0:01:18.670,0:01:22.751 credit card number using a key k and[br]it's encrypted all the way until it goes 0:01:22.751,0:01:27.036 to the acquiring bank or maybe even beyond[br]that, maybe even all the way when it goes 0:01:27.036,0:01:31.117 to Visa. Now, the problem is that these[br]credit card numbers actually pass through 0:01:31.117,0:01:35.300 many, many, many processing points. All of[br]them expect to get basically something 0:01:35.300,0:01:39.683 that looks like a credit card number as a[br]result. So if we simply wanted to encrypt 0:01:39.683,0:01:43.893 something at the end point, at one end[br]point, and decrypt it at the other end 0:01:43.893,0:01:48.087 point, it's actually not that easy because[br]if all we did was encrypt it using AES, 0:01:48.087,0:01:52.748 even if we just used deterministic AES,[br]the output of the encrypted credit card 0:01:52.748,0:01:57.164 number would be 128 bit block. Sixteen[br]bytes that would be, that would need to be 0:01:57.164,0:02:01.691 sent from one system to the next, until it[br]reaches its destination. But each one of 0:02:01.691,0:02:06.107 these processors, then, would have to be[br]changed, because they're all expecting to 0:02:06.107,0:02:10.192 get credit card numbers. And so the[br]question is, can we encrypt at the cash 0:02:10.192,0:02:14.388 register, such that the resulting[br]encryption itself looks like a credit card 0:02:14.388,0:02:18.528 number. And as a result, none of these[br]intermediate systems would have to be 0:02:18.528,0:02:22.937 changed. Only the endpoints would have to[br]be changed. And in doing so we would 0:02:22.937,0:02:27.399 actually obtain end-to-end encryption[br]without actually having to change a lot of 0:02:27.399,0:02:31.973 software along the way. So the question[br]then is, again, can we have this mechanism 0:02:31.973,0:02:36.546 called format preserving encryption, where[br]encrypting a credit card itself produces 0:02:36.546,0:02:40.881 something that looks like a credit card?[br]So that's our goal. The encrypted credit 0:02:40.881,0:02:44.719 card number should look like a regular[br]valid credit card number. So this is the 0:02:44.719,0:02:48.768 goal of format preserving encryption. More[br]abstractly what it is we're trying to do, 0:02:48.768,0:02:54.055 is basically build a pseudo random[br]permutation on the set zero to S minus one 0:02:54.055,0:02:59.105 for any given S. So for the set of credit[br]card numbers, S would be roughly, you 0:02:59.105,0:03:03.966 know, two to the 42. In fact, it's gonna[br]be, not exactly two to the 42. It's gonna 0:03:03.966,0:03:08.415 be some funny numbers that's around two to[br]the 42. And our goal is to build a PRP 0:03:08.415,0:03:14.326 that acts exactly on the interval, zero to[br]S minus one. And what we're given as input 0:03:14.326,0:03:20.369 is some PRF, say, something like AES, that[br]acts on much larger blocks, say, acts on 0:03:20.369,0:03:24.384 sixteen byte blocks. So we're trying to,[br]in some sense, shrink the domain of the 0:03:24.384,0:03:29.576 PRF to make it fit the data that we're[br]given. And the question is basically how 0:03:29.576,0:03:33.639 to do that. Well, once we have such a[br]construction we can easily use it to 0:03:33.639,0:03:37.610 encrypt credit card numbers. What we would[br]do is we would take the, a given credit 0:03:37.610,0:03:41.937 card number. We would encode it such that[br]now it's represented as a number between 0:03:41.937,0:03:47.312 zero and the total number of credit card[br]numbers. Then we would apply our PRP so we 0:03:47.312,0:03:51.815 would encrypt this credit card number, you[br]know, using construction number two from 0:03:51.815,0:03:56.428 the deterministic encryption segment. And[br]then we would map the final number back to 0:03:56.428,0:04:00.656 be a regular, to look like val--, regular,[br]valid credit card number and then send 0:04:00.656,0:04:05.083 this along the way. So this is, this[br]is again non expanding encryption. The 0:04:05.083,0:04:09.442 best we can do, as we said before is to[br]encrypt using a PRP except again the 0:04:09.442,0:04:14.145 technical challenge is we need a PRP that[br]acts on this particular funny looking set 0:04:14.145,0:04:19.811 from zero to S-1 for this prespecified[br]value of S. So my goal is to show you how 0:04:19.811,0:04:23.379 to construct this and once we see how to[br]do it, you will also know how to encrypt 0:04:23.379,0:04:27.738 credit card number so that the resulting[br]cipher text is itself a credit card 0:04:27.738,0:04:33.176 number. So the construction works in two[br]steps. The first thing we do is we shrink 0:04:33.176,0:04:38.552 our PRF from the set {0,1} to the n, say[br]{0,1} to the 128 in the case of AES, 0:04:38.552,0:04:43.965 to {0,1} to the t where t is the[br]closest power of two to the value S. 0:04:44.580,0:04:50.035 So say S is some crazy number around two to[br]the 41. T would basically be then 42 0:04:50.035,0:04:55.338 because that's the closest power of two[br]that's just above the value S. So the 0:04:55.338,0:04:59.396 construction is basically gonna use the[br]Luby-Rackoff construction. What we need 0:04:59.396,0:05:05.157 is a PRF F prime that acts on blocks of[br]size t over two. So imagine for example 0:05:05.157,0:05:10.103 in the credit card case, t would be 42[br]because two to the 42 we said is the 0:05:10.301,0:05:15.132 closest power of two that's bigger than,[br]than, than S. S is the number of credit, 0:05:15.132,0:05:20.387 total number of credit cards. Then we would[br]need a PRF now that acts on 21-bit 0:05:20.387,0:05:25.828 inputs. So one way to do that is simply to[br]take the AES block cipher, treat it as a 0:05:25.828,0:05:30.437 PRF on 128-bit inputs, and then simply[br]truncate the output to the least 0:05:30.437,0:05:35.238 significant 21 bits. Okay, so we were[br]given a 21 bit value. We would append 0:05:35.238,0:05:40.359 zeros to it so that we get 128 bits as a[br]result. We would apply AES to that and 0:05:40.359,0:05:45.016 then we would truncate back to 21 bits.[br]Now I should tell you that that's actually 0:05:45.016,0:05:48.722 a simple way to do it but it's actually[br]not the best way to do it. There are 0:05:48.722,0:05:54.181 actually better ways to truncate a PRF on[br]n bits to a PRF on t over two bits. 0:05:54.181,0:05:58.454 But for our pur--, for the purposes of this[br]segment, the truncation method I just said 0:05:58.626,0:06:03.113 is good enough. So let's just use that[br]particular method. Okay, so now, we've 0:06:03.113,0:06:09.279 converted our AES block cipher into a PRF[br]on t over two bits, say, on 21 bits. But 0:06:09.279,0:06:14.108 what we really want is a PRP. And so what[br]we'll do is we'll plug our new PRF, F prime, 0:06:14.108,0:06:17.937 directly into the Luby-Rackoff[br]construction. If you remember, this is 0:06:17.937,0:06:22.489 basically a Feistel construction. We saw[br]this when we talked about DES. It's a, 0:06:22.489,0:06:27.009 Luby-Rackoff used three rounds, and we[br]know that this converts a secure PRF into 0:06:27.009,0:06:31.961 a secure PRP on twice the block size. In[br]other words, we started from a secure PRF 0:06:31.961,0:06:36.973 on 21 bits, and that will give us, and[br]Luby-Rackoff will give us a secure PRF on 0:06:36.973,0:06:41.424 42 bits, which is what we wanted. Now, I[br]should tell you that the error parameters 0:06:41.424,0:06:45.531 in the Luby-Rackoff construction are not as[br]good as they could be. And, in fact, a 0:06:45.531,0:06:49.689 better thing to do is to use seven rounds[br]of Feistel. So in other words, we'll do 0:06:49.689,0:06:54.782 this seven times; every time we'll use a[br]different key. So you notice, here, we're 0:06:54.782,0:06:59.318 only using three keys. We should be using,[br]we should be doing this seven times using 0:06:59.318,0:07:03.839 seven different keys. And then there's a[br]nice result, due to Patarin, that 0:07:03.839,0:07:06.998 shows that the seven-round construction[br]basically has as good an error 0:07:06.998,0:07:11.312 terms as possible. So the error terms in[br]the security theorem will basically be two 0:07:11.312,0:07:15.978 to the T. Okay. So now we have a pseudo[br]random permutation that operates on close 0:07:15.978,0:07:21.529 power of two to the value of S. But that's[br]not good enough. We actually want to get a 0:07:21.529,0:07:26.701 pseudo random permutation on the set zero[br]to S minus one. So step two will take us 0:07:26.701,0:07:30.772 down from {0,1} to the T, to the actual[br]set zero to the S minus one that we're 0:07:30.772,0:07:35.135 interested in. And this construction is,[br]again, really, really cute, so let me show 0:07:35.135,0:07:39.073 you how it works. So, again, we're given[br]this PRP that operates on a power of two. 0:07:39.073,0:07:44.360 And we wanna go down to a PRP that[br]operates on something slightly smaller. 0:07:44.360,0:07:49.239 Okay, so here's on the construction works.[br]Basically we're given input X in the set 0:07:49.239,0:07:53.149 zero to S minus one that we want. And what[br]we're going to do is we're going to 0:07:53.149,0:07:57.436 iterate the following procedure again[br]and again. So first of all we map X into 0:07:57.436,0:08:02.477 this temporary variable Y. And now we're[br]just gonna encrypt the input X and put the 0:08:02.477,0:08:07.213 result into Y. If Y is inside of our[br]target set, we stop and we output Y. If 0:08:07.213,0:08:12.460 not we iterate this again and again and[br]again and again and again until finally Y 0:08:12.460,0:08:16.696 falls into our target set and then we[br]output that value. So in a picture, let me 0:08:16.696,0:08:22.513 explain how this works. So we start from a[br]point in our target set. And now, now we 0:08:22.513,0:08:27.272 apply the, the function E and we might be[br]mapped into this point outside our target 0:08:27.272,0:08:30.871 set, so we're not gonna stop. So now we[br]apply the function E again and we might 0:08:30.871,0:08:35.050 be mapped into this point and now we apply[br]the function E again and now all of a 0:08:35.050,0:08:39.118 sudden we map into this point, and then we[br]stop, and that's our output. Okay, so 0:08:39.118,0:08:44.199 that's how we map points between from zero[br]to S minus one, to other points in the 0:08:44.199,0:08:48.490 zero to S minus one. It should be pretty[br]clear that this is invertible. To invert, 0:08:48.490,0:08:52.466 all I'll do is I'll just decrypt and walk,[br]kind of, in the opposite direction. So 0:08:52.466,0:08:56.342 I'll decrypt, and then decrypt, and then[br]decrypt, until finally, I fall into the 0:08:56.342,0:09:00.419 set, zero to S minus one. And that gives[br]me the inverse of the point that I wanted 0:09:00.419,0:09:04.625 to. So this is, in fact, a PRP. The[br]question is whether it's a secure PRP, and 0:09:04.625,0:09:08.738 we'll see that in just a minute. But before [br]that, let me ask you, how many iterations 0:09:08.738,0:09:13.317 do you expect that we're gonna need? And [br]I wanna remind you again, before I ask you 0:09:13.317,0:09:19.635 that question, that, in fact, S is less than [br]two to the T, but is more than two to the T-1. 0:09:19.635,0:09:25.092 So in particular S is greater than two to [br]the T over two. And my question to you 0:09:25.092,0:09:29.661 now is how many iterations are we gonna [br]need, on average, until this process converges? 0:09:32.907,0:09:35.075 So the answer is we're going to need two iterations, 0:09:35.075,0:09:38.984 so really just a small[br]constant number of iterations. And so this 0:09:38.984,0:09:43.159 will give us a very, very efficient[br]mechanism that will move us down from 0:09:43.159,0:09:48.567 {0,1} to the T to zero to the S-1. So now[br]when we talk about security, it turns out 0:09:48.567,0:09:53.210 this step here is tight. In other words,[br]there is really no additional security 0:09:53.210,0:09:59.250 cost to going down from two to the T to[br]zero to S-1. And the reason that's true, 0:09:59.250,0:10:02.734 it's actually again a very cute theorem[br]to prove, but I, I won't do it here. Maybe 0:10:02.734,0:10:07.621 I'll leave it as an exercise for you guys[br]to argue. Which says that if you give me 0:10:07.621,0:10:11.921 any two sets Y and X, where Y is contained[br]inside of X, then applying the 0:10:11.921,0:10:16.520 transformation that we just saw to a[br]random permutation from X to X actually 0:10:16.520,0:10:21.340 gives a random permutation from Y to Y. So[br]this means that if we started with a truly 0:10:21.340,0:10:25.578 random permutation on X, you'll end up[br]with a truly random permutation on Y. And 0:10:25.578,0:10:29.445 since, well, the permutation we're[br]starting with is indistinguishable from 0:10:29.445,0:10:33.366 random on X, we'll end up with a[br]permutation that's indistinguishable from 0:10:33.366,0:10:37.763 random on Y. Okay, so this is a very tight[br]construction and as I said, the first step 0:10:37.763,0:10:42.001 actually is basically, the analysis is the[br]same as the Luby-Rackoff analysis. In 0:10:42.001,0:10:46.187 fact, it's better to use as I said, the[br]Patarin analysis using seven rounds. So 0:10:46.187,0:10:50.425 actually here, there's a bit of cost in[br]security. But, overall, we get a 0:10:50.425,0:10:55.558 construction that is a secure PRP for[br]actually, not such good security 0:10:55.558,0:10:59.992 parameters, but nevertheless, this is a[br]good secure PRP that we can construct that 0:10:59.992,0:11:04.644 actually will operate on the range zero to[br]S minus one. This in turn will allow us to 0:11:04.644,0:11:08.968 encrypt credit card numbers such that the[br]cipher text looks like a credit card 0:11:08.968,0:11:13.183 number. And again, I want to emphasize[br]that there's no integrity here. The best 0:11:13.183,0:11:17.124 we can achieve here is just [br]deterministic CPA security. No cipher text 0:11:17.124,0:11:21.251 integrity, and no randomness at all. Okay.[br]So this concludes this module. And as 0:11:21.251,0:11:25.620 usual I want to point to a few reading[br]materials that you can take a look at if 0:11:25.620,0:11:29.990 you're interested in learning more about[br]anything that we discussed in this module. 0:11:29.990,0:11:34.255 So first of all, the HKDF construction[br]that we talked about for key derivation is 0:11:34.255,0:11:38.950 described in a paper from 2010 by Hugo[br]Krawczyk. Deterministic encryption, The 0:11:38.950,0:11:44.477 SIV mode that we described is discussed in[br]this paper over here. The EME mode that we 0:11:44.477,0:11:49.628 described that allows us to build a, a[br]wider block pseudo random permutation is 0:11:49.628,0:11:54.402 described in this paper here by Halevi and[br]Rogaway. Tweakable block ciphers that 0:11:54.402,0:11:59.239 actually led to the XDS mode of operation[br]that's used for disk encryption is 0:11:59.239,0:12:04.704 described in this paper here. And finally,[br]a format preserving encryption is described 0:12:04.704,0:12:09.718 in this last paper that I point to over[br]here. Okay so this concludes this module. 0:12:09.718,0:12:13.831 And in the next module we gonna start[br]looking at how to do key exchange.