0:00:00.000,0:00:04.090
In the previous segment we saw how to[br]build public key encryption from trapdoor

0:00:04.090,0:00:08.390
functions, in this segment we're going to[br]build a classic trapdoor function called

0:00:08.390,0:00:13.295
RSA. But first let's quickly review what a[br]trapdoor function is. So recall that a

0:00:13.295,0:00:17.283
trapdoor function is made up of three[br]algorithms. There is a key generation

0:00:17.283,0:00:21.056
algorithm, the function itself, and the[br]inverse of the function. The key

0:00:21.056,0:00:25.313
generation algorithm outputs a public key[br]and a secret key. And in this case, in

0:00:25.313,0:00:29.786
this lecture the public key is going to[br]define a function that maps the set X onto

0:00:29.786,0:00:33.882
itself. Which is why I call these things[br]trapdoor permutations, as opposed to

0:00:33.882,0:00:37.978
trapdoor functions, simply because the[br]function maps X onto itself, whereas a

0:00:37.978,0:00:43.356
trapdoor function might map the set X to[br]some arbitrary set Y. Now given the public

0:00:43.356,0:00:47.819
key, the function, as we say, basically[br]defines this function from the set X to

0:00:47.819,0:00:52.914
the set X. And then given the secret key,[br]we can invert this function. So this

0:00:52.914,0:00:57.401
function F evaluates the function in the[br]forward direction, and this function F

0:00:57.401,0:01:02.059
inverse, which means the secret key SK,[br]computes F in the reverse direction. Now

0:01:02.059,0:01:06.489
we say that the trapdoor permutation is[br]secure if the function defined by the

0:01:06.489,0:01:11.033
public key is in fact a one-way function,[br]which means that it's easy to evaluate,

0:01:11.033,0:01:15.404
but without the trapdoor, the secret[br]trapdoor, it's difficult to invert. Before

0:01:15.404,0:01:20.076
we look at our first example of a trapdoor[br]permutation, I want to do a quick review

0:01:20.076,0:01:24.467
of some necessary arithmetic facts that[br]we're gonna need. And in particular,

0:01:24.467,0:01:28.632
let's look at some arithmetic facts[br]modulo composites. So here we have our

0:01:28.632,0:01:33.192
modulus N, which happens to be a product[br]of two primes, and you should be thinking

0:01:33.192,0:01:37.864
of P and Q as roughly equal size numbers,[br]since particular P and Q are both roughly

0:01:37.864,0:01:42.407
on the size of square root of N. So both[br]are roughly the same size. Recall that we

0:01:42.407,0:01:46.834
denoted by ZN the set of integers from[br]zero to N minus one, and we said that we

0:01:46.834,0:01:51.318
can do addition and multiplication module N. We denoted by ZN star the set of

0:01:51.318,0:01:55.801
invertible elements in ZN. So these are[br]all the elements, which have a modular

0:01:55.801,0:02:00.925
inverse. And we said that actually X is[br]invertible if and only if it's relatively

0:02:00.925,0:02:05.928
prime to N. Moreover, I also told you that[br]the number of invertible elements in ZN

0:02:05.928,0:02:11.147
star is denoted by this function phi(N). So phi(N)[br]is the number of invertible elements in ZN

0:02:11.147,0:02:15.814
star, And I told you that when N is a[br]product of two distinct primes, then in

0:02:15.814,0:02:20.788
fact phi(N) is equal to (P - 1) times (Q - 1) and if you extend that out, you

0:02:20.788,0:02:26.007
see that this is really equal to (N - P - Q + 1). Now remember that I said

0:02:26.007,0:02:30.858
that P and Q are on the order of square[br]root of N which means that P + Q is

0:02:30.858,0:02:35.675
also on the order of square root of N.[br]Which means that really phi(N) is on the

0:02:35.675,0:02:41.050
order of N minus two square root of N. So,[br]in other words, it's very, very close to

0:02:41.050,0:02:45.158
N. Basically, subtracting the square root[br]of N from a number, this is from, N is

0:02:45.158,0:02:49.425
going to be a huge number in our case,[br]like 600 digits. And so subtracting from a

0:02:49.425,0:02:53.533
600 digit number, the square root of that[br]600 digit number, namely a 300 digit

0:02:53.533,0:02:57.534
number, hardly affects the size of the[br]number. Which means that phi(N) is really,

0:02:57.534,0:03:01.908
really, really close to N, and I want you[br]to remember that, because we will actually

0:03:01.908,0:03:06.122
be using this now and again. And so the[br]fact that phi(N) is really close to N, means

0:03:06.122,0:03:11.094
that if we choose a random element modulo[br]N It's very, very, very likely to be in ZN

0:03:11.094,0:03:15.633
star. So it's very unlikely that by[br]choosing a random element in ZN, we will

0:03:15.633,0:03:20.085
end up choosing an element that is not[br]invertable. Okay. So just remember that,

0:03:20.085,0:03:25.479
that in fact almost all elements in ZN are[br]actually invertible. And the last thing

0:03:25.479,0:03:30.939
that we'll need is Euler's theorem, which[br]we covered last week, which basically says

0:03:30.939,0:03:36.332
that for any element X in ZN star, if I[br]raise X to the power of phi(N), I get one, in

0:03:36.332,0:03:42.527
ZN. So in other words, I get 1 modulo N. I'll say it one more time because this

0:03:42.527,0:03:47.466
is gonna be critical to what's coming.[br]Again X to the phi(N) is equal to 1 modulo

0:03:47.466,0:03:51.997
N. So now that we have the necessary[br]background we can actually describe the

0:03:51.997,0:03:55.927
RSA trapdoor permutation. This is a classic,[br]classic, classic construction in

0:03:55.927,0:04:00.811
cryptography that was first published in[br]Scientific American back in 1977, this is

0:04:00.811,0:04:05.576
a very well known article in cryptography.[br]And ever since then, this was 35 years

0:04:05.576,0:04:10.340
ago, the RSA trapdoor permutation has been used[br]extensively in cryptographic applications.

0:04:10.340,0:04:15.110
For example, SSL and TLS use RSA both for[br]certificates and for key exchange. There

0:04:15.110,0:04:19.452
are many secure email systems and secure[br]file systems that use RSA to encrypt

0:04:19.452,0:04:23.515
emails and encrypt files in the file[br]system. And there are many, many, many

0:04:23.515,0:04:27.690
other applications of this system. So this[br]is a very, very classic, crypto

0:04:27.690,0:04:32.541
construction, and I'll show you how it[br]works. I should say that RSA is named for

0:04:32.541,0:04:37.150
its inventors, Ron Rivest, Adi Shamir and[br]Len Adleman, who were all at MIT at the

0:04:37.150,0:04:41.758
time they made this important discovery.[br]So now we're ready to describe the RSA

0:04:41.758,0:04:46.425
trapdoor permutation. To do that, I have[br]to describe the key generation algorithm,

0:04:46.425,0:04:50.159
the function ƒ and the function ƒ−1.[br]So let's see. So the way the key

0:04:50.159,0:04:54.826
generation algorithm works is as follows:[br]What we do is we generate two primes, P

0:04:54.826,0:04:59.143
and Q, each would be, say on the order of[br]1000 bits, so, you know, roughly 300

0:04:59.143,0:05:03.751
digits, and then the RSA modulus is simply[br]going to be the product of those two

0:05:03.751,0:05:08.801
primes. The next thing we do is we pick[br]two exponents, e and d, such that e times

0:05:08.801,0:05:13.894
d is 1 modulo phi(N). What this means is[br]that e and d first of all are relatively

0:05:13.894,0:05:19.051
prime to phi(N) and second of all they're[br]actually modular inverses of one another,

0:05:19.051,0:05:24.014
modulo phi(N). And then we output the public[br]key as the pair N,e and the

0:05:24.014,0:05:29.172
secret key is the secret key N,d. I should mention that the exponent e,

0:05:29.172,0:05:34.586
that the number e is sometimes called the[br]encryption exponent and the exponent d is

0:05:34.586,0:05:39.135
sometimes called the decryption exponent.[br]And you'll see why I keep referring to

0:05:39.135,0:05:43.189
these as exponents in just a second. Now[br]the way the RSA function itself is

0:05:43.189,0:05:46.902
defined is really simple. For simplicity[br]I'm gonna define it as the function

0:05:46.902,0:05:51.801
from ZN star to ZN star. And the way[br]the function is defined is simply given an

0:05:51.801,0:05:57.001
input X, all we do is we simply take X and[br]raise it to the power of e in ZN. So we

0:05:57.001,0:06:02.137
just compute X to the e, modulo N. That's[br]it. And then to decrypt, what we do is we

0:06:02.137,0:06:07.451
simply, given an input Y, we simply raise[br]Y to the power of d, modulo N. And that's

0:06:07.451,0:06:12.483
it. So now you can see why as I refer to e[br]and d as exponents. They're the things

0:06:12.483,0:06:17.767
that X and Y are being raised to. So let's[br]quickly verify that F inverse really does

0:06:17.767,0:06:22.673
invert the function F. So let's see what[br]happens when we compute Y to the d. So

0:06:22.673,0:06:27.515
suppose Y itself happens to be the RSA[br]function of some value X. In which case,

0:06:27.515,0:06:33.045
what Y to the d is, is really RSA of X to[br]the power of d. While X by itself is

0:06:33.045,0:06:39.006
simply gonna be X to the e modulo N, And[br]therefore, Y to the d is simply X to the e

0:06:39.006,0:06:44.896
times d modulo N. Again, just using rules[br]of exponentiation, the exponents multiply,

0:06:44.896,0:06:50.857
so we get X to the e times d. But what do[br]we know about e times d? e times d we said

0:06:50.857,0:06:57.390
is one modulo phi(N). And what that means is[br]there exists some integer such that e

0:06:57.390,0:07:04.177
times d is equal to K times phi(N) plus one.[br]This is what it means that e times d is

0:07:04.177,0:07:09.820
one modulo phi(N). So, we can simply replace e[br]times d by K times phi(N)+1. So, that's

0:07:09.820,0:07:14.453
what I wrote here. So, we have X to the[br]power of K times phi(N)+1. But now again

0:07:14.453,0:07:19.868
just using rules of exponentiation, we can[br]re-write this as X to the power of phi(N) to

0:07:19.868,0:07:24.827
the power of K times X. This is really the[br]same thing as K times phi(N)+1 as the

0:07:24.827,0:07:29.917
exponent. I just kind of separated the[br]exponent into it's different components.

0:07:29.917,0:07:35.137
Now, X to the phi(N) by Euler's theorem, we know[br]that X to the phi(N) is equal to one. So what

0:07:35.137,0:07:41.394
is this whole product equal to? Well since[br]X to the phi(N) is equal to one, one to

0:07:41.394,0:07:45.961
the K is also equal to one, so this whole[br]thing over here is simply equal to one.

0:07:45.961,0:07:50.757
And what we're left with is X. So what we[br]just proved is that if I take the output of

0:07:50.757,0:07:55.210
the RSA function and raise it to the power[br]of 'd', I get back X. Which means that

0:07:55.210,0:07:59.663
raising to the power of 'd' basically[br]inverts the RSA function, which is what we

0:07:59.663,0:08:04.638
wanted to show. So that's it, that's the[br]whole description of the function, we've

0:08:04.638,0:08:08.972
described the key generation. The function[br]itself, which is simply raising things to

0:08:08.972,0:08:13.410
the power of e modulo N, and the inversion[br]function which is simply raising things to

0:08:13.410,0:08:17.483
the power of d, also modulo N. The next[br]question is, why is this function secure?

0:08:17.483,0:08:21.609
In other words, why is this function one[br]way if all I have is just a public key,

0:08:21.609,0:08:26.409
but I don't have the secret key? And so to[br]argue that this function is one way we

0:08:26.409,0:08:31.454
basically state the RSA assumption which[br]basically says that the RSA function is a

0:08:31.454,0:08:36.626
one way permutation. And formally the way[br]we state that is that, basically for all

0:08:36.626,0:08:41.416
efficient algorithms, A, it so happens[br]that if I generate two primes p and q,

0:08:41.416,0:08:46.397
random primes p and q. I multiply them to[br]get to modulus N and then I choose a

0:08:46.397,0:08:51.103
random 'y' in ZN star. And now I give[br]the modulus, the exponent and this 'y' to

0:08:51.103,0:08:55.893
algorithm A, the probability that I'll get[br]the inverse of RSA at the point Y, mainly

0:08:55.893,0:09:00.336
I'll get Y to the power of one over E.[br]That's really what the inverse is. This

0:09:00.336,0:09:04.606
probability is negligible. So this[br]assumption is called the RSA assumption.

0:09:04.606,0:09:09.338
It basically states that RSA is a one way[br]permutation just given the public [key?]. And

0:09:09.338,0:09:13.954
therefore, it is a trapdoor permutation[br]because it has a trapdoor. And makes this

0:09:13.954,0:09:19.501
easy to invert for anyone who knows the[br]trap door. So now that we have a secure

0:09:19.501,0:09:23.717
trap door permutation, we can simply plug[br]that into our construction for public key

0:09:23.717,0:09:27.826
encryption, and get our first real world[br]public key encryption system. And so what

0:09:27.826,0:09:32.362
we'll do is we'll simply plug the, the RSA[br]trapdoor permutation into the iso standard

0:09:32.362,0:09:36.151
construction that we saw in the previous[br]segment. So, if you recall, that

0:09:36.151,0:09:40.207
construction was based on a symmetric[br]encryption system that had to provide

0:09:40.207,0:09:44.438
authenticated encryption. And it was also[br]based on a hash function. Then mapped

0:09:44.615,0:09:48.866
while transferring it into the world of[br]RSA, it maps elements in

0:09:48.866,0:09:54.202
ZN, into secret keys for the[br]symmetric key system. And now the

0:09:54.202,0:09:58.947
way the encryption scheme works is really[br]easy to describe. Basically algorithm G

0:09:58.947,0:10:03.751
just runs the RSA key generation algorithm[br]and produces a public key and a secret

0:10:03.751,0:10:07.813
key. Just as before. So you notice the[br]public key contains the encryption

0:10:07.813,0:10:11.948
exponent and the, secret key contains the[br]decryption exponent. And the way we

0:10:11.948,0:10:16.298
encrypt is as follows. Well, we're going[br]to choose a random X in ZN. We're going

0:10:16.298,0:10:21.468
to apply the RSA function to this X, we're[br]going to deduce a symmetric key from this

0:10:21.468,0:10:26.453
X by hashing it, so we compute H of X to[br]obtain the key K, and then we output this

0:10:26.453,0:10:31.130
Y along with the encryption of the message[br]under the symmetric key K. And in

0:10:31.130,0:10:35.930
practice, the hash function H would be[br]just implemented just using SHA-256, and

0:10:35.930,0:10:40.977
you would use the output of SHA-256 to[br]generate a symmetric key that could then

0:10:40.977,0:10:45.687
be used for the symmetric encryption[br]assistant. So that's how we would encrypt

0:10:45.687,0:10:50.084
and then the way we would decrypt is[br]pretty much as we saw in the previous

0:10:50.084,0:10:54.951
segment, where the first thing we would do[br]is we would use the secret key to invert

0:10:54.951,0:10:59.758
the header of the ciphertext. So we would[br]compute RSA invert of Y, that would give

0:10:59.758,0:11:04.566
us the value X. Then we would apply the[br]hash function to X so that this would give

0:11:04.566,0:11:09.198
us the key K. And then we would run the[br]decryption algorithm for the symmetric

0:11:09.198,0:11:15.171
system on the ciphertext and that would[br]produce the original message m. And then

0:11:15.171,0:11:19.103
we stated a theorem in the previous[br]segment to say that if the RSA trapdoor

0:11:19.103,0:11:23.087
permutation is secure, Es and Ds, the[br]symmetric encryption scheme [inaudible]

0:11:23.087,0:11:27.175
provides authenticated encryption. And as[br]we said, H is just random oracle. It's,

0:11:27.332,0:11:31.421
you know, kind of a random function from[br]ZN to the key space. Then, in fact, this

0:11:31.421,0:11:35.720
system is chosen cipher text secure, and[br]is a good public key system to use.

0:11:36.240,0:11:41.729
So now that we have our first example of a[br]good public key system to use, I wanna

0:11:41.729,0:11:46.978
quickly warn you about how not to use RSA[br]for encryption. And this again something

0:11:46.978,0:11:51.101
that we said in the previous segment. And[br]I'm going to repeat it here, except I'm

0:11:51.101,0:11:55.534
going to make it specific to RSA. So[br]I'd like to call this, textbook RSA.

0:11:55.534,0:11:59.400
Which basically is the first thing that[br]comes to mind when you think about

0:11:59.400,0:12:03.678
encrypting using RSA, namely, the secret[br]key and the public key will be as before.

0:12:03.678,0:12:08.162
But now instead of running through, a hash[br]function to generate a symmetric key, what

0:12:08.162,0:12:12.337
we would do is we would directly use RSA[br]to encrypt the given message M. And then

0:12:12.337,0:12:16.202
we would directly use the decryption[br]exponent to decrypt the cipher text and

0:12:16.202,0:12:20.773
obtain the plain text M. I'm going to call[br]this textbook RSA, because there are many

0:12:20.773,0:12:25.350
textbooks that describe RSA encryption in[br]this way. And this is completely wrong.

0:12:25.350,0:12:29.495
This is not how RSA encryption works.[br]It's an insecure system. In particular,

0:12:29.495,0:12:33.936
it's deterministic encryption, and so it[br]can't possibly be semantically secure, but

0:12:33.936,0:12:38.542
in fact there are many attacks exist that[br]I'm going to show you an attack in just a

0:12:38.542,0:12:42.709
minute, but the message that I want to[br]make clear here, is that RSA, all it is,

0:12:42.709,0:12:47.151
is a trap door permutation. By itself[br]it's not an encryption system. You have to

0:12:47.151,0:12:51.427
embellish it with this ISO standard for[br]example, to make it into an encryption

0:12:51.427,0:12:55.826
system. By itself, all it is, is just a[br]function. So let's see what goes wrong if

0:12:55.826,0:13:00.225
you try to use textbook RSA, In other[br]words, if you try to encrypt a message

0:13:00.225,0:13:04.975
using RSA directly. And I'll give you an[br]example attack from the world of the web.

0:13:04.975,0:13:09.725
So imagine we have a web server. The web[br]server has an RSA secret key. Here's it's

0:13:09.725,0:13:14.738
denoted by N and D. And here we have a web[br]browser who's trying to establish a secure

0:13:14.738,0:13:19.124
session, a secure SSL session, with the web[br]server. So the way SSL works is that the

0:13:19.124,0:13:23.401
web browser starts off by sending this[br]client hello message saying, hey, I want

0:13:23.401,0:13:27.787
to set up a secure session with you. The[br]web server responds with a server hello

0:13:27.787,0:13:32.430
message that contains the server's public[br]key And then the web browser will go ahead

0:13:32.430,0:13:36.615
and generate a random what's called a[br]premaster secret K, it will encrypt the

0:13:36.615,0:13:40.692
premaster secret using K and send the[br]result in ciphertext over to the web

0:13:40.692,0:13:44.932
server. The web server will decrypt and[br]then the web server will also get K, so

0:13:44.932,0:13:49.336
now the two have a shared key that they[br]can use to then secure a session between

0:13:49.336,0:13:53.630
them. So I want to show you what goes[br]wrong if we directly use the RSA function

0:13:53.630,0:13:57.762
for encrypting K. In other words, if[br]directly K is encrypted as K to the e

0:13:57.762,0:14:02.828
modulo N. Just for the sake of argument[br]let's suppose that K is a 64-bit key.

0:14:02.828,0:14:08.097
We're going to treat K as an integer in[br]the range as zero to 2 to the 64.

0:14:08.097,0:14:13.100
More precisely two to the 64 minus one,[br]and now what we're going to do is the

0:14:13.100,0:14:18.302
following. First of all, suppose it so[br]happens that K factors into a product of

0:14:18.302,0:14:23.705
roughly equal sized numbers. So K we can[br]write as K1 times K2, where K1 and K2 are

0:14:23.705,0:14:29.745
integers. And both are say less than two[br]to the 34. So, it turns out this happens

0:14:29.745,0:14:34.508
with probability roughly twenty percent so[br]one in five times K can actually be

0:14:34.508,0:14:39.740
written in this way. But, now if we plug[br]this K, K=K1 x K2 if we plug that into the

0:14:39.740,0:14:45.241
equation that defines the cipher text you[br]see that we can simply substitute K by K1 x k2

0:14:45.241,0:14:50.875
and then we can move k1 to the other side.[br]So then we end up with this equation here,

0:14:50.875,0:14:55.897
namely C over K1 to the e is equal to K2 to the e. You notice if I multiply both

0:14:55.897,0:15:00.659
sides by K1 to the e, I get that C is[br]equal to K1 times K2 to the e,

0:15:00.659,0:15:06.374
which is precisely this equation here.[br]Okay, so all I did is I just replaced K by

0:15:06.374,0:15:11.955
K1 times K2 and then divided by K1 to the[br]e. So by now this should look familiar to

0:15:11.955,0:15:16.146
you. So now we have two variables in this[br]equation, of course. C is known to the

0:15:16.146,0:15:20.092
attacker, E is known to the attacker, and[br]N is known to the attacker. The two

0:15:20.092,0:15:24.518
variables in this equation are K1 and[br]K2, and we've separated them into a

0:15:24.518,0:15:28.891
different side of the equation, and as a[br]result we can do now a meet in the middle

0:15:28.891,0:15:33.157
attack on this equation. So let's do the[br]meet in the middle attack. What we would

0:15:33.157,0:15:37.524
do is we would build a table of all[br]possible values Of the left-hand side. So

0:15:37.524,0:15:43.392
all possible values of C over K1 to the E,[br]there are 2 to the 34 of them. And then,

0:15:43.584,0:15:48.878
for all possible values on the right-hand[br]side, [inaudible] for all possible values

0:15:48.878,0:15:54.175
of K2 to the e, we're gonna check whether[br]this value here lives in the table that we

0:15:54.175,0:15:58.749
constructed in step one. And if it is then[br]we found a collision, and basically we

0:15:58.749,0:16:03.265
have a solution to this equation. So as[br]soon as we find an element of the form K2

0:16:03.265,0:16:07.962
to the E that lives in the table that[br]we built in step one, we've solved this

0:16:07.962,0:16:12.717
equation and we found K1 and K2. And[br]of course once we found K1 and K2,

0:16:12.717,0:16:16.950
we can easily recover K simply by[br]multiplying them. So then we multiply K1

0:16:16.950,0:16:21.428
and K2 and we get, the secret key[br]K. Okay? So, we've broken, basically,

0:16:21.428,0:16:25.604
this, this encryption system. And how long[br]did it take? Well, by brute force, we

0:16:25.604,0:16:29.890
could have broken it in time 2 to the 64,[br]simply by trying all possible keys. But

0:16:29.890,0:16:33.792
this attack, you notice, it took 2 to[br]the 34 time for step number one. Well, it

0:16:33.792,0:16:38.353
took a little bit more than 2 to the 34,[br]'cause we had to do these exponentiations.

0:16:38.518,0:16:42.969
It took 2 to the 34 time for step two[br]against slightly more than two to the 34

0:16:42.969,0:16:47.530
because of the exponentiations. So let's[br]say that the whole algorithm took time two

0:16:47.530,0:16:52.277
to the 40. The point is that this is much,[br]much less time due to the 64. So here you

0:16:52.277,0:16:56.667
have an example. Where if you encrypt[br]using RSA directly, in other words you

0:16:56.667,0:17:01.296
directly compute, K to the E, mod N,[br]instead of going through the ISO standard,

0:17:01.296,0:17:05.983
for example, then you get an attack that[br]runs much faster than exhaustive search.

0:17:05.983,0:17:10.730
You get an attack that runs in time two to[br]the 40, rather than time two to the 64.

0:17:10.730,0:17:14.985
Okay, so this is a cute example of how[br]things can break if you use the RSA

0:17:14.985,0:17:19.299
trapdoor permutation directly to[br]encrypt a message. So the message to

0:17:19.299,0:17:23.670
remember here, is never, ever, ever use[br]RSA directly to encrypt. You have to use to go

0:17:23.670,0:17:27.868
through an encryption mechanism. For[br]example, like the ISO standard. And in

0:17:27.868,0:17:32.354
fact, in the next segment we are going to[br]see other ways of using RSA to build

0:17:32.354,0:17:33.620
public key encryption.