0:00:00.152,0:00:01.703 The next question, we're going to ask: 0:00:01.703,0:00:03.638 is RSA really an one-way function? 0:00:03.638,0:00:05.788 In other words, is it really hard to 0:00:05.788,0:00:08.104 invert RSA without knowing the trapdoor? 0:00:09.012,0:00:11.998 So, if an attacker wanted to invert the RSA function, 0:00:11.998,0:00:15.001 well, what the attacker has, is basically the public key, 0:00:15.001,0:00:19.054 namely he has N and e. And now, he is given x to the e 0:00:19.054,0:00:23.293 and his goal is to recover x. OK, so the question really is: 0:00:23.293,0:00:26.131 given x to the e modulo N, how hard is it to 0:00:26.131,0:00:28.933 recover x? So, what we're really asking is, 0:00:28.933,0:00:33.113 how hard is it to compute e'th roots modulo a composite? 0:00:34.178,0:00:38.544 If this problem turns out to be hard, then RSA is in fact an one-way function. 0:00:38.544,0:00:39.968 If this problem turns out to be easy, which 0:00:39.968,0:00:44.564 of course we don't believe it's easy, then RSA would actually be broken. 0:00:44.564,0:00:46.832 So, it turns out the best algorithm for this problem 0:00:46.832,0:00:49.563 requires us to first factor the modulus N. 0:00:50.050,0:00:52.236 And then, once we factored the modulus, we have already 0:00:52.236,0:00:55.892 seen last week, that it's easy to compute the e'th root modulo p, 0:00:55.892,0:00:58.483 it's easy to compute the e'th root modulo q. 0:00:58.483,0:01:02.107 And, then given both those e'th roots, it's actually easy to combine them together, 0:01:02.107,0:01:04.699 using what's called the Chinese remainder theorem 0:01:04.699,0:01:07.667 and actually recover the e'th root modulo N. 0:01:07.667,0:01:09.946 So, once we are able to factor the modulus, 0:01:09.946,0:01:12.848 computing e'th roots modulo N becomes easy. 0:01:12.848,0:01:14.636 But, factoring the modulus, as far as 0:01:14.636,0:01:16.761 we know, is a very, very difficult problem. 0:01:16.761,0:01:19.865 But a natural question is, is it true that in 0:01:19.865,0:01:22.568 order to compute e'th roots modulo N, we have to 0:01:22.568,0:01:25.707 factor the modulus N? As far as we know, the 0:01:25.707,0:01:27.369 best algorithm for computing e'th roots 0:01:27.369,0:01:30.002 modulo N requires factorization of N. 0:01:30.002,0:01:31.626 But, who knows, maybe there is a short cut 0:01:31.626,0:01:33.771 that allows us to compute e'th roots modulo N, 0:01:33.771,0:01:36.704 without factoring the modulus. To show that 0:01:36.704,0:01:38.808 that's not possible, we have to show a reduction. 0:01:38.808,0:01:40.314 That is, we have to show that, 0:01:40.314,0:01:42.001 if I give you an efficient algorithm for 0:01:42.001,0:01:43.872 computing e'th roots modulo N, that 0:01:43.872,0:01:48.132 efficient algorithm can be turned into a factoring algorithm. 0:01:48.132,0:01:51.015 So, this is called a reduction. Namely, given an algorithm for 0:01:51.015,0:01:53.137 e'th roots modulo N, we obtain a factoring algorithm. 0:01:53.137,0:01:57.314 That would show that one cannot compute e'th roots modulo N 0:01:57.314,0:02:00.101 faster than factoring the modulus. 0:02:00.101,0:02:02.285 If we had such a result, it would show that 0:02:02.285,0:02:05.716 actually breaking RSA, in fact is as hard as factoring. 0:02:05.716,0:02:08.110 But, unfortunately, this is not really known at the moment. 0:02:08.110,0:02:11.969 In fact, this is one of the oldest problems in public key crypto. 0:02:11.969,0:02:14.415 So, just let me give you a concrete example. 0:02:14.415,0:02:18.523 Suppose, I give you an algorithm that will compute cube roots modulo N. 0:02:19.037,0:02:23.693 So, for any x in ZN, the algorithm will compute the cube root of x modulo N. 0:02:23.693,0:02:25.971 My question is, can you show that using 0:02:25.971,0:02:29.066 such an algorithm you can factor the modulus N? 0:02:29.066,0:02:31.239 And, even that is not known. What is known, 0:02:31.239,0:02:33.920 I'll tell you, is for example that for e equals 2, 0:02:34.375,0:02:37.711 that is if I give you an algorithm for computing square roots modulo N, 0:02:37.711,0:02:40.696 then in fact, that does imply factoring the modulus. 0:02:40.696,0:02:44.699 So, computing square roots is in fact as hard as factoring the modulus. 0:02:44.712,0:02:47.779 Unfortunately, if you think back to the definition of RSA, 0:02:47.779,0:02:52.913 that required that e times d be 1 modulo phi(N). 0:02:52.913,0:02:58.612 What that means is, that e necessarily needs to be relatively prime to phi(N). 0:02:58.612,0:03:03.128 Right, this, what the first equation says is that e is invertible modulo phi(N). 0:03:03.128,0:03:06.125 But, if e is invertible modulo phi(N), necessarily that means that 0:03:06.125,0:03:09.107 e must be relatively prime to phi(N). 0:03:09.107,0:03:13.927 But, phi(N), if you remember, that is equal to p minus 1 times q minus 1. 0:03:13.927,0:03:19.377 And, since p and q are both large primes, p - 1 times q - 1 is always even. 0:03:19.377,0:03:25.503 And, as a result, the GCD of 2 and phi(N) is equal to 2, 0:03:25.503,0:03:28.221 because phi(N) is even. And therefore, the public 0:03:28.221,0:03:30.863 exponent 2 is not relatively prime to phi(N). 0:03:30.863,0:03:33.174 Which means that, even though we have a reduction[br] 0:03:33.174,0:03:35.263 from taking square roots to factoring, 0:03:35.263,0:03:38.758 e equals 2 cannot be used as an RSA exponent. 0:03:38.758,0:03:43.643 So, really the smallest RSA exponent that is legal is in fact e equals 3. 0:03:43.643,0:03:46.911 But, for e equal 3, the question of whether computing cube roots 0:03:46.911,0:03:48.976 is as hard as factoring, is an open problem. 0:03:48.976,0:03:50.978 It's actually a lot of fun to think about this question. 0:03:50.978,0:03:53.444 So, I would encourage you to think about it just a little bit. 0:03:53.444,0:03:58.352 That is, if I give you an efficient algorithm for computing cube roots modulo N, 0:03:58.352,0:04:02.111 can you use that algorithm to actually factor the modulus N? 0:04:02.111,0:04:05.301 I will tell you that there is a little bit of evidence to say 0:04:05.301,0:04:09.402 that a reduction like that, actually doesn't exist, but it is very, very weak evidence. 0:04:09.402,0:04:11.366 What this evidence says is basically, if you 0:04:11.366,0:04:13.500 give me a reduction of a very particular form. 0:04:13.500,0:04:16.070 In other words, if your reduction is what's called algebraic, 0:04:16.070,0:04:18.509 (I am not going to explain what that means here.) 0:04:18.509,0:04:23.087 That is, if given a cube root oracle, you could actually show me an algorithm 0:04:23.087,0:04:26.055 that would then factor. That reduction, by itself, 0:04:26.055,0:04:28.554 would actually imply a factoring algorithm. 0:04:28.554,0:04:31.084 Okay so, that would say that if factoring is hard, 0:04:31.084,0:04:33.637 a reduction actually doesn't exist. 0:04:33.637,0:04:35.537 But, as I say this is very weak evidence. 0:04:35.537,0:04:37.617 Because, who is to say that the reduction needs to be algebraic. 0:04:37.617,0:04:39.725 Maybe, there are some other types of reductions that 0:04:39.725,0:04:42.857 we haven't really considered. So, I would 0:04:42.857,0:04:44.339 encourage you to think a little bit about this 0:04:44.339,0:04:46.235 question. It's actually quite interesting. 0:04:46.235,0:04:50.087 How would you use a cube root algorithm to factor the modulus? 0:04:51.308,0:04:54.143 But as I said, as far as we know, RSA is a one way function. 0:04:54.143,0:05:00.274 In fact, breaking RSA, computing e'th roots that is, actually requires factoring the modulus. 0:05:00.274,0:05:02.918 We all believe that's true, and that's the state of the art. 0:05:02.918,0:05:07.637 But, now there has been a lot of work on trying to improve the performance of RSA. 0:05:07.637,0:05:12.041 Either, RSA encryption or improve the performance of RSA decryption. 0:05:12.041,0:05:14.901 And it turns out, there has been a number of false starts in this direction. 0:05:14.901,0:05:18.796 So I want to show you, this wonderful example as a warning. 0:05:18.796,0:05:23.232 So basically, this is an example of how not to improve the perfomance of RSA. 0:05:23.232,0:05:26.772 So, you might think that if I wanted to speed up RSA decryption, 0:05:26.772,0:05:28.578 remember decryption is done by raising the 0:05:28.578,0:05:30.895 ciphertext to the power of d. And, you remember 0:05:30.895,0:05:34.265 that the exponentiation algorithm ran in linear 0:05:34.265,0:05:37.767 time in the size of d. Linear time in log of d. 0:05:37.767,0:05:39.762 So, you might think to yourself, well if I wanted 0:05:39.762,0:05:43.522 to speed up RSA decryption, why don't I just use a small d. 0:05:43.522,0:05:48.265 I'll say, I'll say a decryption exponent that's on the order of 2 to the 128. 0:05:48.419,0:05:52.350 So it's clearly big enough so that exhaustive search against d is not possible. 0:05:52.350,0:05:57.418 But normally, the decryption exponent d would be as big as the modulus, say 2000 bits. 0:05:57.418,0:06:04.669 By using d that's only 128 bits, I basically speed up RSA decryption by a factor of 20. 0:06:04.669,0:06:07.533 Right, I went down from 2000 bits to 100 bits. 0:06:07.533,0:06:10.915 So, exponentiation would run 20 times as fast. 0:06:10.915,0:06:15.440 It turns out this is a terrible idea. Terrible, terrible idea, in the following sense. 0:06:15.440,0:06:18.638 There is an attack by Michael Wiener that shows that, in fact, 0:06:18.638,0:06:23.425 as soon as the private exponent d is less than the fourth root of the modulus. 0:06:23.425,0:06:27.526 Let's see, if the modulus is around 2048 bits 0:06:27.526,0:06:34.581 that means that if d is less that 2 to the 512, then RSA is completely, completely insecure. 0:06:34.581,0:06:37.509 And, this is, it's insecure in the worst possible way. 0:06:37.509,0:06:43.129 Namely, just given a public key and an e, you can very quickly recover the private key d. 0:06:44.227,0:06:48.493 Well, so some folks said: well this attack works up to 512 bits. 0:06:48.493,0:06:52.378 So, why donĀ“t we make the modulus, say, you know 530 bits. 0:06:52.378,0:06:54.234 Then, this attack actually wouldn't apply. 0:06:54.234,0:06:57.545 But still, we get to speed up RSA decryption by a factor of 4, 0:06:57.545,0:07:01.997 because we shrunk the exponent from 2000 bits to, say, 530 bits. 0:07:01.997,0:07:03.978 Well it turns out, even that is not secure. In fact, 0:07:03.978,0:07:06.243 there is an extension to Wiener's attack, that's actually much 0:07:06.243,0:07:08.176 more complicated, that shows that if d 0:07:08.176,0:07:13.233 is less than N to the 0.292, then also RSA is insecure. 0:07:13.233,0:07:16.674 And in fact, the conjecture that this is true up to N to the 0.5. 0:07:16.674,0:07:21.975 So even if d is like N to the 0.4999, RSA should still be insecure, 0:07:21.975,0:07:25.780 although this is an open problem. It's again, a wonderful open problem, 0:07:25.780,0:07:28.394 It's been open for like, what is it, 14 years now. 0:07:28.394,0:07:31.556 And, nobody can progress beyond this 0.292. 0:07:31.556,0:07:34.327 Somehow, it seems kind of strange, why would 0.292 0:07:34.327,0:07:38.217 be the right answer and yet no one can go beyond 0.292. 0:07:38.802,0:07:41.671 So, just to be precise, when I say that RSA is insecure, 0:07:41.671,0:07:45.218 what I mean is just given the public key N and e, 0:07:45.218,0:07:48.077 your goal is to recover the secret key d. 0:07:49.102,0:07:52.257 If you are curious where 0.292 comes from, 0:07:52.257,0:07:56.312 I'll tell you that what it is, it's basically 1 minus 1 over square root of 2. 0:07:56.312,0:07:58.503 Now how could this possibly be the right answer to this problem? 0:07:58.503,0:08:01.296 It's much more natural that the answer is N to the 0.5. 0:08:01.296,0:08:04.340 But this is still an open problem. Again if you want to think about that, 0:08:04.340,0:08:06.163 it's kind of a fun problem to work on. 0:08:06.163,0:08:10.101 So the lesson in this is that one should not enforce any structure on d 0:08:10.101,0:08:12.475 for improving the performance of RSA, and in fact 0:08:12.475,0:08:15.276 now there's a slew of results like this that show 0:08:15.276,0:08:19.714 that basically, any kind of tricks like this to try and improve RSA's performance 0:08:19.714,0:08:24.285 is going to end up in disaster. So this is not the right way to improve RSA's performance. 0:08:24.285,0:08:27.987 Initially I wasn't going to cover the details of Wiener's attack. 0:08:27.987,0:08:31.582 But given the discussions in the class, I think some of you would enjoy seeing the details. 0:08:31.582,0:08:35.229 All it involves is just manipulating some inequalities. 0:08:35.229,0:08:37.743 If you're not comfortable with that, feel free to skip over this slide, 0:08:37.743,0:08:40.979 although I think many of you would actually enjoy seeing the details. 0:08:40.979,0:08:43.365 So let me remind you in Wiener's attack basically 0:08:43.365,0:08:46.893 we're given the modulus and the RSA exponent N and e, 0:08:46.893,0:08:50.513 and our goal is to recover d, the private exponent d, 0:08:50.513,0:08:54.171 and all we know is that d is basically less than the fourth root of N. 0:08:54.171,0:08:57.711 In fact, I'm going to assume that d is less than the fourth root of N divided by 3. 0:08:57.711,0:09:02.373 This 3 doesn't really matter, but the dominating term here is that d is less than the 4th root of N. 0:09:02.373,0:09:05.944 So let's see how to do it. So first of all, recall that 0:09:05.944,0:09:09.144 because e and d are RSA public and private exponents, 0:09:09.144,0:09:14.145 we know that e times d is 1 modulo phi(N). Well what does that mean? 0:09:14.145,0:09:22.248 That means that there exists some integer k such that e times d is equal to k times phi(N) plus 1. 0:09:22.248,0:09:26.280 Basically that's what it means for e times d to be 1 modulo phi(N). 0:09:26.280,0:09:29.832 Basically some integer multiple of phi(N) plus 1. 0:09:29.832,0:09:32.592 So now let's stare at this equation a little bit. 0:09:32.592,0:09:35.826 And in fact this equation is the key equation in the attack. 0:09:35.826,0:09:40.352 And what we're going to do is first of all divide both sides by d times phi(N). 0:09:40.352,0:09:43.703 And in fact I'm going to move this term here to the left. 0:09:43.703,0:09:47.453 So after I divide by d times phi(N), what I get is that 0:09:47.453,0:09:58.500 e divided by phi(N) minus k divided by d is equal to 1 over d times phi(N). 0:09:59.469,0:10:02.902 Okay, so all I did is I just divided by d times phi(N) 0:10:02.902,0:10:05.850 and I moved the k times phi(N) term to the left-hand side. 0:10:05.850,0:10:09.119 Now, just for the heck of it I'm going to add absolute values here. 0:10:09.119,0:10:13.484 These will become useful in just a minute, but of course they don't change this equality at all. 0:10:13.484,0:10:20.184 Now, phi(N) of course is almost N; phi(N) is very, very close to N, as we said earlier. 0:10:20.184,0:10:23.371 And all I'm going to need then for this fraction is just to say that 0:10:23.371,0:10:26.639 it's less than 1 over square root of N. It's actually much, much smaller 0:10:26.639,0:10:30.571 than 1 over square root of N, it's actually on the order of 1 over N or even more than that, 0:10:30.571,0:10:35.638 but for our purposes all we need is that this fraction is less than 1 over square root of N. 0:10:35.638,0:10:37.939 Now let's stare at this fraction for just a minute. 0:10:37.939,0:10:44.506 You realize that this fraction on the left here is a fraction that we don't actually know. 0:10:44.506,0:10:49.039 We know e, but we don't know phi(N), and as a result we don't know e over phi(N). 0:10:49.039,0:10:53.631 But we have a good approximation to e over phi(N). Namely we know that phi(N) 0:10:53.631,0:10:59.238 is very close to N. Therefore e over phi(N) is very close to e over N. 0:10:59.238,0:11:03.436 So we have a good approximation to this left-hand side fraction, namely e over N. 0:11:03.436,0:11:06.028 What we really want is the right-hand side fraction, 0:11:06.028,0:11:07.649 because once we get the right-hand side fraction, 0:11:07.649,0:11:12.960 basically that's going to involve d, and then we'll be able to recover d. Okay, so let's see 0:11:12.960,0:11:19.041 if we replace e over phi(N) by e over N, let's see what kind of error we're going to get. 0:11:19.041,0:11:22.514 So to analyze that, what we'll do is first of all remind ourselves 0:11:22.514,0:11:26.204 that phi(N) is basically N - p - q + 1, 0:11:26.204,0:11:30.804 which means that N minus phi(N) is going to be less than p + q. 0:11:30.804,0:11:34.752 Actually I should, to be precise I should really write p + q + 1, but you know, 0:11:34.752,0:11:37.838 who cares about this 1, it's not, it doesn't really affect anything. 0:11:37.838,0:11:40.238 So I'm just going to ignore it for simplicity. 0:11:40.238,0:11:45.508 Okay, so N - phi(N) is less than p + q. Both p and q are roughly half the length of N, 0:11:45.508,0:11:48.918 so you know, they're both kind of on the order of square root of N, 0:11:48.918,0:11:53.876 so basically p + q we'll say is less than 3 times square root of N. 0:11:53.876,0:11:56.844 Okay, so we're going to use this inequality in just a minute. 0:11:56.844,0:12:00.243 But now we're going to start using the fact that we know something about d, 0:12:00.243,0:12:02.958 namely that d is small. So if we look at this inequality here, 0:12:02.958,0:12:05.543 d is less than the fourth root of N divided by 3. 0:12:05.543,0:12:08.596 It's actually fairly easy to see that if I square both sides 0:12:08.596,0:12:12.118 and I just manipulate things a little bit, it's [not] difficult to see that 0:12:12.118,0:12:15.510 this directly implies the following relation, 0:12:15.510,0:12:24.263 basically 1 over 2 d squared minus 1 over square root of N is greater than 3 over square root of N. 0:12:24.263,0:12:28.542 As I said, this basically follows by squaring both sides, then taking the 0:12:28.542,0:12:33.334 inverse of both sides, and then I guess multiplying one side by a half. 0:12:33.334,0:12:37.906 Okay, so you can easily derive this relation, and we'll need this relation in just a minute. 0:12:37.906,0:12:40.166 So now let's see, what we'd like to do is bound 0:12:40.166,0:12:45.059 the difference of e over N and k over d. Well what do we know? 0:12:45.059,0:12:47.872 By the triangular inequality, we know that this is equal to 0:12:47.872,0:12:52.122 the distance between e over N and e over phi(N) plus 0:12:52.122,0:12:56.566 the distance from e over phi(N) to k over d. 0:12:56.566,0:13:01.813 This is just what's called a triangular inequality; this is just a property of absolute values. 0:13:01.813,0:13:04.705 Now this absolute value here, we already know how to bound. 0:13:04.705,0:13:07.116 If you think about it it's basically the bound that we've already derived. 0:13:07.116,0:13:11.977 So we know that this absolute value here is basically less than 1 over square root of N. 0:13:11.977,0:13:17.737 Now what do we know about this absolute value here? What is e over N minus e over phi(N)? 0:13:17.737,0:13:20.486 Well let's do common denominators and see what we get. 0:13:20.486,0:13:25.236 So the common denominator is going to be N times phi(N), 0:13:25.236,0:13:31.253 and the numerator in this case is going to be e times phi(N) minus N. 0:13:31.253,0:13:35.760 Which we know from this expression here is less than 3 times square root of N. 0:13:35.760,0:13:40.842 So really what this numerator is going to be is e times 3 square root of N. 0:13:40.842,0:13:44.327 The numerator is going to be less than e times 3 square root of N. 0:13:44.327,0:13:51.246 So now I know that e is less than phi(N), so I know that e over phi(N) is less than 1. 0:13:51.246,0:13:57.313 In other words, if I erase e and I erase phi(N) I've only made the fraction larger. 0:13:57.313,0:14:00.016 Okay, so this initial absolute value is now going to be 0:14:00.016,0:14:03.655 smaller than 3 square root of N over N, which is simply, 0:14:03.655,0:14:09.237 3 square root of N over N is simply 3 over square root of N. 0:14:09.237,0:14:11.270 Okay, but what do we know about 3 over square root of N? 0:14:11.270,0:14:17.706 We know that it's less than 1 over 2 d squared minus 1 over square root of N. 0:14:17.706,0:14:20.473 Okay, so that's the end of our derivation. 0:14:20.473,0:14:26.439 So now we know that the first absolute value is less than 1 over 2 d squared minus square root of N. 0:14:26.439,0:14:29.509 The second absolute value is less than 1 over square root of N. 0:14:29.509,0:14:34.369 And therefore their sum is less than 1 over 2 d squared. 0:14:34.369,0:14:36.576 And this is the expression that I want you to stare at. 0:14:36.576,0:14:42.951 So here, let me circle it a little bit. So let me circle this part and this part. 0:14:43.582,0:14:46.445 Now, so let's stare a little bit at this fraction here. 0:14:46.445,0:14:51.444 And what we see is first of all, as before, now we know the value of e over N, 0:14:51.444,0:14:54.825 and what we'd like to find out is the value k over d. 0:14:54.825,0:14:56.862 But what do we know about this fraction k over d? 0:14:56.862,0:15:00.715 We know somehow that the difference of these two fractions is really small; 0:15:00.715,0:15:05.385 it's less than 1 over 2 d squared. Now this turns out to happen very infrequently, 0:15:05.385,0:15:10.588 that k over d approximates e over N so well that 0:15:10.588,0:15:15.307 the difference between the two is less than the square of the denominator of k over d. 0:15:15.307,0:15:17.324 It turns out that that can't happen very often. 0:15:17.324,0:15:22.338 It turns out that there are very few fractions of the form k over d that approximate another fraction 0:15:22.338,0:15:26.422 so well that their difference is less than 1 over 2 d squared. 0:15:26.422,0:15:30.308 And in fact the number of such fractions is going to be at most logarithmic in N. 0:15:30.308,0:15:33.953 So now there's a continued fraction algorithm. It's a very famous fraction 0:15:33.953,0:15:38.877 that basically what it will do is, from the fraction e over N, 0:15:38.877,0:15:42.977 it will recover log(N) possible candidates for k over d. 0:15:42.977,0:15:47.148 So we just try them all one by one until we find the correct k over d. 0:15:47.148,0:15:51.645 And then we're done. We're done, because we know that, 0:15:51.645,0:15:55.376 well e times d is 1 mod k, therefore d is relatively prime to k, 0:15:55.376,0:16:00.852 so if we just represent k over d as a rational fraction, you know, numerator by denominator, 0:16:00.852,0:16:05.271 the denominator must be d. And so we've just recovered, you know, 0:16:05.271,0:16:10.355 we've tried all possible log(N) fractions that approximate e over N so well 0:16:10.355,0:16:13.688 that the difference is less than 1 over 2 d squared. 0:16:13.688,0:16:19.338 And then we just look at the denominator of all those fractions, and one of those denominators must be d. 0:16:19.338,0:16:22.841 And then we're done; we've just recovered the private key. 0:16:22.841,0:16:26.341 So this is a pretty cute attack. And it shows basically how, 0:16:26.341,0:16:31.267 if the private exponent is small, smaller than the fourth root of N, 0:16:31.267,0:16:35.354 then we can recover d completely, quite easily. Okay, so I'll stop here.