1 00:00:00,000 --> 00:00:05,381 The next question we're gonna ask is RSA really a one way function. In other words, 2 00:00:05,381 --> 00:00:10,566 is it really hard to invert RSA without knowing the Trapdoor? So if an attacker 3 00:00:10,566 --> 00:00:15,685 wanted to invert the RSA function, well what the attacker has is basically the 4 00:00:15,685 --> 00:00:21,066 public key namely, he has n and e, and now he's given x to the e and his goal is to 5 00:00:21,066 --> 00:00:26,251 recover x, okay? So the question really is, given x to the e modulo n, how hard is 6 00:00:26,251 --> 00:00:31,370 it to recover x? So what we're really asking is, how hard is it to compute e-th 7 00:00:31,370 --> 00:00:36,816 roots modulo composite? If this problem turns out to be hard, then RSA is in fact 8 00:00:36,816 --> 00:00:41,077 the one way function. If this problem turns out to be easy which is of course we 9 00:00:41,077 --> 00:00:44,959 don't believe is easy, then RSA would actually be broken. So just at this 10 00:00:44,959 --> 00:00:49,456 [inaudible] for this problem requires us the first factor, the modulo n, and then 11 00:00:49,456 --> 00:00:53,841 the ones we factor the modulus we've already seen last week that it's easy to 12 00:00:53,841 --> 00:00:58,338 compute the e-th root modulo p, it's easy to compute e-th root modulo q and then 13 00:00:58,338 --> 00:01:03,117 given both those e-th roots it's actually easy to combine them together using what's 14 00:01:03,117 --> 00:01:07,839 called the Chinese Remainder Theorem and actually recover the e-th root modulo n. 15 00:01:07,839 --> 00:01:12,280 So once we're able to factor the modulus computing e-th roots, modulo n becomes 16 00:01:12,280 --> 00:01:16,890 easy but factoring the modulus as far as we know is a very, very difficult problem. 17 00:01:16,890 --> 00:01:21,444 But the natural question is, is it true that in order to compute [inaudible] we 18 00:01:21,444 --> 00:01:26,056 have to factor the modulus n. As far as we know the best algorithm for computing 19 00:01:26,056 --> 00:01:30,092 [inaudible] modular n requires factorization of n. But who knows, maybe 20 00:01:30,092 --> 00:01:34,531 there's a short cut that allows us to complete [inaudible] modular n without 21 00:01:34,531 --> 00:01:38,710 factoring the modulus. To show that that's not possible, we have to show our 22 00:01:38,710 --> 00:01:43,104 reduction. That is we have to show that if I give you an efficient algorithm for 23 00:01:43,104 --> 00:01:47,223 computing e-th root modulo m, that efficient algorithm can be turned into a 24 00:01:47,223 --> 00:01:51,726 factoring algorithm. So this is called the reduction namely given an algorithm for 25 00:01:51,726 --> 00:01:56,120 e-th root modulo m, we obtain a factoring algorithm and that would show that one 26 00:01:56,120 --> 00:02:00,736 cannot compute e-th root modulo m faster than factoring the modules. If we had such 27 00:02:00,736 --> 00:02:04,702 a result, it would show that actually breaking RSA in fact is as hard as 28 00:02:04,702 --> 00:02:09,109 factoring but unfortunately this is not really known at the moment. In fact this 29 00:02:09,109 --> 00:02:13,461 is the one of the oldest problems in Public-key crypto and so let me just give 30 00:02:13,461 --> 00:02:18,089 you a concrete example. I supposed I give you an algorithm what will compute q brute 31 00:02:18,089 --> 00:02:22,796 modular m. So for any x and zN the algorithm will compute the cube root of x 32 00:02:22,796 --> 00:02:27,707 modulo m and my question is can you show that using such an algorithm you can 33 00:02:27,707 --> 00:02:32,808 factor the modulo m? And even that's not known. What is known I'll tell you is for 34 00:02:32,808 --> 00:02:37,848 example that for e = two. That is if I give you an algorithm for computing square 35 00:02:37,848 --> 00:02:42,447 roots modulo n, then in fact that does imply factoring the modulus and so 36 00:02:42,447 --> 00:02:47,551 computing square roots, is in fact as hard as factoring the modulus. Unfortunately, 37 00:02:47,551 --> 00:02:52,213 if you think back to the definition of RSA, that required that e * d be one 38 00:02:52,213 --> 00:02:57,505 modulo phi of n and what that means is that e necessarily needs to be relatively 39 00:02:57,505 --> 00:03:03,277 prime to phi of n. What the first equation says that e is invertible modulo 5n but if 40 00:03:03,277 --> 00:03:08,884 e is invertible modulo 5n, necessarily that means that e must be relatively prime 41 00:03:08,884 --> 00:03:14,631 to 5n. But again if you remember that's = p - one * q - one and since p and q are 42 00:03:14,631 --> 00:03:20,413 both primes, p - one * q - one is always even. And as a result, the GCD of two and 43 00:03:20,413 --> 00:03:26,645 phi of n is = two because phi of n is even and therefore, the public exponent two is 44 00:03:26,645 --> 00:03:32,803 not relatively prime to phi of n which means that even though we have a reduction 45 00:03:32,803 --> 00:03:38,824 from taking square roots to factoring, e = two cannot be use as an RSA exponent. So 46 00:03:38,824 --> 00:03:43,043 really the smallest RSA exponent that's legal is in fact e = three but for e = 47 00:03:43,043 --> 00:03:47,481 three the question of whether computing cube roots is as hard as factoring is an 48 00:03:47,481 --> 00:03:51,864 open problem. It's actually a lot of fun to think about this question. So, I would 49 00:03:51,864 --> 00:03:55,973 encourage you to think about it just a little bit that is if I give you an 50 00:03:55,973 --> 00:04:00,411 efficient algorithm for computing cube roots modulo n, can you use that algorithm 51 00:04:00,411 --> 00:04:04,714 to actually factor the modulus n? I'll tell you that is a little b it of evidence 52 00:04:04,714 --> 00:04:09,014 to say that a reduction like that actually doesn't exist but it's very, very weak 53 00:04:09,014 --> 00:04:13,155 evidence. What this evidence says is basically if you give me a reduction of a 54 00:04:13,155 --> 00:04:17,454 very particular form, in other words if your reduction is what's called algebraic, 55 00:04:17,454 --> 00:04:21,276 I'm not gonna explain what that means here, that is if given a cube root, 56 00:04:21,276 --> 00:04:25,045 oracle, you could actually show me in algorithm that within factor, that 57 00:04:25,045 --> 00:04:29,374 reduction by itself would actually imply a factoring algorithm. Okay so that would 58 00:04:29,374 --> 00:04:33,622 say, that a factoring is hard, a reduction actually doesn't exist. But as I say, this 59 00:04:33,622 --> 00:04:37,871 is very weak evidence 'cause who's to say that the reduction needs to be algebraic 60 00:04:37,871 --> 00:04:41,653 maybe there are some other types of reductions that we haven't really 61 00:04:41,653 --> 00:04:45,694 considered. So, I would encourage you to think a little bit about this question, 62 00:04:45,694 --> 00:04:50,098 it's actually quite interesting how would you use a cube root algorithm to factor 63 00:04:50,098 --> 00:04:54,906 the modulus. But as I said as far as we know RSA is a one way function and in fact 64 00:04:54,906 --> 00:04:59,690 breaking RSA, computing e-th root status actually requires factoring the modules. 65 00:04:59,690 --> 00:05:04,504 We all believe that's true. That's state of the art. But now there's has been a lot 66 00:05:04,504 --> 00:05:08,717 of work on trying to improve the performance of RSA. Either RSA encryption 67 00:05:08,717 --> 00:05:12,986 or improve the performance of RSA decryption. And it turns out there's been 68 00:05:12,986 --> 00:05:17,824 a number of false start in this direction so I wanna show you this wonderful example 69 00:05:17,824 --> 00:05:22,094 as a warning and so this basically is an example on how not to improve the 70 00:05:22,094 --> 00:05:26,135 performance of RSA. So you might think that if I wanted to speed up RSA 71 00:05:26,135 --> 00:05:30,803 decryption, remember decryption is done but raising the ciphertext to the power of 72 00:05:30,803 --> 00:05:35,585 d and you remember that the exponentiation algorithm ran in linear time in the size 73 00:05:35,585 --> 00:05:40,368 of d Linear time and log of d. So you might think to yourself well, if I wanted 74 00:05:40,368 --> 00:05:44,990 to speed up RSA decryption, why don't I just use d of a, decryption exponent 75 00:05:44,990 --> 00:05:50,105 that's on the order of two to the 128th. So it's clearly big enough to that exhaust 76 00:05:50,105 --> 00:05:55,035 a search against d is not possible, but normally the decryption exponent d would 77 00:05:55,035 --> 00:06:00,068 be as big as the modulus say 2,000 bits. By using d that's only 128th bits, I 78 00:06:00,068 --> 00:06:05,402 basically speed up the decryption by a factor twenty. Then I went down from 2,000 79 00:06:05,402 --> 00:06:10,604 bits to 100 bits so exponentiation would run twenty times as fast. It turns out 80 00:06:10,604 --> 00:06:15,214 this is a terrible idea, terrible, terrible idea in the following sense 81 00:06:15,214 --> 00:06:20,482 there's an attack by Michael [inaudible] that shows that in fact as soon as the 82 00:06:20,482 --> 00:06:25,750 private exponent d is less than the fourth root of the modulus, let's see, if the 83 00:06:25,750 --> 00:06:31,084 modulus is around twenty to 48 bits, that means that if d is less than two to the 84 00:06:31,084 --> 00:06:36,203 512th. Then RSA is completely, completely insecure. And this is, it's insecure and 85 00:06:36,203 --> 00:06:41,637 the worse possible way namely just given a public key and an e, you can very quickly 86 00:06:41,637 --> 00:06:47,364 recover the private key d. Well some folks said well this attack works out to 512 87 00:06:47,364 --> 00:06:51,929 bits so why don't we make the modulo say you know 530 bits then this attack 88 00:06:51,929 --> 00:06:56,916 actually wouldn't apply but still we get to speed up RSA decryption by a factor of 89 00:06:56,916 --> 00:07:01,815 four because we shrunk the exponent from 2000 bits To say 530 bits. Well, it turns 90 00:07:01,815 --> 00:07:06,892 out even that that's not secure. In fact, there's an extension to [inaudible] attack 91 00:07:06,892 --> 00:07:11,848 that's actually much more complicated that shows that if d is less than n to the 92 00:07:11,848 --> 00:07:16,803 .292, then also RSA isn't secure and in fact the conjuncture is that this is true 93 00:07:16,803 --> 00:07:21,636 up to n to the .5 so even if d is like n to the .4999, RSA is still, be insecure 94 00:07:21,636 --> 00:07:26,347 although this is an open problem. So again, a wonderful open problem, It's been 95 00:07:26,347 --> 00:07:31,118 open for like, you know? Its fourteen years now and no one can progress beyond 96 00:07:31,118 --> 00:07:36,362 this .292 Somehow Seems kind of strange. Why would .292 be the right answer and yet 97 00:07:36,362 --> 00:07:42,174 no one can go beyond .292? So just to be precised when I say that RSA is insecure, 98 00:07:42,174 --> 00:07:47,384 what I mean is just giving them the Public-key in n e; your goal is to recover 99 00:07:47,384 --> 00:07:52,069 the secret key d. If you're curious where .292 comes from I'll tell you that what it 100 00:07:52,069 --> 00:07:56,254 is, it's basically one - one over square root of two. Now how could this possibly 101 00:07:56,254 --> 00:08:00,545 be the right answer? The problem is much more natural than the answer is n to the 102 00:08:00,545 --> 00:08:04,678 .5 but this is still an open problem. Again, if you wanna think about that it's 103 00:08:04,678 --> 00:08:08,810 kind of a fun problem to work on. So the lesson in this is that one should not 104 00:08:08,810 --> 00:08:13,048 enforce any structure on d for improving the performance of RSA and in fact now 105 00:08:13,048 --> 00:08:17,286 there's a slew of results like this that show, that basically any kind of tricks 106 00:08:17,286 --> 00:08:21,260 like this to try and improve RSA's performance is gonna end up in disaster. 107 00:08:21,260 --> 00:08:25,792 So this is not the right way to improve RSAs performance. Initially I wasn't going 108 00:08:25,792 --> 00:08:29,559 to cover the details of Wiener's Attack but given the discussions in the class I 109 00:08:29,559 --> 00:08:33,000 think some of your would enjoy seeing the details. All it involves is just 110 00:08:33,000 --> 00:08:36,721 manipulating so many qualities. If you're not comfortable with that, feel free to 111 00:08:36,721 --> 00:08:40,487 skip over the slide although I think many of you would actually enjoy seeing the 112 00:08:40,487 --> 00:08:44,853 details. So let me remind you in Wiener's Attack basically would given the modulus 113 00:08:44,853 --> 00:08:49,277 and the RSA exponent and an e and our goal is to recover d, the private exponent d 114 00:08:49,277 --> 00:08:53,486 and all we know is that d is basically less than fourth root of n. In fact, I'm 115 00:08:53,486 --> 00:08:57,803 gonna assume that d is less than four root of n divided by three and thus three 116 00:08:57,803 --> 00:09:02,120 doesn't really matter but the dominating term here is d is less than the fourth 117 00:09:02,120 --> 00:09:08,056 root of n. So let's see how to do it. So first of all, recall that because e and d 118 00:09:08,056 --> 00:09:14,255 are RSA public and private exponents, we know that e * d is one modulo phi of n. 119 00:09:14,255 --> 00:09:20,772 Well what does that mean, that means that you're exist some integer k such that e * 120 00:09:20,772 --> 00:09:25,545 d is = k * phi of n + one. Basically that's what it means for e * d to be one 121 00:09:25,545 --> 00:09:29,964 modulo phi of n Basically, some integer multiple phi of n + one. So now let's 122 00:09:29,964 --> 00:09:34,795 stare at this equation a little bit, and in fact this equation is the key equation 123 00:09:34,795 --> 00:09:43,409 on the attack. And what we're gonna do is first of all divide both sides by d * phi 124 00:09:43,409 --> 00:09:52,343 (N). And in fact, I'm gonna move this term here to the left. So after I divide by d * 125 00:09:52,343 --> 00:09:59,908 phi (n), what I get is that e / phi(n) - k / d = one / d * phi(n). Okay. So all I did 126 00:09:59,908 --> 00:10:04,391 is I just divide it by d * 5n and moved the k * 5n and turn to the left hand side. 127 00:10:04,391 --> 00:10:09,041 Now just for the heck of it, I'm going to add absolute values here. This will become 128 00:10:09,041 --> 00:10:13,748 useful in just a minute but of course they don't change this equality at all. Now 5n 129 00:10:13,748 --> 00:10:18,230 of course is almost n. 5n is very, very close to n as we said earlier and all I'm 130 00:10:18,230 --> 00:10:22,993 gonna need then for this fraction is just to say that it's less than one over square 131 00:10:22,993 --> 00:10:27,083 root of n. It's actually much, much smaller than one over square root of n. 132 00:10:27,083 --> 00:10:31,535 It's actually in the order of one over n or even more than that. But for our 133 00:10:31,535 --> 00:10:35,978 purposes, all we need is that this fraction is less than one over square root 134 00:10:35,978 --> 00:10:40,479 of n. Now let's stare at this fraction for just a minute. You realize that this 135 00:10:40,479 --> 00:10:45,789 fraction on the left here Is a fraction that we don't actually know. We know e but 136 00:10:45,789 --> 00:10:50,901 we don't know phi of n and as a result we don't know e over phi of n. But we have a 137 00:10:50,901 --> 00:10:55,829 good approximation to e over phi of n namely we know the phi of n is very close 138 00:10:55,829 --> 00:11:00,415 to n, therefore e over phi of n is very close to e over n. So we have a good 139 00:11:00,415 --> 00:11:05,077 approximation to this left hand side fraction we have ran. What we really want 140 00:11:05,077 --> 00:11:09,678 is the right hand fraction because once we get the right hand side fractions 141 00:11:09,678 --> 00:11:14,220 basically you guys wanna involve d and then we'll able to recover d, okay. So 142 00:11:14,220 --> 00:11:19,286 let's see if we replace e over 5n by e over n. Let's see what kind of error we're 143 00:11:19,286 --> 00:11:24,646 gonna get. So to analyze that what we'll do is first of all remind ourselves that 144 00:11:24,646 --> 00:11:29,023 5n is basically n - p - q +one. Which means that n - phi (n) is gonna be less 145 00:11:29,023 --> 00:11:33,331 than p + q. Actually I should, to be precise, I should really write p + q + 146 00:11:33,331 --> 00:11:38,009 one. But, you know, who cares about this one. It's not, it doesn't really affect 147 00:11:38,009 --> 00:11:42,625 anything so I'm just gonna ignore it for simplicity. Okay? So n - phi (n) is less 148 00:11:42,625 --> 00:11:47,610 than p + q. Both p and q are roughly half the length of n so, you know, they kind of 149 00:11:47,610 --> 00:11:52,472 both on the order of square root of n. So basically, p + q we'll say is less than 150 00:11:52,472 --> 00:11:57,246 three * square root of n. Okay. So we're going to use this inequality in just a 151 00:11:57,246 --> 00:12:02,035 minute but now we're gonna start using the fact that we know something about 152 00:12:02,035 --> 00:12:06,761 [inaudible] so if we look at this inequality here, d is less than the fourth 153 00:12:06,761 --> 00:12:11,737 root of n / three, it's act ually fairly easy to see if I square both sides and I 154 00:12:11,737 --> 00:12:16,463 just manipulate these a little bit, it's difficult to see that this directly 155 00:12:16,463 --> 00:12:21,625 implies the following relation basically one / two d squared - one / square root of 156 00:12:21,625 --> 00:12:26,523 n is greater than three / square root of n. As I said this basically follows by 157 00:12:26,523 --> 00:12:31,218 square in both sides then taking the inverse of both sides and then I guess 158 00:12:31,218 --> 00:12:36,736 multiplying one side by half. Okay. So you can easily derive this relation and we'll 159 00:12:36,736 --> 00:12:42,119 need this relation in just a minute. So now let's see, what do we like to do is 160 00:12:42,119 --> 00:12:47,570 bound the difference of e / n and k / d. Well, what do we know? By the triangular 161 00:12:47,570 --> 00:12:53,229 and equality we know that this is equal to the distance between e / n and e / phi (n) 162 00:12:53,229 --> 00:12:57,901 plus the distance from e/ phi (N). 2k / d, okay? This is just what's called the 163 00:12:57,901 --> 00:13:02,620 triangular and equality. This is just the property of absolute values. Now this 164 00:13:02,620 --> 00:13:07,702 absolute value here we already know how to bound if you think about it as basically 165 00:13:07,702 --> 00:13:12,603 the bound that we've already derived so we know that this absolute value here is 166 00:13:12,603 --> 00:13:17,624 basically less than one / square root of n. Now what do we know about this absolute 167 00:13:17,624 --> 00:13:22,464 value here? What is e / n - e / 5n? Well, let's do common denominators and see what 168 00:13:22,464 --> 00:13:28,881 we get so the common denominator is gonna be n * 5n. And the numerator in this case 169 00:13:28,881 --> 00:13:34,609 is going to be e * 5n - n. Which we know from this expression here is less than 170 00:13:34,609 --> 00:13:39,245 three * square root of n. So really what this numerator is gonna be is e * three 171 00:13:39,245 --> 00:13:43,940 square root of n. The numerator is gonna be less than e * three square root of n. 172 00:13:43,940 --> 00:13:49,578 So now I know that e is less than phi (N). So I know that e over phi (n) is less than 173 00:13:49,578 --> 00:13:54,630 one. In other words, if I erased e and I erased phi (n), I've only made the 174 00:13:54,630 --> 00:14:00,196 fraction larger. Okay? So this initial absolute value is not gonna be smaller 175 00:14:00,196 --> 00:14:06,127 than three square root of n over n which is simply, three square root of n over n 176 00:14:06,127 --> 00:14:11,015 is simply three over square root of n. Okay. But what do we know about three / 177 00:14:11,015 --> 00:14:16,945 square root of n, we know that it's less than one. Over 2d squared - one / square 178 00:14:16,945 --> 00:14:22,292 root of n. Okay, so that's the end of ou r derivation. So now we know that the first 179 00:14:22,292 --> 00:14:27,024 absolute value is less than one over two d square - square root of n. The second 180 00:14:27,024 --> 00:14:31,756 absolute value is less than one over square root of n and therefore the sum is 181 00:14:31,756 --> 00:14:36,847 less than one over two d squared. And this is the expression that I want you to stare 182 00:14:36,847 --> 00:14:42,194 at, so here let me circle it a little bit. So, let me circle this part And this part. 183 00:14:42,194 --> 00:14:47,736 Now so let's stir a little bit of a distraction here and what we see is first 184 00:14:47,736 --> 00:14:53,563 of all as before. Now we know the value of e / n and what we'd like to find out is 185 00:14:53,563 --> 00:14:58,368 the value k / d. But what do we know about this fraction k over d? We know somehow 186 00:14:58,368 --> 00:15:03,053 that the difference of these two fractions is really small, it's less than one over 187 00:15:03,053 --> 00:15:07,400 two d squared. Now this turns out to happen very and frequently that k over d. 188 00:15:07,400 --> 00:15:13,167 Approximate e over n so well That the distance between the two is less than the 189 00:15:13,167 --> 00:15:17,588 square of the denominator of k over d. It turns out that, that can happen very 190 00:15:17,588 --> 00:15:21,883 often. It turns out there are very few fractions of the form k / d than 191 00:15:21,883 --> 00:15:26,903 approximate another fraction so well that their difference is less than one of the 192 00:15:26,903 --> 00:15:31,923 2d squared. And in fact the number of such factions is gonna be most logarithmic in 193 00:15:31,923 --> 00:15:36,702 n. So now there's a continued fraction of algorithm. It's a very famous fraction 194 00:15:36,702 --> 00:15:41,541 that basically what it would do Si from the fraction e / n it would recover and 195 00:15:41,541 --> 00:15:46,561 possible candidate for k / d so we just try them all one by one until we find the 196 00:15:46,561 --> 00:15:51,851 direct k / over d. And then we're done. We're done because we know that well e * d 197 00:15:51,851 --> 00:15:56,730 is one [inaudible] k, therefore, d is relatively prime to k. So if we just 198 00:15:56,730 --> 00:16:02,219 represent k over d as a rational fraction, you know? Numerator by denominator, then 199 00:16:02,219 --> 00:16:06,706 denominator must be d. And so we've just recovered, you know? We've tried all 200 00:16:06,706 --> 00:16:11,400 possible log in fractions that approximate e over n so well that the difference is 201 00:16:11,400 --> 00:16:15,980 less than one over 2d squared and then we just look at the denominator of all of 202 00:16:15,980 --> 00:16:19,995 those fractions and one of those denominators must be d and then we're 203 00:16:19,995 --> 00:16:24,856 done, we're just recovered the private key. So this is a pretty huge attack and 204 00:16:24,856 --> 00:16:30,003 it shows basically how if the private exponent is small, smaller than the fourth 205 00:16:30,003 --> 00:16:34,377 root of n, then we can recover d completely and quite easily. Okay. So, 206 00:16:34,377 --> 00:16:35,343 I'll stop here.