0:00:08.965,0:00:11.637 Today we have Daniel J. Bernstein, 0:00:11.637,0:00:14.538 Nadia Heninger, 0:00:14.538,0:00:16.889 and Tanja Lange. 0:00:16.889,0:00:20.402 And they will talk about RSA factorization in the real world. 0:00:20.402,0:00:22.289 Talk is called FactHacks. 0:00:22.289,0:00:29.905 applause 0:00:29.905,0:00:31.591 Nadia: Thank you. 0:00:31.591,0:00:37.091 So we're going to start with a crypto lecture in 5 minutes. 0:00:37.091,0:00:40.557 So, just to begin, since we're talking about RSA. 0:00:40.557,0:00:43.678 Here is a picture of RSA for you. 0:00:43.678,0:00:48.605 RSA is the first published public-key cryptosystem. 0:00:48.605,0:00:51.222 So by a public-key cryptosystem, what I mean is 0:00:51.222,0:00:54.169 a cryptosystem that has two keys intead of just having 0:00:54.169,0:00:56.886 one key. You might think "ok you encrypt a message 0:00:56.886,0:00:59.035 to that key and you decrypt a message to that key" 0:00:59.035,0:01:01.402 That is a symmetric cryptosystem. 0:01:01.402,0:01:03.194 A public-key cryptosystem has two keys. 0:01:03.194,0:01:06.042 One is the public key and one is the private key. 0:01:06.042,0:01:07.688 The public key you publish to the world and 0:01:07.688,0:01:09.509 anyone can use that key to encrypt a message, 0:01:09.509,0:01:10.806 especially to you. 0:01:10.806,0:01:14.102 And only you, who have the private key, can decrypt that. 0:01:14.102,0:01:18.739 So the RSA paper was the first published public-key cryptosystem. 0:01:18.739,0:01:20.501 That was in 1977. 0:01:20.501,0:01:22.757 This revolutionized cryptography and enabled 0:01:22.757,0:01:25.542 the development of the Internet. 0:01:25.542,0:01:31.568 So 35 years later RSA is still the most commonly used public-key cryptosystem. 0:01:31.568,0:01:35.870 If you ever use the Internet, you probably use it every day. 0:01:35.870,0:01:42.154 So here is a sample SSL certificate which uses RSA. 0:01:42.154,0:01:45.241 If you use SSH you probably have an SSH key 0:01:45.241,0:01:48.440 which probably is RSA. 0:01:48.440,0:01:51.858 Before I launch into the math 0:01:51.858,0:01:54.676 I want to explain what we're doing here. 0:01:54.676,0:01:56.742 We wanted to, since you guys are hackers, 0:01:56.742,0:01:59.874 you like code instead of formulas 0:01:59.874,0:02:02.784 so we wanted to give you formulas in code. 0:02:02.784,0:02:05.455 So what we're giving you is working Python code for 0:02:05.455,0:02:07.720 all the math that we're going to do. 0:02:07.720,0:02:09.976 And in order to do that we're going to use 0:02:09.976,0:02:12.643 some software you call Sage. 0:02:12.643,0:02:15.970 Sage is free open-source mathematics software. 0:02:15.970,0:02:20.160 It is available at this URL sagemath.org. 0:02:20.160,0:02:22.093 It's based off of Python 0:02:22.093,0:02:24.190 so it's very simple if you know Python. 0:02:24.190,0:02:27.904 It has a nice sort of interpreter 0:02:27.904,0:02:29.622 that you can use just like Python. 0:02:29.622,0:02:32.493 Here an example, 2*3=6. 0:02:32.493,0:02:34.890 But there are some differences. 0:02:34.890,0:02:38.202 For example the caret instead of in Python it's XOR 0:02:38.202,0:02:40.155 in Sage it's exponentiation. 0:02:40.155,0:02:43.869 So 2^3 is 8 in Sage. 0:02:43.869,0:02:47.351 The other nice thing about Sage is that 0:02:47.351,0:02:49.703 it has lots of useful mathematical libraries. 0:02:49.703,0:02:53.029 For example if I, in the Sage interpreter, say factor(15) 0:02:53.029,0:02:55.512 it helpfully tells me 3*5. 0:02:55.512,0:02:56.513 That's pretty great. 0:02:56.513,0:02:59.152 It also does a lot more advanced things, for example 0:02:59.152,0:03:00.467 it can represent polynomials. 0:03:00.467,0:03:03.498 I can say factor the polynomial x^2-1 0:03:03.498,0:03:08.127 and it helpfully tells me the answer to that. 0:03:08.127,0:03:12.819 So now that we've gone through what is Sage, 0:03:12.819,0:03:17.443 I want to explain how to do RSA. 0:03:17.443,0:03:21.065 Perhaps some of you have seen this before. 0:03:21.065,0:03:24.677 In order to generate an RSA key, the first thing that you do 0:03:24.677,0:03:27.435 is you generate two large random primes. 0:03:27.435,0:03:30.441 So if we want a 1024 bit RSA key 0:03:30.441,0:03:34.779 we generate two primes of size 2^512, 0:03:34.779,0:03:38.515 so 512 bit primes, p and q. 0:03:38.515,0:03:41.697 In order to generate the public key 0:03:41.697,0:03:44.900 I multiply together p and q and you get some N. 0:03:44.900,0:03:46.433 N has 1024 bits. 0:03:46.433,0:03:49.915 And then you have what's known as the public exponent 0:03:49.915,0:03:54.435 which you can choose 3 or 65537 or 35. 0:03:54.435,0:03:56.300 It really doesn't matter what you choose. 0:03:56.300,0:03:59.117 These are some of the most commonly used values. 0:03:59.117,0:04:03.129 Now in order to generate a private key 0:04:03.129,0:04:07.695 you choose d, the decryption exponent, 0:04:07.695,0:04:12.936 to be the modular inverse of e mod (p-1)*(q-1) 0:04:12.936,0:04:15.394 It doesn't matter why so much, for this talk 0:04:15.394,0:04:17.276 but here is the formula 0:04:17.276,0:04:20.295 or here is the command to do this in Sage. 0:04:20.295,0:04:23.150 So now we have a public key and a private key 0:04:23.150,0:04:24.508 that are pairs. 0:04:24.508,0:04:27.948 And in RSA if you want to enrypt a message 0:04:27.948,0:04:31.547 you raise your message to the e'th power mod n 0:04:31.547,0:04:34.753 so you can do that with your public key here. 0:04:34.753,0:04:36.776 And similarly decryption, 0:04:36.776,0:04:39.998 you raise the ciphertext to the d'th power mod n 0:04:39.998,0:04:42.835 and that will give you the message back out again. 0:04:42.835,0:04:45.748 Fairly simple. 0:04:45.748,0:04:48.731 Now I'm lying to you slightly. 0:04:48.731,0:04:51.088 If you read this talk do not 0:04:51.088,0:04:53.181 absolutely do not implement RSA this way. 0:04:53.181,0:04:54.757 You must pad your message. 0:04:54.757,0:04:57.186 This is a public service warning. 0:04:57.186,0:05:00.662 We offered to tell you about factoring. 0:05:00.662,0:05:03.941 What is the relationship between RSA and factoring? 0:05:03.941,0:05:08.972 Now you can see easily from the formula 0:05:08.972,0:05:11.760 from the private key here 0:05:11.760,0:05:16.337 that if you know how to factor N into its prime factors p and q 0:05:16.337,0:05:19.496 then you can just compute d, the decryption exponent 0:05:19.496,0:05:20.812 of your private key. 0:05:20.812,0:05:24.395 Clearly, if you can factor N you can compute the private key. 0:05:24.395,0:05:28.930 But we don't actually know that factoring is the only way to break RSA. 0:05:28.930,0:05:34.163 There might be some way that you can compute messages from ciphertexts 0:05:34.163,0:05:36.877 that doesn't actually reveal the private key. 0:05:36.877,0:05:41.545 And there's no proof either way that this is or is not possible. 0:05:41.545,0:05:43.478 The last thing that I want to mention here 0:05:43.478,0:05:45.458 just to clarify things 0:05:45.458,0:05:46.830 some of you who might have taken 0:05:46.830,0:05:51.527 complexity or algorithms courses or a beginning CS course 0:05:51.527,0:05:55.874 you might have heard of NP-hardness, the P vs. NP problem. 0:05:55.874,0:06:00.904 And I just want to clarify that factoring is not known to be NP-hard. 0:06:00.904,0:06:05.092 Every computer scientist will love it if we could actually generate 0:06:05.092,0:06:11.139 a cryptosystem which is known to be based of average case NP-hardness. 0:06:11.139,0:06:13.573 We don't know how to do that. 0:06:13.573,0:06:16.992 RSA is not doing that. 0:06:16.992,0:06:22.309 In addition the factoring problem is probably not NP-hard. 0:06:22.309,0:06:25.892 The question is: how hard is factoring? 0:06:25.892,0:06:27.474 Since factoring is the best way that we know 0:06:27.474,0:06:31.227 to break the most commonly used public-key cryptosystem on the Internet. 0:06:31.227,0:06:32.935 Well I showed you some examples before 0:06:32.935,0:06:37.243 where Sage had this helpful little factor function. 0:06:37.243,0:06:39.164 So how well does it work? 0:06:39.164,0:06:41.181 How long do people think this takes? 0:06:41.181,0:06:46.149 two 32 bit integers multiplied together. 0:06:46.149,0:06:48.850 I heard it'll go a couple of seconds. 0:06:48.850,0:06:50.098 0.01 seconds. 0:06:50.098,0:06:55.108 This is on this computer. 0:06:55.108,0:06:57.869 How about 64 bit integers, two of them multiplied together? 0:06:57.869,0:07:01.289 So this is a 128 bit RSA modulus. 0:07:01.289,0:07:04.382 Any guesses? 0:07:04.382,0:07:05.464 One minute? 0:07:05.464,0:07:08.235 Nope, 0.1 seconds. 0:07:08.235,0:07:14.521 How about two 96 bit integers multiplied together? 0:07:14.521,0:07:15.939 A couple of minutes? 0:07:15.939,0:07:18.105 4 seconds. 0:07:18.105,0:07:21.965 How about two 128 bit integers multiplied together? 0:07:21.965,0:07:28.255 This is a 256 bit RSA modulus. 0:07:28.255,0:07:29.300 No guesses? Alright. 0:07:29.300,0:07:30.430 500 seconds. 0:07:30.430,0:07:32.791 So at this point it's starting to get a little bit iffy if I'm 0:07:32.791,0:07:36.327 sitting on a plane trying to do this on battery power. 0:07:36.327,0:07:39.248 laughter 0:07:39.248,0:07:43.764 Clearly this is growing faster than linear in the size of the key. 0:07:43.764,0:07:45.597 So what is actually going on? 0:07:45.597,0:07:49.066 Well. 0:07:49.066,0:07:50.734 Dan: Ok. 0:07:50.734,0:07:52.794 Turns out these factoring functions 0:07:52.794,0:07:54.597 they do look like they're getting pretty slow but 0:07:54.597,0:07:56.835 wait a minute. 0:07:56.835,0:07:59.814 Do we actually have to use this factor function 0:07:59.814,0:08:00.866 if it's getting really slow? 0:08:00.866,0:08:04.302 Maybe instead of trying to use Sage's factor function 0:08:04.302,0:08:08.710 maybe we could just guess what the prime was. 0:08:08.710,0:08:12.350 Now as an RSA user you might not think that this is a threat 0:08:12.350,0:08:13.928 having somebody guess your prime 0:08:13.928,0:08:17.114 because there is some way to count the number of primes, 0:08:17.114,0:08:18.719 number of 512 bit primes and there's 0:08:18.719,0:08:22.185 more than 2^500 512 bit primes. 0:08:22.185,0:08:24.011 And if your random number generator is giving you 0:08:24.011,0:08:27.097 any of those 512 bit primes with the same chance 0:08:27.097,0:08:30.829 then any particular prime that the attacker guesses 0:08:30.829,0:08:32.845 and just tries dividing n by that, 0:08:32.845,0:08:35.061 see if it's evenly divisable, well 0:08:35.061,0:08:39.417 any particular prime has only a 2^(-500) chance 0:08:39.417,0:08:41.312 of being guessed by the attacker. 0:08:41.312,0:08:43.319 And even if the attacker tries a huge number of times 0:08:43.319,0:08:44.897 he'll never guess your prime 0:08:44.897,0:08:47.621 except what if your random number generator 0:08:47.621,0:08:52.031 is working really really badly and doesn't give you 2^500 different primes? 0:08:52.031,0:08:55.902 What if it only gives you, say, 2^40 different primes? 0:08:55.902,0:08:58.055 And then suddenly the attacker could, well, 0:08:58.055,0:09:01.248 try figuring out what those primes are. 0:09:01.248,0:09:03.237 So here's how the attacker looks at this. 0:09:03.237,0:09:05.819 The attacker says: ok I'm gonna target your public key. 0:09:05.819,0:09:08.884 I'm gonna take your big N which is your secret... 0:09:08.884,0:09:10.912 sorry, public key is big N, 0:09:10.912,0:09:13.666 your secret key p times your secret key q, 0:09:13.666,0:09:14.920 your private p and q. 0:09:14.920,0:09:16.764 He's trying to figure out the p and q, given N. 0:09:16.764,0:09:20.169 So what he does, he takes a bunch of devices that are out there 0:09:20.169,0:09:22.902 and he looks at how they work 0:09:22.902,0:09:25.281 or downloads a bunch of software which generates keys 0:09:25.281,0:09:28.904 and then using that software, using those devices 0:09:28.904,0:09:32.216 he tries generating a bunch of p's and q's for himself 0:09:32.216,0:09:34.516 using whatever the random number generator is 0:09:34.516,0:09:36.932 in that device or in that software. 0:09:36.932,0:09:38.400 And then that's something which 0:09:38.400,0:09:40.682 maybe the random number generator works really badly 0:09:40.682,0:09:42.832 and maybe you're using that device, too. 0:09:42.832,0:09:45.765 So the attacker tries each of the primes that he see's 0:09:45.765,0:09:46.918 from this device 0:09:46.918,0:09:50.129 and tries seeing does that prime divide your N. 0:09:50.129,0:09:51.400 If you were using that device 0:09:51.400,0:09:52.932 and it doesn't generate very many primes 0:09:52.932,0:09:55.835 then maybe he gets lucky and finds your prime 0:09:55.835,0:09:57.968 because the device has generated the same prime for him 0:09:57.968,0:09:59.618 that it generated for you. 0:09:59.618,0:10:03.216 Now does anybody actually build devices which 0:10:03.216,0:10:07.005 screw up random numbers this badly? 0:10:07.005,0:10:08.519 laughter 0:10:08.519,0:10:11.538 As a matter of fact, yes, bears do shit in the woods. 0:10:11.538,0:10:13.543 So there are some famous examples. 0:10:13.543,0:10:16.952 We don't have a huge number of historical discussions in this talk 0:10:16.952,0:10:18.539 but just a few examples here. 0:10:18.539,0:10:22.401 Goldberg&Wagner totally broke the original Netscape 0:10:22.401,0:10:23.989 generation of public keys. 0:10:23.989,0:10:26.921 There are only something like 2^47 possible keys 0:10:26.921,0:10:29.216 which is a countable numbers. 0:10:29.216,0:10:31.077 So they could figure out all the possible keys 0:10:31.077,0:10:34.139 that Netscape could generate if they you knew when you generated a key 0:10:34.139,0:10:35.802 which was often very predictable. 0:10:35.802,0:10:37.724 Another example: the Debian bug, 0:10:37.724,0:10:40.974 the horrendous failure for SSH and lots of other applications 0:10:40.974,0:10:42.418 from a few years ago. 0:10:42.418,0:10:45.692 Debian and Ubuntu for a year and half were generating only 0:10:45.692,0:10:48.773 well, fewer than a million possible public keys. 0:10:48.773,0:10:51.421 So complete disasters of random number generation. 0:10:51.421,0:10:53.826 But if you want to add to this list, 0:10:53.826,0:10:55.202 if you want to do more and more of these 0:10:55.202,0:10:56.974 find bad random number generators 0:10:56.974,0:11:00.518 you have to say "ok what's the next device out there 0:11:00.518,0:11:01.858 that might have been screwed up." 0:11:01.858,0:11:03.422 "Let's look at it, look at the code." 0:11:03.422,0:11:05.039 "See what kind of primes it generates." 0:11:05.039,0:11:06.571 "Does it do badly?" 0:11:06.571,0:11:09.424 It takes some real work to figure this out. 0:11:09.424,0:11:11.801 Wouldn't it be nice if there was a systematic way 0:11:11.801,0:11:14.586 to just without having to do a lot of work, 0:11:14.586,0:11:16.576 to look at each device, each piece of software, 0:11:16.576,0:11:18.403 wouldn't it be nice to just figure out 0:11:18.403,0:11:21.568 "oh yeah, let's just magically figure out 0:11:21.568,0:11:23.726 if there's any primes out there generated 0:11:23.726,0:11:25.507 from bad random number generators." 0:11:25.507,0:11:28.020 You could imagine here's the easy attack. 0:11:28.020,0:11:29.622 Here's the systematic way to do this. 0:11:29.622,0:11:34.052 You take two users, so you and you. 0:11:34.052,0:11:35.442 And you each have a public key. 0:11:35.442,0:11:37.178 We're gonna grab those public keys 0:11:37.178,0:11:41.681 N1 and N2 0:11:41.681,0:11:46.983 and then hope that this N1 and N2 share a prime. 0:11:46.983,0:11:54.130 N1 is some prime p times some q1 and N2 is some same prime p times a different q2. 0:11:54.130,0:11:55.663 Now this could happen, you could have 0:11:55.663,0:11:59.100 N1 and N2 sharing both primes. 0:11:59.100,0:12:02.101 That would be legitimate that two people who are actually 0:12:02.101,0:12:05.645 the same webserver sharing their keys, sharing their certificates, 0:12:05.645,0:12:06.843 they know they're doing this, 0:12:06.843,0:12:09.281 you can get the same public key from two places. 0:12:09.281,0:12:12.183 Google has their keys copied to tens of thousands of servers, 0:12:12.183,0:12:13.399 that's not a surprise. 0:12:13.399,0:12:15.395 But if there's a different... 0:12:15.395,0:12:16.999 If you have two different public keys 0:12:16.999,0:12:18.430 they shouldn't be sharing primes. 0:12:18.430,0:12:19.582 What if they do? 0:12:19.582,0:12:21.979 Well, here's this magic Euclid's algorithm 0:12:21.979,0:12:24.744 which will just print out the shared prime p. 0:12:24.744,0:12:26.066 This is a very simple algorithm. 0:12:26.066,0:12:28.495 It just takes one of the numbers, say N1 0:12:28.495,0:12:31.961 and let's say N1 is the smaller one and N2 is the bigger one, 0:12:31.961,0:12:34.142 it divides N2 by N1 0:12:34.142,0:12:37.512 and replaces N2 by the remainder and then switches the numbers 0:12:37.512,0:12:39.593 and that's exactly what this code does. 0:12:39.593,0:12:43.111 The N2 % N1 that's again the N2 mod N1 0:12:43.111,0:12:45.543 the least remainder when you divide N2 by N1. 0:12:45.543,0:12:49.414 And then the result once one of the numbers gets down to zero 0:12:49.414,0:12:52.292 the other number is exactly the shared prime. 0:12:52.292,0:12:54.149 So this amazing algorithm, 0:12:54.149,0:12:57.051 here's just a little manual example 0:12:57.051,0:12:59.898 not with big 512, 1024 bit keys 0:12:59.898,0:13:04.143 this is with just tiny little I guess 8 bit primes 0:13:04.143,0:13:07.625 16 bit... well looks more like 13 bit keys. 0:13:07.625,0:13:12.375 So two numbers coming in and one is 4187 and two is 5989. 0:13:12.375,0:13:14.879 You can see if you keep dividing one number by the other, 0:13:14.879,0:13:16.162 replacing it by the remainder, 0:13:16.162,0:13:19.411 do that a few times, you quickly get down to zero. 0:13:19.411,0:13:22.774 And the number right before the zero in the middle of the slide is 53. 0:13:22.774,0:13:27.172 Now 53 is a shared prime between these two small keys. 0:13:27.172,0:13:28.772 If you scale this up, 0:13:28.772,0:13:32.256 if you do this computation for bigger numbers 0:13:32.256,0:13:34.176 then here's another timing example. 0:13:34.176,0:13:36.673 It's still really really fast, in the middle of the slide. 0:13:36.673,0:13:40.359 This is doing 1024 bit keys that share a 512 bit prime. 0:13:40.359,0:13:44.554 You can figure out this 512 bit prime so quickly that the timing 0:13:44.554,0:13:48.848 there is 0.00 seconds, can't even see how fast it is. 0:13:48.848,0:13:51.094 Ok, so that's Euclid's algorithm. 0:13:51.094,0:13:53.676 Now you can actually do this for tons and tons of keys. 0:13:53.676,0:13:57.695 You download, say, millions, tens of millions of keys from the Internet, 0:13:57.695,0:14:01.011 all the different SSL servers, maybe go for some SSH keys, 0:14:01.011,0:14:03.891 you download tons and tons of public keys and then 0:14:03.891,0:14:06.590 you try all the pairs with Euclid's algorithm. 0:14:06.590,0:14:09.406 And it's so fast that you can do this but actually 0:14:09.406,0:14:10.848 you can do better. 0:14:10.848,0:14:13.927 So there's an algorithm: batch GCD algorithm 0:14:13.927,0:14:15.396 which takes a whole bunch of keys 0:14:15.396,0:14:19.822 and figures out which of those keys share primes with the others. 0:14:19.822,0:14:22.525 Much much faster than trying every pair. 0:14:22.525,0:14:23.909 So you don't have to try every pair. 0:14:23.909,0:14:26.973 It's kind of like sorting, it's about that kind of speed 0:14:26.973,0:14:29.065 instead of having to try every pair. 0:14:29.065,0:14:30.811 If you have tens of millions you don't have to do 0:14:30.811,0:14:33.156 tens of millions times tens of millions of pairs. 0:14:33.156,0:14:35.739 You just have to pretty much reap through the whole list once. 0:14:35.739,0:14:38.877 So what does this batch GCD algorithm do? 0:14:38.877,0:14:41.447 It's figuring out for the first N1, 0:14:41.447,0:14:44.029 instead of computing the greatest common divisor, 0:14:44.029,0:14:45.908 the Euclid result for N1 and N2 0:14:45.908,0:14:47.846 and for N1 and N3 and so on, 0:14:47.846,0:14:52.340 it figures out the product of N2, N3, N4 and so on, 0:14:52.340,0:14:55.257 takes the greatest common divisor of that with N1, 0:14:55.257,0:14:57.191 applies Euclid's algorithm to that 0:14:57.191,0:15:00.874 and then does a similar thing for N2 and for N3 and so on. 0:15:00.874,0:15:05.656 Each of those is gonna show you whether N1 or N2 or N3, 0:15:05.656,0:15:08.148 it's gonna show you whether they share some prime 0:15:08.148,0:15:10.246 with any of the other N's. 0:15:10.246,0:15:13.326 If you did exactly the computation 0:15:13.326,0:15:15.248 shown on the slide in the math formulas 0:15:15.248,0:15:16.617 it would still be too slow. 0:15:16.617,0:15:20.457 But the batch GCD algorithm finds redundancies 0:15:20.457,0:15:23.218 in these GCDs, and here's exactly what it does: 0:15:23.218,0:15:25.620 It figures out the product of all of the keys. 0:15:25.620,0:15:29.923 So you take, say, a million keys and you multiply them all together. 0:15:29.923,0:15:33.534 And that's done with the code here where you do... 0:15:33.534,0:15:35.120 maybe I'll just skip to the bottom. 0:15:35.120,0:15:37.474 You see numbers 10, 20, 30, 40. 0:15:37.474,0:15:39.356 You build a product tree. 0:15:39.356,0:15:42.210 So you take 10 times 20, compute 200. 0:15:42.210,0:15:44.687 Take the next pair 30 times 40, compute 1200. 0:15:44.687,0:15:47.889 The take the pair of products 200 times 1200. 0:15:47.889,0:15:50.378 If you have more numbers, you have more levels in the tree. 0:15:50.378,0:15:55.227 And that's exactly what the Sage code does here. 0:15:55.227,0:15:58.924 And then the last step of the algorithm is 0:15:58.924,0:16:02.847 kind of going down the tree, building a remainder tree. 0:16:02.847,0:16:06.515 So what's happening here to figure out the greatest common divisor 0:16:06.515,0:16:10.013 of N1 with the product of all the other keys, 0:16:10.013,0:16:12.846 the idea is to take this product 0:16:12.846,0:16:15.807 which is called big R on this slide, 0:16:15.807,0:16:23.160 take the product of all the keys and then divide that by N1^2 0:16:23.160,0:16:26.291 and then take the remainders, so R mod N1^2, 0:16:26.291,0:16:31.148 divide by N1 and then compute the greatest common divisor of N1 mod N. 0:16:31.148,0:16:33.124 And the amazing effect is that it's exactly the 0:16:33.124,0:16:35.047 greatest common divisor we were looking for. 0:16:35.047,0:16:37.629 And what this little Sage script does is that it takes 0:16:37.629,0:16:42.282 R and divides R by, well a bunch of N's in a really fast way, 0:16:42.282,0:16:43.960 and then at the bottom, once it has figured out 0:16:43.960,0:16:47.227 all mod N1^2, all mod N2^2 and so on, 0:16:47.227,0:16:49.308 divides those by N1 and N2, 0:16:49.308,0:16:52.177 takes the greatest common divisor with N1 and N2 0:16:52.177,0:16:54.824 and that's exactly telling you, the outputs here 0:16:54.824,0:16:57.063 that are different from 1, are exactly telling you 0:16:57.063,0:17:00.398 which N's share a prime with some others. 0:17:00.398,0:17:04.048 This very remarkably short little script does work quite well. 0:17:04.048,0:17:06.524 Here's how fast these are. 0:17:06.524,0:17:10.248 So I'm on some 800 MHz core. 0:17:10.248,0:17:13.442 If you try this for, say, a hundred numbers, 0:17:13.442,0:17:16.608 just hundred random 1024 bit numbers, 0:17:16.608,0:17:20.917 it doesn't matter what the numbers are, they always run at the same speed, 0:17:20.917,0:17:23.198 then that takes 0.05 seconds. 0:17:23.198,0:17:27.249 If you have 10 times as many numbers it takes about 20 times as long. 0:17:27.249,0:17:31.229 And if you have another 10 times as many numbers it takes about 20 times as long. 0:17:31.229,0:17:33.247 And it keeps scaling up like that. 0:17:33.247,0:17:35.948 So you can get to millions or tens of millions of numbers 0:17:35.948,0:17:38.582 and it still finishes in a reasonable amount of time. 0:17:38.582,0:17:40.068 So, amazingly fast algorithm. 0:17:40.068,0:17:43.810 You don't have to scale this up to like 2^50 or 2^100 keys. 0:17:43.810,0:17:46.059 There's only, say, tens of millions of keys out there 0:17:46.059,0:17:48.330 and this runs reasonably fast. 0:17:48.330,0:17:50.626 Ok. 0:17:50.626,0:17:52.416 Can you actually hope for this to work? 0:17:52.416,0:17:54.629 Are random number generators really this bad? 0:17:54.629,0:17:55.866 Well... 0:17:55.866,0:18:00.478 laughs 0:18:00.478,0:18:03.435 There's a 2012 paper from Heninger 0:18:03.435,0:18:04.926 that's the same Heninger we have sitting here 0:18:04.926,0:18:06.300 and Durumeric, Wustrow, Halderman, 0:18:06.300,0:18:09.881 that's got the best paper award at USENIX Security 2012. 0:18:09.881,0:18:14.678 What they did was they tried exactly this on tens of millions of keys out there 0:18:14.678,0:18:18.276 and factored tens of thousands of keys. 0:18:18.276,0:18:22.677 Admittely they used a C version of this instead of this 0:18:22.677,0:18:24.061 Sage script but 0:18:24.061,0:18:25.827 the Sage script will go up to millions. 0:18:25.827,0:18:28.359 It's just when you get beyond that you want to write it in C 0:18:28.359,0:18:31.009 to deal with using a lot of memory. 0:18:31.009,0:18:34.887 But you can use exactly this script for pretty large chunks of keys. 0:18:34.887,0:18:37.951 And so they factored tens of thousands of keys and said 0:18:37.951,0:18:42.235 that these keys are typically not your bank keys. 0:18:42.235,0:18:44.909 They gonna be, say, keys for your home router. 0:18:44.909,0:18:47.639 So the vulnerable devices are not gonna be, 0:18:47.639,0:18:49.893 say, a big server out there. 0:18:49.893,0:18:53.902 The vulnerable devices will be your FRITZ!Box. 0:18:53.902,0:18:57.340 applause 0:18:57.340,0:18:59.838 Thank you, Nadia. 0:18:59.838,0:19:06.525 It's good to read the paper, go to factorable.net 0:19:06.525,0:19:09.073 and you get tons of analyses of why this happened 0:19:09.073,0:19:12.520 and why these devices are generated guessable numbers. 0:19:12.520,0:19:15.020 I mean numbers that are so non-random that they get 0:19:15.020,0:19:18.370 repeated between a lot of keys out there. 0:19:18.370,0:19:20.308 There was at the same time another team that 0:19:20.308,0:19:23.559 like within days announced the same result. 0:19:23.559,0:19:27.071 They independently did the same computation. 0:19:27.071,0:19:28.883 They said "yeah, we've also downloaded 0:19:28.883,0:19:31.352 a bunch of keys and factored as many as we could" 0:19:31.352,0:19:32.869 with basically the same algorithm. 0:19:32.869,0:19:35.198 At the end of it "yeah we factored 0:19:35.198,0:19:36.629 tens of thousands of keys." 0:19:36.629,0:19:38.882 That paper has no analysis. 0:19:38.882,0:19:41.582 They're like "yeah, there's keys" 0:19:41.582,0:19:43.267 "they share primes for some reason" 0:19:43.267,0:19:45.915 "I guess ecommerce is dead" 0:19:45.915,0:19:47.536 "go out, change your bank keys" 0:19:47.536,0:19:48.846 "call the New York Times" 0:19:48.846,0:19:50.553 There's the New York Times article: 0:19:50.553,0:19:52.601 "Flaw Found in an Online Encryption Method" 0:19:52.601,0:19:54.860 I also like the advertising on the side here. 0:19:54.860,0:19:57.151 Like iron key on the top and then: 0:19:57.151,0:20:01.168 "Encrypt, decrypt & access my secret data in real time" 0:20:01.168,0:20:03.135 "try it free for 30 days." 0:20:03.135,0:20:05.199 "I secured my cloud data. Have you?" 0:20:05.199,0:20:06.878 Awesome advertising there. 0:20:06.878,0:20:12.480 Once you found that there's a problem like this 0:20:12.480,0:20:14.919 then of course you start looking around for more and more places 0:20:14.919,0:20:17.435 where you might have some bad randomness. 0:20:17.435,0:20:22.795 For example there is a paper, or at least slides, 0:20:22.795,0:20:25.395 by Chou in Taiwan, saying 0:20:25.395,0:20:31.857 they took two million Taiwan Citizen Digital Certificates 0:20:31.857,0:20:34.327 and did some GCDs 0:20:34.327,0:20:36.626 and found that a 103 of those were 0:20:36.626,0:20:37.995 factorable. 0:20:37.995,0:20:40.052 So anybody who downloaded these certificates 0:20:40.052,0:20:43.177 which apparently are used in Taiwan for like paying your taxes, 0:20:43.177,0:20:44.393 talking to banks I don't know, 0:20:44.393,0:20:46.677 but paying taxes is one that I've checked. 0:20:46.677,0:20:49.537 So a 103 people out there have these Taiwan 0:20:49.537,0:20:51.011 Citizen Digital Certificates where 0:20:51.011,0:20:54.121 you're supposed to have in this database things like 0:20:54.121,0:20:57.122 the names of the people and their ID numbers 0:20:57.122,0:20:59.576 but you aren't supposed to have their secret keys. 0:20:59.576,0:21:01.543 The whole point is those are supposed to be private 0:21:01.543,0:21:03.938 kept on some smartcards that are issued to the citizens 0:21:03.938,0:21:05.295 in Taiwan. 0:21:05.295,0:21:08.706 And the randomness generation is so bad that 0:21:08.706,0:21:13.169 103 of those keys have now been factored. 0:21:13.169,0:21:15.451 The smartcard manufacturer, I also like this quote, 0:21:15.451,0:21:18.674 it's a company based in Munich called 0:21:18.674,0:21:22.315 Giesecke & Devrient and their motto is: 0:21:22.315,0:21:24.753 "Creating Confidence" 0:21:24.753,0:21:35.270 applause 0:21:35.270,0:21:37.698 Tanja: There's gonna be more of Dan, don't worry. 0:21:37.698,0:21:39.835 But we promised a bit more than just factoring 0:21:39.835,0:21:43.309 bad keys or the keys with their primes. 0:21:43.309,0:21:46.956 If you find another number like this one. 0:21:46.956,0:21:49.686 So here's coming some more of math stuff. 0:21:49.686,0:21:52.555 So if you see a number like this and you 0:21:52.555,0:21:56.427 observe this last digit, then yes. 0:21:56.427,0:21:58.738 It's very easy for you to see it's divisable by 5. 0:21:58.738,0:22:01.385 So if you find such a key that is easy to break 0:22:01.385,0:22:03.226 and you are a computer rather than a human being 0:22:03.226,0:22:05.324 you look at lots of those numbers. 0:22:05.324,0:22:07.076 You have a list of small primes stored 0:22:07.076,0:22:10.037 and what you're gonna do is just trial division. 0:22:10.037,0:22:11.987 So if there's any small factor 0:22:11.987,0:22:14.092 it's very easy to find this trial division. 0:22:14.092,0:22:16.708 Or what Dan showed was the remainder trees. 0:22:16.708,0:22:20.222 You can also do this for batch trial division. 0:22:20.222,0:22:23.728 So assuming that you're looking for a prime p somewhere 0:22:23.728,0:22:26.735 then you go linearly all the way up to the p 0:22:26.735,0:22:30.442 and there are about p/log(p) many primes 0:22:30.442,0:22:32.094 before you find this p. 0:22:32.094,0:22:33.969 For each of them you just do exact division. 0:22:33.969,0:22:36.558 If it works, fine you found a factor, otherwise not. 0:22:36.558,0:22:39.459 Now this is not going to work against your normal keys. 0:22:39.459,0:22:43.624 I know that in the example that Dan mentioned by Wagner&Goldberg 0:22:43.624,0:22:46.625 that they found one factor which had a 9 in there 0:22:46.625,0:22:49.975 but usually your keys don't have a 9. 0:22:49.975,0:22:54.687 For slightly larger numbers there is a method due to Pollard. 0:22:54.687,0:22:57.460 So if you see the respectful number here, 0:22:57.460,0:22:59.887 there is no obvious divisability of this one. 0:22:59.887,0:23:03.091 One thing you can try is 0:23:03.091,0:23:06.660 a walk, well we call this a walk. 0:23:06.660,0:23:10.404 You start with some random number smaller than N 0:23:10.404,0:23:12.902 and some offset here. 0:23:12.902,0:23:17.277 I should also say, all those scripts are available at this URL. 0:23:17.277,0:23:22.087 So if you go to facthacks.cr.yp.to then you'll find all the scripts 0:23:22.087,0:23:23.879 if you want to play with it at the same time, 0:23:23.879,0:23:26.625 including this one with some explanations and such. 0:23:26.625,0:23:31.092 So what you do is you start off two different numbers, 0:23:31.092,0:23:34.788 one at each step squares it and adds c. 0:23:34.788,0:23:37.226 And the other one squares it, adds c, 0:23:37.226,0:23:39.886 and squares it again and adds the c. 0:23:39.886,0:23:41.455 Each time computing mod N. 0:23:41.455,0:23:44.102 So you have two sequences running around, 0:23:44.102,0:23:46.023 one is twice the speed as the other. 0:23:46.023,0:23:50.953 At every step you check with the GCD of this number N 0:23:50.953,0:23:52.757 and the difference between the two walks 0:23:52.757,0:23:55.006 hasn't anything interesting now. 0:23:55.006,0:23:57.390 N is an RSA modulus, the only thing interesting 0:23:57.390,0:23:59.675 could either be p or q. 0:23:59.675,0:24:03.009 So for this particular number 0:24:03.009,0:24:04.270 when I run this 0:24:04.270,0:24:08.894 I find 2053 after a few milliseconds. 0:24:08.894,0:24:10.642 So this is again a cooked up example. 0:24:10.642,0:24:13.307 You won't find such a small factor if you're trying 0:24:13.307,0:24:16.173 a general RSA factor, otherwise your numbers 0:24:16.173,0:24:17.830 would be totally busted but 0:24:17.830,0:24:20.604 this shows if you have a small number in there 0:24:20.604,0:24:21.857 it's very easy to find. 0:24:21.857,0:24:24.357 All later on when Dan is talking about the number field sieve 0:24:24.357,0:24:26.157 this is an interesting subroutine. 0:24:26.157,0:24:28.089 So if you ever need to find small numbers 0:24:28.089,0:24:31.020 then, sure trial division is very easy for 0:24:31.020,0:24:34.080 really small numbers taking this long, 0:24:34.080,0:24:38.352 this one squared with p is much faster already. 0:24:38.352,0:24:42.070 Now even faster than that, also due to Pollard, 0:24:42.070,0:24:45.378 called Pollard's p-1 method. 0:24:45.378,0:24:47.904 Here is again an innocent looking number N. 0:24:47.904,0:24:49.842 You see the numbers get larger. 0:24:49.842,0:24:52.176 This is a 256 bit number. 0:24:52.176,0:24:56.744 In order to deal with such a number there is one step 0:24:56.744,0:24:59.342 which is expensive but you would do it only once 0:24:59.342,0:25:01.208 for all different numbers you want to try, 0:25:01.208,0:25:05.238 namely compute a big integer which has lots of little factors. 0:25:05.238,0:25:06.985 So you take the least common multiple 0:25:06.985,0:25:09.788 from 1, 2, 3, 4, 5... 0:25:09.788,0:25:12.203 Ok, once you hit the 4 you have another 2 in there. 0:25:12.203,0:25:13.808 So there's lots and lots of powers of 2, 0:25:13.808,0:25:15.408 lots and lots of powers of 3. 0:25:15.408,0:25:19.390 And then the larger primes only appear like once. 0:25:19.390,0:25:23.161 But you're not going up to 256 or 128 anywhere. 0:25:23.161,0:25:27.005 You're sticking here at 22 bits 0:25:27.005,0:25:29.059 and then use the same function that Nadia explained 0:25:31.023,0:25:32.987 in the RSA computation, namely the modular exponentiation. 0:25:32.987,0:25:36.576 So you take the 2^y, 0:25:36.576,0:25:38.241 this huge number, 0:25:38.241,0:25:40.486 but the whole computation mod N. 0:25:40.486,0:25:44.633 So this whole thing never grows beyond this 256 bit number. 0:25:44.633,0:25:47.939 Or for a real world example it would be a 1000 bit number. 0:25:47.939,0:25:52.518 And then you compute the GCD of this number and N. 0:25:52.518,0:25:55.091 For this particular one, again it's a cooked up example, 0:25:55.091,0:25:56.390 you get a factor. 0:25:56.390,0:25:59.791 Now this factor is actually 120 bits. 0:25:59.791,0:26:02.369 This is not a small factor. 0:26:02.369,0:26:05.688 So if you were using 256 bit numbers 0:26:05.688,0:26:07.856 this is something that could happen to you 0:26:07.856,0:26:11.305 and nevertheless this method would find it. 0:26:11.305,0:26:13.522 So this finds much larger factors than trial division 0:26:13.522,0:26:15.484 or the Rho method in the same amount of time. 0:26:15.484,0:26:17.376 But it doesn't work for all primes. 0:26:17.376,0:26:20.066 It works for primes which are kind of special. 0:26:20.066,0:26:23.859 Now what's special about this prime or the other factor? 0:26:23.859,0:26:29.553 If you take p and subtract -1, hence the name p-1 method, 0:26:29.553,0:26:31.660 and factor that thing, 0:26:31.660,0:26:35.272 that has lots, lots of small numbers. 0:26:35.272,0:26:36.937 So that's something we call smooth. 0:26:36.937,0:26:40.342 So a smooth number doesn't have big factors. 0:26:40.342,0:26:43.868 So a prime is like the least smooth number you can imagine. 0:26:43.868,0:26:47.367 And 2 to the power large is the most smooth number, 0:26:47.367,0:26:50.007 so lots of little factors means smooth. 0:26:50.007,0:26:55.568 This largest factor happens to be 21.7 bits 0:26:55.568,0:26:58.355 which is why I chose the 22 bits here. 0:26:58.355,0:27:03.702 So as soon I run over the largest prime in p-1 0:27:03.702,0:27:06.127 with this exponent y here 0:27:06.127,0:27:08.653 then I do find this factor. 0:27:08.653,0:27:10.851 Now if you thought this was math 0:27:10.851,0:27:12.322 there is even more scary math 0:27:12.322,0:27:15.893 namely there are methods which are generalizing 0:27:15.893,0:27:19.319 this p-1 method using elliptic curves. 0:27:19.319,0:27:21.569 If you listen to the crypto advertisement 0:27:21.569,0:27:23.472 elliptic curves are usually something very good, 0:27:23.472,0:27:25.005 something you should have on your smartcard. 0:27:25.005,0:27:27.245 It's faster than RSA and so on. 0:27:27.245,0:27:29.385 That's the same type of elliptic curve 0:27:29.385,0:27:32.342 but here elliptic curves come in as an attack tool. 0:27:32.342,0:27:35.088 There is a method called the elliptic curve method ECM 0:27:35.088,0:27:41.255 which is a generalization of this p-1 method and does not need anything special, 0:27:41.255,0:27:44.708 I mean avoiding such primes is easy. 0:27:44.708,0:27:46.325 If you look into older standards 0:27:46.325,0:27:49.852 they all warn about "make sure to use strong primes" 0:27:49.852,0:27:53.174 "make sure to check that your p-1 is not smooth" 0:27:53.174,0:27:55.301 And of course you don't want your p-1 to be smooth 0:27:55.301,0:27:57.670 because otherwise this attack works. 0:27:57.670,0:28:00.760 But if I have some elliptic curve 0:28:00.760,0:28:03.492 a different type of prime is weak for that curve 0:28:03.492,0:28:06.351 and I have many, many, many elliptic curves. 0:28:06.351,0:28:08.617 For each of the curves some primes are weak. 0:28:08.617,0:28:11.161 You can't exclude this method. 0:28:11.161,0:28:13.573 So the only thing you can do is just, well, 0:28:13.573,0:28:18.035 choose your prime large enough and hope, well, 0:28:18.035,0:28:20.436 make sure p-1 doesn't get it and hope 0:28:20.436,0:28:22.601 that the next elliptic curves won't get it. 0:28:22.601,0:28:24.855 So elliptic curves are good for attacking. 0:28:24.855,0:28:27.624 It's still not the fastest method out there 0:28:27.624,0:28:33.185 but it's a good method for random factoring. 0:28:33.185,0:28:35.917 It's not the best method for finding RSA factors 0:28:35.917,0:28:38.366 but if you have a number which is most likely 0:28:38.366,0:28:44.532 not too non-smooth then it's a good method. 0:28:44.532,0:28:47.234 Now this is what you do for finding small factors. 0:28:47.234,0:28:50.509 To show some method for big factors, 0:28:50.509,0:28:53.994 now here that is a big number. 0:28:53.994,0:28:58.978 But this is what I would call factorization by inspection. 0:28:58.978,0:29:07.876 What's wrong with this number? 0:29:07.876,0:29:10.129 I mean this number is not small and 0:29:10.129,0:29:13.162 it won't have small factors, I promise you this. 0:29:13.162,0:29:19.835 So it's a decimal representation, so 9 is the digit before 0. 0:29:19.835,0:29:23.246 This looks very close to the power of 10. 0:29:23.246,0:29:28.253 Actually this is very close to 10^340. 0:29:28.253,0:29:30.292 If you take the square root of it 0:29:30.292,0:29:34.150 then you almost exactly hit 10^170. 0:29:34.150,0:29:37.113 This example was cooked up as taking two primes 0:29:37.113,0:29:46.646 namely 10^170-33 and 10^170+63 and multiplying them. 0:29:46.646,0:29:48.667 So if you see lots of 0's and lots of 9's 0:29:48.667,0:29:50.934 then you know that you are very close to a certain power. 0:29:50.934,0:29:53.464 Now in real life this wouldn't be a power of 10, 0:29:53.464,0:29:54.817 it would be a power of 2. 0:29:54.817,0:29:56.179 And there actually are some examples. 0:29:56.179,0:29:58.817 There's a famous story of a letter bomber in Austria 0:29:58.817,0:30:02.650 who wanted his message to be crackable 0:30:02.650,0:30:06.047 and he sent a message, well he was some right-wing asshole 0:30:06.047,0:30:08.332 that was letter-bombing people 0:30:08.332,0:30:10.813 and then he sent an encrypted message to the police. 0:30:10.813,0:30:14.642 And the police tried to factor it and tried to decipher it 0:30:14.642,0:30:18.568 hoping it would be the list of the next victims. 0:30:18.568,0:30:20.980 In the end it was not but 0:30:20.980,0:30:24.419 this person was actually using very close to a power of 2 0:30:24.419,0:30:26.211 in order to have people crack it. 0:30:26.211,0:30:27.769 In the end it was just some propaganda, 0:30:27.769,0:30:29.462 some right-wing stuff as well. 0:30:29.462,0:30:32.783 So there are people using this for breakability. 0:30:32.783,0:30:35.499 I'm not aware of people actually using this as an accident. 0:30:35.499,0:30:40.437 But what can happen to you is not using close to the power of 2 0:30:40.437,0:30:46.200 but saying "ok I learned I shouldn't be close to the power of 2 0:30:46.200,0:30:48.848 so I take some offset and take the next prime." 0:30:48.848,0:30:51.130 Next prime just means, add 1, check is it prime, 0:30:51.130,0:30:53.798 add next one, add 2, add 3 0:30:53.798,0:30:57.082 Just check primes linearly from then on. 0:30:57.082,0:30:59.261 Now if we jump to this where we finally find our prime p 0:30:59.261,0:31:00.966 it's a good prime. 0:31:00.966,0:31:03.869 It's not anywhere close to the power of 2 because my c was large enough. 0:31:03.869,0:31:07.130 Made it, good. 0:31:07.130,0:31:08.562 Now on q. 0:31:08.562,0:31:10.776 So again call the next_prime(). 0:31:10.776,0:31:14.445 So I do p+1, p+2 until I find a prime. 0:31:14.445,0:31:17.215 That means that if you take the product of the two 0:31:17.215,0:31:19.469 then this N is very close to a square. 0:31:19.469,0:31:22.920 So this N is cooked up with this method. 0:31:22.920,0:31:24.412 You don't see anything on the end, 0:31:24.412,0:31:26.678 there's nothing wrong there. 0:31:26.678,0:31:29.650 But when you compute the square root of it 0:31:29.650,0:31:33.156 it is almost an integer, it's .99999... 0:31:33.156,0:31:35.208 and then some dirt here. 0:31:35.208,0:31:39.288 So then you'll know that your primes were too small. 0:31:39.288,0:31:42.286 Sorry, not too small, too close to each other. 0:31:42.286,0:31:45.483 So if you ever take an RSA factor modulus 0:31:45.483,0:31:46.841 and you compute the square root 0:31:46.841,0:31:47.991 and you see something like this 0:31:47.991,0:31:51.885 then you know: that user did this method. 0:31:51.885,0:31:53.661 You could just say, I take the number, 0:31:53.661,0:31:56.549 subtract 1, subtract 2, just go off from the square root. 0:31:56.549,0:32:02.022 All you can check how far away is it from being a square. 0:32:02.022,0:32:03.701 So you take the seeding, 0:32:03.701,0:32:07.950 that means the closest integer counting upwards. 0:32:07.950,0:32:11.817 Just round it to this two, here's 57, is there a 56? 0:32:11.817,0:32:14.520 Compute the square of that and subtract N. 0:32:14.520,0:32:17.987 Now in this example that is 4096 0:32:17.987,0:32:22.595 which itself is a square. 0:32:22.595,0:32:26.636 So the difference of N and a^2 is a square. 0:32:26.636,0:32:30.915 Then I take this 64, take a-64, divide N, 0:32:30.915,0:32:34.413 it's an exact division, so I found one of the factors. 0:32:34.413,0:32:37.200 And then there's the other factor. 0:32:37.200,0:32:39.738 It doesn't matter whether it's a power of 2 or a power of 10. 0:32:39.738,0:32:42.569 What matters is that the numbers are too close. 0:32:42.569,0:32:44.712 Now this method actually works surprisingly well 0:32:44.712,0:32:47.436 even if we don't do next_prime(). 0:32:47.436,0:32:50.332 So here's another example where I didn't do the next_prime() 0:32:50.332,0:32:52.281 but did something larger 0:32:52.281,0:32:55.114 so this is not really close to a square. 0:32:55.114,0:32:57.731 Some 9's but not too many. 0:32:57.731,0:33:00.053 What we did before we wrote it as a difference of squares 0:33:00.053,0:33:04.614 and then divided N by one of these two factors. 0:33:04.614,0:33:10.050 Now in this case here I would not start as a^2-N 0:33:10.050,0:33:12.255 but I say, well, if a^2-N is not a square 0:33:12.255,0:33:15.997 I try the next one, I do (a+1)^2, (a+2)^2 0:33:15.997,0:33:20.777 each time subtract N and check when this is a square. 0:33:20.777,0:33:24.170 Now for this example I get lucky after 2. 0:33:24.170,0:33:26.331 And this actually took me a long time to cook up this example 0:33:26.331,0:33:28.613 because I was starting at something which was 0:33:28.613,0:33:31.669 half of the bits of p were changed. 0:33:31.669,0:33:35.872 So I was actually starting with a cube quite a distance away. 0:33:35.872,0:33:38.216 Still, most of those were just giving a square. 0:33:38.216,0:33:42.070 They were not giving me anything larger than zero here. 0:33:42.070,0:33:45.598 Now for general numbers if I wouldn't do like next_number() 0:33:45.598,0:33:48.786 but just, well, random prime, another random prime. 0:33:48.786,0:33:52.687 It still works but takes a long time 0:33:52.687,0:33:55.488 because I can always write it this way. 0:33:55.488,0:33:56.786 This is a, this is b. 0:33:56.786,0:34:00.603 But then usually it would take about square root of N 0:34:00.603,0:34:03.752 which is the same size as p until I get there. 0:34:03.752,0:34:07.352 This is a nice method if it looks like lots of 0:34:07.352,0:34:13.386 9999 and otherwise do something better as Dan will tell now. 0:34:13.386,0:34:13.955 Dan: Ok. 0:34:13.955,0:34:21.508 applause 0:34:21.508,0:34:23.662 Alright, so it's bad to have p and q 0:34:23.662,0:34:24.805 too close to each other 0:34:24.805,0:34:28.640 or have a small p or to have p and q non-random. 0:34:28.640,0:34:31.010 So let's do everything right. 0:34:31.010,0:34:34.236 Let's make just independently choose some big p 0:34:34.236,0:34:37.033 between 2^511 and 2^512 0:34:37.033,0:34:38.667 and choose some big q 0:34:38.667,0:34:42.586 without being like the next prime or anywhere in the ballpark of p. 0:34:42.586,0:34:47.859 You could maybe say q has to be between 2^509 and 2^510 0:34:47.859,0:34:49.522 and then it's definitely nowhere near p. 0:34:49.522,0:34:52.539 Now you got this public key and p times q 0:34:52.539,0:34:54.473 which is a 1024 bit RSA key. 0:34:54.473,0:34:58.745 If nothing has gone wrong with your randomness generation 0:34:58.745,0:35:00.153 then what does the attacker do? 0:35:00.153,0:35:04.472 Well, we don't know anything fast. 0:35:04.472,0:35:06.840 So this is gonna be the one part of the talk 0:35:06.840,0:35:08.495 where there's these big powers of 2 0:35:08.495,0:35:09.778 for how fast things are 0:35:09.778,0:35:11.139 because they're not really fast. 0:35:11.139,0:35:14.624 You have to think about how much something like 2^80 is. 0:35:14.624,0:35:15.938 There's all these different methods, 0:35:15.938,0:35:17.944 I'll just skip down to the last two. 0:35:17.944,0:35:19.999 There's the quadratic sieve (QS), 0:35:19.999,0:35:22.114 the number field sieve (NFS). 0:35:22.114,0:35:24.109 These have been around since, well, 0:35:24.109,0:35:29.192 QS since the 80th, NFS since the early 90th. 0:35:29.192,0:35:32.327 The NFS, the general consensus is 0:35:32.327,0:35:35.197 it takes something like 2^80 operations. 0:35:35.197,0:35:36.675 Now here's something fun to try. 0:35:36.675,0:35:38.844 If somebody says 2^80 operations 0:35:38.844,0:35:41.473 and there's some cryptographer talking about the security or something, 0:35:41.473,0:35:44.244 ask them: what do you mean by an operation? 0:35:44.244,0:35:45.996 How fast is this operation? 0:35:45.996,0:35:47.480 And then most of the people saying this 0:35:47.480,0:35:48.624 will go running screaming like: 0:35:48.624,0:35:49.850 I don't know what an operation is, 0:35:49.850,0:35:51.193 don't ask me that question. 0:35:51.193,0:35:53.382 But still people will confidently tell you 0:35:53.382,0:35:56.230 that the NFS takes 2^80 operations 0:35:56.230,0:36:00.347 to break any 1024 bit RSA key 0:36:00.347,0:36:02.237 even if the user hasn't done anything wrong, 0:36:02.237,0:36:04.047 the implementor hasn't done anything wrong. 0:36:04.047,0:36:05.631 And this number, this 2^80 0:36:05.631,0:36:07.618 if you pin down what those operations are, 0:36:07.618,0:36:11.100 this is something which is doable today 0:36:11.100,0:36:15.746 by people with a big botnet or people with a big computer cluster. 0:36:15.746,0:36:18.611 I'll give some details what I mean, the size is there. 0:36:18.611,0:36:21.797 As your computers get cheaper and cheaper, 0:36:21.797,0:36:23.720 somehow chips aren't going up in clockspeed 0:36:23.720,0:36:25.149 but they're still getting cheaper. 0:36:25.149,0:36:27.636 You can get a really cheap ARM which is doing 0:36:27.636,0:36:28.904 the same kind of computations 0:36:28.904,0:36:31.501 that a pretty big CPU was doing several years ago. 0:36:31.501,0:36:33.181 And chips are gonna continue getting cheaper 0:36:33.181,0:36:36.604 for a while. These attacks against RSA 1024 0:36:36.604,0:36:39.568 are going to be common doable for more and more people. 0:36:39.568,0:36:43.119 How do these methods work? 0:36:43.119,0:36:47.210 I'll give one example and then one little Sage script. 0:36:47.210,0:36:50.169 Let's try just a really small number. 0:36:50.169,0:36:51.717 This will be with the quadratic sieve 0:36:51.717,0:36:53.727 which is not as fancy as the number field sieve 0:36:53.727,0:36:56.395 but it will give you some idea of how these methods work. 0:36:56.395,0:36:59.978 The QS starts with that Fermat method you heard about. 0:36:59.978,0:37:04.066 Remember, the idea there was to try to find a square 0:37:04.066,0:37:08.574 as a^2-N, so a was like the square root of N 0:37:08.574,0:37:10.525 or maybe the plus 1, plus 2 and so on 0:37:10.525,0:37:13.333 and you keep searching for a+1 and so on, 0:37:13.333,0:37:15.933 search for numbers that if you square them and subtract N 0:37:15.933,0:37:17.056 then you get a square. 0:37:17.056,0:37:20.708 So let's try that for 50=2759 which is not very big. 0:37:20.708,0:37:23.975 Then you try, 53 was the ceiling of the square root, 0:37:23.975,0:37:27.673 square that, subtract 2759, you get 50. 0:37:27.673,0:37:31.171 Well 25 would be a square, but 50 is not a square 0:37:31.171,0:37:33.309 it's 2 times 25, 2 times 5^2. 0:37:33.309,0:37:37.208 And then you try another number, 54^2-2759 0:37:37.208,0:37:39.911 umm 157, that's too complicated 0:37:39.911,0:37:42.707 I don't remember that being a square so let's just skip it. 0:37:42.707,0:37:45.687 And then you keep going like this, 266, 377, 0:37:45.687,0:37:48.223 490. I remember 49 is a square 0:37:48.223,0:37:50.354 but that doesn't mean that 490 is a square 0:37:50.354,0:37:52.372 cause 10, 2 times 5, that's not a square. 0:37:52.372,0:37:55.871 And 605, you can try... 0:37:55.871,0:37:58.640 We remember how to take out the 5 0:37:58.640,0:38:01.275 and then 60... 605 divided by 5. 0:38:01.275,0:38:03.588 You can figure out it's 5 times 11^2. 0:38:03.588,0:38:05.158 It's almost a square. 0:38:05.158,0:38:06.774 You keep doing this for a while 0:38:06.774,0:38:10.534 and it seems like Fermat's method is actually working pretty badly. 0:38:10.534,0:38:13.754 It does work eventually but it's taking quite a while. 0:38:13.754,0:38:16.671 Here's what the quadratic sieve does. 0:38:16.671,0:38:19.527 From the same computation you look at these 0:38:19.527,0:38:21.585 numbers that were not exactly squares 0:38:21.585,0:38:23.634 50 and 490 and 605 0:38:23.634,0:38:27.056 and you notice if you multiple them together 0:38:27.056,0:38:28.472 50 was 2 times 5^2 0:38:28.472,0:38:30.756 490 is 2 times 5 times 7^2 0:38:30.756,0:38:33.068 605 is 5 times 11^2 0:38:33.068,0:38:34.585 and then you multiply those together 0:38:34.585,0:38:38.524 you get 2^2 and 5^4 and 7^2 and 11^2 0:38:38.524,0:38:41.123 and that's exactly the square of something. 0:38:41.123,0:38:44.592 The square of 2^1 times 5^2 times 7 times 11. 0:38:44.592,0:38:48.175 Then the quadratic sieve is taking the square root of that number 0:38:48.175,0:38:52.553 and subtracting that from the product of the fifty-somethings 0:38:52.553,0:38:53.953 that were on the left side 0:38:53.953,0:38:56.826 and taking the greatest common divisor of that with N 0:38:56.826,0:38:59.787 and amazingly that produces a factor of N. 0:38:59.787,0:39:02.289 And it's not just some random accident that 0:39:02.289,0:39:04.370 if you had tried the greatest common divisor of 0:39:04.370,0:39:06.085 "write down some hocuspocus then" 0:39:06.085,0:39:08.803 "you eventually get 31 coming out" 0:39:08.803,0:39:10.576 "every 31 times you write down a random number" 0:39:10.576,0:39:13.020 "it will have this 31 dividing it" 0:39:13.020,0:39:15.347 No, you can try this for bigger and bigger numbers 0:39:15.347,0:39:17.038 and actually this keeps working. 0:39:17.038,0:39:18.663 As soon as you find a square product 0:39:18.663,0:39:23.212 then half of those square products are gonna factor N. 0:39:23.212,0:39:25.365 So that's how the quadratic sieve works. 0:39:25.365,0:39:28.915 Here's a more systematic script with a much bigger number 0:39:28.915,0:39:31.300 which if it were just hocuspocus this wouldn't work. 0:39:31.300,0:39:32.616 But this does work. 0:39:32.616,0:39:36.834 So there's a number from my computer's random number generator 0:39:36.834,0:39:41.661 31415926... laughter 0:39:41.661,0:39:45.099 You try writing down a bunch of squares... 0:39:45.099,0:39:48.285 maybe I should get my computer's random number generator checked. 0:39:48.285,0:39:50.636 It is this two year old laptop you heard about before 0:39:50.636,0:39:52.202 so I'm not sure it's working properly. 0:39:52.202,0:39:55.599 You take a bunch of a^2-N and then 0:39:55.599,0:39:57.414 let's make a list of those called X. 0:39:57.414,0:40:00.752 You can see the range, there is up to 500.000 numbers. 0:40:00.752,0:40:02.948 It's a pretty big list we're talking about. 0:40:02.948,0:40:05.749 And then search through those elements, 0:40:05.749,0:40:08.538 search through those a^2-N, those differences, 0:40:08.538,0:40:09.504 the elements in this list, 0:40:09.504,0:40:12.985 to see which ones have easy factorizations. 0:40:12.985,0:40:15.445 Now a lot of numbers like that 157 0:40:15.445,0:40:16.970 I know what the factorization of that is 0:40:16.970,0:40:19.846 but there is function which you can find on our website 0:40:19.846,0:40:22.696 easyfactorizations() which looks through the list X 0:40:22.696,0:40:25.251 and if it finds numbers that are easy to factor 0:40:25.251,0:40:27.935 then it quickly writes down the factorizations 0:40:27.935,0:40:30.482 using the small prime techniques you heard about. 0:40:30.482,0:40:33.169 And then makes a list of those easy factorizations, 0:40:33.169,0:40:34.396 puts those into F. 0:40:34.396,0:40:37.600 And then there's a big chunk which I won't try to explain 0:40:37.600,0:40:39.986 which is called linear algebra mod 2. 0:40:39.986,0:40:42.586 And that somehow figures out a square 0:40:42.586,0:40:45.697 that comes from the factorizations, 0:40:45.697,0:40:48.363 somehow looks through this list of factorizations 0:40:48.363,0:40:50.803 and magically applies some linear algebra stuff. 0:40:50.803,0:40:52.612 Well ok, I won't go through that. 0:40:52.612,0:40:54.071 It's something doable 0:40:54.071,0:40:58.604 and this is using some standard Sage functions to do it pretty easily. 0:40:58.604,0:41:01.738 There's all sorts of things that 0:41:01.738,0:41:03.664 in the interest of time I won't get into 0:41:03.664,0:41:05.867 like how this easyfactorizations() works. 0:41:05.867,0:41:07.603 And this is something where people write papers 0:41:07.603,0:41:09.801 and papers and papers talking about trying to make this 0:41:09.801,0:41:11.603 go as quickly as possible. 0:41:11.603,0:41:14.737 I do want to emphasize one fact about these methods: 0:41:14.737,0:41:17.484 the bold-faced **very small memory requirements**. 0:41:17.484,0:41:20.038 You can use the elliptic curve method 0:41:20.038,0:41:21.531 that you heard about before 0:41:21.531,0:41:24.882 you can use that to figure out easy factorizations 0:41:24.882,0:41:27.063 using very little memory. 0:41:27.063,0:41:28.503 That means you can build a chip 0:41:28.503,0:41:30.471 like a FPGA, a special purpose ASIC, 0:41:30.471,0:41:33.285 you can have thousands of little ECM units 0:41:33.285,0:41:35.066 running in parallel, so you can really 0:41:35.066,0:41:38.153 massively parallelize this easyfactorizations() function. 0:41:38.153,0:41:40.187 So if people tell you "oh you need tons of 0:41:40.187,0:41:43.386 memory for sieving to find easy factorizations" 0:41:43.386,0:41:45.868 then you should tell them "no no, that's not true" 0:41:45.868,0:41:47.979 "ECM you can do with very little memory 0:41:47.979,0:41:50.351 running massively parallelizable" 0:41:50.351,0:41:52.802 And then there's other things which, 0:41:52.802,0:41:54.490 I suppose if we had another hour 0:41:54.490,0:41:56.432 then we could get into all these details 0:41:56.432,0:41:59.117 of like how the number field sieve goes beyond this. 0:41:59.117,0:42:01.970 It's a similar kind of method but gets more complicated. 0:42:01.970,0:42:05.181 I do want to emphasize again thing in bold-face here 0:42:05.181,0:42:06.946 which is that if somebody tells you 0:42:06.946,0:42:10.438 "oh 2^80 operations, the attacker wouldn't do 0:42:10.438,0:42:13.796 that because the target isn't worth that much to them" 0:42:13.796,0:42:18.284 Well, **Batch NFS** is a way to take a bunch of N's 0:42:18.284,0:42:20.566 and like the batch techniques before 0:42:20.566,0:42:23.516 there's some magic ways to find redundancies 0:42:23.516,0:42:25.884 between doing factorizations of those N's. 0:42:25.884,0:42:26.937 So somebody tells you: 0:42:26.937,0:42:30.000 "oh 2^80, that isn't worth it for the attacker." 0:42:30.000,0:42:33.296 Well, **Batch NFS** reduces the cost quite a bit 0:42:33.296,0:42:34.821 for factoring each key. 0:42:34.821,0:42:36.846 If the attacker has a lot of keys to target 0:42:36.846,0:42:39.852 they can factor all of them in not much more cost 0:42:39.852,0:42:41.396 than just factoring one. 0:42:41.396,0:42:43.734 So don't believe people who take an economic view 0:42:43.734,0:42:46.987 how much the number field sieve is worth. 0:42:46.987,0:42:49.416 Alright, what does this mean? 0:42:49.416,0:42:52.118 Can the attacker actually do the NFS 0:42:52.118,0:42:54.134 once they've put in all these optimizations? 0:42:54.134,0:42:59.064 Well if you look carefully at the analyses of the NFS 0:42:59.064,0:43:02.416 there are only about 2^70 differences. 0:43:02.416,0:43:04.149 Things like these a^2-N's. 0:43:04.149,0:43:08.862 If you look through 2^70 of those and scan them properly, 0:43:08.862,0:43:10.773 find easy factorizations, do linear algebra, 0:43:10.773,0:43:13.669 you will factor any 1024 bit key. 0:43:13.669,0:43:16.314 And 2^70, how big is that number? 0:43:16.314,0:43:20.820 It's actually small enough that you can do this computation on 0:43:20.820,0:43:22.469 your favorite botnet. 0:43:22.469,0:43:25.749 Say, Conficker is shrinking, still 0:43:25.749,0:43:27.947 and it will probably be wiped out at some point 0:43:27.947,0:43:29.488 but it's an example of what you can do 0:43:29.488,0:43:32.468 using any of these security problems that are deployed 0:43:32.468,0:43:33.632 on enough machines. 0:43:33.632,0:43:35.398 You can break into a bunch of machines, 0:43:35.398,0:43:36.521 millions of machines. 0:43:36.521,0:43:40.296 Conficker estimates vary between 7 million and 12 million machines. 0:43:40.296,0:43:44.320 So use your, say, 2^23, that's 8 million machines. 0:43:44.320,0:43:46.519 Count how many seconds there are in a year 0:43:46.519,0:43:48.553 and then, say, ok that means we have to do 0:43:48.553,0:43:52.070 2^22 differences per second machine. 0:43:52.070,0:43:56.902 And that is slower than state of the art factorization code already runs. 0:43:56.902,0:43:59.231 So it's actually a pretty easy computation. 0:43:59.231,0:44:02.334 You don't even need that many millions of computers to do this. 0:44:02.334,0:44:04.470 If people tell you "wait a minute" 0:44:04.470,0:44:06.120 "you can't just use a bunch of machines" 0:44:06.120,0:44:08.073 "there's all this work for linear algebra" 0:44:08.073,0:44:09.966 Then I'll just skip this. 0:44:09.966,0:44:12.882 Be aware that linear algebra is not as much of a problem 0:44:12.882,0:44:15.334 as some people would say. 0:44:15.334,0:44:18.872 If you use a big botnet there's one little problem. 0:44:18.872,0:44:21.053 For an attacker who breaks into a bunch of machines 0:44:21.053,0:44:23.903 and then starts spinning them up, heating up the CPU, 0:44:23.903,0:44:25.921 the fan starts running, the user starts wondering 0:44:25.921,0:44:28.017 "why is my machine running so slowly?" 0:44:28.017,0:44:30.133 Maybe only use half of the CPU 0:44:30.133,0:44:33.570 but still it has still got a good chance of getting you detected 0:44:33.570,0:44:35.202 and kicked off the machines. 0:44:35.202,0:44:37.266 Users are gonna shut you down if you're doing a 0:44:37.266,0:44:39.463 say, year of computation on your botnet. 0:44:39.463,0:44:42.883 So what do you do instead? 0:44:42.883,0:44:45.021 Well you build a little computer cluster. 0:44:45.021,0:44:49.354 Now yesterday we heard about a big building 0:44:49.354,0:44:51.164 in Bluffdale, Utah, that apparently 0:44:51.164,0:44:54.832 is going to store all of the data collected by 0:44:54.832,0:44:57.171 the NSA for the next hundred years 0:44:57.171,0:44:58.854 or maybe forever. 0:44:58.854,0:45:00.914 But something that I didn't hear emphasized yesterday 0:45:00.914,0:45:04.916 was that this building also has a new power substation 0:45:04.916,0:45:08.150 which is drawing 65 megawatts, 0:45:08.150,0:45:10.155 well generating 65 megawatts. 0:45:10.155,0:45:13.999 Now what are we doing with 65 megawatts? 0:45:13.999,0:45:15.308 Is that the kind of power that you need to 0:45:15.308,0:45:17.490 store a bunch of data on tapes? 0:45:17.490,0:45:21.175 No, that's the kind of power you need to do computations. 0:45:21.175,0:45:24.725 So what is NSA doing with 2^26 watts? 0:45:24.725,0:45:26.955 Or maybe even with more watts. 0:45:26.955,0:45:30.909 You shouldn't actually think that 2^26 is such a big number. 0:45:30.909,0:45:33.526 Here's a little table with, well, 0:45:33.526,0:45:36.573 2^57 admittedly is pushing it a little bit. 0:45:36.573,0:45:40.620 This is the total power that the earth gets from the sun 0:45:40.620,0:45:43.207 of which about half gets to the earth's surface. 0:45:43.207,0:45:47.610 2^44 is the amount the human is using right now. 0:45:47.610,0:45:50.888 Varies a little bit but I mean this is the average power 0:45:50.888,0:45:53.427 including cars and such. 0:45:53.427,0:45:56.446 If you have the botnet that breaks into millions of machines 0:45:56.446,0:45:59.944 and runs in flat-out that's about 2^30 watts. 0:45:59.944,0:46:01.610 And actually if you think about it, wait a minute, 0:46:01.610,0:46:02.575 there's a lot of machines out there. 0:46:02.575,0:46:07.270 This means the 2^26 is not actually that much power. 0:46:07.270,0:46:11.456 It's just one dinky little Bluffdale 65 MW computer center 0:46:11.456,0:46:14.594 and actually if you're a government agency 0:46:14.594,0:46:16.138 you probably have more than one of those. 0:46:16.138,0:46:19.273 As you heard yesterday in the context of data storage 0:46:19.273,0:46:21.292 but again, it's not just data storage. 0:46:21.292,0:46:26.492 You don't make a 65 MW power station to just store data. 0:46:26.492,0:46:30.207 Ok, if you just take standard graphics processing units 0:46:30.207,0:46:33.223 and run 2^26 watts to those that will do about 0:46:33.223,0:46:36.446 2^84 floating point multiplications per year. 0:46:36.446,0:46:39.827 You have to figure out, ok, what are exactly the operations involved here. 0:46:39.827,0:46:44.829 This should be enough to break 1024 bit RSA 0:46:44.829,0:46:48.088 and if NSA is not just buying GPU's off-the-shelves 0:46:48.088,0:46:50.226 but really tuning chips that they build 0:46:50.226,0:46:57.409 using IBM to manufacture their chips at the moment. 0:46:57.409,0:47:01.626 Apparently it wasn't cost-effective for NSA to manufacture their own chips 0:47:01.626,0:47:04.475 so in 2005 they started subcontracting for IBM 0:47:04.475,0:47:08.153 but presumably through that fabrication IBM is building chips 0:47:08.153,0:47:11.073 that NSA wants and those should be about 10 times faster 0:47:11.073,0:47:13.025 than what GPU's can do. 0:47:13.025,0:47:18.362 So it should be possible with this little 65 MW computer cluster 0:47:18.362,0:47:21.476 to factor at least 1, maybe even 10, 0:47:21.476,0:47:23.478 and then with batch NFS maybe even more 0:47:23.478,0:47:30.010 1024 bit RSA keys in a year. 0:47:30.010,0:47:38.229 applause 0:47:38.229,0:47:41.130 Nadia: So to conclude things up here I want to 0:47:41.130,0:47:43.290 explain how you can use 0:47:43.290,0:47:46.057 from the comfort of your very own home 0:47:46.057,0:47:52.242 a distributed, a massive scale distributed cloud computing service 0:47:52.242,0:47:57.142 to calculate private keys on your own. 0:47:57.142,0:48:00.313 So many of you may be familiar with Amazon EC2 0:48:00.313,0:48:04.635 but it turns out Google also has a large amount of computing infrastructure 0:48:04.635,0:48:07.339 and they actually have a very convenient web interface 0:48:07.339,0:48:08.446 to access it. 0:48:08.446,0:48:12.978 laugther 0:48:12.978,0:48:21.677 applause 0:48:21.677,0:48:23.961 It's amazing what you can find on the Internet. 0:48:23.961,0:48:26.143 laughter 0:48:26.143,0:48:29.393 So ok, here's some keys except, you know, 0:48:29.393,0:48:30.578 if you look at these carefully 0:48:30.578,0:48:33.277 not all of these are sort of obviously RSA keys. 0:48:33.277,0:48:35.204 There are some problems here 0:48:35.204,0:48:40.704 like the second to last one on the bottom. 0:48:40.704,0:48:41.688 So... 0:48:41.688,0:48:43.338 laughter 0:48:43.338,0:48:49.820 After doing this search we found this key and 0:48:49.820,0:48:55.279 someone seems to have pasted a private key into pastebin 0:48:55.279,0:49:00.491 and someone came along and interrupted part of it 0:49:00.491,0:49:04.759 with some profanity. 0:49:04.759,0:49:07.387 This is an interesting problem. 0:49:07.387,0:49:08.812 What do we do with this key? 0:49:08.812,0:49:12.310 Clearly this is an interesting key, we must have this key. 0:49:12.310,0:49:14.204 laughter 0:49:14.204,0:49:24.847 applause 0:49:24.847,0:49:26.046 Step 1... yeah? 0:49:26.046,0:49:29.572 Male: Did you see the easter egg in row 1? 0:49:29.572,0:49:31.204 Nadia: Well, we'll discuss this. 0:49:31.204,0:49:34.308 Angel: Questions will be later, after the talk please. 0:49:34.308,0:49:39.368 Nadia: So yeah, we've removed the obvious problems here. 0:49:39.368,0:49:41.636 But there's still an issue which is that 0:49:41.636,0:49:43.844 we're missing hundreds and hundreds of bits of key. 0:49:43.844,0:49:45.940 We can't brute force-this. 0:49:45.940,0:49:47.658 What are we gonna do? 0:49:47.658,0:49:51.327 Well, let's look at what RSA keys actually are. 0:49:51.327,0:49:53.509 I introduced RSA keys and I was, like, 0:49:53.509,0:49:56.786 it's a d and d has like a thosand of bits. 0:49:56.786,0:50:00.671 And if we don't know p and q 0:50:00.671,0:50:02.608 how are we gonna get this d? 0:50:02.608,0:50:07.923 Here is the storage format for an RSA key. 0:50:07.923,0:50:11.258 Public key is the modulus N, the public exponent e. 0:50:11.258,0:50:14.710 Ok, private key has version, N, e, 0:50:14.710,0:50:16.536 d the decryption exponent, 0:50:16.536,0:50:20.273 p, q, d mod (p-1), d mod (q-1), 0:50:20.273,0:50:22.975 the inverse of q mod p, 0:50:22.975,0:50:25.305 this is all useful information if you want to 0:50:25.305,0:50:26.960 speed up your calculation and not have to 0:50:26.960,0:50:29.405 compute everything every time you use your key. 0:50:29.405,0:50:32.493 Okay. 0:50:32.493,0:50:35.952 What did we see here then? 0:50:35.952,0:50:40.975 Here is the key broken up into all of the difference pieces. 0:50:40.975,0:50:44.804 So we have: the red bit is approximately where N is sitting. 0:50:44.804,0:50:46.754 We've got e in orange. 0:50:46.754,0:50:50.569 We've got part of d in yellow. 0:50:50.569,0:50:52.372 We seem to be missing p 0:50:52.372,0:50:54.656 but we've got part of q 0:50:54.656,0:50:56.791 and then here's d mod p-1 0:50:56.791,0:51:02.408 and d mod q-1 and q inverse mod p. 0:51:02.408,0:51:07.712 There's also a little problem before the red here 0:51:07.712,0:51:08.436 laughter 0:51:08.436,0:51:10.504 in addition. 0:51:10.504,0:51:13.561 You might call this the private part of the private key. 0:51:13.561,0:51:15.936 laughter 0:51:15.936,0:51:21.391 applause 0:51:21.391,0:51:23.859 Fortunately this is just sitting in an encoding of the length 0:51:23.859,0:51:30.275 and the version. laughter 0:51:30.275,0:51:33.151 Alright, so what do we do with this information? 0:51:33.151,0:51:36.673 Basically given any part of this private key 0:51:36.673,0:51:38.910 we can easily compute all of the other parts. 0:51:38.910,0:51:40.257 Simple formulas: 0:51:40.257,0:51:48.938 q will be 2^(e times d mod p-1) - 1 and GCD that with N 0:51:48.938,0:51:51.872 then we get q and then 0:51:51.872,0:51:54.455 we can figure out what p was by dividing 0:51:54.455,0:51:56.393 and figure out what d is 0:51:56.393,0:51:57.953 and so on and so forth. 0:51:57.953,0:52:02.460 You can do this even if you have a part of a single value 0:52:02.460,0:52:04.109 from the private key. 0:52:04.109,0:52:08.195 You can still compute the rest of the private key from that. 0:52:08.195,0:52:12.338 We don't have time to get into this but it's online. 0:52:12.338,0:52:16.557 You've seen a little bit of math. 0:52:16.557,0:52:18.060 Single formulas, typed into Sage. 0:52:18.060,0:52:22.591 We can reconstruct the private key here. 0:52:22.591,0:52:29.487 applause 0:52:29.487,0:52:32.941 So, what are the lessons that we can learn 0:52:32.941,0:52:35.554 from everything that we've done today. 0:52:35.554,0:52:38.945 The first one is: stop using 1024 bit RSA 0:52:38.945,0:52:41.010 if you haven't already. 0:52:41.010,0:52:43.386 I have looked at the keys that are out there on the Internet 0:52:43.386,0:52:45.941 and you are still using 1024 bit RSA. 0:52:45.941,0:52:48.463 laughter 0:52:48.463,0:52:51.790 Second: make sure your primes are big enough. 0:52:51.790,0:52:54.227 Don't try to be clever about how you're generating them. 0:52:54.227,0:52:58.025 Make sure your primes are random. 0:52:58.025,0:52:59.303 You guys have some problems 0:52:59.303,0:53:01.394 especially in hardware. 0:53:01.394,0:53:03.621 Also... 0:53:03.621,0:53:04.921 laughter 0:53:04.921,0:53:09.091 "FUCK A DUCK" is not good crypto. 0:53:09.091,0:53:11.545 Pastebin is not a secure cloud store. 0:53:11.545,0:53:13.874 laughter 0:53:13.874,0:53:20.289 applause 0:53:20.289,0:53:22.657 And you probably shouldn't put your private key 0:53:22.657,0:53:24.778 in a secure cloud store anyway. 0:53:24.778,0:53:25.862 laughter 0:53:25.862,0:53:31.396 applause 0:53:31.396,0:53:32.577 And lastly... 0:53:32.577,0:53:34.211 laughter 0:53:34.211,0:53:40.427 applause 0:53:40.427,0:53:41.625 Thank you. 0:53:41.625,0:53:43.894 applause 0:53:43.894,0:53:44.804 Angel: Give them an applause. 0:53:44.804,0:54:05.937 applause 0:54:05.937,0:54:11.379 Angel: So questions will be taken at the numbered microphones. 0:54:11.379,0:54:14.962 So anyone who has a question please line up at a microphone 0:54:14.962,0:54:16.959 and we will start by the signal angel 0:54:16.959,0:54:19.528 who has a question from IRC. 0:54:19.528,0:54:23.628 *Signal Angel*: IRC has a lot of punch lines involving penisses 0:54:23.628,0:54:26.040 and some questions. 0:54:26.040,0:54:31.445 First question is: would you recommend more using 0:54:31.445,0:54:37.640 elliptic curves or RSA with multiple primes... 0:54:37.640,0:54:41.659 using proper primes. 0:54:41.659,0:54:47.192 Dan: There are ways to do elliptic curves wrong, 0:54:47.192,0:54:48.854 there are ways to do RSA wrong. 0:54:48.854,0:54:53.806 In general if you've got a particular performance requirement 0:54:53.806,0:54:55.625 then elliptic curves are gonna meet that 0:54:55.625,0:54:57.861 with a much higher security level 0:54:57.861,0:55:00.208 than RSA is gonna meet that. 0:55:00.208,0:55:03.005 Of course you can do RSA securely, 0:55:03.005,0:55:04.928 you can do elliptic curves securely 0:55:04.928,0:55:07.153 but if you've got some performance limitation 0:55:07.153,0:55:08.613 as many applications do 0:55:08.613,0:55:13.043 then elliptic curves tend to be a better choice. 0:55:13.043,0:55:14.429 Nadia: I also want to clarify 0:55:14.429,0:55:17.826 the other most commonly used public-key cryptosystem 0:55:17.826,0:55:19.979 is DSA, the Digital Signature Algorithm. 0:55:19.979,0:55:21.426 There's also elliptic curve DSA 0:55:21.426,0:55:23.291 which is probably what you're thinking about 0:55:23.291,0:55:25.570 with elliptic curve cryptography. 0:55:25.570,0:55:31.207 DSA is in my opinion actually much worse 0:55:31.207,0:55:34.395 than RSA in terms of randomness failures. 0:55:34.395,0:55:36.246 In the paper that Dan referenced 0:55:36.246,0:55:39.377 we also talk about randomness failures in DSA 0:55:39.377,0:55:40.993 and it's horrifying. 0:55:40.993,0:55:44.521 If you ever ever screw up randomness in a single signature 0:55:44.521,0:55:48.354 your entire public key is lost. 0:55:48.354,0:55:50.761 Angel: We will take the next question from microphone nr 4 0:55:50.761,0:55:53.072 to the right of the seat. 0:55:53.072,0:55:53.824 Jake: Hey guys. 0:55:53.824,0:55:56.277 Sorry I didn't mention the computing power of Bluffdale. 0:55:56.277,0:56:02.072 That said, when we want to transition from 1024 bit RSA 0:56:02.072,0:56:05.339 to something else, what do you think is a good idea? 0:56:05.339,0:56:10.028 So for example, in Tor we do use 1024 bit RSA for TLS. 0:56:10.028,0:56:12.528 laughter Yeah I know, right? 0:56:12.528,0:56:15.592 So the thing is we're working on changing these things 0:56:15.592,0:56:17.079 but what is... 0:56:17.079,0:56:19.766 I mean Zooko has this 100 year crypto project. 0:56:19.766,0:56:21.806 What is the real thing that we should, 0:56:21.806,0:56:23.910 like if we could switch tomorrow to something, 0:56:23.910,0:56:26.595 what's the useful that we can live with for 5 or 10 years? 0:56:26.595,0:56:28.795 Is 2048 going to be useful? 0:56:28.795,0:56:31.179 Is 4096 where we should go tomorrow? 0:56:31.179,0:56:32.780 Cause there is a trade-off between 0:56:32.780,0:56:34.588 performance and forward secrecy 0:56:34.588,0:56:37.962 and there are a lot of things to think about. 0:56:37.962,0:56:40.078 Tanja: If you tell me 5 to 10 years 0:56:40.078,0:56:42.360 I would be still comfortable with elliptic curves. 0:56:42.360,0:56:44.773 If you talk 100 years 0:56:44.773,0:56:47.691 and then there's all these worse quantum computers around 0:56:47.691,0:56:51.143 then I would go for something which is worse in performance 0:56:51.143,0:56:52.829 like code-based cryptography 0:56:52.829,0:56:54.043 or for signatures hashing 0:56:54.043,0:56:55.738 but if you'd say in 5 to 10 years 0:56:55.738,0:56:57.411 I'm very comfortable recommending to you 0:56:57.411,0:56:59.941 elliptic curves with 256 bits. 0:56:59.941,0:57:02.160 Jake: Ok, thanks. 0:57:02.160,0:57:04.878 Angel: Signal angel? 0:57:04.878,0:57:07.190 *Signal Angel*: Yeah, another question from IRC. 0:57:07.190,0:57:11.028 If you avoid all the easy primes 0:57:11.028,0:57:14.227 does this shrink the search space such that 0:57:14.227,0:57:17.862 it becomes easier to crack the remaining keys? 0:57:17.862,0:57:20.161 Or asked another way: 0:57:20.161,0:57:25.146 can you quantify the ratio of easy primes versus hard primes? 0:57:25.146,0:57:26.287 Dan: Yeah, that's a good question. 0:57:26.287,0:57:28.573 The answer is: yes, it can be quantified 0:57:28.573,0:57:32.829 and once you're up to about to the 1024 bit RSA key level 0:57:32.829,0:57:35.306 then there's very very few weak primes. 0:57:35.306,0:57:37.078 So if you have a random number, 0:57:37.078,0:57:38.593 you just generate a random prime 0:57:38.593,0:57:40.238 then your chance of bumping into something weak 0:57:40.238,0:57:42.793 is so small that you just don't have to worry about it. 0:57:42.793,0:57:44.862 It is one of those much less frequent than 0:57:44.862,0:57:46.580 being hit by a lightning kind of events. 0:57:46.580,0:57:49.210 It is something that have been quantified 0:57:49.210,0:57:52.244 and it is an issue for smaller RSA keys 0:57:52.244,0:57:54.872 back when people were using, say, 512 bit RSA 0:57:54.872,0:57:56.754 then it was actually a noticable issue. 0:57:56.754,0:57:59.123 But once you're at 1024 or above then 0:57:59.123,0:58:00.694 the issue is basically gone. 0:58:00.694,0:58:03.831 It's something where you can and you should 0:58:03.831,0:58:06.097 look at exactly what the chance is of bumping into a weak 0:58:06.097,0:58:08.828 but it's not reallistically something 0:58:08.828,0:58:13.259 you're gonna encounter with 1024 bit RSA. 0:58:13.259,0:58:16.542 Angel: We're gonna take the next question from nr 1. 0:58:16.542,0:58:19.313 Just one notice: we don't have a lot of time left 0:58:19.313,0:58:23.546 so even though there is a talk in one hour 0:58:23.546,0:58:25.078 just ask. 0:58:25.078,0:58:28.165 Male: There is a devastating attack that can break AES, 0:58:28.165,0:58:31.872 SHA2, most probably SHA3 and DES. 0:58:31.872,0:58:33.758 It's called a biclique attack. 0:58:33.758,0:58:35.664 Are you concerned that it might break RSA 0:58:35.664,0:58:37.281 with any key size and even ECC 0:58:37.281,0:58:39.988 and even any public-key crypto? 0:58:39.988,0:58:41.660 Dan: No. 0:58:41.660,0:58:43.594 laughter 0:58:43.594,0:58:46.675 Bicliques are something which are certainly interesting 0:58:46.675,0:58:49.590 and I think about it as a way to speed up brute force 0:58:49.590,0:58:52.508 and it speeds up brute force by a noticable factor, 0:58:52.508,0:58:55.148 often makes attacks, say, twice as fast. 0:58:55.148,0:58:58.523 But with public-key crypto we have attacks which are 0:58:58.523,0:59:00.760 much much faster than brute force to begin with 0:59:00.760,0:59:04.209 so that kind of speed-up of brute force won't be competitive 0:59:04.209,0:59:08.226 with things like the number field sieve. 0:59:08.226,0:59:12.964 Angel: Next is from number 3. 0:59:12.964,0:59:16.824 Male: Me not coming from the dark side of mathematics, 0:59:16.824,0:59:20.923 how do I know that my random number generators are fine 0:59:20.923,0:59:23.661 for generating keys? 0:59:23.661,0:59:26.223 laughter 0:59:26.223,0:59:31.354 Nadia: Seed them. 0:59:31.354,0:59:37.042 Basically the things that are out there if you're using 0:59:37.042,0:59:40.613 a standard random number generator like in Linux, 0:59:40.613,0:59:44.441 actually Linux has added patches to the kernel 0:59:44.441,0:59:48.022 to fix some of the problems that we've found. 0:59:48.022,0:59:50.431 If you want to know that you're keys are good: 0:59:50.431,0:59:54.694 if you generate them on a general purpose computer 0:59:54.694,0:59:56.600 after it has been running for a long time 0:59:56.600,0:59:59.559 you're probably fine. 0:59:59.559,1:00:04.793 I would not generate very important or long-term keys 1:00:04.793,1:00:09.707 on low-power hardware devices where you can't tell... 1:00:09.707,1:00:11.910 laughter 1:00:11.910,1:00:15.758 Male: Thank you. 1:00:15.758,1:00:18.341 Angel: Next question will be taken from number 4. 1:00:18.341,1:00:19.394 Male: Hi. 1:00:19.394,1:00:23.690 It's just a remark, not really a question. 1:00:23.690,1:00:25.407 I work in high performance computing. 1:00:25.407,1:00:26.910 I was at the supercomputing conference 1:00:26.910,1:00:29.376 a couple of weeks ago. 1:00:29.376,1:00:31.931 They're presenting large-scale installations 1:00:31.931,1:00:35.863 and dealing with problems of power. 1:00:35.863,1:00:38.927 They want to build petaflops and exaflops systems 1:00:38.927,1:00:42.262 that are in a range of 20 MW. 1:00:42.262,1:00:47.316 So I'm wondering what NSA is doing with 53 megawatts 1:00:47.316,1:00:53.815 in a data center because no storage takes that much capacity. 1:00:53.815,1:00:58.976 I think it's really something we should think about. 1:00:58.976,1:01:03.266 So thanks, nice talk. 1:01:03.266,1:01:04.883 Angel: Next question from... 1:01:04.883,1:01:06.961 does the signal angel have one? 1:01:06.961,1:01:08.679 *Signal angel*: Ok, another question is: 1:01:08.679,1:01:11.845 How big do you estimate the effort to 1:01:11.845,1:01:17.308 exchange keys and certificates involving 1024 bit primes 1:01:17.308,1:01:22.782 let's say worldwide at the moment? 1:01:22.782,1:01:24.502 Dan: I wasn't getting what the question was. 1:01:24.502,1:01:25.642 Could you repeat the question? 1:01:25.642,1:01:27.344 *Signal angel*: I don't really understand it myself. 1:01:27.344,1:01:28.875 laughter 1:01:28.875,1:01:33.181 Angel: Ok, let's switch to 1. 1:01:33.181,1:01:37.009 Male: My question goes back to the idea 1:01:37.009,1:01:42.876 of how can we know how good the private keys are? 1:01:42.876,1:01:50.227 Is there something like a keygen evaluation tool? 1:01:50.227,1:01:56.567 Or can the quality of a key generator be estimated 1:01:56.567,1:01:58.544 from a sample of keys? 1:01:58.544,1:02:01.827 Like a public tool that you can throw keys on 1:02:01.827,1:02:05.565 and it will tell me a bias of my keygen? 1:02:05.565,1:02:09.091 Nadia: If you go to factorable.net 1:02:09.091,1:02:12.574 my co-authors and I have made a key check tool 1:02:12.574,1:02:14.728 which will tell you if your key is in the 1:02:14.728,1:02:17.528 list of keys that we scraped off the Internet 1:02:17.528,1:02:19.296 that we were able to compromise. 1:02:19.296,1:02:21.249 That's a first check. 1:02:21.249,1:02:24.449 It's not a positive verification that your key is good 1:02:24.449,1:02:26.544 but it will tell you if it is very bad. 1:02:26.544,1:02:28.863 laughter 1:02:28.863,1:02:36.811 If you want to check your generator, 1:02:36.811,1:02:40.827 if you're worried about the factorization problems 1:02:40.827,1:02:43.242 we have fast code on the website in C 1:02:43.242,1:02:45.715 that will do the GCD calculation for you 1:02:45.715,1:02:50.201 if you just dump a gigantic collection of keys at it. 1:02:50.201,1:02:54.375 If you might have other problems like you are 1:02:54.375,1:02:56.993 not sure whether it's factorable like those don't come up 1:02:56.993,1:03:00.497 the experiment that you might want to do is 1:03:00.497,1:03:04.211 sort of restart the process in realistic conditions 1:03:04.211,1:03:06.179 and generate a gigantic quantity of keys 1:03:06.179,1:03:08.981 and just check for repeats. 1:03:08.981,1:03:12.378 The sort of obvious stress testings. 1:03:12.378,1:03:16.375 If you have a device you want to boot it up in realistic conditions 1:03:16.375,1:03:18.048 and then check them 1:03:18.048,1:03:19.798 and this is painstaking work 1:03:19.798,1:03:22.327 but I don't think there's any realistic way 1:03:22.327,1:03:25.546 of doing better than that. 1:03:25.546,1:03:28.396 Or inspect the code, the obvious thing. 1:03:28.396,1:03:30.924 Make sure you're seeding anything. 1:03:30.924,1:03:33.463 Male: Ok, thank you. 1:03:33.463,1:03:35.993 Angel: Next question will be from mic number 4. 1:03:35.993,1:03:37.045 Male: Hi. 1:03:37.045,1:03:40.745 If I got your numbers correctly so you said something like 1:03:40.745,1:03:45.450 it would take 8 million machines a year to factor 1024 1:03:45.450,1:03:48.976 so I was wondering what if I would like to factor like 1:03:48.976,1:03:52.243 800 bits or 900 bits which is, 1:03:52.243,1:03:55.149 as I understand, way easier than 1024 1:03:55.149,1:03:57.042 but was never done publicly. 1:03:57.042,1:04:00.259 So are you saying that would take like a day or something? 1:04:00.259,1:04:02.581 Dan: It depends on the size of cluster you have. 1:04:02.581,1:04:07.064 The RSA 768 factorization a couple of years ago 1:04:07.064,1:04:11.423 used something on the scale of a 1500 computers 1:04:11.423,1:04:14.130 for a year. 1:04:14.130,1:04:15.729 And those were normal kind of computers, 1:04:15.729,1:04:19.475 Desktop kind of computers with, I think, typically 2 cores. 1:04:19.475,1:04:24.240 Nowadays, if you have some faster, say, 3 GHz 4-core machines, 1:04:24.240,1:04:27.446 I think those were typically 2 GHZ 2-core, nowadays... 1:04:27.446,1:04:30.582 Tanja's mentioning you of course want to use GPUs 1:04:30.582,1:04:31.725 to speed things up. 1:04:31.725,1:04:33.397 And there's many ways to take advantage of 1:04:33.397,1:04:35.393 computer technology moving forward. 1:04:35.393,1:04:39.998 To get careful estimates: for going from 768 to 800, 1:04:39.998,1:04:42.192 that's a short enough extrapolation 1:04:42.192,1:04:44.113 that people are pretty confident about that. 1:04:44.113,1:04:48.014 Getting to 900 starts getting iffy what exactly the cost would be 1:04:48.014,1:04:50.951 and 1024 there's a fair amount of guess works. 1:04:50.951,1:04:53.796 There is some consensus on rougly what the costs are 1:04:53.796,1:04:55.429 but it's hard to say exactly. 1:04:55.429,1:04:58.059 But again, to give you a scale of it: 1:04:58.059,1:04:59.675 you were saying like 900 or 800. 1:04:59.675,1:05:03.213 Well, 768 RSA challenge was done 1:05:03.213,1:05:07.228 and that one was 1500 machine years. 1:05:07.228,1:05:08.501 Male: Ok, thank you. 1:05:08.501,1:05:10.260 Angel: We'll take four more questions. 1:05:10.260,1:05:12.357 Signal angel first. 1:05:12.357,1:05:16.676 *Signal angel*: It's a clarification of the question I asked before. 1:05:16.676,1:05:22.278 The actual question is: how big will be the effort 1:05:22.278,1:05:32.129 to upgrade existing keys from 1024 bits to 4K or 8K bits? 1:05:32.129,1:05:35.984 Considering that the keys are currently in use 1:05:35.984,1:05:37.910 how much effort worldwide would it be to 1:05:37.910,1:05:41.608 upgrade all that to something more secure? 1:05:41.608,1:05:43.559 Tanja: I mean, for the private user if you're running 1:05:43.559,1:05:46.447 SSH on the laptop you can just generate a new key. 1:05:46.447,1:05:47.876 Doesn't take you much time. 1:05:47.876,1:05:50.345 You will notice maybe some degradation of performance 1:05:50.345,1:05:52.193 if you're running it on a netbook 1:05:52.193,1:05:54.755 but going to 2048 is not a big deal. 1:05:54.755,1:05:57.710 However, there is a recommendation of the 1:05:57.710,1:06:01.932 US government to stop using 1024 bit RSA as of 2010 1:06:01.932,1:06:05.211 and we still see it everywhere. 1:06:05.211,1:06:07.563 Apparently it is bigger effort than just 1:06:07.563,1:06:10.743 everybody spending 10 minutes on the laptop. 1:06:10.743,1:06:13.635 Dan: Maybe part of the problem is things like 1:06:13.635,1:06:16.997 everybody says "yeah, use 2048 or bigger" 1:06:16.997,1:06:18.884 and there's some financial standard 1:06:18.884,1:06:21.157 which says you can use any key size you want 1:06:21.157,1:06:23.943 up to 1984 bits. 1:06:23.943,1:06:25.631 I have no idea how they came up with this 1:06:25.631,1:06:27.208 but then they run into some other standard 1:06:27.208,1:06:29.284 which says use 2048 and they just can't 1:06:29.284,1:06:33.345 implement it on the smartcards which only support up to 1984. 1:06:33.345,1:06:36.467 Now if you get the standards people talking to each other for several years 1:06:36.467,1:06:39.258 then they agree on using 1984. 1:06:39.258,1:06:41.796 It's just about as good as 2048 1:06:41.796,1:06:44.746 but realistically when you have these conflicts 1:06:44.746,1:06:47.128 in standards then it takes a while to work out. 1:06:47.128,1:06:48.742 And when you have a busy server 1:06:48.742,1:06:51.233 that's trying to do 2048 bit RSA 1:06:51.233,1:06:53.126 it doesn't matter what the standard says 1:06:53.126,1:06:55.449 if you got a ton of load you just can't handle that load 1:06:55.449,1:06:58.341 if you're spending a ton of time on cryptography. 1:06:58.341,1:07:00.460 And that again is where we like elliptic curves 1:07:00.460,1:07:02.560 because they give you for whatever your 1:07:02.560,1:07:05.341 performance constraints are much better security. 1:07:05.341,1:07:07.309 So if you're trying a given security level 1:07:07.309,1:07:09.831 the speed is much better. 1:07:09.831,1:07:12.807 Angel: Next question will be from number 2. 1:07:12.807,1:07:14.642 Female: So you've had two developers stand up 1:07:14.642,1:07:17.459 and ask how to verify whether or not their 1:07:17.459,1:07:19.726 key generation is correct and whether or not their 1:07:19.726,1:07:21.490 random numbers are correct. 1:07:21.490,1:07:23.887 And I think it's great that you give coding recommendations 1:07:23.887,1:07:27.441 like seed rand() but coming from the development 1:07:27.441,1:07:29.539 perspective I like to unit test my code 1:07:29.539,1:07:31.789 and specifically I like to write my unit tests 1:07:31.789,1:07:33.043 before I write my code, 1:07:33.043,1:07:34.709 it's called test-driven development. 1:07:34.709,1:07:38.175 So if I'm going about creating an algorithm to 1:07:38.175,1:07:39.994 encrypt something or whatever 1:07:39.994,1:07:42.771 what are the steps that I need to do 1:07:42.771,1:07:46.653 when I'm writing my unit test? 1:07:46.653,1:07:51.939 Has there been some kind of effort in that way 1:07:51.939,1:07:54.911 like what goes into unit test for determining that 1:07:54.911,1:07:56.606 your random number is correct? 1:07:56.606,1:07:59.091 I mean there's algorithms for determining 1:07:59.091,1:08:00.562 whether your random number generator is 1:08:00.562,1:08:02.773 operating correctly, right? 1:08:02.773,1:08:05.160 Dan: This is something where there is, 1:08:05.160,1:08:07.775 if you look at how the random number generators work, 1:08:07.775,1:08:10.209 there was just a NIST workshop 1:08:10.209,1:08:11.975 and there's some NIST standards 1:08:11.975,1:08:12.742 which are telling you 1:08:12.742,1:08:15.709 here's what to do to test your random number generators. 1:08:15.709,1:08:17.796 So you have a hardware random number generator, 1:08:17.796,1:08:19.589 some post-processing, and they say 1:08:19.589,1:08:22.406 "ok here's the standard for how to test those pieces" 1:08:22.406,1:08:24.704 Now that's not the same as testing the cryptography 1:08:24.704,1:08:26.335 that's using those pieces 1:08:26.335,1:08:28.994 and it's very helpful if there's a clear separation there. 1:08:28.994,1:08:30.463 So you have your cryptography 1:08:30.463,1:08:32.746 which is doing deterministic stuff 1:08:32.746,1:08:34.452 and says you have to have some randomness coming 1:08:34.452,1:08:37.806 from a totally separate unit where the only job 1:08:37.806,1:08:40.460 of that separate unit is do the randomness properly. 1:08:40.460,1:08:42.146 And then the NIST test will tell you 1:08:42.146,1:08:43.324 what the randomness does 1:08:43.324,1:08:45.938 and then deterministic cryptographic testing 1:08:45.938,1:08:48.391 will tell you that the other pieces are working correctly 1:08:48.391,1:08:50.211 for all sorts of inputs. 1:08:50.211,1:08:53.892 Where it goes wrong is something like the ECDSA standard 1:08:53.892,1:08:55.394 that Nadia was mentioning before 1:08:55.394,1:08:56.696 and that's one that says that 1:08:56.696,1:08:59.210 you don't just deterministically generate a signature 1:08:59.210,1:09:01.410 you make some new randomness every time you generate 1:09:01.410,1:09:03.705 a signature and that's what goes horribly wrong. 1:09:03.705,1:09:05.670 So that's something where we need new standards 1:09:05.670,1:09:07.790 which say instead of doing what ECDSA does 1:09:07.790,1:09:09.741 we have a clear separation between 1:09:09.741,1:09:12.192 the cryptography always does the same thing 1:09:12.192,1:09:15.043 making it testable, and then randomness is centralized 1:09:15.043,1:09:17.541 in one spot which hopefully the NIST standards 1:09:17.541,1:09:19.840 do a good enough job of testing. 1:09:19.840,1:09:23.508 Female: So there's something that says here's what determines... 1:09:23.508,1:09:26.946 I guess what I'm asking is do you know of any algorithms 1:09:26.946,1:09:30.139 that can be used for determining whether something 1:09:30.139,1:09:32.248 is a good random number and whether your random 1:09:32.248,1:09:34.525 number generator is generating things properly? 1:09:34.525,1:09:37.027 Tanja: Well, there are a bunch of tests for. 1:09:37.027,1:09:39.307 There's hardware random number generators 1:09:39.307,1:09:42.126 you can run them through the diehard tests. 1:09:42.126,1:09:44.710 There's certificates by FIPS. 1:09:44.710,1:09:48.526 In general I would recommend implementing all those steps 1:09:48.526,1:09:50.811 but the smartcards that have been mentioned by Dan earlier 1:09:50.811,1:09:52.742 from the Taiwan citizen card 1:09:52.742,1:09:55.445 those are actually FIPS-certified. 1:09:55.445,1:09:58.942 So even though, these go through what is the industry standard 1:09:58.942,1:10:00.922 of doing random number generation on smartcards 1:10:00.922,1:10:03.622 which everybody thought was good enough, 1:10:03.622,1:10:04.979 apparently they're not. 1:10:04.979,1:10:06.477 So randomness is a total mess. 1:10:06.477,1:10:08.613 I mean after the paper by Nadia and 1:10:08.613,1:10:10.410 also when the paper by the Lausanne team came out, 1:10:10.410,1:10:12.692 there is no some more movement in finding 1:10:12.692,1:10:14.023 better random number generators. 1:10:14.023,1:10:18.659 At this moment, well, hope for the best. 1:10:18.659,1:10:20.625 laughter 1:10:20.625,1:10:22.178 Implement the standards, yes. 1:10:22.178,1:10:24.672 If you have some source of randomness 1:10:24.672,1:10:28.424 try to stretch it with a stream cipher or something like that. 1:10:28.424,1:10:31.296 Nadia: I want to tell a little story. 1:10:31.296,1:10:33.227 After we published the paper 1:10:33.227,1:10:35.643 I heard some very interesting things from some of the vendors. 1:10:35.643,1:10:39.329 I think one of the things that makes randomness 1:10:39.329,1:10:42.206 and cryptography a difficult problem is that 1:10:42.206,1:10:44.612 this kind of issue is something that 1:10:44.612,1:10:46.808 crosses a lot of the abstraction boundaries 1:10:46.808,1:10:48.546 that you're used to in coding. 1:10:48.546,1:10:50.525 You want to have the clean unit test that you know 1:10:50.525,1:10:52.343 that this piece works, and this piece works, 1:10:52.343,1:10:53.923 and this piece works, and this piece works, 1:10:53.923,1:10:57.777 and you put them all together and it will all work flawlessly. 1:10:57.777,1:11:00.297 Somehow this kind of problem is something that 1:11:00.297,1:11:02.898 happens at the boundaries of each unit test. 1:11:02.898,1:11:05.223 We know that the operating system is like 1:11:05.223,1:11:08.431 "ok I'm getting my proper inputs from the hardware 1:11:08.431,1:11:11.297 and I'm correctly generating my randomness from there" 1:11:11.297,1:11:12.840 and then the application says 1:11:12.840,1:11:15.023 "I'm getting my randomness from the operating system, 1:11:15.023,1:11:16.310 it's good randomness" 1:11:16.310,1:11:19.605 and then the application uses the correct crypto algorithms 1:11:19.605,1:11:21.927 to then do good cryptography. 1:11:21.927,1:11:24.495 And at the very beginning of that stage 1:11:24.495,1:11:27.777 there was something broken and it translated all the way 1:11:27.777,1:11:29.526 through all of these pieces of software 1:11:29.526,1:11:31.532 that were working correctly. 1:11:31.532,1:11:35.415 Testing this holistically was the only way 1:11:35.415,1:11:37.312 to find this kind of error. 1:11:37.312,1:11:41.241 One story that I heard from a vendor was that 1:11:41.241,1:11:43.911 they ran into randomness problems. 1:11:43.911,1:11:45.745 They developed some device. 1:11:45.745,1:11:47.325 It was working perfectly 1:11:47.325,1:11:51.478 and on the assembly line they were being turned on 1:11:51.478,1:11:55.497 and run through the checks. 1:11:55.497,1:11:57.807 You boot it up it checks the integrity 1:11:57.807,1:12:00.641 on the assembly line and it was continuing on. 1:12:00.641,1:12:06.390 If you exactly booted them up at the exact same time 1:12:06.390,1:12:10.129 in the exact same conditions 1:12:10.129,1:12:13.221 then all of the inputs to the random number generator 1:12:13.221,1:12:16.323 would be the same and they would generate the same keys 1:12:16.323,1:12:17.743 and then these keys would be, 1:12:17.743,1:12:19.721 you know, they already generated the keys, 1:12:19.721,1:12:21.197 so they wouldn't generate them again after 1:12:21.197,1:12:25.397 they went to the consumer. 1:12:25.397,1:12:27.145 I don't know how to unit test that 1:12:27.145,1:12:29.976 except take the entire device, turn it on, 1:12:29.976,1:12:32.589 and take multiple of them and make sure the devices 1:12:32.589,1:12:35.396 as they're coming out of the production process 1:12:35.396,1:12:39.596 are working correctly. 1:12:39.596,1:12:41.477 Angel: So we will take 2 more questions. 1:12:41.477,1:12:46.692 One from number 4 first and then number 1 at the last. 1:12:46.692,1:12:48.005 Male: How good, do you think, 1:12:48.005,1:12:52.544 hardware-supported random number generators are 1:12:52.544,1:12:53.432 after what you've just said? 1:12:53.432,1:12:55.806 I think they're probably not good anymore? 1:12:55.806,1:13:01.566 Dan: Intel has their RdRand instruction in some of the newer chips 1:13:01.566,1:13:04.979 and they say that they've gone through 1:13:04.979,1:13:06.957 a reasonable number of evaluation steps. 1:13:06.957,1:13:09.442 It sounds like they're trying. 1:13:09.442,1:13:11.826 Whether they're successful is a different question. 1:13:11.826,1:13:13.661 Something I don't like about the API is 1:13:13.661,1:13:16.392 RdRand gives you no way to test the pieces. 1:13:16.392,1:13:18.798 You can only test the post-processed output. 1:13:18.798,1:13:22.157 So nobody else is able to test the actual 1:13:22.157,1:13:24.276 randomness generation part of the hardware. 1:13:24.276,1:13:26.543 You can only get a filtered version of that. 1:13:26.543,1:13:29.245 So Intel and people who are contracted by Intel 1:13:29.245,1:13:30.628 to see what's going on inside 1:13:30.628,1:13:32.829 are the only people who can run these tests. 1:13:32.829,1:13:35.874 It's not open in a way that allows the community 1:13:35.874,1:13:37.377 to see if there's further problems. 1:13:37.377,1:13:38.756 But at least they're trying 1:13:38.756,1:13:41.423 and maybe with enough effort the hardware manufacturers 1:13:41.423,1:13:43.458 will get randomness generation right. 1:13:43.458,1:13:47.579 Of course if you can generate any sort of secret once 1:13:47.579,1:13:50.462 if you have enough of a secret to start with 1:13:50.462,1:13:52.328 then you can use things like AES to 1:13:52.328,1:13:53.944 generate any number of secrets 1:13:53.944,1:13:55.494 and you can put those secrets into 1:13:55.494,1:13:57.296 any number of devices that you want. 1:13:57.296,1:13:59.594 If you use AES in counter mode with 1:13:59.594,1:14:02.123 encrypting 1, encrypting 2, encrypting 3... 1:14:02.123,1:14:04.981 you get totally unpredictable, unrelated outputs 1:14:04.981,1:14:07.378 and then use those, burn those into some hardware. 1:14:07.378,1:14:09.979 But where do you get that initial randomness from? 1:14:09.979,1:14:12.830 Are you going through a trustworthy process there? 1:14:12.830,1:14:16.308 It's a hard problem. 1:14:16.308,1:14:18.198 Angel: Next question from number 1. 1:14:18.198,1:14:19.729 And that's the last question. 1:14:19.729,1:14:21.379 Male: You said something about 1:14:21.379,1:14:24.799 you dropped this group-based cryptography word. 1:14:24.799,1:14:28.111 Most of the stuff I've encountered under that heading 1:14:28.111,1:14:30.008 kind of sounded like snake oil 1:14:30.008,1:14:32.898 just cause they're using non-abelian groups and stuff. 1:14:32.898,1:14:35.832 Is there anything here that isn't? 1:14:35.832,1:14:38.280 Tanja: Ok, I did not say group-based crypto. 1:14:38.280,1:14:38.914 Male: Ok. 1:14:38.914,1:14:40.259 Tanja: I said code-based cryptography. 1:14:40.259,1:14:41.241 Male: Sorry, thanks. 1:14:41.241,1:14:42.773 Tanja: So there is a class of cryptosystems 1:14:42.773,1:14:44.826 which are fine against quantum computers. 1:14:44.826,1:14:50.148 For all crypto it's always up to what we know. 1:14:50.148,1:14:51.959 Next day, there could be somebody with a 1:14:51.959,1:14:54.009 polynomial-time factor algorithm and so on. 1:14:54.009,1:14:56.723 Now there's also a group of people doing 1:14:56.723,1:14:59.357 braid group cryptography, doing non-abelian groups, 1:14:59.357,1:15:01.764 most of these systems have been broken. 1:15:01.764,1:15:07.083 If they weren't broken by classical computers 1:15:07.083,1:15:10.911 maybe they would remain secure against quantum computers. 1:15:10.911,1:15:13.828 It's lots of "would"s there. 1:15:13.828,1:15:17.612 Yes, they might have this on their flags as well 1:15:17.612,1:15:20.547 but I wouldn't count this as a secure system. 1:15:20.547,1:15:23.027 Whereas code-based cryptography or lattice-based cryptography 1:15:23.027,1:15:26.480 are systems which should be fine. 1:15:26.480,1:15:28.646 Dan: Maybe just a URL if you're interested 1:15:28.646,1:15:31.939 in post-quantum cryptography is pqcrypto.org 1:15:31.939,1:15:34.513 and that keeps track of papers on the various types 1:15:34.513,1:15:37.227 of cryptography that should survive for a 100 years 1:15:37.227,1:15:41.448 and in particular against quantum computers. 1:15:41.448,1:15:42.444 Angel: Ladies and gentleman, please give 1:15:42.444,1:15:46.546 Nadia, Tanja and Daniel a big round of applause. 1:15:46.546,1:15:47.714 Thank you so much. 1:15:47.714,1:15:56.643 applause