0:00:08.965,0:00:11.637
Today we have Daniel J. Bernstein,
0:00:11.637,0:00:14.538
Nadia Heninger,
0:00:14.538,0:00:16.889
and Tanja Lange.
0:00:16.889,0:00:20.402
And they will talk about RSA factorization in the real world.
0:00:20.402,0:00:22.289
Talk is called FactHacks.
0:00:22.289,0:00:29.905
applause
0:00:29.905,0:00:31.591
Nadia: Thank you.
0:00:31.591,0:00:37.091
So we're going to start with a crypto lecture in 5 minutes.
0:00:37.091,0:00:40.557
So, just to begin, since we're talking about RSA.
0:00:40.557,0:00:43.678
Here is a picture of RSA for you.
0:00:43.678,0:00:48.605
RSA is the first published public-key cryptosystem.
0:00:48.605,0:00:51.222
So by a public-key cryptosystem, what I mean is
0:00:51.222,0:00:54.169
a cryptosystem that has two keys intead of just having
0:00:54.169,0:00:56.886
one key. You might think "ok you encrypt a message
0:00:56.886,0:00:59.035
to that key and you decrypt a message to that key"
0:00:59.035,0:01:01.402
That is a symmetric cryptosystem.
0:01:01.402,0:01:03.194
A public-key cryptosystem has two keys.
0:01:03.194,0:01:06.042
One is the public key and one is the private key.
0:01:06.042,0:01:07.688
The public key you publish to the world and
0:01:07.688,0:01:09.509
anyone can use that key to encrypt a message,
0:01:09.509,0:01:10.806
especially to you.
0:01:10.806,0:01:14.102
And only you, who have the private key, can decrypt that.
0:01:14.102,0:01:18.739
So the RSA paper was the first published public-key cryptosystem.
0:01:18.739,0:01:20.501
That was in 1977.
0:01:20.501,0:01:22.757
This revolutionized cryptography and enabled
0:01:22.757,0:01:25.542
the development of the Internet.
0:01:25.542,0:01:31.568
So 35 years later RSA is still the most commonly used public-key cryptosystem.
0:01:31.568,0:01:35.870
If you ever use the Internet, you probably use it every day.
0:01:35.870,0:01:42.154
So here is a sample SSL certificate which uses RSA.
0:01:42.154,0:01:45.241
If you use SSH you probably have an SSH key
0:01:45.241,0:01:48.440
which probably is RSA.
0:01:48.440,0:01:51.858
Before I launch into the math
0:01:51.858,0:01:54.676
I want to explain what we're doing here.
0:01:54.676,0:01:56.742
We wanted to, since you guys are hackers,
0:01:56.742,0:01:59.874
you like code instead of formulas
0:01:59.874,0:02:02.784
so we wanted to give you formulas in code.
0:02:02.784,0:02:05.455
So what we're giving you is working Python code for
0:02:05.455,0:02:07.720
all the math that we're going to do.
0:02:07.720,0:02:09.976
And in order to do that we're going to use
0:02:09.976,0:02:12.643
some software you call Sage.
0:02:12.643,0:02:15.970
Sage is free open-source mathematics software.
0:02:15.970,0:02:20.160
It is available at this URL sagemath.org.
0:02:20.160,0:02:22.093
It's based off of Python
0:02:22.093,0:02:24.190
so it's very simple if you know Python.
0:02:24.190,0:02:27.904
It has a nice sort of interpreter
0:02:27.904,0:02:29.622
that you can use just like Python.
0:02:29.622,0:02:32.493
Here an example, 2*3=6.
0:02:32.493,0:02:34.890
But there are some differences.
0:02:34.890,0:02:38.202
For example the caret instead of in Python it's XOR
0:02:38.202,0:02:40.155
in Sage it's exponentiation.
0:02:40.155,0:02:43.869
So 2^3 is 8 in Sage.
0:02:43.869,0:02:47.351
The other nice thing about Sage is that
0:02:47.351,0:02:49.703
it has lots of useful mathematical libraries.
0:02:49.703,0:02:53.029
For example if I, in the Sage interpreter, say factor(15)
0:02:53.029,0:02:55.512
it helpfully tells me 3*5.
0:02:55.512,0:02:56.513
That's pretty great.
0:02:56.513,0:02:59.152
It also does a lot more advanced things, for example
0:02:59.152,0:03:00.467
it can represent polynomials.
0:03:00.467,0:03:03.498
I can say factor the polynomial x^2-1
0:03:03.498,0:03:08.127
and it helpfully tells me the answer to that.
0:03:08.127,0:03:12.819
So now that we've gone through what is Sage,
0:03:12.819,0:03:17.443
I want to explain how to do RSA.
0:03:17.443,0:03:21.065
Perhaps some of you have seen this before.
0:03:21.065,0:03:24.677
In order to generate an RSA key, the first thing that you do
0:03:24.677,0:03:27.435
is you generate two large random primes.
0:03:27.435,0:03:30.441
So if we want a 1024 bit RSA key
0:03:30.441,0:03:34.779
we generate two primes of size 2^512,
0:03:34.779,0:03:38.515
so 512 bit primes, p and q.
0:03:38.515,0:03:41.697
In order to generate the public key
0:03:41.697,0:03:44.900
I multiply together p and q and you get some N.
0:03:44.900,0:03:46.433
N has 1024 bits.
0:03:46.433,0:03:49.915
And then you have what's known as the public exponent
0:03:49.915,0:03:54.435
which you can choose 3 or 65537 or 35.
0:03:54.435,0:03:56.300
It really doesn't matter what you choose.
0:03:56.300,0:03:59.117
These are some of the most commonly used values.
0:03:59.117,0:04:03.129
Now in order to generate a private key
0:04:03.129,0:04:07.695
you choose d, the decryption exponent,
0:04:07.695,0:04:12.936
to be the modular inverse of e mod (p-1)*(q-1)
0:04:12.936,0:04:15.394
It doesn't matter why so much, for this talk
0:04:15.394,0:04:17.276
but here is the formula
0:04:17.276,0:04:20.295
or here is the command to do this in Sage.
0:04:20.295,0:04:23.150
So now we have a public key and a private key
0:04:23.150,0:04:24.508
that are pairs.
0:04:24.508,0:04:27.948
And in RSA if you want to enrypt a message
0:04:27.948,0:04:31.547
you raise your message to the e'th power mod n
0:04:31.547,0:04:34.753
so you can do that with your public key here.
0:04:34.753,0:04:36.776
And similarly decryption,
0:04:36.776,0:04:39.998
you raise the ciphertext to the d'th power mod n
0:04:39.998,0:04:42.835
and that will give you the message back out again.
0:04:42.835,0:04:45.748
Fairly simple.
0:04:45.748,0:04:48.731
Now I'm lying to you slightly.
0:04:48.731,0:04:51.088
If you read this talk do not
0:04:51.088,0:04:53.181
absolutely do not implement RSA this way.
0:04:53.181,0:04:54.757
You must pad your message.
0:04:54.757,0:04:57.186
This is a public service warning.
0:04:57.186,0:05:00.662
We offered to tell you about factoring.
0:05:00.662,0:05:03.941
What is the relationship between RSA and factoring?
0:05:03.941,0:05:08.972
Now you can see easily from the formula
0:05:08.972,0:05:11.760
from the private key here
0:05:11.760,0:05:16.337
that if you know how to factor N into its prime factors p and q
0:05:16.337,0:05:19.496
then you can just compute d, the decryption exponent
0:05:19.496,0:05:20.812
of your private key.
0:05:20.812,0:05:24.395
Clearly, if you can factor N you can compute the private key.
0:05:24.395,0:05:28.930
But we don't actually know that factoring is the only way to break RSA.
0:05:28.930,0:05:34.163
There might be some way that you can compute messages from ciphertexts
0:05:34.163,0:05:36.877
that doesn't actually reveal the private key.
0:05:36.877,0:05:41.545
And there's no proof either way that this is or is not possible.
0:05:41.545,0:05:43.478
The last thing that I want to mention here
0:05:43.478,0:05:45.458
just to clarify things
0:05:45.458,0:05:46.830
some of you who might have taken
0:05:46.830,0:05:51.527
complexity or algorithms courses or a beginning CS course
0:05:51.527,0:05:55.874
you might have heard of NP-hardness, the P vs. NP problem.
0:05:55.874,0:06:00.904
And I just want to clarify that factoring is not known to be NP-hard.
0:06:00.904,0:06:05.092
Every computer scientist will love it if we could actually generate
0:06:05.092,0:06:11.139
a cryptosystem which is known to be based of average case NP-hardness.
0:06:11.139,0:06:13.573
We don't know how to do that.
0:06:13.573,0:06:16.992
RSA is not doing that.
0:06:16.992,0:06:22.309
In addition the factoring problem is probably not NP-hard.
0:06:22.309,0:06:25.892
The question is: how hard is factoring?
0:06:25.892,0:06:27.474
Since factoring is the best way that we know
0:06:27.474,0:06:31.227
to break the most commonly used public-key cryptosystem on the Internet.
0:06:31.227,0:06:32.935
Well I showed you some examples before
0:06:32.935,0:06:37.243
where Sage had this helpful little factor function.
0:06:37.243,0:06:39.164
So how well does it work?
0:06:39.164,0:06:41.181
How long do people think this takes?
0:06:41.181,0:06:46.149
two 32 bit integers multiplied together.
0:06:46.149,0:06:48.850
I heard it'll go a couple of seconds.
0:06:48.850,0:06:50.098
0.01 seconds.
0:06:50.098,0:06:55.108
This is on this computer.
0:06:55.108,0:06:57.869
How about 64 bit integers, two of them multiplied together?
0:06:57.869,0:07:01.289
So this is a 128 bit RSA modulus.
0:07:01.289,0:07:04.382
Any guesses?
0:07:04.382,0:07:05.464
One minute?
0:07:05.464,0:07:08.235
Nope, 0.1 seconds.
0:07:08.235,0:07:14.521
How about two 96 bit integers multiplied together?
0:07:14.521,0:07:15.939
A couple of minutes?
0:07:15.939,0:07:18.105
4 seconds.
0:07:18.105,0:07:21.965
How about two 128 bit integers multiplied together?
0:07:21.965,0:07:28.255
This is a 256 bit RSA modulus.
0:07:28.255,0:07:29.300
No guesses? Alright.
0:07:29.300,0:07:30.430
500 seconds.
0:07:30.430,0:07:32.791
So at this point it's starting to get a little bit iffy if I'm
0:07:32.791,0:07:36.327
sitting on a plane trying to do this on battery power.
0:07:36.327,0:07:39.248
laughter
0:07:39.248,0:07:43.764
Clearly this is growing faster than linear in the size of the key.
0:07:43.764,0:07:45.597
So what is actually going on?
0:07:45.597,0:07:49.066
Well.
0:07:49.066,0:07:50.734
Dan: Ok.
0:07:50.734,0:07:52.794
Turns out these factoring functions
0:07:52.794,0:07:54.597
they do look like they're getting pretty slow but
0:07:54.597,0:07:56.835
wait a minute.
0:07:56.835,0:07:59.814
Do we actually have to use this factor function
0:07:59.814,0:08:00.866
if it's getting really slow?
0:08:00.866,0:08:04.302
Maybe instead of trying to use Sage's factor function
0:08:04.302,0:08:08.710
maybe we could just guess what the prime was.
0:08:08.710,0:08:12.350
Now as an RSA user you might not think that this is a threat
0:08:12.350,0:08:13.928
having somebody guess your prime
0:08:13.928,0:08:17.114
because there is some way to count the number of primes,
0:08:17.114,0:08:18.719
number of 512 bit primes and there's
0:08:18.719,0:08:22.185
more than 2^500 512 bit primes.
0:08:22.185,0:08:24.011
And if your random number generator is giving you
0:08:24.011,0:08:27.097
any of those 512 bit primes with the same chance
0:08:27.097,0:08:30.829
then any particular prime that the attacker guesses
0:08:30.829,0:08:32.845
and just tries dividing n by that,
0:08:32.845,0:08:35.061
see if it's evenly divisable, well
0:08:35.061,0:08:39.417
any particular prime has only a 2^(-500) chance
0:08:39.417,0:08:41.312
of being guessed by the attacker.
0:08:41.312,0:08:43.319
And even if the attacker tries a huge number of times
0:08:43.319,0:08:44.897
he'll never guess your prime
0:08:44.897,0:08:47.621
except what if your random number generator
0:08:47.621,0:08:52.031
is working really really badly and doesn't give you 2^500 different primes?
0:08:52.031,0:08:55.902
What if it only gives you, say, 2^40 different primes?
0:08:55.902,0:08:58.055
And then suddenly the attacker could, well,
0:08:58.055,0:09:01.248
try figuring out what those primes are.
0:09:01.248,0:09:03.237
So here's how the attacker looks at this.
0:09:03.237,0:09:05.819
The attacker says: ok I'm gonna target your public key.
0:09:05.819,0:09:08.884
I'm gonna take your big N which is your secret...
0:09:08.884,0:09:10.912
sorry, public key is big N,
0:09:10.912,0:09:13.666
your secret key p times your secret key q,
0:09:13.666,0:09:14.920
your private p and q.
0:09:14.920,0:09:16.764
He's trying to figure out the p and q, given N.
0:09:16.764,0:09:20.169
So what he does, he takes a bunch of devices that are out there
0:09:20.169,0:09:22.902
and he looks at how they work
0:09:22.902,0:09:25.281
or downloads a bunch of software which generates keys
0:09:25.281,0:09:28.904
and then using that software, using those devices
0:09:28.904,0:09:32.216
he tries generating a bunch of p's and q's for himself
0:09:32.216,0:09:34.516
using whatever the random number generator is
0:09:34.516,0:09:36.932
in that device or in that software.
0:09:36.932,0:09:38.400
And then that's something which
0:09:38.400,0:09:40.682
maybe the random number generator works really badly
0:09:40.682,0:09:42.832
and maybe you're using that device, too.
0:09:42.832,0:09:45.765
So the attacker tries each of the primes that he see's
0:09:45.765,0:09:46.918
from this device
0:09:46.918,0:09:50.129
and tries seeing does that prime divide your N.
0:09:50.129,0:09:51.400
If you were using that device
0:09:51.400,0:09:52.932
and it doesn't generate very many primes
0:09:52.932,0:09:55.835
then maybe he gets lucky and finds your prime
0:09:55.835,0:09:57.968
because the device has generated the same prime for him
0:09:57.968,0:09:59.618
that it generated for you.
0:09:59.618,0:10:03.216
Now does anybody actually build devices which
0:10:03.216,0:10:07.005
screw up random numbers this badly?
0:10:07.005,0:10:08.519
laughter
0:10:08.519,0:10:11.538
As a matter of fact, yes, bears do shit in the woods.
0:10:11.538,0:10:13.543
So there are some famous examples.
0:10:13.543,0:10:16.952
We don't have a huge number of historical discussions in this talk
0:10:16.952,0:10:18.539
but just a few examples here.
0:10:18.539,0:10:22.401
Goldberg&Wagner totally broke the original Netscape
0:10:22.401,0:10:23.989
generation of public keys.
0:10:23.989,0:10:26.921
There are only something like 2^47 possible keys
0:10:26.921,0:10:29.216
which is a countable numbers.
0:10:29.216,0:10:31.077
So they could figure out all the possible keys
0:10:31.077,0:10:34.139
that Netscape could generate if they you knew when you generated a key
0:10:34.139,0:10:35.802
which was often very predictable.
0:10:35.802,0:10:37.724
Another example: the Debian bug,
0:10:37.724,0:10:40.974
the horrendous failure for SSH and lots of other applications
0:10:40.974,0:10:42.418
from a few years ago.
0:10:42.418,0:10:45.692
Debian and Ubuntu for a year and half were generating only
0:10:45.692,0:10:48.773
well, fewer than a million possible public keys.
0:10:48.773,0:10:51.421
So complete disasters of random number generation.
0:10:51.421,0:10:53.826
But if you want to add to this list,
0:10:53.826,0:10:55.202
if you want to do more and more of these
0:10:55.202,0:10:56.974
find bad random number generators
0:10:56.974,0:11:00.518
you have to say "ok what's the next device out there
0:11:00.518,0:11:01.858
that might have been screwed up."
0:11:01.858,0:11:03.422
"Let's look at it, look at the code."
0:11:03.422,0:11:05.039
"See what kind of primes it generates."
0:11:05.039,0:11:06.571
"Does it do badly?"
0:11:06.571,0:11:09.424
It takes some real work to figure this out.
0:11:09.424,0:11:11.801
Wouldn't it be nice if there was a systematic way
0:11:11.801,0:11:14.586
to just without having to do a lot of work,
0:11:14.586,0:11:16.576
to look at each device, each piece of software,
0:11:16.576,0:11:18.403
wouldn't it be nice to just figure out
0:11:18.403,0:11:21.568
"oh yeah, let's just magically figure out
0:11:21.568,0:11:23.726
if there's any primes out there generated
0:11:23.726,0:11:25.507
from bad random number generators."
0:11:25.507,0:11:28.020
You could imagine here's the easy attack.
0:11:28.020,0:11:29.622
Here's the systematic way to do this.
0:11:29.622,0:11:34.052
You take two users, so you and you.
0:11:34.052,0:11:35.442
And you each have a public key.
0:11:35.442,0:11:37.178
We're gonna grab those public keys
0:11:37.178,0:11:41.681
N1 and N2
0:11:41.681,0:11:46.983
and then hope that this N1 and N2 share a prime.
0:11:46.983,0:11:54.130
N1 is some prime p times some q1 and N2 is some same prime p times a different q2.
0:11:54.130,0:11:55.663
Now this could happen, you could have
0:11:55.663,0:11:59.100
N1 and N2 sharing both primes.
0:11:59.100,0:12:02.101
That would be legitimate that two people who are actually
0:12:02.101,0:12:05.645
the same webserver sharing their keys, sharing their certificates,
0:12:05.645,0:12:06.843
they know they're doing this,
0:12:06.843,0:12:09.281
you can get the same public key from two places.
0:12:09.281,0:12:12.183
Google has their keys copied to tens of thousands of servers,
0:12:12.183,0:12:13.399
that's not a surprise.
0:12:13.399,0:12:15.395
But if there's a different...
0:12:15.395,0:12:16.999
If you have two different public keys
0:12:16.999,0:12:18.430
they shouldn't be sharing primes.
0:12:18.430,0:12:19.582
What if they do?
0:12:19.582,0:12:21.979
Well, here's this magic Euclid's algorithm
0:12:21.979,0:12:24.744
which will just print out the shared prime p.
0:12:24.744,0:12:26.066
This is a very simple algorithm.
0:12:26.066,0:12:28.495
It just takes one of the numbers, say N1
0:12:28.495,0:12:31.961
and let's say N1 is the smaller one and N2 is the bigger one,
0:12:31.961,0:12:34.142
it divides N2 by N1
0:12:34.142,0:12:37.512
and replaces N2 by the remainder and then switches the numbers
0:12:37.512,0:12:39.593
and that's exactly what this code does.
0:12:39.593,0:12:43.111
The N2 % N1 that's again the N2 mod N1
0:12:43.111,0:12:45.543
the least remainder when you divide N2 by N1.
0:12:45.543,0:12:49.414
And then the result once one of the numbers gets down to zero
0:12:49.414,0:12:52.292
the other number is exactly the shared prime.
0:12:52.292,0:12:54.149
So this amazing algorithm,
0:12:54.149,0:12:57.051
here's just a little manual example
0:12:57.051,0:12:59.898
not with big 512, 1024 bit keys
0:12:59.898,0:13:04.143
this is with just tiny little I guess 8 bit primes
0:13:04.143,0:13:07.625
16 bit... well looks more like 13 bit keys.
0:13:07.625,0:13:12.375
So two numbers coming in and one is 4187 and two is 5989.
0:13:12.375,0:13:14.879
You can see if you keep dividing one number by the other,
0:13:14.879,0:13:16.162
replacing it by the remainder,
0:13:16.162,0:13:19.411
do that a few times, you quickly get down to zero.
0:13:19.411,0:13:22.774
And the number right before the zero in the middle of the slide is 53.
0:13:22.774,0:13:27.172
Now 53 is a shared prime between these two small keys.
0:13:27.172,0:13:28.772
If you scale this up,
0:13:28.772,0:13:32.256
if you do this computation for bigger numbers
0:13:32.256,0:13:34.176
then here's another timing example.
0:13:34.176,0:13:36.673
It's still really really fast, in the middle of the slide.
0:13:36.673,0:13:40.359
This is doing 1024 bit keys that share a 512 bit prime.
0:13:40.359,0:13:44.554
You can figure out this 512 bit prime so quickly that the timing
0:13:44.554,0:13:48.848
there is 0.00 seconds, can't even see how fast it is.
0:13:48.848,0:13:51.094
Ok, so that's Euclid's algorithm.
0:13:51.094,0:13:53.676
Now you can actually do this for tons and tons of keys.
0:13:53.676,0:13:57.695
You download, say, millions, tens of millions of keys from the Internet,
0:13:57.695,0:14:01.011
all the different SSL servers, maybe go for some SSH keys,
0:14:01.011,0:14:03.891
you download tons and tons of public keys and then
0:14:03.891,0:14:06.590
you try all the pairs with Euclid's algorithm.
0:14:06.590,0:14:09.406
And it's so fast that you can do this but actually
0:14:09.406,0:14:10.848
you can do better.
0:14:10.848,0:14:13.927
So there's an algorithm: batch GCD algorithm
0:14:13.927,0:14:15.396
which takes a whole bunch of keys
0:14:15.396,0:14:19.822
and figures out which of those keys share primes with the others.
0:14:19.822,0:14:22.525
Much much faster than trying every pair.
0:14:22.525,0:14:23.909
So you don't have to try every pair.
0:14:23.909,0:14:26.973
It's kind of like sorting, it's about that kind of speed
0:14:26.973,0:14:29.065
instead of having to try every pair.
0:14:29.065,0:14:30.811
If you have tens of millions you don't have to do
0:14:30.811,0:14:33.156
tens of millions times tens of millions of pairs.
0:14:33.156,0:14:35.739
You just have to pretty much reap through the whole list once.
0:14:35.739,0:14:38.877
So what does this batch GCD algorithm do?
0:14:38.877,0:14:41.447
It's figuring out for the first N1,
0:14:41.447,0:14:44.029
instead of computing the greatest common divisor,
0:14:44.029,0:14:45.908
the Euclid result for N1 and N2
0:14:45.908,0:14:47.846
and for N1 and N3 and so on,
0:14:47.846,0:14:52.340
it figures out the product of N2, N3, N4 and so on,
0:14:52.340,0:14:55.257
takes the greatest common divisor of that with N1,
0:14:55.257,0:14:57.191
applies Euclid's algorithm to that
0:14:57.191,0:15:00.874
and then does a similar thing for N2 and for N3 and so on.
0:15:00.874,0:15:05.656
Each of those is gonna show you whether N1 or N2 or N3,
0:15:05.656,0:15:08.148
it's gonna show you whether they share some prime
0:15:08.148,0:15:10.246
with any of the other N's.
0:15:10.246,0:15:13.326
If you did exactly the computation
0:15:13.326,0:15:15.248
shown on the slide in the math formulas
0:15:15.248,0:15:16.617
it would still be too slow.
0:15:16.617,0:15:20.457
But the batch GCD algorithm finds redundancies
0:15:20.457,0:15:23.218
in these GCDs, and here's exactly what it does:
0:15:23.218,0:15:25.620
It figures out the product of all of the keys.
0:15:25.620,0:15:29.923
So you take, say, a million keys and you multiply them all together.
0:15:29.923,0:15:33.534
And that's done with the code here where you do...
0:15:33.534,0:15:35.120
maybe I'll just skip to the bottom.
0:15:35.120,0:15:37.474
You see numbers 10, 20, 30, 40.
0:15:37.474,0:15:39.356
You build a product tree.
0:15:39.356,0:15:42.210
So you take 10 times 20, compute 200.
0:15:42.210,0:15:44.687
Take the next pair 30 times 40, compute 1200.
0:15:44.687,0:15:47.889
The take the pair of products 200 times 1200.
0:15:47.889,0:15:50.378
If you have more numbers, you have more levels in the tree.
0:15:50.378,0:15:55.227
And that's exactly what the Sage code does here.
0:15:55.227,0:15:58.924
And then the last step of the algorithm is
0:15:58.924,0:16:02.847
kind of going down the tree, building a remainder tree.
0:16:02.847,0:16:06.515
So what's happening here to figure out the greatest common divisor
0:16:06.515,0:16:10.013
of N1 with the product of all the other keys,
0:16:10.013,0:16:12.846
the idea is to take this product
0:16:12.846,0:16:15.807
which is called big R on this slide,
0:16:15.807,0:16:23.160
take the product of all the keys and then divide that by N1^2
0:16:23.160,0:16:26.291
and then take the remainders, so R mod N1^2,
0:16:26.291,0:16:31.148
divide by N1 and then compute the greatest common divisor of N1 mod N.
0:16:31.148,0:16:33.124
And the amazing effect is that it's exactly the
0:16:33.124,0:16:35.047
greatest common divisor we were looking for.
0:16:35.047,0:16:37.629
And what this little Sage script does is that it takes
0:16:37.629,0:16:42.282
R and divides R by, well a bunch of N's in a really fast way,
0:16:42.282,0:16:43.960
and then at the bottom, once it has figured out
0:16:43.960,0:16:47.227
all mod N1^2, all mod N2^2 and so on,
0:16:47.227,0:16:49.308
divides those by N1 and N2,
0:16:49.308,0:16:52.177
takes the greatest common divisor with N1 and N2
0:16:52.177,0:16:54.824
and that's exactly telling you, the outputs here
0:16:54.824,0:16:57.063
that are different from 1, are exactly telling you
0:16:57.063,0:17:00.398
which N's share a prime with some others.
0:17:00.398,0:17:04.048
This very remarkably short little script does work quite well.
0:17:04.048,0:17:06.524
Here's how fast these are.
0:17:06.524,0:17:10.248
So I'm on some 800 MHz core.
0:17:10.248,0:17:13.442
If you try this for, say, a hundred numbers,
0:17:13.442,0:17:16.608
just hundred random 1024 bit numbers,
0:17:16.608,0:17:20.917
it doesn't matter what the numbers are, they always run at the same speed,
0:17:20.917,0:17:23.198
then that takes 0.05 seconds.
0:17:23.198,0:17:27.249
If you have 10 times as many numbers it takes about 20 times as long.
0:17:27.249,0:17:31.229
And if you have another 10 times as many numbers it takes about 20 times as long.
0:17:31.229,0:17:33.247
And it keeps scaling up like that.
0:17:33.247,0:17:35.948
So you can get to millions or tens of millions of numbers
0:17:35.948,0:17:38.582
and it still finishes in a reasonable amount of time.
0:17:38.582,0:17:40.068
So, amazingly fast algorithm.
0:17:40.068,0:17:43.810
You don't have to scale this up to like 2^50 or 2^100 keys.
0:17:43.810,0:17:46.059
There's only, say, tens of millions of keys out there
0:17:46.059,0:17:48.330
and this runs reasonably fast.
0:17:48.330,0:17:50.626
Ok.
0:17:50.626,0:17:52.416
Can you actually hope for this to work?
0:17:52.416,0:17:54.629
Are random number generators really this bad?
0:17:54.629,0:17:55.866
Well...
0:17:55.866,0:18:00.478
laughs
0:18:00.478,0:18:03.435
There's a 2012 paper from Heninger
0:18:03.435,0:18:04.926
that's the same Heninger we have sitting here
0:18:04.926,0:18:06.300
and Durumeric, Wustrow, Halderman,
0:18:06.300,0:18:09.881
that's got the best paper award at USENIX Security 2012.
0:18:09.881,0:18:14.678
What they did was they tried exactly this on tens of millions of keys out there
0:18:14.678,0:18:18.276
and factored tens of thousands of keys.
0:18:18.276,0:18:22.677
Admittely they used a C version of this instead of this
0:18:22.677,0:18:24.061
Sage script but
0:18:24.061,0:18:25.827
the Sage script will go up to millions.
0:18:25.827,0:18:28.359
It's just when you get beyond that you want to write it in C
0:18:28.359,0:18:31.009
to deal with using a lot of memory.
0:18:31.009,0:18:34.887
But you can use exactly this script for pretty large chunks of keys.
0:18:34.887,0:18:37.951
And so they factored tens of thousands of keys and said
0:18:37.951,0:18:42.235
that these keys are typically not your bank keys.
0:18:42.235,0:18:44.909
They gonna be, say, keys for your home router.
0:18:44.909,0:18:47.639
So the vulnerable devices are not gonna be,
0:18:47.639,0:18:49.893
say, a big server out there.
0:18:49.893,0:18:53.902
The vulnerable devices will be your FRITZ!Box.
0:18:53.902,0:18:57.340
applause
0:18:57.340,0:18:59.838
Thank you, Nadia.
0:18:59.838,0:19:06.525
It's good to read the paper, go to factorable.net
0:19:06.525,0:19:09.073
and you get tons of analyses of why this happened
0:19:09.073,0:19:12.520
and why these devices are generated guessable numbers.
0:19:12.520,0:19:15.020
I mean numbers that are so non-random that they get
0:19:15.020,0:19:18.370
repeated between a lot of keys out there.
0:19:18.370,0:19:20.308
There was at the same time another team that
0:19:20.308,0:19:23.559
like within days announced the same result.
0:19:23.559,0:19:27.071
They independently did the same computation.
0:19:27.071,0:19:28.883
They said "yeah, we've also downloaded
0:19:28.883,0:19:31.352
a bunch of keys and factored as many as we could"
0:19:31.352,0:19:32.869
with basically the same algorithm.
0:19:32.869,0:19:35.198
At the end of it "yeah we factored
0:19:35.198,0:19:36.629
tens of thousands of keys."
0:19:36.629,0:19:38.882
That paper has no analysis.
0:19:38.882,0:19:41.582
They're like "yeah, there's keys"
0:19:41.582,0:19:43.267
"they share primes for some reason"
0:19:43.267,0:19:45.915
"I guess ecommerce is dead"
0:19:45.915,0:19:47.536
"go out, change your bank keys"
0:19:47.536,0:19:48.846
"call the New York Times"
0:19:48.846,0:19:50.553
There's the New York Times article:
0:19:50.553,0:19:52.601
"Flaw Found in an Online Encryption Method"
0:19:52.601,0:19:54.860
I also like the advertising on the side here.
0:19:54.860,0:19:57.151
Like iron key on the top and then:
0:19:57.151,0:20:01.168
"Encrypt, decrypt & access my secret data in real time"
0:20:01.168,0:20:03.135
"try it free for 30 days."
0:20:03.135,0:20:05.199
"I secured my cloud data. Have you?"
0:20:05.199,0:20:06.878
Awesome advertising there.
0:20:06.878,0:20:12.480
Once you found that there's a problem like this
0:20:12.480,0:20:14.919
then of course you start looking around for more and more places
0:20:14.919,0:20:17.435
where you might have some bad randomness.
0:20:17.435,0:20:22.795
For example there is a paper, or at least slides,
0:20:22.795,0:20:25.395
by Chou in Taiwan, saying
0:20:25.395,0:20:31.857
they took two million Taiwan Citizen Digital Certificates
0:20:31.857,0:20:34.327
and did some GCDs
0:20:34.327,0:20:36.626
and found that a 103 of those were
0:20:36.626,0:20:37.995
factorable.
0:20:37.995,0:20:40.052
So anybody who downloaded these certificates
0:20:40.052,0:20:43.177
which apparently are used in Taiwan for like paying your taxes,
0:20:43.177,0:20:44.393
talking to banks I don't know,
0:20:44.393,0:20:46.677
but paying taxes is one that I've checked.
0:20:46.677,0:20:49.537
So a 103 people out there have these Taiwan
0:20:49.537,0:20:51.011
Citizen Digital Certificates where
0:20:51.011,0:20:54.121
you're supposed to have in this database things like
0:20:54.121,0:20:57.122
the names of the people and their ID numbers
0:20:57.122,0:20:59.576
but you aren't supposed to have their secret keys.
0:20:59.576,0:21:01.543
The whole point is those are supposed to be private
0:21:01.543,0:21:03.938
kept on some smartcards that are issued to the citizens
0:21:03.938,0:21:05.295
in Taiwan.
0:21:05.295,0:21:08.706
And the randomness generation is so bad that
0:21:08.706,0:21:13.169
103 of those keys have now been factored.
0:21:13.169,0:21:15.451
The smartcard manufacturer, I also like this quote,
0:21:15.451,0:21:18.674
it's a company based in Munich called
0:21:18.674,0:21:22.315
Giesecke & Devrient and their motto is:
0:21:22.315,0:21:24.753
"Creating Confidence"
0:21:24.753,0:21:35.270
applause
0:21:35.270,0:21:37.698
Tanja: There's gonna be more of Dan, don't worry.
0:21:37.698,0:21:39.835
But we promised a bit more than just factoring
0:21:39.835,0:21:43.309
bad keys or the keys with their primes.
0:21:43.309,0:21:46.956
If you find another number like this one.
0:21:46.956,0:21:49.686
So here's coming some more of math stuff.
0:21:49.686,0:21:52.555
So if you see a number like this and you
0:21:52.555,0:21:56.427
observe this last digit, then yes.
0:21:56.427,0:21:58.738
It's very easy for you to see it's divisable by 5.
0:21:58.738,0:22:01.385
So if you find such a key that is easy to break
0:22:01.385,0:22:03.226
and you are a computer rather than a human being
0:22:03.226,0:22:05.324
you look at lots of those numbers.
0:22:05.324,0:22:07.076
You have a list of small primes stored
0:22:07.076,0:22:10.037
and what you're gonna do is just trial division.
0:22:10.037,0:22:11.987
So if there's any small factor
0:22:11.987,0:22:14.092
it's very easy to find this trial division.
0:22:14.092,0:22:16.708
Or what Dan showed was the remainder trees.
0:22:16.708,0:22:20.222
You can also do this for batch trial division.
0:22:20.222,0:22:23.728
So assuming that you're looking for a prime p somewhere
0:22:23.728,0:22:26.735
then you go linearly all the way up to the p
0:22:26.735,0:22:30.442
and there are about p/log(p) many primes
0:22:30.442,0:22:32.094
before you find this p.
0:22:32.094,0:22:33.969
For each of them you just do exact division.
0:22:33.969,0:22:36.558
If it works, fine you found a factor, otherwise not.
0:22:36.558,0:22:39.459
Now this is not going to work against your normal keys.
0:22:39.459,0:22:43.624
I know that in the example that Dan mentioned by Wagner&Goldberg
0:22:43.624,0:22:46.625
that they found one factor which had a 9 in there
0:22:46.625,0:22:49.975
but usually your keys don't have a 9.
0:22:49.975,0:22:54.687
For slightly larger numbers there is a method due to Pollard.
0:22:54.687,0:22:57.460
So if you see the respectful number here,
0:22:57.460,0:22:59.887
there is no obvious divisability of this one.
0:22:59.887,0:23:03.091
One thing you can try is
0:23:03.091,0:23:06.660
a walk, well we call this a walk.
0:23:06.660,0:23:10.404
You start with some random number smaller than N
0:23:10.404,0:23:12.902
and some offset here.
0:23:12.902,0:23:17.277
I should also say, all those scripts are available at this URL.
0:23:17.277,0:23:22.087
So if you go to facthacks.cr.yp.to then you'll find all the scripts
0:23:22.087,0:23:23.879
if you want to play with it at the same time,
0:23:23.879,0:23:26.625
including this one with some explanations and such.
0:23:26.625,0:23:31.092
So what you do is you start off two different numbers,
0:23:31.092,0:23:34.788
one at each step squares it and adds c.
0:23:34.788,0:23:37.226
And the other one squares it, adds c,
0:23:37.226,0:23:39.886
and squares it again and adds the c.
0:23:39.886,0:23:41.455
Each time computing mod N.
0:23:41.455,0:23:44.102
So you have two sequences running around,
0:23:44.102,0:23:46.023
one is twice the speed as the other.
0:23:46.023,0:23:50.953
At every step you check with the GCD of this number N
0:23:50.953,0:23:52.757
and the difference between the two walks
0:23:52.757,0:23:55.006
hasn't anything interesting now.
0:23:55.006,0:23:57.390
N is an RSA modulus, the only thing interesting
0:23:57.390,0:23:59.675
could either be p or q.
0:23:59.675,0:24:03.009
So for this particular number
0:24:03.009,0:24:04.270
when I run this
0:24:04.270,0:24:08.894
I find 2053 after a few milliseconds.
0:24:08.894,0:24:10.642
So this is again a cooked up example.
0:24:10.642,0:24:13.307
You won't find such a small factor if you're trying
0:24:13.307,0:24:16.173
a general RSA factor, otherwise your numbers
0:24:16.173,0:24:17.830
would be totally busted but
0:24:17.830,0:24:20.604
this shows if you have a small number in there
0:24:20.604,0:24:21.857
it's very easy to find.
0:24:21.857,0:24:24.357
All later on when Dan is talking about the number field sieve
0:24:24.357,0:24:26.157
this is an interesting subroutine.
0:24:26.157,0:24:28.089
So if you ever need to find small numbers
0:24:28.089,0:24:31.020
then, sure trial division is very easy for
0:24:31.020,0:24:34.080
really small numbers taking this long,
0:24:34.080,0:24:38.352
this one squared with p is much faster already.
0:24:38.352,0:24:42.070
Now even faster than that, also due to Pollard,
0:24:42.070,0:24:45.378
called Pollard's p-1 method.
0:24:45.378,0:24:47.904
Here is again an innocent looking number N.
0:24:47.904,0:24:49.842
You see the numbers get larger.
0:24:49.842,0:24:52.176
This is a 256 bit number.
0:24:52.176,0:24:56.744
In order to deal with such a number there is one step
0:24:56.744,0:24:59.342
which is expensive but you would do it only once
0:24:59.342,0:25:01.208
for all different numbers you want to try,
0:25:01.208,0:25:05.238
namely compute a big integer which has lots of little factors.
0:25:05.238,0:25:06.985
So you take the least common multiple
0:25:06.985,0:25:09.788
from 1, 2, 3, 4, 5...
0:25:09.788,0:25:12.203
Ok, once you hit the 4 you have another 2 in there.
0:25:12.203,0:25:13.808
So there's lots and lots of powers of 2,
0:25:13.808,0:25:15.408
lots and lots of powers of 3.
0:25:15.408,0:25:19.390
And then the larger primes only appear like once.
0:25:19.390,0:25:23.161
But you're not going up to 256 or 128 anywhere.
0:25:23.161,0:25:27.005
You're sticking here at 22 bits
0:25:27.005,0:25:29.059
and then use the same function that Nadia explained
0:25:31.023,0:25:32.987
in the RSA computation, namely the modular exponentiation.
0:25:32.987,0:25:36.576
So you take the 2^y,
0:25:36.576,0:25:38.241
this huge number,
0:25:38.241,0:25:40.486
but the whole computation mod N.
0:25:40.486,0:25:44.633
So this whole thing never grows beyond this 256 bit number.
0:25:44.633,0:25:47.939
Or for a real world example it would be a 1000 bit number.
0:25:47.939,0:25:52.518
And then you compute the GCD of this number and N.
0:25:52.518,0:25:55.091
For this particular one, again it's a cooked up example,
0:25:55.091,0:25:56.390
you get a factor.
0:25:56.390,0:25:59.791
Now this factor is actually 120 bits.
0:25:59.791,0:26:02.369
This is not a small factor.
0:26:02.369,0:26:05.688
So if you were using 256 bit numbers
0:26:05.688,0:26:07.856
this is something that could happen to you
0:26:07.856,0:26:11.305
and nevertheless this method would find it.
0:26:11.305,0:26:13.522
So this finds much larger factors than trial division
0:26:13.522,0:26:15.484
or the Rho method in the same amount of time.
0:26:15.484,0:26:17.376
But it doesn't work for all primes.
0:26:17.376,0:26:20.066
It works for primes which are kind of special.
0:26:20.066,0:26:23.859
Now what's special about this prime or the other factor?
0:26:23.859,0:26:29.553
If you take p and subtract -1, hence the name p-1 method,
0:26:29.553,0:26:31.660
and factor that thing,
0:26:31.660,0:26:35.272
that has lots, lots of small numbers.
0:26:35.272,0:26:36.937
So that's something we call smooth.
0:26:36.937,0:26:40.342
So a smooth number doesn't have big factors.
0:26:40.342,0:26:43.868
So a prime is like the least smooth number you can imagine.
0:26:43.868,0:26:47.367
And 2 to the power large is the most smooth number,
0:26:47.367,0:26:50.007
so lots of little factors means smooth.
0:26:50.007,0:26:55.568
This largest factor happens to be 21.7 bits
0:26:55.568,0:26:58.355
which is why I chose the 22 bits here.
0:26:58.355,0:27:03.702
So as soon I run over the largest prime in p-1
0:27:03.702,0:27:06.127
with this exponent y here
0:27:06.127,0:27:08.653
then I do find this factor.
0:27:08.653,0:27:10.851
Now if you thought this was math
0:27:10.851,0:27:12.322
there is even more scary math
0:27:12.322,0:27:15.893
namely there are methods which are generalizing
0:27:15.893,0:27:19.319
this p-1 method using elliptic curves.
0:27:19.319,0:27:21.569
If you listen to the crypto advertisement
0:27:21.569,0:27:23.472
elliptic curves are usually something very good,
0:27:23.472,0:27:25.005
something you should have on your smartcard.
0:27:25.005,0:27:27.245
It's faster than RSA and so on.
0:27:27.245,0:27:29.385
That's the same type of elliptic curve
0:27:29.385,0:27:32.342
but here elliptic curves come in as an attack tool.
0:27:32.342,0:27:35.088
There is a method called the elliptic curve method ECM
0:27:35.088,0:27:41.255
which is a generalization of this p-1 method and does not need anything special,
0:27:41.255,0:27:44.708
I mean avoiding such primes is easy.
0:27:44.708,0:27:46.325
If you look into older standards
0:27:46.325,0:27:49.852
they all warn about "make sure to use strong primes"
0:27:49.852,0:27:53.174
"make sure to check that your p-1 is not smooth"
0:27:53.174,0:27:55.301
And of course you don't want your p-1 to be smooth
0:27:55.301,0:27:57.670
because otherwise this attack works.
0:27:57.670,0:28:00.760
But if I have some elliptic curve
0:28:00.760,0:28:03.492
a different type of prime is weak for that curve
0:28:03.492,0:28:06.351
and I have many, many, many elliptic curves.
0:28:06.351,0:28:08.617
For each of the curves some primes are weak.
0:28:08.617,0:28:11.161
You can't exclude this method.
0:28:11.161,0:28:13.573
So the only thing you can do is just, well,
0:28:13.573,0:28:18.035
choose your prime large enough and hope, well,
0:28:18.035,0:28:20.436
make sure p-1 doesn't get it and hope
0:28:20.436,0:28:22.601
that the next elliptic curves won't get it.
0:28:22.601,0:28:24.855
So elliptic curves are good for attacking.
0:28:24.855,0:28:27.624
It's still not the fastest method out there
0:28:27.624,0:28:33.185
but it's a good method for random factoring.
0:28:33.185,0:28:35.917
It's not the best method for finding RSA factors
0:28:35.917,0:28:38.366
but if you have a number which is most likely
0:28:38.366,0:28:44.532
not too non-smooth then it's a good method.
0:28:44.532,0:28:47.234
Now this is what you do for finding small factors.
0:28:47.234,0:28:50.509
To show some method for big factors,
0:28:50.509,0:28:53.994
now here that is a big number.
0:28:53.994,0:28:58.978
But this is what I would call factorization by inspection.
0:28:58.978,0:29:07.876
What's wrong with this number?
0:29:07.876,0:29:10.129
I mean this number is not small and
0:29:10.129,0:29:13.162
it won't have small factors, I promise you this.
0:29:13.162,0:29:19.835
So it's a decimal representation, so 9 is the digit before 0.
0:29:19.835,0:29:23.246
This looks very close to the power of 10.
0:29:23.246,0:29:28.253
Actually this is very close to 10^340.
0:29:28.253,0:29:30.292
If you take the square root of it
0:29:30.292,0:29:34.150
then you almost exactly hit 10^170.
0:29:34.150,0:29:37.113
This example was cooked up as taking two primes
0:29:37.113,0:29:46.646
namely 10^170-33 and 10^170+63 and multiplying them.
0:29:46.646,0:29:48.667
So if you see lots of 0's and lots of 9's
0:29:48.667,0:29:50.934
then you know that you are very close to a certain power.
0:29:50.934,0:29:53.464
Now in real life this wouldn't be a power of 10,
0:29:53.464,0:29:54.817
it would be a power of 2.
0:29:54.817,0:29:56.179
And there actually are some examples.
0:29:56.179,0:29:58.817
There's a famous story of a letter bomber in Austria
0:29:58.817,0:30:02.650
who wanted his message to be crackable
0:30:02.650,0:30:06.047
and he sent a message, well he was some right-wing asshole
0:30:06.047,0:30:08.332
that was letter-bombing people
0:30:08.332,0:30:10.813
and then he sent an encrypted message to the police.
0:30:10.813,0:30:14.642
And the police tried to factor it and tried to decipher it
0:30:14.642,0:30:18.568
hoping it would be the list of the next victims.
0:30:18.568,0:30:20.980
In the end it was not but
0:30:20.980,0:30:24.419
this person was actually using very close to a power of 2
0:30:24.419,0:30:26.211
in order to have people crack it.
0:30:26.211,0:30:27.769
In the end it was just some propaganda,
0:30:27.769,0:30:29.462
some right-wing stuff as well.
0:30:29.462,0:30:32.783
So there are people using this for breakability.
0:30:32.783,0:30:35.499
I'm not aware of people actually using this as an accident.
0:30:35.499,0:30:40.437
But what can happen to you is not using close to the power of 2
0:30:40.437,0:30:46.200
but saying "ok I learned I shouldn't be close to the power of 2
0:30:46.200,0:30:48.848
so I take some offset and take the next prime."
0:30:48.848,0:30:51.130
Next prime just means, add 1, check is it prime,
0:30:51.130,0:30:53.798
add next one, add 2, add 3
0:30:53.798,0:30:57.082
Just check primes linearly from then on.
0:30:57.082,0:30:59.261
Now if we jump to this where we finally find our prime p
0:30:59.261,0:31:00.966
it's a good prime.
0:31:00.966,0:31:03.869
It's not anywhere close to the power of 2 because my c was large enough.
0:31:03.869,0:31:07.130
Made it, good.
0:31:07.130,0:31:08.562
Now on q.
0:31:08.562,0:31:10.776
So again call the next_prime().
0:31:10.776,0:31:14.445
So I do p+1, p+2 until I find a prime.
0:31:14.445,0:31:17.215
That means that if you take the product of the two
0:31:17.215,0:31:19.469
then this N is very close to a square.
0:31:19.469,0:31:22.920
So this N is cooked up with this method.
0:31:22.920,0:31:24.412
You don't see anything on the end,
0:31:24.412,0:31:26.678
there's nothing wrong there.
0:31:26.678,0:31:29.650
But when you compute the square root of it
0:31:29.650,0:31:33.156
it is almost an integer, it's .99999...
0:31:33.156,0:31:35.208
and then some dirt here.
0:31:35.208,0:31:39.288
So then you'll know that your primes were too small.
0:31:39.288,0:31:42.286
Sorry, not too small, too close to each other.
0:31:42.286,0:31:45.483
So if you ever take an RSA factor modulus
0:31:45.483,0:31:46.841
and you compute the square root
0:31:46.841,0:31:47.991
and you see something like this
0:31:47.991,0:31:51.885
then you know: that user did this method.
0:31:51.885,0:31:53.661
You could just say, I take the number,
0:31:53.661,0:31:56.549
subtract 1, subtract 2, just go off from the square root.
0:31:56.549,0:32:02.022
All you can check how far away is it from being a square.
0:32:02.022,0:32:03.701
So you take the seeding,
0:32:03.701,0:32:07.950
that means the closest integer counting upwards.
0:32:07.950,0:32:11.817
Just round it to this two, here's 57, is there a 56?
0:32:11.817,0:32:14.520
Compute the square of that and subtract N.
0:32:14.520,0:32:17.987
Now in this example that is 4096
0:32:17.987,0:32:22.595
which itself is a square.
0:32:22.595,0:32:26.636
So the difference of N and a^2 is a square.
0:32:26.636,0:32:30.915
Then I take this 64, take a-64, divide N,
0:32:30.915,0:32:34.413
it's an exact division, so I found one of the factors.
0:32:34.413,0:32:37.200
And then there's the other factor.
0:32:37.200,0:32:39.738
It doesn't matter whether it's a power of 2 or a power of 10.
0:32:39.738,0:32:42.569
What matters is that the numbers are too close.
0:32:42.569,0:32:44.712
Now this method actually works surprisingly well
0:32:44.712,0:32:47.436
even if we don't do next_prime().
0:32:47.436,0:32:50.332
So here's another example where I didn't do the next_prime()
0:32:50.332,0:32:52.281
but did something larger
0:32:52.281,0:32:55.114
so this is not really close to a square.
0:32:55.114,0:32:57.731
Some 9's but not too many.
0:32:57.731,0:33:00.053
What we did before we wrote it as a difference of squares
0:33:00.053,0:33:04.614
and then divided N by one of these two factors.
0:33:04.614,0:33:10.050
Now in this case here I would not start as a^2-N
0:33:10.050,0:33:12.255
but I say, well, if a^2-N is not a square
0:33:12.255,0:33:15.997
I try the next one, I do (a+1)^2, (a+2)^2
0:33:15.997,0:33:20.777
each time subtract N and check when this is a square.
0:33:20.777,0:33:24.170
Now for this example I get lucky after 2.
0:33:24.170,0:33:26.331
And this actually took me a long time to cook up this example
0:33:26.331,0:33:28.613
because I was starting at something which was
0:33:28.613,0:33:31.669
half of the bits of p were changed.
0:33:31.669,0:33:35.872
So I was actually starting with a cube quite a distance away.
0:33:35.872,0:33:38.216
Still, most of those were just giving a square.
0:33:38.216,0:33:42.070
They were not giving me anything larger than zero here.
0:33:42.070,0:33:45.598
Now for general numbers if I wouldn't do like next_number()
0:33:45.598,0:33:48.786
but just, well, random prime, another random prime.
0:33:48.786,0:33:52.687
It still works but takes a long time
0:33:52.687,0:33:55.488
because I can always write it this way.
0:33:55.488,0:33:56.786
This is a, this is b.
0:33:56.786,0:34:00.603
But then usually it would take about square root of N
0:34:00.603,0:34:03.752
which is the same size as p until I get there.
0:34:03.752,0:34:07.352
This is a nice method if it looks like lots of
0:34:07.352,0:34:13.386
9999 and otherwise do something better as Dan will tell now.
0:34:13.386,0:34:13.955
Dan: Ok.
0:34:13.955,0:34:21.508
applause
0:34:21.508,0:34:23.662
Alright, so it's bad to have p and q
0:34:23.662,0:34:24.805
too close to each other
0:34:24.805,0:34:28.640
or have a small p or to have p and q non-random.
0:34:28.640,0:34:31.010
So let's do everything right.
0:34:31.010,0:34:34.236
Let's make just independently choose some big p
0:34:34.236,0:34:37.033
between 2^511 and 2^512
0:34:37.033,0:34:38.667
and choose some big q
0:34:38.667,0:34:42.586
without being like the next prime or anywhere in the ballpark of p.
0:34:42.586,0:34:47.859
You could maybe say q has to be between 2^509 and 2^510
0:34:47.859,0:34:49.522
and then it's definitely nowhere near p.
0:34:49.522,0:34:52.539
Now you got this public key and p times q
0:34:52.539,0:34:54.473
which is a 1024 bit RSA key.
0:34:54.473,0:34:58.745
If nothing has gone wrong with your randomness generation
0:34:58.745,0:35:00.153
then what does the attacker do?
0:35:00.153,0:35:04.472
Well, we don't know anything fast.
0:35:04.472,0:35:06.840
So this is gonna be the one part of the talk
0:35:06.840,0:35:08.495
where there's these big powers of 2
0:35:08.495,0:35:09.778
for how fast things are
0:35:09.778,0:35:11.139
because they're not really fast.
0:35:11.139,0:35:14.624
You have to think about how much something like 2^80 is.
0:35:14.624,0:35:15.938
There's all these different methods,
0:35:15.938,0:35:17.944
I'll just skip down to the last two.
0:35:17.944,0:35:19.999
There's the quadratic sieve (QS),
0:35:19.999,0:35:22.114
the number field sieve (NFS).
0:35:22.114,0:35:24.109
These have been around since, well,
0:35:24.109,0:35:29.192
QS since the 80th, NFS since the early 90th.
0:35:29.192,0:35:32.327
The NFS, the general consensus is
0:35:32.327,0:35:35.197
it takes something like 2^80 operations.
0:35:35.197,0:35:36.675
Now here's something fun to try.
0:35:36.675,0:35:38.844
If somebody says 2^80 operations
0:35:38.844,0:35:41.473
and there's some cryptographer talking about the security or something,
0:35:41.473,0:35:44.244
ask them: what do you mean by an operation?
0:35:44.244,0:35:45.996
How fast is this operation?
0:35:45.996,0:35:47.480
And then most of the people saying this
0:35:47.480,0:35:48.624
will go running screaming like:
0:35:48.624,0:35:49.850
I don't know what an operation is,
0:35:49.850,0:35:51.193
don't ask me that question.
0:35:51.193,0:35:53.382
But still people will confidently tell you
0:35:53.382,0:35:56.230
that the NFS takes 2^80 operations
0:35:56.230,0:36:00.347
to break any 1024 bit RSA key
0:36:00.347,0:36:02.237
even if the user hasn't done anything wrong,
0:36:02.237,0:36:04.047
the implementor hasn't done anything wrong.
0:36:04.047,0:36:05.631
And this number, this 2^80
0:36:05.631,0:36:07.618
if you pin down what those operations are,
0:36:07.618,0:36:11.100
this is something which is doable today
0:36:11.100,0:36:15.746
by people with a big botnet or people with a big computer cluster.
0:36:15.746,0:36:18.611
I'll give some details what I mean, the size is there.
0:36:18.611,0:36:21.797
As your computers get cheaper and cheaper,
0:36:21.797,0:36:23.720
somehow chips aren't going up in clockspeed
0:36:23.720,0:36:25.149
but they're still getting cheaper.
0:36:25.149,0:36:27.636
You can get a really cheap ARM which is doing
0:36:27.636,0:36:28.904
the same kind of computations
0:36:28.904,0:36:31.501
that a pretty big CPU was doing several years ago.
0:36:31.501,0:36:33.181
And chips are gonna continue getting cheaper
0:36:33.181,0:36:36.604
for a while. These attacks against RSA 1024
0:36:36.604,0:36:39.568
are going to be common doable for more and more people.
0:36:39.568,0:36:43.119
How do these methods work?
0:36:43.119,0:36:47.210
I'll give one example and then one little Sage script.
0:36:47.210,0:36:50.169
Let's try just a really small number.
0:36:50.169,0:36:51.717
This will be with the quadratic sieve
0:36:51.717,0:36:53.727
which is not as fancy as the number field sieve
0:36:53.727,0:36:56.395
but it will give you some idea of how these methods work.
0:36:56.395,0:36:59.978
The QS starts with that Fermat method you heard about.
0:36:59.978,0:37:04.066
Remember, the idea there was to try to find a square
0:37:04.066,0:37:08.574
as a^2-N, so a was like the square root of N
0:37:08.574,0:37:10.525
or maybe the plus 1, plus 2 and so on
0:37:10.525,0:37:13.333
and you keep searching for a+1 and so on,
0:37:13.333,0:37:15.933
search for numbers that if you square them and subtract N
0:37:15.933,0:37:17.056
then you get a square.
0:37:17.056,0:37:20.708
So let's try that for 50=2759 which is not very big.
0:37:20.708,0:37:23.975
Then you try, 53 was the ceiling of the square root,
0:37:23.975,0:37:27.673
square that, subtract 2759, you get 50.
0:37:27.673,0:37:31.171
Well 25 would be a square, but 50 is not a square
0:37:31.171,0:37:33.309
it's 2 times 25, 2 times 5^2.
0:37:33.309,0:37:37.208
And then you try another number, 54^2-2759
0:37:37.208,0:37:39.911
umm 157, that's too complicated
0:37:39.911,0:37:42.707
I don't remember that being a square so let's just skip it.
0:37:42.707,0:37:45.687
And then you keep going like this, 266, 377,
0:37:45.687,0:37:48.223
490. I remember 49 is a square
0:37:48.223,0:37:50.354
but that doesn't mean that 490 is a square
0:37:50.354,0:37:52.372
cause 10, 2 times 5, that's not a square.
0:37:52.372,0:37:55.871
And 605, you can try...
0:37:55.871,0:37:58.640
We remember how to take out the 5
0:37:58.640,0:38:01.275
and then 60... 605 divided by 5.
0:38:01.275,0:38:03.588
You can figure out it's 5 times 11^2.
0:38:03.588,0:38:05.158
It's almost a square.
0:38:05.158,0:38:06.774
You keep doing this for a while
0:38:06.774,0:38:10.534
and it seems like Fermat's method is actually working pretty badly.
0:38:10.534,0:38:13.754
It does work eventually but it's taking quite a while.
0:38:13.754,0:38:16.671
Here's what the quadratic sieve does.
0:38:16.671,0:38:19.527
From the same computation you look at these
0:38:19.527,0:38:21.585
numbers that were not exactly squares
0:38:21.585,0:38:23.634
50 and 490 and 605
0:38:23.634,0:38:27.056
and you notice if you multiple them together
0:38:27.056,0:38:28.472
50 was 2 times 5^2
0:38:28.472,0:38:30.756
490 is 2 times 5 times 7^2
0:38:30.756,0:38:33.068
605 is 5 times 11^2
0:38:33.068,0:38:34.585
and then you multiply those together
0:38:34.585,0:38:38.524
you get 2^2 and 5^4 and 7^2 and 11^2
0:38:38.524,0:38:41.123
and that's exactly the square of something.
0:38:41.123,0:38:44.592
The square of 2^1 times 5^2 times 7 times 11.
0:38:44.592,0:38:48.175
Then the quadratic sieve is taking the square root of that number
0:38:48.175,0:38:52.553
and subtracting that from the product of the fifty-somethings
0:38:52.553,0:38:53.953
that were on the left side
0:38:53.953,0:38:56.826
and taking the greatest common divisor of that with N
0:38:56.826,0:38:59.787
and amazingly that produces a factor of N.
0:38:59.787,0:39:02.289
And it's not just some random accident that
0:39:02.289,0:39:04.370
if you had tried the greatest common divisor of
0:39:04.370,0:39:06.085
"write down some hocuspocus then"
0:39:06.085,0:39:08.803
"you eventually get 31 coming out"
0:39:08.803,0:39:10.576
"every 31 times you write down a random number"
0:39:10.576,0:39:13.020
"it will have this 31 dividing it"
0:39:13.020,0:39:15.347
No, you can try this for bigger and bigger numbers
0:39:15.347,0:39:17.038
and actually this keeps working.
0:39:17.038,0:39:18.663
As soon as you find a square product
0:39:18.663,0:39:23.212
then half of those square products are gonna factor N.
0:39:23.212,0:39:25.365
So that's how the quadratic sieve works.
0:39:25.365,0:39:28.915
Here's a more systematic script with a much bigger number
0:39:28.915,0:39:31.300
which if it were just hocuspocus this wouldn't work.
0:39:31.300,0:39:32.616
But this does work.
0:39:32.616,0:39:36.834
So there's a number from my computer's random number generator
0:39:36.834,0:39:41.661
31415926... laughter
0:39:41.661,0:39:45.099
You try writing down a bunch of squares...
0:39:45.099,0:39:48.285
maybe I should get my computer's random number generator checked.
0:39:48.285,0:39:50.636
It is this two year old laptop you heard about before
0:39:50.636,0:39:52.202
so I'm not sure it's working properly.
0:39:52.202,0:39:55.599
You take a bunch of a^2-N and then
0:39:55.599,0:39:57.414
let's make a list of those called X.
0:39:57.414,0:40:00.752
You can see the range, there is up to 500.000 numbers.
0:40:00.752,0:40:02.948
It's a pretty big list we're talking about.
0:40:02.948,0:40:05.749
And then search through those elements,
0:40:05.749,0:40:08.538
search through those a^2-N, those differences,
0:40:08.538,0:40:09.504
the elements in this list,
0:40:09.504,0:40:12.985
to see which ones have easy factorizations.
0:40:12.985,0:40:15.445
Now a lot of numbers like that 157
0:40:15.445,0:40:16.970
I know what the factorization of that is
0:40:16.970,0:40:19.846
but there is function which you can find on our website
0:40:19.846,0:40:22.696
easyfactorizations() which looks through the list X
0:40:22.696,0:40:25.251
and if it finds numbers that are easy to factor
0:40:25.251,0:40:27.935
then it quickly writes down the factorizations
0:40:27.935,0:40:30.482
using the small prime techniques you heard about.
0:40:30.482,0:40:33.169
And then makes a list of those easy factorizations,
0:40:33.169,0:40:34.396
puts those into F.
0:40:34.396,0:40:37.600
And then there's a big chunk which I won't try to explain
0:40:37.600,0:40:39.986
which is called linear algebra mod 2.
0:40:39.986,0:40:42.586
And that somehow figures out a square
0:40:42.586,0:40:45.697
that comes from the factorizations,
0:40:45.697,0:40:48.363
somehow looks through this list of factorizations
0:40:48.363,0:40:50.803
and magically applies some linear algebra stuff.
0:40:50.803,0:40:52.612
Well ok, I won't go through that.
0:40:52.612,0:40:54.071
It's something doable
0:40:54.071,0:40:58.604
and this is using some standard Sage functions to do it pretty easily.
0:40:58.604,0:41:01.738
There's all sorts of things that
0:41:01.738,0:41:03.664
in the interest of time I won't get into
0:41:03.664,0:41:05.867
like how this easyfactorizations() works.
0:41:05.867,0:41:07.603
And this is something where people write papers
0:41:07.603,0:41:09.801
and papers and papers talking about trying to make this
0:41:09.801,0:41:11.603
go as quickly as possible.
0:41:11.603,0:41:14.737
I do want to emphasize one fact about these methods:
0:41:14.737,0:41:17.484
the bold-faced **very small memory requirements**.
0:41:17.484,0:41:20.038
You can use the elliptic curve method
0:41:20.038,0:41:21.531
that you heard about before
0:41:21.531,0:41:24.882
you can use that to figure out easy factorizations
0:41:24.882,0:41:27.063
using very little memory.
0:41:27.063,0:41:28.503
That means you can build a chip
0:41:28.503,0:41:30.471
like a FPGA, a special purpose ASIC,
0:41:30.471,0:41:33.285
you can have thousands of little ECM units
0:41:33.285,0:41:35.066
running in parallel, so you can really
0:41:35.066,0:41:38.153
massively parallelize this easyfactorizations() function.
0:41:38.153,0:41:40.187
So if people tell you "oh you need tons of
0:41:40.187,0:41:43.386
memory for sieving to find easy factorizations"
0:41:43.386,0:41:45.868
then you should tell them "no no, that's not true"
0:41:45.868,0:41:47.979
"ECM you can do with very little memory
0:41:47.979,0:41:50.351
running massively parallelizable"
0:41:50.351,0:41:52.802
And then there's other things which,
0:41:52.802,0:41:54.490
I suppose if we had another hour
0:41:54.490,0:41:56.432
then we could get into all these details
0:41:56.432,0:41:59.117
of like how the number field sieve goes beyond this.
0:41:59.117,0:42:01.970
It's a similar kind of method but gets more complicated.
0:42:01.970,0:42:05.181
I do want to emphasize again thing in bold-face here
0:42:05.181,0:42:06.946
which is that if somebody tells you
0:42:06.946,0:42:10.438
"oh 2^80 operations, the attacker wouldn't do
0:42:10.438,0:42:13.796
that because the target isn't worth that much to them"
0:42:13.796,0:42:18.284
Well, **Batch NFS** is a way to take a bunch of N's
0:42:18.284,0:42:20.566
and like the batch techniques before
0:42:20.566,0:42:23.516
there's some magic ways to find redundancies
0:42:23.516,0:42:25.884
between doing factorizations of those N's.
0:42:25.884,0:42:26.937
So somebody tells you:
0:42:26.937,0:42:30.000
"oh 2^80, that isn't worth it for the attacker."
0:42:30.000,0:42:33.296
Well, **Batch NFS** reduces the cost quite a bit
0:42:33.296,0:42:34.821
for factoring each key.
0:42:34.821,0:42:36.846
If the attacker has a lot of keys to target
0:42:36.846,0:42:39.852
they can factor all of them in not much more cost
0:42:39.852,0:42:41.396
than just factoring one.
0:42:41.396,0:42:43.734
So don't believe people who take an economic view
0:42:43.734,0:42:46.987
how much the number field sieve is worth.
0:42:46.987,0:42:49.416
Alright, what does this mean?
0:42:49.416,0:42:52.118
Can the attacker actually do the NFS
0:42:52.118,0:42:54.134
once they've put in all these optimizations?
0:42:54.134,0:42:59.064
Well if you look carefully at the analyses of the NFS
0:42:59.064,0:43:02.416
there are only about 2^70 differences.
0:43:02.416,0:43:04.149
Things like these a^2-N's.
0:43:04.149,0:43:08.862
If you look through 2^70 of those and scan them properly,
0:43:08.862,0:43:10.773
find easy factorizations, do linear algebra,
0:43:10.773,0:43:13.669
you will factor any 1024 bit key.
0:43:13.669,0:43:16.314
And 2^70, how big is that number?
0:43:16.314,0:43:20.820
It's actually small enough that you can do this computation on
0:43:20.820,0:43:22.469
your favorite botnet.
0:43:22.469,0:43:25.749
Say, Conficker is shrinking, still
0:43:25.749,0:43:27.947
and it will probably be wiped out at some point
0:43:27.947,0:43:29.488
but it's an example of what you can do
0:43:29.488,0:43:32.468
using any of these security problems that are deployed
0:43:32.468,0:43:33.632
on enough machines.
0:43:33.632,0:43:35.398
You can break into a bunch of machines,
0:43:35.398,0:43:36.521
millions of machines.
0:43:36.521,0:43:40.296
Conficker estimates vary between 7 million and 12 million machines.
0:43:40.296,0:43:44.320
So use your, say, 2^23, that's 8 million machines.
0:43:44.320,0:43:46.519
Count how many seconds there are in a year
0:43:46.519,0:43:48.553
and then, say, ok that means we have to do
0:43:48.553,0:43:52.070
2^22 differences per second machine.
0:43:52.070,0:43:56.902
And that is slower than state of the art factorization code already runs.
0:43:56.902,0:43:59.231
So it's actually a pretty easy computation.
0:43:59.231,0:44:02.334
You don't even need that many millions of computers to do this.
0:44:02.334,0:44:04.470
If people tell you "wait a minute"
0:44:04.470,0:44:06.120
"you can't just use a bunch of machines"
0:44:06.120,0:44:08.073
"there's all this work for linear algebra"
0:44:08.073,0:44:09.966
Then I'll just skip this.
0:44:09.966,0:44:12.882
Be aware that linear algebra is not as much of a problem
0:44:12.882,0:44:15.334
as some people would say.
0:44:15.334,0:44:18.872
If you use a big botnet there's one little problem.
0:44:18.872,0:44:21.053
For an attacker who breaks into a bunch of machines
0:44:21.053,0:44:23.903
and then starts spinning them up, heating up the CPU,
0:44:23.903,0:44:25.921
the fan starts running, the user starts wondering
0:44:25.921,0:44:28.017
"why is my machine running so slowly?"
0:44:28.017,0:44:30.133
Maybe only use half of the CPU
0:44:30.133,0:44:33.570
but still it has still got a good chance of getting you detected
0:44:33.570,0:44:35.202
and kicked off the machines.
0:44:35.202,0:44:37.266
Users are gonna shut you down if you're doing a
0:44:37.266,0:44:39.463
say, year of computation on your botnet.
0:44:39.463,0:44:42.883
So what do you do instead?
0:44:42.883,0:44:45.021
Well you build a little computer cluster.
0:44:45.021,0:44:49.354
Now yesterday we heard about a big building
0:44:49.354,0:44:51.164
in Bluffdale, Utah, that apparently
0:44:51.164,0:44:54.832
is going to store all of the data collected by
0:44:54.832,0:44:57.171
the NSA for the next hundred years
0:44:57.171,0:44:58.854
or maybe forever.
0:44:58.854,0:45:00.914
But something that I didn't hear emphasized yesterday
0:45:00.914,0:45:04.916
was that this building also has a new power substation
0:45:04.916,0:45:08.150
which is drawing 65 megawatts,
0:45:08.150,0:45:10.155
well generating 65 megawatts.
0:45:10.155,0:45:13.999
Now what are we doing with 65 megawatts?
0:45:13.999,0:45:15.308
Is that the kind of power that you need to
0:45:15.308,0:45:17.490
store a bunch of data on tapes?
0:45:17.490,0:45:21.175
No, that's the kind of power you need to do computations.
0:45:21.175,0:45:24.725
So what is NSA doing with 2^26 watts?
0:45:24.725,0:45:26.955
Or maybe even with more watts.
0:45:26.955,0:45:30.909
You shouldn't actually think that 2^26 is such a big number.
0:45:30.909,0:45:33.526
Here's a little table with, well,
0:45:33.526,0:45:36.573
2^57 admittedly is pushing it a little bit.
0:45:36.573,0:45:40.620
This is the total power that the earth gets from the sun
0:45:40.620,0:45:43.207
of which about half gets to the earth's surface.
0:45:43.207,0:45:47.610
2^44 is the amount the human is using right now.
0:45:47.610,0:45:50.888
Varies a little bit but I mean this is the average power
0:45:50.888,0:45:53.427
including cars and such.
0:45:53.427,0:45:56.446
If you have the botnet that breaks into millions of machines
0:45:56.446,0:45:59.944
and runs in flat-out that's about 2^30 watts.
0:45:59.944,0:46:01.610
And actually if you think about it, wait a minute,
0:46:01.610,0:46:02.575
there's a lot of machines out there.
0:46:02.575,0:46:07.270
This means the 2^26 is not actually that much power.
0:46:07.270,0:46:11.456
It's just one dinky little Bluffdale 65 MW computer center
0:46:11.456,0:46:14.594
and actually if you're a government agency
0:46:14.594,0:46:16.138
you probably have more than one of those.
0:46:16.138,0:46:19.273
As you heard yesterday in the context of data storage
0:46:19.273,0:46:21.292
but again, it's not just data storage.
0:46:21.292,0:46:26.492
You don't make a 65 MW power station to just store data.
0:46:26.492,0:46:30.207
Ok, if you just take standard graphics processing units
0:46:30.207,0:46:33.223
and run 2^26 watts to those that will do about
0:46:33.223,0:46:36.446
2^84 floating point multiplications per year.
0:46:36.446,0:46:39.827
You have to figure out, ok, what are exactly the operations involved here.
0:46:39.827,0:46:44.829
This should be enough to break 1024 bit RSA
0:46:44.829,0:46:48.088
and if NSA is not just buying GPU's off-the-shelves
0:46:48.088,0:46:50.226
but really tuning chips that they build
0:46:50.226,0:46:57.409
using IBM to manufacture their chips at the moment.
0:46:57.409,0:47:01.626
Apparently it wasn't cost-effective for NSA to manufacture their own chips
0:47:01.626,0:47:04.475
so in 2005 they started subcontracting for IBM
0:47:04.475,0:47:08.153
but presumably through that fabrication IBM is building chips
0:47:08.153,0:47:11.073
that NSA wants and those should be about 10 times faster
0:47:11.073,0:47:13.025
than what GPU's can do.
0:47:13.025,0:47:18.362
So it should be possible with this little 65 MW computer cluster
0:47:18.362,0:47:21.476
to factor at least 1, maybe even 10,
0:47:21.476,0:47:23.478
and then with batch NFS maybe even more
0:47:23.478,0:47:30.010
1024 bit RSA keys in a year.
0:47:30.010,0:47:38.229
applause
0:47:38.229,0:47:41.130
Nadia: So to conclude things up here I want to
0:47:41.130,0:47:43.290
explain how you can use
0:47:43.290,0:47:46.057
from the comfort of your very own home
0:47:46.057,0:47:52.242
a distributed, a massive scale distributed cloud computing service
0:47:52.242,0:47:57.142
to calculate private keys on your own.
0:47:57.142,0:48:00.313
So many of you may be familiar with Amazon EC2
0:48:00.313,0:48:04.635
but it turns out Google also has a large amount of computing infrastructure
0:48:04.635,0:48:07.339
and they actually have a very convenient web interface
0:48:07.339,0:48:08.446
to access it.
0:48:08.446,0:48:12.978
laugther
0:48:12.978,0:48:21.677
applause
0:48:21.677,0:48:23.961
It's amazing what you can find on the Internet.
0:48:23.961,0:48:26.143
laughter
0:48:26.143,0:48:29.393
So ok, here's some keys except, you know,
0:48:29.393,0:48:30.578
if you look at these carefully
0:48:30.578,0:48:33.277
not all of these are sort of obviously RSA keys.
0:48:33.277,0:48:35.204
There are some problems here
0:48:35.204,0:48:40.704
like the second to last one on the bottom.
0:48:40.704,0:48:41.688
So...
0:48:41.688,0:48:43.338
laughter
0:48:43.338,0:48:49.820
After doing this search we found this key and
0:48:49.820,0:48:55.279
someone seems to have pasted a private key into pastebin
0:48:55.279,0:49:00.491
and someone came along and interrupted part of it
0:49:00.491,0:49:04.759
with some profanity.
0:49:04.759,0:49:07.387
This is an interesting problem.
0:49:07.387,0:49:08.812
What do we do with this key?
0:49:08.812,0:49:12.310
Clearly this is an interesting key, we must have this key.
0:49:12.310,0:49:14.204
laughter
0:49:14.204,0:49:24.847
applause
0:49:24.847,0:49:26.046
Step 1... yeah?
0:49:26.046,0:49:29.572
Male: Did you see the easter egg in row 1?
0:49:29.572,0:49:31.204
Nadia: Well, we'll discuss this.
0:49:31.204,0:49:34.308
Angel: Questions will be later, after the talk please.
0:49:34.308,0:49:39.368
Nadia: So yeah, we've removed the obvious problems here.
0:49:39.368,0:49:41.636
But there's still an issue which is that
0:49:41.636,0:49:43.844
we're missing hundreds and hundreds of bits of key.
0:49:43.844,0:49:45.940
We can't brute force-this.
0:49:45.940,0:49:47.658
What are we gonna do?
0:49:47.658,0:49:51.327
Well, let's look at what RSA keys actually are.
0:49:51.327,0:49:53.509
I introduced RSA keys and I was, like,
0:49:53.509,0:49:56.786
it's a d and d has like a thosand of bits.
0:49:56.786,0:50:00.671
And if we don't know p and q
0:50:00.671,0:50:02.608
how are we gonna get this d?
0:50:02.608,0:50:07.923
Here is the storage format for an RSA key.
0:50:07.923,0:50:11.258
Public key is the modulus N, the public exponent e.
0:50:11.258,0:50:14.710
Ok, private key has version, N, e,
0:50:14.710,0:50:16.536
d the decryption exponent,
0:50:16.536,0:50:20.273
p, q, d mod (p-1), d mod (q-1),
0:50:20.273,0:50:22.975
the inverse of q mod p,
0:50:22.975,0:50:25.305
this is all useful information if you want to
0:50:25.305,0:50:26.960
speed up your calculation and not have to
0:50:26.960,0:50:29.405
compute everything every time you use your key.
0:50:29.405,0:50:32.493
Okay.
0:50:32.493,0:50:35.952
What did we see here then?
0:50:35.952,0:50:40.975
Here is the key broken up into all of the difference pieces.
0:50:40.975,0:50:44.804
So we have: the red bit is approximately where N is sitting.
0:50:44.804,0:50:46.754
We've got e in orange.
0:50:46.754,0:50:50.569
We've got part of d in yellow.
0:50:50.569,0:50:52.372
We seem to be missing p
0:50:52.372,0:50:54.656
but we've got part of q
0:50:54.656,0:50:56.791
and then here's d mod p-1
0:50:56.791,0:51:02.408
and d mod q-1 and q inverse mod p.
0:51:02.408,0:51:07.712
There's also a little problem before the red here
0:51:07.712,0:51:08.436
laughter
0:51:08.436,0:51:10.504
in addition.
0:51:10.504,0:51:13.561
You might call this the private part of the private key.
0:51:13.561,0:51:15.936
laughter
0:51:15.936,0:51:21.391
applause
0:51:21.391,0:51:23.859
Fortunately this is just sitting in an encoding of the length
0:51:23.859,0:51:30.275
and the version. laughter
0:51:30.275,0:51:33.151
Alright, so what do we do with this information?
0:51:33.151,0:51:36.673
Basically given any part of this private key
0:51:36.673,0:51:38.910
we can easily compute all of the other parts.
0:51:38.910,0:51:40.257
Simple formulas:
0:51:40.257,0:51:48.938
q will be 2^(e times d mod p-1) - 1 and GCD that with N
0:51:48.938,0:51:51.872
then we get q and then
0:51:51.872,0:51:54.455
we can figure out what p was by dividing
0:51:54.455,0:51:56.393
and figure out what d is
0:51:56.393,0:51:57.953
and so on and so forth.
0:51:57.953,0:52:02.460
You can do this even if you have a part of a single value
0:52:02.460,0:52:04.109
from the private key.
0:52:04.109,0:52:08.195
You can still compute the rest of the private key from that.
0:52:08.195,0:52:12.338
We don't have time to get into this but it's online.
0:52:12.338,0:52:16.557
You've seen a little bit of math.
0:52:16.557,0:52:18.060
Single formulas, typed into Sage.
0:52:18.060,0:52:22.591
We can reconstruct the private key here.
0:52:22.591,0:52:29.487
applause
0:52:29.487,0:52:32.941
So, what are the lessons that we can learn
0:52:32.941,0:52:35.554
from everything that we've done today.
0:52:35.554,0:52:38.945
The first one is: stop using 1024 bit RSA
0:52:38.945,0:52:41.010
if you haven't already.
0:52:41.010,0:52:43.386
I have looked at the keys that are out there on the Internet
0:52:43.386,0:52:45.941
and you are still using 1024 bit RSA.
0:52:45.941,0:52:48.463
laughter
0:52:48.463,0:52:51.790
Second: make sure your primes are big enough.
0:52:51.790,0:52:54.227
Don't try to be clever about how you're generating them.
0:52:54.227,0:52:58.025
Make sure your primes are random.
0:52:58.025,0:52:59.303
You guys have some problems
0:52:59.303,0:53:01.394
especially in hardware.
0:53:01.394,0:53:03.621
Also...
0:53:03.621,0:53:04.921
laughter
0:53:04.921,0:53:09.091
"FUCK A DUCK" is not good crypto.
0:53:09.091,0:53:11.545
Pastebin is not a secure cloud store.
0:53:11.545,0:53:13.874
laughter
0:53:13.874,0:53:20.289
applause
0:53:20.289,0:53:22.657
And you probably shouldn't put your private key
0:53:22.657,0:53:24.778
in a secure cloud store anyway.
0:53:24.778,0:53:25.862
laughter
0:53:25.862,0:53:31.396
applause
0:53:31.396,0:53:32.577
And lastly...
0:53:32.577,0:53:34.211
laughter
0:53:34.211,0:53:40.427
applause
0:53:40.427,0:53:41.625
Thank you.
0:53:41.625,0:53:43.894
applause
0:53:43.894,0:53:44.804
Angel: Give them an applause.
0:53:44.804,0:54:05.937
applause
0:54:05.937,0:54:11.379
Angel: So questions will be taken at the numbered microphones.
0:54:11.379,0:54:14.962
So anyone who has a question please line up at a microphone
0:54:14.962,0:54:16.959
and we will start by the signal angel
0:54:16.959,0:54:19.528
who has a question from IRC.
0:54:19.528,0:54:23.628
*Signal Angel*: IRC has a lot of punch lines involving penisses
0:54:23.628,0:54:26.040
and some questions.
0:54:26.040,0:54:31.445
First question is: would you recommend more using
0:54:31.445,0:54:37.640
elliptic curves or RSA with multiple primes...
0:54:37.640,0:54:41.659
using proper primes.
0:54:41.659,0:54:47.192
Dan: There are ways to do elliptic curves wrong,
0:54:47.192,0:54:48.854
there are ways to do RSA wrong.
0:54:48.854,0:54:53.806
In general if you've got a particular performance requirement
0:54:53.806,0:54:55.625
then elliptic curves are gonna meet that
0:54:55.625,0:54:57.861
with a much higher security level
0:54:57.861,0:55:00.208
than RSA is gonna meet that.
0:55:00.208,0:55:03.005
Of course you can do RSA securely,
0:55:03.005,0:55:04.928
you can do elliptic curves securely
0:55:04.928,0:55:07.153
but if you've got some performance limitation
0:55:07.153,0:55:08.613
as many applications do
0:55:08.613,0:55:13.043
then elliptic curves tend to be a better choice.
0:55:13.043,0:55:14.429
Nadia: I also want to clarify
0:55:14.429,0:55:17.826
the other most commonly used public-key cryptosystem
0:55:17.826,0:55:19.979
is DSA, the Digital Signature Algorithm.
0:55:19.979,0:55:21.426
There's also elliptic curve DSA
0:55:21.426,0:55:23.291
which is probably what you're thinking about
0:55:23.291,0:55:25.570
with elliptic curve cryptography.
0:55:25.570,0:55:31.207
DSA is in my opinion actually much worse
0:55:31.207,0:55:34.395
than RSA in terms of randomness failures.
0:55:34.395,0:55:36.246
In the paper that Dan referenced
0:55:36.246,0:55:39.377
we also talk about randomness failures in DSA
0:55:39.377,0:55:40.993
and it's horrifying.
0:55:40.993,0:55:44.521
If you ever ever screw up randomness in a single signature
0:55:44.521,0:55:48.354
your entire public key is lost.
0:55:48.354,0:55:50.761
Angel: We will take the next question from microphone nr 4
0:55:50.761,0:55:53.072
to the right of the seat.
0:55:53.072,0:55:53.824
Jake: Hey guys.
0:55:53.824,0:55:56.277
Sorry I didn't mention the computing power of Bluffdale.
0:55:56.277,0:56:02.072
That said, when we want to transition from 1024 bit RSA
0:56:02.072,0:56:05.339
to something else, what do you think is a good idea?
0:56:05.339,0:56:10.028
So for example, in Tor we do use 1024 bit RSA for TLS.
0:56:10.028,0:56:12.528
laughter Yeah I know, right?
0:56:12.528,0:56:15.592
So the thing is we're working on changing these things
0:56:15.592,0:56:17.079
but what is...
0:56:17.079,0:56:19.766
I mean Zooko has this 100 year crypto project.
0:56:19.766,0:56:21.806
What is the real thing that we should,
0:56:21.806,0:56:23.910
like if we could switch tomorrow to something,
0:56:23.910,0:56:26.595
what's the useful that we can live with for 5 or 10 years?
0:56:26.595,0:56:28.795
Is 2048 going to be useful?
0:56:28.795,0:56:31.179
Is 4096 where we should go tomorrow?
0:56:31.179,0:56:32.780
Cause there is a trade-off between
0:56:32.780,0:56:34.588
performance and forward secrecy
0:56:34.588,0:56:37.962
and there are a lot of things to think about.
0:56:37.962,0:56:40.078
Tanja: If you tell me 5 to 10 years
0:56:40.078,0:56:42.360
I would be still comfortable with elliptic curves.
0:56:42.360,0:56:44.773
If you talk 100 years
0:56:44.773,0:56:47.691
and then there's all these worse quantum computers around
0:56:47.691,0:56:51.143
then I would go for something which is worse in performance
0:56:51.143,0:56:52.829
like code-based cryptography
0:56:52.829,0:56:54.043
or for signatures hashing
0:56:54.043,0:56:55.738
but if you'd say in 5 to 10 years
0:56:55.738,0:56:57.411
I'm very comfortable recommending to you
0:56:57.411,0:56:59.941
elliptic curves with 256 bits.
0:56:59.941,0:57:02.160
Jake: Ok, thanks.
0:57:02.160,0:57:04.878
Angel: Signal angel?
0:57:04.878,0:57:07.190
*Signal Angel*: Yeah, another question from IRC.
0:57:07.190,0:57:11.028
If you avoid all the easy primes
0:57:11.028,0:57:14.227
does this shrink the search space such that
0:57:14.227,0:57:17.862
it becomes easier to crack the remaining keys?
0:57:17.862,0:57:20.161
Or asked another way:
0:57:20.161,0:57:25.146
can you quantify the ratio of easy primes versus hard primes?
0:57:25.146,0:57:26.287
Dan: Yeah, that's a good question.
0:57:26.287,0:57:28.573
The answer is: yes, it can be quantified
0:57:28.573,0:57:32.829
and once you're up to about to the 1024 bit RSA key level
0:57:32.829,0:57:35.306
then there's very very few weak primes.
0:57:35.306,0:57:37.078
So if you have a random number,
0:57:37.078,0:57:38.593
you just generate a random prime
0:57:38.593,0:57:40.238
then your chance of bumping into something weak
0:57:40.238,0:57:42.793
is so small that you just don't have to worry about it.
0:57:42.793,0:57:44.862
It is one of those much less frequent than
0:57:44.862,0:57:46.580
being hit by a lightning kind of events.
0:57:46.580,0:57:49.210
It is something that have been quantified
0:57:49.210,0:57:52.244
and it is an issue for smaller RSA keys
0:57:52.244,0:57:54.872
back when people were using, say, 512 bit RSA
0:57:54.872,0:57:56.754
then it was actually a noticable issue.
0:57:56.754,0:57:59.123
But once you're at 1024 or above then
0:57:59.123,0:58:00.694
the issue is basically gone.
0:58:00.694,0:58:03.831
It's something where you can and you should
0:58:03.831,0:58:06.097
look at exactly what the chance is of bumping into a weak
0:58:06.097,0:58:08.828
but it's not reallistically something
0:58:08.828,0:58:13.259
you're gonna encounter with 1024 bit RSA.
0:58:13.259,0:58:16.542
Angel: We're gonna take the next question from nr 1.
0:58:16.542,0:58:19.313
Just one notice: we don't have a lot of time left
0:58:19.313,0:58:23.546
so even though there is a talk in one hour
0:58:23.546,0:58:25.078
just ask.
0:58:25.078,0:58:28.165
Male: There is a devastating attack that can break AES,
0:58:28.165,0:58:31.872
SHA2, most probably SHA3 and DES.
0:58:31.872,0:58:33.758
It's called a biclique attack.
0:58:33.758,0:58:35.664
Are you concerned that it might break RSA
0:58:35.664,0:58:37.281
with any key size and even ECC
0:58:37.281,0:58:39.988
and even any public-key crypto?
0:58:39.988,0:58:41.660
Dan: No.
0:58:41.660,0:58:43.594
laughter
0:58:43.594,0:58:46.675
Bicliques are something which are certainly interesting
0:58:46.675,0:58:49.590
and I think about it as a way to speed up brute force
0:58:49.590,0:58:52.508
and it speeds up brute force by a noticable factor,
0:58:52.508,0:58:55.148
often makes attacks, say, twice as fast.
0:58:55.148,0:58:58.523
But with public-key crypto we have attacks which are
0:58:58.523,0:59:00.760
much much faster than brute force to begin with
0:59:00.760,0:59:04.209
so that kind of speed-up of brute force won't be competitive
0:59:04.209,0:59:08.226
with things like the number field sieve.
0:59:08.226,0:59:12.964
Angel: Next is from number 3.
0:59:12.964,0:59:16.824
Male: Me not coming from the dark side of mathematics,
0:59:16.824,0:59:20.923
how do I know that my random number generators are fine
0:59:20.923,0:59:23.661
for generating keys?
0:59:23.661,0:59:26.223
laughter
0:59:26.223,0:59:31.354
Nadia: Seed them.
0:59:31.354,0:59:37.042
Basically the things that are out there if you're using
0:59:37.042,0:59:40.613
a standard random number generator like in Linux,
0:59:40.613,0:59:44.441
actually Linux has added patches to the kernel
0:59:44.441,0:59:48.022
to fix some of the problems that we've found.
0:59:48.022,0:59:50.431
If you want to know that you're keys are good:
0:59:50.431,0:59:54.694
if you generate them on a general purpose computer
0:59:54.694,0:59:56.600
after it has been running for a long time
0:59:56.600,0:59:59.559
you're probably fine.
0:59:59.559,1:00:04.793
I would not generate very important or long-term keys
1:00:04.793,1:00:09.707
on low-power hardware devices where you can't tell...
1:00:09.707,1:00:11.910
laughter
1:00:11.910,1:00:15.758
Male: Thank you.
1:00:15.758,1:00:18.341
Angel: Next question will be taken from number 4.
1:00:18.341,1:00:19.394
Male: Hi.
1:00:19.394,1:00:23.690
It's just a remark, not really a question.
1:00:23.690,1:00:25.407
I work in high performance computing.
1:00:25.407,1:00:26.910
I was at the supercomputing conference
1:00:26.910,1:00:29.376
a couple of weeks ago.
1:00:29.376,1:00:31.931
They're presenting large-scale installations
1:00:31.931,1:00:35.863
and dealing with problems of power.
1:00:35.863,1:00:38.927
They want to build petaflops and exaflops systems
1:00:38.927,1:00:42.262
that are in a range of 20 MW.
1:00:42.262,1:00:47.316
So I'm wondering what NSA is doing with 53 megawatts
1:00:47.316,1:00:53.815
in a data center because no storage takes that much capacity.
1:00:53.815,1:00:58.976
I think it's really something we should think about.
1:00:58.976,1:01:03.266
So thanks, nice talk.
1:01:03.266,1:01:04.883
Angel: Next question from...
1:01:04.883,1:01:06.961
does the signal angel have one?
1:01:06.961,1:01:08.679
*Signal angel*: Ok, another question is:
1:01:08.679,1:01:11.845
How big do you estimate the effort to
1:01:11.845,1:01:17.308
exchange keys and certificates involving 1024 bit primes
1:01:17.308,1:01:22.782
let's say worldwide at the moment?
1:01:22.782,1:01:24.502
Dan: I wasn't getting what the question was.
1:01:24.502,1:01:25.642
Could you repeat the question?
1:01:25.642,1:01:27.344
*Signal angel*: I don't really understand it myself.
1:01:27.344,1:01:28.875
laughter
1:01:28.875,1:01:33.181
Angel: Ok, let's switch to 1.
1:01:33.181,1:01:37.009
Male: My question goes back to the idea
1:01:37.009,1:01:42.876
of how can we know how good the private keys are?
1:01:42.876,1:01:50.227
Is there something like a keygen evaluation tool?
1:01:50.227,1:01:56.567
Or can the quality of a key generator be estimated
1:01:56.567,1:01:58.544
from a sample of keys?
1:01:58.544,1:02:01.827
Like a public tool that you can throw keys on
1:02:01.827,1:02:05.565
and it will tell me a bias of my keygen?
1:02:05.565,1:02:09.091
Nadia: If you go to factorable.net
1:02:09.091,1:02:12.574
my co-authors and I have made a key check tool
1:02:12.574,1:02:14.728
which will tell you if your key is in the
1:02:14.728,1:02:17.528
list of keys that we scraped off the Internet
1:02:17.528,1:02:19.296
that we were able to compromise.
1:02:19.296,1:02:21.249
That's a first check.
1:02:21.249,1:02:24.449
It's not a positive verification that your key is good
1:02:24.449,1:02:26.544
but it will tell you if it is very bad.
1:02:26.544,1:02:28.863
laughter
1:02:28.863,1:02:36.811
If you want to check your generator,
1:02:36.811,1:02:40.827
if you're worried about the factorization problems
1:02:40.827,1:02:43.242
we have fast code on the website in C
1:02:43.242,1:02:45.715
that will do the GCD calculation for you
1:02:45.715,1:02:50.201
if you just dump a gigantic collection of keys at it.
1:02:50.201,1:02:54.375
If you might have other problems like you are
1:02:54.375,1:02:56.993
not sure whether it's factorable like those don't come up
1:02:56.993,1:03:00.497
the experiment that you might want to do is
1:03:00.497,1:03:04.211
sort of restart the process in realistic conditions
1:03:04.211,1:03:06.179
and generate a gigantic quantity of keys
1:03:06.179,1:03:08.981
and just check for repeats.
1:03:08.981,1:03:12.378
The sort of obvious stress testings.
1:03:12.378,1:03:16.375
If you have a device you want to boot it up in realistic conditions
1:03:16.375,1:03:18.048
and then check them
1:03:18.048,1:03:19.798
and this is painstaking work
1:03:19.798,1:03:22.327
but I don't think there's any realistic way
1:03:22.327,1:03:25.546
of doing better than that.
1:03:25.546,1:03:28.396
Or inspect the code, the obvious thing.
1:03:28.396,1:03:30.924
Make sure you're seeding anything.
1:03:30.924,1:03:33.463
Male: Ok, thank you.
1:03:33.463,1:03:35.993
Angel: Next question will be from mic number 4.
1:03:35.993,1:03:37.045
Male: Hi.
1:03:37.045,1:03:40.745
If I got your numbers correctly so you said something like
1:03:40.745,1:03:45.450
it would take 8 million machines a year to factor 1024
1:03:45.450,1:03:48.976
so I was wondering what if I would like to factor like
1:03:48.976,1:03:52.243
800 bits or 900 bits which is,
1:03:52.243,1:03:55.149
as I understand, way easier than 1024
1:03:55.149,1:03:57.042
but was never done publicly.
1:03:57.042,1:04:00.259
So are you saying that would take like a day or something?
1:04:00.259,1:04:02.581
Dan: It depends on the size of cluster you have.
1:04:02.581,1:04:07.064
The RSA 768 factorization a couple of years ago
1:04:07.064,1:04:11.423
used something on the scale of a 1500 computers
1:04:11.423,1:04:14.130
for a year.
1:04:14.130,1:04:15.729
And those were normal kind of computers,
1:04:15.729,1:04:19.475
Desktop kind of computers with, I think, typically 2 cores.
1:04:19.475,1:04:24.240
Nowadays, if you have some faster, say, 3 GHz 4-core machines,
1:04:24.240,1:04:27.446
I think those were typically 2 GHZ 2-core, nowadays...
1:04:27.446,1:04:30.582
Tanja's mentioning you of course want to use GPUs
1:04:30.582,1:04:31.725
to speed things up.
1:04:31.725,1:04:33.397
And there's many ways to take advantage of
1:04:33.397,1:04:35.393
computer technology moving forward.
1:04:35.393,1:04:39.998
To get careful estimates: for going from 768 to 800,
1:04:39.998,1:04:42.192
that's a short enough extrapolation
1:04:42.192,1:04:44.113
that people are pretty confident about that.
1:04:44.113,1:04:48.014
Getting to 900 starts getting iffy what exactly the cost would be
1:04:48.014,1:04:50.951
and 1024 there's a fair amount of guess works.
1:04:50.951,1:04:53.796
There is some consensus on rougly what the costs are
1:04:53.796,1:04:55.429
but it's hard to say exactly.
1:04:55.429,1:04:58.059
But again, to give you a scale of it:
1:04:58.059,1:04:59.675
you were saying like 900 or 800.
1:04:59.675,1:05:03.213
Well, 768 RSA challenge was done
1:05:03.213,1:05:07.228
and that one was 1500 machine years.
1:05:07.228,1:05:08.501
Male: Ok, thank you.
1:05:08.501,1:05:10.260
Angel: We'll take four more questions.
1:05:10.260,1:05:12.357
Signal angel first.
1:05:12.357,1:05:16.676
*Signal angel*: It's a clarification of the question I asked before.
1:05:16.676,1:05:22.278
The actual question is: how big will be the effort
1:05:22.278,1:05:32.129
to upgrade existing keys from 1024 bits to 4K or 8K bits?
1:05:32.129,1:05:35.984
Considering that the keys are currently in use
1:05:35.984,1:05:37.910
how much effort worldwide would it be to
1:05:37.910,1:05:41.608
upgrade all that to something more secure?
1:05:41.608,1:05:43.559
Tanja: I mean, for the private user if you're running
1:05:43.559,1:05:46.447
SSH on the laptop you can just generate a new key.
1:05:46.447,1:05:47.876
Doesn't take you much time.
1:05:47.876,1:05:50.345
You will notice maybe some degradation of performance
1:05:50.345,1:05:52.193
if you're running it on a netbook
1:05:52.193,1:05:54.755
but going to 2048 is not a big deal.
1:05:54.755,1:05:57.710
However, there is a recommendation of the
1:05:57.710,1:06:01.932
US government to stop using 1024 bit RSA as of 2010
1:06:01.932,1:06:05.211
and we still see it everywhere.
1:06:05.211,1:06:07.563
Apparently it is bigger effort than just
1:06:07.563,1:06:10.743
everybody spending 10 minutes on the laptop.
1:06:10.743,1:06:13.635
Dan: Maybe part of the problem is things like
1:06:13.635,1:06:16.997
everybody says "yeah, use 2048 or bigger"
1:06:16.997,1:06:18.884
and there's some financial standard
1:06:18.884,1:06:21.157
which says you can use any key size you want
1:06:21.157,1:06:23.943
up to 1984 bits.
1:06:23.943,1:06:25.631
I have no idea how they came up with this
1:06:25.631,1:06:27.208
but then they run into some other standard
1:06:27.208,1:06:29.284
which says use 2048 and they just can't
1:06:29.284,1:06:33.345
implement it on the smartcards which only support up to 1984.
1:06:33.345,1:06:36.467
Now if you get the standards people talking to each other for several years
1:06:36.467,1:06:39.258
then they agree on using 1984.
1:06:39.258,1:06:41.796
It's just about as good as 2048
1:06:41.796,1:06:44.746
but realistically when you have these conflicts
1:06:44.746,1:06:47.128
in standards then it takes a while to work out.
1:06:47.128,1:06:48.742
And when you have a busy server
1:06:48.742,1:06:51.233
that's trying to do 2048 bit RSA
1:06:51.233,1:06:53.126
it doesn't matter what the standard says
1:06:53.126,1:06:55.449
if you got a ton of load you just can't handle that load
1:06:55.449,1:06:58.341
if you're spending a ton of time on cryptography.
1:06:58.341,1:07:00.460
And that again is where we like elliptic curves
1:07:00.460,1:07:02.560
because they give you for whatever your
1:07:02.560,1:07:05.341
performance constraints are much better security.
1:07:05.341,1:07:07.309
So if you're trying a given security level
1:07:07.309,1:07:09.831
the speed is much better.
1:07:09.831,1:07:12.807
Angel: Next question will be from number 2.
1:07:12.807,1:07:14.642
Female: So you've had two developers stand up
1:07:14.642,1:07:17.459
and ask how to verify whether or not their
1:07:17.459,1:07:19.726
key generation is correct and whether or not their
1:07:19.726,1:07:21.490
random numbers are correct.
1:07:21.490,1:07:23.887
And I think it's great that you give coding recommendations
1:07:23.887,1:07:27.441
like seed rand() but coming from the development
1:07:27.441,1:07:29.539
perspective I like to unit test my code
1:07:29.539,1:07:31.789
and specifically I like to write my unit tests
1:07:31.789,1:07:33.043
before I write my code,
1:07:33.043,1:07:34.709
it's called test-driven development.
1:07:34.709,1:07:38.175
So if I'm going about creating an algorithm to
1:07:38.175,1:07:39.994
encrypt something or whatever
1:07:39.994,1:07:42.771
what are the steps that I need to do
1:07:42.771,1:07:46.653
when I'm writing my unit test?
1:07:46.653,1:07:51.939
Has there been some kind of effort in that way
1:07:51.939,1:07:54.911
like what goes into unit test for determining that
1:07:54.911,1:07:56.606
your random number is correct?
1:07:56.606,1:07:59.091
I mean there's algorithms for determining
1:07:59.091,1:08:00.562
whether your random number generator is
1:08:00.562,1:08:02.773
operating correctly, right?
1:08:02.773,1:08:05.160
Dan: This is something where there is,
1:08:05.160,1:08:07.775
if you look at how the random number generators work,
1:08:07.775,1:08:10.209
there was just a NIST workshop
1:08:10.209,1:08:11.975
and there's some NIST standards
1:08:11.975,1:08:12.742
which are telling you
1:08:12.742,1:08:15.709
here's what to do to test your random number generators.
1:08:15.709,1:08:17.796
So you have a hardware random number generator,
1:08:17.796,1:08:19.589
some post-processing, and they say
1:08:19.589,1:08:22.406
"ok here's the standard for how to test those pieces"
1:08:22.406,1:08:24.704
Now that's not the same as testing the cryptography
1:08:24.704,1:08:26.335
that's using those pieces
1:08:26.335,1:08:28.994
and it's very helpful if there's a clear separation there.
1:08:28.994,1:08:30.463
So you have your cryptography
1:08:30.463,1:08:32.746
which is doing deterministic stuff
1:08:32.746,1:08:34.452
and says you have to have some randomness coming
1:08:34.452,1:08:37.806
from a totally separate unit where the only job
1:08:37.806,1:08:40.460
of that separate unit is do the randomness properly.
1:08:40.460,1:08:42.146
And then the NIST test will tell you
1:08:42.146,1:08:43.324
what the randomness does
1:08:43.324,1:08:45.938
and then deterministic cryptographic testing
1:08:45.938,1:08:48.391
will tell you that the other pieces are working correctly
1:08:48.391,1:08:50.211
for all sorts of inputs.
1:08:50.211,1:08:53.892
Where it goes wrong is something like the ECDSA standard
1:08:53.892,1:08:55.394
that Nadia was mentioning before
1:08:55.394,1:08:56.696
and that's one that says that
1:08:56.696,1:08:59.210
you don't just deterministically generate a signature
1:08:59.210,1:09:01.410
you make some new randomness every time you generate
1:09:01.410,1:09:03.705
a signature and that's what goes horribly wrong.
1:09:03.705,1:09:05.670
So that's something where we need new standards
1:09:05.670,1:09:07.790
which say instead of doing what ECDSA does
1:09:07.790,1:09:09.741
we have a clear separation between
1:09:09.741,1:09:12.192
the cryptography always does the same thing
1:09:12.192,1:09:15.043
making it testable, and then randomness is centralized
1:09:15.043,1:09:17.541
in one spot which hopefully the NIST standards
1:09:17.541,1:09:19.840
do a good enough job of testing.
1:09:19.840,1:09:23.508
Female: So there's something that says here's what determines...
1:09:23.508,1:09:26.946
I guess what I'm asking is do you know of any algorithms
1:09:26.946,1:09:30.139
that can be used for determining whether something
1:09:30.139,1:09:32.248
is a good random number and whether your random
1:09:32.248,1:09:34.525
number generator is generating things properly?
1:09:34.525,1:09:37.027
Tanja: Well, there are a bunch of tests for.
1:09:37.027,1:09:39.307
There's hardware random number generators
1:09:39.307,1:09:42.126
you can run them through the diehard tests.
1:09:42.126,1:09:44.710
There's certificates by FIPS.
1:09:44.710,1:09:48.526
In general I would recommend implementing all those steps
1:09:48.526,1:09:50.811
but the smartcards that have been mentioned by Dan earlier
1:09:50.811,1:09:52.742
from the Taiwan citizen card
1:09:52.742,1:09:55.445
those are actually FIPS-certified.
1:09:55.445,1:09:58.942
So even though, these go through what is the industry standard
1:09:58.942,1:10:00.922
of doing random number generation on smartcards
1:10:00.922,1:10:03.622
which everybody thought was good enough,
1:10:03.622,1:10:04.979
apparently they're not.
1:10:04.979,1:10:06.477
So randomness is a total mess.
1:10:06.477,1:10:08.613
I mean after the paper by Nadia and
1:10:08.613,1:10:10.410
also when the paper by the Lausanne team came out,
1:10:10.410,1:10:12.692
there is no some more movement in finding
1:10:12.692,1:10:14.023
better random number generators.
1:10:14.023,1:10:18.659
At this moment, well, hope for the best.
1:10:18.659,1:10:20.625
laughter
1:10:20.625,1:10:22.178
Implement the standards, yes.
1:10:22.178,1:10:24.672
If you have some source of randomness
1:10:24.672,1:10:28.424
try to stretch it with a stream cipher or something like that.
1:10:28.424,1:10:31.296
Nadia: I want to tell a little story.
1:10:31.296,1:10:33.227
After we published the paper
1:10:33.227,1:10:35.643
I heard some very interesting things from some of the vendors.
1:10:35.643,1:10:39.329
I think one of the things that makes randomness
1:10:39.329,1:10:42.206
and cryptography a difficult problem is that
1:10:42.206,1:10:44.612
this kind of issue is something that
1:10:44.612,1:10:46.808
crosses a lot of the abstraction boundaries
1:10:46.808,1:10:48.546
that you're used to in coding.
1:10:48.546,1:10:50.525
You want to have the clean unit test that you know
1:10:50.525,1:10:52.343
that this piece works, and this piece works,
1:10:52.343,1:10:53.923
and this piece works, and this piece works,
1:10:53.923,1:10:57.777
and you put them all together and it will all work flawlessly.
1:10:57.777,1:11:00.297
Somehow this kind of problem is something that
1:11:00.297,1:11:02.898
happens at the boundaries of each unit test.
1:11:02.898,1:11:05.223
We know that the operating system is like
1:11:05.223,1:11:08.431
"ok I'm getting my proper inputs from the hardware
1:11:08.431,1:11:11.297
and I'm correctly generating my randomness from there"
1:11:11.297,1:11:12.840
and then the application says
1:11:12.840,1:11:15.023
"I'm getting my randomness from the operating system,
1:11:15.023,1:11:16.310
it's good randomness"
1:11:16.310,1:11:19.605
and then the application uses the correct crypto algorithms
1:11:19.605,1:11:21.927
to then do good cryptography.
1:11:21.927,1:11:24.495
And at the very beginning of that stage
1:11:24.495,1:11:27.777
there was something broken and it translated all the way
1:11:27.777,1:11:29.526
through all of these pieces of software
1:11:29.526,1:11:31.532
that were working correctly.
1:11:31.532,1:11:35.415
Testing this holistically was the only way
1:11:35.415,1:11:37.312
to find this kind of error.
1:11:37.312,1:11:41.241
One story that I heard from a vendor was that
1:11:41.241,1:11:43.911
they ran into randomness problems.
1:11:43.911,1:11:45.745
They developed some device.
1:11:45.745,1:11:47.325
It was working perfectly
1:11:47.325,1:11:51.478
and on the assembly line they were being turned on
1:11:51.478,1:11:55.497
and run through the checks.
1:11:55.497,1:11:57.807
You boot it up it checks the integrity
1:11:57.807,1:12:00.641
on the assembly line and it was continuing on.
1:12:00.641,1:12:06.390
If you exactly booted them up at the exact same time
1:12:06.390,1:12:10.129
in the exact same conditions
1:12:10.129,1:12:13.221
then all of the inputs to the random number generator
1:12:13.221,1:12:16.323
would be the same and they would generate the same keys
1:12:16.323,1:12:17.743
and then these keys would be,
1:12:17.743,1:12:19.721
you know, they already generated the keys,
1:12:19.721,1:12:21.197
so they wouldn't generate them again after
1:12:21.197,1:12:25.397
they went to the consumer.
1:12:25.397,1:12:27.145
I don't know how to unit test that
1:12:27.145,1:12:29.976
except take the entire device, turn it on,
1:12:29.976,1:12:32.589
and take multiple of them and make sure the devices
1:12:32.589,1:12:35.396
as they're coming out of the production process
1:12:35.396,1:12:39.596
are working correctly.
1:12:39.596,1:12:41.477
Angel: So we will take 2 more questions.
1:12:41.477,1:12:46.692
One from number 4 first and then number 1 at the last.
1:12:46.692,1:12:48.005
Male: How good, do you think,
1:12:48.005,1:12:52.544
hardware-supported random number generators are
1:12:52.544,1:12:53.432
after what you've just said?
1:12:53.432,1:12:55.806
I think they're probably not good anymore?
1:12:55.806,1:13:01.566
Dan: Intel has their RdRand instruction in some of the newer chips
1:13:01.566,1:13:04.979
and they say that they've gone through
1:13:04.979,1:13:06.957
a reasonable number of evaluation steps.
1:13:06.957,1:13:09.442
It sounds like they're trying.
1:13:09.442,1:13:11.826
Whether they're successful is a different question.
1:13:11.826,1:13:13.661
Something I don't like about the API is
1:13:13.661,1:13:16.392
RdRand gives you no way to test the pieces.
1:13:16.392,1:13:18.798
You can only test the post-processed output.
1:13:18.798,1:13:22.157
So nobody else is able to test the actual
1:13:22.157,1:13:24.276
randomness generation part of the hardware.
1:13:24.276,1:13:26.543
You can only get a filtered version of that.
1:13:26.543,1:13:29.245
So Intel and people who are contracted by Intel
1:13:29.245,1:13:30.628
to see what's going on inside
1:13:30.628,1:13:32.829
are the only people who can run these tests.
1:13:32.829,1:13:35.874
It's not open in a way that allows the community
1:13:35.874,1:13:37.377
to see if there's further problems.
1:13:37.377,1:13:38.756
But at least they're trying
1:13:38.756,1:13:41.423
and maybe with enough effort the hardware manufacturers
1:13:41.423,1:13:43.458
will get randomness generation right.
1:13:43.458,1:13:47.579
Of course if you can generate any sort of secret once
1:13:47.579,1:13:50.462
if you have enough of a secret to start with
1:13:50.462,1:13:52.328
then you can use things like AES to
1:13:52.328,1:13:53.944
generate any number of secrets
1:13:53.944,1:13:55.494
and you can put those secrets into
1:13:55.494,1:13:57.296
any number of devices that you want.
1:13:57.296,1:13:59.594
If you use AES in counter mode with
1:13:59.594,1:14:02.123
encrypting 1, encrypting 2, encrypting 3...
1:14:02.123,1:14:04.981
you get totally unpredictable, unrelated outputs
1:14:04.981,1:14:07.378
and then use those, burn those into some hardware.
1:14:07.378,1:14:09.979
But where do you get that initial randomness from?
1:14:09.979,1:14:12.830
Are you going through a trustworthy process there?
1:14:12.830,1:14:16.308
It's a hard problem.
1:14:16.308,1:14:18.198
Angel: Next question from number 1.
1:14:18.198,1:14:19.729
And that's the last question.
1:14:19.729,1:14:21.379
Male: You said something about
1:14:21.379,1:14:24.799
you dropped this group-based cryptography word.
1:14:24.799,1:14:28.111
Most of the stuff I've encountered under that heading
1:14:28.111,1:14:30.008
kind of sounded like snake oil
1:14:30.008,1:14:32.898
just cause they're using non-abelian groups and stuff.
1:14:32.898,1:14:35.832
Is there anything here that isn't?
1:14:35.832,1:14:38.280
Tanja: Ok, I did not say group-based crypto.
1:14:38.280,1:14:38.914
Male: Ok.
1:14:38.914,1:14:40.259
Tanja: I said code-based cryptography.
1:14:40.259,1:14:41.241
Male: Sorry, thanks.
1:14:41.241,1:14:42.773
Tanja: So there is a class of cryptosystems
1:14:42.773,1:14:44.826
which are fine against quantum computers.
1:14:44.826,1:14:50.148
For all crypto it's always up to what we know.
1:14:50.148,1:14:51.959
Next day, there could be somebody with a
1:14:51.959,1:14:54.009
polynomial-time factor algorithm and so on.
1:14:54.009,1:14:56.723
Now there's also a group of people doing
1:14:56.723,1:14:59.357
braid group cryptography, doing non-abelian groups,
1:14:59.357,1:15:01.764
most of these systems have been broken.
1:15:01.764,1:15:07.083
If they weren't broken by classical computers
1:15:07.083,1:15:10.911
maybe they would remain secure against quantum computers.
1:15:10.911,1:15:13.828
It's lots of "would"s there.
1:15:13.828,1:15:17.612
Yes, they might have this on their flags as well
1:15:17.612,1:15:20.547
but I wouldn't count this as a secure system.
1:15:20.547,1:15:23.027
Whereas code-based cryptography or lattice-based cryptography
1:15:23.027,1:15:26.480
are systems which should be fine.
1:15:26.480,1:15:28.646
Dan: Maybe just a URL if you're interested
1:15:28.646,1:15:31.939
in post-quantum cryptography is pqcrypto.org
1:15:31.939,1:15:34.513
and that keeps track of papers on the various types
1:15:34.513,1:15:37.227
of cryptography that should survive for a 100 years
1:15:37.227,1:15:41.448
and in particular against quantum computers.
1:15:41.448,1:15:42.444
Angel: Ladies and gentleman, please give
1:15:42.444,1:15:46.546
Nadia, Tanja and Daniel a big round of applause.
1:15:46.546,1:15:47.714
Thank you so much.
1:15:47.714,1:15:56.643
applause