0:00:08.965,0:00:11.637
Today we have Daniel J. Bernstein,

0:00:11.637,0:00:14.538
Nadia Heninger,

0:00:14.538,0:00:16.889
and Tanja Lange.

0:00:16.889,0:00:20.402
And they will talk about RSA factorization in the real world.

0:00:20.402,0:00:22.289
Talk is called FactHacks.

0:00:22.289,0:00:29.905
<i>applause</i>

0:00:29.905,0:00:31.591
<i>Nadia</i>: Thank you.

0:00:31.591,0:00:37.091
So we're going to start with a crypto lecture in 5 minutes.

0:00:37.091,0:00:40.557
So, just to begin, since we're talking about RSA.

0:00:40.557,0:00:43.678
Here is a picture of RSA for you.

0:00:43.678,0:00:48.605
RSA is the first published public-key cryptosystem.

0:00:48.605,0:00:51.222
So by a public-key cryptosystem, what I mean is

0:00:51.222,0:00:54.169
a cryptosystem that has two keys intead of just having

0:00:54.169,0:00:56.886
one key. You might think "ok you encrypt a message

0:00:56.886,0:00:59.035
to that key and you decrypt a message to that key"

0:00:59.035,0:01:01.402
That is a symmetric cryptosystem.

0:01:01.402,0:01:03.194
A public-key cryptosystem has two keys.

0:01:03.194,0:01:06.042
One is the public key and one is the private key.

0:01:06.042,0:01:07.688
The public key you publish to the world and

0:01:07.688,0:01:09.509
anyone can use that key to encrypt a message,

0:01:09.509,0:01:10.806
especially to you.

0:01:10.806,0:01:14.102
And only you, who have the private key, can decrypt that.

0:01:14.102,0:01:18.739
So the RSA paper was the first published public-key cryptosystem.

0:01:18.739,0:01:20.501
That was in 1977.

0:01:20.501,0:01:22.757
This revolutionized cryptography and enabled

0:01:22.757,0:01:25.542
the development of the Internet.

0:01:25.542,0:01:31.568
So 35 years later RSA is still the most commonly used public-key cryptosystem.

0:01:31.568,0:01:35.870
If you ever use the Internet, you probably use it every day.

0:01:35.870,0:01:42.154
So here is a sample SSL certificate which uses RSA.

0:01:42.154,0:01:45.241
If you use SSH you probably have an SSH key

0:01:45.241,0:01:48.440
which probably is RSA.

0:01:48.440,0:01:51.858
Before I launch into the math

0:01:51.858,0:01:54.676
I want to explain what we're doing here.

0:01:54.676,0:01:56.742
We wanted to, since you guys are hackers,

0:01:56.742,0:01:59.874
you like code instead of formulas

0:01:59.874,0:02:02.784
so we wanted to give you formulas in code.

0:02:02.784,0:02:05.455
So what we're giving you is working Python code for

0:02:05.455,0:02:07.720
all the math that we're going to do.

0:02:07.720,0:02:09.976
And in order to do that we're going to use

0:02:09.976,0:02:12.643
some software you call Sage.

0:02:12.643,0:02:15.970
Sage is free open-source mathematics software.

0:02:15.970,0:02:20.160
It is available at this URL sagemath.org.

0:02:20.160,0:02:22.093
It's based off of Python

0:02:22.093,0:02:24.190
so it's very simple if you know Python.

0:02:24.190,0:02:27.904
It has a nice sort of interpreter

0:02:27.904,0:02:29.622
that you can use just like Python.

0:02:29.622,0:02:32.493
Here an example, 2*3=6.

0:02:32.493,0:02:34.890
But there are some differences.

0:02:34.890,0:02:38.202
For example the caret instead of in Python it's XOR

0:02:38.202,0:02:40.155
in Sage it's exponentiation.

0:02:40.155,0:02:43.869
So 2^3 is 8 in Sage.

0:02:43.869,0:02:47.351
The other nice thing about Sage is that

0:02:47.351,0:02:49.703
it has lots of useful mathematical libraries.

0:02:49.703,0:02:53.029
For example if I, in the Sage interpreter, say factor(15)

0:02:53.029,0:02:55.512
it helpfully tells me 3*5.

0:02:55.512,0:02:56.513
That's pretty great.

0:02:56.513,0:02:59.152
It also does a lot more advanced things, for example

0:02:59.152,0:03:00.467
it can represent polynomials.

0:03:00.467,0:03:03.498
I can say factor the polynomial x^2-1

0:03:03.498,0:03:08.127
and it helpfully tells me the answer to that.

0:03:08.127,0:03:12.819
So now that we've gone through what is Sage,

0:03:12.819,0:03:17.443
I want to explain how to do RSA.

0:03:17.443,0:03:21.065
Perhaps some of you have seen this before.

0:03:21.065,0:03:24.677
In order to generate an RSA key, the first thing that you do

0:03:24.677,0:03:27.435
is you generate two large random primes.

0:03:27.435,0:03:30.441
So if we want a 1024 bit RSA key

0:03:30.441,0:03:34.779
we generate two primes of size 2^512,

0:03:34.779,0:03:38.515
so 512 bit primes, p and q.

0:03:38.515,0:03:41.697
In order to generate the public key

0:03:41.697,0:03:44.900
I multiply together p and q and you get some N.

0:03:44.900,0:03:46.433
N has 1024 bits.

0:03:46.433,0:03:49.915
And then you have what's known as the public exponent

0:03:49.915,0:03:54.435
which you can choose 3 or 65537 or 35.

0:03:54.435,0:03:56.300
It really doesn't matter what you choose.

0:03:56.300,0:03:59.117
These are some of the most commonly used values.

0:03:59.117,0:04:03.129
Now in order to generate a private key

0:04:03.129,0:04:07.695
you choose d, the decryption exponent,

0:04:07.695,0:04:12.936
to be the modular inverse of e mod (p-1)*(q-1)

0:04:12.936,0:04:15.394
It doesn't matter why so much, for this talk

0:04:15.394,0:04:17.276
but here is the formula

0:04:17.276,0:04:20.295
or here is the command to do this in Sage.

0:04:20.295,0:04:23.150
So now we have a public key and a private key

0:04:23.150,0:04:24.508
that are pairs.

0:04:24.508,0:04:27.948
And in RSA if you want to enrypt a message

0:04:27.948,0:04:31.547
you raise your message to the e'th power mod n

0:04:31.547,0:04:34.753
so you can do that with your public key here.

0:04:34.753,0:04:36.776
And similarly decryption,

0:04:36.776,0:04:39.998
you raise the ciphertext to the d'th power mod n

0:04:39.998,0:04:42.835
and that will give you the message back out again.

0:04:42.835,0:04:45.748
Fairly simple.

0:04:45.748,0:04:48.731
Now I'm lying to you slightly.

0:04:48.731,0:04:51.088
If you read this talk do not

0:04:51.088,0:04:53.181
absolutely do <b>not</b> implement RSA this way.

0:04:53.181,0:04:54.757
You <b>must</b> pad your message.

0:04:54.757,0:04:57.186
This is a public service warning.

0:04:57.186,0:05:00.662
We offered to tell you about factoring.

0:05:00.662,0:05:03.941
What is the relationship between RSA and factoring?

0:05:03.941,0:05:08.972
Now you can see easily from the formula

0:05:08.972,0:05:11.760
from the private key here

0:05:11.760,0:05:16.337
that if you know how to factor N into its prime factors p and q

0:05:16.337,0:05:19.496
then you can just compute d, the decryption exponent

0:05:19.496,0:05:20.812
of your private key.

0:05:20.812,0:05:24.395
Clearly, if you can factor N you can compute the private key.

0:05:24.395,0:05:28.930
But we don't actually know that factoring is the only way to break RSA.

0:05:28.930,0:05:34.163
There might be some way that you can compute messages from ciphertexts

0:05:34.163,0:05:36.877
that doesn't actually reveal the private key.

0:05:36.877,0:05:41.545
And there's no proof either way that this is or is not possible.

0:05:41.545,0:05:43.478
The last thing that I want to mention here

0:05:43.478,0:05:45.458
just to clarify things

0:05:45.458,0:05:46.830
some of you who might have taken

0:05:46.830,0:05:51.527
complexity or algorithms courses or a beginning CS course

0:05:51.527,0:05:55.874
you might have heard of NP-hardness, the P vs. NP problem.

0:05:55.874,0:06:00.904
And I just want to clarify that factoring is not known to be NP-hard.

0:06:00.904,0:06:05.092
Every computer scientist will love it if we could actually generate

0:06:05.092,0:06:11.139
a cryptosystem which is known to be based of average case NP-hardness.

0:06:11.139,0:06:13.573
We don't know how to do that.

0:06:13.573,0:06:16.992
RSA is not doing that.

0:06:16.992,0:06:22.309
In addition the factoring problem is probably not NP-hard.

0:06:22.309,0:06:25.892
The question is: how hard <b>is</b> factoring?

0:06:25.892,0:06:27.474
Since factoring is the best way that we know

0:06:27.474,0:06:31.227
to break the most commonly used public-key cryptosystem on the Internet.

0:06:31.227,0:06:32.935
Well I showed you some examples before

0:06:32.935,0:06:37.243
where Sage had this helpful little factor function.

0:06:37.243,0:06:39.164
So how well does it work?

0:06:39.164,0:06:41.181
How long do people think this takes?

0:06:41.181,0:06:46.149
two 32 bit integers multiplied together.

0:06:46.149,0:06:48.850
I heard it'll go a couple of seconds.

0:06:48.850,0:06:50.098
0.01 seconds.

0:06:50.098,0:06:55.108
This is on <i>this</i> computer.

0:06:55.108,0:06:57.869
How about 64 bit integers, two of them multiplied together?

0:06:57.869,0:07:01.289
So this is a 128 bit RSA modulus.

0:07:01.289,0:07:04.382
Any guesses?

0:07:04.382,0:07:05.464
One minute?

0:07:05.464,0:07:08.235
Nope, 0.1 seconds.

0:07:08.235,0:07:14.521
How about two 96 bit integers multiplied together?

0:07:14.521,0:07:15.939
A couple of minutes?

0:07:15.939,0:07:18.105
4 seconds.

0:07:18.105,0:07:21.965
How about two 128 bit integers multiplied together?

0:07:21.965,0:07:28.255
This is a 256 bit RSA modulus.

0:07:28.255,0:07:29.300
No guesses? Alright.

0:07:29.300,0:07:30.430
500 seconds.

0:07:30.430,0:07:32.791
So at this point it's starting to get a little bit iffy if I'm

0:07:32.791,0:07:36.327
sitting on a plane trying to do this on battery power.

0:07:36.327,0:07:39.248
<i>laughter</i>

0:07:39.248,0:07:43.764
Clearly this is growing faster than linear in the size of the key.

0:07:43.764,0:07:45.597
So what is actually going on?

0:07:45.597,0:07:49.066
Well.

0:07:49.066,0:07:50.734
<i>Dan</i>: Ok.

0:07:50.734,0:07:52.794
Turns out these factoring functions

0:07:52.794,0:07:54.597
they do look like they're getting pretty slow but

0:07:54.597,0:07:56.835
wait a minute.

0:07:56.835,0:07:59.814
Do we actually have to use this factor function

0:07:59.814,0:08:00.866
if it's getting really slow?

0:08:00.866,0:08:04.302
Maybe instead of trying to use Sage's factor function

0:08:04.302,0:08:08.710
maybe we could just guess what the prime was.

0:08:08.710,0:08:12.350
Now as an RSA user you might not think that this is a threat

0:08:12.350,0:08:13.928
having somebody guess your prime

0:08:13.928,0:08:17.114
because there is some way to count the number of primes,

0:08:17.114,0:08:18.719
number of 512 bit primes and there's

0:08:18.719,0:08:22.185
more than 2^500 512 bit primes.

0:08:22.185,0:08:24.011
And if your random number generator is giving you

0:08:24.011,0:08:27.097
any of those 512 bit primes with the same chance

0:08:27.097,0:08:30.829
then any particular prime that the attacker guesses

0:08:30.829,0:08:32.845
and just tries dividing n by that,

0:08:32.845,0:08:35.061
see if it's evenly divisable, well

0:08:35.061,0:08:39.417
any particular prime has only a 2^(-500) chance

0:08:39.417,0:08:41.312
of being guessed by the attacker.

0:08:41.312,0:08:43.319
And even if the attacker tries a huge number of times

0:08:43.319,0:08:44.897
he'll never guess your prime

0:08:44.897,0:08:47.621
except what if your random number generator

0:08:47.621,0:08:52.031
is working really really badly and doesn't give you 2^500 different primes?

0:08:52.031,0:08:55.902
What if it only gives you, say, 2^40 different primes?

0:08:55.902,0:08:58.055
And then suddenly the attacker could, well,

0:08:58.055,0:09:01.248
try figuring out what those primes are.

0:09:01.248,0:09:03.237
So here's how the attacker looks at this.

0:09:03.237,0:09:05.819
The attacker says: ok I'm gonna target your public key.

0:09:05.819,0:09:08.884
I'm gonna take your big N which is your secret...

0:09:08.884,0:09:10.912
sorry, public key is big N,

0:09:10.912,0:09:13.666
your secret key p times your secret key q,

0:09:13.666,0:09:14.920
your private p and q.

0:09:14.920,0:09:16.764
He's trying to figure out the p and q, given N.

0:09:16.764,0:09:20.169
So what he does, he takes a bunch of devices that are out there

0:09:20.169,0:09:22.902
and he looks at how they work

0:09:22.902,0:09:25.281
or downloads a bunch of software which generates keys

0:09:25.281,0:09:28.904
and then using that software, using those devices

0:09:28.904,0:09:32.216
he tries generating a bunch of p's and q's for himself

0:09:32.216,0:09:34.516
using whatever the random number generator is

0:09:34.516,0:09:36.932
in that device or in that software.

0:09:36.932,0:09:38.400
And then that's something which

0:09:38.400,0:09:40.682
maybe the random number generator works really badly

0:09:40.682,0:09:42.832
and maybe you're using that device, too.

0:09:42.832,0:09:45.765
So the attacker tries each of the primes that he see's

0:09:45.765,0:09:46.918
from this device

0:09:46.918,0:09:50.129
and tries seeing does that prime divide your N.

0:09:50.129,0:09:51.400
If you were using that device

0:09:51.400,0:09:52.932
and it doesn't generate very many primes

0:09:52.932,0:09:55.835
then maybe he gets lucky and finds your prime

0:09:55.835,0:09:57.968
because the device has generated the same prime for him

0:09:57.968,0:09:59.618
that it generated for you.

0:09:59.618,0:10:03.216
Now does anybody actually build devices which

0:10:03.216,0:10:07.005
screw up random numbers this badly?

0:10:07.005,0:10:08.519
<i>laughter</i>

0:10:08.519,0:10:11.538
As a matter of fact, yes, bears do shit in the woods.

0:10:11.538,0:10:13.543
So there are some famous examples.

0:10:13.543,0:10:16.952
We don't have a huge number of historical discussions in this talk

0:10:16.952,0:10:18.539
but just a few examples here.

0:10:18.539,0:10:22.401
Goldberg&Wagner totally broke the original Netscape

0:10:22.401,0:10:23.989
generation of public keys.

0:10:23.989,0:10:26.921
There are only something like 2^47 possible keys

0:10:26.921,0:10:29.216
which is a countable numbers.

0:10:29.216,0:10:31.077
So they could figure out all the possible keys

0:10:31.077,0:10:34.139
that Netscape could generate if they you knew when you generated a key

0:10:34.139,0:10:35.802
which was often very predictable.

0:10:35.802,0:10:37.724
Another example: the Debian bug,

0:10:37.724,0:10:40.974
the horrendous failure for SSH and lots of other applications

0:10:40.974,0:10:42.418
from a few years ago.

0:10:42.418,0:10:45.692
Debian and Ubuntu for a year and half were generating only

0:10:45.692,0:10:48.773
well, fewer than a million possible public keys.

0:10:48.773,0:10:51.421
So complete disasters of random number generation.

0:10:51.421,0:10:53.826
But if you want to add to this list,

0:10:53.826,0:10:55.202
if you want to do more and more of these

0:10:55.202,0:10:56.974
find bad random number generators

0:10:56.974,0:11:00.518
you have to say "ok what's the next device out there

0:11:00.518,0:11:01.858
that might have been screwed up."

0:11:01.858,0:11:03.422
"Let's look at it, look at the code."

0:11:03.422,0:11:05.039
"See what kind of primes it generates."

0:11:05.039,0:11:06.571
"Does it do badly?"

0:11:06.571,0:11:09.424
It takes some real work to figure this out.

0:11:09.424,0:11:11.801
Wouldn't it be nice if there was a systematic way

0:11:11.801,0:11:14.586
to just without having to do a lot of work,

0:11:14.586,0:11:16.576
to look at each device, each piece of software,

0:11:16.576,0:11:18.403
wouldn't it be nice to just figure out

0:11:18.403,0:11:21.568
"oh yeah, let's just magically figure out

0:11:21.568,0:11:23.726
if there's any primes out there generated

0:11:23.726,0:11:25.507
from bad random number generators."

0:11:25.507,0:11:28.020
You could imagine here's the easy attack.

0:11:28.020,0:11:29.622
Here's the systematic way to do this.

0:11:29.622,0:11:34.052
You take two users, so you and you.

0:11:34.052,0:11:35.442
And you each have a public key.

0:11:35.442,0:11:37.178
We're gonna grab those public keys

0:11:37.178,0:11:41.681
N1 and N2

0:11:41.681,0:11:46.983
and then hope that this N1 and N2 share a prime.

0:11:46.983,0:11:54.130
N1 is some prime p times some q1 and N2 is some same prime p times a different q2.

0:11:54.130,0:11:55.663
Now this could happen, you could have

0:11:55.663,0:11:59.100
N1 and N2 sharing both primes.

0:11:59.100,0:12:02.101
That would be legitimate that two people who are actually

0:12:02.101,0:12:05.645
the same webserver sharing their keys, sharing their certificates,

0:12:05.645,0:12:06.843
they know they're doing this,

0:12:06.843,0:12:09.281
you can get the same public key from two places.

0:12:09.281,0:12:12.183
Google has their keys copied to tens of thousands of servers,

0:12:12.183,0:12:13.399
that's not a surprise.

0:12:13.399,0:12:15.395
But if there's a different...

0:12:15.395,0:12:16.999
If you have two different public keys

0:12:16.999,0:12:18.430
they shouldn't be sharing primes.

0:12:18.430,0:12:19.582
What if they do?

0:12:19.582,0:12:21.979
Well, here's this magic Euclid's algorithm

0:12:21.979,0:12:24.744
which will just print out the shared prime p.

0:12:24.744,0:12:26.066
This is a very simple algorithm.

0:12:26.066,0:12:28.495
It just takes one of the numbers, say N1

0:12:28.495,0:12:31.961
and let's say N1 is the smaller one and N2 is the bigger one,

0:12:31.961,0:12:34.142
it divides N2 by N1

0:12:34.142,0:12:37.512
and replaces N2 by the remainder and then switches the numbers

0:12:37.512,0:12:39.593
and that's exactly what this code does.

0:12:39.593,0:12:43.111
The N2 % N1 that's again the N2 mod N1

0:12:43.111,0:12:45.543
the least remainder when you divide N2 by N1.

0:12:45.543,0:12:49.414
And then the result once one of the numbers gets down to zero

0:12:49.414,0:12:52.292
the other number is exactly the shared prime.

0:12:52.292,0:12:54.149
So this amazing algorithm,

0:12:54.149,0:12:57.051
here's just a little manual example

0:12:57.051,0:12:59.898
not with big 512, 1024 bit keys

0:12:59.898,0:13:04.143
this is with just tiny little I guess 8 bit primes

0:13:04.143,0:13:07.625
16 bit... well looks more like 13 bit keys.

0:13:07.625,0:13:12.375
So two numbers coming in and one is 4187 and two is 5989.

0:13:12.375,0:13:14.879
You can see if you keep dividing one number by the other,

0:13:14.879,0:13:16.162
replacing it by the remainder,

0:13:16.162,0:13:19.411
do that a few times, you quickly get down to zero.

0:13:19.411,0:13:22.774
And the number right before the zero in the middle of the slide is 53.

0:13:22.774,0:13:27.172
Now 53 is a shared prime between these two small keys.

0:13:27.172,0:13:28.772
If you scale this up,

0:13:28.772,0:13:32.256
if you do this computation for bigger numbers

0:13:32.256,0:13:34.176
then here's another timing example.

0:13:34.176,0:13:36.673
It's still really really fast, in the middle of the slide.

0:13:36.673,0:13:40.359
This is doing 1024 bit keys that share a 512 bit prime.

0:13:40.359,0:13:44.554
You can figure out this 512 bit prime so quickly that the timing

0:13:44.554,0:13:48.848
there is 0.00 seconds, can't even see how fast it is.

0:13:48.848,0:13:51.094
Ok, so that's Euclid's algorithm.

0:13:51.094,0:13:53.676
Now you can actually do this for tons and tons of keys.

0:13:53.676,0:13:57.695
You download, say, millions, tens of millions of keys from the Internet,

0:13:57.695,0:14:01.011
all the different SSL servers, maybe go for some SSH keys,

0:14:01.011,0:14:03.891
you download tons and tons of public keys and then

0:14:03.891,0:14:06.590
you try all the pairs with Euclid's algorithm.

0:14:06.590,0:14:09.406
And it's so fast that you can do this but actually

0:14:09.406,0:14:10.848
you can do better.

0:14:10.848,0:14:13.927
So there's an algorithm: batch GCD algorithm

0:14:13.927,0:14:15.396
which takes a whole bunch of keys

0:14:15.396,0:14:19.822
and figures out which of those keys share primes with the others.

0:14:19.822,0:14:22.525
Much much faster than trying every pair.

0:14:22.525,0:14:23.909
So you don't have to try every pair.

0:14:23.909,0:14:26.973
It's kind of like sorting, it's about that kind of speed

0:14:26.973,0:14:29.065
instead of having to try every pair.

0:14:29.065,0:14:30.811
If you have tens of millions you don't have to do

0:14:30.811,0:14:33.156
tens of millions times tens of millions of pairs.

0:14:33.156,0:14:35.739
You just have to pretty much reap through the whole list once.

0:14:35.739,0:14:38.877
So what does this batch GCD algorithm do?

0:14:38.877,0:14:41.447
It's figuring out for the first N1,

0:14:41.447,0:14:44.029
instead of computing the greatest common divisor,

0:14:44.029,0:14:45.908
the Euclid result for N1 and N2

0:14:45.908,0:14:47.846
and for N1 and N3 and so on,

0:14:47.846,0:14:52.340
it figures out the product of N2, N3, N4 and so on,

0:14:52.340,0:14:55.257
takes the greatest common divisor of that with N1,

0:14:55.257,0:14:57.191
applies Euclid's algorithm to that

0:14:57.191,0:15:00.874
and then does a similar thing for N2 and for N3 and so on.

0:15:00.874,0:15:05.656
Each of those is gonna show you whether N1 or N2 or N3,

0:15:05.656,0:15:08.148
it's gonna show you whether they share some prime

0:15:08.148,0:15:10.246
with any of the other N's.

0:15:10.246,0:15:13.326
If you did exactly the computation

0:15:13.326,0:15:15.248
shown on the slide in the math formulas

0:15:15.248,0:15:16.617
it would still be too slow.

0:15:16.617,0:15:20.457
But the batch GCD algorithm finds redundancies

0:15:20.457,0:15:23.218
in these GCDs, and here's exactly what it does:

0:15:23.218,0:15:25.620
It figures out the product of all of the keys.

0:15:25.620,0:15:29.923
So you take, say, a million keys and you multiply them all together.

0:15:29.923,0:15:33.534
And that's done with the code here where you do...

0:15:33.534,0:15:35.120
maybe I'll just skip to the bottom.

0:15:35.120,0:15:37.474
You see numbers 10, 20, 30, 40.

0:15:37.474,0:15:39.356
You build a product tree.

0:15:39.356,0:15:42.210
So you take 10 times 20, compute 200.

0:15:42.210,0:15:44.687
Take the next pair 30 times 40, compute 1200.

0:15:44.687,0:15:47.889
The take the pair of products 200 times 1200.

0:15:47.889,0:15:50.378
If you have more numbers, you have more levels in the tree.

0:15:50.378,0:15:55.227
And that's exactly what the Sage code does here.

0:15:55.227,0:15:58.924
And then the last step of the algorithm is

0:15:58.924,0:16:02.847
kind of going down the tree, building a remainder tree.

0:16:02.847,0:16:06.515
So what's happening here to figure out the greatest common divisor

0:16:06.515,0:16:10.013
of N1 with the product of all the other keys,

0:16:10.013,0:16:12.846
the idea is to take this product

0:16:12.846,0:16:15.807
which is called big R on this slide,

0:16:15.807,0:16:23.160
take the product of all the keys and then divide that by N1^2

0:16:23.160,0:16:26.291
and then take the remainders, so R mod N1^2,

0:16:26.291,0:16:31.148
divide by N1 and then compute the greatest common divisor of N1 mod N.

0:16:31.148,0:16:33.124
And the amazing effect is that it's exactly the

0:16:33.124,0:16:35.047
greatest common divisor we were looking for.

0:16:35.047,0:16:37.629
And what this little Sage script does is that it takes

0:16:37.629,0:16:42.282
R and divides R by, well a bunch of N's in a really fast way,

0:16:42.282,0:16:43.960
and then at the bottom, once it has figured out

0:16:43.960,0:16:47.227
all mod N1^2, all mod N2^2 and so on,

0:16:47.227,0:16:49.308
divides those by N1 and N2,

0:16:49.308,0:16:52.177
takes the greatest common divisor with N1 and N2

0:16:52.177,0:16:54.824
and that's exactly telling you, the outputs here

0:16:54.824,0:16:57.063
that are different from 1, are exactly telling you

0:16:57.063,0:17:00.398
which N's share a prime with some others.

0:17:00.398,0:17:04.048
This very remarkably short little script does work quite well.

0:17:04.048,0:17:06.524
Here's how fast these are.

0:17:06.524,0:17:10.248
So I'm on some 800 MHz core.

0:17:10.248,0:17:13.442
If you try this for, say, a hundred numbers,

0:17:13.442,0:17:16.608
just hundred random 1024 bit numbers,

0:17:16.608,0:17:20.917
it doesn't matter what the numbers are, they always run at the same speed,

0:17:20.917,0:17:23.198
then that takes 0.05 seconds.

0:17:23.198,0:17:27.249
If you have 10 times as many numbers it takes about 20 times as long.

0:17:27.249,0:17:31.229
And if you have another 10 times as many numbers it takes about 20 times as long.

0:17:31.229,0:17:33.247
And it keeps scaling up like that.

0:17:33.247,0:17:35.948
So you can get to millions or tens of millions of numbers

0:17:35.948,0:17:38.582
and it still finishes in a reasonable amount of time.

0:17:38.582,0:17:40.068
So, amazingly fast algorithm.

0:17:40.068,0:17:43.810
You don't have to scale this up to like 2^50 or 2^100 keys.

0:17:43.810,0:17:46.059
There's only, say, tens of millions of keys out there

0:17:46.059,0:17:48.330
and this runs reasonably fast.

0:17:48.330,0:17:50.626
Ok.

0:17:50.626,0:17:52.416
Can you actually hope for this to work?

0:17:52.416,0:17:54.629
Are random number generators really this bad?

0:17:54.629,0:17:55.866
Well...

0:17:55.866,0:18:00.478
<i>laughs</i>

0:18:00.478,0:18:03.435
There's a 2012 paper from Heninger

0:18:03.435,0:18:04.926
that's the same Heninger we have sitting here

0:18:04.926,0:18:06.300
and Durumeric, Wustrow, Halderman,

0:18:06.300,0:18:09.881
that's got the best paper award at USENIX Security 2012.

0:18:09.881,0:18:14.678
What they did was they tried exactly this on tens of millions of keys out there

0:18:14.678,0:18:18.276
and factored tens of thousands of keys.

0:18:18.276,0:18:22.677
Admittely they used a C version of this instead of this

0:18:22.677,0:18:24.061
Sage script but

0:18:24.061,0:18:25.827
the Sage script will go up to millions.

0:18:25.827,0:18:28.359
It's just when you get beyond that you want to write it in C

0:18:28.359,0:18:31.009
to deal with using a lot of memory.

0:18:31.009,0:18:34.887
But you can use exactly this script for pretty large chunks of keys.

0:18:34.887,0:18:37.951
And so they factored tens of thousands of keys and said

0:18:37.951,0:18:42.235
that these keys are typically not your bank keys.

0:18:42.235,0:18:44.909
They gonna be, say, keys for your home router.

0:18:44.909,0:18:47.639
So the vulnerable devices are not gonna be,

0:18:47.639,0:18:49.893
say, a big server out there.

0:18:49.893,0:18:53.902
The vulnerable devices will be your FRITZ!Box.

0:18:53.902,0:18:57.340
<i>applause</i>

0:18:57.340,0:18:59.838
Thank you, Nadia.

0:18:59.838,0:19:06.525
It's good to read the paper, go to factorable.net

0:19:06.525,0:19:09.073
and you get tons of analyses of why this happened

0:19:09.073,0:19:12.520
and why these devices are generated guessable numbers.

0:19:12.520,0:19:15.020
I mean numbers that are so non-random that they get

0:19:15.020,0:19:18.370
repeated between a lot of keys out there.

0:19:18.370,0:19:20.308
There was at the same time another team that

0:19:20.308,0:19:23.559
like within days announced the same result.

0:19:23.559,0:19:27.071
They independently did the same computation.

0:19:27.071,0:19:28.883
They said "yeah, we've also downloaded

0:19:28.883,0:19:31.352
a bunch of keys and factored as many as we could"

0:19:31.352,0:19:32.869
with basically the same algorithm.

0:19:32.869,0:19:35.198
At the end of it "yeah we factored

0:19:35.198,0:19:36.629
tens of thousands of keys."

0:19:36.629,0:19:38.882
That paper has no analysis.

0:19:38.882,0:19:41.582
They're like "yeah, there's keys"

0:19:41.582,0:19:43.267
"they share primes for some reason"

0:19:43.267,0:19:45.915
"I guess ecommerce is dead"

0:19:45.915,0:19:47.536
"go out, change your bank keys"

0:19:47.536,0:19:48.846
"call the New York Times"

0:19:48.846,0:19:50.553
There's the New York Times article:

0:19:50.553,0:19:52.601
"Flaw Found in an Online Encryption Method"

0:19:52.601,0:19:54.860
I also like the advertising on the side here.

0:19:54.860,0:19:57.151
Like iron key on the top and then:

0:19:57.151,0:20:01.168
"Encrypt, decrypt & access my secret data in real time"

0:20:01.168,0:20:03.135
"try it free for 30 days."

0:20:03.135,0:20:05.199
"I secured my cloud data. Have you?"

0:20:05.199,0:20:06.878
Awesome advertising there.

0:20:06.878,0:20:12.480
Once you found that there's a problem like this

0:20:12.480,0:20:14.919
then of course you start looking around for more and more places

0:20:14.919,0:20:17.435
where you might have some bad randomness.

0:20:17.435,0:20:22.795
For example there is a paper, or at least slides,

0:20:22.795,0:20:25.395
by Chou in Taiwan, saying

0:20:25.395,0:20:31.857
they took two million Taiwan Citizen Digital Certificates

0:20:31.857,0:20:34.327
and did some GCDs

0:20:34.327,0:20:36.626
and found that a 103 of those were

0:20:36.626,0:20:37.995
factorable.

0:20:37.995,0:20:40.052
So anybody who downloaded these certificates

0:20:40.052,0:20:43.177
which apparently are used in Taiwan for like paying your taxes,

0:20:43.177,0:20:44.393
talking to banks I don't know,

0:20:44.393,0:20:46.677
but paying taxes is one that I've checked.

0:20:46.677,0:20:49.537
So a 103 people out there have these Taiwan

0:20:49.537,0:20:51.011
Citizen Digital Certificates where

0:20:51.011,0:20:54.121
you're supposed to have in this database things like

0:20:54.121,0:20:57.122
the names of the people and their ID numbers

0:20:57.122,0:20:59.576
but you aren't supposed to have their secret keys.

0:20:59.576,0:21:01.543
The whole point is those are supposed to be private

0:21:01.543,0:21:03.938
kept on some smartcards that are issued to the citizens

0:21:03.938,0:21:05.295
in Taiwan.

0:21:05.295,0:21:08.706
And the randomness generation is so bad that

0:21:08.706,0:21:13.169
103 of those keys have now been factored.

0:21:13.169,0:21:15.451
The smartcard manufacturer, I also like this quote,

0:21:15.451,0:21:18.674
it's a company based in Munich called

0:21:18.674,0:21:22.315
Giesecke & Devrient and their motto is:

0:21:22.315,0:21:24.753
"Creating Confidence"

0:21:24.753,0:21:35.270
<i>applause</i>

0:21:35.270,0:21:37.698
<i>Tanja</i>: There's gonna be more of Dan, don't worry.

0:21:37.698,0:21:39.835
But we promised a bit more than just factoring

0:21:39.835,0:21:43.309
bad keys or the keys with their primes.

0:21:43.309,0:21:46.956
If you find another number like this one.

0:21:46.956,0:21:49.686
So here's coming some more of math stuff.

0:21:49.686,0:21:52.555
So if you see a number like this and you

0:21:52.555,0:21:56.427
observe this last digit, then yes.

0:21:56.427,0:21:58.738
It's very easy for you to see it's divisable by 5.

0:21:58.738,0:22:01.385
So if you find such a key that is easy to break

0:22:01.385,0:22:03.226
and you are a computer rather than a human being

0:22:03.226,0:22:05.324
you look at lots of those numbers.

0:22:05.324,0:22:07.076
You have a list of small primes stored

0:22:07.076,0:22:10.037
and what you're gonna do is just trial division.

0:22:10.037,0:22:11.987
So if there's any small factor

0:22:11.987,0:22:14.092
it's very easy to find this trial division.

0:22:14.092,0:22:16.708
Or what Dan showed was the remainder trees.

0:22:16.708,0:22:20.222
You can also do this for batch trial division.

0:22:20.222,0:22:23.728
So assuming that you're looking for a prime p somewhere

0:22:23.728,0:22:26.735
then you go linearly all the way up to the p

0:22:26.735,0:22:30.442
and there are about p/log(p) many primes

0:22:30.442,0:22:32.094
before you find this p.

0:22:32.094,0:22:33.969
For each of them you just do exact division.

0:22:33.969,0:22:36.558
If it works, fine you found a factor, otherwise not.

0:22:36.558,0:22:39.459
Now this is not going to work against your normal keys.

0:22:39.459,0:22:43.624
I know that in the example that Dan mentioned by Wagner&Goldberg

0:22:43.624,0:22:46.625
that they found one factor which had a 9 in there

0:22:46.625,0:22:49.975
but usually your keys don't have a 9.

0:22:49.975,0:22:54.687
For slightly larger numbers there is a method due to Pollard.

0:22:54.687,0:22:57.460
So if you see the respectful number here,

0:22:57.460,0:22:59.887
there is no obvious divisability of this one.

0:22:59.887,0:23:03.091
One thing you can try is

0:23:03.091,0:23:06.660
a walk, well we call this a walk.

0:23:06.660,0:23:10.404
You start with some random number smaller than N

0:23:10.404,0:23:12.902
and some offset here.

0:23:12.902,0:23:17.277
I should also say, all those scripts are available at this URL.

0:23:17.277,0:23:22.087
So if you go to facthacks.cr.yp.to then you'll find all the scripts

0:23:22.087,0:23:23.879
if you want to play with it at the same time,

0:23:23.879,0:23:26.625
including this one with some explanations and such.

0:23:26.625,0:23:31.092
So what you do is you start off two different numbers,

0:23:31.092,0:23:34.788
one at each step squares it and adds c.

0:23:34.788,0:23:37.226
And the other one squares it, adds c,

0:23:37.226,0:23:39.886
and squares it again and adds the c.

0:23:39.886,0:23:41.455
Each time computing mod N.

0:23:41.455,0:23:44.102
So you have two sequences running around,

0:23:44.102,0:23:46.023
one is twice the speed as the other.

0:23:46.023,0:23:50.953
At every step you check with the GCD of this number N

0:23:50.953,0:23:52.757
and the difference between the two walks

0:23:52.757,0:23:55.006
hasn't anything interesting now.

0:23:55.006,0:23:57.390
N is an RSA modulus, the only thing interesting

0:23:57.390,0:23:59.675
could either be p or q.

0:23:59.675,0:24:03.009
So for this particular number

0:24:03.009,0:24:04.270
when I run this

0:24:04.270,0:24:08.894
I find 2053 after a few milliseconds.

0:24:08.894,0:24:10.642
So this is again a cooked up example.

0:24:10.642,0:24:13.307
You won't find such a small factor if you're trying

0:24:13.307,0:24:16.173
a general RSA factor, otherwise your numbers

0:24:16.173,0:24:17.830
would be totally busted but

0:24:17.830,0:24:20.604
this shows if you have a small number in there

0:24:20.604,0:24:21.857
it's very easy to find.

0:24:21.857,0:24:24.357
All later on when Dan is talking about the number field sieve

0:24:24.357,0:24:26.157
this is an interesting subroutine.

0:24:26.157,0:24:28.089
So if you ever need to find small numbers

0:24:28.089,0:24:31.020
then, sure trial division is very easy for

0:24:31.020,0:24:34.080
really small numbers taking this long,

0:24:34.080,0:24:38.352
this one squared with p is much faster already.

0:24:38.352,0:24:42.070
Now even faster than that, also due to Pollard,

0:24:42.070,0:24:45.378
called Pollard's p-1 method.

0:24:45.378,0:24:47.904
Here is again an innocent looking number N.

0:24:47.904,0:24:49.842
You see the numbers get larger.

0:24:49.842,0:24:52.176
This is a 256 bit number.

0:24:52.176,0:24:56.744
In order to deal with such a number there is one step

0:24:56.744,0:24:59.342
which is expensive but you would do it only once

0:24:59.342,0:25:01.208
for all different numbers you want to try,

0:25:01.208,0:25:05.238
namely compute a big integer which has lots of little factors.

0:25:05.238,0:25:06.985
So you take the least common multiple

0:25:06.985,0:25:09.788
from 1, 2, 3, 4, 5...

0:25:09.788,0:25:12.203
Ok, once you hit the 4 you have another 2 in there.

0:25:12.203,0:25:13.808
So there's lots and lots of powers of 2,

0:25:13.808,0:25:15.408
lots and lots of powers of 3.

0:25:15.408,0:25:19.390
And then the larger primes only appear like once.

0:25:19.390,0:25:23.161
But you're not going up to 256 or 128 anywhere.

0:25:23.161,0:25:27.005
You're sticking here at 22 bits

0:25:27.005,0:25:29.059
and then use the same function that Nadia explained

0:25:31.023,0:25:32.987
in the RSA computation, namely the modular exponentiation.

0:25:32.987,0:25:36.576
So you take the 2^y,

0:25:36.576,0:25:38.241
this huge number,

0:25:38.241,0:25:40.486
but the whole computation mod N.

0:25:40.486,0:25:44.633
So this whole thing never grows beyond this 256 bit number.

0:25:44.633,0:25:47.939
Or for a real world example it would be a 1000 bit number.

0:25:47.939,0:25:52.518
And then you compute the GCD of this number and N.

0:25:52.518,0:25:55.091
For this particular one, again it's a cooked up example,

0:25:55.091,0:25:56.390
you get a factor.

0:25:56.390,0:25:59.791
Now this factor is actually 120 bits.

0:25:59.791,0:26:02.369
This is not a small factor.

0:26:02.369,0:26:05.688
So if you were using 256 bit numbers

0:26:05.688,0:26:07.856
this is something that could happen to you

0:26:07.856,0:26:11.305
and nevertheless this method would find it.

0:26:11.305,0:26:13.522
So this finds much larger factors than trial division

0:26:13.522,0:26:15.484
or the Rho method in the same amount of time.

0:26:15.484,0:26:17.376
But it doesn't work for all primes.

0:26:17.376,0:26:20.066
It works for primes which are kind of special.

0:26:20.066,0:26:23.859
Now what's special about this prime or the other factor?

0:26:23.859,0:26:29.553
If you take p and subtract -1, hence the name p-1 method,

0:26:29.553,0:26:31.660
and factor that thing,

0:26:31.660,0:26:35.272
that has lots, lots of small numbers.

0:26:35.272,0:26:36.937
So that's something we call smooth.

0:26:36.937,0:26:40.342
So a smooth number doesn't have big factors.

0:26:40.342,0:26:43.868
So a prime is like the least smooth number you can imagine.

0:26:43.868,0:26:47.367
And 2 to the power large is the most smooth number,

0:26:47.367,0:26:50.007
so lots of little factors means smooth.

0:26:50.007,0:26:55.568
This largest factor happens to be 21.7 bits

0:26:55.568,0:26:58.355
which is why I chose the 22 bits here.

0:26:58.355,0:27:03.702
So as soon I run over the largest prime in p-1

0:27:03.702,0:27:06.127
with this exponent y here

0:27:06.127,0:27:08.653
then I do find this factor.

0:27:08.653,0:27:10.851
Now if you thought this was math

0:27:10.851,0:27:12.322
there is even more scary math

0:27:12.322,0:27:15.893
namely there are methods which are generalizing

0:27:15.893,0:27:19.319
this p-1 method using elliptic curves.

0:27:19.319,0:27:21.569
If you listen to the crypto advertisement

0:27:21.569,0:27:23.472
elliptic curves are usually something very good,

0:27:23.472,0:27:25.005
something you should have on your smartcard.

0:27:25.005,0:27:27.245
It's faster than RSA and so on.

0:27:27.245,0:27:29.385
That's the same type of elliptic curve

0:27:29.385,0:27:32.342
but here elliptic curves come in as an attack tool.

0:27:32.342,0:27:35.088
There is a method called the elliptic curve method ECM

0:27:35.088,0:27:41.255
which is a generalization of this p-1 method and does not need anything special,

0:27:41.255,0:27:44.708
I mean avoiding such primes is easy.

0:27:44.708,0:27:46.325
If you look into older standards

0:27:46.325,0:27:49.852
they all warn about "make sure to use strong primes"

0:27:49.852,0:27:53.174
"make sure to check that your p-1 is not smooth"

0:27:53.174,0:27:55.301
And of course you don't want your p-1 to be smooth

0:27:55.301,0:27:57.670
because otherwise this attack works.

0:27:57.670,0:28:00.760
But if I have some elliptic curve

0:28:00.760,0:28:03.492
a different type of prime is weak for that curve

0:28:03.492,0:28:06.351
and I have many, many, many elliptic curves.

0:28:06.351,0:28:08.617
For each of the curves some primes are weak.

0:28:08.617,0:28:11.161
You can't exclude this method.

0:28:11.161,0:28:13.573
So the only thing you can do is just, well,

0:28:13.573,0:28:18.035
choose your prime large enough and hope, well,

0:28:18.035,0:28:20.436
make sure p-1 doesn't get it and hope

0:28:20.436,0:28:22.601
that the next elliptic curves won't get it.

0:28:22.601,0:28:24.855
So elliptic curves are good for attacking.

0:28:24.855,0:28:27.624
It's still not the fastest method out there

0:28:27.624,0:28:33.185
but it's a good method for random factoring.

0:28:33.185,0:28:35.917
It's not the best method for finding RSA factors

0:28:35.917,0:28:38.366
but if you have a number which is most likely

0:28:38.366,0:28:44.532
not too non-smooth then it's a good method.

0:28:44.532,0:28:47.234
Now this is what you do for finding small factors.

0:28:47.234,0:28:50.509
To show some method for big factors,

0:28:50.509,0:28:53.994
now here that is a big number.

0:28:53.994,0:28:58.978
But this is what I would call factorization by inspection.

0:28:58.978,0:29:07.876
What's wrong with this number?

0:29:07.876,0:29:10.129
I mean this number is not small and

0:29:10.129,0:29:13.162
it won't have small factors, I promise you this.

0:29:13.162,0:29:19.835
So it's a decimal representation, so 9 is the digit before 0.

0:29:19.835,0:29:23.246
This looks very close to the power of 10.

0:29:23.246,0:29:28.253
Actually this is very close to 10^340.

0:29:28.253,0:29:30.292
If you take the square root of it

0:29:30.292,0:29:34.150
then you almost exactly hit 10^170.

0:29:34.150,0:29:37.113
This example was cooked up as taking two primes

0:29:37.113,0:29:46.646
namely 10^170-33 and 10^170+63 and multiplying them.

0:29:46.646,0:29:48.667
So if you see lots of 0's and lots of 9's

0:29:48.667,0:29:50.934
then you know that you are very close to a certain power.

0:29:50.934,0:29:53.464
Now in real life this wouldn't be a power of 10,

0:29:53.464,0:29:54.817
it would be a power of 2.

0:29:54.817,0:29:56.179
And there actually are some examples.

0:29:56.179,0:29:58.817
There's a famous story of a letter bomber in Austria

0:29:58.817,0:30:02.650
who wanted his message to be crackable

0:30:02.650,0:30:06.047
and he sent a message, well he was some right-wing asshole

0:30:06.047,0:30:08.332
that was letter-bombing people

0:30:08.332,0:30:10.813
and then he sent an encrypted message to the police.

0:30:10.813,0:30:14.642
And the police tried to factor it and tried to decipher it

0:30:14.642,0:30:18.568
hoping it would be the list of the next victims.

0:30:18.568,0:30:20.980
In the end it was not but

0:30:20.980,0:30:24.419
this person was actually using very close to a power of 2

0:30:24.419,0:30:26.211
in order to have people crack it.

0:30:26.211,0:30:27.769
In the end it was just some propaganda,

0:30:27.769,0:30:29.462
some right-wing stuff as well.

0:30:29.462,0:30:32.783
So there are people using this for breakability.

0:30:32.783,0:30:35.499
I'm not aware of people actually using this as an accident.

0:30:35.499,0:30:40.437
But what can happen to you is not using close to the power of 2

0:30:40.437,0:30:46.200
but saying "ok I learned I shouldn't be close to the power of 2

0:30:46.200,0:30:48.848
so I take some offset and take the next prime."

0:30:48.848,0:30:51.130
Next prime just means, add 1, check is it prime,

0:30:51.130,0:30:53.798
add next one, add 2, add 3

0:30:53.798,0:30:57.082
Just check primes linearly from then on.

0:30:57.082,0:30:59.261
Now if we jump to this where we finally find our prime p

0:30:59.261,0:31:00.966
it's a good prime.

0:31:00.966,0:31:03.869
It's not anywhere close to the power of 2 because my c was large enough.

0:31:03.869,0:31:07.130
Made it, good.

0:31:07.130,0:31:08.562
Now on q.

0:31:08.562,0:31:10.776
So again call the next_prime().

0:31:10.776,0:31:14.445
So I do p+1, p+2 until I find a prime.

0:31:14.445,0:31:17.215
That means that if you take the product of the two

0:31:17.215,0:31:19.469
then this N is very close to a square.

0:31:19.469,0:31:22.920
So this N is cooked up with this method.

0:31:22.920,0:31:24.412
You don't see anything on the end,

0:31:24.412,0:31:26.678
there's nothing wrong there.

0:31:26.678,0:31:29.650
But when you compute the square root of it

0:31:29.650,0:31:33.156
it is almost an integer, it's .99999...

0:31:33.156,0:31:35.208
and then some dirt here.

0:31:35.208,0:31:39.288
So then you'll know that your primes were too small.

0:31:39.288,0:31:42.286
Sorry, not too small, too close to each other.

0:31:42.286,0:31:45.483
So if you ever take an RSA factor modulus

0:31:45.483,0:31:46.841
and you compute the square root

0:31:46.841,0:31:47.991
and you see something like this

0:31:47.991,0:31:51.885
then you know: that user did this method.

0:31:51.885,0:31:53.661
You could just say, I take the number,

0:31:53.661,0:31:56.549
subtract 1, subtract 2, just go off from the square root.

0:31:56.549,0:32:02.022
All you can check how far away is it from being a square.

0:32:02.022,0:32:03.701
So you take the seeding,

0:32:03.701,0:32:07.950
that means the closest integer counting upwards.

0:32:07.950,0:32:11.817
Just round it to this two, here's 57, is there a 56?

0:32:11.817,0:32:14.520
Compute the square of that and subtract N.

0:32:14.520,0:32:17.987
Now in this example that is 4096

0:32:17.987,0:32:22.595
which itself is a square.

0:32:22.595,0:32:26.636
So the difference of N and a^2 is a square.

0:32:26.636,0:32:30.915
Then I take this 64, take a-64, divide N,

0:32:30.915,0:32:34.413
it's an exact division, so I found one of the factors.

0:32:34.413,0:32:37.200
And then there's the other factor.

0:32:37.200,0:32:39.738
It doesn't matter whether it's a power of 2 or a power of 10.

0:32:39.738,0:32:42.569
What matters is that the numbers are too close.

0:32:42.569,0:32:44.712
Now this method actually works surprisingly well

0:32:44.712,0:32:47.436
even if we don't do next_prime().

0:32:47.436,0:32:50.332
So here's another example where I didn't do the next_prime()

0:32:50.332,0:32:52.281
but did something larger

0:32:52.281,0:32:55.114
so this is not really close to a square.

0:32:55.114,0:32:57.731
Some 9's but not too many.

0:32:57.731,0:33:00.053
What we did before we wrote it as a difference of squares

0:33:00.053,0:33:04.614
and then divided N by one of these two factors.

0:33:04.614,0:33:10.050
Now in this case here I would not start as a^2-N

0:33:10.050,0:33:12.255
but I say, well, if a^2-N is not a square

0:33:12.255,0:33:15.997
I try the next one, I do (a+1)^2, (a+2)^2

0:33:15.997,0:33:20.777
each time subtract N and check when this is a square.

0:33:20.777,0:33:24.170
Now for this example I get lucky after 2.

0:33:24.170,0:33:26.331
And this actually took me a long time to cook up this example

0:33:26.331,0:33:28.613
because I was starting at something which was

0:33:28.613,0:33:31.669
half of the bits of p were changed.

0:33:31.669,0:33:35.872
So I was actually starting with a cube quite a distance away.

0:33:35.872,0:33:38.216
Still, most of those were just giving a square.

0:33:38.216,0:33:42.070
They were not giving me anything larger than zero here.

0:33:42.070,0:33:45.598
Now for general numbers if I wouldn't do like next_number()

0:33:45.598,0:33:48.786
but just, well, random prime, another random prime.

0:33:48.786,0:33:52.687
It still works but takes a long time

0:33:52.687,0:33:55.488
because I can always write it this way.

0:33:55.488,0:33:56.786
This is a, this is b.

0:33:56.786,0:34:00.603
But then usually it would take about square root of N

0:34:00.603,0:34:03.752
which is the same size as p until I get there.

0:34:03.752,0:34:07.352
This is a nice method if it looks like lots of

0:34:07.352,0:34:13.386
9999 and otherwise do something better as Dan will tell now.

0:34:13.386,0:34:13.955
<i>Dan</i>: Ok.

0:34:13.955,0:34:21.508
<i>applause</i>

0:34:21.508,0:34:23.662
Alright, so it's bad to have p and q

0:34:23.662,0:34:24.805
too close to each other

0:34:24.805,0:34:28.640
or have a small p or to have p and q non-random.

0:34:28.640,0:34:31.010
So let's do everything right.

0:34:31.010,0:34:34.236
Let's make just independently choose some big p

0:34:34.236,0:34:37.033
between 2^511 and 2^512

0:34:37.033,0:34:38.667
and choose some big q

0:34:38.667,0:34:42.586
without being like the next prime or anywhere in the ballpark of p.

0:34:42.586,0:34:47.859
You could maybe say q has to be between 2^509 and 2^510

0:34:47.859,0:34:49.522
and then it's definitely nowhere near p.

0:34:49.522,0:34:52.539
Now you got this public key and p times q

0:34:52.539,0:34:54.473
which is a 1024 bit RSA key.

0:34:54.473,0:34:58.745
If nothing has gone wrong with your randomness generation

0:34:58.745,0:35:00.153
then what does the attacker do?

0:35:00.153,0:35:04.472
Well, we don't know anything fast.

0:35:04.472,0:35:06.840
So this is gonna be the one part of the talk

0:35:06.840,0:35:08.495
where there's these big powers of 2

0:35:08.495,0:35:09.778
for how fast things are

0:35:09.778,0:35:11.139
because they're not really fast.

0:35:11.139,0:35:14.624
You have to think about how much something like 2^80 is.

0:35:14.624,0:35:15.938
There's all these different methods,

0:35:15.938,0:35:17.944
I'll just skip down to the last two.

0:35:17.944,0:35:19.999
There's the quadratic sieve (QS),

0:35:19.999,0:35:22.114
the number field sieve (NFS).

0:35:22.114,0:35:24.109
These have been around since, well,

0:35:24.109,0:35:29.192
QS since the 80th, NFS since the early 90th.

0:35:29.192,0:35:32.327
The NFS, the general consensus is

0:35:32.327,0:35:35.197
it takes something like 2^80 operations.

0:35:35.197,0:35:36.675
Now here's something fun to try.

0:35:36.675,0:35:38.844
If somebody says 2^80 operations

0:35:38.844,0:35:41.473
and there's some cryptographer talking about the security or something,

0:35:41.473,0:35:44.244
ask them: what do you mean by an operation?

0:35:44.244,0:35:45.996
How fast is this operation?

0:35:45.996,0:35:47.480
And then most of the people saying this

0:35:47.480,0:35:48.624
will go running screaming like:

0:35:48.624,0:35:49.850
I don't know what an operation is,

0:35:49.850,0:35:51.193
don't ask me that question.

0:35:51.193,0:35:53.382
But still people will confidently tell you

0:35:53.382,0:35:56.230
that the NFS takes 2^80 operations

0:35:56.230,0:36:00.347
to break any 1024 bit RSA key

0:36:00.347,0:36:02.237
even if the user hasn't done anything wrong,

0:36:02.237,0:36:04.047
the implementor hasn't done anything wrong.

0:36:04.047,0:36:05.631
And this number, this 2^80

0:36:05.631,0:36:07.618
if you pin down what those operations are,

0:36:07.618,0:36:11.100
this is something which is doable today

0:36:11.100,0:36:15.746
by people with a big botnet or people with a big computer cluster.

0:36:15.746,0:36:18.611
I'll give some details what I mean, the size is there.

0:36:18.611,0:36:21.797
As your computers get cheaper and cheaper,

0:36:21.797,0:36:23.720
somehow chips aren't going up in clockspeed

0:36:23.720,0:36:25.149
but they're still getting cheaper.

0:36:25.149,0:36:27.636
You can get a really cheap ARM which is doing

0:36:27.636,0:36:28.904
the same kind of computations

0:36:28.904,0:36:31.501
that a pretty big CPU was doing several years ago.

0:36:31.501,0:36:33.181
And chips are gonna continue getting cheaper

0:36:33.181,0:36:36.604
for a while. These attacks against RSA 1024

0:36:36.604,0:36:39.568
are going to be common doable for more and more people.

0:36:39.568,0:36:43.119
How do these methods work?

0:36:43.119,0:36:47.210
I'll give one example and then one little Sage script.

0:36:47.210,0:36:50.169
Let's try just a really small number.

0:36:50.169,0:36:51.717
This will be with the quadratic sieve

0:36:51.717,0:36:53.727
which is not as fancy as the number field sieve

0:36:53.727,0:36:56.395
but it will give you some idea of how these methods work.

0:36:56.395,0:36:59.978
The QS starts with that Fermat method you heard about.

0:36:59.978,0:37:04.066
Remember, the idea there was to try to find a square

0:37:04.066,0:37:08.574
as a^2-N, so <i>a</i> was like the square root of N

0:37:08.574,0:37:10.525
or maybe the plus 1, plus 2 and so on

0:37:10.525,0:37:13.333
and you keep searching for a+1 and so on,

0:37:13.333,0:37:15.933
search for numbers that if you square them and subtract N

0:37:15.933,0:37:17.056
then you get a square.

0:37:17.056,0:37:20.708
So let's try that for 50=2759 which is not very big.

0:37:20.708,0:37:23.975
Then you try, 53 was the ceiling of the square root,

0:37:23.975,0:37:27.673
square that, subtract 2759, you get 50.

0:37:27.673,0:37:31.171
Well 25 would be a square, but 50 is not a square

0:37:31.171,0:37:33.309
it's 2 times 25, 2 times 5^2.

0:37:33.309,0:37:37.208
And then you try another number, 54^2-2759

0:37:37.208,0:37:39.911
umm 157, that's too complicated

0:37:39.911,0:37:42.707
I don't remember that being a square so let's just skip it.

0:37:42.707,0:37:45.687
And then you keep going like this, 266, 377,

0:37:45.687,0:37:48.223
<b>490</b>. I remember 49 is a square

0:37:48.223,0:37:50.354
but that doesn't mean that 490 is a square

0:37:50.354,0:37:52.372
cause 10, 2 times 5, that's not a square.

0:37:52.372,0:37:55.871
And 605, you can try...

0:37:55.871,0:37:58.640
We remember how to take out the 5

0:37:58.640,0:38:01.275
and then 60... 605 divided by 5.

0:38:01.275,0:38:03.588
You can figure out it's 5 times 11^2.

0:38:03.588,0:38:05.158
It's almost a square.

0:38:05.158,0:38:06.774
You keep doing this for a while

0:38:06.774,0:38:10.534
and it seems like Fermat's method is actually working pretty badly.

0:38:10.534,0:38:13.754
It does work eventually but it's taking quite a while.

0:38:13.754,0:38:16.671
Here's what the quadratic sieve does.

0:38:16.671,0:38:19.527
From the same computation you look at these

0:38:19.527,0:38:21.585
numbers that were not exactly squares

0:38:21.585,0:38:23.634
50 and 490 and 605

0:38:23.634,0:38:27.056
and you notice if you multiple them together

0:38:27.056,0:38:28.472
50 was 2 times 5^2

0:38:28.472,0:38:30.756
490 is 2 times 5 times 7^2

0:38:30.756,0:38:33.068
605 is 5 times 11^2

0:38:33.068,0:38:34.585
and then you multiply those together

0:38:34.585,0:38:38.524
you get 2^2 and 5^4 and 7^2 and 11^2

0:38:38.524,0:38:41.123
and that's exactly the square of something.

0:38:41.123,0:38:44.592
The square of 2^1 times 5^2 times 7 times 11.

0:38:44.592,0:38:48.175
Then the quadratic sieve is taking the square root of that number

0:38:48.175,0:38:52.553
and subtracting that from the product of the fifty-somethings

0:38:52.553,0:38:53.953
that were on the left side

0:38:53.953,0:38:56.826
and taking the greatest common divisor of that with N

0:38:56.826,0:38:59.787
and amazingly that produces a factor of N.

0:38:59.787,0:39:02.289
And it's not just some random accident that

0:39:02.289,0:39:04.370
if you had tried the greatest common divisor of

0:39:04.370,0:39:06.085
"write down some hocuspocus then"

0:39:06.085,0:39:08.803
"you eventually get 31 coming out"

0:39:08.803,0:39:10.576
"every 31 times you write down a random number"

0:39:10.576,0:39:13.020
"it will have this 31 dividing it"

0:39:13.020,0:39:15.347
No, you can try this for bigger and bigger numbers

0:39:15.347,0:39:17.038
and actually this keeps working.

0:39:17.038,0:39:18.663
As soon as you find a square product

0:39:18.663,0:39:23.212
then half of those square products are gonna factor N.

0:39:23.212,0:39:25.365
So that's how the quadratic sieve works.

0:39:25.365,0:39:28.915
Here's a more systematic script with a much bigger number

0:39:28.915,0:39:31.300
which if it were just hocuspocus this wouldn't work.

0:39:31.300,0:39:32.616
But this does work.

0:39:32.616,0:39:36.834
So there's a number from my computer's random number generator

0:39:36.834,0:39:41.661
31415926... <i>laughter</i>

0:39:41.661,0:39:45.099
You try writing down a bunch of squares...

0:39:45.099,0:39:48.285
maybe I should get my computer's random number generator checked.

0:39:48.285,0:39:50.636
It is this two year old laptop you heard about before

0:39:50.636,0:39:52.202
so I'm not sure it's working properly.

0:39:52.202,0:39:55.599
You take a bunch of a^2-N and then

0:39:55.599,0:39:57.414
let's make a list of those called X.

0:39:57.414,0:40:00.752
You can see the range, there is up to 500.000 numbers.

0:40:00.752,0:40:02.948
It's a pretty big list we're talking about.

0:40:02.948,0:40:05.749
And then search through those elements,

0:40:05.749,0:40:08.538
search through those a^2-N, those differences,

0:40:08.538,0:40:09.504
the elements in this list,

0:40:09.504,0:40:12.985
to see which ones have easy factorizations.

0:40:12.985,0:40:15.445
Now a lot of numbers like that 157

0:40:15.445,0:40:16.970
I know what the factorization of that is

0:40:16.970,0:40:19.846
but there is function which you can find on our website

0:40:19.846,0:40:22.696
easyfactorizations() which looks through the list X

0:40:22.696,0:40:25.251
and if it finds numbers that are easy to factor

0:40:25.251,0:40:27.935
then it quickly writes down the factorizations

0:40:27.935,0:40:30.482
using the small prime techniques you heard about.

0:40:30.482,0:40:33.169
And then makes a list of those easy factorizations,

0:40:33.169,0:40:34.396
puts those into F.

0:40:34.396,0:40:37.600
And then there's a big chunk which I won't try to explain

0:40:37.600,0:40:39.986
which is called linear algebra mod 2.

0:40:39.986,0:40:42.586
And that somehow figures out a square

0:40:42.586,0:40:45.697
that comes from the factorizations,

0:40:45.697,0:40:48.363
somehow looks through this list of factorizations

0:40:48.363,0:40:50.803
and magically applies some linear algebra stuff.

0:40:50.803,0:40:52.612
Well ok, I won't go through that.

0:40:52.612,0:40:54.071
It's something doable

0:40:54.071,0:40:58.604
and this is using some standard Sage functions to do it pretty easily.

0:40:58.604,0:41:01.738
There's all sorts of things that

0:41:01.738,0:41:03.664
in the interest of time I won't get into

0:41:03.664,0:41:05.867
like how this easyfactorizations() works.

0:41:05.867,0:41:07.603
And this is something where people write papers

0:41:07.603,0:41:09.801
and papers and papers talking about trying to make this

0:41:09.801,0:41:11.603
go as quickly as possible.

0:41:11.603,0:41:14.737
I do want to emphasize one fact about these methods:

0:41:14.737,0:41:17.484
the bold-faced **very small memory requirements**.

0:41:17.484,0:41:20.038
You can use the elliptic curve method

0:41:20.038,0:41:21.531
that you heard about before

0:41:21.531,0:41:24.882
you can use that to figure out easy factorizations

0:41:24.882,0:41:27.063
using very little memory.

0:41:27.063,0:41:28.503
That means you can build a chip

0:41:28.503,0:41:30.471
like a FPGA, a special purpose ASIC,

0:41:30.471,0:41:33.285
you can have thousands of little ECM units

0:41:33.285,0:41:35.066
running in parallel, so you can really

0:41:35.066,0:41:38.153
massively parallelize this easyfactorizations() function.

0:41:38.153,0:41:40.187
So if people tell you "oh you need tons of

0:41:40.187,0:41:43.386
memory for sieving to find easy factorizations"

0:41:43.386,0:41:45.868
then you should tell them "no no, that's not true"

0:41:45.868,0:41:47.979
"ECM you can do with very little memory

0:41:47.979,0:41:50.351
running massively parallelizable"

0:41:50.351,0:41:52.802
And then there's other things which,

0:41:52.802,0:41:54.490
I suppose if we had another hour

0:41:54.490,0:41:56.432
then we could get into all these details

0:41:56.432,0:41:59.117
of like how the number field sieve goes beyond this.

0:41:59.117,0:42:01.970
It's a similar kind of method but gets more complicated.

0:42:01.970,0:42:05.181
I do want to emphasize again thing in bold-face here

0:42:05.181,0:42:06.946
which is that if somebody tells you

0:42:06.946,0:42:10.438
"oh 2^80 operations, the attacker wouldn't do

0:42:10.438,0:42:13.796
that because the target isn't worth that much to them"

0:42:13.796,0:42:18.284
Well, **Batch NFS** is a way to take a bunch of N's

0:42:18.284,0:42:20.566
and like the batch techniques before

0:42:20.566,0:42:23.516
there's some magic ways to find redundancies

0:42:23.516,0:42:25.884
between doing factorizations of those N's.

0:42:25.884,0:42:26.937
So somebody tells you:

0:42:26.937,0:42:30.000
"oh 2^80, that isn't worth it for the attacker."

0:42:30.000,0:42:33.296
Well, **Batch NFS** reduces the cost quite a bit

0:42:33.296,0:42:34.821
for factoring each key.

0:42:34.821,0:42:36.846
If the attacker has a lot of keys to target

0:42:36.846,0:42:39.852
they can factor all of them in not much more cost

0:42:39.852,0:42:41.396
than just factoring one.

0:42:41.396,0:42:43.734
So don't believe people who take an economic view

0:42:43.734,0:42:46.987
how much the number field sieve is worth.

0:42:46.987,0:42:49.416
Alright, what does this mean?

0:42:49.416,0:42:52.118
Can the attacker actually do the NFS

0:42:52.118,0:42:54.134
once they've put in all these optimizations?

0:42:54.134,0:42:59.064
Well if you look carefully at the analyses of the NFS

0:42:59.064,0:43:02.416
there are only about 2^70 differences.

0:43:02.416,0:43:04.149
Things like these a^2-N's.

0:43:04.149,0:43:08.862
If you look through 2^70 of those and scan them properly,

0:43:08.862,0:43:10.773
find easy factorizations, do linear algebra,

0:43:10.773,0:43:13.669
you will factor any 1024 bit key.

0:43:13.669,0:43:16.314
And 2^70, how big is that number?

0:43:16.314,0:43:20.820
It's actually small enough that you can do this computation on

0:43:20.820,0:43:22.469
your favorite botnet.

0:43:22.469,0:43:25.749
Say, Conficker is shrinking, still

0:43:25.749,0:43:27.947
and it will probably be wiped out at some point

0:43:27.947,0:43:29.488
but it's an example of what you can do

0:43:29.488,0:43:32.468
using any of these security problems that are deployed

0:43:32.468,0:43:33.632
on enough machines.

0:43:33.632,0:43:35.398
You can break into a bunch of machines,

0:43:35.398,0:43:36.521
millions of machines.

0:43:36.521,0:43:40.296
Conficker estimates vary between 7 million and 12 million machines.

0:43:40.296,0:43:44.320
So use your, say, 2^23, that's 8 million machines.

0:43:44.320,0:43:46.519
Count how many seconds there are in a year

0:43:46.519,0:43:48.553
and then, say, ok that means we have to do

0:43:48.553,0:43:52.070
2^22 differences per second machine.

0:43:52.070,0:43:56.902
And that is slower than state of the art factorization code already runs.

0:43:56.902,0:43:59.231
So it's actually a pretty easy computation.

0:43:59.231,0:44:02.334
You don't even need that many millions of computers to do this.

0:44:02.334,0:44:04.470
If people tell you "wait a minute"

0:44:04.470,0:44:06.120
"you can't just use a bunch of machines"

0:44:06.120,0:44:08.073
"there's all this work for linear algebra"

0:44:08.073,0:44:09.966
Then I'll just skip this.

0:44:09.966,0:44:12.882
Be aware that linear algebra is not as much of a problem

0:44:12.882,0:44:15.334
as some people would say.

0:44:15.334,0:44:18.872
If you use a big botnet there's one little problem.

0:44:18.872,0:44:21.053
For an attacker who breaks into a bunch of machines

0:44:21.053,0:44:23.903
and then starts spinning them up, heating up the CPU,

0:44:23.903,0:44:25.921
the fan starts running, the user starts wondering

0:44:25.921,0:44:28.017
"why is my machine running so slowly?"

0:44:28.017,0:44:30.133
Maybe only use half of the CPU

0:44:30.133,0:44:33.570
but still it has still got a good chance of getting you detected

0:44:33.570,0:44:35.202
and kicked off the machines.

0:44:35.202,0:44:37.266
Users are gonna shut you down if you're doing a

0:44:37.266,0:44:39.463
say, year of computation on your botnet.

0:44:39.463,0:44:42.883
So what do you do instead?

0:44:42.883,0:44:45.021
Well you build a little computer cluster.

0:44:45.021,0:44:49.354
Now yesterday we heard about a big building

0:44:49.354,0:44:51.164
in Bluffdale, Utah, that apparently

0:44:51.164,0:44:54.832
is going to store all of the data collected by

0:44:54.832,0:44:57.171
the NSA for the next hundred years

0:44:57.171,0:44:58.854
or maybe forever.

0:44:58.854,0:45:00.914
But something that I didn't hear emphasized yesterday

0:45:00.914,0:45:04.916
was that this building also has a new power substation

0:45:04.916,0:45:08.150
which is drawing 65 megawatts,

0:45:08.150,0:45:10.155
well generating 65 megawatts.

0:45:10.155,0:45:13.999
Now what are we doing with 65 megawatts?

0:45:13.999,0:45:15.308
Is that the kind of power that you need to

0:45:15.308,0:45:17.490
store a bunch of data on tapes?

0:45:17.490,0:45:21.175
No, that's the kind of power you need to do computations.

0:45:21.175,0:45:24.725
So what is NSA doing with 2^26 watts?

0:45:24.725,0:45:26.955
Or maybe even with more watts.

0:45:26.955,0:45:30.909
You shouldn't actually think that 2^26 is such a big number.

0:45:30.909,0:45:33.526
Here's a little table with, well,

0:45:33.526,0:45:36.573
2^57 admittedly is pushing it a little bit.

0:45:36.573,0:45:40.620
This is the total power that the earth gets from the sun

0:45:40.620,0:45:43.207
of which about half gets to the earth's surface.

0:45:43.207,0:45:47.610
2^44 is the amount the human is using right now.

0:45:47.610,0:45:50.888
Varies a little bit but I mean this is the average power

0:45:50.888,0:45:53.427
including cars and such.

0:45:53.427,0:45:56.446
If you have the botnet that breaks into millions of machines

0:45:56.446,0:45:59.944
and runs in flat-out that's about 2^30 watts.

0:45:59.944,0:46:01.610
And actually if you think about it, wait a minute,

0:46:01.610,0:46:02.575
there's a lot of machines out there.

0:46:02.575,0:46:07.270
This means the 2^26 is not actually that much power.

0:46:07.270,0:46:11.456
It's just one dinky little Bluffdale 65 MW computer center

0:46:11.456,0:46:14.594
and actually if you're a government agency

0:46:14.594,0:46:16.138
you probably have more than one of those.

0:46:16.138,0:46:19.273
As you heard yesterday in the context of data storage

0:46:19.273,0:46:21.292
but again, it's not just data storage.

0:46:21.292,0:46:26.492
You don't make a 65 MW power station to just store data.

0:46:26.492,0:46:30.207
Ok, if you just take standard graphics processing units

0:46:30.207,0:46:33.223
and run 2^26 watts to those that will do about

0:46:33.223,0:46:36.446
2^84 floating point multiplications per year.

0:46:36.446,0:46:39.827
You have to figure out, ok, what are exactly the operations involved here.

0:46:39.827,0:46:44.829
This should be enough to break 1024 bit RSA

0:46:44.829,0:46:48.088
and if NSA is not just buying GPU's off-the-shelves

0:46:48.088,0:46:50.226
but really tuning chips that they build

0:46:50.226,0:46:57.409
using IBM to manufacture their chips at the moment.

0:46:57.409,0:47:01.626
Apparently it wasn't cost-effective for NSA to manufacture their own chips

0:47:01.626,0:47:04.475
so in 2005 they started subcontracting for IBM

0:47:04.475,0:47:08.153
but presumably through that fabrication IBM is building chips

0:47:08.153,0:47:11.073
that NSA wants and those should be about 10 times faster

0:47:11.073,0:47:13.025
than what GPU's can do.

0:47:13.025,0:47:18.362
So it should be possible with this little 65 MW computer cluster

0:47:18.362,0:47:21.476
to factor at least 1, maybe even 10,

0:47:21.476,0:47:23.478
and then with batch NFS maybe even more

0:47:23.478,0:47:30.010
1024 bit RSA keys in a year.

0:47:30.010,0:47:38.229
<i>applause</i>

0:47:38.229,0:47:41.130
<i>Nadia</i>: So to conclude things up here I want to

0:47:41.130,0:47:43.290
explain how you can use

0:47:43.290,0:47:46.057
from the comfort of your very own home

0:47:46.057,0:47:52.242
a distributed, a massive scale distributed cloud computing service

0:47:52.242,0:47:57.142
to calculate private keys on your own.

0:47:57.142,0:48:00.313
So many of you may be familiar with Amazon EC2

0:48:00.313,0:48:04.635
but it turns out Google also has a large amount of computing infrastructure

0:48:04.635,0:48:07.339
and they actually have a very convenient web interface

0:48:07.339,0:48:08.446
to access it.

0:48:08.446,0:48:12.978
<i>laugther</i>

0:48:12.978,0:48:21.677
<i>applause</i>

0:48:21.677,0:48:23.961
It's amazing what you can find on the Internet.

0:48:23.961,0:48:26.143
<i>laughter</i>

0:48:26.143,0:48:29.393
So ok, here's some keys except, you know,

0:48:29.393,0:48:30.578
if you look at these carefully

0:48:30.578,0:48:33.277
not all of these are sort of obviously RSA keys.

0:48:33.277,0:48:35.204
There are some problems here

0:48:35.204,0:48:40.704
like the second to last one on the bottom.

0:48:40.704,0:48:41.688
So...

0:48:41.688,0:48:43.338
<i>laughter</i>

0:48:43.338,0:48:49.820
After doing this search we found this key and

0:48:49.820,0:48:55.279
someone seems to have pasted a private key into pastebin

0:48:55.279,0:49:00.491
and someone came along and interrupted part of it

0:49:00.491,0:49:04.759
with some profanity.

0:49:04.759,0:49:07.387
This is an interesting problem.

0:49:07.387,0:49:08.812
What do we do with this key?

0:49:08.812,0:49:12.310
Clearly this is an interesting key, we must have this key.

0:49:12.310,0:49:14.204
<i>laughter</i>

0:49:14.204,0:49:24.847
<i>applause</i>

0:49:24.847,0:49:26.046
Step 1... yeah?

0:49:26.046,0:49:29.572
<i>Male</i>: Did you see the easter egg in row 1?

0:49:29.572,0:49:31.204
<i>Nadia</i>: Well, we'll discuss this.

0:49:31.204,0:49:34.308
<i>Angel</i>: Questions will be later, after the talk please.

0:49:34.308,0:49:39.368
<i>Nadia</i>: So yeah, we've removed the obvious problems here.

0:49:39.368,0:49:41.636
But there's still an issue which is that

0:49:41.636,0:49:43.844
we're missing hundreds and hundreds of bits of key.

0:49:43.844,0:49:45.940
We can't brute force-this.

0:49:45.940,0:49:47.658
What are we gonna do?

0:49:47.658,0:49:51.327
Well, let's look at what RSA keys actually are.

0:49:51.327,0:49:53.509
I introduced RSA keys and I was, like,

0:49:53.509,0:49:56.786
it's a d and d has like a thosand of bits.

0:49:56.786,0:50:00.671
And if we don't know p and q

0:50:00.671,0:50:02.608
how are we gonna get this d?

0:50:02.608,0:50:07.923
Here is the storage format for an RSA key.

0:50:07.923,0:50:11.258
Public key is the modulus N, the public exponent e.

0:50:11.258,0:50:14.710
Ok, private key has version, N, e,

0:50:14.710,0:50:16.536
d the decryption exponent,

0:50:16.536,0:50:20.273
p, q, d mod (p-1), d mod (q-1),

0:50:20.273,0:50:22.975
the inverse of q mod p,

0:50:22.975,0:50:25.305
this is all useful information if you want to

0:50:25.305,0:50:26.960
speed up your calculation and not have to

0:50:26.960,0:50:29.405
compute everything every time you use your key.

0:50:29.405,0:50:32.493
Okay.

0:50:32.493,0:50:35.952
What did we see here then?

0:50:35.952,0:50:40.975
Here is the key broken up into all of the difference pieces.

0:50:40.975,0:50:44.804
So we have: the red bit is approximately where N is sitting.

0:50:44.804,0:50:46.754
We've got e in orange.

0:50:46.754,0:50:50.569
We've got part of d in yellow.

0:50:50.569,0:50:52.372
We seem to be missing p

0:50:52.372,0:50:54.656
but we've got part of q

0:50:54.656,0:50:56.791
and then here's d mod p-1

0:50:56.791,0:51:02.408
and d mod q-1 and q inverse mod p.

0:51:02.408,0:51:07.712
There's also a little problem before the red here

0:51:07.712,0:51:08.436
<i>laughter</i>

0:51:08.436,0:51:10.504
in addition.

0:51:10.504,0:51:13.561
You might call this the private part of the private key.

0:51:13.561,0:51:15.936
<i>laughter</i>

0:51:15.936,0:51:21.391
<i>applause</i>

0:51:21.391,0:51:23.859
Fortunately this is just sitting in an encoding of the length

0:51:23.859,0:51:30.275
and the version. <i>laughter</i>

0:51:30.275,0:51:33.151
Alright, so what do we do with this information?

0:51:33.151,0:51:36.673
Basically given any part of this private key

0:51:36.673,0:51:38.910
we can easily compute all of the other parts.

0:51:38.910,0:51:40.257
Simple formulas:

0:51:40.257,0:51:48.938
q will be 2^(e times d mod p-1) - 1 and GCD that with N

0:51:48.938,0:51:51.872
then we get q and then

0:51:51.872,0:51:54.455
we can figure out what p was by dividing

0:51:54.455,0:51:56.393
and figure out what d is

0:51:56.393,0:51:57.953
and so on and so forth.

0:51:57.953,0:52:02.460
You can do this even if you have a part of a single value

0:52:02.460,0:52:04.109
from the private key.

0:52:04.109,0:52:08.195
You can still compute the rest of the private key from that.

0:52:08.195,0:52:12.338
We don't have time to get into this but it's online.

0:52:12.338,0:52:16.557
You've seen a little bit of math.

0:52:16.557,0:52:18.060
Single formulas, typed into Sage.

0:52:18.060,0:52:22.591
We can reconstruct the private key here.

0:52:22.591,0:52:29.487
<i>applause</i>

0:52:29.487,0:52:32.941
So, what are the lessons that we can learn

0:52:32.941,0:52:35.554
from everything that we've done today.

0:52:35.554,0:52:38.945
The first one is: stop using 1024 bit RSA

0:52:38.945,0:52:41.010
if you haven't already.

0:52:41.010,0:52:43.386
I have looked at the keys that are out there on the Internet

0:52:43.386,0:52:45.941
and you are still using 1024 bit RSA.

0:52:45.941,0:52:48.463
<i>laughter</i>

0:52:48.463,0:52:51.790
Second: make sure your primes are big enough.

0:52:51.790,0:52:54.227
Don't try to be clever about how you're generating them.

0:52:54.227,0:52:58.025
Make sure your primes are random.

0:52:58.025,0:52:59.303
You guys have some problems

0:52:59.303,0:53:01.394
especially in hardware.

0:53:01.394,0:53:03.621
Also...

0:53:03.621,0:53:04.921
<i>laughter</i>

0:53:04.921,0:53:09.091
"FUCK A DUCK" is not good crypto.

0:53:09.091,0:53:11.545
Pastebin is not a secure cloud store.

0:53:11.545,0:53:13.874
<i>laughter</i>

0:53:13.874,0:53:20.289
<i>applause</i>

0:53:20.289,0:53:22.657
And you probably shouldn't put your private key

0:53:22.657,0:53:24.778
in a secure cloud store anyway.

0:53:24.778,0:53:25.862
<i>laughter</i>

0:53:25.862,0:53:31.396
<i>applause</i>

0:53:31.396,0:53:32.577
And lastly...

0:53:32.577,0:53:34.211
<i>laughter</i>

0:53:34.211,0:53:40.427
<i>applause</i>

0:53:40.427,0:53:41.625
Thank you.

0:53:41.625,0:53:43.894
<i>applause</i>

0:53:43.894,0:53:44.804
<i>Angel</i>: Give them an applause.

0:53:44.804,0:54:05.937
<i>applause</i>

0:54:05.937,0:54:11.379
<i>Angel</i>: So questions will be taken at the numbered microphones.

0:54:11.379,0:54:14.962
So anyone who has a question please line up at a microphone

0:54:14.962,0:54:16.959
and we will start by the signal angel

0:54:16.959,0:54:19.528
who has a question from IRC.

0:54:19.528,0:54:23.628
*Signal Angel*: IRC has a lot of punch lines involving penisses

0:54:23.628,0:54:26.040
and some questions.

0:54:26.040,0:54:31.445
First question is: would you recommend more using

0:54:31.445,0:54:37.640
elliptic curves or RSA with multiple primes...

0:54:37.640,0:54:41.659
using proper primes.

0:54:41.659,0:54:47.192
<i>Dan</i>: There are ways to do elliptic curves wrong,

0:54:47.192,0:54:48.854
there are ways to do RSA wrong.

0:54:48.854,0:54:53.806
In general if you've got a particular performance requirement

0:54:53.806,0:54:55.625
then elliptic curves are gonna meet that

0:54:55.625,0:54:57.861
with a much higher security level

0:54:57.861,0:55:00.208
than RSA is gonna meet that.

0:55:00.208,0:55:03.005
Of course you can do RSA securely,

0:55:03.005,0:55:04.928
you can do elliptic curves securely

0:55:04.928,0:55:07.153
but if you've got some performance limitation

0:55:07.153,0:55:08.613
as many applications do

0:55:08.613,0:55:13.043
then elliptic curves tend to be a better choice.

0:55:13.043,0:55:14.429
<i>Nadia</i>: I also want to clarify

0:55:14.429,0:55:17.826
the other most commonly used public-key cryptosystem

0:55:17.826,0:55:19.979
is DSA, the Digital Signature Algorithm.

0:55:19.979,0:55:21.426
There's also elliptic curve DSA

0:55:21.426,0:55:23.291
which is probably what you're thinking about

0:55:23.291,0:55:25.570
with elliptic curve cryptography.

0:55:25.570,0:55:31.207
DSA is in my opinion actually much worse

0:55:31.207,0:55:34.395
than RSA in terms of randomness failures.

0:55:34.395,0:55:36.246
In the paper that Dan referenced

0:55:36.246,0:55:39.377
we also talk about randomness failures in DSA

0:55:39.377,0:55:40.993
and it's horrifying.

0:55:40.993,0:55:44.521
If you ever ever screw up randomness in a single signature

0:55:44.521,0:55:48.354
your entire public key is lost.

0:55:48.354,0:55:50.761
<i>Angel:</i> We will take the next question from microphone nr 4

0:55:50.761,0:55:53.072
to the right of the seat.

0:55:53.072,0:55:53.824
<i>Jake</i>: Hey guys.

0:55:53.824,0:55:56.277
Sorry I didn't mention the computing power of Bluffdale.

0:55:56.277,0:56:02.072
That said, when we want to transition from 1024 bit RSA

0:56:02.072,0:56:05.339
to something else, what do you think is a good idea?

0:56:05.339,0:56:10.028
So for example, in Tor we do use 1024 bit RSA for TLS.

0:56:10.028,0:56:12.528
<i>laughter</i> Yeah I know, right?

0:56:12.528,0:56:15.592
So the thing is we're working on changing these things

0:56:15.592,0:56:17.079
but what is...

0:56:17.079,0:56:19.766
I mean Zooko has this 100 year crypto project.

0:56:19.766,0:56:21.806
What is the real thing that we should,

0:56:21.806,0:56:23.910
like if we could switch tomorrow to something,

0:56:23.910,0:56:26.595
what's the useful that we can live with for 5 or 10 years?

0:56:26.595,0:56:28.795
Is 2048 going to be useful?

0:56:28.795,0:56:31.179
Is 4096 where we should go tomorrow?

0:56:31.179,0:56:32.780
Cause there is a trade-off between

0:56:32.780,0:56:34.588
performance and forward secrecy

0:56:34.588,0:56:37.962
and there are a lot of things to think about.

0:56:37.962,0:56:40.078
<i>Tanja</i>: If you tell me 5 to 10 years

0:56:40.078,0:56:42.360
I would be still comfortable with elliptic curves.

0:56:42.360,0:56:44.773
If you talk 100 years

0:56:44.773,0:56:47.691
and then there's all these worse quantum computers around

0:56:47.691,0:56:51.143
then I would go for something which is worse in performance

0:56:51.143,0:56:52.829
like code-based cryptography

0:56:52.829,0:56:54.043
or for signatures hashing

0:56:54.043,0:56:55.738
but if you'd say in 5 to 10 years

0:56:55.738,0:56:57.411
I'm very comfortable recommending to you

0:56:57.411,0:56:59.941
elliptic curves with 256 bits.

0:56:59.941,0:57:02.160
<i>Jake</i>: Ok, thanks.

0:57:02.160,0:57:04.878
<i>Angel</i>: Signal angel?

0:57:04.878,0:57:07.190
*Signal Angel*: Yeah, another question from IRC.

0:57:07.190,0:57:11.028
If you avoid all the easy primes

0:57:11.028,0:57:14.227
does this shrink the search space such that

0:57:14.227,0:57:17.862
it becomes easier to crack the remaining keys?

0:57:17.862,0:57:20.161
Or asked another way:

0:57:20.161,0:57:25.146
can you quantify the ratio of easy primes versus hard primes?

0:57:25.146,0:57:26.287
<i>Dan</i>: Yeah, that's a good question.

0:57:26.287,0:57:28.573
The answer is: yes, it can be quantified

0:57:28.573,0:57:32.829
and once you're up to about to the 1024 bit RSA key level

0:57:32.829,0:57:35.306
then there's very very few weak primes.

0:57:35.306,0:57:37.078
So if you have a random number,

0:57:37.078,0:57:38.593
you just generate a random prime

0:57:38.593,0:57:40.238
then your chance of bumping into something weak

0:57:40.238,0:57:42.793
is so small that you just don't have to worry about it.

0:57:42.793,0:57:44.862
It is one of those much less frequent than

0:57:44.862,0:57:46.580
being hit by a lightning kind of events.

0:57:46.580,0:57:49.210
It is something that have been quantified

0:57:49.210,0:57:52.244
and it is an issue for smaller RSA keys

0:57:52.244,0:57:54.872
back when people were using, say, 512 bit RSA

0:57:54.872,0:57:56.754
then it was actually a noticable issue.

0:57:56.754,0:57:59.123
But once you're at 1024 or above then

0:57:59.123,0:58:00.694
the issue is basically gone.

0:58:00.694,0:58:03.831
It's something where you can and you should

0:58:03.831,0:58:06.097
look at exactly what the chance is of bumping into a weak

0:58:06.097,0:58:08.828
but it's not reallistically something

0:58:08.828,0:58:13.259
you're gonna encounter with 1024 bit RSA.

0:58:13.259,0:58:16.542
<i>Angel</i>: We're gonna take the next question from nr 1.

0:58:16.542,0:58:19.313
Just one notice: we don't have a lot of time left

0:58:19.313,0:58:23.546
so even though there is a talk in one hour

0:58:23.546,0:58:25.078
just ask.

0:58:25.078,0:58:28.165
<i>Male</i>: There is a devastating attack that can break AES,

0:58:28.165,0:58:31.872
SHA2, most probably SHA3 and DES.

0:58:31.872,0:58:33.758
It's called a biclique attack.

0:58:33.758,0:58:35.664
Are you concerned that it might break RSA

0:58:35.664,0:58:37.281
with any key size and even ECC

0:58:37.281,0:58:39.988
and even any public-key crypto?

0:58:39.988,0:58:41.660
<i>Dan</i>: No.

0:58:41.660,0:58:43.594
<i>laughter</i>

0:58:43.594,0:58:46.675
Bicliques are something which are certainly interesting

0:58:46.675,0:58:49.590
and I think about it as a way to speed up brute force

0:58:49.590,0:58:52.508
and it speeds up brute force by a noticable factor,

0:58:52.508,0:58:55.148
often makes attacks, say, twice as fast.

0:58:55.148,0:58:58.523
But with public-key crypto we have attacks which are

0:58:58.523,0:59:00.760
much much faster than brute force to begin with

0:59:00.760,0:59:04.209
so that kind of speed-up of brute force won't be competitive

0:59:04.209,0:59:08.226
with things like the number field sieve.

0:59:08.226,0:59:12.964
<i>Angel</i>: Next is from number 3.

0:59:12.964,0:59:16.824
<i>Male</i>: Me not coming from the dark side of mathematics,

0:59:16.824,0:59:20.923
how do I know that my random number generators are fine

0:59:20.923,0:59:23.661
for generating keys?

0:59:23.661,0:59:26.223
<i>laughter</i>

0:59:26.223,0:59:31.354
<i>Nadia</i>: Seed them.

0:59:31.354,0:59:37.042
Basically the things that are out there if you're using

0:59:37.042,0:59:40.613
a standard random number generator like in Linux,

0:59:40.613,0:59:44.441
actually Linux has added patches to the kernel

0:59:44.441,0:59:48.022
to fix some of the problems that we've found.

0:59:48.022,0:59:50.431
If you want to know that you're keys are good:

0:59:50.431,0:59:54.694
if you generate them on a general purpose computer

0:59:54.694,0:59:56.600
after it has been running for a long time

0:59:56.600,0:59:59.559
you're probably fine.

0:59:59.559,1:00:04.793
I would not generate very important or long-term keys

1:00:04.793,1:00:09.707
on low-power hardware devices where you can't tell...

1:00:09.707,1:00:11.910
<i>laughter</i>

1:00:11.910,1:00:15.758
<i>Male</i>: Thank you.

1:00:15.758,1:00:18.341
<i>Angel</i>: Next question will be taken from number 4.

1:00:18.341,1:00:19.394
<i>Male</i>: Hi.

1:00:19.394,1:00:23.690
It's just a remark, not really a question.

1:00:23.690,1:00:25.407
I work in high performance computing.

1:00:25.407,1:00:26.910
I was at the supercomputing conference

1:00:26.910,1:00:29.376
a couple of weeks ago.

1:00:29.376,1:00:31.931
They're presenting large-scale installations

1:00:31.931,1:00:35.863
and dealing with problems of power.

1:00:35.863,1:00:38.927
They want to build petaflops and exaflops systems

1:00:38.927,1:00:42.262
that are in a range of 20 MW.

1:00:42.262,1:00:47.316
So I'm wondering what NSA is doing with 53 megawatts

1:00:47.316,1:00:53.815
in a data center because no storage takes that much capacity.

1:00:53.815,1:00:58.976
I think it's really something we should think about.

1:00:58.976,1:01:03.266
So thanks, nice talk.

1:01:03.266,1:01:04.883
<i>Angel</i>: Next question from...

1:01:04.883,1:01:06.961
does the signal angel have one?

1:01:06.961,1:01:08.679
*Signal angel*: Ok, another question is:

1:01:08.679,1:01:11.845
How big do you estimate the effort to

1:01:11.845,1:01:17.308
exchange keys and certificates involving 1024 bit primes

1:01:17.308,1:01:22.782
let's say worldwide at the moment?

1:01:22.782,1:01:24.502
<i>Dan</i>: I wasn't getting what the question was.

1:01:24.502,1:01:25.642
Could you repeat the question?

1:01:25.642,1:01:27.344
*Signal angel*: I don't really understand it myself.

1:01:27.344,1:01:28.875
<i>laughter</i>

1:01:28.875,1:01:33.181
<i>Angel</i>: Ok, let's switch to 1.

1:01:33.181,1:01:37.009
<i>Male</i>: My question goes back to the idea

1:01:37.009,1:01:42.876
of how can we know how good the private keys are?

1:01:42.876,1:01:50.227
Is there something like a keygen evaluation tool?

1:01:50.227,1:01:56.567
Or can the quality of a key generator be estimated

1:01:56.567,1:01:58.544
from a sample of keys?

1:01:58.544,1:02:01.827
Like a public tool that you can throw keys on

1:02:01.827,1:02:05.565
and it will tell me a bias of my keygen?

1:02:05.565,1:02:09.091
<i>Nadia</i>: If you go to factorable.net

1:02:09.091,1:02:12.574
my co-authors and I have made a key check tool

1:02:12.574,1:02:14.728
which will tell you if your key is in the

1:02:14.728,1:02:17.528
list of keys that we scraped off the Internet

1:02:17.528,1:02:19.296
that we were able to compromise.

1:02:19.296,1:02:21.249
That's a first check.

1:02:21.249,1:02:24.449
It's not a positive verification that your key is good

1:02:24.449,1:02:26.544
but it will tell you if it is very bad.

1:02:26.544,1:02:28.863
<i>laughter</i>

1:02:28.863,1:02:36.811
If you want to check your generator,

1:02:36.811,1:02:40.827
if you're worried about the factorization problems

1:02:40.827,1:02:43.242
we have fast code on the website in C

1:02:43.242,1:02:45.715
that will do the GCD calculation for you

1:02:45.715,1:02:50.201
if you just dump a gigantic collection of keys at it.

1:02:50.201,1:02:54.375
If you might have other problems like you are

1:02:54.375,1:02:56.993
not sure whether it's factorable like those don't come up

1:02:56.993,1:03:00.497
the experiment that you might want to do is

1:03:00.497,1:03:04.211
sort of restart the process in realistic conditions

1:03:04.211,1:03:06.179
and generate a gigantic quantity of keys

1:03:06.179,1:03:08.981
and just check for repeats.

1:03:08.981,1:03:12.378
The sort of obvious stress testings.

1:03:12.378,1:03:16.375
If you have a device you want to boot it up in realistic conditions

1:03:16.375,1:03:18.048
and then check them

1:03:18.048,1:03:19.798
and this is painstaking work

1:03:19.798,1:03:22.327
but I don't think there's any realistic way

1:03:22.327,1:03:25.546
of doing better than that.

1:03:25.546,1:03:28.396
Or inspect the code, the obvious thing.

1:03:28.396,1:03:30.924
Make sure you're seeding anything.

1:03:30.924,1:03:33.463
<i>Male</i>: Ok, thank you.

1:03:33.463,1:03:35.993
<i>Angel</i>: Next question will be from mic number 4.

1:03:35.993,1:03:37.045
<i>Male</i>: Hi.

1:03:37.045,1:03:40.745
If I got your numbers correctly so you said something like

1:03:40.745,1:03:45.450
it would take 8 million machines a year to factor 1024

1:03:45.450,1:03:48.976
so I was wondering what if I would like to factor like

1:03:48.976,1:03:52.243
800 bits or 900 bits which is,

1:03:52.243,1:03:55.149
as I understand, way easier than 1024

1:03:55.149,1:03:57.042
but was never done publicly.

1:03:57.042,1:04:00.259
So are you saying that would take like a day or something?

1:04:00.259,1:04:02.581
<i>Dan</i>: It depends on the size of cluster you have.

1:04:02.581,1:04:07.064
The RSA 768 factorization a couple of years ago

1:04:07.064,1:04:11.423
used something on the scale of a 1500 computers

1:04:11.423,1:04:14.130
for a year.

1:04:14.130,1:04:15.729
And those were normal kind of computers,

1:04:15.729,1:04:19.475
Desktop kind of computers with, I think, typically 2 cores.

1:04:19.475,1:04:24.240
Nowadays, if you have some faster, say, 3 GHz 4-core machines,

1:04:24.240,1:04:27.446
I think those were typically 2 GHZ 2-core, nowadays...

1:04:27.446,1:04:30.582
Tanja's mentioning you of course want to use GPUs

1:04:30.582,1:04:31.725
to speed things up.

1:04:31.725,1:04:33.397
And there's many ways to take advantage of

1:04:33.397,1:04:35.393
computer technology moving forward.

1:04:35.393,1:04:39.998
To get careful estimates: for going from 768 to 800,

1:04:39.998,1:04:42.192
that's a short enough extrapolation

1:04:42.192,1:04:44.113
that people are pretty confident about that.

1:04:44.113,1:04:48.014
Getting to 900 starts getting iffy what exactly the cost would be

1:04:48.014,1:04:50.951
and 1024 there's a fair amount of guess works.

1:04:50.951,1:04:53.796
There is some consensus on rougly what the costs are

1:04:53.796,1:04:55.429
but it's hard to say exactly.

1:04:55.429,1:04:58.059
But again, to give you a scale of it:

1:04:58.059,1:04:59.675
you were saying like 900 or 800.

1:04:59.675,1:05:03.213
Well, 768 RSA challenge was done

1:05:03.213,1:05:07.228
and that one was 1500 machine years.

1:05:07.228,1:05:08.501
<i>Male</i>: Ok, thank you.

1:05:08.501,1:05:10.260
<i>Angel</i>: We'll take four more questions.

1:05:10.260,1:05:12.357
Signal angel first.

1:05:12.357,1:05:16.676
*Signal angel*: It's a clarification of the question I asked before.

1:05:16.676,1:05:22.278
The actual question is: how big will be the effort

1:05:22.278,1:05:32.129
to upgrade existing keys from 1024 bits to 4K or 8K bits?

1:05:32.129,1:05:35.984
Considering that the keys are currently in use

1:05:35.984,1:05:37.910
how much effort worldwide would it be to

1:05:37.910,1:05:41.608
upgrade all that to something more secure?

1:05:41.608,1:05:43.559
<i>Tanja</i>: I mean, for the private user if you're running

1:05:43.559,1:05:46.447
SSH on the laptop you can just generate a new key.

1:05:46.447,1:05:47.876
Doesn't take you much time.

1:05:47.876,1:05:50.345
You will notice maybe some degradation of performance

1:05:50.345,1:05:52.193
if you're running it on a netbook

1:05:52.193,1:05:54.755
but going to 2048 is not a big deal.

1:05:54.755,1:05:57.710
However, there is a recommendation of the

1:05:57.710,1:06:01.932
US government to stop using 1024 bit RSA as of 2010

1:06:01.932,1:06:05.211
and we still see it everywhere.

1:06:05.211,1:06:07.563
Apparently it is bigger effort than just

1:06:07.563,1:06:10.743
everybody spending 10 minutes on the laptop.

1:06:10.743,1:06:13.635
<i>Dan</i>: Maybe part of the problem is things like

1:06:13.635,1:06:16.997
everybody says "yeah, use 2048 or bigger"

1:06:16.997,1:06:18.884
and there's some financial standard

1:06:18.884,1:06:21.157
which says you can use any key size you want

1:06:21.157,1:06:23.943
up to 1984 bits.

1:06:23.943,1:06:25.631
I have no idea how they came up with this

1:06:25.631,1:06:27.208
but then they run into some other standard

1:06:27.208,1:06:29.284
which says use 2048 and they just can't

1:06:29.284,1:06:33.345
implement it on the smartcards which only support up to 1984.

1:06:33.345,1:06:36.467
Now if you get the standards people talking to each other for several years

1:06:36.467,1:06:39.258
then they agree on using 1984.

1:06:39.258,1:06:41.796
It's just about as good as 2048

1:06:41.796,1:06:44.746
but realistically when you have these conflicts

1:06:44.746,1:06:47.128
in standards then it takes a while to work out.

1:06:47.128,1:06:48.742
And when you have a busy server

1:06:48.742,1:06:51.233
that's trying to do 2048 bit RSA

1:06:51.233,1:06:53.126
it doesn't matter what the standard says

1:06:53.126,1:06:55.449
if you got a ton of load you just can't handle that load

1:06:55.449,1:06:58.341
if you're spending a ton of time on cryptography.

1:06:58.341,1:07:00.460
And that again is where we like elliptic curves

1:07:00.460,1:07:02.560
because they give you for whatever your

1:07:02.560,1:07:05.341
performance constraints are much better security.

1:07:05.341,1:07:07.309
So if you're trying a given security level

1:07:07.309,1:07:09.831
the speed is much better.

1:07:09.831,1:07:12.807
<i>Angel</i>: Next question will be from number 2.

1:07:12.807,1:07:14.642
<i>Female</i>: So you've had two developers stand up

1:07:14.642,1:07:17.459
and ask how to verify whether or not their

1:07:17.459,1:07:19.726
key generation is correct and whether or not their

1:07:19.726,1:07:21.490
random numbers are correct.

1:07:21.490,1:07:23.887
And I think it's great that you give coding recommendations

1:07:23.887,1:07:27.441
like seed rand() but coming from the development

1:07:27.441,1:07:29.539
perspective I like to unit test my code

1:07:29.539,1:07:31.789
and specifically I like to write my unit tests

1:07:31.789,1:07:33.043
before I write my code,

1:07:33.043,1:07:34.709
it's called test-driven development.

1:07:34.709,1:07:38.175
So if I'm going about creating an algorithm to

1:07:38.175,1:07:39.994
encrypt something or whatever

1:07:39.994,1:07:42.771
what are the steps that I need to do

1:07:42.771,1:07:46.653
when I'm writing my unit test?

1:07:46.653,1:07:51.939
Has there been some kind of effort in that way

1:07:51.939,1:07:54.911
like what goes into unit test for determining that

1:07:54.911,1:07:56.606
your random number is correct?

1:07:56.606,1:07:59.091
I mean there's algorithms for determining

1:07:59.091,1:08:00.562
whether your random number generator is

1:08:00.562,1:08:02.773
operating correctly, right?

1:08:02.773,1:08:05.160
<i>Dan</i>: This is something where there is,

1:08:05.160,1:08:07.775
if you look at how the random number generators work,

1:08:07.775,1:08:10.209
there was just a NIST workshop

1:08:10.209,1:08:11.975
and there's some NIST standards

1:08:11.975,1:08:12.742
which are telling you

1:08:12.742,1:08:15.709
here's what to do to test your random number generators.

1:08:15.709,1:08:17.796
So you have a hardware random number generator,

1:08:17.796,1:08:19.589
some post-processing, and they say

1:08:19.589,1:08:22.406
"ok here's the standard for how to test those pieces"

1:08:22.406,1:08:24.704
Now that's not the same as testing the cryptography

1:08:24.704,1:08:26.335
that's using those pieces

1:08:26.335,1:08:28.994
and it's very helpful if there's a clear separation there.

1:08:28.994,1:08:30.463
So you have your cryptography

1:08:30.463,1:08:32.746
which is doing deterministic stuff

1:08:32.746,1:08:34.452
and says you have to have some randomness coming

1:08:34.452,1:08:37.806
from a totally separate unit where the only job

1:08:37.806,1:08:40.460
of that separate unit is do the randomness properly.

1:08:40.460,1:08:42.146
And then the NIST test will tell you

1:08:42.146,1:08:43.324
what the randomness does

1:08:43.324,1:08:45.938
and then deterministic cryptographic testing

1:08:45.938,1:08:48.391
will tell you that the other pieces are working correctly

1:08:48.391,1:08:50.211
for all sorts of inputs.

1:08:50.211,1:08:53.892
Where it goes wrong is something like the ECDSA standard

1:08:53.892,1:08:55.394
that Nadia was mentioning before

1:08:55.394,1:08:56.696
and that's one that says that

1:08:56.696,1:08:59.210
you don't just deterministically generate a signature

1:08:59.210,1:09:01.410
you make some new randomness every time you generate

1:09:01.410,1:09:03.705
a signature and that's what goes horribly wrong.

1:09:03.705,1:09:05.670
So that's something where we need new standards

1:09:05.670,1:09:07.790
which say instead of doing what ECDSA does

1:09:07.790,1:09:09.741
we have a clear separation between

1:09:09.741,1:09:12.192
the cryptography always does the same thing

1:09:12.192,1:09:15.043
making it testable, and then randomness is centralized

1:09:15.043,1:09:17.541
in one spot which hopefully the NIST standards

1:09:17.541,1:09:19.840
do a good enough job of testing.

1:09:19.840,1:09:23.508
<i>Female</i>: So there's something that says here's what determines...

1:09:23.508,1:09:26.946
I guess what I'm asking is do you know of any algorithms

1:09:26.946,1:09:30.139
that can be used for determining whether something

1:09:30.139,1:09:32.248
is a good random number and whether your random

1:09:32.248,1:09:34.525
number generator is generating things properly?

1:09:34.525,1:09:37.027
<i>Tanja</i>: Well, there are a bunch of tests for.

1:09:37.027,1:09:39.307
There's hardware random number generators

1:09:39.307,1:09:42.126
you can run them through the diehard tests.

1:09:42.126,1:09:44.710
There's certificates by FIPS.

1:09:44.710,1:09:48.526
In general I would recommend implementing all those steps

1:09:48.526,1:09:50.811
but the smartcards that have been mentioned by Dan earlier

1:09:50.811,1:09:52.742
from the Taiwan citizen card

1:09:52.742,1:09:55.445
those are actually FIPS-certified.

1:09:55.445,1:09:58.942
So even though, these go through what is the industry standard

1:09:58.942,1:10:00.922
of doing random number generation on smartcards

1:10:00.922,1:10:03.622
which everybody thought was good enough,

1:10:03.622,1:10:04.979
apparently they're not.

1:10:04.979,1:10:06.477
So randomness is a total mess.

1:10:06.477,1:10:08.613
I mean after the paper by Nadia and

1:10:08.613,1:10:10.410
also when the paper by the Lausanne team came out,

1:10:10.410,1:10:12.692
there is no some more movement in finding

1:10:12.692,1:10:14.023
better random number generators.

1:10:14.023,1:10:18.659
At this moment, well, hope for the best.

1:10:18.659,1:10:20.625
<i>laughter</i>

1:10:20.625,1:10:22.178
Implement the standards, yes.

1:10:22.178,1:10:24.672
If you have some source of randomness

1:10:24.672,1:10:28.424
try to stretch it with a stream cipher or something like that.

1:10:28.424,1:10:31.296
<i>Nadia</i>: I want to tell a little story.

1:10:31.296,1:10:33.227
After we published the paper

1:10:33.227,1:10:35.643
I heard some very interesting things from some of the vendors.

1:10:35.643,1:10:39.329
I think one of the things that makes randomness

1:10:39.329,1:10:42.206
and cryptography a difficult problem is that

1:10:42.206,1:10:44.612
this kind of issue is something that

1:10:44.612,1:10:46.808
crosses a lot of the abstraction boundaries

1:10:46.808,1:10:48.546
that you're used to in coding.

1:10:48.546,1:10:50.525
You want to have the clean unit test that you know

1:10:50.525,1:10:52.343
that this piece works, and this piece works,

1:10:52.343,1:10:53.923
and this piece works, and this piece works,

1:10:53.923,1:10:57.777
and you put them all together and it will all work flawlessly.

1:10:57.777,1:11:00.297
Somehow this kind of problem is something that

1:11:00.297,1:11:02.898
happens at the boundaries of each unit test.

1:11:02.898,1:11:05.223
We know that the operating system is like

1:11:05.223,1:11:08.431
"ok I'm getting my proper inputs from the hardware

1:11:08.431,1:11:11.297
and I'm correctly generating my randomness from there"

1:11:11.297,1:11:12.840
and then the application says

1:11:12.840,1:11:15.023
"I'm getting my randomness from the operating system,

1:11:15.023,1:11:16.310
it's good randomness"

1:11:16.310,1:11:19.605
and then the application uses the correct crypto algorithms

1:11:19.605,1:11:21.927
to then do good cryptography.

1:11:21.927,1:11:24.495
And at the very beginning of that stage

1:11:24.495,1:11:27.777
there was something broken and it translated all the way

1:11:27.777,1:11:29.526
through all of these pieces of software

1:11:29.526,1:11:31.532
that were working correctly.

1:11:31.532,1:11:35.415
Testing this holistically was the only way

1:11:35.415,1:11:37.312
to find this kind of error.

1:11:37.312,1:11:41.241
One story that I heard from a vendor was that

1:11:41.241,1:11:43.911
they ran into randomness problems.

1:11:43.911,1:11:45.745
They developed some device.

1:11:45.745,1:11:47.325
It was working perfectly

1:11:47.325,1:11:51.478
and on the assembly line they were being turned on

1:11:51.478,1:11:55.497
and run through the checks.

1:11:55.497,1:11:57.807
You boot it up it checks the integrity

1:11:57.807,1:12:00.641
on the assembly line and it was continuing on.

1:12:00.641,1:12:06.390
If you exactly booted them up at the exact same time

1:12:06.390,1:12:10.129
in the exact same conditions

1:12:10.129,1:12:13.221
then all of the inputs to the random number generator

1:12:13.221,1:12:16.323
would be the same and they would generate the same keys

1:12:16.323,1:12:17.743
and then these keys would be,

1:12:17.743,1:12:19.721
you know, they already generated the keys,

1:12:19.721,1:12:21.197
so they wouldn't generate them again after

1:12:21.197,1:12:25.397
they went to the consumer.

1:12:25.397,1:12:27.145
I don't know how to unit test that

1:12:27.145,1:12:29.976
except take the entire device, turn it on,

1:12:29.976,1:12:32.589
and take multiple of them and make sure the devices

1:12:32.589,1:12:35.396
as they're coming out of the production process

1:12:35.396,1:12:39.596
are working correctly.

1:12:39.596,1:12:41.477
<i>Angel</i>: So we will take 2 more questions.

1:12:41.477,1:12:46.692
One from number 4 first and then number 1 at the last.

1:12:46.692,1:12:48.005
<i>Male</i>: How good, do you think,

1:12:48.005,1:12:52.544
hardware-supported random number generators are

1:12:52.544,1:12:53.432
after what you've just said?

1:12:53.432,1:12:55.806
I think they're probably not good anymore?

1:12:55.806,1:13:01.566
<i>Dan</i>: Intel has their RdRand instruction in some of the newer chips

1:13:01.566,1:13:04.979
and they say that they've gone through

1:13:04.979,1:13:06.957
a reasonable number of evaluation steps.

1:13:06.957,1:13:09.442
It sounds like they're trying.

1:13:09.442,1:13:11.826
Whether they're successful is a different question.

1:13:11.826,1:13:13.661
Something I don't like about the API is

1:13:13.661,1:13:16.392
RdRand gives you no way to test the pieces.

1:13:16.392,1:13:18.798
You can only test the post-processed output.

1:13:18.798,1:13:22.157
So nobody else is able to test the actual

1:13:22.157,1:13:24.276
randomness generation part of the hardware.

1:13:24.276,1:13:26.543
You can only get a filtered version of that.

1:13:26.543,1:13:29.245
So Intel and people who are contracted by Intel

1:13:29.245,1:13:30.628
to see what's going on inside

1:13:30.628,1:13:32.829
are the only people who can run these tests.

1:13:32.829,1:13:35.874
It's not open in a way that allows the community

1:13:35.874,1:13:37.377
to see if there's further problems.

1:13:37.377,1:13:38.756
But at least they're trying

1:13:38.756,1:13:41.423
and maybe with enough effort the hardware manufacturers

1:13:41.423,1:13:43.458
will get randomness generation right.

1:13:43.458,1:13:47.579
Of course if you can generate any sort of secret once

1:13:47.579,1:13:50.462
if you have enough of a secret to start with

1:13:50.462,1:13:52.328
then you can use things like AES to

1:13:52.328,1:13:53.944
generate any number of secrets

1:13:53.944,1:13:55.494
and you can put those secrets into

1:13:55.494,1:13:57.296
any number of devices that you want.

1:13:57.296,1:13:59.594
If you use AES in counter mode with

1:13:59.594,1:14:02.123
encrypting 1, encrypting 2, encrypting 3...

1:14:02.123,1:14:04.981
you get totally unpredictable, unrelated outputs

1:14:04.981,1:14:07.378
and then use those, burn those into some hardware.

1:14:07.378,1:14:09.979
But where do you get that initial randomness from?

1:14:09.979,1:14:12.830
Are you going through a trustworthy process there?

1:14:12.830,1:14:16.308
It's a hard problem.

1:14:16.308,1:14:18.198
<i>Angel</i>: Next question from number 1.

1:14:18.198,1:14:19.729
And that's the last question.

1:14:19.729,1:14:21.379
<i>Male</i>: You said something about

1:14:21.379,1:14:24.799
you dropped this group-based cryptography word.

1:14:24.799,1:14:28.111
Most of the stuff I've encountered under that heading

1:14:28.111,1:14:30.008
kind of sounded like snake oil

1:14:30.008,1:14:32.898
just cause they're using non-abelian groups and stuff.

1:14:32.898,1:14:35.832
Is there anything here that isn't?

1:14:35.832,1:14:38.280
<i>Tanja</i>: Ok, I did not say group-based crypto.

1:14:38.280,1:14:38.914
<i>Male</i>: Ok.

1:14:38.914,1:14:40.259
<i>Tanja</i>: I said <b>code</b>-based cryptography.

1:14:40.259,1:14:41.241
<i>Male</i>: Sorry, thanks.

1:14:41.241,1:14:42.773
<i>Tanja</i>: So there is a class of cryptosystems

1:14:42.773,1:14:44.826
which are fine against quantum computers.

1:14:44.826,1:14:50.148
For all crypto it's always up to what we know.

1:14:50.148,1:14:51.959
Next day, there could be somebody with a

1:14:51.959,1:14:54.009
polynomial-time factor algorithm and so on.

1:14:54.009,1:14:56.723
Now there's also a group of people doing

1:14:56.723,1:14:59.357
braid group cryptography, doing non-abelian groups,

1:14:59.357,1:15:01.764
most of these systems have been broken.

1:15:01.764,1:15:07.083
If they weren't broken by classical computers

1:15:07.083,1:15:10.911
maybe they would remain secure against quantum computers.

1:15:10.911,1:15:13.828
It's lots of "would"s there.

1:15:13.828,1:15:17.612
Yes, they might have this on their flags as well

1:15:17.612,1:15:20.547
but I wouldn't count this as a secure system.

1:15:20.547,1:15:23.027
Whereas code-based cryptography or lattice-based cryptography

1:15:23.027,1:15:26.480
are systems which should be fine.

1:15:26.480,1:15:28.646
<i>Dan</i>: Maybe just a URL if you're interested

1:15:28.646,1:15:31.939
in post-quantum cryptography is pqcrypto.org

1:15:31.939,1:15:34.513
and that keeps track of papers on the various types

1:15:34.513,1:15:37.227
of cryptography that should survive for a 100 years

1:15:37.227,1:15:41.448
and in particular against quantum computers.

1:15:41.448,1:15:42.444
<i>Angel</i>: Ladies and gentleman, please give

1:15:42.444,1:15:46.546
Nadia, Tanja and Daniel a big round of applause.

1:15:46.546,1:15:47.714
Thank you so much.

1:15:47.714,1:15:56.643
<i>applause</i>