1
00:00:08,965 --> 00:00:11,637
Today we have Daniel J. Bernstein,
2
00:00:11,637 --> 00:00:14,538
Nadia Heninger,
3
00:00:14,538 --> 00:00:16,889
and Tanja Lange.
4
00:00:16,889 --> 00:00:20,402
And they will talk about RSA factorization in the real world.
5
00:00:20,402 --> 00:00:22,289
Talk is called FactHacks.
6
00:00:22,289 --> 00:00:29,905
applause
7
00:00:29,905 --> 00:00:31,591
Nadia: Thank you.
8
00:00:31,591 --> 00:00:37,091
So we're going to start with a crypto lecture in 5 minutes.
9
00:00:37,091 --> 00:00:40,557
So, just to begin, since we're talking about RSA.
10
00:00:40,557 --> 00:00:43,678
Here is a picture of RSA for you.
11
00:00:43,678 --> 00:00:48,605
RSA is the first published public-key cryptosystem.
12
00:00:48,605 --> 00:00:51,222
So by a public-key cryptosystem, what I mean is
13
00:00:51,222 --> 00:00:54,169
a cryptosystem that has two keys intead of just having
14
00:00:54,169 --> 00:00:56,886
one key. You might think "ok you encrypt a message
15
00:00:56,886 --> 00:00:59,035
to that key and you decrypt a message to that key"
16
00:00:59,035 --> 00:01:01,402
That is a symmetric cryptosystem.
17
00:01:01,402 --> 00:01:03,194
A public-key cryptosystem has two keys.
18
00:01:03,194 --> 00:01:06,042
One is the public key and one is the private key.
19
00:01:06,042 --> 00:01:07,688
The public key you publish to the world and
20
00:01:07,688 --> 00:01:09,509
anyone can use that key to encrypt a message,
21
00:01:09,509 --> 00:01:10,806
especially to you.
22
00:01:10,806 --> 00:01:14,102
And only you, who have the private key, can decrypt that.
23
00:01:14,102 --> 00:01:18,739
So the RSA paper was the first published public-key cryptosystem.
24
00:01:18,739 --> 00:01:20,501
That was in 1977.
25
00:01:20,501 --> 00:01:22,757
This revolutionized cryptography and enabled
26
00:01:22,757 --> 00:01:25,542
the development of the Internet.
27
00:01:25,542 --> 00:01:31,568
So 35 years later RSA is still the most commonly used public-key cryptosystem.
28
00:01:31,568 --> 00:01:35,870
If you ever use the Internet, you probably use it every day.
29
00:01:35,870 --> 00:01:42,154
So here is a sample SSL certificate which uses RSA.
30
00:01:42,154 --> 00:01:45,241
If you use SSH you probably have an SSH key
31
00:01:45,241 --> 00:01:48,440
which probably is RSA.
32
00:01:48,440 --> 00:01:51,858
Before I launch into the math
33
00:01:51,858 --> 00:01:54,676
I want to explain what we're doing here.
34
00:01:54,676 --> 00:01:56,742
We wanted to, since you guys are hackers,
35
00:01:56,742 --> 00:01:59,874
you like code instead of formulas
36
00:01:59,874 --> 00:02:02,784
so we wanted to give you formulas in code.
37
00:02:02,784 --> 00:02:05,455
So what we're giving you is working Python code for
38
00:02:05,455 --> 00:02:07,720
all the math that we're going to do.
39
00:02:07,720 --> 00:02:09,976
And in order to do that we're going to use
40
00:02:09,976 --> 00:02:12,643
some software you call Sage.
41
00:02:12,643 --> 00:02:15,970
Sage is free open-source mathematics software.
42
00:02:15,970 --> 00:02:20,160
It is available at this URL sagemath.org.
43
00:02:20,160 --> 00:02:22,093
It's based off of Python
44
00:02:22,093 --> 00:02:24,190
so it's very simple if you know Python.
45
00:02:24,190 --> 00:02:27,904
It has a nice sort of interpreter
46
00:02:27,904 --> 00:02:29,622
that you can use just like Python.
47
00:02:29,622 --> 00:02:32,493
Here an example, 2*3=6.
48
00:02:32,493 --> 00:02:34,890
But there are some differences.
49
00:02:34,890 --> 00:02:38,202
For example the caret instead of in Python it's XOR
50
00:02:38,202 --> 00:02:40,155
in Sage it's exponentiation.
51
00:02:40,155 --> 00:02:43,869
So 2^3 is 8 in Sage.
52
00:02:43,869 --> 00:02:47,351
The other nice thing about Sage is that
53
00:02:47,351 --> 00:02:49,703
it has lots of useful mathematical libraries.
54
00:02:49,703 --> 00:02:53,029
For example if I, in the Sage interpreter, say factor(15)
55
00:02:53,029 --> 00:02:55,512
it helpfully tells me 3*5.
56
00:02:55,512 --> 00:02:56,513
That's pretty great.
57
00:02:56,513 --> 00:02:59,152
It also does a lot more advanced things, for example
58
00:02:59,152 --> 00:03:00,467
it can represent polynomials.
59
00:03:00,467 --> 00:03:03,498
I can say factor the polynomial x^2-1
60
00:03:03,498 --> 00:03:08,127
and it helpfully tells me the answer to that.
61
00:03:08,127 --> 00:03:12,819
So now that we've gone through what is Sage,
62
00:03:12,819 --> 00:03:17,443
I want to explain how to do RSA.
63
00:03:17,443 --> 00:03:21,065
Perhaps some of you have seen this before.
64
00:03:21,065 --> 00:03:24,677
In order to generate an RSA key, the first thing that you do
65
00:03:24,677 --> 00:03:27,435
is you generate two large random primes.
66
00:03:27,435 --> 00:03:30,441
So if we want a 1024 bit RSA key
67
00:03:30,441 --> 00:03:34,779
we generate two primes of size 2^512,
68
00:03:34,779 --> 00:03:38,515
so 512 bit primes, p and q.
69
00:03:38,515 --> 00:03:41,697
In order to generate the public key
70
00:03:41,697 --> 00:03:44,900
I multiply together p and q and you get some N.
71
00:03:44,900 --> 00:03:46,433
N has 1024 bits.
72
00:03:46,433 --> 00:03:49,915
And then you have what's known as the public exponent
73
00:03:49,915 --> 00:03:54,435
which you can choose 3 or 65537 or 35.
74
00:03:54,435 --> 00:03:56,300
It really doesn't matter what you choose.
75
00:03:56,300 --> 00:03:59,117
These are some of the most commonly used values.
76
00:03:59,117 --> 00:04:03,129
Now in order to generate a private key
77
00:04:03,129 --> 00:04:07,695
you choose d, the decryption exponent,
78
00:04:07,695 --> 00:04:12,936
to be the modular inverse of e mod (p-1)*(q-1)
79
00:04:12,936 --> 00:04:15,394
It doesn't matter why so much, for this talk
80
00:04:15,394 --> 00:04:17,276
but here is the formula
81
00:04:17,276 --> 00:04:20,295
or here is the command to do this in Sage.
82
00:04:20,295 --> 00:04:23,150
So now we have a public key and a private key
83
00:04:23,150 --> 00:04:24,508
that are pairs.
84
00:04:24,508 --> 00:04:27,948
And in RSA if you want to enrypt a message
85
00:04:27,948 --> 00:04:31,547
you raise your message to the e'th power mod n
86
00:04:31,547 --> 00:04:34,753
so you can do that with your public key here.
87
00:04:34,753 --> 00:04:36,776
And similarly decryption,
88
00:04:36,776 --> 00:04:39,998
you raise the ciphertext to the d'th power mod n
89
00:04:39,998 --> 00:04:42,835
and that will give you the message back out again.
90
00:04:42,835 --> 00:04:45,748
Fairly simple.
91
00:04:45,748 --> 00:04:48,731
Now I'm lying to you slightly.
92
00:04:48,731 --> 00:04:51,088
If you read this talk do not
93
00:04:51,088 --> 00:04:53,181
absolutely do not implement RSA this way.
94
00:04:53,181 --> 00:04:54,757
You must pad your message.
95
00:04:54,757 --> 00:04:57,186
This is a public service warning.
96
00:04:57,186 --> 00:05:00,662
We offered to tell you about factoring.
97
00:05:00,662 --> 00:05:03,941
What is the relationship between RSA and factoring?
98
00:05:03,941 --> 00:05:08,972
Now you can see easily from the formula
99
00:05:08,972 --> 00:05:11,760
from the private key here
100
00:05:11,760 --> 00:05:16,337
that if you know how to factor N into its prime factors p and q
101
00:05:16,337 --> 00:05:19,496
then you can just compute d, the decryption exponent
102
00:05:19,496 --> 00:05:20,812
of your private key.
103
00:05:20,812 --> 00:05:24,395
Clearly, if you can factor N you can compute the private key.
104
00:05:24,395 --> 00:05:28,930
But we don't actually know that factoring is the only way to break RSA.
105
00:05:28,930 --> 00:05:34,163
There might be some way that you can compute messages from ciphertexts
106
00:05:34,163 --> 00:05:36,877
that doesn't actually reveal the private key.
107
00:05:36,877 --> 00:05:41,545
And there's no proof either way that this is or is not possible.
108
00:05:41,545 --> 00:05:43,478
The last thing that I want to mention here
109
00:05:43,478 --> 00:05:45,458
just to clarify things
110
00:05:45,458 --> 00:05:46,830
some of you who might have taken
111
00:05:46,830 --> 00:05:51,527
complexity or algorithms courses or a beginning CS course
112
00:05:51,527 --> 00:05:55,874
you might have heard of NP-hardness, the P vs. NP problem.
113
00:05:55,874 --> 00:06:00,904
And I just want to clarify that factoring is not known to be NP-hard.
114
00:06:00,904 --> 00:06:05,092
Every computer scientist will love it if we could actually generate
115
00:06:05,092 --> 00:06:11,139
a cryptosystem which is known to be based of average case NP-hardness.
116
00:06:11,139 --> 00:06:13,573
We don't know how to do that.
117
00:06:13,573 --> 00:06:16,992
RSA is not doing that.
118
00:06:16,992 --> 00:06:22,309
In addition the factoring problem is probably not NP-hard.
119
00:06:22,309 --> 00:06:25,892
The question is: how hard is factoring?
120
00:06:25,892 --> 00:06:27,474
Since factoring is the best way that we know
121
00:06:27,474 --> 00:06:31,227
to break the most commonly used public-key cryptosystem on the Internet.
122
00:06:31,227 --> 00:06:32,935
Well I showed you some examples before
123
00:06:32,935 --> 00:06:37,243
where Sage had this helpful little factor function.
124
00:06:37,243 --> 00:06:39,164
So how well does it work?
125
00:06:39,164 --> 00:06:41,181
How long do people think this takes?
126
00:06:41,181 --> 00:06:46,149
two 32 bit integers multiplied together.
127
00:06:46,149 --> 00:06:48,850
I heard it'll go a couple of seconds.
128
00:06:48,850 --> 00:06:50,098
0.01 seconds.
129
00:06:50,098 --> 00:06:55,108
This is on this computer.
130
00:06:55,108 --> 00:06:57,869
How about 64 bit integers, two of them multiplied together?
131
00:06:57,869 --> 00:07:01,289
So this is a 128 bit RSA modulus.
132
00:07:01,289 --> 00:07:04,382
Any guesses?
133
00:07:04,382 --> 00:07:05,464
One minute?
134
00:07:05,464 --> 00:07:08,235
Nope, 0.1 seconds.
135
00:07:08,235 --> 00:07:14,521
How about two 96 bit integers multiplied together?
136
00:07:14,521 --> 00:07:15,939
A couple of minutes?
137
00:07:15,939 --> 00:07:18,105
4 seconds.
138
00:07:18,105 --> 00:07:21,965
How about two 128 bit integers multiplied together?
139
00:07:21,965 --> 00:07:28,255
This is a 256 bit RSA modulus.
140
00:07:28,255 --> 00:07:29,300
No guesses? Alright.
141
00:07:29,300 --> 00:07:30,430
500 seconds.
142
00:07:30,430 --> 00:07:32,791
So at this point it's starting to get a little bit iffy if I'm
143
00:07:32,791 --> 00:07:36,327
sitting on a plane trying to do this on battery power.
144
00:07:36,327 --> 00:07:39,248
laughter
145
00:07:39,248 --> 00:07:43,764
Clearly this is growing faster than linear in the size of the key.
146
00:07:43,764 --> 00:07:45,597
So what is actually going on?
147
00:07:45,597 --> 00:07:49,066
Well.
148
00:07:49,066 --> 00:07:50,734
Dan: Ok.
149
00:07:50,734 --> 00:07:52,794
Turns out these factoring functions
150
00:07:52,794 --> 00:07:54,597
they do look like they're getting pretty slow but
151
00:07:54,597 --> 00:07:56,835
wait a minute.
152
00:07:56,835 --> 00:07:59,814
Do we actually have to use this factor function
153
00:07:59,814 --> 00:08:00,866
if it's getting really slow?
154
00:08:00,866 --> 00:08:04,302
Maybe instead of trying to use Sage's factor function
155
00:08:04,302 --> 00:08:08,710
maybe we could just guess what the prime was.
156
00:08:08,710 --> 00:08:12,350
Now as an RSA user you might not think that this is a threat
157
00:08:12,350 --> 00:08:13,928
having somebody guess your prime
158
00:08:13,928 --> 00:08:17,114
because there is some way to count the number of primes,
159
00:08:17,114 --> 00:08:18,719
number of 512 bit primes and there's
160
00:08:18,719 --> 00:08:22,185
more than 2^500 512 bit primes.
161
00:08:22,185 --> 00:08:24,011
And if your random number generator is giving you
162
00:08:24,011 --> 00:08:27,097
any of those 512 bit primes with the same chance
163
00:08:27,097 --> 00:08:30,829
then any particular prime that the attacker guesses
164
00:08:30,829 --> 00:08:32,845
and just tries dividing n by that,
165
00:08:32,845 --> 00:08:35,061
see if it's evenly divisable, well
166
00:08:35,061 --> 00:08:39,417
any particular prime has only a 2^(-500) chance
167
00:08:39,417 --> 00:08:41,312
of being guessed by the attacker.
168
00:08:41,312 --> 00:08:43,319
And even if the attacker tries a huge number of times
169
00:08:43,319 --> 00:08:44,897
he'll never guess your prime
170
00:08:44,897 --> 00:08:47,621
except what if your random number generator
171
00:08:47,621 --> 00:08:52,031
is working really really badly and doesn't give you 2^500 different primes?
172
00:08:52,031 --> 00:08:55,902
What if it only gives you, say, 2^40 different primes?
173
00:08:55,902 --> 00:08:58,055
And then suddenly the attacker could, well,
174
00:08:58,055 --> 00:09:01,248
try figuring out what those primes are.
175
00:09:01,248 --> 00:09:03,237
So here's how the attacker looks at this.
176
00:09:03,237 --> 00:09:05,819
The attacker says: ok I'm gonna target your public key.
177
00:09:05,819 --> 00:09:08,884
I'm gonna take your big N which is your secret...
178
00:09:08,884 --> 00:09:10,912
sorry, public key is big N,
179
00:09:10,912 --> 00:09:13,666
your secret key p times your secret key q,
180
00:09:13,666 --> 00:09:14,920
your private p and q.
181
00:09:14,920 --> 00:09:16,764
He's trying to figure out the p and q, given N.
182
00:09:16,764 --> 00:09:20,169
So what he does, he takes a bunch of devices that are out there
183
00:09:20,169 --> 00:09:22,902
and he looks at how they work
184
00:09:22,902 --> 00:09:25,281
or downloads a bunch of software which generates keys
185
00:09:25,281 --> 00:09:28,904
and then using that software, using those devices
186
00:09:28,904 --> 00:09:32,216
he tries generating a bunch of p's and q's for himself
187
00:09:32,216 --> 00:09:34,516
using whatever the random number generator is
188
00:09:34,516 --> 00:09:36,932
in that device or in that software.
189
00:09:36,932 --> 00:09:38,400
And then that's something which
190
00:09:38,400 --> 00:09:40,682
maybe the random number generator works really badly
191
00:09:40,682 --> 00:09:42,832
and maybe you're using that device, too.
192
00:09:42,832 --> 00:09:45,765
So the attacker tries each of the primes that he see's
193
00:09:45,765 --> 00:09:46,918
from this device
194
00:09:46,918 --> 00:09:50,129
and tries seeing does that prime divide your N.
195
00:09:50,129 --> 00:09:51,400
If you were using that device
196
00:09:51,400 --> 00:09:52,932
and it doesn't generate very many primes
197
00:09:52,932 --> 00:09:55,835
then maybe he gets lucky and finds your prime
198
00:09:55,835 --> 00:09:57,968
because the device has generated the same prime for him
199
00:09:57,968 --> 00:09:59,618
that it generated for you.
200
00:09:59,618 --> 00:10:03,216
Now does anybody actually build devices which
201
00:10:03,216 --> 00:10:07,005
screw up random numbers this badly?
202
00:10:07,005 --> 00:10:08,519
laughter
203
00:10:08,519 --> 00:10:11,538
As a matter of fact, yes, bears do shit in the woods.
204
00:10:11,538 --> 00:10:13,543
So there are some famous examples.
205
00:10:13,543 --> 00:10:16,952
We don't have a huge number of historical discussions in this talk
206
00:10:16,952 --> 00:10:18,539
but just a few examples here.
207
00:10:18,539 --> 00:10:22,401
Goldberg&Wagner totally broke the original Netscape
208
00:10:22,401 --> 00:10:23,989
generation of public keys.
209
00:10:23,989 --> 00:10:26,921
There are only something like 2^47 possible keys
210
00:10:26,921 --> 00:10:29,216
which is a countable numbers.
211
00:10:29,216 --> 00:10:31,077
So they could figure out all the possible keys
212
00:10:31,077 --> 00:10:34,139
that Netscape could generate if they you knew when you generated a key
213
00:10:34,139 --> 00:10:35,802
which was often very predictable.
214
00:10:35,802 --> 00:10:37,724
Another example: the Debian bug,
215
00:10:37,724 --> 00:10:40,974
the horrendous failure for SSH and lots of other applications
216
00:10:40,974 --> 00:10:42,418
from a few years ago.
217
00:10:42,418 --> 00:10:45,692
Debian and Ubuntu for a year and half were generating only
218
00:10:45,692 --> 00:10:48,773
well, fewer than a million possible public keys.
219
00:10:48,773 --> 00:10:51,421
So complete disasters of random number generation.
220
00:10:51,421 --> 00:10:53,826
But if you want to add to this list,
221
00:10:53,826 --> 00:10:55,202
if you want to do more and more of these
222
00:10:55,202 --> 00:10:56,974
find bad random number generators
223
00:10:56,974 --> 00:11:00,518
you have to say "ok what's the next device out there
224
00:11:00,518 --> 00:11:01,858
that might have been screwed up."
225
00:11:01,858 --> 00:11:03,422
"Let's look at it, look at the code."
226
00:11:03,422 --> 00:11:05,039
"See what kind of primes it generates."
227
00:11:05,039 --> 00:11:06,571
"Does it do badly?"
228
00:11:06,571 --> 00:11:09,424
It takes some real work to figure this out.
229
00:11:09,424 --> 00:11:11,801
Wouldn't it be nice if there was a systematic way
230
00:11:11,801 --> 00:11:14,586
to just without having to do a lot of work,
231
00:11:14,586 --> 00:11:16,576
to look at each device, each piece of software,
232
00:11:16,576 --> 00:11:18,403
wouldn't it be nice to just figure out
233
00:11:18,403 --> 00:11:21,568
"oh yeah, let's just magically figure out
234
00:11:21,568 --> 00:11:23,726
if there's any primes out there generated
235
00:11:23,726 --> 00:11:25,507
from bad random number generators."
236
00:11:25,507 --> 00:11:28,020
You could imagine here's the easy attack.
237
00:11:28,020 --> 00:11:29,622
Here's the systematic way to do this.
238
00:11:29,622 --> 00:11:34,052
You take two users, so you and you.
239
00:11:34,052 --> 00:11:35,442
And you each have a public key.
240
00:11:35,442 --> 00:11:37,178
We're gonna grab those public keys
241
00:11:37,178 --> 00:11:41,681
N1 and N2
242
00:11:41,681 --> 00:11:46,983
and then hope that this N1 and N2 share a prime.
243
00:11:46,983 --> 00:11:54,130
N1 is some prime p times some q1 and N2 is some same prime p times a different q2.
244
00:11:54,130 --> 00:11:55,663
Now this could happen, you could have
245
00:11:55,663 --> 00:11:59,100
N1 and N2 sharing both primes.
246
00:11:59,100 --> 00:12:02,101
That would be legitimate that two people who are actually
247
00:12:02,101 --> 00:12:05,645
the same webserver sharing their keys, sharing their certificates,
248
00:12:05,645 --> 00:12:06,843
they know they're doing this,
249
00:12:06,843 --> 00:12:09,281
you can get the same public key from two places.
250
00:12:09,281 --> 00:12:12,183
Google has their keys copied to tens of thousands of servers,
251
00:12:12,183 --> 00:12:13,399
that's not a surprise.
252
00:12:13,399 --> 00:12:15,395
But if there's a different...
253
00:12:15,395 --> 00:12:16,999
If you have two different public keys
254
00:12:16,999 --> 00:12:18,430
they shouldn't be sharing primes.
255
00:12:18,430 --> 00:12:19,582
What if they do?
256
00:12:19,582 --> 00:12:21,979
Well, here's this magic Euclid's algorithm
257
00:12:21,979 --> 00:12:24,744
which will just print out the shared prime p.
258
00:12:24,744 --> 00:12:26,066
This is a very simple algorithm.
259
00:12:26,066 --> 00:12:28,495
It just takes one of the numbers, say N1
260
00:12:28,495 --> 00:12:31,961
and let's say N1 is the smaller one and N2 is the bigger one,
261
00:12:31,961 --> 00:12:34,142
it divides N2 by N1
262
00:12:34,142 --> 00:12:37,512
and replaces N2 by the remainder and then switches the numbers
263
00:12:37,512 --> 00:12:39,593
and that's exactly what this code does.
264
00:12:39,593 --> 00:12:43,111
The N2 % N1 that's again the N2 mod N1
265
00:12:43,111 --> 00:12:45,543
the least remainder when you divide N2 by N1.
266
00:12:45,543 --> 00:12:49,414
And then the result once one of the numbers gets down to zero
267
00:12:49,414 --> 00:12:52,292
the other number is exactly the shared prime.
268
00:12:52,292 --> 00:12:54,149
So this amazing algorithm,
269
00:12:54,149 --> 00:12:57,051
here's just a little manual example
270
00:12:57,051 --> 00:12:59,898
not with big 512, 1024 bit keys
271
00:12:59,898 --> 00:13:04,143
this is with just tiny little I guess 8 bit primes
272
00:13:04,143 --> 00:13:07,625
16 bit... well looks more like 13 bit keys.
273
00:13:07,625 --> 00:13:12,375
So two numbers coming in and one is 4187 and two is 5989.
274
00:13:12,375 --> 00:13:14,879
You can see if you keep dividing one number by the other,
275
00:13:14,879 --> 00:13:16,162
replacing it by the remainder,
276
00:13:16,162 --> 00:13:19,411
do that a few times, you quickly get down to zero.
277
00:13:19,411 --> 00:13:22,774
And the number right before the zero in the middle of the slide is 53.
278
00:13:22,774 --> 00:13:27,172
Now 53 is a shared prime between these two small keys.
279
00:13:27,172 --> 00:13:28,772
If you scale this up,
280
00:13:28,772 --> 00:13:32,256
if you do this computation for bigger numbers
281
00:13:32,256 --> 00:13:34,176
then here's another timing example.
282
00:13:34,176 --> 00:13:36,673
It's still really really fast, in the middle of the slide.
283
00:13:36,673 --> 00:13:40,359
This is doing 1024 bit keys that share a 512 bit prime.
284
00:13:40,359 --> 00:13:44,554
You can figure out this 512 bit prime so quickly that the timing
285
00:13:44,554 --> 00:13:48,848
there is 0.00 seconds, can't even see how fast it is.
286
00:13:48,848 --> 00:13:51,094
Ok, so that's Euclid's algorithm.
287
00:13:51,094 --> 00:13:53,676
Now you can actually do this for tons and tons of keys.
288
00:13:53,676 --> 00:13:57,695
You download, say, millions, tens of millions of keys from the Internet,
289
00:13:57,695 --> 00:14:01,011
all the different SSL servers, maybe go for some SSH keys,
290
00:14:01,011 --> 00:14:03,891
you download tons and tons of public keys and then
291
00:14:03,891 --> 00:14:06,590
you try all the pairs with Euclid's algorithm.
292
00:14:06,590 --> 00:14:09,406
And it's so fast that you can do this but actually
293
00:14:09,406 --> 00:14:10,848
you can do better.
294
00:14:10,848 --> 00:14:13,927
So there's an algorithm: batch GCD algorithm
295
00:14:13,927 --> 00:14:15,396
which takes a whole bunch of keys
296
00:14:15,396 --> 00:14:19,822
and figures out which of those keys share primes with the others.
297
00:14:19,822 --> 00:14:22,525
Much much faster than trying every pair.
298
00:14:22,525 --> 00:14:23,909
So you don't have to try every pair.
299
00:14:23,909 --> 00:14:26,973
It's kind of like sorting, it's about that kind of speed
300
00:14:26,973 --> 00:14:29,065
instead of having to try every pair.
301
00:14:29,065 --> 00:14:30,811
If you have tens of millions you don't have to do
302
00:14:30,811 --> 00:14:33,156
tens of millions times tens of millions of pairs.
303
00:14:33,156 --> 00:14:35,739
You just have to pretty much reap through the whole list once.
304
00:14:35,739 --> 00:14:38,877
So what does this batch GCD algorithm do?
305
00:14:38,877 --> 00:14:41,447
It's figuring out for the first N1,
306
00:14:41,447 --> 00:14:44,029
instead of computing the greatest common divisor,
307
00:14:44,029 --> 00:14:45,908
the Euclid result for N1 and N2
308
00:14:45,908 --> 00:14:47,846
and for N1 and N3 and so on,
309
00:14:47,846 --> 00:14:52,340
it figures out the product of N2, N3, N4 and so on,
310
00:14:52,340 --> 00:14:55,257
takes the greatest common divisor of that with N1,
311
00:14:55,257 --> 00:14:57,191
applies Euclid's algorithm to that
312
00:14:57,191 --> 00:15:00,874
and then does a similar thing for N2 and for N3 and so on.
313
00:15:00,874 --> 00:15:05,656
Each of those is gonna show you whether N1 or N2 or N3,
314
00:15:05,656 --> 00:15:08,148
it's gonna show you whether they share some prime
315
00:15:08,148 --> 00:15:10,246
with any of the other N's.
316
00:15:10,246 --> 00:15:13,326
If you did exactly the computation
317
00:15:13,326 --> 00:15:15,248
shown on the slide in the math formulas
318
00:15:15,248 --> 00:15:16,617
it would still be too slow.
319
00:15:16,617 --> 00:15:20,457
But the batch GCD algorithm finds redundancies
320
00:15:20,457 --> 00:15:23,218
in these GCDs, and here's exactly what it does:
321
00:15:23,218 --> 00:15:25,620
It figures out the product of all of the keys.
322
00:15:25,620 --> 00:15:29,923
So you take, say, a million keys and you multiply them all together.
323
00:15:29,923 --> 00:15:33,534
And that's done with the code here where you do...
324
00:15:33,534 --> 00:15:35,120
maybe I'll just skip to the bottom.
325
00:15:35,120 --> 00:15:37,474
You see numbers 10, 20, 30, 40.
326
00:15:37,474 --> 00:15:39,356
You build a product tree.
327
00:15:39,356 --> 00:15:42,210
So you take 10 times 20, compute 200.
328
00:15:42,210 --> 00:15:44,687
Take the next pair 30 times 40, compute 1200.
329
00:15:44,687 --> 00:15:47,889
The take the pair of products 200 times 1200.
330
00:15:47,889 --> 00:15:50,378
If you have more numbers, you have more levels in the tree.
331
00:15:50,378 --> 00:15:55,227
And that's exactly what the Sage code does here.
332
00:15:55,227 --> 00:15:58,924
And then the last step of the algorithm is
333
00:15:58,924 --> 00:16:02,847
kind of going down the tree, building a remainder tree.
334
00:16:02,847 --> 00:16:06,515
So what's happening here to figure out the greatest common divisor
335
00:16:06,515 --> 00:16:10,013
of N1 with the product of all the other keys,
336
00:16:10,013 --> 00:16:12,846
the idea is to take this product
337
00:16:12,846 --> 00:16:15,807
which is called big R on this slide,
338
00:16:15,807 --> 00:16:23,160
take the product of all the keys and then divide that by N1^2
339
00:16:23,160 --> 00:16:26,291
and then take the remainders, so R mod N1^2,
340
00:16:26,291 --> 00:16:31,148
divide by N1 and then compute the greatest common divisor of N1 mod N.
341
00:16:31,148 --> 00:16:33,124
And the amazing effect is that it's exactly the
342
00:16:33,124 --> 00:16:35,047
greatest common divisor we were looking for.
343
00:16:35,047 --> 00:16:37,629
And what this little Sage script does is that it takes
344
00:16:37,629 --> 00:16:42,282
R and divides R by, well a bunch of N's in a really fast way,
345
00:16:42,282 --> 00:16:43,960
and then at the bottom, once it has figured out
346
00:16:43,960 --> 00:16:47,227
all mod N1^2, all mod N2^2 and so on,
347
00:16:47,227 --> 00:16:49,308
divides those by N1 and N2,
348
00:16:49,308 --> 00:16:52,177
takes the greatest common divisor with N1 and N2
349
00:16:52,177 --> 00:16:54,824
and that's exactly telling you, the outputs here
350
00:16:54,824 --> 00:16:57,063
that are different from 1, are exactly telling you
351
00:16:57,063 --> 00:17:00,398
which N's share a prime with some others.
352
00:17:00,398 --> 00:17:04,048
This very remarkably short little script does work quite well.
353
00:17:04,048 --> 00:17:06,524
Here's how fast these are.
354
00:17:06,524 --> 00:17:10,248
So I'm on some 800 MHz core.
355
00:17:10,248 --> 00:17:13,442
If you try this for, say, a hundred numbers,
356
00:17:13,442 --> 00:17:16,608
just hundred random 1024 bit numbers,
357
00:17:16,608 --> 00:17:20,917
it doesn't matter what the numbers are, they always run at the same speed,
358
00:17:20,917 --> 00:17:23,198
then that takes 0.05 seconds.
359
00:17:23,198 --> 00:17:27,249
If you have 10 times as many numbers it takes about 20 times as long.
360
00:17:27,249 --> 00:17:31,229
And if you have another 10 times as many numbers it takes about 20 times as long.
361
00:17:31,229 --> 00:17:33,247
And it keeps scaling up like that.
362
00:17:33,247 --> 00:17:35,948
So you can get to millions or tens of millions of numbers
363
00:17:35,948 --> 00:17:38,582
and it still finishes in a reasonable amount of time.
364
00:17:38,582 --> 00:17:40,068
So, amazingly fast algorithm.
365
00:17:40,068 --> 00:17:43,810
You don't have to scale this up to like 2^50 or 2^100 keys.
366
00:17:43,810 --> 00:17:46,059
There's only, say, tens of millions of keys out there
367
00:17:46,059 --> 00:17:48,330
and this runs reasonably fast.
368
00:17:48,330 --> 00:17:50,626
Ok.
369
00:17:50,626 --> 00:17:52,416
Can you actually hope for this to work?
370
00:17:52,416 --> 00:17:54,629
Are random number generators really this bad?
371
00:17:54,629 --> 00:17:55,866
Well...
372
00:17:55,866 --> 00:18:00,478
laughs
373
00:18:00,478 --> 00:18:03,435
There's a 2012 paper from Heninger
374
00:18:03,435 --> 00:18:04,926
that's the same Heninger we have sitting here
375
00:18:04,926 --> 00:18:06,300
and Durumeric, Wustrow, Halderman,
376
00:18:06,300 --> 00:18:09,881
that's got the best paper award at USENIX Security 2012.
377
00:18:09,881 --> 00:18:14,678
What they did was they tried exactly this on tens of millions of keys out there
378
00:18:14,678 --> 00:18:18,276
and factored tens of thousands of keys.
379
00:18:18,276 --> 00:18:22,677
Admittely they used a C version of this instead of this
380
00:18:22,677 --> 00:18:24,061
Sage script but
381
00:18:24,061 --> 00:18:25,827
the Sage script will go up to millions.
382
00:18:25,827 --> 00:18:28,359
It's just when you get beyond that you want to write it in C
383
00:18:28,359 --> 00:18:31,009
to deal with using a lot of memory.
384
00:18:31,009 --> 00:18:34,887
But you can use exactly this script for pretty large chunks of keys.
385
00:18:34,887 --> 00:18:37,951
And so they factored tens of thousands of keys and said
386
00:18:37,951 --> 00:18:42,235
that these keys are typically not your bank keys.
387
00:18:42,235 --> 00:18:44,909
They gonna be, say, keys for your home router.
388
00:18:44,909 --> 00:18:47,639
So the vulnerable devices are not gonna be,
389
00:18:47,639 --> 00:18:49,893
say, a big server out there.
390
00:18:49,893 --> 00:18:53,902
The vulnerable devices will be your FRITZ!Box.
391
00:18:53,902 --> 00:18:57,340
applause
392
00:18:57,340 --> 00:18:59,838
Thank you, Nadia.
393
00:18:59,838 --> 00:19:06,525
It's good to read the paper, go to factorable.net
394
00:19:06,525 --> 00:19:09,073
and you get tons of analyses of why this happened
395
00:19:09,073 --> 00:19:12,520
and why these devices are generated guessable numbers.
396
00:19:12,520 --> 00:19:15,020
I mean numbers that are so non-random that they get
397
00:19:15,020 --> 00:19:18,370
repeated between a lot of keys out there.
398
00:19:18,370 --> 00:19:20,308
There was at the same time another team that
399
00:19:20,308 --> 00:19:23,559
like within days announced the same result.
400
00:19:23,559 --> 00:19:27,071
They independently did the same computation.
401
00:19:27,071 --> 00:19:28,883
They said "yeah, we've also downloaded
402
00:19:28,883 --> 00:19:31,352
a bunch of keys and factored as many as we could"
403
00:19:31,352 --> 00:19:32,869
with basically the same algorithm.
404
00:19:32,869 --> 00:19:35,198
At the end of it "yeah we factored
405
00:19:35,198 --> 00:19:36,629
tens of thousands of keys."
406
00:19:36,629 --> 00:19:38,882
That paper has no analysis.
407
00:19:38,882 --> 00:19:41,582
They're like "yeah, there's keys"
408
00:19:41,582 --> 00:19:43,267
"they share primes for some reason"
409
00:19:43,267 --> 00:19:45,915
"I guess ecommerce is dead"
410
00:19:45,915 --> 00:19:47,536
"go out, change your bank keys"
411
00:19:47,536 --> 00:19:48,846
"call the New York Times"
412
00:19:48,846 --> 00:19:50,553
There's the New York Times article:
413
00:19:50,553 --> 00:19:52,601
"Flaw Found in an Online Encryption Method"
414
00:19:52,601 --> 00:19:54,860
I also like the advertising on the side here.
415
00:19:54,860 --> 00:19:57,151
Like iron key on the top and then:
416
00:19:57,151 --> 00:20:01,168
"Encrypt, decrypt & access my secret data in real time"
417
00:20:01,168 --> 00:20:03,135
"try it free for 30 days."
418
00:20:03,135 --> 00:20:05,199
"I secured my cloud data. Have you?"
419
00:20:05,199 --> 00:20:06,878
Awesome advertising there.
420
00:20:06,878 --> 00:20:12,480
Once you found that there's a problem like this
421
00:20:12,480 --> 00:20:14,919
then of course you start looking around for more and more places
422
00:20:14,919 --> 00:20:17,435
where you might have some bad randomness.
423
00:20:17,435 --> 00:20:22,795
For example there is a paper, or at least slides,
424
00:20:22,795 --> 00:20:25,395
by Chou in Taiwan, saying
425
00:20:25,395 --> 00:20:31,857
they took two million Taiwan Citizen Digital Certificates
426
00:20:31,857 --> 00:20:34,327
and did some GCDs
427
00:20:34,327 --> 00:20:36,626
and found that a 103 of those were
428
00:20:36,626 --> 00:20:37,995
factorable.
429
00:20:37,995 --> 00:20:40,052
So anybody who downloaded these certificates
430
00:20:40,052 --> 00:20:43,177
which apparently are used in Taiwan for like paying your taxes,
431
00:20:43,177 --> 00:20:44,393
talking to banks I don't know,
432
00:20:44,393 --> 00:20:46,677
but paying taxes is one that I've checked.
433
00:20:46,677 --> 00:20:49,537
So a 103 people out there have these Taiwan
434
00:20:49,537 --> 00:20:51,011
Citizen Digital Certificates where
435
00:20:51,011 --> 00:20:54,121
you're supposed to have in this database things like
436
00:20:54,121 --> 00:20:57,122
the names of the people and their ID numbers
437
00:20:57,122 --> 00:20:59,576
but you aren't supposed to have their secret keys.
438
00:20:59,576 --> 00:21:01,543
The whole point is those are supposed to be private
439
00:21:01,543 --> 00:21:03,938
kept on some smartcards that are issued to the citizens
440
00:21:03,938 --> 00:21:05,295
in Taiwan.
441
00:21:05,295 --> 00:21:08,706
And the randomness generation is so bad that
442
00:21:08,706 --> 00:21:13,169
103 of those keys have now been factored.
443
00:21:13,169 --> 00:21:15,451
The smartcard manufacturer, I also like this quote,
444
00:21:15,451 --> 00:21:18,674
it's a company based in Munich called
445
00:21:18,674 --> 00:21:22,315
Giesecke & Devrient and their motto is:
446
00:21:22,315 --> 00:21:24,753
"Creating Confidence"
447
00:21:24,753 --> 00:21:35,270
applause
448
00:21:35,270 --> 00:21:37,698
Tanja: There's gonna be more of Dan, don't worry.
449
00:21:37,698 --> 00:21:39,835
But we promised a bit more than just factoring
450
00:21:39,835 --> 00:21:43,309
bad keys or the keys with their primes.
451
00:21:43,309 --> 00:21:46,956
If you find another number like this one.
452
00:21:46,956 --> 00:21:49,686
So here's coming some more of math stuff.
453
00:21:49,686 --> 00:21:52,555
So if you see a number like this and you
454
00:21:52,555 --> 00:21:56,427
observe this last digit, then yes.
455
00:21:56,427 --> 00:21:58,738
It's very easy for you to see it's divisable by 5.
456
00:21:58,738 --> 00:22:01,385
So if you find such a key that is easy to break
457
00:22:01,385 --> 00:22:03,226
and you are a computer rather than a human being
458
00:22:03,226 --> 00:22:05,324
you look at lots of those numbers.
459
00:22:05,324 --> 00:22:07,076
You have a list of small primes stored
460
00:22:07,076 --> 00:22:10,037
and what you're gonna do is just trial division.
461
00:22:10,037 --> 00:22:11,987
So if there's any small factor
462
00:22:11,987 --> 00:22:14,092
it's very easy to find this trial division.
463
00:22:14,092 --> 00:22:16,708
Or what Dan showed was the remainder trees.
464
00:22:16,708 --> 00:22:20,222
You can also do this for batch trial division.
465
00:22:20,222 --> 00:22:23,728
So assuming that you're looking for a prime p somewhere
466
00:22:23,728 --> 00:22:26,735
then you go linearly all the way up to the p
467
00:22:26,735 --> 00:22:30,442
and there are about p/log(p) many primes
468
00:22:30,442 --> 00:22:32,094
before you find this p.
469
00:22:32,094 --> 00:22:33,969
For each of them you just do exact division.
470
00:22:33,969 --> 00:22:36,558
If it works, fine you found a factor, otherwise not.
471
00:22:36,558 --> 00:22:39,459
Now this is not going to work against your normal keys.
472
00:22:39,459 --> 00:22:43,624
I know that in the example that Dan mentioned by Wagner&Goldberg
473
00:22:43,624 --> 00:22:46,625
that they found one factor which had a 9 in there
474
00:22:46,625 --> 00:22:49,975
but usually your keys don't have a 9.
475
00:22:49,975 --> 00:22:54,687
For slightly larger numbers there is a method due to Pollard.
476
00:22:54,687 --> 00:22:57,460
So if you see the respectful number here,
477
00:22:57,460 --> 00:22:59,887
there is no obvious divisability of this one.
478
00:22:59,887 --> 00:23:03,091
One thing you can try is
479
00:23:03,091 --> 00:23:06,660
a walk, well we call this a walk.
480
00:23:06,660 --> 00:23:10,404
You start with some random number smaller than N
481
00:23:10,404 --> 00:23:12,902
and some offset here.
482
00:23:12,902 --> 00:23:17,277
I should also say, all those scripts are available at this URL.
483
00:23:17,277 --> 00:23:22,087
So if you go to facthacks.cr.yp.to then you'll find all the scripts
484
00:23:22,087 --> 00:23:23,879
if you want to play with it at the same time,
485
00:23:23,879 --> 00:23:26,625
including this one with some explanations and such.
486
00:23:26,625 --> 00:23:31,092
So what you do is you start off two different numbers,
487
00:23:31,092 --> 00:23:34,788
one at each step squares it and adds c.
488
00:23:34,788 --> 00:23:37,226
And the other one squares it, adds c,
489
00:23:37,226 --> 00:23:39,886
and squares it again and adds the c.
490
00:23:39,886 --> 00:23:41,455
Each time computing mod N.
491
00:23:41,455 --> 00:23:44,102
So you have two sequences running around,
492
00:23:44,102 --> 00:23:46,023
one is twice the speed as the other.
493
00:23:46,023 --> 00:23:50,953
At every step you check with the GCD of this number N
494
00:23:50,953 --> 00:23:52,757
and the difference between the two walks
495
00:23:52,757 --> 00:23:55,006
hasn't anything interesting now.
496
00:23:55,006 --> 00:23:57,390
N is an RSA modulus, the only thing interesting
497
00:23:57,390 --> 00:23:59,675
could either be p or q.
498
00:23:59,675 --> 00:24:03,009
So for this particular number
499
00:24:03,009 --> 00:24:04,270
when I run this
500
00:24:04,270 --> 00:24:08,894
I find 2053 after a few milliseconds.
501
00:24:08,894 --> 00:24:10,642
So this is again a cooked up example.
502
00:24:10,642 --> 00:24:13,307
You won't find such a small factor if you're trying
503
00:24:13,307 --> 00:24:16,173
a general RSA factor, otherwise your numbers
504
00:24:16,173 --> 00:24:17,830
would be totally busted but
505
00:24:17,830 --> 00:24:20,604
this shows if you have a small number in there
506
00:24:20,604 --> 00:24:21,857
it's very easy to find.
507
00:24:21,857 --> 00:24:24,357
All later on when Dan is talking about the number field sieve
508
00:24:24,357 --> 00:24:26,157
this is an interesting subroutine.
509
00:24:26,157 --> 00:24:28,089
So if you ever need to find small numbers
510
00:24:28,089 --> 00:24:31,020
then, sure trial division is very easy for
511
00:24:31,020 --> 00:24:34,080
really small numbers taking this long,
512
00:24:34,080 --> 00:24:38,352
this one squared with p is much faster already.
513
00:24:38,352 --> 00:24:42,070
Now even faster than that, also due to Pollard,
514
00:24:42,070 --> 00:24:45,378
called Pollard's p-1 method.
515
00:24:45,378 --> 00:24:47,904
Here is again an innocent looking number N.
516
00:24:47,904 --> 00:24:49,842
You see the numbers get larger.
517
00:24:49,842 --> 00:24:52,176
This is a 256 bit number.
518
00:24:52,176 --> 00:24:56,744
In order to deal with such a number there is one step
519
00:24:56,744 --> 00:24:59,342
which is expensive but you would do it only once
520
00:24:59,342 --> 00:25:01,208
for all different numbers you want to try,
521
00:25:01,208 --> 00:25:05,238
namely compute a big integer which has lots of little factors.
522
00:25:05,238 --> 00:25:06,985
So you take the least common multiple
523
00:25:06,985 --> 00:25:09,788
from 1, 2, 3, 4, 5...
524
00:25:09,788 --> 00:25:12,203
Ok, once you hit the 4 you have another 2 in there.
525
00:25:12,203 --> 00:25:13,808
So there's lots and lots of powers of 2,
526
00:25:13,808 --> 00:25:15,408
lots and lots of powers of 3.
527
00:25:15,408 --> 00:25:19,390
And then the larger primes only appear like once.
528
00:25:19,390 --> 00:25:23,161
But you're not going up to 256 or 128 anywhere.
529
00:25:23,161 --> 00:25:27,005
You're sticking here at 22 bits
530
00:25:27,005 --> 00:25:29,059
and then use the same function that Nadia explained
531
00:25:31,023 --> 00:25:32,987
in the RSA computation, namely the modular exponentiation.
532
00:25:32,987 --> 00:25:36,576
So you take the 2^y,
533
00:25:36,576 --> 00:25:38,241
this huge number,
534
00:25:38,241 --> 00:25:40,486
but the whole computation mod N.
535
00:25:40,486 --> 00:25:44,633
So this whole thing never grows beyond this 256 bit number.
536
00:25:44,633 --> 00:25:47,939
Or for a real world example it would be a 1000 bit number.
537
00:25:47,939 --> 00:25:52,518
And then you compute the GCD of this number and N.
538
00:25:52,518 --> 00:25:55,091
For this particular one, again it's a cooked up example,
539
00:25:55,091 --> 00:25:56,390
you get a factor.
540
00:25:56,390 --> 00:25:59,791
Now this factor is actually 120 bits.
541
00:25:59,791 --> 00:26:02,369
This is not a small factor.
542
00:26:02,369 --> 00:26:05,688
So if you were using 256 bit numbers
543
00:26:05,688 --> 00:26:07,856
this is something that could happen to you
544
00:26:07,856 --> 00:26:11,305
and nevertheless this method would find it.
545
00:26:11,305 --> 00:26:13,522
So this finds much larger factors than trial division
546
00:26:13,522 --> 00:26:15,484
or the Rho method in the same amount of time.
547
00:26:15,484 --> 00:26:17,376
But it doesn't work for all primes.
548
00:26:17,376 --> 00:26:20,066
It works for primes which are kind of special.
549
00:26:20,066 --> 00:26:23,859
Now what's special about this prime or the other factor?
550
00:26:23,859 --> 00:26:29,553
If you take p and subtract -1, hence the name p-1 method,
551
00:26:29,553 --> 00:26:31,660
and factor that thing,
552
00:26:31,660 --> 00:26:35,272
that has lots, lots of small numbers.
553
00:26:35,272 --> 00:26:36,937
So that's something we call smooth.
554
00:26:36,937 --> 00:26:40,342
So a smooth number doesn't have big factors.
555
00:26:40,342 --> 00:26:43,868
So a prime is like the least smooth number you can imagine.
556
00:26:43,868 --> 00:26:47,367
And 2 to the power large is the most smooth number,
557
00:26:47,367 --> 00:26:50,007
so lots of little factors means smooth.
558
00:26:50,007 --> 00:26:55,568
This largest factor happens to be 21.7 bits
559
00:26:55,568 --> 00:26:58,355
which is why I chose the 22 bits here.
560
00:26:58,355 --> 00:27:03,702
So as soon I run over the largest prime in p-1
561
00:27:03,702 --> 00:27:06,127
with this exponent y here
562
00:27:06,127 --> 00:27:08,653
then I do find this factor.
563
00:27:08,653 --> 00:27:10,851
Now if you thought this was math
564
00:27:10,851 --> 00:27:12,322
there is even more scary math
565
00:27:12,322 --> 00:27:15,893
namely there are methods which are generalizing
566
00:27:15,893 --> 00:27:19,319
this p-1 method using elliptic curves.
567
00:27:19,319 --> 00:27:21,569
If you listen to the crypto advertisement
568
00:27:21,569 --> 00:27:23,472
elliptic curves are usually something very good,
569
00:27:23,472 --> 00:27:25,005
something you should have on your smartcard.
570
00:27:25,005 --> 00:27:27,245
It's faster than RSA and so on.
571
00:27:27,245 --> 00:27:29,385
That's the same type of elliptic curve
572
00:27:29,385 --> 00:27:32,342
but here elliptic curves come in as an attack tool.
573
00:27:32,342 --> 00:27:35,088
There is a method called the elliptic curve method ECM
574
00:27:35,088 --> 00:27:41,255
which is a generalization of this p-1 method and does not need anything special,
575
00:27:41,255 --> 00:27:44,708
I mean avoiding such primes is easy.
576
00:27:44,708 --> 00:27:46,325
If you look into older standards
577
00:27:46,325 --> 00:27:49,852
they all warn about "make sure to use strong primes"
578
00:27:49,852 --> 00:27:53,174
"make sure to check that your p-1 is not smooth"
579
00:27:53,174 --> 00:27:55,301
And of course you don't want your p-1 to be smooth
580
00:27:55,301 --> 00:27:57,670
because otherwise this attack works.
581
00:27:57,670 --> 00:28:00,760
But if I have some elliptic curve
582
00:28:00,760 --> 00:28:03,492
a different type of prime is weak for that curve
583
00:28:03,492 --> 00:28:06,351
and I have many, many, many elliptic curves.
584
00:28:06,351 --> 00:28:08,617
For each of the curves some primes are weak.
585
00:28:08,617 --> 00:28:11,161
You can't exclude this method.
586
00:28:11,161 --> 00:28:13,573
So the only thing you can do is just, well,
587
00:28:13,573 --> 00:28:18,035
choose your prime large enough and hope, well,
588
00:28:18,035 --> 00:28:20,436
make sure p-1 doesn't get it and hope
589
00:28:20,436 --> 00:28:22,601
that the next elliptic curves won't get it.
590
00:28:22,601 --> 00:28:24,855
So elliptic curves are good for attacking.
591
00:28:24,855 --> 00:28:27,624
It's still not the fastest method out there
592
00:28:27,624 --> 00:28:33,185
but it's a good method for random factoring.
593
00:28:33,185 --> 00:28:35,917
It's not the best method for finding RSA factors
594
00:28:35,917 --> 00:28:38,366
but if you have a number which is most likely
595
00:28:38,366 --> 00:28:44,532
not too non-smooth then it's a good method.
596
00:28:44,532 --> 00:28:47,234
Now this is what you do for finding small factors.
597
00:28:47,234 --> 00:28:50,509
To show some method for big factors,
598
00:28:50,509 --> 00:28:53,994
now here that is a big number.
599
00:28:53,994 --> 00:28:58,978
But this is what I would call factorization by inspection.
600
00:28:58,978 --> 00:29:07,876
What's wrong with this number?
601
00:29:07,876 --> 00:29:10,129
I mean this number is not small and
602
00:29:10,129 --> 00:29:13,162
it won't have small factors, I promise you this.
603
00:29:13,162 --> 00:29:19,835
So it's a decimal representation, so 9 is the digit before 0.
604
00:29:19,835 --> 00:29:23,246
This looks very close to the power of 10.
605
00:29:23,246 --> 00:29:28,253
Actually this is very close to 10^340.
606
00:29:28,253 --> 00:29:30,292
If you take the square root of it
607
00:29:30,292 --> 00:29:34,150
then you almost exactly hit 10^170.
608
00:29:34,150 --> 00:29:37,113
This example was cooked up as taking two primes
609
00:29:37,113 --> 00:29:46,646
namely 10^170-33 and 10^170+63 and multiplying them.
610
00:29:46,646 --> 00:29:48,667
So if you see lots of 0's and lots of 9's
611
00:29:48,667 --> 00:29:50,934
then you know that you are very close to a certain power.
612
00:29:50,934 --> 00:29:53,464
Now in real life this wouldn't be a power of 10,
613
00:29:53,464 --> 00:29:54,817
it would be a power of 2.
614
00:29:54,817 --> 00:29:56,179
And there actually are some examples.
615
00:29:56,179 --> 00:29:58,817
There's a famous story of a letter bomber in Austria
616
00:29:58,817 --> 00:30:02,650
who wanted his message to be crackable
617
00:30:02,650 --> 00:30:06,047
and he sent a message, well he was some right-wing asshole
618
00:30:06,047 --> 00:30:08,332
that was letter-bombing people
619
00:30:08,332 --> 00:30:10,813
and then he sent an encrypted message to the police.
620
00:30:10,813 --> 00:30:14,642
And the police tried to factor it and tried to decipher it
621
00:30:14,642 --> 00:30:18,568
hoping it would be the list of the next victims.
622
00:30:18,568 --> 00:30:20,980
In the end it was not but
623
00:30:20,980 --> 00:30:24,419
this person was actually using very close to a power of 2
624
00:30:24,419 --> 00:30:26,211
in order to have people crack it.
625
00:30:26,211 --> 00:30:27,769
In the end it was just some propaganda,
626
00:30:27,769 --> 00:30:29,462
some right-wing stuff as well.
627
00:30:29,462 --> 00:30:32,783
So there are people using this for breakability.
628
00:30:32,783 --> 00:30:35,499
I'm not aware of people actually using this as an accident.
629
00:30:35,499 --> 00:30:40,437
But what can happen to you is not using close to the power of 2
630
00:30:40,437 --> 00:30:46,200
but saying "ok I learned I shouldn't be close to the power of 2
631
00:30:46,200 --> 00:30:48,848
so I take some offset and take the next prime."
632
00:30:48,848 --> 00:30:51,130
Next prime just means, add 1, check is it prime,
633
00:30:51,130 --> 00:30:53,798
add next one, add 2, add 3
634
00:30:53,798 --> 00:30:57,082
Just check primes linearly from then on.
635
00:30:57,082 --> 00:30:59,261
Now if we jump to this where we finally find our prime p
636
00:30:59,261 --> 00:31:00,966
it's a good prime.
637
00:31:00,966 --> 00:31:03,869
It's not anywhere close to the power of 2 because my c was large enough.
638
00:31:03,869 --> 00:31:07,130
Made it, good.
639
00:31:07,130 --> 00:31:08,562
Now on q.
640
00:31:08,562 --> 00:31:10,776
So again call the next_prime().
641
00:31:10,776 --> 00:31:14,445
So I do p+1, p+2 until I find a prime.
642
00:31:14,445 --> 00:31:17,215
That means that if you take the product of the two
643
00:31:17,215 --> 00:31:19,469
then this N is very close to a square.
644
00:31:19,469 --> 00:31:22,920
So this N is cooked up with this method.
645
00:31:22,920 --> 00:31:24,412
You don't see anything on the end,
646
00:31:24,412 --> 00:31:26,678
there's nothing wrong there.
647
00:31:26,678 --> 00:31:29,650
But when you compute the square root of it
648
00:31:29,650 --> 00:31:33,156
it is almost an integer, it's .99999...
649
00:31:33,156 --> 00:31:35,208
and then some dirt here.
650
00:31:35,208 --> 00:31:39,288
So then you'll know that your primes were too small.
651
00:31:39,288 --> 00:31:42,286
Sorry, not too small, too close to each other.
652
00:31:42,286 --> 00:31:45,483
So if you ever take an RSA factor modulus
653
00:31:45,483 --> 00:31:46,841
and you compute the square root
654
00:31:46,841 --> 00:31:47,991
and you see something like this
655
00:31:47,991 --> 00:31:51,885
then you know: that user did this method.
656
00:31:51,885 --> 00:31:53,661
You could just say, I take the number,
657
00:31:53,661 --> 00:31:56,549
subtract 1, subtract 2, just go off from the square root.
658
00:31:56,549 --> 00:32:02,022
All you can check how far away is it from being a square.
659
00:32:02,022 --> 00:32:03,701
So you take the seeding,
660
00:32:03,701 --> 00:32:07,950
that means the closest integer counting upwards.
661
00:32:07,950 --> 00:32:11,817
Just round it to this two, here's 57, is there a 56?
662
00:32:11,817 --> 00:32:14,520
Compute the square of that and subtract N.
663
00:32:14,520 --> 00:32:17,987
Now in this example that is 4096
664
00:32:17,987 --> 00:32:22,595
which itself is a square.
665
00:32:22,595 --> 00:32:26,636
So the difference of N and a^2 is a square.
666
00:32:26,636 --> 00:32:30,915
Then I take this 64, take a-64, divide N,
667
00:32:30,915 --> 00:32:34,413
it's an exact division, so I found one of the factors.
668
00:32:34,413 --> 00:32:37,200
And then there's the other factor.
669
00:32:37,200 --> 00:32:39,738
It doesn't matter whether it's a power of 2 or a power of 10.
670
00:32:39,738 --> 00:32:42,569
What matters is that the numbers are too close.
671
00:32:42,569 --> 00:32:44,712
Now this method actually works surprisingly well
672
00:32:44,712 --> 00:32:47,436
even if we don't do next_prime().
673
00:32:47,436 --> 00:32:50,332
So here's another example where I didn't do the next_prime()
674
00:32:50,332 --> 00:32:52,281
but did something larger
675
00:32:52,281 --> 00:32:55,114
so this is not really close to a square.
676
00:32:55,114 --> 00:32:57,731
Some 9's but not too many.
677
00:32:57,731 --> 00:33:00,053
What we did before we wrote it as a difference of squares
678
00:33:00,053 --> 00:33:04,614
and then divided N by one of these two factors.
679
00:33:04,614 --> 00:33:10,050
Now in this case here I would not start as a^2-N
680
00:33:10,050 --> 00:33:12,255
but I say, well, if a^2-N is not a square
681
00:33:12,255 --> 00:33:15,997
I try the next one, I do (a+1)^2, (a+2)^2
682
00:33:15,997 --> 00:33:20,777
each time subtract N and check when this is a square.
683
00:33:20,777 --> 00:33:24,170
Now for this example I get lucky after 2.
684
00:33:24,170 --> 00:33:26,331
And this actually took me a long time to cook up this example
685
00:33:26,331 --> 00:33:28,613
because I was starting at something which was
686
00:33:28,613 --> 00:33:31,669
half of the bits of p were changed.
687
00:33:31,669 --> 00:33:35,872
So I was actually starting with a cube quite a distance away.
688
00:33:35,872 --> 00:33:38,216
Still, most of those were just giving a square.
689
00:33:38,216 --> 00:33:42,070
They were not giving me anything larger than zero here.
690
00:33:42,070 --> 00:33:45,598
Now for general numbers if I wouldn't do like next_number()
691
00:33:45,598 --> 00:33:48,786
but just, well, random prime, another random prime.
692
00:33:48,786 --> 00:33:52,687
It still works but takes a long time
693
00:33:52,687 --> 00:33:55,488
because I can always write it this way.
694
00:33:55,488 --> 00:33:56,786
This is a, this is b.
695
00:33:56,786 --> 00:34:00,603
But then usually it would take about square root of N
696
00:34:00,603 --> 00:34:03,752
which is the same size as p until I get there.
697
00:34:03,752 --> 00:34:07,352
This is a nice method if it looks like lots of
698
00:34:07,352 --> 00:34:13,386
9999 and otherwise do something better as Dan will tell now.
699
00:34:13,386 --> 00:34:13,955
Dan: Ok.
700
00:34:13,955 --> 00:34:21,508
applause
701
00:34:21,508 --> 00:34:23,662
Alright, so it's bad to have p and q
702
00:34:23,662 --> 00:34:24,805
too close to each other
703
00:34:24,805 --> 00:34:28,640
or have a small p or to have p and q non-random.
704
00:34:28,640 --> 00:34:31,010
So let's do everything right.
705
00:34:31,010 --> 00:34:34,236
Let's make just independently choose some big p
706
00:34:34,236 --> 00:34:37,033
between 2^511 and 2^512
707
00:34:37,033 --> 00:34:38,667
and choose some big q
708
00:34:38,667 --> 00:34:42,586
without being like the next prime or anywhere in the ballpark of p.
709
00:34:42,586 --> 00:34:47,859
You could maybe say q has to be between 2^509 and 2^510
710
00:34:47,859 --> 00:34:49,522
and then it's definitely nowhere near p.
711
00:34:49,522 --> 00:34:52,539
Now you got this public key and p times q
712
00:34:52,539 --> 00:34:54,473
which is a 1024 bit RSA key.
713
00:34:54,473 --> 00:34:58,745
If nothing has gone wrong with your randomness generation
714
00:34:58,745 --> 00:35:00,153
then what does the attacker do?
715
00:35:00,153 --> 00:35:04,472
Well, we don't know anything fast.
716
00:35:04,472 --> 00:35:06,840
So this is gonna be the one part of the talk
717
00:35:06,840 --> 00:35:08,495
where there's these big powers of 2
718
00:35:08,495 --> 00:35:09,778
for how fast things are
719
00:35:09,778 --> 00:35:11,139
because they're not really fast.
720
00:35:11,139 --> 00:35:14,624
You have to think about how much something like 2^80 is.
721
00:35:14,624 --> 00:35:15,938
There's all these different methods,
722
00:35:15,938 --> 00:35:17,944
I'll just skip down to the last two.
723
00:35:17,944 --> 00:35:19,999
There's the quadratic sieve (QS),
724
00:35:19,999 --> 00:35:22,114
the number field sieve (NFS).
725
00:35:22,114 --> 00:35:24,109
These have been around since, well,
726
00:35:24,109 --> 00:35:29,192
QS since the 80th, NFS since the early 90th.
727
00:35:29,192 --> 00:35:32,327
The NFS, the general consensus is
728
00:35:32,327 --> 00:35:35,197
it takes something like 2^80 operations.
729
00:35:35,197 --> 00:35:36,675
Now here's something fun to try.
730
00:35:36,675 --> 00:35:38,844
If somebody says 2^80 operations
731
00:35:38,844 --> 00:35:41,473
and there's some cryptographer talking about the security or something,
732
00:35:41,473 --> 00:35:44,244
ask them: what do you mean by an operation?
733
00:35:44,244 --> 00:35:45,996
How fast is this operation?
734
00:35:45,996 --> 00:35:47,480
And then most of the people saying this
735
00:35:47,480 --> 00:35:48,624
will go running screaming like:
736
00:35:48,624 --> 00:35:49,850
I don't know what an operation is,
737
00:35:49,850 --> 00:35:51,193
don't ask me that question.
738
00:35:51,193 --> 00:35:53,382
But still people will confidently tell you
739
00:35:53,382 --> 00:35:56,230
that the NFS takes 2^80 operations
740
00:35:56,230 --> 00:36:00,347
to break any 1024 bit RSA key
741
00:36:00,347 --> 00:36:02,237
even if the user hasn't done anything wrong,
742
00:36:02,237 --> 00:36:04,047
the implementor hasn't done anything wrong.
743
00:36:04,047 --> 00:36:05,631
And this number, this 2^80
744
00:36:05,631 --> 00:36:07,618
if you pin down what those operations are,
745
00:36:07,618 --> 00:36:11,100
this is something which is doable today
746
00:36:11,100 --> 00:36:15,746
by people with a big botnet or people with a big computer cluster.
747
00:36:15,746 --> 00:36:18,611
I'll give some details what I mean, the size is there.
748
00:36:18,611 --> 00:36:21,797
As your computers get cheaper and cheaper,
749
00:36:21,797 --> 00:36:23,720
somehow chips aren't going up in clockspeed
750
00:36:23,720 --> 00:36:25,149
but they're still getting cheaper.
751
00:36:25,149 --> 00:36:27,636
You can get a really cheap ARM which is doing
752
00:36:27,636 --> 00:36:28,904
the same kind of computations
753
00:36:28,904 --> 00:36:31,501
that a pretty big CPU was doing several years ago.
754
00:36:31,501 --> 00:36:33,181
And chips are gonna continue getting cheaper
755
00:36:33,181 --> 00:36:36,604
for a while. These attacks against RSA 1024
756
00:36:36,604 --> 00:36:39,568
are going to be common doable for more and more people.
757
00:36:39,568 --> 00:36:43,119
How do these methods work?
758
00:36:43,119 --> 00:36:47,210
I'll give one example and then one little Sage script.
759
00:36:47,210 --> 00:36:50,169
Let's try just a really small number.
760
00:36:50,169 --> 00:36:51,717
This will be with the quadratic sieve
761
00:36:51,717 --> 00:36:53,727
which is not as fancy as the number field sieve
762
00:36:53,727 --> 00:36:56,395
but it will give you some idea of how these methods work.
763
00:36:56,395 --> 00:36:59,978
The QS starts with that Fermat method you heard about.
764
00:36:59,978 --> 00:37:04,066
Remember, the idea there was to try to find a square
765
00:37:04,066 --> 00:37:08,574
as a^2-N, so a was like the square root of N
766
00:37:08,574 --> 00:37:10,525
or maybe the plus 1, plus 2 and so on
767
00:37:10,525 --> 00:37:13,333
and you keep searching for a+1 and so on,
768
00:37:13,333 --> 00:37:15,933
search for numbers that if you square them and subtract N
769
00:37:15,933 --> 00:37:17,056
then you get a square.
770
00:37:17,056 --> 00:37:20,708
So let's try that for 50=2759 which is not very big.
771
00:37:20,708 --> 00:37:23,975
Then you try, 53 was the ceiling of the square root,
772
00:37:23,975 --> 00:37:27,673
square that, subtract 2759, you get 50.
773
00:37:27,673 --> 00:37:31,171
Well 25 would be a square, but 50 is not a square
774
00:37:31,171 --> 00:37:33,309
it's 2 times 25, 2 times 5^2.
775
00:37:33,309 --> 00:37:37,208
And then you try another number, 54^2-2759
776
00:37:37,208 --> 00:37:39,911
umm 157, that's too complicated
777
00:37:39,911 --> 00:37:42,707
I don't remember that being a square so let's just skip it.
778
00:37:42,707 --> 00:37:45,687
And then you keep going like this, 266, 377,
779
00:37:45,687 --> 00:37:48,223
490. I remember 49 is a square
780
00:37:48,223 --> 00:37:50,354
but that doesn't mean that 490 is a square
781
00:37:50,354 --> 00:37:52,372
cause 10, 2 times 5, that's not a square.
782
00:37:52,372 --> 00:37:55,871
And 605, you can try...
783
00:37:55,871 --> 00:37:58,640
We remember how to take out the 5
784
00:37:58,640 --> 00:38:01,275
and then 60... 605 divided by 5.
785
00:38:01,275 --> 00:38:03,588
You can figure out it's 5 times 11^2.
786
00:38:03,588 --> 00:38:05,158
It's almost a square.
787
00:38:05,158 --> 00:38:06,774
You keep doing this for a while
788
00:38:06,774 --> 00:38:10,534
and it seems like Fermat's method is actually working pretty badly.
789
00:38:10,534 --> 00:38:13,754
It does work eventually but it's taking quite a while.
790
00:38:13,754 --> 00:38:16,671
Here's what the quadratic sieve does.
791
00:38:16,671 --> 00:38:19,527
From the same computation you look at these
792
00:38:19,527 --> 00:38:21,585
numbers that were not exactly squares
793
00:38:21,585 --> 00:38:23,634
50 and 490 and 605
794
00:38:23,634 --> 00:38:27,056
and you notice if you multiple them together
795
00:38:27,056 --> 00:38:28,472
50 was 2 times 5^2
796
00:38:28,472 --> 00:38:30,756
490 is 2 times 5 times 7^2
797
00:38:30,756 --> 00:38:33,068
605 is 5 times 11^2
798
00:38:33,068 --> 00:38:34,585
and then you multiply those together
799
00:38:34,585 --> 00:38:38,524
you get 2^2 and 5^4 and 7^2 and 11^2
800
00:38:38,524 --> 00:38:41,123
and that's exactly the square of something.
801
00:38:41,123 --> 00:38:44,592
The square of 2^1 times 5^2 times 7 times 11.
802
00:38:44,592 --> 00:38:48,175
Then the quadratic sieve is taking the square root of that number
803
00:38:48,175 --> 00:38:52,553
and subtracting that from the product of the fifty-somethings
804
00:38:52,553 --> 00:38:53,953
that were on the left side
805
00:38:53,953 --> 00:38:56,826
and taking the greatest common divisor of that with N
806
00:38:56,826 --> 00:38:59,787
and amazingly that produces a factor of N.
807
00:38:59,787 --> 00:39:02,289
And it's not just some random accident that
808
00:39:02,289 --> 00:39:04,370
if you had tried the greatest common divisor of
809
00:39:04,370 --> 00:39:06,085
"write down some hocuspocus then"
810
00:39:06,085 --> 00:39:08,803
"you eventually get 31 coming out"
811
00:39:08,803 --> 00:39:10,576
"every 31 times you write down a random number"
812
00:39:10,576 --> 00:39:13,020
"it will have this 31 dividing it"
813
00:39:13,020 --> 00:39:15,347
No, you can try this for bigger and bigger numbers
814
00:39:15,347 --> 00:39:17,038
and actually this keeps working.
815
00:39:17,038 --> 00:39:18,663
As soon as you find a square product
816
00:39:18,663 --> 00:39:23,212
then half of those square products are gonna factor N.
817
00:39:23,212 --> 00:39:25,365
So that's how the quadratic sieve works.
818
00:39:25,365 --> 00:39:28,915
Here's a more systematic script with a much bigger number
819
00:39:28,915 --> 00:39:31,300
which if it were just hocuspocus this wouldn't work.
820
00:39:31,300 --> 00:39:32,616
But this does work.
821
00:39:32,616 --> 00:39:36,834
So there's a number from my computer's random number generator
822
00:39:36,834 --> 00:39:41,661
31415926... laughter
823
00:39:41,661 --> 00:39:45,099
You try writing down a bunch of squares...
824
00:39:45,099 --> 00:39:48,285
maybe I should get my computer's random number generator checked.
825
00:39:48,285 --> 00:39:50,636
It is this two year old laptop you heard about before
826
00:39:50,636 --> 00:39:52,202
so I'm not sure it's working properly.
827
00:39:52,202 --> 00:39:55,599
You take a bunch of a^2-N and then
828
00:39:55,599 --> 00:39:57,414
let's make a list of those called X.
829
00:39:57,414 --> 00:40:00,752
You can see the range, there is up to 500.000 numbers.
830
00:40:00,752 --> 00:40:02,948
It's a pretty big list we're talking about.
831
00:40:02,948 --> 00:40:05,749
And then search through those elements,
832
00:40:05,749 --> 00:40:08,538
search through those a^2-N, those differences,
833
00:40:08,538 --> 00:40:09,504
the elements in this list,
834
00:40:09,504 --> 00:40:12,985
to see which ones have easy factorizations.
835
00:40:12,985 --> 00:40:15,445
Now a lot of numbers like that 157
836
00:40:15,445 --> 00:40:16,970
I know what the factorization of that is
837
00:40:16,970 --> 00:40:19,846
but there is function which you can find on our website
838
00:40:19,846 --> 00:40:22,696
easyfactorizations() which looks through the list X
839
00:40:22,696 --> 00:40:25,251
and if it finds numbers that are easy to factor
840
00:40:25,251 --> 00:40:27,935
then it quickly writes down the factorizations
841
00:40:27,935 --> 00:40:30,482
using the small prime techniques you heard about.
842
00:40:30,482 --> 00:40:33,169
And then makes a list of those easy factorizations,
843
00:40:33,169 --> 00:40:34,396
puts those into F.
844
00:40:34,396 --> 00:40:37,600
And then there's a big chunk which I won't try to explain
845
00:40:37,600 --> 00:40:39,986
which is called linear algebra mod 2.
846
00:40:39,986 --> 00:40:42,586
And that somehow figures out a square
847
00:40:42,586 --> 00:40:45,697
that comes from the factorizations,
848
00:40:45,697 --> 00:40:48,363
somehow looks through this list of factorizations
849
00:40:48,363 --> 00:40:50,803
and magically applies some linear algebra stuff.
850
00:40:50,803 --> 00:40:52,612
Well ok, I won't go through that.
851
00:40:52,612 --> 00:40:54,071
It's something doable
852
00:40:54,071 --> 00:40:58,604
and this is using some standard Sage functions to do it pretty easily.
853
00:40:58,604 --> 00:41:01,738
There's all sorts of things that
854
00:41:01,738 --> 00:41:03,664
in the interest of time I won't get into
855
00:41:03,664 --> 00:41:05,867
like how this easyfactorizations() works.
856
00:41:05,867 --> 00:41:07,603
And this is something where people write papers
857
00:41:07,603 --> 00:41:09,801
and papers and papers talking about trying to make this
858
00:41:09,801 --> 00:41:11,603
go as quickly as possible.
859
00:41:11,603 --> 00:41:14,737
I do want to emphasize one fact about these methods:
860
00:41:14,737 --> 00:41:17,484
the bold-faced **very small memory requirements**.
861
00:41:17,484 --> 00:41:20,038
You can use the elliptic curve method
862
00:41:20,038 --> 00:41:21,531
that you heard about before
863
00:41:21,531 --> 00:41:24,882
you can use that to figure out easy factorizations
864
00:41:24,882 --> 00:41:27,063
using very little memory.
865
00:41:27,063 --> 00:41:28,503
That means you can build a chip
866
00:41:28,503 --> 00:41:30,471
like a FPGA, a special purpose ASIC,
867
00:41:30,471 --> 00:41:33,285
you can have thousands of little ECM units
868
00:41:33,285 --> 00:41:35,066
running in parallel, so you can really
869
00:41:35,066 --> 00:41:38,153
massively parallelize this easyfactorizations() function.
870
00:41:38,153 --> 00:41:40,187
So if people tell you "oh you need tons of
871
00:41:40,187 --> 00:41:43,386
memory for sieving to find easy factorizations"
872
00:41:43,386 --> 00:41:45,868
then you should tell them "no no, that's not true"
873
00:41:45,868 --> 00:41:47,979
"ECM you can do with very little memory
874
00:41:47,979 --> 00:41:50,351
running massively parallelizable"
875
00:41:50,351 --> 00:41:52,802
And then there's other things which,
876
00:41:52,802 --> 00:41:54,490
I suppose if we had another hour
877
00:41:54,490 --> 00:41:56,432
then we could get into all these details
878
00:41:56,432 --> 00:41:59,117
of like how the number field sieve goes beyond this.
879
00:41:59,117 --> 00:42:01,970
It's a similar kind of method but gets more complicated.
880
00:42:01,970 --> 00:42:05,181
I do want to emphasize again thing in bold-face here
881
00:42:05,181 --> 00:42:06,946
which is that if somebody tells you
882
00:42:06,946 --> 00:42:10,438
"oh 2^80 operations, the attacker wouldn't do
883
00:42:10,438 --> 00:42:13,796
that because the target isn't worth that much to them"
884
00:42:13,796 --> 00:42:18,284
Well, **Batch NFS** is a way to take a bunch of N's
885
00:42:18,284 --> 00:42:20,566
and like the batch techniques before
886
00:42:20,566 --> 00:42:23,516
there's some magic ways to find redundancies
887
00:42:23,516 --> 00:42:25,884
between doing factorizations of those N's.
888
00:42:25,884 --> 00:42:26,937
So somebody tells you:
889
00:42:26,937 --> 00:42:30,000
"oh 2^80, that isn't worth it for the attacker."
890
00:42:30,000 --> 00:42:33,296
Well, **Batch NFS** reduces the cost quite a bit
891
00:42:33,296 --> 00:42:34,821
for factoring each key.
892
00:42:34,821 --> 00:42:36,846
If the attacker has a lot of keys to target
893
00:42:36,846 --> 00:42:39,852
they can factor all of them in not much more cost
894
00:42:39,852 --> 00:42:41,396
than just factoring one.
895
00:42:41,396 --> 00:42:43,734
So don't believe people who take an economic view
896
00:42:43,734 --> 00:42:46,987
how much the number field sieve is worth.
897
00:42:46,987 --> 00:42:49,416
Alright, what does this mean?
898
00:42:49,416 --> 00:42:52,118
Can the attacker actually do the NFS
899
00:42:52,118 --> 00:42:54,134
once they've put in all these optimizations?
900
00:42:54,134 --> 00:42:59,064
Well if you look carefully at the analyses of the NFS
901
00:42:59,064 --> 00:43:02,416
there are only about 2^70 differences.
902
00:43:02,416 --> 00:43:04,149
Things like these a^2-N's.
903
00:43:04,149 --> 00:43:08,862
If you look through 2^70 of those and scan them properly,
904
00:43:08,862 --> 00:43:10,773
find easy factorizations, do linear algebra,
905
00:43:10,773 --> 00:43:13,669
you will factor any 1024 bit key.
906
00:43:13,669 --> 00:43:16,314
And 2^70, how big is that number?
907
00:43:16,314 --> 00:43:20,820
It's actually small enough that you can do this computation on
908
00:43:20,820 --> 00:43:22,469
your favorite botnet.
909
00:43:22,469 --> 00:43:25,749
Say, Conficker is shrinking, still
910
00:43:25,749 --> 00:43:27,947
and it will probably be wiped out at some point
911
00:43:27,947 --> 00:43:29,488
but it's an example of what you can do
912
00:43:29,488 --> 00:43:32,468
using any of these security problems that are deployed
913
00:43:32,468 --> 00:43:33,632
on enough machines.
914
00:43:33,632 --> 00:43:35,398
You can break into a bunch of machines,
915
00:43:35,398 --> 00:43:36,521
millions of machines.
916
00:43:36,521 --> 00:43:40,296
Conficker estimates vary between 7 million and 12 million machines.
917
00:43:40,296 --> 00:43:44,320
So use your, say, 2^23, that's 8 million machines.
918
00:43:44,320 --> 00:43:46,519
Count how many seconds there are in a year
919
00:43:46,519 --> 00:43:48,553
and then, say, ok that means we have to do
920
00:43:48,553 --> 00:43:52,070
2^22 differences per second machine.
921
00:43:52,070 --> 00:43:56,902
And that is slower than state of the art factorization code already runs.
922
00:43:56,902 --> 00:43:59,231
So it's actually a pretty easy computation.
923
00:43:59,231 --> 00:44:02,334
You don't even need that many millions of computers to do this.
924
00:44:02,334 --> 00:44:04,470
If people tell you "wait a minute"
925
00:44:04,470 --> 00:44:06,120
"you can't just use a bunch of machines"
926
00:44:06,120 --> 00:44:08,073
"there's all this work for linear algebra"
927
00:44:08,073 --> 00:44:09,966
Then I'll just skip this.
928
00:44:09,966 --> 00:44:12,882
Be aware that linear algebra is not as much of a problem
929
00:44:12,882 --> 00:44:15,334
as some people would say.
930
00:44:15,334 --> 00:44:18,872
If you use a big botnet there's one little problem.
931
00:44:18,872 --> 00:44:21,053
For an attacker who breaks into a bunch of machines
932
00:44:21,053 --> 00:44:23,903
and then starts spinning them up, heating up the CPU,
933
00:44:23,903 --> 00:44:25,921
the fan starts running, the user starts wondering
934
00:44:25,921 --> 00:44:28,017
"why is my machine running so slowly?"
935
00:44:28,017 --> 00:44:30,133
Maybe only use half of the CPU
936
00:44:30,133 --> 00:44:33,570
but still it has still got a good chance of getting you detected
937
00:44:33,570 --> 00:44:35,202
and kicked off the machines.
938
00:44:35,202 --> 00:44:37,266
Users are gonna shut you down if you're doing a
939
00:44:37,266 --> 00:44:39,463
say, year of computation on your botnet.
940
00:44:39,463 --> 00:44:42,883
So what do you do instead?
941
00:44:42,883 --> 00:44:45,021
Well you build a little computer cluster.
942
00:44:45,021 --> 00:44:49,354
Now yesterday we heard about a big building
943
00:44:49,354 --> 00:44:51,164
in Bluffdale, Utah, that apparently
944
00:44:51,164 --> 00:44:54,832
is going to store all of the data collected by
945
00:44:54,832 --> 00:44:57,171
the NSA for the next hundred years
946
00:44:57,171 --> 00:44:58,854
or maybe forever.
947
00:44:58,854 --> 00:45:00,914
But something that I didn't hear emphasized yesterday
948
00:45:00,914 --> 00:45:04,916
was that this building also has a new power substation
949
00:45:04,916 --> 00:45:08,150
which is drawing 65 megawatts,
950
00:45:08,150 --> 00:45:10,155
well generating 65 megawatts.
951
00:45:10,155 --> 00:45:13,999
Now what are we doing with 65 megawatts?
952
00:45:13,999 --> 00:45:15,308
Is that the kind of power that you need to
953
00:45:15,308 --> 00:45:17,490
store a bunch of data on tapes?
954
00:45:17,490 --> 00:45:21,175
No, that's the kind of power you need to do computations.
955
00:45:21,175 --> 00:45:24,725
So what is NSA doing with 2^26 watts?
956
00:45:24,725 --> 00:45:26,955
Or maybe even with more watts.
957
00:45:26,955 --> 00:45:30,909
You shouldn't actually think that 2^26 is such a big number.
958
00:45:30,909 --> 00:45:33,526
Here's a little table with, well,
959
00:45:33,526 --> 00:45:36,573
2^57 admittedly is pushing it a little bit.
960
00:45:36,573 --> 00:45:40,620
This is the total power that the earth gets from the sun
961
00:45:40,620 --> 00:45:43,207
of which about half gets to the earth's surface.
962
00:45:43,207 --> 00:45:47,610
2^44 is the amount the human is using right now.
963
00:45:47,610 --> 00:45:50,888
Varies a little bit but I mean this is the average power
964
00:45:50,888 --> 00:45:53,427
including cars and such.
965
00:45:53,427 --> 00:45:56,446
If you have the botnet that breaks into millions of machines
966
00:45:56,446 --> 00:45:59,944
and runs in flat-out that's about 2^30 watts.
967
00:45:59,944 --> 00:46:01,610
And actually if you think about it, wait a minute,
968
00:46:01,610 --> 00:46:02,575
there's a lot of machines out there.
969
00:46:02,575 --> 00:46:07,270
This means the 2^26 is not actually that much power.
970
00:46:07,270 --> 00:46:11,456
It's just one dinky little Bluffdale 65 MW computer center
971
00:46:11,456 --> 00:46:14,594
and actually if you're a government agency
972
00:46:14,594 --> 00:46:16,138
you probably have more than one of those.
973
00:46:16,138 --> 00:46:19,273
As you heard yesterday in the context of data storage
974
00:46:19,273 --> 00:46:21,292
but again, it's not just data storage.
975
00:46:21,292 --> 00:46:26,492
You don't make a 65 MW power station to just store data.
976
00:46:26,492 --> 00:46:30,207
Ok, if you just take standard graphics processing units
977
00:46:30,207 --> 00:46:33,223
and run 2^26 watts to those that will do about
978
00:46:33,223 --> 00:46:36,446
2^84 floating point multiplications per year.
979
00:46:36,446 --> 00:46:39,827
You have to figure out, ok, what are exactly the operations involved here.
980
00:46:39,827 --> 00:46:44,829
This should be enough to break 1024 bit RSA
981
00:46:44,829 --> 00:46:48,088
and if NSA is not just buying GPU's off-the-shelves
982
00:46:48,088 --> 00:46:50,226
but really tuning chips that they build
983
00:46:50,226 --> 00:46:57,409
using IBM to manufacture their chips at the moment.
984
00:46:57,409 --> 00:47:01,626
Apparently it wasn't cost-effective for NSA to manufacture their own chips
985
00:47:01,626 --> 00:47:04,475
so in 2005 they started subcontracting for IBM
986
00:47:04,475 --> 00:47:08,153
but presumably through that fabrication IBM is building chips
987
00:47:08,153 --> 00:47:11,073
that NSA wants and those should be about 10 times faster
988
00:47:11,073 --> 00:47:13,025
than what GPU's can do.
989
00:47:13,025 --> 00:47:18,362
So it should be possible with this little 65 MW computer cluster
990
00:47:18,362 --> 00:47:21,476
to factor at least 1, maybe even 10,
991
00:47:21,476 --> 00:47:23,478
and then with batch NFS maybe even more
992
00:47:23,478 --> 00:47:30,010
1024 bit RSA keys in a year.
993
00:47:30,010 --> 00:47:38,229
applause
994
00:47:38,229 --> 00:47:41,130
Nadia: So to conclude things up here I want to
995
00:47:41,130 --> 00:47:43,290
explain how you can use
996
00:47:43,290 --> 00:47:46,057
from the comfort of your very own home
997
00:47:46,057 --> 00:47:52,242
a distributed, a massive scale distributed cloud computing service
998
00:47:52,242 --> 00:47:57,142
to calculate private keys on your own.
999
00:47:57,142 --> 00:48:00,313
So many of you may be familiar with Amazon EC2
1000
00:48:00,313 --> 00:48:04,635
but it turns out Google also has a large amount of computing infrastructure
1001
00:48:04,635 --> 00:48:07,339
and they actually have a very convenient web interface
1002
00:48:07,339 --> 00:48:08,446
to access it.
1003
00:48:08,446 --> 00:48:12,978
laugther
1004
00:48:12,978 --> 00:48:21,677
applause
1005
00:48:21,677 --> 00:48:23,961
It's amazing what you can find on the Internet.
1006
00:48:23,961 --> 00:48:26,143
laughter
1007
00:48:26,143 --> 00:48:29,393
So ok, here's some keys except, you know,
1008
00:48:29,393 --> 00:48:30,578
if you look at these carefully
1009
00:48:30,578 --> 00:48:33,277
not all of these are sort of obviously RSA keys.
1010
00:48:33,277 --> 00:48:35,204
There are some problems here
1011
00:48:35,204 --> 00:48:40,704
like the second to last one on the bottom.
1012
00:48:40,704 --> 00:48:41,688
So...
1013
00:48:41,688 --> 00:48:43,338
laughter
1014
00:48:43,338 --> 00:48:49,820
After doing this search we found this key and
1015
00:48:49,820 --> 00:48:55,279
someone seems to have pasted a private key into pastebin
1016
00:48:55,279 --> 00:49:00,491
and someone came along and interrupted part of it
1017
00:49:00,491 --> 00:49:04,759
with some profanity.
1018
00:49:04,759 --> 00:49:07,387
This is an interesting problem.
1019
00:49:07,387 --> 00:49:08,812
What do we do with this key?
1020
00:49:08,812 --> 00:49:12,310
Clearly this is an interesting key, we must have this key.
1021
00:49:12,310 --> 00:49:14,204
laughter
1022
00:49:14,204 --> 00:49:24,847
applause
1023
00:49:24,847 --> 00:49:26,046
Step 1... yeah?
1024
00:49:26,046 --> 00:49:29,572
Male: Did you see the easter egg in row 1?
1025
00:49:29,572 --> 00:49:31,204
Nadia: Well, we'll discuss this.
1026
00:49:31,204 --> 00:49:34,308
Angel: Questions will be later, after the talk please.
1027
00:49:34,308 --> 00:49:39,368
Nadia: So yeah, we've removed the obvious problems here.
1028
00:49:39,368 --> 00:49:41,636
But there's still an issue which is that
1029
00:49:41,636 --> 00:49:43,844
we're missing hundreds and hundreds of bits of key.
1030
00:49:43,844 --> 00:49:45,940
We can't brute force-this.
1031
00:49:45,940 --> 00:49:47,658
What are we gonna do?
1032
00:49:47,658 --> 00:49:51,327
Well, let's look at what RSA keys actually are.
1033
00:49:51,327 --> 00:49:53,509
I introduced RSA keys and I was, like,
1034
00:49:53,509 --> 00:49:56,786
it's a d and d has like a thosand of bits.
1035
00:49:56,786 --> 00:50:00,671
And if we don't know p and q
1036
00:50:00,671 --> 00:50:02,608
how are we gonna get this d?
1037
00:50:02,608 --> 00:50:07,923
Here is the storage format for an RSA key.
1038
00:50:07,923 --> 00:50:11,258
Public key is the modulus N, the public exponent e.
1039
00:50:11,258 --> 00:50:14,710
Ok, private key has version, N, e,
1040
00:50:14,710 --> 00:50:16,536
d the decryption exponent,
1041
00:50:16,536 --> 00:50:20,273
p, q, d mod (p-1), d mod (q-1),
1042
00:50:20,273 --> 00:50:22,975
the inverse of q mod p,
1043
00:50:22,975 --> 00:50:25,305
this is all useful information if you want to
1044
00:50:25,305 --> 00:50:26,960
speed up your calculation and not have to
1045
00:50:26,960 --> 00:50:29,405
compute everything every time you use your key.
1046
00:50:29,405 --> 00:50:32,493
Okay.
1047
00:50:32,493 --> 00:50:35,952
What did we see here then?
1048
00:50:35,952 --> 00:50:40,975
Here is the key broken up into all of the difference pieces.
1049
00:50:40,975 --> 00:50:44,804
So we have: the red bit is approximately where N is sitting.
1050
00:50:44,804 --> 00:50:46,754
We've got e in orange.
1051
00:50:46,754 --> 00:50:50,569
We've got part of d in yellow.
1052
00:50:50,569 --> 00:50:52,372
We seem to be missing p
1053
00:50:52,372 --> 00:50:54,656
but we've got part of q
1054
00:50:54,656 --> 00:50:56,791
and then here's d mod p-1
1055
00:50:56,791 --> 00:51:02,408
and d mod q-1 and q inverse mod p.
1056
00:51:02,408 --> 00:51:07,712
There's also a little problem before the red here
1057
00:51:07,712 --> 00:51:08,436
laughter
1058
00:51:08,436 --> 00:51:10,504
in addition.
1059
00:51:10,504 --> 00:51:13,561
You might call this the private part of the private key.
1060
00:51:13,561 --> 00:51:15,936
laughter
1061
00:51:15,936 --> 00:51:21,391
applause
1062
00:51:21,391 --> 00:51:23,859
Fortunately this is just sitting in an encoding of the length
1063
00:51:23,859 --> 00:51:30,275
and the version. laughter
1064
00:51:30,275 --> 00:51:33,151
Alright, so what do we do with this information?
1065
00:51:33,151 --> 00:51:36,673
Basically given any part of this private key
1066
00:51:36,673 --> 00:51:38,910
we can easily compute all of the other parts.
1067
00:51:38,910 --> 00:51:40,257
Simple formulas:
1068
00:51:40,257 --> 00:51:48,938
q will be 2^(e times d mod p-1) - 1 and GCD that with N
1069
00:51:48,938 --> 00:51:51,872
then we get q and then
1070
00:51:51,872 --> 00:51:54,455
we can figure out what p was by dividing
1071
00:51:54,455 --> 00:51:56,393
and figure out what d is
1072
00:51:56,393 --> 00:51:57,953
and so on and so forth.
1073
00:51:57,953 --> 00:52:02,460
You can do this even if you have a part of a single value
1074
00:52:02,460 --> 00:52:04,109
from the private key.
1075
00:52:04,109 --> 00:52:08,195
You can still compute the rest of the private key from that.
1076
00:52:08,195 --> 00:52:12,338
We don't have time to get into this but it's online.
1077
00:52:12,338 --> 00:52:16,557
You've seen a little bit of math.
1078
00:52:16,557 --> 00:52:18,060
Single formulas, typed into Sage.
1079
00:52:18,060 --> 00:52:22,591
We can reconstruct the private key here.
1080
00:52:22,591 --> 00:52:29,487
applause
1081
00:52:29,487 --> 00:52:32,941
So, what are the lessons that we can learn
1082
00:52:32,941 --> 00:52:35,554
from everything that we've done today.
1083
00:52:35,554 --> 00:52:38,945
The first one is: stop using 1024 bit RSA
1084
00:52:38,945 --> 00:52:41,010
if you haven't already.
1085
00:52:41,010 --> 00:52:43,386
I have looked at the keys that are out there on the Internet
1086
00:52:43,386 --> 00:52:45,941
and you are still using 1024 bit RSA.
1087
00:52:45,941 --> 00:52:48,463
laughter
1088
00:52:48,463 --> 00:52:51,790
Second: make sure your primes are big enough.
1089
00:52:51,790 --> 00:52:54,227
Don't try to be clever about how you're generating them.
1090
00:52:54,227 --> 00:52:58,025
Make sure your primes are random.
1091
00:52:58,025 --> 00:52:59,303
You guys have some problems
1092
00:52:59,303 --> 00:53:01,394
especially in hardware.
1093
00:53:01,394 --> 00:53:03,621
Also...
1094
00:53:03,621 --> 00:53:04,921
laughter
1095
00:53:04,921 --> 00:53:09,091
"FUCK A DUCK" is not good crypto.
1096
00:53:09,091 --> 00:53:11,545
Pastebin is not a secure cloud store.
1097
00:53:11,545 --> 00:53:13,874
laughter
1098
00:53:13,874 --> 00:53:20,289
applause
1099
00:53:20,289 --> 00:53:22,657
And you probably shouldn't put your private key
1100
00:53:22,657 --> 00:53:24,778
in a secure cloud store anyway.
1101
00:53:24,778 --> 00:53:25,862
laughter
1102
00:53:25,862 --> 00:53:31,396
applause
1103
00:53:31,396 --> 00:53:32,577
And lastly...
1104
00:53:32,577 --> 00:53:34,211
laughter
1105
00:53:34,211 --> 00:53:40,427
applause
1106
00:53:40,427 --> 00:53:41,625
Thank you.
1107
00:53:41,625 --> 00:53:43,894
applause
1108
00:53:43,894 --> 00:53:44,804
Angel: Give them an applause.
1109
00:53:44,804 --> 00:54:05,937
applause
1110
00:54:05,937 --> 00:54:11,379
Angel: So questions will be taken at the numbered microphones.
1111
00:54:11,379 --> 00:54:14,962
So anyone who has a question please line up at a microphone
1112
00:54:14,962 --> 00:54:16,959
and we will start by the signal angel
1113
00:54:16,959 --> 00:54:19,528
who has a question from IRC.
1114
00:54:19,528 --> 00:54:23,628
*Signal Angel*: IRC has a lot of punch lines involving penisses
1115
00:54:23,628 --> 00:54:26,040
and some questions.
1116
00:54:26,040 --> 00:54:31,445
First question is: would you recommend more using
1117
00:54:31,445 --> 00:54:37,640
elliptic curves or RSA with multiple primes...
1118
00:54:37,640 --> 00:54:41,659
using proper primes.
1119
00:54:41,659 --> 00:54:47,192
Dan: There are ways to do elliptic curves wrong,
1120
00:54:47,192 --> 00:54:48,854
there are ways to do RSA wrong.
1121
00:54:48,854 --> 00:54:53,806
In general if you've got a particular performance requirement
1122
00:54:53,806 --> 00:54:55,625
then elliptic curves are gonna meet that
1123
00:54:55,625 --> 00:54:57,861
with a much higher security level
1124
00:54:57,861 --> 00:55:00,208
than RSA is gonna meet that.
1125
00:55:00,208 --> 00:55:03,005
Of course you can do RSA securely,
1126
00:55:03,005 --> 00:55:04,928
you can do elliptic curves securely
1127
00:55:04,928 --> 00:55:07,153
but if you've got some performance limitation
1128
00:55:07,153 --> 00:55:08,613
as many applications do
1129
00:55:08,613 --> 00:55:13,043
then elliptic curves tend to be a better choice.
1130
00:55:13,043 --> 00:55:14,429
Nadia: I also want to clarify
1131
00:55:14,429 --> 00:55:17,826
the other most commonly used public-key cryptosystem
1132
00:55:17,826 --> 00:55:19,979
is DSA, the Digital Signature Algorithm.
1133
00:55:19,979 --> 00:55:21,426
There's also elliptic curve DSA
1134
00:55:21,426 --> 00:55:23,291
which is probably what you're thinking about
1135
00:55:23,291 --> 00:55:25,570
with elliptic curve cryptography.
1136
00:55:25,570 --> 00:55:31,207
DSA is in my opinion actually much worse
1137
00:55:31,207 --> 00:55:34,395
than RSA in terms of randomness failures.
1138
00:55:34,395 --> 00:55:36,246
In the paper that Dan referenced
1139
00:55:36,246 --> 00:55:39,377
we also talk about randomness failures in DSA
1140
00:55:39,377 --> 00:55:40,993
and it's horrifying.
1141
00:55:40,993 --> 00:55:44,521
If you ever ever screw up randomness in a single signature
1142
00:55:44,521 --> 00:55:48,354
your entire public key is lost.
1143
00:55:48,354 --> 00:55:50,761
Angel: We will take the next question from microphone nr 4
1144
00:55:50,761 --> 00:55:53,072
to the right of the seat.
1145
00:55:53,072 --> 00:55:53,824
Jake: Hey guys.
1146
00:55:53,824 --> 00:55:56,277
Sorry I didn't mention the computing power of Bluffdale.
1147
00:55:56,277 --> 00:56:02,072
That said, when we want to transition from 1024 bit RSA
1148
00:56:02,072 --> 00:56:05,339
to something else, what do you think is a good idea?
1149
00:56:05,339 --> 00:56:10,028
So for example, in Tor we do use 1024 bit RSA for TLS.
1150
00:56:10,028 --> 00:56:12,528
laughter Yeah I know, right?
1151
00:56:12,528 --> 00:56:15,592
So the thing is we're working on changing these things
1152
00:56:15,592 --> 00:56:17,079
but what is...
1153
00:56:17,079 --> 00:56:19,766
I mean Zooko has this 100 year crypto project.
1154
00:56:19,766 --> 00:56:21,806
What is the real thing that we should,
1155
00:56:21,806 --> 00:56:23,910
like if we could switch tomorrow to something,
1156
00:56:23,910 --> 00:56:26,595
what's the useful that we can live with for 5 or 10 years?
1157
00:56:26,595 --> 00:56:28,795
Is 2048 going to be useful?
1158
00:56:28,795 --> 00:56:31,179
Is 4096 where we should go tomorrow?
1159
00:56:31,179 --> 00:56:32,780
Cause there is a trade-off between
1160
00:56:32,780 --> 00:56:34,588
performance and forward secrecy
1161
00:56:34,588 --> 00:56:37,962
and there are a lot of things to think about.
1162
00:56:37,962 --> 00:56:40,078
Tanja: If you tell me 5 to 10 years
1163
00:56:40,078 --> 00:56:42,360
I would be still comfortable with elliptic curves.
1164
00:56:42,360 --> 00:56:44,773
If you talk 100 years
1165
00:56:44,773 --> 00:56:47,691
and then there's all these worse quantum computers around
1166
00:56:47,691 --> 00:56:51,143
then I would go for something which is worse in performance
1167
00:56:51,143 --> 00:56:52,829
like code-based cryptography
1168
00:56:52,829 --> 00:56:54,043
or for signatures hashing
1169
00:56:54,043 --> 00:56:55,738
but if you'd say in 5 to 10 years
1170
00:56:55,738 --> 00:56:57,411
I'm very comfortable recommending to you
1171
00:56:57,411 --> 00:56:59,941
elliptic curves with 256 bits.
1172
00:56:59,941 --> 00:57:02,160
Jake: Ok, thanks.
1173
00:57:02,160 --> 00:57:04,878
Angel: Signal angel?
1174
00:57:04,878 --> 00:57:07,190
*Signal Angel*: Yeah, another question from IRC.
1175
00:57:07,190 --> 00:57:11,028
If you avoid all the easy primes
1176
00:57:11,028 --> 00:57:14,227
does this shrink the search space such that
1177
00:57:14,227 --> 00:57:17,862
it becomes easier to crack the remaining keys?
1178
00:57:17,862 --> 00:57:20,161
Or asked another way:
1179
00:57:20,161 --> 00:57:25,146
can you quantify the ratio of easy primes versus hard primes?
1180
00:57:25,146 --> 00:57:26,287
Dan: Yeah, that's a good question.
1181
00:57:26,287 --> 00:57:28,573
The answer is: yes, it can be quantified
1182
00:57:28,573 --> 00:57:32,829
and once you're up to about to the 1024 bit RSA key level
1183
00:57:32,829 --> 00:57:35,306
then there's very very few weak primes.
1184
00:57:35,306 --> 00:57:37,078
So if you have a random number,
1185
00:57:37,078 --> 00:57:38,593
you just generate a random prime
1186
00:57:38,593 --> 00:57:40,238
then your chance of bumping into something weak
1187
00:57:40,238 --> 00:57:42,793
is so small that you just don't have to worry about it.
1188
00:57:42,793 --> 00:57:44,862
It is one of those much less frequent than
1189
00:57:44,862 --> 00:57:46,580
being hit by a lightning kind of events.
1190
00:57:46,580 --> 00:57:49,210
It is something that have been quantified
1191
00:57:49,210 --> 00:57:52,244
and it is an issue for smaller RSA keys
1192
00:57:52,244 --> 00:57:54,872
back when people were using, say, 512 bit RSA
1193
00:57:54,872 --> 00:57:56,754
then it was actually a noticable issue.
1194
00:57:56,754 --> 00:57:59,123
But once you're at 1024 or above then
1195
00:57:59,123 --> 00:58:00,694
the issue is basically gone.
1196
00:58:00,694 --> 00:58:03,831
It's something where you can and you should
1197
00:58:03,831 --> 00:58:06,097
look at exactly what the chance is of bumping into a weak
1198
00:58:06,097 --> 00:58:08,828
but it's not reallistically something
1199
00:58:08,828 --> 00:58:13,259
you're gonna encounter with 1024 bit RSA.
1200
00:58:13,259 --> 00:58:16,542
Angel: We're gonna take the next question from nr 1.
1201
00:58:16,542 --> 00:58:19,313
Just one notice: we don't have a lot of time left
1202
00:58:19,313 --> 00:58:23,546
so even though there is a talk in one hour
1203
00:58:23,546 --> 00:58:25,078
just ask.
1204
00:58:25,078 --> 00:58:28,165
Male: There is a devastating attack that can break AES,
1205
00:58:28,165 --> 00:58:31,872
SHA2, most probably SHA3 and DES.
1206
00:58:31,872 --> 00:58:33,758
It's called a biclique attack.
1207
00:58:33,758 --> 00:58:35,664
Are you concerned that it might break RSA
1208
00:58:35,664 --> 00:58:37,281
with any key size and even ECC
1209
00:58:37,281 --> 00:58:39,988
and even any public-key crypto?
1210
00:58:39,988 --> 00:58:41,660
Dan: No.
1211
00:58:41,660 --> 00:58:43,594
laughter
1212
00:58:43,594 --> 00:58:46,675
Bicliques are something which are certainly interesting
1213
00:58:46,675 --> 00:58:49,590
and I think about it as a way to speed up brute force
1214
00:58:49,590 --> 00:58:52,508
and it speeds up brute force by a noticable factor,
1215
00:58:52,508 --> 00:58:55,148
often makes attacks, say, twice as fast.
1216
00:58:55,148 --> 00:58:58,523
But with public-key crypto we have attacks which are
1217
00:58:58,523 --> 00:59:00,760
much much faster than brute force to begin with
1218
00:59:00,760 --> 00:59:04,209
so that kind of speed-up of brute force won't be competitive
1219
00:59:04,209 --> 00:59:08,226
with things like the number field sieve.
1220
00:59:08,226 --> 00:59:12,964
Angel: Next is from number 3.
1221
00:59:12,964 --> 00:59:16,824
Male: Me not coming from the dark side of mathematics,
1222
00:59:16,824 --> 00:59:20,923
how do I know that my random number generators are fine
1223
00:59:20,923 --> 00:59:23,661
for generating keys?
1224
00:59:23,661 --> 00:59:26,223
laughter
1225
00:59:26,223 --> 00:59:31,354
Nadia: Seed them.
1226
00:59:31,354 --> 00:59:37,042
Basically the things that are out there if you're using
1227
00:59:37,042 --> 00:59:40,613
a standard random number generator like in Linux,
1228
00:59:40,613 --> 00:59:44,441
actually Linux has added patches to the kernel
1229
00:59:44,441 --> 00:59:48,022
to fix some of the problems that we've found.
1230
00:59:48,022 --> 00:59:50,431
If you want to know that you're keys are good:
1231
00:59:50,431 --> 00:59:54,694
if you generate them on a general purpose computer
1232
00:59:54,694 --> 00:59:56,600
after it has been running for a long time
1233
00:59:56,600 --> 00:59:59,559
you're probably fine.
1234
00:59:59,559 --> 01:00:04,793
I would not generate very important or long-term keys
1235
01:00:04,793 --> 01:00:09,707
on low-power hardware devices where you can't tell...
1236
01:00:09,707 --> 01:00:11,910
laughter
1237
01:00:11,910 --> 01:00:15,758
Male: Thank you.
1238
01:00:15,758 --> 01:00:18,341
Angel: Next question will be taken from number 4.
1239
01:00:18,341 --> 01:00:19,394
Male: Hi.
1240
01:00:19,394 --> 01:00:23,690
It's just a remark, not really a question.
1241
01:00:23,690 --> 01:00:25,407
I work in high performance computing.
1242
01:00:25,407 --> 01:00:26,910
I was at the supercomputing conference
1243
01:00:26,910 --> 01:00:29,376
a couple of weeks ago.
1244
01:00:29,376 --> 01:00:31,931
They're presenting large-scale installations
1245
01:00:31,931 --> 01:00:35,863
and dealing with problems of power.
1246
01:00:35,863 --> 01:00:38,927
They want to build petaflops and exaflops systems
1247
01:00:38,927 --> 01:00:42,262
that are in a range of 20 MW.
1248
01:00:42,262 --> 01:00:47,316
So I'm wondering what NSA is doing with 53 megawatts
1249
01:00:47,316 --> 01:00:53,815
in a data center because no storage takes that much capacity.
1250
01:00:53,815 --> 01:00:58,976
I think it's really something we should think about.
1251
01:00:58,976 --> 01:01:03,266
So thanks, nice talk.
1252
01:01:03,266 --> 01:01:04,883
Angel: Next question from...
1253
01:01:04,883 --> 01:01:06,961
does the signal angel have one?
1254
01:01:06,961 --> 01:01:08,679
*Signal angel*: Ok, another question is:
1255
01:01:08,679 --> 01:01:11,845
How big do you estimate the effort to
1256
01:01:11,845 --> 01:01:17,308
exchange keys and certificates involving 1024 bit primes
1257
01:01:17,308 --> 01:01:22,782
let's say worldwide at the moment?
1258
01:01:22,782 --> 01:01:24,502
Dan: I wasn't getting what the question was.
1259
01:01:24,502 --> 01:01:25,642
Could you repeat the question?
1260
01:01:25,642 --> 01:01:27,344
*Signal angel*: I don't really understand it myself.
1261
01:01:27,344 --> 01:01:28,875
laughter
1262
01:01:28,875 --> 01:01:33,181
Angel: Ok, let's switch to 1.
1263
01:01:33,181 --> 01:01:37,009
Male: My question goes back to the idea
1264
01:01:37,009 --> 01:01:42,876
of how can we know how good the private keys are?
1265
01:01:42,876 --> 01:01:50,227
Is there something like a keygen evaluation tool?
1266
01:01:50,227 --> 01:01:56,567
Or can the quality of a key generator be estimated
1267
01:01:56,567 --> 01:01:58,544
from a sample of keys?
1268
01:01:58,544 --> 01:02:01,827
Like a public tool that you can throw keys on
1269
01:02:01,827 --> 01:02:05,565
and it will tell me a bias of my keygen?
1270
01:02:05,565 --> 01:02:09,091
Nadia: If you go to factorable.net
1271
01:02:09,091 --> 01:02:12,574
my co-authors and I have made a key check tool
1272
01:02:12,574 --> 01:02:14,728
which will tell you if your key is in the
1273
01:02:14,728 --> 01:02:17,528
list of keys that we scraped off the Internet
1274
01:02:17,528 --> 01:02:19,296
that we were able to compromise.
1275
01:02:19,296 --> 01:02:21,249
That's a first check.
1276
01:02:21,249 --> 01:02:24,449
It's not a positive verification that your key is good
1277
01:02:24,449 --> 01:02:26,544
but it will tell you if it is very bad.
1278
01:02:26,544 --> 01:02:28,863
laughter
1279
01:02:28,863 --> 01:02:36,811
If you want to check your generator,
1280
01:02:36,811 --> 01:02:40,827
if you're worried about the factorization problems
1281
01:02:40,827 --> 01:02:43,242
we have fast code on the website in C
1282
01:02:43,242 --> 01:02:45,715
that will do the GCD calculation for you
1283
01:02:45,715 --> 01:02:50,201
if you just dump a gigantic collection of keys at it.
1284
01:02:50,201 --> 01:02:54,375
If you might have other problems like you are
1285
01:02:54,375 --> 01:02:56,993
not sure whether it's factorable like those don't come up
1286
01:02:56,993 --> 01:03:00,497
the experiment that you might want to do is
1287
01:03:00,497 --> 01:03:04,211
sort of restart the process in realistic conditions
1288
01:03:04,211 --> 01:03:06,179
and generate a gigantic quantity of keys
1289
01:03:06,179 --> 01:03:08,981
and just check for repeats.
1290
01:03:08,981 --> 01:03:12,378
The sort of obvious stress testings.
1291
01:03:12,378 --> 01:03:16,375
If you have a device you want to boot it up in realistic conditions
1292
01:03:16,375 --> 01:03:18,048
and then check them
1293
01:03:18,048 --> 01:03:19,798
and this is painstaking work
1294
01:03:19,798 --> 01:03:22,327
but I don't think there's any realistic way
1295
01:03:22,327 --> 01:03:25,546
of doing better than that.
1296
01:03:25,546 --> 01:03:28,396
Or inspect the code, the obvious thing.
1297
01:03:28,396 --> 01:03:30,924
Make sure you're seeding anything.
1298
01:03:30,924 --> 01:03:33,463
Male: Ok, thank you.
1299
01:03:33,463 --> 01:03:35,993
Angel: Next question will be from mic number 4.
1300
01:03:35,993 --> 01:03:37,045
Male: Hi.
1301
01:03:37,045 --> 01:03:40,745
If I got your numbers correctly so you said something like
1302
01:03:40,745 --> 01:03:45,450
it would take 8 million machines a year to factor 1024
1303
01:03:45,450 --> 01:03:48,976
so I was wondering what if I would like to factor like
1304
01:03:48,976 --> 01:03:52,243
800 bits or 900 bits which is,
1305
01:03:52,243 --> 01:03:55,149
as I understand, way easier than 1024
1306
01:03:55,149 --> 01:03:57,042
but was never done publicly.
1307
01:03:57,042 --> 01:04:00,259
So are you saying that would take like a day or something?
1308
01:04:00,259 --> 01:04:02,581
Dan: It depends on the size of cluster you have.
1309
01:04:02,581 --> 01:04:07,064
The RSA 768 factorization a couple of years ago
1310
01:04:07,064 --> 01:04:11,423
used something on the scale of a 1500 computers
1311
01:04:11,423 --> 01:04:14,130
for a year.
1312
01:04:14,130 --> 01:04:15,729
And those were normal kind of computers,
1313
01:04:15,729 --> 01:04:19,475
Desktop kind of computers with, I think, typically 2 cores.
1314
01:04:19,475 --> 01:04:24,240
Nowadays, if you have some faster, say, 3 GHz 4-core machines,
1315
01:04:24,240 --> 01:04:27,446
I think those were typically 2 GHZ 2-core, nowadays...
1316
01:04:27,446 --> 01:04:30,582
Tanja's mentioning you of course want to use GPUs
1317
01:04:30,582 --> 01:04:31,725
to speed things up.
1318
01:04:31,725 --> 01:04:33,397
And there's many ways to take advantage of
1319
01:04:33,397 --> 01:04:35,393
computer technology moving forward.
1320
01:04:35,393 --> 01:04:39,998
To get careful estimates: for going from 768 to 800,
1321
01:04:39,998 --> 01:04:42,192
that's a short enough extrapolation
1322
01:04:42,192 --> 01:04:44,113
that people are pretty confident about that.
1323
01:04:44,113 --> 01:04:48,014
Getting to 900 starts getting iffy what exactly the cost would be
1324
01:04:48,014 --> 01:04:50,951
and 1024 there's a fair amount of guess works.
1325
01:04:50,951 --> 01:04:53,796
There is some consensus on rougly what the costs are
1326
01:04:53,796 --> 01:04:55,429
but it's hard to say exactly.
1327
01:04:55,429 --> 01:04:58,059
But again, to give you a scale of it:
1328
01:04:58,059 --> 01:04:59,675
you were saying like 900 or 800.
1329
01:04:59,675 --> 01:05:03,213
Well, 768 RSA challenge was done
1330
01:05:03,213 --> 01:05:07,228
and that one was 1500 machine years.
1331
01:05:07,228 --> 01:05:08,501
Male: Ok, thank you.
1332
01:05:08,501 --> 01:05:10,260
Angel: We'll take four more questions.
1333
01:05:10,260 --> 01:05:12,357
Signal angel first.
1334
01:05:12,357 --> 01:05:16,676
*Signal angel*: It's a clarification of the question I asked before.
1335
01:05:16,676 --> 01:05:22,278
The actual question is: how big will be the effort
1336
01:05:22,278 --> 01:05:32,129
to upgrade existing keys from 1024 bits to 4K or 8K bits?
1337
01:05:32,129 --> 01:05:35,984
Considering that the keys are currently in use
1338
01:05:35,984 --> 01:05:37,910
how much effort worldwide would it be to
1339
01:05:37,910 --> 01:05:41,608
upgrade all that to something more secure?
1340
01:05:41,608 --> 01:05:43,559
Tanja: I mean, for the private user if you're running
1341
01:05:43,559 --> 01:05:46,447
SSH on the laptop you can just generate a new key.
1342
01:05:46,447 --> 01:05:47,876
Doesn't take you much time.
1343
01:05:47,876 --> 01:05:50,345
You will notice maybe some degradation of performance
1344
01:05:50,345 --> 01:05:52,193
if you're running it on a netbook
1345
01:05:52,193 --> 01:05:54,755
but going to 2048 is not a big deal.
1346
01:05:54,755 --> 01:05:57,710
However, there is a recommendation of the
1347
01:05:57,710 --> 01:06:01,932
US government to stop using 1024 bit RSA as of 2010
1348
01:06:01,932 --> 01:06:05,211
and we still see it everywhere.
1349
01:06:05,211 --> 01:06:07,563
Apparently it is bigger effort than just
1350
01:06:07,563 --> 01:06:10,743
everybody spending 10 minutes on the laptop.
1351
01:06:10,743 --> 01:06:13,635
Dan: Maybe part of the problem is things like
1352
01:06:13,635 --> 01:06:16,997
everybody says "yeah, use 2048 or bigger"
1353
01:06:16,997 --> 01:06:18,884
and there's some financial standard
1354
01:06:18,884 --> 01:06:21,157
which says you can use any key size you want
1355
01:06:21,157 --> 01:06:23,943
up to 1984 bits.
1356
01:06:23,943 --> 01:06:25,631
I have no idea how they came up with this
1357
01:06:25,631 --> 01:06:27,208
but then they run into some other standard
1358
01:06:27,208 --> 01:06:29,284
which says use 2048 and they just can't
1359
01:06:29,284 --> 01:06:33,345
implement it on the smartcards which only support up to 1984.
1360
01:06:33,345 --> 01:06:36,467
Now if you get the standards people talking to each other for several years
1361
01:06:36,467 --> 01:06:39,258
then they agree on using 1984.
1362
01:06:39,258 --> 01:06:41,796
It's just about as good as 2048
1363
01:06:41,796 --> 01:06:44,746
but realistically when you have these conflicts
1364
01:06:44,746 --> 01:06:47,128
in standards then it takes a while to work out.
1365
01:06:47,128 --> 01:06:48,742
And when you have a busy server
1366
01:06:48,742 --> 01:06:51,233
that's trying to do 2048 bit RSA
1367
01:06:51,233 --> 01:06:53,126
it doesn't matter what the standard says
1368
01:06:53,126 --> 01:06:55,449
if you got a ton of load you just can't handle that load
1369
01:06:55,449 --> 01:06:58,341
if you're spending a ton of time on cryptography.
1370
01:06:58,341 --> 01:07:00,460
And that again is where we like elliptic curves
1371
01:07:00,460 --> 01:07:02,560
because they give you for whatever your
1372
01:07:02,560 --> 01:07:05,341
performance constraints are much better security.
1373
01:07:05,341 --> 01:07:07,309
So if you're trying a given security level
1374
01:07:07,309 --> 01:07:09,831
the speed is much better.
1375
01:07:09,831 --> 01:07:12,807
Angel: Next question will be from number 2.
1376
01:07:12,807 --> 01:07:14,642
Female: So you've had two developers stand up
1377
01:07:14,642 --> 01:07:17,459
and ask how to verify whether or not their
1378
01:07:17,459 --> 01:07:19,726
key generation is correct and whether or not their
1379
01:07:19,726 --> 01:07:21,490
random numbers are correct.
1380
01:07:21,490 --> 01:07:23,887
And I think it's great that you give coding recommendations
1381
01:07:23,887 --> 01:07:27,441
like seed rand() but coming from the development
1382
01:07:27,441 --> 01:07:29,539
perspective I like to unit test my code
1383
01:07:29,539 --> 01:07:31,789
and specifically I like to write my unit tests
1384
01:07:31,789 --> 01:07:33,043
before I write my code,
1385
01:07:33,043 --> 01:07:34,709
it's called test-driven development.
1386
01:07:34,709 --> 01:07:38,175
So if I'm going about creating an algorithm to
1387
01:07:38,175 --> 01:07:39,994
encrypt something or whatever
1388
01:07:39,994 --> 01:07:42,771
what are the steps that I need to do
1389
01:07:42,771 --> 01:07:46,653
when I'm writing my unit test?
1390
01:07:46,653 --> 01:07:51,939
Has there been some kind of effort in that way
1391
01:07:51,939 --> 01:07:54,911
like what goes into unit test for determining that
1392
01:07:54,911 --> 01:07:56,606
your random number is correct?
1393
01:07:56,606 --> 01:07:59,091
I mean there's algorithms for determining
1394
01:07:59,091 --> 01:08:00,562
whether your random number generator is
1395
01:08:00,562 --> 01:08:02,773
operating correctly, right?
1396
01:08:02,773 --> 01:08:05,160
Dan: This is something where there is,
1397
01:08:05,160 --> 01:08:07,775
if you look at how the random number generators work,
1398
01:08:07,775 --> 01:08:10,209
there was just a NIST workshop
1399
01:08:10,209 --> 01:08:11,975
and there's some NIST standards
1400
01:08:11,975 --> 01:08:12,742
which are telling you
1401
01:08:12,742 --> 01:08:15,709
here's what to do to test your random number generators.
1402
01:08:15,709 --> 01:08:17,796
So you have a hardware random number generator,
1403
01:08:17,796 --> 01:08:19,589
some post-processing, and they say
1404
01:08:19,589 --> 01:08:22,406
"ok here's the standard for how to test those pieces"
1405
01:08:22,406 --> 01:08:24,704
Now that's not the same as testing the cryptography
1406
01:08:24,704 --> 01:08:26,335
that's using those pieces
1407
01:08:26,335 --> 01:08:28,994
and it's very helpful if there's a clear separation there.
1408
01:08:28,994 --> 01:08:30,463
So you have your cryptography
1409
01:08:30,463 --> 01:08:32,746
which is doing deterministic stuff
1410
01:08:32,746 --> 01:08:34,452
and says you have to have some randomness coming
1411
01:08:34,452 --> 01:08:37,806
from a totally separate unit where the only job
1412
01:08:37,806 --> 01:08:40,460
of that separate unit is do the randomness properly.
1413
01:08:40,460 --> 01:08:42,146
And then the NIST test will tell you
1414
01:08:42,146 --> 01:08:43,324
what the randomness does
1415
01:08:43,324 --> 01:08:45,938
and then deterministic cryptographic testing
1416
01:08:45,938 --> 01:08:48,391
will tell you that the other pieces are working correctly
1417
01:08:48,391 --> 01:08:50,211
for all sorts of inputs.
1418
01:08:50,211 --> 01:08:53,892
Where it goes wrong is something like the ECDSA standard
1419
01:08:53,892 --> 01:08:55,394
that Nadia was mentioning before
1420
01:08:55,394 --> 01:08:56,696
and that's one that says that
1421
01:08:56,696 --> 01:08:59,210
you don't just deterministically generate a signature
1422
01:08:59,210 --> 01:09:01,410
you make some new randomness every time you generate
1423
01:09:01,410 --> 01:09:03,705
a signature and that's what goes horribly wrong.
1424
01:09:03,705 --> 01:09:05,670
So that's something where we need new standards
1425
01:09:05,670 --> 01:09:07,790
which say instead of doing what ECDSA does
1426
01:09:07,790 --> 01:09:09,741
we have a clear separation between
1427
01:09:09,741 --> 01:09:12,192
the cryptography always does the same thing
1428
01:09:12,192 --> 01:09:15,043
making it testable, and then randomness is centralized
1429
01:09:15,043 --> 01:09:17,541
in one spot which hopefully the NIST standards
1430
01:09:17,541 --> 01:09:19,840
do a good enough job of testing.
1431
01:09:19,840 --> 01:09:23,508
Female: So there's something that says here's what determines...
1432
01:09:23,508 --> 01:09:26,946
I guess what I'm asking is do you know of any algorithms
1433
01:09:26,946 --> 01:09:30,139
that can be used for determining whether something
1434
01:09:30,139 --> 01:09:32,248
is a good random number and whether your random
1435
01:09:32,248 --> 01:09:34,525
number generator is generating things properly?
1436
01:09:34,525 --> 01:09:37,027
Tanja: Well, there are a bunch of tests for.
1437
01:09:37,027 --> 01:09:39,307
There's hardware random number generators
1438
01:09:39,307 --> 01:09:42,126
you can run them through the diehard tests.
1439
01:09:42,126 --> 01:09:44,710
There's certificates by FIPS.
1440
01:09:44,710 --> 01:09:48,526
In general I would recommend implementing all those steps
1441
01:09:48,526 --> 01:09:50,811
but the smartcards that have been mentioned by Dan earlier
1442
01:09:50,811 --> 01:09:52,742
from the Taiwan citizen card
1443
01:09:52,742 --> 01:09:55,445
those are actually FIPS-certified.
1444
01:09:55,445 --> 01:09:58,942
So even though, these go through what is the industry standard
1445
01:09:58,942 --> 01:10:00,922
of doing random number generation on smartcards
1446
01:10:00,922 --> 01:10:03,622
which everybody thought was good enough,
1447
01:10:03,622 --> 01:10:04,979
apparently they're not.
1448
01:10:04,979 --> 01:10:06,477
So randomness is a total mess.
1449
01:10:06,477 --> 01:10:08,613
I mean after the paper by Nadia and
1450
01:10:08,613 --> 01:10:10,410
also when the paper by the Lausanne team came out,
1451
01:10:10,410 --> 01:10:12,692
there is no some more movement in finding
1452
01:10:12,692 --> 01:10:14,023
better random number generators.
1453
01:10:14,023 --> 01:10:18,659
At this moment, well, hope for the best.
1454
01:10:18,659 --> 01:10:20,625
laughter
1455
01:10:20,625 --> 01:10:22,178
Implement the standards, yes.
1456
01:10:22,178 --> 01:10:24,672
If you have some source of randomness
1457
01:10:24,672 --> 01:10:28,424
try to stretch it with a stream cipher or something like that.
1458
01:10:28,424 --> 01:10:31,296
Nadia: I want to tell a little story.
1459
01:10:31,296 --> 01:10:33,227
After we published the paper
1460
01:10:33,227 --> 01:10:35,643
I heard some very interesting things from some of the vendors.
1461
01:10:35,643 --> 01:10:39,329
I think one of the things that makes randomness
1462
01:10:39,329 --> 01:10:42,206
and cryptography a difficult problem is that
1463
01:10:42,206 --> 01:10:44,612
this kind of issue is something that
1464
01:10:44,612 --> 01:10:46,808
crosses a lot of the abstraction boundaries
1465
01:10:46,808 --> 01:10:48,546
that you're used to in coding.
1466
01:10:48,546 --> 01:10:50,525
You want to have the clean unit test that you know
1467
01:10:50,525 --> 01:10:52,343
that this piece works, and this piece works,
1468
01:10:52,343 --> 01:10:53,923
and this piece works, and this piece works,
1469
01:10:53,923 --> 01:10:57,777
and you put them all together and it will all work flawlessly.
1470
01:10:57,777 --> 01:11:00,297
Somehow this kind of problem is something that
1471
01:11:00,297 --> 01:11:02,898
happens at the boundaries of each unit test.
1472
01:11:02,898 --> 01:11:05,223
We know that the operating system is like
1473
01:11:05,223 --> 01:11:08,431
"ok I'm getting my proper inputs from the hardware
1474
01:11:08,431 --> 01:11:11,297
and I'm correctly generating my randomness from there"
1475
01:11:11,297 --> 01:11:12,840
and then the application says
1476
01:11:12,840 --> 01:11:15,023
"I'm getting my randomness from the operating system,
1477
01:11:15,023 --> 01:11:16,310
it's good randomness"
1478
01:11:16,310 --> 01:11:19,605
and then the application uses the correct crypto algorithms
1479
01:11:19,605 --> 01:11:21,927
to then do good cryptography.
1480
01:11:21,927 --> 01:11:24,495
And at the very beginning of that stage
1481
01:11:24,495 --> 01:11:27,777
there was something broken and it translated all the way
1482
01:11:27,777 --> 01:11:29,526
through all of these pieces of software
1483
01:11:29,526 --> 01:11:31,532
that were working correctly.
1484
01:11:31,532 --> 01:11:35,415
Testing this holistically was the only way
1485
01:11:35,415 --> 01:11:37,312
to find this kind of error.
1486
01:11:37,312 --> 01:11:41,241
One story that I heard from a vendor was that
1487
01:11:41,241 --> 01:11:43,911
they ran into randomness problems.
1488
01:11:43,911 --> 01:11:45,745
They developed some device.
1489
01:11:45,745 --> 01:11:47,325
It was working perfectly
1490
01:11:47,325 --> 01:11:51,478
and on the assembly line they were being turned on
1491
01:11:51,478 --> 01:11:55,497
and run through the checks.
1492
01:11:55,497 --> 01:11:57,807
You boot it up it checks the integrity
1493
01:11:57,807 --> 01:12:00,641
on the assembly line and it was continuing on.
1494
01:12:00,641 --> 01:12:06,390
If you exactly booted them up at the exact same time
1495
01:12:06,390 --> 01:12:10,129
in the exact same conditions
1496
01:12:10,129 --> 01:12:13,221
then all of the inputs to the random number generator
1497
01:12:13,221 --> 01:12:16,323
would be the same and they would generate the same keys
1498
01:12:16,323 --> 01:12:17,743
and then these keys would be,
1499
01:12:17,743 --> 01:12:19,721
you know, they already generated the keys,
1500
01:12:19,721 --> 01:12:21,197
so they wouldn't generate them again after
1501
01:12:21,197 --> 01:12:25,397
they went to the consumer.
1502
01:12:25,397 --> 01:12:27,145
I don't know how to unit test that
1503
01:12:27,145 --> 01:12:29,976
except take the entire device, turn it on,
1504
01:12:29,976 --> 01:12:32,589
and take multiple of them and make sure the devices
1505
01:12:32,589 --> 01:12:35,396
as they're coming out of the production process
1506
01:12:35,396 --> 01:12:39,596
are working correctly.
1507
01:12:39,596 --> 01:12:41,477
Angel: So we will take 2 more questions.
1508
01:12:41,477 --> 01:12:46,692
One from number 4 first and then number 1 at the last.
1509
01:12:46,692 --> 01:12:48,005
Male: How good, do you think,
1510
01:12:48,005 --> 01:12:52,544
hardware-supported random number generators are
1511
01:12:52,544 --> 01:12:53,432
after what you've just said?
1512
01:12:53,432 --> 01:12:55,806
I think they're probably not good anymore?
1513
01:12:55,806 --> 01:13:01,566
Dan: Intel has their RdRand instruction in some of the newer chips
1514
01:13:01,566 --> 01:13:04,979
and they say that they've gone through
1515
01:13:04,979 --> 01:13:06,957
a reasonable number of evaluation steps.
1516
01:13:06,957 --> 01:13:09,442
It sounds like they're trying.
1517
01:13:09,442 --> 01:13:11,826
Whether they're successful is a different question.
1518
01:13:11,826 --> 01:13:13,661
Something I don't like about the API is
1519
01:13:13,661 --> 01:13:16,392
RdRand gives you no way to test the pieces.
1520
01:13:16,392 --> 01:13:18,798
You can only test the post-processed output.
1521
01:13:18,798 --> 01:13:22,157
So nobody else is able to test the actual
1522
01:13:22,157 --> 01:13:24,276
randomness generation part of the hardware.
1523
01:13:24,276 --> 01:13:26,543
You can only get a filtered version of that.
1524
01:13:26,543 --> 01:13:29,245
So Intel and people who are contracted by Intel
1525
01:13:29,245 --> 01:13:30,628
to see what's going on inside
1526
01:13:30,628 --> 01:13:32,829
are the only people who can run these tests.
1527
01:13:32,829 --> 01:13:35,874
It's not open in a way that allows the community
1528
01:13:35,874 --> 01:13:37,377
to see if there's further problems.
1529
01:13:37,377 --> 01:13:38,756
But at least they're trying
1530
01:13:38,756 --> 01:13:41,423
and maybe with enough effort the hardware manufacturers
1531
01:13:41,423 --> 01:13:43,458
will get randomness generation right.
1532
01:13:43,458 --> 01:13:47,579
Of course if you can generate any sort of secret once
1533
01:13:47,579 --> 01:13:50,462
if you have enough of a secret to start with
1534
01:13:50,462 --> 01:13:52,328
then you can use things like AES to
1535
01:13:52,328 --> 01:13:53,944
generate any number of secrets
1536
01:13:53,944 --> 01:13:55,494
and you can put those secrets into
1537
01:13:55,494 --> 01:13:57,296
any number of devices that you want.
1538
01:13:57,296 --> 01:13:59,594
If you use AES in counter mode with
1539
01:13:59,594 --> 01:14:02,123
encrypting 1, encrypting 2, encrypting 3...
1540
01:14:02,123 --> 01:14:04,981
you get totally unpredictable, unrelated outputs
1541
01:14:04,981 --> 01:14:07,378
and then use those, burn those into some hardware.
1542
01:14:07,378 --> 01:14:09,979
But where do you get that initial randomness from?
1543
01:14:09,979 --> 01:14:12,830
Are you going through a trustworthy process there?
1544
01:14:12,830 --> 01:14:16,308
It's a hard problem.
1545
01:14:16,308 --> 01:14:18,198
Angel: Next question from number 1.
1546
01:14:18,198 --> 01:14:19,729
And that's the last question.
1547
01:14:19,729 --> 01:14:21,379
Male: You said something about
1548
01:14:21,379 --> 01:14:24,799
you dropped this group-based cryptography word.
1549
01:14:24,799 --> 01:14:28,111
Most of the stuff I've encountered under that heading
1550
01:14:28,111 --> 01:14:30,008
kind of sounded like snake oil
1551
01:14:30,008 --> 01:14:32,898
just cause they're using non-abelian groups and stuff.
1552
01:14:32,898 --> 01:14:35,832
Is there anything here that isn't?
1553
01:14:35,832 --> 01:14:38,280
Tanja: Ok, I did not say group-based crypto.
1554
01:14:38,280 --> 01:14:38,914
Male: Ok.
1555
01:14:38,914 --> 01:14:40,259
Tanja: I said code-based cryptography.
1556
01:14:40,259 --> 01:14:41,241
Male: Sorry, thanks.
1557
01:14:41,241 --> 01:14:42,773
Tanja: So there is a class of cryptosystems
1558
01:14:42,773 --> 01:14:44,826
which are fine against quantum computers.
1559
01:14:44,826 --> 01:14:50,148
For all crypto it's always up to what we know.
1560
01:14:50,148 --> 01:14:51,959
Next day, there could be somebody with a
1561
01:14:51,959 --> 01:14:54,009
polynomial-time factor algorithm and so on.
1562
01:14:54,009 --> 01:14:56,723
Now there's also a group of people doing
1563
01:14:56,723 --> 01:14:59,357
braid group cryptography, doing non-abelian groups,
1564
01:14:59,357 --> 01:15:01,764
most of these systems have been broken.
1565
01:15:01,764 --> 01:15:07,083
If they weren't broken by classical computers
1566
01:15:07,083 --> 01:15:10,911
maybe they would remain secure against quantum computers.
1567
01:15:10,911 --> 01:15:13,828
It's lots of "would"s there.
1568
01:15:13,828 --> 01:15:17,612
Yes, they might have this on their flags as well
1569
01:15:17,612 --> 01:15:20,547
but I wouldn't count this as a secure system.
1570
01:15:20,547 --> 01:15:23,027
Whereas code-based cryptography or lattice-based cryptography
1571
01:15:23,027 --> 01:15:26,480
are systems which should be fine.
1572
01:15:26,480 --> 01:15:28,646
Dan: Maybe just a URL if you're interested
1573
01:15:28,646 --> 01:15:31,939
in post-quantum cryptography is pqcrypto.org
1574
01:15:31,939 --> 01:15:34,513
and that keeps track of papers on the various types
1575
01:15:34,513 --> 01:15:37,227
of cryptography that should survive for a 100 years
1576
01:15:37,227 --> 01:15:41,448
and in particular against quantum computers.
1577
01:15:41,448 --> 01:15:42,444
Angel: Ladies and gentleman, please give
1578
01:15:42,444 --> 01:15:46,546
Nadia, Tanja and Daniel a big round of applause.
1579
01:15:46,546 --> 01:15:47,714
Thank you so much.
1580
01:15:47,714 --> 01:15:56,643
applause