Today we have Daniel J. Bernstein,
Nadia Heninger,
and Tanja Lange.
And they will talk about RSA factorization in the real world.
Talk is called FactHacks.
applause
Nadia: Thank you.
So we're going to start with a crypto lecture in 5 minutes.
So, just to begin, since we're talking about RSA.
Here is a picture of RSA for you.
RSA is the first published public-key cryptosystem.
So by a public-key cryptosystem, what I mean is
a cryptosystem that has two keys intead of just having
one key. You might think "ok you encrypt a message
to that key and you decrypt a message to that key"
That is a symmetric cryptosystem.
A public-key cryptosystem has two keys.
One is the public key and one is the private key.
The public key you publish to the world and
anyone can use that key to encrypt a message,
especially to you.
And only you, who have the private key, can decrypt that.
So the RSA paper was the first published public-key cryptosystem.
That was in 1977.
This revolutionized cryptography and enabled
the development of the Internet.
So 35 years later RSA is still the most commonly used public-key cryptosystem.
If you ever use the Internet, you probably use it every day.
So here is a sample SSL certificate which uses RSA.
If you use SSH you probably have an SSH key
which probably is RSA.
Before I launch into the math
I want to explain what we're doing here.
We wanted to, since you guys are hackers,
you like code instead of formulas
so we wanted to give you formulas in code.
So what we're giving you is working Python code for
all the math that we're going to do.
And in order to do that we're going to use
some software you call Sage.
Sage is free open-source mathematics software.
It is available at this URL sagemath.org.
It's based off of Python
so it's very simple if you know Python.
It has a nice sort of interpreter
that you can use just like Python.
Here an example, 2*3=6.
But there are some differences.
For example the caret instead of in Python it's XOR
in Sage it's exponentiation.
So 2^3 is 8 in Sage.
The other nice thing about Sage is that
it has lots of useful mathematical libraries.
For example if I, in the Sage interpreter, say factor(15)
it helpfully tells me 3*5.
That's pretty great.
It also does a lot more advanced things, for example
it can represent polynomials.
I can say factor the polynomial x^2-1
and it helpfully tells me the answer to that.
So now that we've gone through what is Sage,
I want to explain how to do RSA.
Perhaps some of you have seen this before.
In order to generate an RSA key, the first thing that you do
is you generate two large random primes.
So if we want a 1024 bit RSA key
we generate two primes of size 2^512,
so 512 bit primes, p and q.
In order to generate the public key
I multiply together p and q and you get some N.
N has 1024 bits.
And then you have what's known as the public exponent
which you can choose 3 or 65537 or 35.
It really doesn't matter what you choose.
These are some of the most commonly used values.
Now in order to generate a private key
you choose d, the decryption exponent,
to be the modular inverse of e mod (p-1)*(q-1)
It doesn't matter why so much, for this talk
but here is the formula
or here is the command to do this in Sage.
So now we have a public key and a private key
that are pairs.
And in RSA if you want to enrypt a message
you raise your message to the e'th power mod n
so you can do that with your public key here.
And similarly decryption,
you raise the ciphertext to the d'th power mod n
and that will give you the message back out again.
Fairly simple.
Now I'm lying to you slightly.
If you read this talk do not
absolutely do not implement RSA this way.
You must pad your message.
This is a public service warning.
We offered to tell you about factoring.
What is the relationship between RSA and factoring?
Now you can see easily from the formula
from the private key here
that if you know how to factor N into its prime factors p and q
then you can just compute d, the decryption exponent
of your private key.
Clearly, if you can factor N you can compute the private key.
But we don't actually know that factoring is the only way to break RSA.
There might be some way that you can compute messages from ciphertexts
that doesn't actually reveal the private key.
And there's no proof either way that this is or is not possible.
The last thing that I want to mention here
just to clarify things
some of you who might have taken
complexity or algorithms courses or a beginning CS course
you might have heard of NP-hardness, the P vs. NP problem.
And I just want to clarify that factoring is not known to be NP-hard.
Every computer scientist will love it if we could actually generate
a cryptosystem which is known to be based of average case NP-hardness.
We don't know how to do that.
RSA is not doing that.
In addition the factoring problem is probably not NP-hard.
The question is: how hard is factoring?
Since factoring is the best way that we know
to break the most commonly used public-key cryptosystem on the Internet.
Well I showed you some examples before
where Sage had this helpful little factor function.
So how well does it work?
How long do people think this takes?
two 32 bit integers multiplied together.
I heard it'll go a couple of seconds.
0.01 seconds.
This is on this computer.
How about 64 bit integers, two of them multiplied together?
So this is a 128 bit RSA modulus.
Any guesses?
One minute?
Nope, 0.1 seconds.
How about two 96 bit integers multiplied together?
A couple of minutes?
4 seconds.
How about two 128 bit integers multiplied together?
This is a 256 bit RSA modulus.
No guesses? Alright.
500 seconds.
So at this point it's starting to get a little bit iffy if I'm
sitting on a plane trying to do this on battery power.
laughter
Clearly this is growing faster than linear in the size of the key.
So what is actually going on?
Well.
Dan: Ok.
Turns out these factoring functions
they do look like they're getting pretty slow but
wait a minute.
Do we actually have to use this factor function
if it's getting really slow?
Maybe instead of trying to use Sage's factor function
maybe we could just guess what the prime was.
Now as an RSA user you might not think that this is a threat
having somebody guess your prime
because there is some way to count the number of primes,
number of 512 bit primes and there's
more than 2^500 512 bit primes.
And if your random number generator is giving you
any of those 512 bit primes with the same chance
then any particular prime that the attacker guesses
and just tries dividing n by that,
see if it's evenly divisable, well
any particular prime has only a 2^(-500) chance
of being guessed by the attacker.
And even if the attacker tries a huge number of times
he'll never guess your prime
except what if your random number generator
is working really really badly and doesn't give you 2^500 different primes?
What if it only gives you, say, 2^40 different primes?
And then suddenly the attacker could, well,
try figuring out what those primes are.
So here's how the attacker looks at this.
The attacker says: ok I'm gonna target your public key.
I'm gonna take your big N which is your secret...
sorry, public key is big N,
your secret key p times your secret key q,
your private p and q.
He's trying to figure out the p and q, given N.
So what he does, he takes a bunch of devices that are out there
and he looks at how they work
or downloads a bunch of software which generates keys
and then using that software, using those devices
he tries generating a bunch of p's and q's for himself
using whatever the random number generator is
in that device or in that software.
And then that's something which
maybe the random number generator works really badly
and maybe you're using that device, too.
So the attacker tries each of the primes that he see's
from this device
and tries seeing does that prime divide your N.
If you were using that device
and it doesn't generate very many primes
then maybe he gets lucky and finds your prime
because the device has generated the same prime for him
that it generated for you.
Now does anybody actually build devices which
screw up random numbers this badly?
laughter
As a matter of fact, yes, bears do shit in the woods.
So there are some famous examples.
We don't have a huge number of historical discussions in this talk
but just a few examples here.
Goldberg&Wagner totally broke the original Netscape
generation of public keys.
There are only something like 2^47 possible keys
which is a countable numbers.
So they could figure out all the possible keys
that Netscape could generate if they you knew when you generated a key
which was often very predictable.
Another example: the Debian bug,
the horrendous failure for SSH and lots of other applications
from a few years ago.
Debian and Ubuntu for a year and half were generating only
well, fewer than a million possible public keys.
So complete disasters of random number generation.
But if you want to add to this list,
if you want to do more and more of these
find bad random number generators
you have to say "ok what's the next device out there
that might have been screwed up."
"Let's look at it, look at the code."
"See what kind of primes it generates."
"Does it do badly?"
It takes some real work to figure this out.
Wouldn't it be nice if there was a systematic way
to just without having to do a lot of work,
to look at each device, each piece of software,
wouldn't it be nice to just figure out
"oh yeah, let's just magically figure out
if there's any primes out there generated
from bad random number generators."
You could imagine here's the easy attack.
Here's the systematic way to do this.
You take two users, so you and you.
And you each have a public key.
We're gonna grab those public keys
N1 and N2
and then hope that this N1 and N2 share a prime.
N1 is some prime p times some q1 and N2 is some same prime p times a different q2.
Now this could happen, you could have
N1 and N2 sharing both primes.
That would be legitimate that two people who are actually
the same webserver sharing their keys, sharing their certificates,
they know they're doing this,
you can get the same public key from two places.
Google has their keys copied to tens of thousands of servers,
that's not a surprise.
But if there's a different...
If you have two different public keys
they shouldn't be sharing primes.
What if they do?
Well, here's this magic Euclid's algorithm
which will just print out the shared prime p.
This is a very simple algorithm.
It just takes one of the numbers, say N1
and let's say N1 is the smaller one and N2 is the bigger one,
it divides N2 by N1
and replaces N2 by the remainder and then switches the numbers
and that's exactly what this code does.
The N2 % N1 that's again the N2 mod N1
the least remainder when you divide N2 by N1.
And then the result once one of the numbers gets down to zero
the other number is exactly the shared prime.
So this amazing algorithm,
here's just a little manual example
not with big 512, 1024 bit keys
this is with just tiny little I guess 8 bit primes
16 bit... well looks more like 13 bit keys.
So two numbers coming in and one is 4187 and two is 5989.
You can see if you keep dividing one number by the other,
replacing it by the remainder,
do that a few times, you quickly get down to zero.
And the number right before the zero in the middle of the slide is 53.
Now 53 is a shared prime between these two small keys.
If you scale this up,
if you do this computation for bigger numbers
then here's another timing example.
It's still really really fast, in the middle of the slide.
This is doing 1024 bit keys that share a 512 bit prime.
You can figure out this 512 bit prime so quickly that the timing
there is 0.00 seconds, can't even see how fast it is.
Ok, so that's Euclid's algorithm.
Now you can actually do this for tons and tons of keys.
You download, say, millions, tens of millions of keys from the Internet,
all the different SSL servers, maybe go for some SSH keys,
you download tons and tons of public keys and then
you try all the pairs with Euclid's algorithm.
And it's so fast that you can do this but actually
you can do better.
So there's an algorithm: batch GCD algorithm
which takes a whole bunch of keys
and figures out which of those keys share primes with the others.
Much much faster than trying every pair.
So you don't have to try every pair.
It's kind of like sorting, it's about that kind of speed
instead of having to try every pair.
If you have tens of millions you don't have to do
tens of millions times tens of millions of pairs.
You just have to pretty much reap through the whole list once.
So what does this batch GCD algorithm do?
It's figuring out for the first N1,
instead of computing the greatest common divisor,
the Euclid result for N1 and N2
and for N1 and N3 and so on,
it figures out the product of N2, N3, N4 and so on,
takes the greatest common divisor of that with N1,
applies Euclid's algorithm to that
and then does a similar thing for N2 and for N3 and so on.
Each of those is gonna show you whether N1 or N2 or N3,
it's gonna show you whether they share some prime
with any of the other N's.
If you did exactly the computation
shown on the slide in the math formulas
it would still be too slow.
But the batch GCD algorithm finds redundancies
in these GCDs, and here's exactly what it does:
It figures out the product of all of the keys.
So you take, say, a million keys and you multiply them all together.
And that's done with the code here where you do...
maybe I'll just skip to the bottom.
You see numbers 10, 20, 30, 40.
You build a product tree.
So you take 10 times 20, compute 200.
Take the next pair 30 times 40, compute 1200.
The take the pair of products 200 times 1200.
If you have more numbers, you have more levels in the tree.
And that's exactly what the Sage code does here.
And then the last step of the algorithm is
kind of going down the tree, building a remainder tree.
So what's happening here to figure out the greatest common divisor
of N1 with the product of all the other keys,
the idea is to take this product
which is called big R on this slide,
take the product of all the keys and then divide that by N1^2
and then take the remainders, so R mod N1^2,
divide by N1 and then compute the greatest common divisor of N1 mod N.
And the amazing effect is that it's exactly the
greatest common divisor we were looking for.
And what this little Sage script does is that it takes
R and divides R by, well a bunch of N's in a really fast way,
and then at the bottom, once it has figured out
all mod N1^2, all mod N2^2 and so on,
divides those by N1 and N2,
takes the greatest common divisor with N1 and N2
and that's exactly telling you, the outputs here
that are different from 1, are exactly telling you
which N's share a prime with some others.
This very remarkably short little script does work quite well.
Here's how fast these are.
So I'm on some 800 MHz core.
If you try this for, say, a hundred numbers,
just hundred random 1024 bit numbers,
it doesn't matter what the numbers are, they always run at the same speed,
then that takes 0.05 seconds.
If you have 10 times as many numbers it takes about 20 times as long.
And if you have another 10 times as many numbers it takes about 20 times as long.
And it keeps scaling up like that.
So you can get to millions or tens of millions of numbers
and it still finishes in a reasonable amount of time.
So, amazingly fast algorithm.
You don't have to scale this up to like 2^50 or 2^100 keys.
There's only, say, tens of millions of keys out there
and this runs reasonably fast.
Ok.
Can you actually hope for this to work?
Are random number generators really this bad?
Well...
laughs
There's a 2012 paper from Heninger
that's the same Heninger we have sitting here
and Durumeric, Wustrow, Halderman,
that's got the best paper award at USENIX Security 2012.
What they did was they tried exactly this on tens of millions of keys out there
and factored tens of thousands of keys.
Admittely they used a C version of this instead of this
Sage script but
the Sage script will go up to millions.
It's just when you get beyond that you want to write it in C
to deal with using a lot of memory.
But you can use exactly this script for pretty large chunks of keys.
And so they factored tens of thousands of keys and said
that these keys are typically not your bank keys.
They gonna be, say, keys for your home router.
So the vulnerable devices are not gonna be,
say, a big server out there.
The vulnerable devices will be your FRITZ!Box.
applause
Thank you, Nadia.
It's good to read the paper, go to factorable.net
and you get tons of analyses of why this happened
and why these devices are generated guessable numbers.
I mean numbers that are so non-random that they get
repeated between a lot of keys out there.
There was at the same time another team that
like within days announced the same result.
They independently did the same computation.
They said "yeah, we've also downloaded
a bunch of keys and factored as many as we could"
with basically the same algorithm.
At the end of it "yeah we factored
tens of thousands of keys."
That paper has no analysis.
They're like "yeah, there's keys"
"they share primes for some reason"
"I guess ecommerce is dead"
"go out, change your bank keys"
"call the New York Times"
There's the New York Times article:
"Flaw Found in an Online Encryption Method"
I also like the advertising on the side here.
Like iron key on the top and then:
"Encrypt, decrypt & access my secret data in real time"
"try it free for 30 days."
"I secured my cloud data. Have you?"
Awesome advertising there.
Once you found that there's a problem like this
then of course you start looking around for more and more places
where you might have some bad randomness.
For example there is a paper, or at least slides,
by Chou in Taiwan, saying
they took two million Taiwan Citizen Digital Certificates
and did some GCDs
and found that a 103 of those were
factorable.
So anybody who downloaded these certificates
which apparently are used in Taiwan for like paying your taxes,
talking to banks I don't know,
but paying taxes is one that I've checked.
So a 103 people out there have these Taiwan
Citizen Digital Certificates where
you're supposed to have in this database things like
the names of the people and their ID numbers
but you aren't supposed to have their secret keys.
The whole point is those are supposed to be private
kept on some smartcards that are issued to the citizens
in Taiwan.
And the randomness generation is so bad that
103 of those keys have now been factored.
The smartcard manufacturer, I also like this quote,
it's a company based in Munich called
Giesecke & Devrient and their motto is:
"Creating Confidence"
applause
Tanja: There's gonna be more of Dan, don't worry.
But we promised a bit more than just factoring
bad keys or the keys with their primes.
If you find another number like this one.
So here's coming some more of math stuff.
So if you see a number like this and you
observe this last digit, then yes.
It's very easy for you to see it's divisable by 5.
So if you find such a key that is easy to break
and you are a computer rather than a human being
you look at lots of those numbers.
You have a list of small primes stored
and what you're gonna do is just trial division.
So if there's any small factor
it's very easy to find this trial division.
Or what Dan showed was the remainder trees.
You can also do this for batch trial division.
So assuming that you're looking for a prime p somewhere
then you go linearly all the way up to the p
and there are about p/log(p) many primes
before you find this p.
For each of them you just do exact division.
If it works, fine you found a factor, otherwise not.
Now this is not going to work against your normal keys.
I know that in the example that Dan mentioned by Wagner&Goldberg
that they found one factor which had a 9 in there
but usually your keys don't have a 9.
For slightly larger numbers there is a method due to Pollard.
So if you see the respectful number here,
there is no obvious divisability of this one.
One thing you can try is
a walk, well we call this a walk.
You start with some random number smaller than N
and some offset here.
I should also say, all those scripts are available at this URL.
So if you go to facthacks.cr.yp.to then you'll find all the scripts
if you want to play with it at the same time,
including this one with some explanations and such.
So what you do is you start off two different numbers,
one at each step squares it and adds c.
And the other one squares it, adds c,
and squares it again and adds the c.
Each time computing mod N.
So you have two sequences running around,
one is twice the speed as the other.
At every step you check with the GCD of this number N
and the difference between the two walks
hasn't anything interesting now.
N is an RSA modulus, the only thing interesting
could either be p or q.
So for this particular number
when I run this
I find 2053 after a few milliseconds.
So this is again a cooked up example.
You won't find such a small factor if you're trying
a general RSA factor, otherwise your numbers
would be totally busted but
this shows if you have a small number in there
it's very easy to find.
All later on when Dan is talking about the number field sieve
this is an interesting subroutine.
So if you ever need to find small numbers
then, sure trial division is very easy for
really small numbers taking this long,
this one squared with p is much faster already.
Now even faster than that, also due to Pollard,
called Pollard's p-1 method.
Here is again an innocent looking number N.
You see the numbers get larger.
This is a 256 bit number.
In order to deal with such a number there is one step
which is expensive but you would do it only once
for all different numbers you want to try,
namely compute a big integer which has lots of little factors.
So you take the least common multiple
from 1, 2, 3, 4, 5...
Ok, once you hit the 4 you have another 2 in there.
So there's lots and lots of powers of 2,
lots and lots of powers of 3.
And then the larger primes only appear like once.
But you're not going up to 256 or 128 anywhere.
You're sticking here at 22 bits
and then use the same function that Nadia explained
in the RSA computation, namely the modular exponentiation.
So you take the 2^y,
this huge number,
but the whole computation mod N.
So this whole thing never grows beyond this 256 bit number.
Or for a real world example it would be a 1000 bit number.
And then you compute the GCD of this number and N.
For this particular one, again it's a cooked up example,
you get a factor.
Now this factor is actually 120 bits.
This is not a small factor.
So if you were using 256 bit numbers
this is something that could happen to you
and nevertheless this method would find it.
So this finds much larger factors than trial division
or the Rho method in the same amount of time.
But it doesn't work for all primes.
It works for primes which are kind of special.
Now what's special about this prime or the other factor?
If you take p and subtract -1, hence the name p-1 method,
and factor that thing,
that has lots, lots of small numbers.
So that's something we call smooth.
So a smooth number doesn't have big factors.
So a prime is like the least smooth number you can imagine.
And 2 to the power large is the most smooth number,
so lots of little factors means smooth.
This largest factor happens to be 21.7 bits
which is why I chose the 22 bits here.
So as soon I run over the largest prime in p-1
with this exponent y here
then I do find this factor.
Now if you thought this was math
there is even more scary math
namely there are methods which are generalizing
this p-1 method using elliptic curves.
If you listen to the crypto advertisement
elliptic curves are usually something very good,
something you should have on your smartcard.
It's faster than RSA and so on.
That's the same type of elliptic curve
but here elliptic curves come in as an attack tool.
There is a method called the elliptic curve method ECM
which is a generalization of this p-1 method and does not need anything special,
I mean avoiding such primes is easy.
If you look into older standards
they all warn about "make sure to use strong primes"
"make sure to check that your p-1 is not smooth"
And of course you don't want your p-1 to be smooth
because otherwise this attack works.
But if I have some elliptic curve
a different type of prime is weak for that curve
and I have many, many, many elliptic curves.
For each of the curves some primes are weak.
You can't exclude this method.
So the only thing you can do is just, well,
choose your prime large enough and hope, well,
make sure p-1 doesn't get it and hope
that the next elliptic curves won't get it.
So elliptic curves are good for attacking.
It's still not the fastest method out there
but it's a good method for random factoring.
It's not the best method for finding RSA factors
but if you have a number which is most likely
not too non-smooth then it's a good method.
Now this is what you do for finding small factors.
To show some method for big factors,
now here that is a big number.
But this is what I would call factorization by inspection.
What's wrong with this number?
I mean this number is not small and
it won't have small factors, I promise you this.
So it's a decimal representation, so 9 is the digit before 0.
This looks very close to the power of 10.
Actually this is very close to 10^340.
If you take the square root of it
then you almost exactly hit 10^170.
This example was cooked up as taking two primes
namely 10^170-33 and 10^170+63 and multiplying them.
So if you see lots of 0's and lots of 9's
then you know that you are very close to a certain power.
Now in real life this wouldn't be a power of 10,
it would be a power of 2.
And there actually are some examples.
There's a famous story of a letter bomber in Austria
who wanted his message to be crackable
and he sent a message, well he was some right-wing asshole
that was letter-bombing people
and then he sent an encrypted message to the police.
And the police tried to factor it and tried to decipher it
hoping it would be the list of the next victims.
In the end it was not but
this person was actually using very close to a power of 2
in order to have people crack it.
In the end it was just some propaganda,
some right-wing stuff as well.
So there are people using this for breakability.
I'm not aware of people actually using this as an accident.
But what can happen to you is not using close to the power of 2
but saying "ok I learned I shouldn't be close to the power of 2
so I take some offset and take the next prime."
Next prime just means, add 1, check is it prime,
add next one, add 2, add 3
Just check primes linearly from then on.
Now if we jump to this where we finally find our prime p
it's a good prime.
It's not anywhere close to the power of 2 because my c was large enough.
Made it, good.
Now on q.
So again call the next_prime().
So I do p+1, p+2 until I find a prime.
That means that if you take the product of the two
then this N is very close to a square.
So this N is cooked up with this method.
You don't see anything on the end,
there's nothing wrong there.
But when you compute the square root of it
it is almost an integer, it's .99999...
and then some dirt here.
So then you'll know that your primes were too small.
Sorry, not too small, too close to each other.
So if you ever take an RSA factor modulus
and you compute the square root
and you see something like this
then you know: that user did this method.
You could just say, I take the number,
subtract 1, subtract 2, just go off from the square root.
All you can check how far away is it from being a square.
So you take the seeding,
that means the closest integer counting upwards.
Just round it to this two, here's 57, is there a 56?
Compute the square of that and subtract N.
Now in this example that is 4096
which itself is a square.
So the difference of N and a^2 is a square.
Then I take this 64, take a-64, divide N,
it's an exact division, so I found one of the factors.
And then there's the other factor.
It doesn't matter whether it's a power of 2 or a power of 10.
What matters is that the numbers are too close.
Now this method actually works surprisingly well
even if we don't do next_prime().
So here's another example where I didn't do the next_prime()
but did something larger
so this is not really close to a square.
Some 9's but not too many.
What we did before we wrote it as a difference of squares
and then divided N by one of these two factors.
Now in this case here I would not start as a^2-N
but I say, well, if a^2-N is not a square
I try the next one, I do (a+1)^2, (a+2)^2
each time subtract N and check when this is a square.
Now for this example I get lucky after 2.
And this actually took me a long time to cook up this example
because I was starting at something which was
half of the bits of p were changed.
So I was actually starting with a cube quite a distance away.
Still, most of those were just giving a square.
They were not giving me anything larger than zero here.
Now for general numbers if I wouldn't do like next_number()
but just, well, random prime, another random prime.
It still works but takes a long time
because I can always write it this way.
This is a, this is b.
But then usually it would take about square root of N
which is the same size as p until I get there.
This is a nice method if it looks like lots of
9999 and otherwise do something better as Dan will tell now.
Dan: Ok.
applause
Alright, so it's bad to have p and q
too close to each other
or have a small p or to have p and q non-random.
So let's do everything right.
Let's make just independently choose some big p
between 2^511 and 2^512
and choose some big q
without being like the next prime or anywhere in the ballpark of p.
You could maybe say q has to be between 2^509 and 2^510
and then it's definitely nowhere near p.
Now you got this public key and p times q
which is a 1024 bit RSA key.
If nothing has gone wrong with your randomness generation
then what does the attacker do?
Well, we don't know anything fast.
So this is gonna be the one part of the talk
where there's these big powers of 2
for how fast things are
because they're not really fast.
You have to think about how much something like 2^80 is.
There's all these different methods,
I'll just skip down to the last two.
There's the quadratic sieve (QS),
the number field sieve (NFS).
These have been around since, well,
QS since the 80th, NFS since the early 90th.
The NFS, the general consensus is
it takes something like 2^80 operations.
Now here's something fun to try.
If somebody says 2^80 operations
and there's some cryptographer talking about the security or something,
ask them: what do you mean by an operation?
How fast is this operation?
And then most of the people saying this
will go running screaming like:
I don't know what an operation is,
don't ask me that question.
But still people will confidently tell you
that the NFS takes 2^80 operations
to break any 1024 bit RSA key
even if the user hasn't done anything wrong,
the implementor hasn't done anything wrong.
And this number, this 2^80
if you pin down what those operations are,
this is something which is doable today
by people with a big botnet or people with a big computer cluster.
I'll give some details what I mean, the size is there.
As your computers get cheaper and cheaper,
somehow chips aren't going up in clockspeed
but they're still getting cheaper.
You can get a really cheap ARM which is doing
the same kind of computations
that a pretty big CPU was doing several years ago.
And chips are gonna continue getting cheaper
for a while. These attacks against RSA 1024
are going to be common doable for more and more people.
How do these methods work?
I'll give one example and then one little Sage script.
Let's try just a really small number.
This will be with the quadratic sieve
which is not as fancy as the number field sieve
but it will give you some idea of how these methods work.
The QS starts with that Fermat method you heard about.
Remember, the idea there was to try to find a square
as a^2-N, so a was like the square root of N
or maybe the plus 1, plus 2 and so on
and you keep searching for a+1 and so on,
search for numbers that if you square them and subtract N
then you get a square.
So let's try that for 50=2759 which is not very big.
Then you try, 53 was the ceiling of the square root,
square that, subtract 2759, you get 50.
Well 25 would be a square, but 50 is not a square
it's 2 times 25, 2 times 5^2.
And then you try another number, 54^2-2759
umm 157, that's too complicated
I don't remember that being a square so let's just skip it.
And then you keep going like this, 266, 377,
490. I remember 49 is a square
but that doesn't mean that 490 is a square
cause 10, 2 times 5, that's not a square.
And 605, you can try...
We remember how to take out the 5
and then 60... 605 divided by 5.
You can figure out it's 5 times 11^2.
It's almost a square.
You keep doing this for a while
and it seems like Fermat's method is actually working pretty badly.
It does work eventually but it's taking quite a while.
Here's what the quadratic sieve does.
From the same computation you look at these
numbers that were not exactly squares
50 and 490 and 605
and you notice if you multiple them together
50 was 2 times 5^2
490 is 2 times 5 times 7^2
605 is 5 times 11^2
and then you multiply those together
you get 2^2 and 5^4 and 7^2 and 11^2
and that's exactly the square of something.
The square of 2^1 times 5^2 times 7 times 11.
Then the quadratic sieve is taking the square root of that number
and subtracting that from the product of the fifty-somethings
that were on the left side
and taking the greatest common divisor of that with N
and amazingly that produces a factor of N.
And it's not just some random accident that
if you had tried the greatest common divisor of
"write down some hocuspocus then"
"you eventually get 31 coming out"
"every 31 times you write down a random number"
"it will have this 31 dividing it"
No, you can try this for bigger and bigger numbers
and actually this keeps working.
As soon as you find a square product
then half of those square products are gonna factor N.
So that's how the quadratic sieve works.
Here's a more systematic script with a much bigger number
which if it were just hocuspocus this wouldn't work.
But this does work.
So there's a number from my computer's random number generator
31415926... laughter
You try writing down a bunch of squares...
maybe I should get my computer's random number generator checked.
It is this two year old laptop you heard about before
so I'm not sure it's working properly.
You take a bunch of a^2-N and then
let's make a list of those called X.
You can see the range, there is up to 500.000 numbers.
It's a pretty big list we're talking about.
And then search through those elements,
search through those a^2-N, those differences,
the elements in this list,
to see which ones have easy factorizations.
Now a lot of numbers like that 157
I know what the factorization of that is
but there is function which you can find on our website
easyfactorizations() which looks through the list X
and if it finds numbers that are easy to factor
then it quickly writes down the factorizations
using the small prime techniques you heard about.
And then makes a list of those easy factorizations,
puts those into F.
And then there's a big chunk which I won't try to explain
which is called linear algebra mod 2.
And that somehow figures out a square
that comes from the factorizations,
somehow looks through this list of factorizations
and magically applies some linear algebra stuff.
Well ok, I won't go through that.
It's something doable
and this is using some standard Sage functions to do it pretty easily.
There's all sorts of things that
in the interest of time I won't get into
like how this easyfactorizations() works.
And this is something where people write papers
and papers and papers talking about trying to make this
go as quickly as possible.
I do want to emphasize one fact about these methods:
the bold-faced **very small memory requirements**.
You can use the elliptic curve method
that you heard about before
you can use that to figure out easy factorizations
using very little memory.
That means you can build a chip
like a FPGA, a special purpose ASIC,
you can have thousands of little ECM units
running in parallel, so you can really
massively parallelize this easyfactorizations() function.
So if people tell you "oh you need tons of
memory for sieving to find easy factorizations"
then you should tell them "no no, that's not true"
"ECM you can do with very little memory
running massively parallelizable"
And then there's other things which,
I suppose if we had another hour
then we could get into all these details
of like how the number field sieve goes beyond this.
It's a similar kind of method but gets more complicated.
I do want to emphasize again thing in bold-face here
which is that if somebody tells you
"oh 2^80 operations, the attacker wouldn't do
that because the target isn't worth that much to them"
Well, **Batch NFS** is a way to take a bunch of N's
and like the batch techniques before
there's some magic ways to find redundancies
between doing factorizations of those N's.
So somebody tells you:
"oh 2^80, that isn't worth it for the attacker."
Well, **Batch NFS** reduces the cost quite a bit
for factoring each key.
If the attacker has a lot of keys to target
they can factor all of them in not much more cost
than just factoring one.
So don't believe people who take an economic view
how much the number field sieve is worth.
Alright, what does this mean?
Can the attacker actually do the NFS
once they've put in all these optimizations?
Well if you look carefully at the analyses of the NFS
there are only about 2^70 differences.
Things like these a^2-N's.
If you look through 2^70 of those and scan them properly,
find easy factorizations, do linear algebra,
you will factor any 1024 bit key.
And 2^70, how big is that number?
It's actually small enough that you can do this computation on
your favorite botnet.
Say, Conficker is shrinking, still
and it will probably be wiped out at some point
but it's an example of what you can do
using any of these security problems that are deployed
on enough machines.
You can break into a bunch of machines,
millions of machines.
Conficker estimates vary between 7 million and 12 million machines.
So use your, say, 2^23, that's 8 million machines.
Count how many seconds there are in a year
and then, say, ok that means we have to do
2^22 differences per second machine.
And that is slower than state of the art factorization code already runs.
So it's actually a pretty easy computation.
You don't even need that many millions of computers to do this.
If people tell you "wait a minute"
"you can't just use a bunch of machines"
"there's all this work for linear algebra"
Then I'll just skip this.
Be aware that linear algebra is not as much of a problem
as some people would say.
If you use a big botnet there's one little problem.
For an attacker who breaks into a bunch of machines
and then starts spinning them up, heating up the CPU,
the fan starts running, the user starts wondering
"why is my machine running so slowly?"
Maybe only use half of the CPU
but still it has still got a good chance of getting you detected
and kicked off the machines.
Users are gonna shut you down if you're doing a
say, year of computation on your botnet.
So what do you do instead?
Well you build a little computer cluster.
Now yesterday we heard about a big building
in Bluffdale, Utah, that apparently
is going to store all of the data collected by
the NSA for the next hundred years
or maybe forever.
But something that I didn't hear emphasized yesterday
was that this building also has a new power substation
which is drawing 65 megawatts,
well generating 65 megawatts.
Now what are we doing with 65 megawatts?
Is that the kind of power that you need to
store a bunch of data on tapes?
No, that's the kind of power you need to do computations.
So what is NSA doing with 2^26 watts?
Or maybe even with more watts.
You shouldn't actually think that 2^26 is such a big number.
Here's a little table with, well,
2^57 admittedly is pushing it a little bit.
This is the total power that the earth gets from the sun
of which about half gets to the earth's surface.
2^44 is the amount the human is using right now.
Varies a little bit but I mean this is the average power
including cars and such.
If you have the botnet that breaks into millions of machines
and runs in flat-out that's about 2^30 watts.
And actually if you think about it, wait a minute,
there's a lot of machines out there.
This means the 2^26 is not actually that much power.
It's just one dinky little Bluffdale 65 MW computer center
and actually if you're a government agency
you probably have more than one of those.
As you heard yesterday in the context of data storage
but again, it's not just data storage.
You don't make a 65 MW power station to just store data.
Ok, if you just take standard graphics processing units
and run 2^26 watts to those that will do about
2^84 floating point multiplications per year.
You have to figure out, ok, what are exactly the operations involved here.
This should be enough to break 1024 bit RSA
and if NSA is not just buying GPU's off-the-shelves
but really tuning chips that they build
using IBM to manufacture their chips at the moment.
Apparently it wasn't cost-effective for NSA to manufacture their own chips
so in 2005 they started subcontracting for IBM
but presumably through that fabrication IBM is building chips
that NSA wants and those should be about 10 times faster
than what GPU's can do.
So it should be possible with this little 65 MW computer cluster
to factor at least 1, maybe even 10,
and then with batch NFS maybe even more
1024 bit RSA keys in a year.
applause
Nadia: So to conclude things up here I want to
explain how you can use
from the comfort of your very own home
a distributed, a massive scale distributed cloud computing service
to calculate private keys on your own.
So many of you may be familiar with Amazon EC2
but it turns out Google also has a large amount of computing infrastructure
and they actually have a very convenient web interface
to access it.
laugther
applause
It's amazing what you can find on the Internet.
laughter
So ok, here's some keys except, you know,
if you look at these carefully
not all of these are sort of obviously RSA keys.
There are some problems here
like the second to last one on the bottom.
So...
laughter
After doing this search we found this key and
someone seems to have pasted a private key into pastebin
and someone came along and interrupted part of it
with some profanity.
This is an interesting problem.
What do we do with this key?
Clearly this is an interesting key, we must have this key.
laughter
applause
Step 1... yeah?
Male: Did you see the easter egg in row 1?
Nadia: Well, we'll discuss this.
Angel: Questions will be later, after the talk please.
Nadia: So yeah, we've removed the obvious problems here.
But there's still an issue which is that
we're missing hundreds and hundreds of bits of key.
We can't brute force-this.
What are we gonna do?
Well, let's look at what RSA keys actually are.
I introduced RSA keys and I was, like,
it's a d and d has like a thosand of bits.
And if we don't know p and q
how are we gonna get this d?
Here is the storage format for an RSA key.
Public key is the modulus N, the public exponent e.
Ok, private key has version, N, e,
d the decryption exponent,
p, q, d mod (p-1), d mod (q-1),
the inverse of q mod p,
this is all useful information if you want to
speed up your calculation and not have to
compute everything every time you use your key.
Okay.
What did we see here then?
Here is the key broken up into all of the difference pieces.
So we have: the red bit is approximately where N is sitting.
We've got e in orange.
We've got part of d in yellow.
We seem to be missing p
but we've got part of q
and then here's d mod p-1
and d mod q-1 and q inverse mod p.
There's also a little problem before the red here
laughter
in addition.
You might call this the private part of the private key.
laughter
applause
Fortunately this is just sitting in an encoding of the length
and the version. laughter
Alright, so what do we do with this information?
Basically given any part of this private key
we can easily compute all of the other parts.
Simple formulas:
q will be 2^(e times d mod p-1) - 1 and GCD that with N
then we get q and then
we can figure out what p was by dividing
and figure out what d is
and so on and so forth.
You can do this even if you have a part of a single value
from the private key.
You can still compute the rest of the private key from that.
We don't have time to get into this but it's online.
You've seen a little bit of math.
Single formulas, typed into Sage.
We can reconstruct the private key here.
applause
So, what are the lessons that we can learn
from everything that we've done today.
The first one is: stop using 1024 bit RSA
if you haven't already.
I have looked at the keys that are out there on the Internet
and you are still using 1024 bit RSA.
laughter
Second: make sure your primes are big enough.
Don't try to be clever about how you're generating them.
Make sure your primes are random.
You guys have some problems
especially in hardware.
Also...
laughter
"FUCK A DUCK" is not good crypto.
Pastebin is not a secure cloud store.
laughter
applause
And you probably shouldn't put your private key
in a secure cloud store anyway.
laughter
applause
And lastly...
laughter
applause
Thank you.
applause
Angel: Give them an applause.
applause
Angel: So questions will be taken at the numbered microphones.
So anyone who has a question please line up at a microphone
and we will start by the signal angel
who has a question from IRC.
*Signal Angel*: IRC has a lot of punch lines involving penisses
and some questions.
First question is: would you recommend more using
elliptic curves or RSA with multiple primes...
using proper primes.
Dan: There are ways to do elliptic curves wrong,
there are ways to do RSA wrong.
In general if you've got a particular performance requirement
then elliptic curves are gonna meet that
with a much higher security level
than RSA is gonna meet that.
Of course you can do RSA securely,
you can do elliptic curves securely
but if you've got some performance limitation
as many applications do
then elliptic curves tend to be a better choice.
Nadia: I also want to clarify
the other most commonly used public-key cryptosystem
is DSA, the Digital Signature Algorithm.
There's also elliptic curve DSA
which is probably what you're thinking about
with elliptic curve cryptography.
DSA is in my opinion actually much worse
than RSA in terms of randomness failures.
In the paper that Dan referenced
we also talk about randomness failures in DSA
and it's horrifying.
If you ever ever screw up randomness in a single signature
your entire public key is lost.
Angel: We will take the next question from microphone nr 4
to the right of the seat.
Jake: Hey guys.
Sorry I didn't mention the computing power of Bluffdale.
That said, when we want to transition from 1024 bit RSA
to something else, what do you think is a good idea?
So for example, in Tor we do use 1024 bit RSA for TLS.
laughter Yeah I know, right?
So the thing is we're working on changing these things
but what is...
I mean Zooko has this 100 year crypto project.
What is the real thing that we should,
like if we could switch tomorrow to something,
what's the useful that we can live with for 5 or 10 years?
Is 2048 going to be useful?
Is 4096 where we should go tomorrow?
Cause there is a trade-off between
performance and forward secrecy
and there are a lot of things to think about.
Tanja: If you tell me 5 to 10 years
I would be still comfortable with elliptic curves.
If you talk 100 years
and then there's all these worse quantum computers around
then I would go for something which is worse in performance
like code-based cryptography
or for signatures hashing
but if you'd say in 5 to 10 years
I'm very comfortable recommending to you
elliptic curves with 256 bits.
Jake: Ok, thanks.
Angel: Signal angel?
*Signal Angel*: Yeah, another question from IRC.
If you avoid all the easy primes
does this shrink the search space such that
it becomes easier to crack the remaining keys?
Or asked another way:
can you quantify the ratio of easy primes versus hard primes?
Dan: Yeah, that's a good question.
The answer is: yes, it can be quantified
and once you're up to about to the 1024 bit RSA key level
then there's very very few weak primes.
So if you have a random number,
you just generate a random prime
then your chance of bumping into something weak
is so small that you just don't have to worry about it.
It is one of those much less frequent than
being hit by a lightning kind of events.
It is something that have been quantified
and it is an issue for smaller RSA keys
back when people were using, say, 512 bit RSA
then it was actually a noticable issue.
But once you're at 1024 or above then
the issue is basically gone.
It's something where you can and you should
look at exactly what the chance is of bumping into a weak
but it's not reallistically something
you're gonna encounter with 1024 bit RSA.
Angel: We're gonna take the next question from nr 1.
Just one notice: we don't have a lot of time left
so even though there is a talk in one hour
just ask.
Male: There is a devastating attack that can break AES,
SHA2, most probably SHA3 and DES.
It's called a biclique attack.
Are you concerned that it might break RSA
with any key size and even ECC
and even any public-key crypto?
Dan: No.
laughter
Bicliques are something which are certainly interesting
and I think about it as a way to speed up brute force
and it speeds up brute force by a noticable factor,
often makes attacks, say, twice as fast.
But with public-key crypto we have attacks which are
much much faster than brute force to begin with
so that kind of speed-up of brute force won't be competitive
with things like the number field sieve.
Angel: Next is from number 3.
Male: Me not coming from the dark side of mathematics,
how do I know that my random number generators are fine
for generating keys?
laughter
Nadia: Seed them.
Basically the things that are out there if you're using
a standard random number generator like in Linux,
actually Linux has added patches to the kernel
to fix some of the problems that we've found.
If you want to know that you're keys are good:
if you generate them on a general purpose computer
after it has been running for a long time
you're probably fine.
I would not generate very important or long-term keys
on low-power hardware devices where you can't tell...
laughter
Male: Thank you.
Angel: Next question will be taken from number 4.
Male: Hi.
It's just a remark, not really a question.
I work in high performance computing.
I was at the supercomputing conference
a couple of weeks ago.
They're presenting large-scale installations
and dealing with problems of power.
They want to build petaflops and exaflops systems
that are in a range of 20 MW.
So I'm wondering what NSA is doing with 53 megawatts
in a data center because no storage takes that much capacity.
I think it's really something we should think about.
So thanks, nice talk.
Angel: Next question from...
does the signal angel have one?
*Signal angel*: Ok, another question is:
How big do you estimate the effort to
exchange keys and certificates involving 1024 bit primes
let's say worldwide at the moment?
Dan: I wasn't getting what the question was.
Could you repeat the question?
*Signal angel*: I don't really understand it myself.
laughter
Angel: Ok, let's switch to 1.
Male: My question goes back to the idea
of how can we know how good the private keys are?
Is there something like a keygen evaluation tool?
Or can the quality of a key generator be estimated
from a sample of keys?
Like a public tool that you can throw keys on
and it will tell me a bias of my keygen?
Nadia: If you go to factorable.net
my co-authors and I have made a key check tool
which will tell you if your key is in the
list of keys that we scraped off the Internet
that we were able to compromise.
That's a first check.
It's not a positive verification that your key is good
but it will tell you if it is very bad.
laughter
If you want to check your generator,
if you're worried about the factorization problems
we have fast code on the website in C
that will do the GCD calculation for you
if you just dump a gigantic collection of keys at it.
If you might have other problems like you are
not sure whether it's factorable like those don't come up
the experiment that you might want to do is
sort of restart the process in realistic conditions
and generate a gigantic quantity of keys
and just check for repeats.
The sort of obvious stress testings.
If you have a device you want to boot it up in realistic conditions
and then check them
and this is painstaking work
but I don't think there's any realistic way
of doing better than that.
Or inspect the code, the obvious thing.
Make sure you're seeding anything.
Male: Ok, thank you.
Angel: Next question will be from mic number 4.
Male: Hi.
If I got your numbers correctly so you said something like
it would take 8 million machines a year to factor 1024
so I was wondering what if I would like to factor like
800 bits or 900 bits which is,
as I understand, way easier than 1024
but was never done publicly.
So are you saying that would take like a day or something?
Dan: It depends on the size of cluster you have.
The RSA 768 factorization a couple of years ago
used something on the scale of a 1500 computers
for a year.
And those were normal kind of computers,
Desktop kind of computers with, I think, typically 2 cores.
Nowadays, if you have some faster, say, 3 GHz 4-core machines,
I think those were typically 2 GHZ 2-core, nowadays...
Tanja's mentioning you of course want to use GPUs
to speed things up.
And there's many ways to take advantage of
computer technology moving forward.
To get careful estimates: for going from 768 to 800,
that's a short enough extrapolation
that people are pretty confident about that.
Getting to 900 starts getting iffy what exactly the cost would be
and 1024 there's a fair amount of guess works.
There is some consensus on rougly what the costs are
but it's hard to say exactly.
But again, to give you a scale of it:
you were saying like 900 or 800.
Well, 768 RSA challenge was done
and that one was 1500 machine years.
Male: Ok, thank you.
Angel: We'll take four more questions.
Signal angel first.
*Signal angel*: It's a clarification of the question I asked before.
The actual question is: how big will be the effort
to upgrade existing keys from 1024 bits to 4K or 8K bits?
Considering that the keys are currently in use
how much effort worldwide would it be to
upgrade all that to something more secure?
Tanja: I mean, for the private user if you're running
SSH on the laptop you can just generate a new key.
Doesn't take you much time.
You will notice maybe some degradation of performance
if you're running it on a netbook
but going to 2048 is not a big deal.
However, there is a recommendation of the
US government to stop using 1024 bit RSA as of 2010
and we still see it everywhere.
Apparently it is bigger effort than just
everybody spending 10 minutes on the laptop.
Dan: Maybe part of the problem is things like
everybody says "yeah, use 2048 or bigger"
and there's some financial standard
which says you can use any key size you want
up to 1984 bits.
I have no idea how they came up with this
but then they run into some other standard
which says use 2048 and they just can't
implement it on the smartcards which only support up to 1984.
Now if you get the standards people talking to each other for several years
then they agree on using 1984.
It's just about as good as 2048
but realistically when you have these conflicts
in standards then it takes a while to work out.
And when you have a busy server
that's trying to do 2048 bit RSA
it doesn't matter what the standard says
if you got a ton of load you just can't handle that load
if you're spending a ton of time on cryptography.
And that again is where we like elliptic curves
because they give you for whatever your
performance constraints are much better security.
So if you're trying a given security level
the speed is much better.
Angel: Next question will be from number 2.
Female: So you've had two developers stand up
and ask how to verify whether or not their
key generation is correct and whether or not their
random numbers are correct.
And I think it's great that you give coding recommendations
like seed rand() but coming from the development
perspective I like to unit test my code
and specifically I like to write my unit tests
before I write my code,
it's called test-driven development.
So if I'm going about creating an algorithm to
encrypt something or whatever
what are the steps that I need to do
when I'm writing my unit test?
Has there been some kind of effort in that way
like what goes into unit test for determining that
your random number is correct?
I mean there's algorithms for determining
whether your random number generator is
operating correctly, right?
Dan: This is something where there is,
if you look at how the random number generators work,
there was just a NIST workshop
and there's some NIST standards
which are telling you
here's what to do to test your random number generators.
So you have a hardware random number generator,
some post-processing, and they say
"ok here's the standard for how to test those pieces"
Now that's not the same as testing the cryptography
that's using those pieces
and it's very helpful if there's a clear separation there.
So you have your cryptography
which is doing deterministic stuff
and says you have to have some randomness coming
from a totally separate unit where the only job
of that separate unit is do the randomness properly.
And then the NIST test will tell you
what the randomness does
and then deterministic cryptographic testing
will tell you that the other pieces are working correctly
for all sorts of inputs.
Where it goes wrong is something like the ECDSA standard
that Nadia was mentioning before
and that's one that says that
you don't just deterministically generate a signature
you make some new randomness every time you generate
a signature and that's what goes horribly wrong.
So that's something where we need new standards
which say instead of doing what ECDSA does
we have a clear separation between
the cryptography always does the same thing
making it testable, and then randomness is centralized
in one spot which hopefully the NIST standards
do a good enough job of testing.
Female: So there's something that says here's what determines...
I guess what I'm asking is do you know of any algorithms
that can be used for determining whether something
is a good random number and whether your random
number generator is generating things properly?
Tanja: Well, there are a bunch of tests for.
There's hardware random number generators
you can run them through the diehard tests.
There's certificates by FIPS.
In general I would recommend implementing all those steps
but the smartcards that have been mentioned by Dan earlier
from the Taiwan citizen card
those are actually FIPS-certified.
So even though, these go through what is the industry standard
of doing random number generation on smartcards
which everybody thought was good enough,
apparently they're not.
So randomness is a total mess.
I mean after the paper by Nadia and
also when the paper by the Lausanne team came out,
there is no some more movement in finding
better random number generators.
At this moment, well, hope for the best.
laughter
Implement the standards, yes.
If you have some source of randomness
try to stretch it with a stream cipher or something like that.
Nadia: I want to tell a little story.
After we published the paper
I heard some very interesting things from some of the vendors.
I think one of the things that makes randomness
and cryptography a difficult problem is that
this kind of issue is something that
crosses a lot of the abstraction boundaries
that you're used to in coding.
You want to have the clean unit test that you know
that this piece works, and this piece works,
and this piece works, and this piece works,
and you put them all together and it will all work flawlessly.
Somehow this kind of problem is something that
happens at the boundaries of each unit test.
We know that the operating system is like
"ok I'm getting my proper inputs from the hardware
and I'm correctly generating my randomness from there"
and then the application says
"I'm getting my randomness from the operating system,
it's good randomness"
and then the application uses the correct crypto algorithms
to then do good cryptography.
And at the very beginning of that stage
there was something broken and it translated all the way
through all of these pieces of software
that were working correctly.
Testing this holistically was the only way
to find this kind of error.
One story that I heard from a vendor was that
they ran into randomness problems.
They developed some device.
It was working perfectly
and on the assembly line they were being turned on
and run through the checks.
You boot it up it checks the integrity
on the assembly line and it was continuing on.
If you exactly booted them up at the exact same time
in the exact same conditions
then all of the inputs to the random number generator
would be the same and they would generate the same keys
and then these keys would be,
you know, they already generated the keys,
so they wouldn't generate them again after
they went to the consumer.
I don't know how to unit test that
except take the entire device, turn it on,
and take multiple of them and make sure the devices
as they're coming out of the production process
are working correctly.
Angel: So we will take 2 more questions.
One from number 4 first and then number 1 at the last.
Male: How good, do you think,
hardware-supported random number generators are
after what you've just said?
I think they're probably not good anymore?
Dan: Intel has their RdRand instruction in some of the newer chips
and they say that they've gone through
a reasonable number of evaluation steps.
It sounds like they're trying.
Whether they're successful is a different question.
Something I don't like about the API is
RdRand gives you no way to test the pieces.
You can only test the post-processed output.
So nobody else is able to test the actual
randomness generation part of the hardware.
You can only get a filtered version of that.
So Intel and people who are contracted by Intel
to see what's going on inside
are the only people who can run these tests.
It's not open in a way that allows the community
to see if there's further problems.
But at least they're trying
and maybe with enough effort the hardware manufacturers
will get randomness generation right.
Of course if you can generate any sort of secret once
if you have enough of a secret to start with
then you can use things like AES to
generate any number of secrets
and you can put those secrets into
any number of devices that you want.
If you use AES in counter mode with
encrypting 1, encrypting 2, encrypting 3...
you get totally unpredictable, unrelated outputs
and then use those, burn those into some hardware.
But where do you get that initial randomness from?
Are you going through a trustworthy process there?
It's a hard problem.
Angel: Next question from number 1.
And that's the last question.
Male: You said something about
you dropped this group-based cryptography word.
Most of the stuff I've encountered under that heading
kind of sounded like snake oil
just cause they're using non-abelian groups and stuff.
Is there anything here that isn't?
Tanja: Ok, I did not say group-based crypto.
Male: Ok.
Tanja: I said code-based cryptography.
Male: Sorry, thanks.
Tanja: So there is a class of cryptosystems
which are fine against quantum computers.
For all crypto it's always up to what we know.
Next day, there could be somebody with a
polynomial-time factor algorithm and so on.
Now there's also a group of people doing
braid group cryptography, doing non-abelian groups,
most of these systems have been broken.
If they weren't broken by classical computers
maybe they would remain secure against quantum computers.
It's lots of "would"s there.
Yes, they might have this on their flags as well
but I wouldn't count this as a secure system.
Whereas code-based cryptography or lattice-based cryptography
are systems which should be fine.
Dan: Maybe just a URL if you're interested
in post-quantum cryptography is pqcrypto.org
and that keeps track of papers on the various types
of cryptography that should survive for a 100 years
and in particular against quantum computers.
Angel: Ladies and gentleman, please give
Nadia, Tanja and Daniel a big round of applause.
Thank you so much.
applause