preroll music
Herald: Tanja Lange and djb,
or, Daniel Bernstein,
came from the elliptic curve
are gonna talk today also about
the McEliece crypto where they took us.
They're both in the steering committee
for the PQCrypt
and I've talked to other people
in the community
and they said,
especially about Tanja,
she is the one that brings practice
and reality into all this theoretical mess
so let's please have a hand for
Tanja Lange and Daniel Bernstein.
applause
djb: Okay. Am I on? I apparently am.
Welcome to a late night show,
with Dan and Tanja.
laughter
There's a lot of crypto talks out there
where you'll get the idea that
crypto has problems.
Crypto has massive usability problems,
has performance problems,
has pitfalls for implementors,
has crazy complexity in implementation,
stupid standards,
millions of lines of unauditable code,
and then all of these problems
are combined into a
grand unified clusterfuck
called transport layer security.
laughter, applause
But, actually, the situation is worse.
laughter
Because of what's coming, namely,
quantum computers.
These typical talks will give you the idea
that the core of crypto,
the basic cryptographic primitives
like elliptic curve crypto, ECC,
that those are just fine,
and all the problems
are integration into applications,
that's where the security problems happen.
But quantum computers will break ECC.
This has been know since the 1990s.
For people who don't recognise
the picture here,
this is a mathematician named Peter Shor,
20 years ago.
laughter, applause
20 years ago he wrote a paper saying
that a big quantum computer would be
able to factor big integers
in polynomial time.
Now, if you like doing attacks,
and you hear about something like this,
then your first thought is
"Ooh, I wanna quantum computer!"
"I want a big quantum computer
so I can run this attack!"
"And, where is it? Does anybody have one?"
"Can anybody gimme one?"
"Oh, it isn't there yet?
Well, what's the progress?"
"Are we there yet?
Can I have one, please?"
And, well, in the news, you now hear
about this D-wave quantum computer,
which, okay, there's some very good
quantum engineering
that goes into this machine.
It's pretty clear that it's quantum,
I mean, it's actually doing the quantum
stuff they say it does.
And there's some fantastic
marketing in this machine.
But, unfortunately,
it's not actually useful.
So the D-wave quantum computer
doesn't do the basic things
that we expect a universal
quantum computer to do.
It can only do very limited
kinds of quantum computation.
It cannot maintain qubits,
do error correction on qubits for a while,
hold on to them to be able to do
basic qubit computations.
It can't even do those computations,
even if it could hold on to the qubits.
It can't do Shor's algorithm.
Can't factor big numbers.
Can't do other quantum algorithms
that we care about.
Now, the D-wave company says that's
not the point,
there's some other computations
that we designed this machine to do.
But they cherry-picked the computations
for this machine, saying basically
Here is the problem of
simulating this machine
simulating the quantum
thing that it's doing,
and of course the machine is very good
at simulating itself,
by just running,
whereas a normal laptop, they compared to
sort of standard programs
on a normal laptop,
and latest results, 10^8 times faster
except, well, there's a much
more optimised program
that somebody put out last year,
which basically is the same speed
as the D-wave machine
and like ten thousand times cheaper.
So, the D-wave machine is not,
for the moment, something useful.
Lange: Well, if this makes you
kind of comfortable,
so, no Shor monster coming,
there is progress.
There will be a universal
quantum computer,
so, something that can run
Shor's algorithm,
and there's a lot of research effort
going into this,
so if you track, like, spending on crypto
and you can spend...
and track spending on quantum computers,
that is a lot of money.
And there's a lot of institutions
and companies
and of course governments
all over the world
building stuff.
And we do see progress, so
we're keeping track on this,
and, go to the Wikipedia page here
so we see over the last few years
a huge progress in stability of qubits,
so these things usually
forget what they had,
they decay,
but they get more stable,
there's better error correction,
and there's better communication.
So the latest was that
even silicon-based qubits can communicate.
So something is happening.
Um, IBM has
on the record of Mark Ketchen
who's their CEO saying
"We actually do things that make us think
like, hey, this isn't 50 years off"
"This is maybe just 10 or 15 years off.
It's within reach."
So IBM is leaning out the window
and saying, yup, we're gonna be there
pretty damn soon.
They said that in 2012, so
let's fast-forward by 10 to 15 years
so it's now 2022 or 2027
and so there is a universal
quantum computer.
The effect this has on the Internet crypto
is that RSA, which is still the majority
of all the connections today,
RSA is broken. RSA is dead.
Discrete logs in finite fields.
You might have thought a lot about it
earlier this year about Logjam.
But Logjam just breaks the very small one.
This is breaking any big one.
So, discrete logs in finite fields
is totally dead as well.
Elliptic curves, that I already mentioned,
ECC is dead.
So this breaks all public-key crypto
on the Internet.
There's another algorithm due to Grover,
which is somewhat better under control,
somewhat less scary for crypto,
but it still has quite an effect,
so if you believe in the security of AES
and go like, well, AES-128 seems just fine,
has been not getting any
big progress in cryptanalysis,
but it halves the security,
so AES 128-bit keys
are only 64-bit secure.
However, you can go to 256 bits.
djb: Okay, so, the AES situation:
not so bad,
but all this public key stuff is broken.
So what do we do?
Well, you could bury
your head in the sand,
you could hope that quantum computers
don't come along,
you could deploy
a physically secure crypto.
You could take a locked briefcase
and chain it to your wrist,
or you could do quantum key distribution.
Now these things, obviously,
are very very expensive.
This is crypto, information protection
for rich people.
Or, quantum key distribution is crypto
for even richer people.
You, obviously, aren't going to be able
to democratically give this
to everybody who has a laptop.
But, well, okay, imagine that
you have enough money,
that you don't mind this.
There's a bigger problem
with physical cryptography,
which is the security.
Now these things are always advertised
as provably secure, guaranteed secure.
Nobody can get into this briefcase!
Nobody can get into this
quantum key distribution system!
Assuming, of course, blah blah blah.
And then you look at the assumptions
and you start saying
"Wait a minute, is that actually true?"
And then it turns out that when people try
attacking these systems,
the assumptions are not true.
And the systems get broken
again and again and again
for a very reasonable attack cost.
Okay. Even if you have a system
where you think it's secure,
which nobody's managed to build yet,
even if you had that, the design of
this physical security system
is incredibly difficult to audit.
It's something where, if somebody wants
to slip in a backdoor,
it's really easy.
But there's an even worse problem
with physical cryptography,
which is that it doesn't
do very much crypto.
Typically these systems do about
the same thing that AES does.
You start with a shared secret,
and then from that shared secret
you can authenticate more secrets
that you transmit through this
physically secure communication mechanism.
For instance, quantum key distribution
starts with some way,
some pre-existing way
of authenticating.
But that's just like AES starts with
a, a secret key
which is then used to generate
more and more secret data.
And quantum key distribution says
well, it's provably secure under
certain assumptions,
instead of AES which, well,
we just heard AES may be a little affected
by quantum computers, but
AES-256 is not broken.
Nobody has any idea how to break it.
So this physical cryptography
isn't actually serving
the needs that we want.
Imagine trying to distribute
operating system updates
through locked briefcases.
It's just not going to happen.
Microsoft sending out billions
of locked briefcases
to all of its customers.
If you think that's realistic,
or if you think that,
just, lasers are cool,
then there's a talk about quantum crypto
tomorrow, I believe.
Back in the real world, we obviously
need to do something better with crypto.
Lange: Alright, so,
let me take, momentarily,
a slightly more optimistic perspective.
Is there any hope? Yes.
So, the title of this talk, PQCHacks,
comes from post-quantum crypto,
and that's the term Dan coined
back in 2003
for crypto that remains secure
even if the adversary has
a quantum computer.
So you, the normal people,
me, him, the normal people,
the Internet, all the connections,
we have normal computers.
The adversary has a quantum computer.
We, today, encrypt something.
Adversary has all the time in the world
to build the quantum computer,
break us now or break us in the future.
And so this took off,
and there's been the first workshop
on post-quantum crypto,
called PQCrypto2006
and then there's been
a series of workshops
you can see here.
These are the attendees
of the last edition
which was in Waterloo.
It's more than a hundred people,
going like, yes,
this is an important topic,
let's do some research on it.
So something is happening.
If you're curious what's happening,
next there's a workshop in Japan
in February
and then in the Netherlands
in 2017,
and we also got a project through
from the EU
to support research on post-quantum.
So something is happening.
Actually, um, did I forget somebody?
Yes.
Um, the NSA is also saying,
let's do something.
So, on August 11 the NSA posted
on their Suite B... page
saying, "The Information Assurance
Director (so, IAD)
recognizes that there will be a move, in
the not distant future, to a
quantum resistant algorithm suite."
Oh my god! Somebody is doing something!
The public has realised
we have to do something!
Quick, quick,
let's put out a press release!
Um.
Looks like they kind of realised
it was embarrassing.
So, eight days later they pushed
a new release, saying
"IAD will initiate a transition to quantum
resistant algorithms in the not distant future."
So, NSA will lead us to a bright and
glorious future.
djb: America, fuck yeah.
laughter, applause
Lange: But if you thought that,
so far, this was bad,
it actually takes a lot of time
to build crypto.
With let's say, NSA
all ask the researchers
If you have some idea of
what could be secure
against a quantum computer,
say AES-256,
the reason that you have confidence in it
is we had more than ten years of people
trying to break it,
trying all kinds of approaches
and not getting anywhere.
It takes a lot of time to explore
the space of cryptosystems.
Once you have figured out
what could be actually doable,
you want to figure
what the best attacks are,
you want to focus on the security system,
you want to figure out
how to implement them,
how to implement them securely,
that is a whole lot of work
that needs to be done
before you can say, well,
this is actually practical.
We can actually use this.
Or sometimes, this is as practical
as we can get it.
And then, you have this tiny little
crypto primitive,
and then you have to build the hooks
and connections to get it into TLS.
Then we said at the beginning
that maybe you believe
that the cute little crypto cores
are all secure
and it's just the connections
with the AES, sorry, with the TLS
in other world that is the problem.
That still needs to be done
for post-quantum as well.
And, the side-channel attacks,
there's, uh, as an example,
ECC was introduced back in 85,
and now 30 years later,
we're seeing that ECC is actually
getting deployed on the Internet,
that now the IETF, having called
for having elliptic curve crypto,
because people start saying, yes,
we would like to use this.
So we cannot wait until
there's a quantum computer
in order to start research on it.
If you remember what 85 looked like
That's a while ago.
Now, if that sounded bad,
it's actually worse.
It's not just that, in 15 years from now,
whatever you send then will be broken.
There's some indication that some entities
might be recording your traffic.
We've given a talk like this in 2012
which was "Crypto for the Paranoid",
everybody thought it was, like,
pfft, you're crazy,
now it goes like
"well, there might be an entity"
and everybody goes like, yeah, hmm, yeah.
So, let's assume that this entity
has the records of all the connections,
and, that in ten years, you're going
to be an interesting person.
Maybe you're a politician,
maybe you've become rich,
maybe you associate with the wrong people,
or so they think,
and so somehow they go back to
the 27th of December 2015,
and figure out what you
were emailing during the talks,
because they have all the connections here
so they can go back in time
just by the metadata,
and of course the real data.
From the metadata, they can decrypt it,
because this metadata is using RSA or ECC
which are broken.
And then they get the same key
than you're currently using with your
other side to communicate.
And so they can go and decrypt this.
So it's not only that we can't wait for
quantum computers to come,
we might already be too late.
Well, on the next slide,
here's a bunch of people who are getting
together in this EU project,
and we have come up with
what we're currently thinking
are good recommendations.
We think it's necessary for people
who really care about the security
of their connections,
and when I say people I include myself,
I care about the security of
my connections,
we would like to have something
that we believe is secure
for the next hundred years.
And so we put out
a list of recommendations.
So, for symmetric, it's relatively easy.
You want something which is at least
a 256-bit key
and is sufficiently well understood
so there we have Salsa20,
and we have AES-256.
Then, for authentication in symmetric,
there, we don't actually have any decrease
from a quantum computer, because
these are already
information-theoretically secure.
So these are the nice part.
These two talk items is where we go like
"Pff. We might be okay on those."
We can do better,
we can have nice research
and get something which is even
better protected,
even faster under the new scenario,
but it's not so dire.
The bottom two, these are
the public key systems.
That's public key encryption
and public key signatures.
That's what we're going to
focus on in this talk.
So, for public key encryption,
we're recommending
the McEliece cryptosystem
with Goppa codes
for a certain parameter size,
and then for signatures,
the recommendation is to use hash-based.
djb: Okay, so...
Let me dive into
the hash-based part of this.
This is something which goes back
even further than the 80s.
It goes back to the 1970s. Lamport.
So here's a way to use hashes
to do one-time signatures.
Which are, we'll see in a few minutes,
signatures that you can use
to sign one message under a given key.
And then, well, that's a little bit
inconvenient
that you can't sign more messages
so then Merkle came along
and said, you can sign more messages
by modifying the system,
and we'll also take a look at that.
The only thing you need
for these public-key signature systems
is a good hash function.
And, okay, that's something where
historically,
some hash functions designed in like, 1991
had trouble.
But, now, we have some good hash functions
so, for instance, SHA-3 has
some great hash functions,
even SHA-2, there's no sign of trouble,
of those things being broken.
And then after these original systems
from Lamport and Merkle,
there's lots and lots of improvements,
but all of these hash-based systems,
it's really easy to
understand the security.
It's something where the basic idea
is really, really simple,
and the security analysis also ends up
being very straightforward.
You have to be careful about some things,
but, when you're careful about those,
and there's nothing subtle
about it, nothing tricky,
then we understand exactly
what security we get from these.
So let's dive into a signature scheme
that can only sign empty messages.
Now, this sounds kind of, like,
wait a minute,
what do you mean,
"can only sign empty messages?"
There's only one empty string,
and that's the only thing you can sign.
But imagine that you want
to have a panic button,
like, your revocation key,
you want to be able to say,
"Here's a message that says,
my house, my computer's been compromised,
don't trust anything from this anymore".
It's this one message
that you want to sign,
under a certain key,
and if anybody has that public key,
then they should be able to verify
that you sent that message,
and nobody should be able to forge that,
because it's really bad
if somebody can forge
your panic message.
So, being able to sign just one message
is not actually such a useless thing,
and we'll also see that it builds up
to the rest of hash-based crypto.
Okay. Let's look at
some Python stuff here,
simple SHA-3 you can get online
under Ubuntu or Debian
do install python-pip and python-dev
because it's a C library actually,
and then, pip install simplesha3,
that will give you SHA3-256,
and then to generate a keypair
in this empty-message signature system,
all you do is you make 32 bytes
of random data
and just hash it again,
just in in case... urandom is not
too well-reviewed...
I mean, we should trust urandom,
but it's really cheap
to put an extra hash on it.
So that's your secret key,
it's a 32-byte hash.
And then the public key is,
hash that secret again.
So the public key is simply
a hash of the secret key.
And then return the public key
and the secret key,
of course the public key
will get published,
and the secret key you keep for yourself.
As an example of doing this, well, um,
you just get random-looking data
coming out from your public key
at the bottom
your secret key right at the bottom
and public key above that.
And you can check that one
of them's a hash of the other
if you know the secret key,
you can hash it to get the public.
Okay, now, how do you sign a message?
Well, this maybe sort of spelled out
in more steps than it has to be.
The API here, this is, I would say,
getting to be a pretty popular API
for signatures and for verification,
where you include the signature
and the message together,
as a signed message.
And to emphasise that, here's a returned
signed message from the sign function.
Now, the signed message
will be later on verified,
and you get a message out of it,
where the only possible message
to be signed is the empty string,
and you can see that the top there
of the signature is checking
if the message is anything
other than the empty string
then you're not allowed to sign it.
Um, if you have the empty string coming in
then the signature is simply
your secret key.
So you reveal your secret key
and the whole idea of hash-based crypto
is that somebody can publicly check
the hashes of something that you reveal
to sign, in this case, the empty message.
And we'll see later how
you can use the same idea
to sign lots more.
And then, okay, verification,
you simply checked that signedmessage,
which is supposed to be the secret key,
check its hash
and see if that matches the public key.
What would somebody have to do
to attack this?
He would have to,
if you haven't actually signed a message,
the one message,
then he would have to figure out
some string that, when you hash it,
gives this public key.
And, well, that public key,
this is a preimage problem,
inverting the hash function.
The hash is supposed to be one-way,
if the input were low-entropy,
then this would be doable,
but the input was a 32-byte random string,
so nobody will be able to guess that,
or find any other string that passes this.
And then you return the message
and you can try this out
and see that it works.
Alright, let's move on.
We've managed to sign empty messages.
How about signing 0 or 1?
So now we'll make a signature system
where your key can sign 0
and your key can sign 1.
And, well, this is going
to be kind of stupid,
what you do is, you make two keys,
one of them is signing 0,
and the other one is signing 1.
You can see how complicated
this hash-based signature stuff is,
it's, okay, you put two keys together,
you make one key that will sign 0,
one key that will sign 1,
concatenate the public keys,
concatenate the secret keys,
the p0+p1, s0+s1,
and then if you want to sign, then,
well, if you're signing 0,
now the messages being signed
are integers now
instead the empty string,
if you want to sign 0,
then the signed message will be
the string "0", followed by the 32 bytes,
well, this is again more complicated
than it has to be,
but think of this as
signing the empty message
with the empty signature system,
which is just copying the 32 bytes
of the secret key.
And then if you want to sign message 1,
then you do that with the other 32 bytes
of the secret key.
And then, to verify it,
well, just whether the
signed message is 0 or 1,
just do the obvious thing.
Maybe I should emphasise this code
was written just for this talk,
it has not been reviewed,
and you should audit everything.
You know, you might feel like
six lines of code,
you can't possibly screw it up,
but, don't ever
use code like that in crypto.
So, this is just meant for
you to understand
what this is supposed to be doing,
this has not passed review.
But, I like to think it would pass review.
Um, okay, if you try
signing the 1 message, for example,
and take the signed 1 message
and open that signed message,
you do get the integer 1 back.
And if you try forging it,
you're again faced with this problem
of coming up with some 32-byte string
that hashes to a particular result.
Alright, let's move on to 4-bit messages!
So, I promise I won't do
every possible length.
But, 4 bits, this will make
an important point
that you don't see from 1 bit.
Let's try to sign a 4-bit message
by signing each of the four bits.
This all scales up to signing 1000 bits
if you want.
Um, so, okay, let's make
4 sign-bit keypairs
where each of those was
two 32-byte hashes,
I mean, each secret key is
two 32-byte random strings
and each of the public keys is the hash
of those 32-byte random strings.
Make 4 of those pairs
and concatenate them
to make some new public keys
and secret keys.
Alright, and now to sign a message, well,
you look at the message
and check,
is it an integer between 0 and 15,
and reject otherwise,
and then sign each bit of the message.
You can see m shifted right by 0, 1, 2, 3,
extract the bottom bit of each of those,
and then sign each of those bits,
and then concatenate the signatures
to get, well, signatures of the four bits
of the message.
And then verification works the way
you expect
and I'll just skip that.
Alright, this has a problem.
That if you use this signature system
to sign two different messages,
then you might actually allow forgeries.
So let's look, for example,
suppose you sign 11 and you sign 7.
Now what is that signature of 11,
11, uh-oh, I have to do 11 in binary now,
so 11 sounds like 8+2+1.
And you sign 7, which is 4+2+1.
So what if you revealed now?
You reveal the signature on that 8,
so the 3rd bit you revealed
a signature saying, "the 3rd bit is 1"
as part of the signature of 11.
But as part of the signature of 7,
you revealed, you signed a message
saying "the 3rd bit is 0".
And now you can just mix and match
those messages,
wherever the bits were different.
So for instance if you take the top bit
from the 11
and the bottom 3 bits from the 7
then you end up signing 15.
So this signature system,
it's totally safe
if you're signing one message.
But if you think about the data flow,
what does it mean
to sign the individual bits
of two different messages,
then you can get in big trouble.
So this is a one-time signature system.
Alright. Here's how Lamport's
signature system
works for one-time signatures
of any length of message.
First of all, you scale that 4 bits
up to 256 bits.
And then, if you want to sign
whatever length of message,
you just hash the message to 256 bits.
And the code for it is very simple.
This is not quite the state of the art
for one-time signatures,
there's fancier things you can do,
you can sign with Winternitz signatures
and get instead of something
like 256 or 512 hash values
you can compress that down to like
50 or even fewer hash values.
There's all sorts of tricks to
trade space for time
in these systems
but it's totally feasible to deploy
Lamport signatures
and, well, Winternitz makes it
even more practical.
Alright. What about this
one-time restriction?
So these last, the 4-bit,
and the Lamport bigger message system,
these are only usable for,
you can only use your key
to sign one message.
And this was fixed by Merkle.
What Merkle said was,
you take 8 Lamport, for example 8,
it can be any number,
here's how we'll make a public key
that can sign 8 different messages.
You make, well, 8 different Lamport keys
and you concatenate them together
and you use each one just once.
But it's actually more space-efficient
than that sounds.
So here's what you send along.
You make 8 Ss there, those are the secret
Lamport keys,
that are able to each sign one message,
and then you have 8
corresponding public keys,
P1 through P8,
and then you hash together in a tree,
you hash together P1 and P2,
and P3 and P4
to form P9 and P10,
and hash those together to get P13,
and same over here to get
a final public key P15.
So just one hash value, that's your whole
public key.
And then you have to remember
lots of secrets,
but, okay, nobody has to see
those secrets,
you just keep them on your computer.
And now what does it look like to,
to hash, sorry, to sign one message?
Well, here's what it looks like signing
the first message.
That's when you use S1.
Once you use S1, you cross it out,
you never use it again,
you kill the secret.
You sign your message with S1,
and somebody can verify,
if they see the public key P1
for Lamport signatures.
Or whatever one-time signature system
you put at the top.
But, well, your public key
in Merkle systems is P15.
And how does somebody verify that
P1 and P15 are related?
Well, you show them the P2 and the P10
and the P14
that they need to hash together with P1
to get your public key, P15.
Alright, and that's as complicated
as the Merkle signature system gets,
if you want to be able to sign
a million messages,
you have to have a million secrets,
but, again, just put it on
your local hard drive,
you're not worried about the space.
It takes a few moments to generate the key,
but that's also not a problem.
Okay.
Good things about hash-based signatures,
and a few bad things.
Good things: It's post-quantum secure,
we totally understand what hash function
security looks like,
after quantum computers
and before quantum computers,
all you need is a good hash function,
and there's lots of hash functions
which have been thoroughly studied,
so, we're confident they're secure,
SHA-3 was the result of a long competition
with lots of people bashing on it,
and really has no problems.
The public key is very small,
just one hash value.
The security, as I said,
is well-understood.
All the computations are very fast,
and you can already find
standard proposals
for this system.
This is going to be the first standardised
and the first deployed
post-quantum public key system.
On the other hand,
if you look at the signature size,
it's kind of big,
and then the more scary thing
is the statefulness.
That you can only use S1 once,
and then you have to cross it out.
You can only use S2 once, and you have to
cross it out.
What if you clone your VM?
What if you have a backup and restore?
Then, you've forgotten that you've
used S2.
And then you use it again,
and then somebody can forge messages,
just like I was saying before.
Alright, so state is a problem,
actually some of you I'm sure were
in Rutkowska's talk earlier today
you also heard that
state is harmful there.
And the solution:
Eliminate the state!
applause
I think I only have
about three minutes left
for this, this section, well,
this slide,
but, let me try to briefly say, the idea
for getting rid of the state
is, you have, instead of signatures,
you have certificate chains.
So you have whole chain of CAs,
like, as a signer, you build
a whole bunch of CAs,
you build, say, 2^256
certificate authorities.
Now, that's too much computation to do,
but you do it on the fly,
as you need them.
So you say, I'm going to sign
using one of these bottom-level
certificate authorities,
that will sign my actual message.
And now, you don't have to know
anything about any of the others,
you just pick one and use that to
sign your message.
And then it has to be signed
by the parent CA,
and that's signed by the next parent,
and so on, up the tree, to the top level,
only 256 levels.
And then, okay, you have your
certificate chain,
how do you manufacture these
certificate authorities?
Well, a certificate authority is just
some bytes of code on disk,
that's what real CAs are like,
and you do the same thing signing,
and, those bytes of code, you can,
instead of storing the CA as a program
in a configuration file on disk,
you just generate it on the fly
when you need it,
by taking the position of the CA
within this huge tree
and then hashing that together with
some long-term secret.
That one long-term secret
is the only thing you remember.
So you can manufacture CAs on demand
in some huge tree
and have them sign
certificates for each other
only looking at a few CAs that you need
for the particular signature
that you want.
The reason for having this big tree
is that then you're never going
to randomly use the same CA
at the bottom twice.
So the problem of having
one-time signatures
is no longer a problem.
Each CA will only sign one message.
Okay, and this is something where
the original proposal
from Goldreich in 87,
if you use good one-time signatures,
Winternitz and all that stuff,
you get something like 0.6 megs
for a signature.
Now that's a little bit inconvenient,
for comparison, if you do an OS update,
you look at the average
Debian packet size,
then it's 1.2 megs.
And then, there is some
number of signatures
with each update, and some
of them are in packages,
and well, it's not inconceivable
to send 0.6 megs for each signature,
but it's kind of big.
And then if you look at, well,
web pages, say the Alexa top
million web pages,
that average is 1.8 megs.
And again there's several
signatures on the web page,
depending how many sites are providing
graphics for it and so on.
So, 0.6 megs is maybe a problem.
But, okay, we took a look at this and
a bunch of people
made the signature a lot smaller,
0.041 megabytes.
You know how to make a 41-kilobyte
signature sound small:
only 0.041 megabytes for a sphincs 2^128
post-quantum secure signature system.
If you're interested in more about
what sphincs does,
go to sphincs.cr.yp.to
Lange: Alright, so.
Now we have some idea of
how we can do signatures.
And signature's just the thing
that quantum crypto really
really can't do,
I mean, also, locked briefcases,
how do you trust that this is
actually your locked briefcase?
But also, public key crypto is a problem.
So, the one that we recommend
is code-based cryptography,
and code, in this context, is not like
code as in writing a program
but it comes from error-correcting codes.
So when you think about,
say, your computer,
you have RAM in there
and this RAM might get cosmic radiation
or just a bump somewhere,
might forget something.
Or, a more trivial example,
if you order a book, or, nowadays,
order any media,
it has an ISBN number.
This last digit is dependent on
the previous digits.
So they can figure out whether any
of the digits before
was mistransmitted
then then go like, hmm,
that number doesn't exist, try again.
With your ECC RAM it's actually
a little more sophisticated.
So you have your RAM sitting there,
and it stores 64 bits.
But it stores these 64 bits
with some redundancy.
Internally, it has some extra check bits.
It stores 8 extra bits
and those 8 extra bits allow you
to recover the 64
that you're interested in
in case there was some error.
Not in case of a massive error,
not in case somebody
took a screwdriver and hit it,
but if there was just one random bit flip,
you can recover it.
Also, if two bits flip,
you can at least figure out
that there was something
and raise a flag.
So the common scenario in error correction
is that we have a certain number, say, k
bits that we care about
and then in order to be able
to recover those,
to correct those,
we add some redundancy.
So we encapsulate, we encode those k bits
into n bits.
Now, we would like to have those n bits
be not too much larger than k.
We call those the parity check,
so this is like from the old days
where you check
"those two add up to zero,
zero zero zero...
oops! there's one, aha, there must
have been one bit flip."
So parity as in, it has to be
an even number at the end.
If you're talking about more positions
then it's obviously more
than just the parity
but it's parity of some
more complicated equations.
So, if no error occurred, if those 64 bits
in your ECC RAM are just fine,
then all those extra n-k
checks will be okay.
If there was an error,
then something will fail,
raise a flag,
1, 2, 3 of those will not be satisfied,
and depending on this error pattern,
you will be able to recover what
was going wrong in those bits.
It might actually be that
it was your parity check bits.
It might be that one of
those 8 extra bits flipped.
In which case, your 64 bits were just fine
but you don't know this when
you get the error message.
And, if you have a good code,
so the kind of code
that coding theorists study,
then you would like to have
your k pretty large, and the n
not too much larger.
Because that's the amount of storage
you actually have to afford
for just getting this much data out of it.
Now, we get some guarantees
when we design these,
there's a guarantee of getting t errors,
but for most of the codes that we know,
the guarantees are actually worse
than we get in practice,
so if something guarantees you 100 errors,
most of the time,
you can actually correct 203.
Um, to get a little bit further
we actually did to, um, represent
those equations with matrix,
um, not quite this one, sorry.
So, here's our equations.
So, small example,
we would like to encode 4 bits,
k is 4,
and we're adding 3 extra bits.
That's not very efficient,
but it's a very nice example
that one can see.
So, those rows there are
our parity equations.
So let's assume we have those 7 bits,
which add redundancy to it,
then let's translate this first row,
which is 1 0 0 1 1 0 1,
that means we take the first bit,
skip the second and third,
so, have be 0,
and then, bit 3, then the next bit is set
so bit 4, and then bit 6.
Second row, similarly,
we skip the first one,
so there's no bit 0, there's a bit 1,
no bit 2, there's a bit 3,
it was a 1 1 0 column,
and then bit 5 and bit 6,
and similarly.
So we have a nice diagonal
on the left-hand side,
and then the rest is determined
by these equations.
Now let's assume that
something went wrong.
So we have our 7 bits here,
and a few hours later we come back
and want to look at those 7 bits,
we check whether anything went wrong,
we run through this parity check program,
and we actually get a failure pattern.
If everything was fine,
we would have gotten 0 0 0,
but we're getting 1 0 1.
So the first equation doesn't hold,
the second one does hold,
and the third one does not hold.
Okay. Where could this come from?
We're pretty sure that b1 is okay
because otherwise the second
equation would be wrong
because b1 only appears there,
and we're also making the promise
that it would be only one error.
If you have two errors, three errors,
then lots of other combinations
could occur.
But it's much more likely that
one bit flipped
than that a whole bunch
of bits flipped at once.
Okay, so, tracing this
a little bit further,
yes, so the b3 would get
the two first equations,
b4, yes! b4 actually would get
that the first and the third
equations are false.
So by seeing the error pattern 1 0 1,
we know it was b4.
Now this is a very nice and small example,
which doesn't even cover like the ECC RAM,
but it gives you an idea of how to try it.
On the other hand, it also gives you
the idea that you need to do kind of
brute force search it.
So for just n=7, you have to
try up to 7 times.
If you now have two errors,
I would need to try every combination of 2
out of those n.
If I have a n which is like 2000 bits,
really long,
and I tell you there's 100 errors,
so you would need to try every combination
of 100 positions in there.
So that would be a huge number.
That's obviously not a good way
of error correction,
and that's certainly not how
DVDs and whatever else works.
Oh yeah, one bit of maths notation,
so, we call these things,
the parentheses up there, matrix,
and in order to have a shorthand,
because I can't quite put my 2000 bits
times 1000 bits matrix on the page,
I call this thing H,
and boldface b is such a bit vector.
So boldface b is that
bits that I'm storing,
and the combination of
applying this matrix,
wherever is 1, I take the variable,
where there's a 0, I don't write it,
that I write as H times b.
So in maths, if you have seen this,
this is just a matrix times
vector multiplication,
otherwise, just take this as
the instruction of evaluating
this, each row, as a set of equations.
Alright, so, to give this some names,
in coding theory we call c the code word,
so that's an element
which is not compromised,
which will give you 0,
then there might be an error vector,
that's the bits that flip,
and so, when you retrieve the memory
or when you have a transmission,
we call this the received word,
and that's my b from the previous slide.
We do like to save on space,
so when there's this diagonal
which has all 0s down there
we just skip it.
So instead of writing a 7-by-3,
we just write 4 columns and 3 rows.
Now there's lots of stuff
happening in coding theory,
it's a, well, 65 years old topic,
and we can go up to very large matrices,
and for some special codes,
these are the ones that coding theorists
come up with,
we actually have efficient methods.
Much much nicer than taking
every combination of 100 positions
out of 2000 positions.
Of course, if you get too many errors,
you can't correct.
You can only correct up to
whatever the promise is,
plus maybe a little bit later.
But if you don't know
any of the structure,
if you don't know what
the coding theorists put into it,
or if this H matrix got somewhat perturbed
by me giving a slightly
different representation,
I don't call this b1 anymore,
I call it now b17,
and let's flip those and so on,
suddenly it doesn't look like
the code that you're used to.
If it's a random matrix,
the best attacks are not quite as bad
as picking 100 out of 2000,
but they are close to that,
they're pretty slow attacks,
it's an exponential-time algorithm,
if you have a random matrix.
And so what we're doing
in code-based crypto
is to use this difference for encryption.
Now going back again to the 70s,
so, basically, yes, the stuff that we're
really confident in
that it will last another 100 years,
is the stuff from the 70s that lots of
people have looked at,
so, McEliece in 1978,
so just the year after RSA,
came up with a system
which is using encoding as encryption.
So your method is,
you just encode a vector, your message,
and then you add some errors to it.
And, if the other side doesn't have
an efficient algorithm to decode,
they actually can't break it.
They're using a special code in there,
so there's a code from Goppa
from a few years earlier than that,
and that code, so far, nobody has found
any way to take the perturbed,
this complicated way of representing it,
and coming up with
efficient decoding algorithm.
1986, Niederreither came up with
a version of McEliece
which is smaller,
the code size is smaller,
the public key size is smaller,
and we have similar things to
what you've seen before,
so we have those H matrix,
we skip the diagonal,
we just represent this H as our public key
as the remaining part of the matrix,
the secret key, that's the algorithm
that only I know.
It's a Goppa code,
but it's a Goppa code
and there's many many Goppa codes
for the same size.
So that's something that Goppa says,
well, if you want to have Goppa codes
with 2000 bits
that can correct 200 errors,
here's how you set it up,
but there's lots and lots and lots
of choices in there.
And your encryption is just,
take an error vector,
with a fixed number of bits,
and send me the error pattern of it.
So the outcome of multiplying H by this e.
And then we want to use this in crypto,
so we use the hash of this to encrypt it.
Um, then Peter Schwabe and
Tung Chou, our PhD student,
wrote a very efficient
implementation of this
called mcbits,
so if you want to have more detail
on how you could really
use this in practice,
then go to that url.
Um, but let me talk a little bit
about why we are confident in this.
Code-based crypto is less obvious
than hash-based,
I mean with hash-based it's like,
the, all we're doing is, hash.
A hash function by definition is
something which is one-way.
For this code-based crypto,
you need to think a little more,
to have people studying it
to figure out why this
actually can be secure.
Now, the attacks, if I may say so,
started before the cryptosystem
was proposed,
so it was another hard problem
that coding theorists were
studying naturally,
and then McEliece said, hey, we have
this hard problem here,
decoding a random code is not easy,
well, actually, afterwards, figuring out,
well, if you have a really random code,
even finding out whether there is
a smaller code word is NP-hard.
And then, once it was a cryptosystem,
so, Omura, Lee-Brickell, all of those,
those come after
it was proposed for crypto.
There's some which have an extra
parenthetic comment,
those are papers which study
the security of McEliece
if the attacker has a quantum computer,
so there's been lots and lots of work
for studying it
against a classical computer,
and there's been some work
studying it against quantum computers.
It's pretty clear that we can't use Shor,
but we can use Grover,
and it's not so easily obvious
how much Grover can give us.
However, the best that Grover can do
is basically halve the bit size.
So we had this AES-128,
leading to 64-bit security,
so if you're conservative,
if you want to be like really really
on the secure size,
then let's assume that we want to go for
pre-quantum security at least 256
in order to get post-quantum security 128.
So here are some key sizes,
so if you're okay with 1000 kilobytes,
then you're getting something
which is at least 131 post-quantum secure,
and that's actually very conservative,
most likely we can go much down
on the key size.
But that's where more research is needed.
If you need something where you can get
a guarantee of 100 years security,
take this one,
it's not small,
it's fast, but it's not small,
and hopefully in 5 years,
less than 5 years,
they can give you something
which is smaller
and still as secure.
djb: Okay.
There are lots of possibilities for
what the smaller things might be,
for instance, there's something
called QC-MDPC,
there's actually lots and lots
of variations of McEliece
that people have proposed,
and some of those variations have held up,
some of them haven't.
QC-MDPC is something that has
held up so far,
but how many people've looked at it,
and what's going to happen when
more attacks get optimised?
You can't be trusting something which
is new or hasn't been studied enough,
or hasn't been studied
in the right directions,
you also have to be sceptical
about crypto.
There's, well, lots of proposals,
QC-MDPC looks like one of the most
interesting ones,
that nobody's been able to damage
its security,
for 2^80 pre-quantum security,
which is not very good,
but, just as an example,
they have 0.6 kilobytes, or 600 bytes,
for the public key,
and that's a very reasonable size
for a public key,
not as small as ECC
but it's quite tolerable.
Um, lattice-based crypto,
there's NTRU, for example,
that's been around since the 1990s,
and maybe that's enough time to start
getting confident,
but, well, again, there's
lattice-based systems
that get proposed and broken,
for instance, there's a system from
Smart-Vercauteren in 2009,
that was, it's not broken pre-quantum,
but it was shown in 2014-2015,
some follow-up papers,
to be broken in polynomial time
by a quantum computer.
So, it's a lattice-based system which
sounded good at the beginning,
but is definitely not going to be part of
post-quantum cryptography.
There's multivariate-quadratic,
now those have
the interesting feature that
the signatures they provide
can be very very small,
like, you can have 100-bit signatures
and still get some reasonable security
that nobody knows how to break these,
well, there's lots and lots
of these proposals
and some of them are broken,
some of them...
HFEv-, that's a multivariate
quadratic system
that's been unbroken for 20 years.
Maybe it will continue holding up,
but, well, how many people have looked?
You have to make sure enough people
look at these things.
The reason we like these
simple systems from the 70s is
enough people have looked
but maybe if you've got more serious
performance constraints
you should look at these other things,
and of course we are looking at
these other things,
trying to figure out what's secure,
and how well we can actually, er,
choose among all these
different possibilities.
Something else, just last mention here,
is isogeny-based,
this is for people who like
elliptic curves,
that there is maybe a way to use
elliptic curves post-quantum,
and this has the interesting feature
that it does exactly the same data flows
as Diffie-Hellman.
So everything you're used to doing
with Diffie-Hellman,
and having, like, a secret key over here
and a public key over there,
agree on a shared secret,
that you can also do with
isogeny-based systems,
but only a few people
have studied the security
and maybe this'll also get broken.
So, a lots more work,
lots more possibilities.
Last slide, if you're interested
in more information
here are a few places to look,
first of all you can go to pqcrypto.org,
so this is our first survey site
for post-quantum cryptography,
and the biggest chunk of information there
is bibliography, and if anybody has
newer papers,
if you happen to write papers on
post-quantum crypto,
and you'd like us to add those,
then just let us know,
um, and then, well,
we also have on the front page
things like pointers to all
the PQCrypto conferences,
that's the main place to look,
go to pqcrypto.org and follow links to,
for instance, the February
Japan conference.
Then pqcrypto.eu.org, that's the main page
for the EU project that Tanja mentioned,
which is putting out as it progresses,
well, it just started this year, but
it's putting out, soon, free libraries!
Software libraries, for actually doing
post-quantum cryptography.
And benchmarking results,
so you can actually say
"Yeah, I've got the following performance
constraints here,
that's what I'm able to use."
And then in 2017, there's going to be
a workshop, and a summer school, maybe
it'll be a spring school, we'll see,
if you're interested in the Twitter feed
for that, then twitter.com/pqc_eu
and final resource on the slide is you.
You have the responsibility
if you care about crypto,
then you have to get used to this stuff.
You have to learn about hash-based,
code-based, maybe lattice-based,
multivariate quadratics,
these are the things which will be
the future of crypto.
In the future, all of crypto will be
post-quantum crypto,
because eventually the attackers
will have quantum computers.
If you want to adapt to that future,
then, well, take a look at these sytems
and see, maybe find improvements,
then, cool, then let us know,
and, you know, publish papers,
integrate them into real applications,
networking applications,
implement the things
that aren't implemented,
speed them up,
there's lots of work that
still has to be done.
That's it, thank you for your attention.
applause
Herald: Wow. Now we'll have some
short questions please.
Because we're a bit late on time.
Questioners, go around ahead,
there's a mike there,
talk into it please.
Can we have mike 1?
Q: Oh, okay. Uh, quick question.
So there was this result of
Laszlo Babai that there's a,
that graph isomorphism has a
quasipolynomial time algorithm,
um, not really knowing the subject at all,
there's some very, very
superficial similarities,
like, that is a hidden subgroup
problem, basically.
Um, is there going to be any, like,
is there any indication that the methods
he used in that proof
are relevant to breaking, like,
to weakening NTRU or things like this?
djb: That's a good question. So, the, um,
the graph isomorphism advance now,
is basically saying,
so, graph isomorphism is a problem
which was famous as being
not know to be solvable in polynomial
or even, like you said,
quasipolynomial time.
Um, except there were some really good
software algorithms
which would solve most cases of it,
really quickly.
There were just some really weird,
highly symmetric cases,
that were hard to solve
and that's what Babai managed
to completely kill now,
so he's handled all the cases.
But we try to stay away from
problems like that in crypto,
so, an example that's very closely
analogous to that is
what's called support splitting,
which is a certain type of attack strategy
against code-based crypto
if somebody gives away lots of information
from the legitimate user.
And that's something where the
support-splitting algorithm
works in most cases,
but it kind of fails in some corner cases,
and, well, we try to stay away from
thinking about this anyway,
because you just don't want to give
somebody that extra information,
but if you did, then maybe Babai's
technique could be useful there.
Herald: Thank you. This talk is not over.
If you're leaving, which is okay,
please be silent.
Next question, number 3, no, number 1.
Q: Uh, hello, thank you for the talk.
Um, how are the chances to have something
like forward secrecy with that?
I recognise the last algorithm
has a chance
to reuse Diffie-Hellman,
that's possibly the only one?
Lange: So, if you think that
forward secrecy means Diffie-Hellman,
you can have forward secrecy
from normal encryption algorithms,
at the expense of generating
a new key each time.
You can have forward secrecy with RSA,
if Alice talking to Bob starts by saying
"Okay, this is my one-time RSA key,
and I send this to Bob,
with a request to encrypt to this key."
If Alice never reuses this key,
then this method is forward-secure.
And similar to how you can
do this with RSA keys,
you can also do this with McEliece keys.
At the moment, there is a difficulty
that the keys are very large.
It is inconvenient when you want
to start talking,
"Hey, I'm a client, hello server,
please talk to me",
and the first thing you have to do
is transmit a megabyte of key.
On the other hand, you can do it.
It just requires you to engineer
your protocol to expect,
yes, you have to send a few packages.
And then it has all the
forward secrecy that you want.
Q: But, uh, a way without
transferring the key,
like Diffie-Hellman?
Lange: But, Diffie-Hellman
is transferring a key.
Q: Okay.
Lange: I mean, you're basically
transferring
the first part of a discrete log pair.
If Alice sends a*p, and there is
some one-time key,
she's sending a public key.
It's just that the method of how
those two public keys interact
is slightly different from how
RSA encryption works.
Q: Okay, thank you.
Herald: Thank you. Do we have
an Internet message? Question?
Signal Angel: Actually, we do. There are
two questions that are somehow related.
The first one is:
Given that there is no actual working
quantum computer,
how do you start developing
a crypto algorithm?
Where do you start, how do you design it,
how do you test it? Is there a way
to prove that it's secure?
And the second question is related:
The whole thing is based on the property
of the hash functions being one-way,
how does one know that there is no
quantum algorithm
that breaks this property?
Can you prove this?
djb: Okay. For both of these questions,
the technique that the community
goes through,
that we all go through,
is cryptanalysis.
So we have as many smart people as
we can find
focusing on these problems and saying
"Can you find an algorithm which is
breaking these problems
better than any previous algorithms can?"
and we put as many incentives as we can
so that we try to get as many smart people
to stay ahead of the bad guys,
and hopefully find the best algorithms.
But there's no guarantees in this.
And you do always have to be sceptical
about whether people have
actually looked at, for instance
quantum algorithms to attack systems.
And there is that extra difficulty
that the first part of the question,
at the beginning, was saying,
that we don't have a quantum computer.
So if we're trying to verify quantum
algorithms that we're developing,
we don't get to experiment with them.
That's the usual procedure for
making sure your algorithms work.
In state-of-the-art cryptanalysis,
like the number field sieve for factoring,
that does not have a proof that
it works in any particular,
well, at the speed that we think,
it works out from experiment.
So experiments are really important,
because we don't have proofs
for state-of-the-art cryptanalysis,
and that's something where it's actually
really tough for quantum computers.
Of course eventually we'll all have
quantum computers,
and there's ideas for simulating
quantum algorithms
which have had some success
at verifying that algorithms work
even though we can't actually
run them yet.
That we're actually able to verify
a simulation of those algorithms.
Lange: Let me do a second part of this.
Um, when we use quantum cryptanalysis,
for estimates, we usually go
for the worst-case estimate.
So we say, well, McEliece, at worst,
gets broken to half the security level.
Most likely it won't be that fast,
but we're staying on the side
where we're very confident.
Um, if I understood correctly
there was also the part of the question,
how can we test the algorithms?
If this is for
the constructive algorithms,
then all the algorithms we analyse,
all the algorithms we propose
for post-quantum crypto,
are algorithms that you can run on
your normal computer.
So, you can test those, you can run those,
we have benchmarking numbers from those,
on our current hardware.
Herald: We'll do two more questions,
please. Number 1.
Q: Yeah, I got a question on
the practicality of the attacks.
So, um, if we assume there is
a quantum computer,
how much time will it take in practice,
order of magnitude,
to break, let's say, RSA 2048-bit key?
Is it on the order of hours,
weeks, months, years?
Lange: snaps fingers Done.
Q: Okay, thanks.
laughter
djb: That was easy!
Herald: Number 3.
Q: Okay.
Herald: Talk into the mike, please.
Q: Thank you. So, it's very,
very nice to have
both quantum encryption and signing,
but do you know anything about
some other cryptographic primitives,
such as zero-knowledge proofs?
Lange: Well, I mean, zero-knowledge proofs
are basically signatures
which are not interactive.
So if you have something which is, um,
a primitive for a signature is usually
very closely related
to zero-knowledge proofs.
So, there is work going on,
we are focusing on
the most important things
that we see on the Internet,
but, that shouldn't mean that people
shouldn't do research on it.
Please do research on
zero-knowledge proofs!
Herald: Okay. Last question. Number 1.
Q: So, why do you put so much emphasis
on smaller key sizes and
on performance in encryption,
um, especially in a delicate topic like
post-quantum computing?
Why can't we just use 1-megabyte keys
and why can't we just use a few seconds
instead of miliseconds
to compute those?
Lange: So...
Q: What's the the problem here?
Lange: We are suggesting
to use a key of 1 megabyte.
So, our recommendation that we have on
the Internet on the pqcrypto.org page
are precisely using this system
which has a 1-megabyte key.
The nice thing is, that actually
encryption and decryption
are very efficient.
But that was not our main goal,
our main goal was to get something
which is very secure,
and where we have a high confidence
that we actually understand the attack.
And then, well, once we have this system,
then you try to optimise
how to implement it,
and the implementation,
when we looked at the numbers
is actually faster than
elliptic curve implementations
except for the size.
So, you get a very nice speed,
even though it was not our
target for optimisation.
Herald: Okay, this is it.
Thank you very much,
let's have a final hand
for Tanja Lange and djb Bernstein!
applause
postroll music
Untertitel erstellt von c3subtitles.de
im Jahr 2016. Mach mit und hilf uns!