preroll music Herald: Tanja Lange and djb, or, Daniel Bernstein, came from the elliptic curve are gonna talk today also about the McEliece crypto where they took us. They're both in the steering committee for the PQCrypt and I've talked to other people in the community and they said, especially about Tanja, she is the one that brings practice and reality into all this theoretical mess so let's please have a hand for Tanja Lange and Daniel Bernstein. applause djb: Okay. Am I on? I apparently am. Welcome to a late night show, with Dan and Tanja. laughter There's a lot of crypto talks out there where you'll get the idea that crypto has problems. Crypto has massive usability problems, has performance problems, has pitfalls for implementors, has crazy complexity in implementation, stupid standards, millions of lines of unauditable code, and then all of these problems are combined into a grand unified clusterfuck called transport layer security. laughter, applause But, actually, the situation is worse. laughter Because of what's coming, namely, quantum computers. These typical talks will give you the idea that the core of crypto, the basic cryptographic primitives like elliptic curve crypto, ECC, that those are just fine, and all the problems are integration into applications, that's where the security problems happen. But quantum computers will break ECC. This has been know since the 1990s. For people who don't recognise the picture here, this is a mathematician named Peter Shor, 20 years ago. laughter, applause 20 years ago he wrote a paper saying that a big quantum computer would be able to factor big integers in polynomial time. Now, if you like doing attacks, and you hear about something like this, then your first thought is "Ooh, I wanna quantum computer!" "I want a big quantum computer so I can run this attack!" "And, where is it? Does anybody have one?" "Can anybody gimme one?" "Oh, it isn't there yet? Well, what's the progress?" "Are we there yet? Can I have one, please?" And, well, in the news, you now hear about this D-wave quantum computer, which, okay, there's some very good quantum engineering that goes into this machine. It's pretty clear that it's quantum, I mean, it's actually doing the quantum stuff they say it does. And there's some fantastic marketing in this machine. But, unfortunately, it's not actually useful. So the D-wave quantum computer doesn't do the basic things that we expect a universal quantum computer to do. It can only do very limited kinds of quantum computation. It cannot maintain qubits, do error correction on qubits for a while, hold on to them to be able to do basic qubit computations. It can't even do those computations, even if it could hold on to the qubits. It can't do Shor's algorithm. Can't factor big numbers. Can't do other quantum algorithms that we care about. Now, the D-wave company says that's not the point, there's some other computations that we designed this machine to do. But they cherry-picked the computations for this machine, saying basically Here is the problem of simulating this machine simulating the quantum thing that it's doing, and of course the machine is very good at simulating itself, by just running, whereas a normal laptop, they compared to sort of standard programs on a normal laptop, and latest results, 10^8 times faster except, well, there's a much more optimised program that somebody put out last year, which basically is the same speed as the D-wave machine and like ten thousand times cheaper. So, the D-wave machine is not, for the moment, something useful. Lange: Well, if this makes you kind of comfortable, so, no Shor monster coming, there is progress. There will be a universal quantum computer, so, something that can run Shor's algorithm, and there's a lot of research effort going into this, so if you track, like, spending on crypto and you can spend... and track spending on quantum computers, that is a lot of money. And there's a lot of institutions and companies and of course governments all over the world building stuff. And we do see progress, so we're keeping track on this, and, go to the Wikipedia page here so we see over the last few years a huge progress in stability of qubits, so these things usually forget what they had, they decay, but they get more stable, there's better error correction, and there's better communication. So the latest was that even silicon-based qubits can communicate. So something is happening. Um, IBM has on the record of Mark Ketchen who's their CEO saying "We actually do things that make us think like, hey, this isn't 50 years off" "This is maybe just 10 or 15 years off. It's within reach." So IBM is leaning out the window and saying, yup, we're gonna be there pretty damn soon. They said that in 2012, so let's fast-forward by 10 to 15 years so it's now 2022 or 2027 and so there is a universal quantum computer. The effect this has on the Internet crypto is that RSA, which is still the majority of all the connections today, RSA is broken. RSA is dead. Discrete logs in finite fields. You might have thought a lot about it earlier this year about Logjam. But Logjam just breaks the very small one. This is breaking any big one. So, discrete logs in finite fields is totally dead as well. Elliptic curves, that I already mentioned, ECC is dead. So this breaks all public-key crypto on the Internet. There's another algorithm due to Grover, which is somewhat better under control, somewhat less scary for crypto, but it still has quite an effect, so if you believe in the security of AES and go like, well, AES-128 seems just fine, has been not getting any big progress in cryptanalysis, but it halves the security, so AES 128-bit keys are only 64-bit secure. However, you can go to 256 bits. djb: Okay, so, the AES situation: not so bad, but all this public key stuff is broken. So what do we do? Well, you could bury your head in the sand, you could hope that quantum computers don't come along, you could deploy a physically secure crypto. You could take a locked briefcase and chain it to your wrist, or you could do quantum key distribution. Now these things, obviously, are very very expensive. This is crypto, information protection for rich people. Or, quantum key distribution is crypto for even richer people. You, obviously, aren't going to be able to democratically give this to everybody who has a laptop. But, well, okay, imagine that you have enough money, that you don't mind this. There's a bigger problem with physical cryptography, which is the security. Now these things are always advertised as provably secure, guaranteed secure. Nobody can get into this briefcase! Nobody can get into this quantum key distribution system! Assuming, of course, blah blah blah. And then you look at the assumptions and you start saying "Wait a minute, is that actually true?" And then it turns out that when people try attacking these systems, the assumptions are not true. And the systems get broken again and again and again for a very reasonable attack cost. Okay. Even if you have a system where you think it's secure, which nobody's managed to build yet, even if you had that, the design of this physical security system is incredibly difficult to audit. It's something where, if somebody wants to slip in a backdoor, it's really easy. But there's an even worse problem with physical cryptography, which is that it doesn't do very much crypto. Typically these systems do about the same thing that AES does. You start with a shared secret, and then from that shared secret you can authenticate more secrets that you transmit through this physically secure communication mechanism. For instance, quantum key distribution starts with some way, some pre-existing way of authenticating. But that's just like AES starts with a, a secret key which is then used to generate more and more secret data. And quantum key distribution says well, it's provably secure under certain assumptions, instead of AES which, well, we just heard AES may be a little affected by quantum computers, but AES-256 is not broken. Nobody has any idea how to break it. So this physical cryptography isn't actually serving the needs that we want. Imagine trying to distribute operating system updates through locked briefcases. It's just not going to happen. Microsoft sending out billions of locked briefcases to all of its customers. If you think that's realistic, or if you think that, just, lasers are cool, then there's a talk about quantum crypto tomorrow, I believe. Back in the real world, we obviously need to do something better with crypto. Lange: Alright, so, let me take, momentarily, a slightly more optimistic perspective. Is there any hope? Yes. So, the title of this talk, PQCHacks, comes from post-quantum crypto, and that's the term Dan coined back in 2003 for crypto that remains secure even if the adversary has a quantum computer. So you, the normal people, me, him, the normal people, the Internet, all the connections, we have normal computers. The adversary has a quantum computer. We, today, encrypt something. Adversary has all the time in the world to build the quantum computer, break us now or break us in the future. And so this took off, and there's been the first workshop on post-quantum crypto, called PQCrypto2006 and then there's been a series of workshops you can see here. These are the attendees of the last edition which was in Waterloo. It's more than a hundred people, going like, yes, this is an important topic, let's do some research on it. So something is happening. If you're curious what's happening, next there's a workshop in Japan in February and then in the Netherlands in 2017, and we also got a project through from the EU to support research on post-quantum. So something is happening. Actually, um, did I forget somebody? Yes. Um, the NSA is also saying, let's do something. So, on August 11 the NSA posted on their Suite B... page saying, "The Information Assurance Director (so, IAD) recognizes that there will be a move, in the not distant future, to a quantum resistant algorithm suite." Oh my god! Somebody is doing something! The public has realised we have to do something! Quick, quick, let's put out a press release! Um. Looks like they kind of realised it was embarrassing. So, eight days later they pushed a new release, saying "IAD will initiate a transition to quantum resistant algorithms in the not distant future." So, NSA will lead us to a bright and glorious future. djb: America, fuck yeah. laughter, applause Lange: But if you thought that, so far, this was bad, it actually takes a lot of time to build crypto. With let's say, NSA all ask the researchers If you have some idea of what could be secure against a quantum computer, say AES-256, the reason that you have confidence in it is we had more than ten years of people trying to break it, trying all kinds of approaches and not getting anywhere. It takes a lot of time to explore the space of cryptosystems. Once you have figured out what could be actually doable, you want to figure what the best attacks are, you want to focus on the security system, you want to figure out how to implement them, how to implement them securely, that is a whole lot of work that needs to be done before you can say, well, this is actually practical. We can actually use this. Or sometimes, this is as practical as we can get it. And then, you have this tiny little crypto primitive, and then you have to build the hooks and connections to get it into TLS. Then we said at the beginning that maybe you believe that the cute little crypto cores are all secure and it's just the connections with the AES, sorry, with the TLS in other world that is the problem. That still needs to be done for post-quantum as well. And, the side-channel attacks, there's, uh, as an example, ECC was introduced back in 85, and now 30 years later, we're seeing that ECC is actually getting deployed on the Internet, that now the IETF, having called for having elliptic curve crypto, because people start saying, yes, we would like to use this. So we cannot wait until there's a quantum computer in order to start research on it. If you remember what 85 looked like That's a while ago. Now, if that sounded bad, it's actually worse. It's not just that, in 15 years from now, whatever you send then will be broken. There's some indication that some entities might be recording your traffic. We've given a talk like this in 2012 which was "Crypto for the Paranoid", everybody thought it was, like, pfft, you're crazy, now it goes like "well, there might be an entity" and everybody goes like, yeah, hmm, yeah. So, let's assume that this entity has the records of all the connections, and, that in ten years, you're going to be an interesting person. Maybe you're a politician, maybe you've become rich, maybe you associate with the wrong people, or so they think, and so somehow they go back to the 27th of December 2015, and figure out what you were emailing during the talks, because they have all the connections here so they can go back in time just by the metadata, and of course the real data. From the metadata, they can decrypt it, because this metadata is using RSA or ECC which are broken. And then they get the same key than you're currently using with your other side to communicate. And so they can go and decrypt this. So it's not only that we can't wait for quantum computers to come, we might already be too late. Well, on the next slide, here's a bunch of people who are getting together in this EU project, and we have come up with what we're currently thinking are good recommendations. We think it's necessary for people who really care about the security of their connections, and when I say people I include myself, I care about the security of my connections, we would like to have something that we believe is secure for the next hundred years. And so we put out a list of recommendations. So, for symmetric, it's relatively easy. You want something which is at least a 256-bit key and is sufficiently well understood so there we have Salsa20, and we have AES-256. Then, for authentication in symmetric, there, we don't actually have any decrease from a quantum computer, because these are already information-theoretically secure. So these are the nice part. These two talk items is where we go like "Pff. We might be okay on those." We can do better, we can have nice research and get something which is even better protected, even faster under the new scenario, but it's not so dire. The bottom two, these are the public key systems. That's public key encryption and public key signatures. That's what we're going to focus on in this talk. So, for public key encryption, we're recommending the McEliece cryptosystem with Goppa codes for a certain parameter size, and then for signatures, the recommendation is to use hash-based. djb: Okay, so... Let me dive into the hash-based part of this. This is something which goes back even further than the 80s. It goes back to the 1970s. Lamport. So here's a way to use hashes to do one-time signatures. Which are, we'll see in a few minutes, signatures that you can use to sign one message under a given key. And then, well, that's a little bit inconvenient that you can't sign more messages so then Merkle came along and said, you can sign more messages by modifying the system, and we'll also take a look at that. The only thing you need for these public-key signature systems is a good hash function. And, okay, that's something where historically, some hash functions designed in like, 1991 had trouble. But, now, we have some good hash functions so, for instance, SHA-3 has some great hash functions, even SHA-2, there's no sign of trouble, of those things being broken. And then after these original systems from Lamport and Merkle, there's lots and lots of improvements, but all of these hash-based systems, it's really easy to understand the security. It's something where the basic idea is really, really simple, and the security analysis also ends up being very straightforward. You have to be careful about some things, but, when you're careful about those, and there's nothing subtle about it, nothing tricky, then we understand exactly what security we get from these. So let's dive into a signature scheme that can only sign empty messages. Now, this sounds kind of, like, wait a minute, what do you mean, "can only sign empty messages?" There's only one empty string, and that's the only thing you can sign. But imagine that you want to have a panic button, like, your revocation key, you want to be able to say, "Here's a message that says, my house, my computer's been compromised, don't trust anything from this anymore". It's this one message that you want to sign, under a certain key, and if anybody has that public key, then they should be able to verify that you sent that message, and nobody should be able to forge that, because it's really bad if somebody can forge your panic message. So, being able to sign just one message is not actually such a useless thing, and we'll also see that it builds up to the rest of hash-based crypto. Okay. Let's look at some Python stuff here, simple SHA-3 you can get online under Ubuntu or Debian do install python-pip and python-dev because it's a C library actually, and then, pip install simplesha3, that will give you SHA3-256, and then to generate a keypair in this empty-message signature system, all you do is you make 32 bytes of random data and just hash it again, just in in case... urandom is not too well-reviewed... I mean, we should trust urandom, but it's really cheap to put an extra hash on it. So that's your secret key, it's a 32-byte hash. And then the public key is, hash that secret again. So the public key is simply a hash of the secret key. And then return the public key and the secret key, of course the public key will get published, and the secret key you keep for yourself. As an example of doing this, well, um, you just get random-looking data coming out from your public key at the bottom your secret key right at the bottom and public key above that. And you can check that one of them's a hash of the other if you know the secret key, you can hash it to get the public. Okay, now, how do you sign a message? Well, this maybe sort of spelled out in more steps than it has to be. The API here, this is, I would say, getting to be a pretty popular API for signatures and for verification, where you include the signature and the message together, as a signed message. And to emphasise that, here's a returned signed message from the sign function. Now, the signed message will be later on verified, and you get a message out of it, where the only possible message to be signed is the empty string, and you can see that the top there of the signature is checking if the message is anything other than the empty string then you're not allowed to sign it. Um, if you have the empty string coming in then the signature is simply your secret key. So you reveal your secret key and the whole idea of hash-based crypto is that somebody can publicly check the hashes of something that you reveal to sign, in this case, the empty message. And we'll see later how you can use the same idea to sign lots more. And then, okay, verification, you simply checked that signedmessage, which is supposed to be the secret key, check its hash and see if that matches the public key. What would somebody have to do to attack this? He would have to, if you haven't actually signed a message, the one message, then he would have to figure out some string that, when you hash it, gives this public key. And, well, that public key, this is a preimage problem, inverting the hash function. The hash is supposed to be one-way, if the input were low-entropy, then this would be doable, but the input was a 32-byte random string, so nobody will be able to guess that, or find any other string that passes this. And then you return the message and you can try this out and see that it works. Alright, let's move on. We've managed to sign empty messages. How about signing 0 or 1? So now we'll make a signature system where your key can sign 0 and your key can sign 1. And, well, this is going to be kind of stupid, what you do is, you make two keys, one of them is signing 0, and the other one is signing 1. You can see how complicated this hash-based signature stuff is, it's, okay, you put two keys together, you make one key that will sign 0, one key that will sign 1, concatenate the public keys, concatenate the secret keys, the p0+p1, s0+s1, and then if you want to sign, then, well, if you're signing 0, now the messages being signed are integers now instead the empty string, if you want to sign 0, then the signed message will be the string "0", followed by the 32 bytes, well, this is again more complicated than it has to be, but think of this as signing the empty message with the empty signature system, which is just copying the 32 bytes of the secret key. And then if you want to sign message 1, then you do that with the other 32 bytes of the secret key. And then, to verify it, well, just whether the signed message is 0 or 1, just do the obvious thing. Maybe I should emphasise this code was written just for this talk, it has not been reviewed, and you should audit everything. You know, you might feel like six lines of code, you can't possibly screw it up, but, don't ever use code like that in crypto. So, this is just meant for you to understand what this is supposed to be doing, this has not passed review. But, I like to think it would pass review. Um, okay, if you try signing the 1 message, for example, and take the signed 1 message and open that signed message, you do get the integer 1 back. And if you try forging it, you're again faced with this problem of coming up with some 32-byte string that hashes to a particular result. Alright, let's move on to 4-bit messages! So, I promise I won't do every possible length. But, 4 bits, this will make an important point that you don't see from 1 bit. Let's try to sign a 4-bit message by signing each of the four bits. This all scales up to signing 1000 bits if you want. Um, so, okay, let's make 4 sign-bit keypairs where each of those was two 32-byte hashes, I mean, each secret key is two 32-byte random strings and each of the public keys is the hash of those 32-byte random strings. Make 4 of those pairs and concatenate them to make some new public keys and secret keys. Alright, and now to sign a message, well, you look at the message and check, is it an integer between 0 and 15, and reject otherwise, and then sign each bit of the message. You can see m shifted right by 0, 1, 2, 3, extract the bottom bit of each of those, and then sign each of those bits, and then concatenate the signatures to get, well, signatures of the four bits of the message. And then verification works the way you expect and I'll just skip that. Alright, this has a problem. That if you use this signature system to sign two different messages, then you might actually allow forgeries. So let's look, for example, suppose you sign 11 and you sign 7. Now what is that signature of 11, 11, uh-oh, I have to do 11 in binary now, so 11 sounds like 8+2+1. And you sign 7, which is 4+2+1. So what if you revealed now? You reveal the signature on that 8, so the 3rd bit you revealed a signature saying, "the 3rd bit is 1" as part of the signature of 11. But as part of the signature of 7, you revealed, you signed a message saying "the 3rd bit is 0". And now you can just mix and match those messages, wherever the bits were different. So for instance if you take the top bit from the 11 and the bottom 3 bits from the 7 then you end up signing 15. So this signature system, it's totally safe if you're signing one message. But if you think about the data flow, what does it mean to sign the individual bits of two different messages, then you can get in big trouble. So this is a one-time signature system. Alright. Here's how Lamport's signature system works for one-time signatures of any length of message. First of all, you scale that 4 bits up to 256 bits. And then, if you want to sign whatever length of message, you just hash the message to 256 bits. And the code for it is very simple. This is not quite the state of the art for one-time signatures, there's fancier things you can do, you can sign with Winternitz signatures and get instead of something like 256 or 512 hash values you can compress that down to like 50 or even fewer hash values. There's all sorts of tricks to trade space for time in these systems but it's totally feasible to deploy Lamport signatures and, well, Winternitz makes it even more practical. Alright. What about this one-time restriction? So these last, the 4-bit, and the Lamport bigger message system, these are only usable for, you can only use your key to sign one message. And this was fixed by Merkle. What Merkle said was, you take 8 Lamport, for example 8, it can be any number, here's how we'll make a public key that can sign 8 different messages. You make, well, 8 different Lamport keys and you concatenate them together and you use each one just once. But it's actually more space-efficient than that sounds. So here's what you send along. You make 8 Ss there, those are the secret Lamport keys, that are able to each sign one message, and then you have 8 corresponding public keys, P1 through P8, and then you hash together in a tree, you hash together P1 and P2, and P3 and P4 to form P9 and P10, and hash those together to get P13, and same over here to get a final public key P15. So just one hash value, that's your whole public key. And then you have to remember lots of secrets, but, okay, nobody has to see those secrets, you just keep them on your computer. And now what does it look like to, to hash, sorry, to sign one message? Well, here's what it looks like signing the first message. That's when you use S1. Once you use S1, you cross it out, you never use it again, you kill the secret. You sign your message with S1, and somebody can verify, if they see the public key P1 for Lamport signatures. Or whatever one-time signature system you put at the top. But, well, your public key in Merkle systems is P15. And how does somebody verify that P1 and P15 are related? Well, you show them the P2 and the P10 and the P14 that they need to hash together with P1 to get your public key, P15. Alright, and that's as complicated as the Merkle signature system gets, if you want to be able to sign a million messages, you have to have a million secrets, but, again, just put it on your local hard drive, you're not worried about the space. It takes a few moments to generate the key, but that's also not a problem. Okay. Good things about hash-based signatures, and a few bad things. Good things: It's post-quantum secure, we totally understand what hash function security looks like, after quantum computers and before quantum computers, all you need is a good hash function, and there's lots of hash functions which have been thoroughly studied, so, we're confident they're secure, SHA-3 was the result of a long competition with lots of people bashing on it, and really has no problems. The public key is very small, just one hash value. The security, as I said, is well-understood. All the computations are very fast, and you can already find standard proposals for this system. This is going to be the first standardised and the first deployed post-quantum public key system. On the other hand, if you look at the signature size, it's kind of big, and then the more scary thing is the statefulness. That you can only use S1 once, and then you have to cross it out. You can only use S2 once, and you have to cross it out. What if you clone your VM? What if you have a backup and restore? Then, you've forgotten that you've used S2. And then you use it again, and then somebody can forge messages, just like I was saying before. Alright, so state is a problem, actually some of you I'm sure were in Rutkowska's talk earlier today you also heard that state is harmful there. And the solution: Eliminate the state! applause I think I only have about three minutes left for this, this section, well, this slide, but, let me try to briefly say, the idea for getting rid of the state is, you have, instead of signatures, you have certificate chains. So you have whole chain of CAs, like, as a signer, you build a whole bunch of CAs, you build, say, 2^256 certificate authorities. Now, that's too much computation to do, but you do it on the fly, as you need them. So you say, I'm going to sign using one of these bottom-level certificate authorities, that will sign my actual message. And now, you don't have to know anything about any of the others, you just pick one and use that to sign your message. And then it has to be signed by the parent CA, and that's signed by the next parent, and so on, up the tree, to the top level, only 256 levels. And then, okay, you have your certificate chain, how do you manufacture these certificate authorities? Well, a certificate authority is just some bytes of code on disk, that's what real CAs are like, and you do the same thing signing, and, those bytes of code, you can, instead of storing the CA as a program in a configuration file on disk, you just generate it on the fly when you need it, by taking the position of the CA within this huge tree and then hashing that together with some long-term secret. That one long-term secret is the only thing you remember. So you can manufacture CAs on demand in some huge tree and have them sign certificates for each other only looking at a few CAs that you need for the particular signature that you want. The reason for having this big tree is that then you're never going to randomly use the same CA at the bottom twice. So the problem of having one-time signatures is no longer a problem. Each CA will only sign one message. Okay, and this is something where the original proposal from Goldreich in 87, if you use good one-time signatures, Winternitz and all that stuff, you get something like 0.6 megs for a signature. Now that's a little bit inconvenient, for comparison, if you do an OS update, you look at the average Debian packet size, then it's 1.2 megs. And then, there is some number of signatures with each update, and some of them are in packages, and well, it's not inconceivable to send 0.6 megs for each signature, but it's kind of big. And then if you look at, well, web pages, say the Alexa top million web pages, that average is 1.8 megs. And again there's several signatures on the web page, depending how many sites are providing graphics for it and so on. So, 0.6 megs is maybe a problem. But, okay, we took a look at this and a bunch of people made the signature a lot smaller, 0.041 megabytes. You know how to make a 41-kilobyte signature sound small: only 0.041 megabytes for a sphincs 2^128 post-quantum secure signature system. If you're interested in more about what sphincs does, go to sphincs.cr.yp.to Lange: Alright, so. Now we have some idea of how we can do signatures. And signature's just the thing that quantum crypto really really can't do, I mean, also, locked briefcases, how do you trust that this is actually your locked briefcase? But also, public key crypto is a problem. So, the one that we recommend is code-based cryptography, and code, in this context, is not like code as in writing a program but it comes from error-correcting codes. So when you think about, say, your computer, you have RAM in there and this RAM might get cosmic radiation or just a bump somewhere, might forget something. Or, a more trivial example, if you order a book, or, nowadays, order any media, it has an ISBN number. This last digit is dependent on the previous digits. So they can figure out whether any of the digits before was mistransmitted then then go like, hmm, that number doesn't exist, try again. With your ECC RAM it's actually a little more sophisticated. So you have your RAM sitting there, and it stores 64 bits. But it stores these 64 bits with some redundancy. Internally, it has some extra check bits. It stores 8 extra bits and those 8 extra bits allow you to recover the 64 that you're interested in in case there was some error. Not in case of a massive error, not in case somebody took a screwdriver and hit it, but if there was just one random bit flip, you can recover it. Also, if two bits flip, you can at least figure out that there was something and raise a flag. So the common scenario in error correction is that we have a certain number, say, k bits that we care about and then in order to be able to recover those, to correct those, we add some redundancy. So we encapsulate, we encode those k bits into n bits. Now, we would like to have those n bits be not too much larger than k. We call those the parity check, so this is like from the old days where you check "those two add up to zero, zero zero zero... oops! there's one, aha, there must have been one bit flip." So parity as in, it has to be an even number at the end. If you're talking about more positions then it's obviously more than just the parity but it's parity of some more complicated equations. So, if no error occurred, if those 64 bits in your ECC RAM are just fine, then all those extra n-k checks will be okay. If there was an error, then something will fail, raise a flag, 1, 2, 3 of those will not be satisfied, and depending on this error pattern, you will be able to recover what was going wrong in those bits. It might actually be that it was your parity check bits. It might be that one of those 8 extra bits flipped. In which case, your 64 bits were just fine but you don't know this when you get the error message. And, if you have a good code, so the kind of code that coding theorists study, then you would like to have your k pretty large, and the n not too much larger. Because that's the amount of storage you actually have to afford for just getting this much data out of it. Now, we get some guarantees when we design these, there's a guarantee of getting t errors, but for most of the codes that we know, the guarantees are actually worse than we get in practice, so if something guarantees you 100 errors, most of the time, you can actually correct 203. Um, to get a little bit further we actually did to, um, represent those equations with matrix, um, not quite this one, sorry. So, here's our equations. So, small example, we would like to encode 4 bits, k is 4, and we're adding 3 extra bits. That's not very efficient, but it's a very nice example that one can see. So, those rows there are our parity equations. So let's assume we have those 7 bits, which add redundancy to it, then let's translate this first row, which is 1 0 0 1 1 0 1, that means we take the first bit, skip the second and third, so, have be 0, and then, bit 3, then the next bit is set so bit 4, and then bit 6. Second row, similarly, we skip the first one, so there's no bit 0, there's a bit 1, no bit 2, there's a bit 3, it was a 1 1 0 column, and then bit 5 and bit 6, and similarly. So we have a nice diagonal on the left-hand side, and then the rest is determined by these equations. Now let's assume that something went wrong. So we have our 7 bits here, and a few hours later we come back and want to look at those 7 bits, we check whether anything went wrong, we run through this parity check program, and we actually get a failure pattern. If everything was fine, we would have gotten 0 0 0, but we're getting 1 0 1. So the first equation doesn't hold, the second one does hold, and the third one does not hold. Okay. Where could this come from? We're pretty sure that b1 is okay because otherwise the second equation would be wrong because b1 only appears there, and we're also making the promise that it would be only one error. If you have two errors, three errors, then lots of other combinations could occur. But it's much more likely that one bit flipped than that a whole bunch of bits flipped at once. Okay, so, tracing this a little bit further, yes, so the b3 would get the two first equations, b4, yes! b4 actually would get that the first and the third equations are false. So by seeing the error pattern 1 0 1, we know it was b4. Now this is a very nice and small example, which doesn't even cover like the ECC RAM, but it gives you an idea of how to try it. On the other hand, it also gives you the idea that you need to do kind of brute force search it. So for just n=7, you have to try up to 7 times. If you now have two errors, I would need to try every combination of 2 out of those n. If I have a n which is like 2000 bits, really long, and I tell you there's 100 errors, so you would need to try every combination of 100 positions in there. So that would be a huge number. That's obviously not a good way of error correction, and that's certainly not how DVDs and whatever else works. Oh yeah, one bit of maths notation, so, we call these things, the parentheses up there, matrix, and in order to have a shorthand, because I can't quite put my 2000 bits times 1000 bits matrix on the page, I call this thing H, and boldface b is such a bit vector. So boldface b is that bits that I'm storing, and the combination of applying this matrix, wherever is 1, I take the variable, where there's a 0, I don't write it, that I write as H times b. So in maths, if you have seen this, this is just a matrix times vector multiplication, otherwise, just take this as the instruction of evaluating this, each row, as a set of equations. Alright, so, to give this some names, in coding theory we call c the code word, so that's an element which is not compromised, which will give you 0, then there might be an error vector, that's the bits that flip, and so, when you retrieve the memory or when you have a transmission, we call this the received word, and that's my b from the previous slide. We do like to save on space, so when there's this diagonal which has all 0s down there we just skip it. So instead of writing a 7-by-3, we just write 4 columns and 3 rows. Now there's lots of stuff happening in coding theory, it's a, well, 65 years old topic, and we can go up to very large matrices, and for some special codes, these are the ones that coding theorists come up with, we actually have efficient methods. Much much nicer than taking every combination of 100 positions out of 2000 positions. Of course, if you get too many errors, you can't correct. You can only correct up to whatever the promise is, plus maybe a little bit later. But if you don't know any of the structure, if you don't know what the coding theorists put into it, or if this H matrix got somewhat perturbed by me giving a slightly different representation, I don't call this b1 anymore, I call it now b17, and let's flip those and so on, suddenly it doesn't look like the code that you're used to. If it's a random matrix, the best attacks are not quite as bad as picking 100 out of 2000, but they are close to that, they're pretty slow attacks, it's an exponential-time algorithm, if you have a random matrix. And so what we're doing in code-based crypto is to use this difference for encryption. Now going back again to the 70s, so, basically, yes, the stuff that we're really confident in that it will last another 100 years, is the stuff from the 70s that lots of people have looked at, so, McEliece in 1978, so just the year after RSA, came up with a system which is using encoding as encryption. So your method is, you just encode a vector, your message, and then you add some errors to it. And, if the other side doesn't have an efficient algorithm to decode, they actually can't break it. They're using a special code in there, so there's a code from Goppa from a few years earlier than that, and that code, so far, nobody has found any way to take the perturbed, this complicated way of representing it, and coming up with efficient decoding algorithm. 1986, Niederreither came up with a version of McEliece which is smaller, the code size is smaller, the public key size is smaller, and we have similar things to what you've seen before, so we have those H matrix, we skip the diagonal, we just represent this H as our public key as the remaining part of the matrix, the secret key, that's the algorithm that only I know. It's a Goppa code, but it's a Goppa code and there's many many Goppa codes for the same size. So that's something that Goppa says, well, if you want to have Goppa codes with 2000 bits that can correct 200 errors, here's how you set it up, but there's lots and lots and lots of choices in there. And your encryption is just, take an error vector, with a fixed number of bits, and send me the error pattern of it. So the outcome of multiplying H by this e. And then we want to use this in crypto, so we use the hash of this to encrypt it. Um, then Peter Schwabe and Tung Chou, our PhD student, wrote a very efficient implementation of this called mcbits, so if you want to have more detail on how you could really use this in practice, then go to that url. Um, but let me talk a little bit about why we are confident in this. Code-based crypto is less obvious than hash-based, I mean with hash-based it's like, the, all we're doing is, hash. A hash function by definition is something which is one-way. For this code-based crypto, you need to think a little more, to have people studying it to figure out why this actually can be secure. Now, the attacks, if I may say so, started before the cryptosystem was proposed, so it was another hard problem that coding theorists were studying naturally, and then McEliece said, hey, we have this hard problem here, decoding a random code is not easy, well, actually, afterwards, figuring out, well, if you have a really random code, even finding out whether there is a smaller code word is NP-hard. And then, once it was a cryptosystem, so, Omura, Lee-Brickell, all of those, those come after it was proposed for crypto. There's some which have an extra parenthetic comment, those are papers which study the security of McEliece if the attacker has a quantum computer, so there's been lots and lots of work for studying it against a classical computer, and there's been some work studying it against quantum computers. It's pretty clear that we can't use Shor, but we can use Grover, and it's not so easily obvious how much Grover can give us. However, the best that Grover can do is basically halve the bit size. So we had this AES-128, leading to 64-bit security, so if you're conservative, if you want to be like really really on the secure size, then let's assume that we want to go for pre-quantum security at least 256 in order to get post-quantum security 128. So here are some key sizes, so if you're okay with 1000 kilobytes, then you're getting something which is at least 131 post-quantum secure, and that's actually very conservative, most likely we can go much down on the key size. But that's where more research is needed. If you need something where you can get a guarantee of 100 years security, take this one, it's not small, it's fast, but it's not small, and hopefully in 5 years, less than 5 years, they can give you something which is smaller and still as secure. djb: Okay. There are lots of possibilities for what the smaller things might be, for instance, there's something called QC-MDPC, there's actually lots and lots of variations of McEliece that people have proposed, and some of those variations have held up, some of them haven't. QC-MDPC is something that has held up so far, but how many people've looked at it, and what's going to happen when more attacks get optimised? You can't be trusting something which is new or hasn't been studied enough, or hasn't been studied in the right directions, you also have to be sceptical about crypto. There's, well, lots of proposals, QC-MDPC looks like one of the most interesting ones, that nobody's been able to damage its security, for 2^80 pre-quantum security, which is not very good, but, just as an example, they have 0.6 kilobytes, or 600 bytes, for the public key, and that's a very reasonable size for a public key, not as small as ECC but it's quite tolerable. Um, lattice-based crypto, there's NTRU, for example, that's been around since the 1990s, and maybe that's enough time to start getting confident, but, well, again, there's lattice-based systems that get proposed and broken, for instance, there's a system from Smart-Vercauteren in 2009, that was, it's not broken pre-quantum, but it was shown in 2014-2015, some follow-up papers, to be broken in polynomial time by a quantum computer. So, it's a lattice-based system which sounded good at the beginning, but is definitely not going to be part of post-quantum cryptography. There's multivariate-quadratic, now those have the interesting feature that the signatures they provide can be very very small, like, you can have 100-bit signatures and still get some reasonable security that nobody knows how to break these, well, there's lots and lots of these proposals and some of them are broken, some of them... HFEv-, that's a multivariate quadratic system that's been unbroken for 20 years. Maybe it will continue holding up, but, well, how many people have looked? You have to make sure enough people look at these things. The reason we like these simple systems from the 70s is enough people have looked but maybe if you've got more serious performance constraints you should look at these other things, and of course we are looking at these other things, trying to figure out what's secure, and how well we can actually, er, choose among all these different possibilities. Something else, just last mention here, is isogeny-based, this is for people who like elliptic curves, that there is maybe a way to use elliptic curves post-quantum, and this has the interesting feature that it does exactly the same data flows as Diffie-Hellman. So everything you're used to doing with Diffie-Hellman, and having, like, a secret key over here and a public key over there, agree on a shared secret, that you can also do with isogeny-based systems, but only a few people have studied the security and maybe this'll also get broken. So, a lots more work, lots more possibilities. Last slide, if you're interested in more information here are a few places to look, first of all you can go to pqcrypto.org, so this is our first survey site for post-quantum cryptography, and the biggest chunk of information there is bibliography, and if anybody has newer papers, if you happen to write papers on post-quantum crypto, and you'd like us to add those, then just let us know, um, and then, well, we also have on the front page things like pointers to all the PQCrypto conferences, that's the main place to look, go to pqcrypto.org and follow links to, for instance, the February Japan conference. Then pqcrypto.eu.org, that's the main page for the EU project that Tanja mentioned, which is putting out as it progresses, well, it just started this year, but it's putting out, soon, free libraries! Software libraries, for actually doing post-quantum cryptography. And benchmarking results, so you can actually say "Yeah, I've got the following performance constraints here, that's what I'm able to use." And then in 2017, there's going to be a workshop, and a summer school, maybe it'll be a spring school, we'll see, if you're interested in the Twitter feed for that, then twitter.com/pqc_eu and final resource on the slide is you. You have the responsibility if you care about crypto, then you have to get used to this stuff. You have to learn about hash-based, code-based, maybe lattice-based, multivariate quadratics, these are the things which will be the future of crypto. In the future, all of crypto will be post-quantum crypto, because eventually the attackers will have quantum computers. If you want to adapt to that future, then, well, take a look at these sytems and see, maybe find improvements, then, cool, then let us know, and, you know, publish papers, integrate them into real applications, networking applications, implement the things that aren't implemented, speed them up, there's lots of work that still has to be done. That's it, thank you for your attention. applause Herald: Wow. Now we'll have some short questions please. Because we're a bit late on time. Questioners, go around ahead, there's a mike there, talk into it please. Can we have mike 1? Q: Oh, okay. Uh, quick question. So there was this result of Laszlo Babai that there's a, that graph isomorphism has a quasipolynomial time algorithm, um, not really knowing the subject at all, there's some very, very superficial similarities, like, that is a hidden subgroup problem, basically. Um, is there going to be any, like, is there any indication that the methods he used in that proof are relevant to breaking, like, to weakening NTRU or things like this? djb: That's a good question. So, the, um, the graph isomorphism advance now, is basically saying, so, graph isomorphism is a problem which was famous as being not know to be solvable in polynomial or even, like you said, quasipolynomial time. Um, except there were some really good software algorithms which would solve most cases of it, really quickly. There were just some really weird, highly symmetric cases, that were hard to solve and that's what Babai managed to completely kill now, so he's handled all the cases. But we try to stay away from problems like that in crypto, so, an example that's very closely analogous to that is what's called support splitting, which is a certain type of attack strategy against code-based crypto if somebody gives away lots of information from the legitimate user. And that's something where the support-splitting algorithm works in most cases, but it kind of fails in some corner cases, and, well, we try to stay away from thinking about this anyway, because you just don't want to give somebody that extra information, but if you did, then maybe Babai's technique could be useful there. Herald: Thank you. This talk is not over. If you're leaving, which is okay, please be silent. Next question, number 3, no, number 1. Q: Uh, hello, thank you for the talk. Um, how are the chances to have something like forward secrecy with that? I recognise the last algorithm has a chance to reuse Diffie-Hellman, that's possibly the only one? Lange: So, if you think that forward secrecy means Diffie-Hellman, you can have forward secrecy from normal encryption algorithms, at the expense of generating a new key each time. You can have forward secrecy with RSA, if Alice talking to Bob starts by saying "Okay, this is my one-time RSA key, and I send this to Bob, with a request to encrypt to this key." If Alice never reuses this key, then this method is forward-secure. And similar to how you can do this with RSA keys, you can also do this with McEliece keys. At the moment, there is a difficulty that the keys are very large. It is inconvenient when you want to start talking, "Hey, I'm a client, hello server, please talk to me", and the first thing you have to do is transmit a megabyte of key. On the other hand, you can do it. It just requires you to engineer your protocol to expect, yes, you have to send a few packages. And then it has all the forward secrecy that you want. Q: But, uh, a way without transferring the key, like Diffie-Hellman? Lange: But, Diffie-Hellman is transferring a key. Q: Okay. Lange: I mean, you're basically transferring the first part of a discrete log pair. If Alice sends a*p, and there is some one-time key, she's sending a public key. It's just that the method of how those two public keys interact is slightly different from how RSA encryption works. Q: Okay, thank you. Herald: Thank you. Do we have an Internet message? Question? Signal Angel: Actually, we do. There are two questions that are somehow related. The first one is: Given that there is no actual working quantum computer, how do you start developing a crypto algorithm? Where do you start, how do you design it, how do you test it? Is there a way to prove that it's secure? And the second question is related: The whole thing is based on the property of the hash functions being one-way, how does one know that there is no quantum algorithm that breaks this property? Can you prove this? djb: Okay. For both of these questions, the technique that the community goes through, that we all go through, is cryptanalysis. So we have as many smart people as we can find focusing on these problems and saying "Can you find an algorithm which is breaking these problems better than any previous algorithms can?" and we put as many incentives as we can so that we try to get as many smart people to stay ahead of the bad guys, and hopefully find the best algorithms. But there's no guarantees in this. And you do always have to be sceptical about whether people have actually looked at, for instance quantum algorithms to attack systems. And there is that extra difficulty that the first part of the question, at the beginning, was saying, that we don't have a quantum computer. So if we're trying to verify quantum algorithms that we're developing, we don't get to experiment with them. That's the usual procedure for making sure your algorithms work. In state-of-the-art cryptanalysis, like the number field sieve for factoring, that does not have a proof that it works in any particular, well, at the speed that we think, it works out from experiment. So experiments are really important, because we don't have proofs for state-of-the-art cryptanalysis, and that's something where it's actually really tough for quantum computers. Of course eventually we'll all have quantum computers, and there's ideas for simulating quantum algorithms which have had some success at verifying that algorithms work even though we can't actually run them yet. That we're actually able to verify a simulation of those algorithms. Lange: Let me do a second part of this. Um, when we use quantum cryptanalysis, for estimates, we usually go for the worst-case estimate. So we say, well, McEliece, at worst, gets broken to half the security level. Most likely it won't be that fast, but we're staying on the side where we're very confident. Um, if I understood correctly there was also the part of the question, how can we test the algorithms? If this is for the constructive algorithms, then all the algorithms we analyse, all the algorithms we propose for post-quantum crypto, are algorithms that you can run on your normal computer. So, you can test those, you can run those, we have benchmarking numbers from those, on our current hardware. Herald: We'll do two more questions, please. Number 1. Q: Yeah, I got a question on the practicality of the attacks. So, um, if we assume there is a quantum computer, how much time will it take in practice, order of magnitude, to break, let's say, RSA 2048-bit key? Is it on the order of hours, weeks, months, years? Lange: snaps fingers Done. Q: Okay, thanks. laughter djb: That was easy! Herald: Number 3. Q: Okay. Herald: Talk into the mike, please. Q: Thank you. So, it's very, very nice to have both quantum encryption and signing, but do you know anything about some other cryptographic primitives, such as zero-knowledge proofs? Lange: Well, I mean, zero-knowledge proofs are basically signatures which are not interactive. So if you have something which is, um, a primitive for a signature is usually very closely related to zero-knowledge proofs. So, there is work going on, we are focusing on the most important things that we see on the Internet, but, that shouldn't mean that people shouldn't do research on it. Please do research on zero-knowledge proofs! Herald: Okay. Last question. Number 1. Q: So, why do you put so much emphasis on smaller key sizes and on performance in encryption, um, especially in a delicate topic like post-quantum computing? Why can't we just use 1-megabyte keys and why can't we just use a few seconds instead of miliseconds to compute those? Lange: So... Q: What's the the problem here? Lange: We are suggesting to use a key of 1 megabyte. So, our recommendation that we have on the Internet on the pqcrypto.org page are precisely using this system which has a 1-megabyte key. The nice thing is, that actually encryption and decryption are very efficient. But that was not our main goal, our main goal was to get something which is very secure, and where we have a high confidence that we actually understand the attack. And then, well, once we have this system, then you try to optimise how to implement it, and the implementation, when we looked at the numbers is actually faster than elliptic curve implementations except for the size. So, you get a very nice speed, even though it was not our target for optimisation. Herald: Okay, this is it. Thank you very much, let's have a final hand for Tanja Lange and djb Bernstein! applause postroll music Untertitel erstellt von c3subtitles.de im Jahr 2016. Mach mit und hilf uns!