35C3 preroll music Herald: The next talk is called "The year in post-quantum crypto" by Tanja Lange, she's researching in Eindhoven, and Daniel Bernstein, researching at the University of Chicago. They both had the first ever conference on post-quantum cryptography in 2006 and they both contribute to libsalt crypto library. Their talk is going to be a summary of the NIST post-quantum crypto competition and they're gonna provide an overview of problems of post-quantum cryptography and hopefully show us how to integrate post-quantum crypto in existing software. Please welcome them with a huge round of applause. applause Tanja Lange: All right, well, thank you. First of all, this title word post-quantum cryptography, what it means, we're talking about cryptography where we're designing the systems under the assumption that our attacker - not the user, just the attacker - has a large quantum computer. Also to remind you from three years ago, when we showed you the attacker, we're talking about attackers. All right. So, we're seeing like over the last few years 2015, say middle August, there was a big announcement by the NSA saying "Oh yeah, acutally the world should care about post- quantum cryptography. We need more research." They finally wake up to it and actually they had a nice roll-out effect, that pretty much every agency - just highlighting three of them here, so U.K., the Netherlands and the NSA themselves - made some statements about the urgency of rolling out post-quantum, that normal cryptography that we're currently using, so elliptic curves and RSA and Diffie- Hellman and finite fields, will be broken as soon as a quantum computer, and the NIST, which is the National Institute for Standards and Technology, so it's an institute in the U.S., which is working on standards said, ok, we're gonna call for a standardization project for post-quatum cryptography. So yes, it's a U.S. institution, but it has in cryptography a mostly good history. They have been running competitions which gave us AES. They've been running competitions which gave us the SHA3. And they also, without a competition, gave us Dual_EC. laughter TL: So, competitions with NIST, good thing, everything else, well, judge for yourself. This sounds like a great story. It's also a little bit disappointing when you look at where it started. So back in 2003, Dan here coined the term post- quantum cryptography and then they've been running around for 10 years going like "The end is nigh, please invest in post- quantum cryptography, do something", just that 10 years later the NSA said "Oh my God, we have to do something, quatum computers are comming". So it's a little bit. Yeah, I told you so. Not a great story, but. Daniel J. Bernstein: All right so what happened with this competition. Well after NIST said "Hey everybody, send us some proposals for post-quantum crypto to standardize", a lot of people did. So, they said actually more than 80 submissions came in and some of them were incomplete, like one of the requirements was include software, and whatever reasons, they said some of the submissions are not complete, but they posted 69 complete submissions, the submission teams include 260 people from around the world, all sorts of academia, industry, maybe even some government people. There's lots and lots of names and we're not gonna go through, like, what all these things are, but we are going to say something... TL: Oh we have 60 minutes. DJB: Oh that's true. So BIG QUAKE: This is a code based system... laughter No, we're gonna give you some idea of, like, what the big picture is of how well these things have held up. So, these were all posted by NIST on the 21st of December last year, and some of you who saw our LatticeHacks talk last year remember that this list, already there was some damage to it by the time of our talk on the 28th of December 2017. By the end of the year, there were eight submissions that had attacks, at different levels of severity. I should explain the color coding here. The stuff that's in brown is less security than claimed, which could be for instance they claimed something was, it would take 2 to the 128 operations to break and somebody says "no, it's 2 to the 100 or 2 to the 80". Does that really matter? Or maybe a different direction is somebody says: "this is a system where you can reuse the key any number of times", which we expect for for normal crypto systems that you can publish your key and people can keep sending you messages or you can sign many times under some keys, but sometimes people claimed they had that feature and they turned out to be wrong; those were attackable, like HILA5 in this list, which is not necessarily kicking them out of the competition, because NIST said, you can have a system with one-time keys that can be useful for some applications. The things in red are, everything that they proposed, all the concrete parameters are broken. The underlines are those where there's attacks scripts, Python scripts, stage scripts, whatever it takes to demonstrate that, yeah, look, here's here's your secret key, or decrypting something. So that was the end of 2017. How about now. Well, OK, let's extrapolate, three days from now. Probably the situation is at least the following. At least by twenty eighth of December 2018, 22 submissions have attacks. So it's about a third of those 69 submissions, and you can see more, well, most, well, 13 of those, out of the 22, are red, mostly with underlines. I think some of this is from us, a lot from other people. And I think we did well early on establishing that people should put out scripts to demonstrate that ,yeah, you really can break these things. So again the underlines are demonstrated attacks; some of the submitters have withdrawn the broken submissions; some of them have not. TL: All right, so when you look at this, we're not going to go through explaining those, but let me categorize these. So when we look at the ones which are not completely smashed and broken, we can put them into boxes, we can say "hey, what is the underlying mathematical problem", "what do we hope is something the attacker has to do in order to break the cryptosystem". And then there is one category of using error-correcting codes, which is then used for building an encryption system or a signature scheme, hash functions, you can only build signature schemes from, isogeny-based crypto, for the competition we're only seeing an encryption system, and honestly all the signatures we have seen so far are pretty inefficient. Lattices we see for both, that is something we talked about last year, so if you want to get a full- blown lattice explanation, go for last year's talk. And then finally, there's something using systems in many variables, many equations, and we get signatures and encryption from that. So that's one way of saying "Well, these are good things for post-quantum". It does not mean that everything which isn't in these boxes is secure. You can still design a system, which somehow relates to this math problem, but the attacker can do a way around. Some of those systems, RaCoSS I'll get back to later, is a code-based system - code-based is on the list, first one up there - so should be totally secure, and yet it's one of the red underlined systems. So just being in one of the categories does not mean it's secure, but it's at least some hopeful and helpful thing to box things. There are other ways of describing why this is the situation we have now. DJB: As an example of this kind of categorization, sometimes people might say: "Oh, lattice-based cryptography is is the safe thing to do. All of that red was people who were not using lattice-based- cryptography and everything lattice-based is good, everything else is scary". But if you look at the lattice-based systems, it's not all black. There's some red stuff here. Compact LWE. That one is broken, with a script, we're quite sure it's broken. There's some others with some damage, DRS, HILA5. So it's not that the lattice-based submissions have held up perfectly and also it's not just that these are isolated mistakes that people have made. There's ongoing research which is making better and better lattice attacks. For instance, some papers from last month and this month from the three authors listed there are talking about lattice-based systems being broken by decryption failures. Now this phenomenon, most of the lattice submissions have occasional decryption failures. Once every 2 to the 64 ciphertexts maybe, you won't be able to decrypt. Which might sound like "oh, it's no big deal, it's, maybe occasionally you might have some user has that happen and they'll just, the browser will reconnect, whatever it takes, it's not a significant failure rate". Except that those failures, if the attacker is trying to decrypt a particular cipher text or maybe attack or figure out somebody's secret key, they can usually get that information out of watching the pattern of decryption failures, if decryption failures happen often enough. And these papers are saying, for two different reasons, decryption failures are actually happening more often, maybe much more often than people claim. So that's kind of scary. TL: All right. One explanation, which I of course like very much. I've been running a European protocol, PQCRYPTO, and just use everything in our portfolio. It's good right? Actually, it's looking better than the arguments saying: "I'll use everything lattice-based. We have one which is slightly scratched, but everything else is good. Right. Yay." DJB: Except, well, there's another explanation than just, well, whoever these PQCRYPTO project people are, obviously the right people, putting together a great portfolio. There's another possibility. Not saying this is right, but you could imagine the cryptanalysts who are deciding what to do out of that huge pile of 69 submissions. Maybe they thought the people who were in this project doing this stuff for some number of years are maybe not the top targets. Maybe you should look at the other 49 submissions. Maybe you should look at the submissions with the specification written in Microsoft Word. Probably more likely to be broken. Maybe there's other ways that you can decide how to - no offence to Microsoft people - TL: It worked. DJB: yeah, that did work coincidentally. Yeah. So, the thing to keep in mind is that there's a huge pile of submissions, more than a million words in these 69 submission documents, and this is like, a word in English is usually a kind of imprecise thing, reviewing that, this is I would say more effort than reviewing a million lines of code. This is a lot of stuff, lot of junk to go through. There's a real flood, there's too much for us to all the people who care about attacking systems to, to actually go through everything. So people are making a selecttion, and maybe they just haven't bothered looking at these PQCRYPTO submissions. So, if you want to actually have security review, it's really important to keep this denial of service effect in mind, that the flood of submissions has to be small enough that it can be handled by the number of people who are looking at these submissions and evaluating the security. TL: So, call for help, please join, please break. Don't break ours. laughter TL: Now, one thing which in this audience is a little funny, but in an normal academic conference where you talk to like 40 or 100 people, we like, we actually have a lot of people now being interested in post-quantum cryptography. This was this year's conference on this, when Dan and I started this in 2006, we were looking at an audience of 40. This is 350 people. Well, they would probably fit in here, but, well - for academics, this is big. There's a huge interest right now. This was the combined conference together with a NIST workshop. And so people are looking. So that's the good news. There's more denial of service going on, I mentioned RaCoSS already as one which is broken, but not withdrawn. Broken actually three different ways, where our last message to them was basically "abandon all hope, this is not gonna work", but they keep on hoping. If I zoom in there, they have, they are on the way to publishing new parameters. laughter TL: When he was reading it out, actually at the conference, he was saying "the keys and signature size may be several dozens of terabytes". Well, there is some effect that we're most likely not going to break it so easily, because when you're trying to download several terabytes of data, it might not reconnect. DJB: So, NIST is certainly aware of the fact that there's just this denial of service on security analysis. And one of the ways that they tried to simplify the picture is, said, after their conference they put out this call saying, "all right, if you have similar submissions, see if you can merge them". And then hopefully you get something which is, you know, going to be a combined submission that's easier for us to analyze than these two separate submissions in the first place. And they said this is an interesting little guarantee. They said "NIST will accept a merged submission to the second round if either of the submissions being merged would have been accepted". So you can imagine kind of attacking this if there's a strong submission and you have a weak submission, like the strong one, they surely have to accept that, and then you merge your weak submission into the strong one if you can somehow convince the other team to do that, then your weak submission is also going to get into the the second round, but NIST said "well, you should only merge submissions that are similar and the merged submissions should be like a combination of the two original submissions". So that sounds kind of reasonable, an example, while the first announcement of a merger - I don't think NIST said that you have to publicly announce - but HILA5 merged with Round2, and of course this was after the HILA5 attack and part of the merger was fixing the issue in that attack, and they formed Round5 and they said "Round5, this result of the merge, is a leading lattice-based candidate in terms of security, bandwidth and CPU performance". So, three weeks later, the security turned out to be kind of a problem. Mike Hamburg said that here is a very strong reason to believe that decryption failures are much much more likely than what they claimed, and as a result of that, and they they accepted the argument and said yeah, oops. As a result of that, like I mentioned before, decryption failures are something that attackers can use to break security. And it's not that that a full attack was implemented, but it's pretty clear that the attack would work, and this is also an interesting attack because the mergers were supposed to be just taking, like take the best features of your two submissions, that you're merging. But this was a mistake. The vulnerability that Hamburg exploited was a mistake that was not in either of the submissions that were being put together. So there's some process of break and fix and merge, making more mistakes, which get broken and then fixed and, well, what was the fix. They said, "oh, here's a proposed fix". They're "looking at the security proof adjustments", there will be a Round5 real proposal, their actual merge will be in the future. I think now they have Round5a, Round5b and Round5c, where A is broken, B is questionable, C is still not defined. What does a security proof, what does a security proof mean, if you have a security proof previously that they were adjusting, and the security proof is for something that is not actually secure. Very strange. More merger announcements, post-quantum RSA encryption and post- quantum RSA signatures merged to form post-quantum RSA, saying that that "is a leading candidate in terms of depth of security analysis, amount of network traffic, and flexibility". For people not familiar with post-quantum RSA, this means using RSA with gigabyte or terabyte keys, which is leading in the amount of network traffic. laughter applause DJB: We want the internet to have as much cryptography as possible. TL: Use more bandwidth. DJB: Yeah. Remember, you have, if you're measuring the amount of encrypted data on your network, this increases that. More mergers, more mergers, more mergers. Some of these are kind of gluing submissions together in a way that does not simplify the, the security analysis. But this last one is a good example, I would say, of a merge entry. NTRU-HRSS and NTRUEncrypt, two of the NTRU-based submissions, they actually put some thought into what they wanted to keep and what they wanted to throw away. And so analyzing the merge is easier than analyzing the, both of the initial submissions. After the November deadline for mergers, NIST said they will announce the second round candidates. Maybe it'll be, probably less than 30, some hints it'll be 25 or even 20, candidates, and maybe that's starting to get down to a small enough flood that we can start seriously analyzing what's left. They're going to announce that, I think on exactly January 10th they're scheduled to do that, and then a week after this announcement they said, "Well, the government might be shutting down, and in that case we're not allowed to do any work, so we're going to be completely silent in case of a shutdown". It's important in the U.S. government, during shutdown there's a definition of essential personnel, like NSA, and non-essential personnel, like people protecting us against NSA. laughter DJB: - and only the essential personnel are allowed to do work. You know what else is not allowed to do work? The backend database for NIST's web pages, which might sound a little bit weird, although maybe they're paying Oracle for the backend database, and they have to turn off the payments to Oracle. I don't know what's going on here, but if you look for the competition information, you can't find it on their web pages anymore. We're not quite sure how long this shutdown is going to last. Of course, there are some people who say that this is not a problem, because we can figure out without this competition how to protect ourselves against quantum computers. laughter TL: All right, now that we have aliens already, are quantum computers actually coming? Big question, we don't know. What we can monitor is progress in quantum computing and just middle December, there was a news item from IonQ, which is a small startup company, announcing their largest ever quantum computer based on ion traps. All the other quantum computers we've seen so far, which are of this size, like 40 to 70, they were using superconducting qubits. So, it is again a race between different technologies, but both of them are advancing, and there are some more which are growing. So, it looks like it's coming. Whenever I see that picture like this, I'm reminded of a nice joke from a colleague of mine, Steven Galbraith. laughter TL: Can you distinguish? Yep. So, with all these news coming up, the National Academy of Sciences in the US has been interviewing people for the last about year and a half. So they got people in physics and engineering, building quantum computers, people doing quantum algorithms, people doing quantum error- correcting codes, and putting all of this together into a report, which just came out, where they look into, well, what is the progress and what are the prospects. So the first part of key findings, the first one is the good news, saying don't panic. We do not expect that anything is going to happen in the next 10 years which will threaten RSA 2048 or similar, where I assume what they mean, well, elliptic curves, discrete logs on finite fields. So that's the good news. But they wouldn't have just one key finding, it actually goes on, two three, ten, by the time they reach 10, I think this is panic. So that they say is, well, it takes forever to roll out these things. The hazard of such a machine is high enough. And then, "the deployment of post-quantum cryptography is critical for minimizing the chances of a potential security and privacy disaster". These are strong words from the National Academy of Sciences. DJB: So, OK, can we deploy post-quantum cryptography? Is it deployable? Well, some people would say we've already deployed it, but maybe that doesn't include the NIST submissions, so let's look at the deployability of the NIST submissions. The main thing that matters for deployment in most applications, the main problem for post quantum cryptography, is the sizes. So, here's a picture of the night sky over Leipzig. No, this is a picture of, on the horizontal axis is the size of your public key for a bunch of signature systems, not all of the signature systems, for instance WalnutDSA, which is broken with a script in the first five versions - TL: Post-quantum RSA is missing. DJB: Yeah, post-quantum RSA is also omitted from this, sorry. TL: It's kind of, of laughter DJB: Yeah, that would be like, I'm one of the designers of post-quantum RSA by the way. TL: I'm not. DJB: It's the future of cryptography. laughter DJB: That was good. Yeah. So what you can see here is, for example this "gui" submission, this has, you can get your your verticals, the signature size down to just 32 bytes or 35 bytes, something like that, but you need something like 400,000 bytes in your public key. And then there's three different dots for gui. Those are three different security levels, and maybe all the different submissions here are maybe not exactly comparable on the security levels. It should be a three dimensional graph, if we measure everything properly by exactly how secure it is, which of course we're not quite sure about until there's been enough security analysis. You can see, various different trade-offs are possible, none of which are down where we want to be with things like under 100 bytes for your public key and under 100 bytes for your signature, which is what we're used to right now. That's what ecliptic curve crypto gives us, is signature sizes and public sizes, which are both below 100 bytes. And that's something you can fit into your applications much more easily than, say, 100,000 byte public keys or 10,000 byte signatures. There's various different trade-offs, and maybe your application can handle some of that, but there's nothing that's just really small, which is what we're used to right now. Another more complicated graph. This is for encryption and showing more candidates. Well, there are more encryption submissions. This is still not all of the encryption submissions, but a representative sample. And you can see that, well, there's still no really great sizes here. The best in terms of sizes is "sike", supersingular isogeny keyexchange, which is something like 400, little less than 400 bytes for the public key and for the cipher text, and then it starts getting bigger from there. And you get, on this graph you get things up to megabyte or bigger. You can get a little below 300-400 bytes, you can get down to 100 or so bytes for the cipher text, as long as you are willing to accept a public key that's much, much bigger, with some of these code-based systems. And then just to zoom in on some of the smaller ones, you can start seeing where some different candidates are. This is everything which is public key and cipher text below 1280 bytes. And again, you see "sike" down there, a little below 400 bytes, and then some other possibilities. But well, what are the security levels of these things? Could all of these be broken? There's not actually that many of them, how many of these have actually been studied? It's kind of scary. And again, much bigger sizes than we're used to in cryptography. TL: So, yes, size does matter. laughter TL: So, Google and Cloudflare this year in April were saying, well, we don't really know what the outcome of this competition is going to be, but we have some categories of different crypto systems, and let's just send dummy packets of data of respective sizes, and see what happens when we do this on the Internet. Now this is Google and Cloudflare, so they they were doing this on the Chrome browser for connections that go through Cloudflare, so they could actually see from where it came, where it ended, whether it came back, whether it dropped. Dan mentioned "sike", so this is the one category of supersingular isogenies, those are just 400 bytes. And that was pretty much fine, so when you look at the first column, you have a small latency increase. You also see the inaccuracy, that's a -0.2. So, the numbers are mostly correct. Then there is, in the lattices, this was the zoom-in what Dan was showing, those are mostly something that could take a, we have structured lattices, those are around the MTU, so 1,100 something bytes. And that was, mostly worked, some small increases, but under 20 percent, so this is also something we feel like, yes, you could actually deploy this on the Internet. Then a different category still within lattices are unstructured lattices. Those would come with 10 kilobytes of data for the public key. And there they just noticed that too many pages, including, like, top pages on Alexa 500, such as LinkedIn, were just dropping. They tried, funny enough, 9999, where fewer pages was dropping, so 10K was worse than 9999, but even then LinkedIn was still dropping. And so they decreased it to a third, as basically a placeholder, measured with these 3300 bytes, and then scaled up by a factor of three. Now, those increases in the latency is what they said, well, this is not acceptable. So then for the next experiments, they were only looking at isogenies and structured lattices. Isogenies are also special, not just being the smallest, but also being the slowest. Well okay, not absolutely the slowest, but it's the only system where the speed is much more an issue than the size. And so despite Google having quite a few computers, they were saying we can't actually use isogenies for the speed reasons. Size would be awesome, but speed, not so sure. It's also a relatively recent system, it's just in 2012. So maybe also it's a security question. So just now in December, they announced that they're actually building a new experiment, they announced which candidate they have chosen, NTRU-HRSS, which Dan just mentioned was also one of the recent mergers. And so this is one of the structured lattices systems, designers are Andreas Hulsing, Joost Rijneveld, Peter Schwab and John Schanck. Great score for Eindhoven team, current professor, former student, and then some collaborators. And so they're now building a system which is a combined elliptic curve, so that's the combined EC, that's combined elliptic curves and post-quantum. This is the second round running, and so this would come to some Internet browser near you soon. Another nice result, I mean this is not the only thing up there. The IETF this year finished standardizing XMSS, which is a hash based signature system. It's been in the making, down there you see the timeline, it's been in the making for three years. It's not really fast, but it's also the first of its kind. This was the first time that IETF has published a post-quantum RFC, request for comments, but that's basically their standards. And so there's a lot of boilerplate text, which was developed in this thing, in the process of making the standard, which is dealing with, yeah, post-quantum, quantum attacks and so on, how you should handle it. XMSS is interesting. It's not one of the NIST submissions, because it doesn't satisfy the normal thing that you learn in primary school, what a signature should be doing. Signature you expect to have a public key. You use it to sign something, and then you have a signature. XMSS, you have a public key, you have a state. You get a message, you sign it, and you update your state. And if you ever forget to update your state, you will lose security. So it's something which is not as cool as a normal signature scheme, but there are also many applications where you actually know how many signatures you've made. I mean, if you're doing operating system updates, you better know how often you got your key out of the drawer and used it. So it is not impossible to use, but it might not be exactly what you want for your web applications. DJB: Good thing about XMSS is still, if you can count, then the size is much smaller than the signature systems that we were looking at before. Another size advantage, size advance is something called Glowstick. I should explain the name. This is starting from a lattice submission called Saber, which is one of the unbroken ones, and Saber has a big version called FireSaber, high security level, scaled up. It also has a small version called LightSaber. laughter and groans DJB: And this Glowstick is the even smaller version. It's like, let's scale it down as far as we can get that it's not quite broken, and there's various technical details and it hasn't been broken in the months since it's been proposed, the six months or so, and it is interesting. I mean, it is something which is a good challenge. It's nice to have these scaled down problems, so we can try different ways of attacking these things and people who like breaking stuff, it's good to have the simpler systems to practice doing attacks, and that gives you some insight into what could work against the larger systems. TL: All right. So, since we're coming to funny names... Oh, no, since we're coming to sizes, why do we still care about big sizes? I mean, people are scaling things down. Google says, oh, we don't like big sizes. So why do people say post-quantum systems are bigger and we still care about it? So one of the reasons is, well, highlighting McEliece here, which is our submission in this competition, these have had a lot of analysis. So classic McEliece is based on a system from 1978, basically unchanged except for some changes where we can really prove this is the same as that. It has one of the shorter ciphertext, just 200 bytes, so that's actually kind of tolerable on the Internet, but a megabyte of key. Key generation is also pretty slow, but then the, well, it's nowadays called encapsulation, decapsulation because all you want is your AES key, you don't want to actually encrypt the message, but basic thing of encryption and decryption of that. So, nice system, very good history in security, pretty fast, pretty small, except for the public keys. It's like, "Grandma, why do you have so big keys?". Why are these keys so big? So one thing is, it's like a two dimensional key. We have this big matrix there, what you see on the left is an identity matrix. This thing has about 7000 columns. It's like pretty long. It's only for, the height is n-k, which is just like thousand five hundred, so it's really long and stretched. And so all of this part on your right-hand side, that part is random and you have to send that. You can of course, everybody remembers what an identity matrix looks like. You can forget about this one, but this one you have to send, because the encryption works by the sender thinking, which of those around 7000 column he wants to pick, and then just XORing them up, and for doing that, well, you need to have this big matrix. And if you calculate, well, 1547 times 5413 , that's the part of the right matrix, you're getting to this one megabyte size. Now what are the issues with having big keys? It's bandwidth, but honestly, when you download, a picture it's also a megabyte. So, it might not be so bad. I mean, if you're on the German trains then you will hate it, but - laughter TL: - normally, elsewhere in the world or on your mobile, it's fine. Google was saying, we actually excluded some systems, so they didn't experiment with classic McEliece, largest they looked at was 10,000 kilobytes, and even then some dropped, and they said, well, some are too large to be viable within TLS. So they just said, well, we don't do this. But then again, you have a secure system. You can just also design a secure protocol to go with it. We don't need to stick with TLS. But there is a real problem with having a megabyte of key, because if your protocol assumes that the client generates this one megabyte and then just throws it at the server, and the server has to accept one megabyte from every single client that is throwing a megabyte at it, and then has to do something with it, well, this is really an invitation for a denial of service attack, because you're allocating memory on the server for doing these operations. The operations are pretty fast, it's just XORing 0s and 1s, but you have to allocate 1 megabyte for each client. And so this is a problem, no matter what the protocol we designed, we have to deal with the possibility of denial of service attacks, or avoid them. So, can servers avoid storing these big keys? You want to XOR these columns, so one of the first limitation on small devices was saying, well, "I'm a very small device, but I can pick those positions and then, outside world, please be nice to me and spoon feed me one column at a time". So the small device memorizes 1500 bits, not so big, and then gets the next column. If it was selected, it XORs it in, if it wasn't selected, well, it keeps the intermediate state. So this works. And at the end, you output the normal ciphertext. But what we have here is, we're operating in a friendly environment where we do not expect this outside world to do something nasty to us, and also we have some memory. Now if we put this on the real Internet and we don't want to have any state, so we cannot memorize these 1500, because, well, we don't know when the next column is going to come, so we output and send this back to this client. That's not gonna work. When you tell the client, "Oh, this is my current result", then the server gets the next column, maybe XORs it in, maybe not, sends it back to the client, anybody who is watching this traffic could see whether there was a change or there was no change. So this is not a way of dealing with it. So what Dan and I have been busy with, and, well, I put 2018 with a question mark, we still have like three days, right? So we're working on this system called McTiny, tiny because it's made for tiny web servers, where we assume that this web server is not allocating any per-client storage, any per-client state. And so, we again work with spoon feeding things, but we're making sure that everything that the server gets and sends out is encrypted, is authenticated, there is some stuff to avoid replay attacks, so that somebody can't just say, "oh, what if I change the column here or there". So all these things are encrypted, and what we do is we use properties of doing these sums and pieces by chunking up this big matrix into chewable pieces that are small enough to fit in one MTU and still have some space for some cookies. So this is similar to normal uses of cookies, this is a cookie, encrypted to the server, send to a client, you handle the storage, and then client sends the next piece, sends the old cookie. Now, the cookie is encrypted, but the way that the key is handled is the same for all clients. So there is no per-client storage of any keys. It's a symmetric key, it's pretty small. So that's the one thing that the server remembers, and then it gets a packet, it from this cookie part recovers all the, like, which columns to pick, what's the intermediate result, and then does some computation, sends it back. So, the result of this is that we need several round trips, but there's absolutely no per-client state on the server. DJB: Of course you could say, well, there were still all that bandwidth, and what if you do have bandwidth problems, but some people say that we're familiar with sending a lot of data around, so that's really not a big deal. Something else that could interfere with deployment is patents. Now Tanja mentioned before that classic McEliece does not have patents, but what if somebody says, "I just don't want to handle the megabyte", or for whatever reason people want something smaller or there are signature questions. Well, we have a lot of information about some systems which are patented. The 18 systems shown here, because NIST had as one of the rules for the competition that you had to deliver statements to them, signed by every member of the submission team, saying either "we do not have patents, patent applications on our submission", or, "here's the patents, patent applications, here's their numbers". And so, OK, as a result of that and NIST, after checking they had a complete pile of statements, put them online, so now we know that these are exactly the 18 submissions where the submission teams claim patents on their submissions, including for example Compact LWE and DME and WalnutDSA, which are rapidly broken by scripts that are online. And RLCE-KEM, that one has half of the parameters are broken, there's another half which are not broken. It's not that the patented submissions are somehow better than the rest of the submissions, but well, for some reason people think they can make money off of patents, and maybe they're not actually so wrong, because you can't just throw away these 18 submissions and say "that's the end of it". The problem is that there's some patents which cover more submissions. Now, NIST does not require these submitters to say that, "here's which patents are, which submissions are covered by my patent". The submitters are only required to say something about their own submissions, and also NIST has no way to say anything about whatever random patent trolls that are out there, that have not submitted anything. They can't impose any rules on that. I mean, of course you can try doing patent searches, but you won't necessarily find things, for instance this patent. Nobody noticed until it was revealed in, well, by a member of some submission team, this patent was issued in 2015, at the top there, which might make you think, "oh, if something was published before 2015, it would be okay", and some submissions were published earlier. But what's important is the date down at the bottom here, which is the priority date of February 18th 2010. If you look on Google Patents, one good thing is they put the priority date pretty far up, where you can easily see it. What this means is that, in order to be prior art for this patent, well, you have to check what exactly they filed in 2010. They might have made later changes, but the 2010 thing, assuming that has all the same stuff as the patent, which it's possible to find out, this, anything that was published after 2010 is not prior art for this. Now what's really scary about this patent, and I hope that really soon I'm gonna have analysis online of which submissions are covered by which patents of all the patents I've seen. This one is by far the scariest, because this one covers a whole bunch of submissions. This one covers basically every submission which is using what's called the LPR cryptosystem, a ring-LWE-lattice-based crypto system. This is a very popular type of lattice-based crypto system, which was published by L, P, and R, Lyubashevsky, Peikert, and Regev, in May 2010, which is after this patent application was filed. Now, there was a talk in April which had the same stuff from LPR, and it seems like there might even have been a talk in January from LPR, but they didn't put the slides online, and then, well, it starts getting into interesting questions of patent law. This looks like a very strong patent, covering a whole lot of submissions, and there's more cases, there's a whole company called ISARA that is specializing in planting landmines, patent landmines, around things that other people are doing, sometimes on things that other people have already published, and then you get a court fight about it. This is gonna be a problem; it's something we really have to watch out for, is what is patented, and again, I hope to be sometime soon done with some patent analysis. Of course, some people would say that we don't have to worry about patents as long as we find something that we can deploy, that somebody tries deploying it and they don't get sued. TL: Not sure that's going to be deployed anytime soon. I mean, 95 out of 3000, Laughter okay. All right. Funny names, I said. So what do you see here? Can anybody read phonetics? Yeah? "si-said". Okay, so, seaside. Now, CSIDH is what you really always wanted. CSIDH is an efficient post- quantum commutative group action. Did you know that you wanted a commutative group action? Actually, you did. So, what all people ask me is, like, "I'm using Diffie- Hellman these days. What can you give me in the post-quantum world?". And then it depends a lot on how you define Diffie- Hellman. Some features that we come to use from Diffie-Hellman are that, well, you publish a public key, I publish a public key, other people publish public keys, and we can reuse them. Kind of nice. Also, we don't have to talk to each other. We can just look up the other public key on the phone book, have a shared key, and start using that one. And if I send you something encrypted with our shared public keys, like, combined thing of this, then you will be able to decrypt this. There are some nice other features; you can blind things, you can like take your g to the a and then multiply, compute in the exponent times r, so put some blinding factors there, and there is no difference whether I'm the initiator or I am the responder in this. We don't have this anywhere else in post quantum cryptography. All the systems that you see for NIST submissions make a difference between are you the sender, are you the responder. So this is the first efficient post-quantum, well, Diffie-Hellman-like thing, which, well, by fancy math called commutative group action. So, if you are a user, you don't want to know all the details. I'm not going to give an entire talk about this, unless, maybe next year. What you see exposed to you is just one single finite field element. So there is some fixed prime, that all the people in the system know, and then everybody's public key is just one single field element. So Alice computes her field element, Bob computes his field element, they post these somewhere, and then sometime later, years later, maybe if quantum computers are built, they find each other, they compute their shared public key, they combine the shared, the public keys into their shared secret key, sorry, and then they have the shared secret. Now, a little bit of the math behind this, A actually appears in some form there, so this is one of the elliptic curves that we've been talking about in, gosh, when was this, 2013 or so. No, 14 at least. DJB: 14. TL: So there's EA: y^2 = x^3, and then A, this public key A, x^2 + x, and that what the computation is to go from one key to another key is using an isogeny, same isogeny kind of thing that you heard in sike before, it's a math object that just means you move from one elliptic curve to another elliptic curve. If somebody tells you to implement this, what you need to get doing is, well, take this prime p, compute modulo p for addition, multiplications and divisions, out of those you for instance build the curve operations, and then some more operations which computes an isogeny, but all of those are just combined from those things. So there's nothing particularly scary behind it, except for, well, we came up with this thing in January 2018 at this lovely beach. Was great there, but please don't use it yet. Experiment with it all you want, but this has not had enough analysis. But another reason why you might want this is security, key sizes and so on. So, what are we looking at? First of all, how many keys are there? So how big do we have to look at this p? When you have fixed your prime p, say n bits, then there are square root of p, so 2 to the n over 2, many such curves. So these are the numbers of public keys. And then, similar to how the elliptic curve discrete log, kind of meet-in-the-middle attacks, work, so basically smart bruce force search, you get a square root of the number of keys as your security. So if you have square root of p many keys, it takes your fourth root of p time to find out what Alice's key is. So if you want 128 bit security, you have to choose your prime p four times as many bits, so a 512 bit prime. But this is a talk on post-quantum cryptography. So where do we stand there? Elliptic curves would be totally broken. Nice enough for isogenies, we don't have any complete break. There are some sub-exponential attacks, so doesn't have full exponential security as we maybe would like to have, on the other hand, with RSA and finite field Diffie-Hellman, we have been dealing with the growth of keys, with sub- exponentil attacks, so this is something we're familiar with. It doesn't kill things, but you look at the literature, it's mostly asymptotics, so we and also some others have been looking into details. I think our analysis, which we have at quantum.isogeny.org, is the most detailed one, looking into, like, what actual security you get against somebody with a really really big quantum computer. Now with elliptic curves, you'll hopefully have also learned that you must always validate, you need to get a point, somebody says "oh, this is my public key", the first thing you do is check, is this thing on the curve, does it have the right order? Same thing for isogeny-based crypto, for CSIDH you have a very quick check, you check that this curve has the number of points, you know what it is, you don't even need to do full point counting, you just take a point, do some scalar multiplications, and you check it. This is another thing that we've gotten totally used to doing. This is another thing that is really really really hard for most post-quantum systems. Most post-quantum systems you add another proof to it, so typically when you encrypt to somebody's key and you're sending something which looks like a key, you reveal all the secrets in there, that's why you can't reuse it, or you have to do a big zero knowledge proof to prove that you actually generated this thing properly. With CSIDH, it's just there. All you need to do is check if it's a valid curve, and you're done. Size it's also pretty neat. So, 32 bytes for the secret key, 64 bytes. So just twice as large as normal elliptic curves, that is really like bottom-left corner of Dan's graph where there was nothing, so CSIDH does fill a big gap, a big gap, but a small key, there with something, which pre-quantum has at least 128 bit security and post-quantum, so what NIST was asking for is comparisons with AES-128 and then you look at, like, how big are the sizes, how big are the quantum computers and so on. And we think that CSIDH-512, to the best of our knowledge, based on the latest analysis, will be as secure as that. There is some code written by Lorenz, somewhere? Ah, Lorenz. Yay. This is on Skylake. Your mileage may vary. This is a, it's not a super-quick hack, but it's not deployment code. So this is not yet constant time, there are some others who've been working on constant time, it makes it about three times slower. It is similar to sike in that it's really nice small keys, but somewhat slow. On the other hand, this is still very new, it's just from January. So we're still figuring out ways to make it faster, whereas, well, sike has been doing a lot of work getting to the speed that they have now. So there's hope that this will get faster, and there's some hope it will remain unbroken until next year, but, well, I'm not sure yet where I'd put my money, at the, at this moment I think actually CSIDH has a better chance than sike of surviving, but who knows. Don't use it for anything yet. DJB: Speaking of broke, there's a lot of people who are investing in crypto currencies, - laughter DJB: - and I think, I think it's Nick Mathewson's fault, this whole quantum- cyber blockchain idea, if you know something earlier than 2016, well anyway, there's variations of it since then, like quantum AI blockchain, apparently you can buy the t-shirt. We have about 10 minutes left, so I'd like to finish things off with some comments on software. Now this is looking back at 40 years of public key cryptography, well RSA was from '77 or so, the McEliece cryptosystem from '78, and then, well, here's some, some schematics of what the software quality has been in cryptography, on a scale of "good", "bad", "terrible", and "horrifying". 1978, I don't actually know, I haven't seen software from back then. By 1988, it was clear that the software quality was horrifying. By 1998, it had moved up to terrible. By 2008, it moved up to bad. And by 2018, it has jumped back down to horrifying. laughter and applause DJB: And of course a major contributor to this is all of these NIST submissions, which have code written by mathematicians, who barely can implement anything and certainly don't have good code quality. There's occasional submission teams that have people who can write code, but in general, if you, well, for a good time, pick up a random - TL: McEliece is fine! DJB: Yeah, the classic McEliece code is fine. There's other submissions where the code is is fine, but if you just take a random submission and look at the code, it's, well, interesting. If you would like to find out where the software is and download it. laughter DJB: Yeah, NIST doesn't work very well. I did look, archive.org has, like, you search for "NIST round one" on DuckDuckGo, and the top link is to the NIST page, and then you take the URL for that, put it into archive.org, and I tried a few of the submissions and the zip files that NIST prepared with the specifications and the code, those are available from archive.org. I guess they got most or all of them. You can also look for more than half the submissions, there are upstream websites with newer code. NIST has not updated the code, but lots of submissions, the submission teams have, lots of the fastest code, and even some faster while improved code, is available in our SUPERCOP benchmarking framework, so this is the system for unified performance evaluation related to cryptographic operations and primitives, bench.cr.yp.to, and this one has, well, 170 primitives from 40 of the 69 submissions. Might have accidentally left out all of the patented submissions, well, the SUPERCOP policy is anybody who sends us code to put in, we'll benchmark it, it doesn't have to be unpatented, it doesn't have to be secure, we benchmark MD5, we benchmark RSA-512. But anyways, there's 40 submissions where code is in there, from other people or from me painfully going through getting code to actually work. The primitives are, for instance, RSA-512, and RSA-1024, and RSA-2048. They're all RSA, but they're different primitives, or different mathematical functions with different security levels and, well, in these submissions, there's typically three different security levels, sometimes more choices, sometimes less and then a lot of the primitives have multiple implementations, like reference code and optimized code for different platforms maybe. So, okay, a lot of those are collected in this benchmarking framework, all with the same API. libpqcrypto, I think I might have a few minutes to say a little more about this, libpqcrypto is focusing on having an API which is suitable for cryptographic deployment in the future. If you imagine that the implementation quality of the underlying crypto is dramatically improved, at least that interface layer is supposed to be something that you'll be able to use. Some more examples of things out there, pqm4 is a library optimized for small ARM microcontrollers, the ARM Cortex-M4, or pqhw is for FPGAs, and this last one,Open Quantum Safe, that one, they don't have as many primitives maybe as libpqcrypto or SUPERCOP, but what's cool about that project is, they've got working integrations of all of these into OpenSSL and OpenSSH. So if you're in, say the TLS world, then that's clearly the way to use these post-quantum proposals, at least quite a few of them, inside TLS. Okay, let me look a little bit at libpqcrypto and then we'll finish this off. There's lots of cryptographic libraries which give you a nice, simple API for hash. They'll have some simple function like sha256, which takes a message, okay, in C you have to say there's a pointer to the beginning of the message plus the length of the message, and then gives you back some hash of some 256 bit, 32 byte hash. In a higher level language, of course, you say something like "h = sha256(m)", and m knows what its length is, but in C it looks like you have h and m and the length of m as arguments. Why not do this for all cryptographic functions? Somehow, it's really weird, lots of cryptographic libraries just have a nice, simple interface for hashing, and then if you want to do something like public key signatures, it's, well, okay, first we're going to find the factory, which is producing the keys, while the generator method for the key, blah blah blah, and well, you can just say, and what libpqcrypto does is, it simply says, you sign something, with whichever signature scheme, you have to tell it, well, I'm going to put the signed message somewhere, and then the length of the message is an output, the message you're signing and the length are inputs, and your secret key is an input. And then it just takes everything in wire format, produces everything in wire format. You don't have to have conversion functions, input/output serializations etc. This is actually an API that goes back to, we've been doing SUPERCOP for many years, and SUPERCOP, the salt (NaCl) library, libsodium etc. are all using the same API. And this is something which, actually people have measured the impact on usability of cryptographic libraries depending on the different API provided by these libraries, and so we're pretty confident about benefits of having a nice, simple way to use crypto. NIST looked at this and said, well, ok, yeah. People should submit something to the post-quantum competition using this API, but they didn't have test code that people could use to make sure that they were following the rules. They didn't require that everybody pass any particular set of tests and they accepted submissions which didn't work, for example, in SUPERCOP. So, well, ok, that's why I had to do a bunch of work to integrate a bunch of submissions into, into SUPERCOP. But it's been sufficiently close to everybody using this that there has been a lot of code shared between these different projects, Open Quantum Safe is also starting from the same API and then providing higher level integrations into OpenSSL and OpenSSH. OK. There's a bunch of different signature systems and a bunch of different encryption systems in libpqcrypto. Here's an example of what the higher level API looks like in Python. If you want to use libpqcrypto and sign a message, well first somebody has to generate a public key and a secret key using the pqcrypto library. Signature system, here's one of the signature systems; Sphincs is a stateless hash-based signature system, you don't have to record anything when you sign message, and then 128 is security level 2 to the 128, using the SHA-256 hash, and well, you just have to know this name and then you say, give me a key pair, sign a message using a secret key, open a message using a public key. Now this is not "you get a signature and then you do verify of a message and a signature". This is another little API detail, designed to protect people against screwing up. There's lots of applications which verify signatures, and then if the verification fails, nobody's ever tested it, and the verification failure is ignored. What actually works to protect application programmers is have an interface where you have a signed message as one bundle, it goes into the opening a signature, opening a signed message and producing a message, and the cryptographic library does not produce a message as output if the signature was invalid. So the signed message is produced, is handled by the cryptographic library producing a message if the signature is valid. Also there's an exception being raised, but even if you ignore the exception in Python or if you're using a lower level language without exceptions, then you just aren't given back a message. This is what lots of little thought that goes into the API. Maybe a bigger example in Python; this is a whole thing of using the library, generating a key, signing some random message and opening the message. OK. What's going to happen in libpqcrypto coming up is, first of all, one of the big problems with code quality is there's lots of exposure to timing attacks. I saw a great talk earlier today about Spectre, and there's lots and lots of these attacks, and part of fixing these attacks is fixing software, along with we're going to have to do lots of hardware fixes. There's been some work on some implementations to fix this, but much more is required. Need a lot more work on correctness. Lots and lots of the code doesn't even pass address sanitizer. And this was, I don't want to tell you how much pain to get code working and address sanitizer, where, I mean anybody writing code professionally is going to be using these automatic tests as they're writing the code, and this is something that just doesn't happen when you ask a bunch of mathematicians to write code. Formal verification. We'll do much more than testing and do much more than, say, address sanitizer does and much more than even an expert auditor will do. That formal verification is going to guarantee that your code is doing what it's supposed to for every possible input, which I used to be very skeptical about because it seemed so painful to do for any realistic code, but I've started getting much more enthusiastic about it, because the tools are getting much much better. And one example of something I did was a sorting verification, where some really fast sorting code is actually completely verified to work correctly, the machine code is verified, so you compile it, even if there's compiler bugs, then the machine code is what's verified, so the verification isn't going to rely on some compiler being correct. This is using the angr toolkit, also, I don't know if there's any Trail of Bits people here, Manticore I understand has similar features, I used angr, but, well. There's really cool tools out there for doing symbolic execution and as part of that formal verification. Speed is important and trying to get the code volume down, there's lots of duplication, we need more internal libraries to get post-quantum crypto on a smaller, easier to review code base. And finally, hopefully, at the end of all this we'll be able to throw away as many primitives as possible and focus on a small number of things, where we can say "We've really, seriously reviewed these. We've reviewed the designs, we've reviewed the implementations, and we're confident that these things are secure". That's it. Thank you for your attention. applause Herald: Thank you Tanja & Daniel. If you would like to leave at this point, please do that very quietly, we'll have a short run of Q&A. Signal angel, your first question. Microphone: [unintelligible] from older submission in code base, are there any other ones that use smaller keys? TL: Use smaller keys you said? Signal angel: Yeah. From all the submissions - TL: Yeah, so code-based cryptography, there's two submissions classic McEliece, which I highlighted because it's ours, and there's NTS-KEM [CHECK], which has these gigantic keys. Both of those are using goppa codes, which is what has received the most analysis so far, but at this gigantic list of - Yep. What Dan is showing here, several of those are actually code based. So, bigquake for instance, down there, is a code-based system, then - DJB: Lake, that's, that's - TL: Bike is one, lake is one, down there, so lake would be fitting this by saying it's very small keys and signatures and ciphertext. The downside is, it is using far less well-studied codes. So we need to see how that develops. Herald: Thank you. For people in the room, please try to limit your questions to a single sentence. Microphone number 3, your question. Microphone 3: OK. How exactly do you define post-quantum crypto? I mean, you have Shor's algorithm, you have the other algorithms, but do you say, OK, it's just secure against factoring, discrete logarithms, or do you also take into account optimization problems and stuff like that? DJB: Yeah. So, I mean the definition is, we're trying to protect against any attacker who has a big quantum computer and we have a rough understanding of what quantum computers can do, because they're limited by the laws of quantum physics. Which tells us that, OK, if you can build a computer that supports what are called Toffoli gates and Hadamard gates, then you can, well, it's not completely proven, but it's very plausible that you can simulate the matrix at that point, a quantum matrix. Microphone 3: Yes, but that's the universal model. DJB: Yes, you have a universal quantum computer at that point. The problem is, how do we know, even if we say, well OK, by believing that quantum physics is everything we can do in the universe, that tells us we have a computation build out of Hadamards and Toffolis. That doesn't tell you what kinds of algorithms you can put together. And there's this big problem that's always been a problem for cryptography, is trying to imagine what all possible algorithms are, and sometimes people miss something. And so if somebody ever tells you, "oh, the system is provably secure, there's, there can't possibly be an algorithm which is faster than this to break the system", no there's no guarantees, and lots of people have been overconfident and burned, because there is actually a faster algorithm. We've had a lot of work on people trying to figure out good algorithms using quantum computers, for instance for the sub-exponential attacks that Tanja was mentioning against CSIDH, and that's something where, there's a long history to those attacks, starting with Cooperberg's algorithm, and this is going beyond Shor's algorithm and Grover's algorithm. And it's really important to look more at what sort of quantum algorithms could attack cryptographic systems. There's been some initial work, but there definitely needs to be more. TL: I mean, our attacker is allowed to do whatever they want. That's why I'm showing this as attacker. The attacker is not playing by the rules. The only thing we know is our attacker has a quantum computer. Microphone 3: Okay. Herald: Right, signal angel, your next question. Signal angel: Question from the Internet. Size does matter, but what about the performance of post-quantum cryptography compared to classical algorithms for embedded or FPGA devices, for example of firmware signing or communication and encryption? TL: OK. So on the big list, and quickly firing up. pqm4, that's using a Cortex ARM M4, so that's a rather small device. They did not implement all algorithms and for some of them they said, it is very cumbersome to do with the big keys. So yes, it's more of an issue. I mean, we're spoiled with elliptic curves, just having 256 bits there, and all the systems are larger than that. CSIDH is the closest you get, but then it has the big computation. But there is effort, and these smaller and more fitting systems have been implemented. Hopefully we'll get better. Herald: Thanks. Microphone number 4, your question. Microphone 4: You said, when Google did some tests, that said it's just too slow, they cannot really use it. Would the solution be acceleration units, like used for AES in CPUs? TL: So, Google was excluding the use of the supersingular isogenies based on speed. I assume that's what you mean, rather than the big ones with bandwidth. I don't know all the details of it. My assumption is, it was factoring in also the security, like how much time have people spent on analyzing it, which made them more comfortable with the structured lattices than the supersingular isogenies. You can speed things up if you have a big engine, which would be manufactured to do finite field arithmetic, but that is much, much bigger than, say, an AES engine. DJB: Maybe just an extra comment. I think that the choice they made of NTRU-HRSS is really an excellent choice of, it's something which is small enough to fit into most applications, I mean 1,000 bytes or so, it's much bigger than elliptic curve crypto, but compared to all the data we tend to send around, it usually fits, unless you got, like, some really small communication happening, then you usually can fit a kilobyte or so, which is the NTRU-HRSS sizes. And it's something which, it's got some history of study. I would be the last person to say that lattices are definitely secure, and we actually, our NTRU Prime submission is worried about ways that something like NTRU-HRSS could maybe be broken, but there's no evidence of any problems, and NTRU has held up for about 20 years of study without being broken. So, it's also reasonably fast, so it's a reasonable compromise between the different constraints of trying to have something secure and not ridiculously big, and, well, if it gets broken, then we're in trouble. But hopefully it's okay. Herald: Thanks. Signal angel. The final question please. Signal angel: The final question. Can CSIDH drawn on a hardware accelerator make for regular elliptic curves, or is it is the handling of isogenies more problematic? TL: All right, so depends what your hardware accelerator has. If it's like one of fairly generic elliptic curve arithmetics, you can probably use it. We're getting some speed from not using elliptic curves in Weierstrass form, but Montgomery forms, so you probably would want to modify the accelerator they're currently using to fit this better. Also, most systems are optimized for 256 bit elliptic curves or 384, with 512 bits we're a little bit outside, but most of the operations would be looking just the same. The most time we spend on doing a big scalar multiplication, and then we have some operations in these isogenies, but they are fairly similar. If you have, like, the field arithmetic built up, you can just put these together and have an isogeny computation as well. So yes, it can get faster. As I said, this is from January, we're still working on the security analysis, so don't build any hardware, at this moment, quite yet. laughter Herald: Thank you so much. Please give them a huge round of applause for the talk. applause postroll music subtitles created by c3subtitles.de in the year 2020. Join, and help us!