36c3 preroll music Herald: The next talk is on how to break PDF's, breaking the encryption and the signatures, by Fabian Ising and Vladislav Mladenov. Their talk was accepted at CCS this year in London and they had that in November. It comes from research that basically produced two different kinds of papers and it has been... people worldwide have been interested in what has been going on. Please give them a great round of applause and welcome them to the stage. Applause Vladi: So can you hear me? Yeah. Perfect. OK. Now you can see the slides. My name is Vladislav Mladenov, or just Vladi if you have some questions to me and this is Fabian. And we are allowed today to talk about how to break PDF security or more special about how to break the cryptography operations in PDF files. We are a large team from university of Bochum, Mue nster and Hackmanit GmbH. So as I mentioned: We will talk about cryptography and PDF files. Does it work? Fabian: All right. OK. Let's try that again. Okay. Vladi: Perfect. This talk will consist of two parts. The first part is about digitally signed PDF files and how can we recognize such files? If we open them we see the information regarding that the file was signed and all verification procedures were valid. And more information regarding the signature validation panel and information about who signed this file. This is the first part of the talk and I will present this topic. And the second part is regarding PDF encrypted files and how can we recognize such files? If you tried to open such files, the first thing you see is the password prompt. And after entering the correct password, the file is decrypted and you can read the content within this file. If you open it with Adobe, additional information regarding if this file is secured or not is displayed further. And this is the second part of our talk, and Fabian, will talk: how can we break the PDA encryption? So before we start with the attacks on signatures or encryption, we first need some basics. And after six slides, you will be experts regarding PDF files and you will understand everything about it. But maybe it's a little bit boring, so be patient: there are only 6 slides. So the first is quite easy. PDF files are... the first specification was in 1993 and almost at the beginning PDF cryptography operations like signatures and encryption was already there. The last version is PDF 2.0 and it was released in 2017. And according to Adobe 1.6 billion files are on the web and perhaps more exchange beyond the web. So basically PDF files are everywhere. And that's the reason why we consider this topic and tried to find or to analyze the security of the features. If we have some very simple file and we open it with Adobe Reader, the first thing we see is, of course, the content. "Hello, world!" in this case, and additional information regarding the focused page and how many pages this document has. But what would happen if we don't use a PDF viewer and just use some text editor? We use the Notepad++ to open and later manipulate the files. So I will zoom this thing... this file. And the first thing we see is that we can read it. Perhaps it's quite, quite funny. And but we can still extract some information of this file. For example, some information regarding the pages. And here you can see the information that the PDF file consists of one page. But more interesting is that we can see the content of the file itself. So the lessons we learned is that we can use a simple text editor to view and edit PDF files. And for our attacks, we used only this text editor. So let's go to the details. How PDF files are structured and how they are processed. PDF files consist of 4 parts: header, body and body is the most important part of the PDF files. The body contains the entire information presented to the user. And 2 other sections: Xref section and trailer. Very important think about processing PDF files, is that they're processed not from the top to the bottom, but from the bottom to the top. So the first thing is that the PDF viewer analyses or processes is the trailer. So let's start doing that. What information is starting this trailer? Basically, there are two very important informations. On the first side this is the information: what is the root element of this PDF? So which is the first object which will be processed? And the second important information is where the Xref section starts. It's just a byte offset pointing to the position of the XRef section within the PDF file. So this pointer, as mentioned before, points to the Xref section. But what is the Xref section about? The Xref section is a catalog pointing or holding the information where the objects defined in the body are contained or the byte positions of this object. So how can we read this weird Xref section? The first information we extract is that the first object, which is defined here, is the object with ID 0 and we have 5 further elements or objects which are defined. So the first object is here. The first entry is the byte position within the file. The second is its generation number. And the last charter points, if this object is used or not used. So reading it, reading this Xref section, we extract the information that the object with ID 0 is at byte position 0 and is not in use. So the object with ID 1 is at the position 9 and so on and so forth. So for the object with ID 4 and the object number comes from counting it: 0 1, 2, 3 and 4. So the object with ID 4 can be found at the offset 184 and it's in use. In other words, the PDF viewer knows where each object will be found and can properly display it and process it. Now we come to the most important part: the body, and I mentioned it that in the body the entire content which is presented to the user is contained. So let's see. Object 4 0 is this one and as you can see, it contains the word "Hello World". The other objects are a reference, too. So each pointer points exactly to the starting position of each of the objects. And how can we read this object? You see, we have an object starting with the ID number, then the generation number and the word "obj". So you now know where the object starts and when it ends. Now how can we process this body? As I mentioned before in the trailer, there was a reference regarding the root element and this element was with ID 1 and generation number 0. So, we now we start reading the document here and we have a catalog and a reference to some pages. Pages is just a description of all the pages contained within the file. And what can we see here is that we have this number count once or we have only one page and a reference to the page object which contains the entire information inscription of the page. If we have multiple pages, then we will have here multiple elements. Then we have one page. And here we have the contents, which is a reference to the string we already saw. Perfect. If you understand this then you know everything or almost everything about PDF files. Now you can just use your editor and open such files and analyze them. Then we need one feature... I forgot the last part. The most simple one. The header. It should just one line stating which version is used. For example, in our case, 1.4. For the last version of Adobe here will be stated 2.0. Now, we need this one feature called "Incremental Update". And I call this feature - do you know this feature highlighting something in the PDF file or putting some sticky notes? Technically, it's called "incremental update." I just call it reviewing master and bachelor thesis of my students because this is exactly the procedure I follow. I just read the text and highlight something and store the information I put at it. Technically by putting such a sticky note. this additional information is appended after the end of the file. So we have a body update which contains exactly the information additionally of the new objects and of course, new Xref section and a new trailer pointing to this new object. Okay, we are done. Considering incremental update, we saw that it is used mainly for sticky notes or highlighting. But we observed something which is very important because an incremental update we can redefine existing objects, for example, we can redefine the object with ID 4 and put new content. So we replace in this manner the word "Hello World" with another sentence and of course the Xref section and the trailer point to this new object. So this is very important. With incremental update we are not stuck to only adding some highlighting or notes. We can redefine already existing content and perhaps we need this for the attacks we will present. So let's talk about PDF signatures. First, we need a difference between electronic signature and digital signature. Electronic signature. From a technical point of view, it's just an image. I just wrote it on my PC and put it into the file. There is no cryptographic protection. It could be me lying on the beach doing something. From cryptographic point of view is the same. It does not provide any security, any cryptographic security. What we will talk about here is about digitally signed files, so if you open such files, you have the additional information regarding the validation about the signatures and who signed this PDF file. So as I mentioned before, this talk will concentrate only on these digitally signed PDF files. How? What kind of process is behind digitally signing PDF files? Imagine we have this abstract overview of a PDF document. We have the header, body, Xref section and trailer. We want to sign it. What happens is that we take this PDF file and via incremental update we put additional information regarding that. There is a new catalog and more important, a new signature object containing the signature value and information about who signed this PDF file. And of course, there is an Xref section and trailer. And relevant for you: The entire file is now protected by the PDF signature. So manipulations within this area should not be possible, right? Yeah, let's talk about this: why it's not possible and how can we break it? First, we need an attack scenario. What we want to achieve as an attacker. We assumed in our research that the attacker possesses this signed PDF file. This could be an old contract, receipt or, in our case, a bill from Amazon. And if we open this file, the signature is valid. So everything is green. No warnings are thrown and everything is fine. What we tried to do is to take this file, manipulate it somehow and then send it to the victim. And now the victim expects to receive a digitally signed PDF file, so just tripping the digital signature is a very trivial scenario and we did not consider it because it's trivial. We considered that the victim expects to see that there is a signature and it is valid. So no warning casts are thrown and the entire left side is exactly the same from the normal behavior. But on the other side, the content was exchanged so we manipulated the receipt and exchanged it with another content. The question is now: how can we do it on a technical level? And we came up with three attacks: incremental saving attacks, signature wrapping and universal signature forgery. And I will now introduce the techniques and how these attacks are working. The first attack is the incremental saving attack. So I mentioned before that via incremental saving or via incremental updates, we can add and remove and even redefine already existing objects and the signature still stays valid. Why is this happening? Consider now again our case. We have some header, body, Xref table and trailer and the file is now signed and the signature protects only the signed area. So what would happen if I put a sticky note or some highlighting? An incremental update happens. If I open this file, usually this happens: We have the information that this signature is valid, when it was signed and so on and so forth. So our first idea was to just put new body updates, redefine already existing content and with a Xref table and trailer we point to the new content. This is quite trivial because it's a legitimate feature in PDF files, so we didn't expect to be quite successful and we were not so successful. But the first idea: we applied this attack, we opened it and we got this message. So it's kind of a weird message because an experienced user sees valid, but the document has been updated and you should know what does this exactly mean. But we did not consider this attack as successful because the warning is not the same or the status of the signature validation is not the same. So what we did is to evaluate this first against this trivial case, against older viewers we have, and Libre office, for example, was vulnerable against this trivial attack. This was the only viewer which was vulnerable against this trivial variation. But then we asked ourselves: Okay, the other viewers are quite secure. But how do they detect these incremental updates? And from developer point of view, the laziest thing we can do is just to check if another Xref table and trailer were added after the signature was applied. So we just put our body updates but just deleted the other two parts. This is not a standard compliant PDF file. It's broken. But our hope was that the PDF viewer fixes this kind of stuff for us and that these viewers are error-tolerant. And we were quite successful because the verification logic just checked: Is there an Xref table and trailer after the signature was applied? No? Okay. Everything's fine. The signature is valid. No warning was thrown. But then the application logic saw that incremental updates were applied and fixed this for us and processed these body updates and no warning was thrown. Some of the viewers required to have a trailer. I don't know why - it was a Black box testing. So we just removed the Xref table, but the trailer was there and we were able to break further PDF viewers. The most complex variation of the attack was the following: We had the PDF viewers checked if every incremental update contains a signature object. But they did not check if this signature is covered by the incremental update. So we just copy-pasted the signature which was provided here and we just forced the PDF viewer to validate this signed content twice - and still our body updates were processed and for example, Foxit or Master PDF were vulnerable against this type of attack. So the evaluation of our attack: We considered as part of our evaluation 22 different viewers - among others, Adobe with different versions, Foxit, and so on. And as you can see 11 of 22 were vulnerable against incremental saving. So 50 percent, and we were quite surprised because we saw that the developers saw that incremental updates could be dangerous regarding the signature validation. But we were still able to bypass their considerations. We had - a full signature bypass means that there is no possibility for the victim to detect the attack. A limited signature bypass means that the victim, if the victim clicks on one - at least one - additional window and explicitly wants to validate the signature, then the viewer was vulnerable. But the most important thing is by opening the file, there was a status message that the signature validation and all signatures are valid. So this was the first layer and the viewers were vulnerable against this. So let's talk about the second attack class. We called it "signature wrapping attack" and this is the most complex attack of the 3 classes. And now we have to go a little bit into the details of how PDF signatures are made. So imagine now we have a PDF file. We have some header and the original document. The original document contains the header, the body, the Xref section and so on and so forth. And we want to sign this document. Technically, again, an incremental update is provided and we have a new catalog here. We have some other objects, for example, certificates and so on and the signature objects. And we will now concentrate on this signature object because it's essential for the attack we want to to carry out. And the signature object contains a lot of information, but we want for this attacks only two elements are relevant: The contents and the byte range. The contents contains the signature value. It's a PKCS7 container containing the signature value and the certificates used to validate the signature and the bytes range. The byte range contains four different values and what how these values are being used. The first two, A and B define the first signed area. And this is here from the beginning of the document until the start of the signature value. Why we need this? Because the signature value is part of the signed area. So we need to exclude the signature value from the document computation. And this is how the bytes range is used. The first part is from the beginning of the document until the signed the signature value starts and after the signature ends until the end of the file is the second area specified by the two digits C and D. So, now we have everything protected besides the signature value itself. What we wanted to try is to create additional space for our attacks. So our idea was to move the second signed area. And how can we do it? So basically we can do it by just defining another byte range. And as you can see here, the byte range points from area A to B. So this area we didn't made any manipulation in this part, right? It was not modified at all. So it's still valid. And the second part, the new C value and the next D bytes, we didn't change anything here, right? So basically, we didn't changed anything in the signed area. And the signature is still valid. But what we created was a space for some malicious objects; sometimes we needed some padding and a new extra section pointing to this malicious objects. Important thing was that this malicious Xref sections, the position is defined by the trailer. And since we can not modify this trailer, this position is fixed. So this is the only limitation of the attack, but it works like a charm. And the question is now: How many PDF viewers were vulnerable against this attack? And as you can see, this is the signature wrapping column. 17 out of 22 applications were vulnerable against this attack. This was quite expected result because the attack was complex we saw that many developers didn't, were not aware of this threat and that's the reason why so many vulnerabilities were there. Now to the last class of attacks, universal signature forgery. And we called it universal signature forgery, but I preferred to use another definition for this attacks. I call them stupid implementation flaws. We are coming from the PenTesting area and I know a lot of you are PenTesters, too. And, many of you have experience, quite interesting experience with zero bytes, null values or some kind of weird values. And this is what we tried in this kind of attacks. Just tried to do some stupid values or remove references and see what happen. Considering the signature, there are two different important elements: The contents containing the signature value and the byte range pointing to what is exactly signed. So, what would happen if we remove the contents? Our hope was that the information regarding the signature is still shown by the viewer as valid without validating any signature because it was not possible. And by just removing the signature value is quite obvious idea. And we were not successful with this kind of attack. But let's proceed with another values like for example, contents without any value or contents like equals NULL or zero bytes. And considering this last version, we had two viewers which were vulnerable against this attack. And another, another case is, for example, by removing the byte range. By removing this byte range we have some signature value, but we don't know what is exactly signed. So, we tried this attack and of course, byte range without any value or NULL bytes or byte range with a minus or negative, negative numbers. And usually this last crashed very a lot of viewers. But the most interesting is that Adobe made this mistake by just removing the byte range. We were able to bypass the entire security. We didn't expect this behavior, but it was a stupid implementation flaw, allowing us to do anything in this document and all the exploits we show in our presentations were made on Adobe with this attack. So let's see what were the results of this attack. As you can see, only 4 of 22 viewers were vulnerable against this attack and only Adobe unlimited; for the others, there was limitation because if you click on the signature validation, then a warning was thrown. It was very easy for Adobe to fix. And as you can see, Adobe didn't mistake, made any mistake regarding incremental saving, a signature wrapping, but regarding controversial signature forgery. There were vulnerable against this attack. And this was the hope of our approach. In summary, we were able to break 21 of 22 PDF viewers. The only Applause Thanks. Applause The only secure PDF viewer is Adobe 9, which is deprecated and has remote code execution. The only Laugh The only users allowed to use them or are using it are Linux users, because this is the last version available for Linux and that's the reason why you consider it. So, I'm done with the talk about PDF signatures and now Fabian can talk about PDF encryption. Thank you. Fabian: Yes Applause OK, now that we have dealt with the signatures, let's talk about another cryptographic aspect in PDFs. And that is encryption. And some of you might remember our PDFex vulnerability from earlier this year. It's, of course, an attack with a logo and it presents two novel tech techniques targeting PDF encryption that have never been applied to PDF encryption before. So one of them is these so-called direct exfiltration where we break the cryptography without even touching the cryptography. So no ciphertext manipulation here. The second one as so- called malleability gadgets. And those are actually targeted modifications of the ciphertext of the document. But first, let's take a step back and let again take some keywords in. So PDF uses AES. OK. Well, AES is good. Nothing can go wrong, right? So let's go home. Encryption is fine. Well, of course, we didn't stop here, but took a closer look. So they use CBC mode of operation, so cipher block chaining. And, what's more important is that they don't use any integrity protection. So it's unintegrity protected AES-CBC. And you might remember the scenario from the attacks against encrypted e-mail, so against OpenPGP and S-MIME, it's basically the same problem. But first, who actually uses PDF encryption? You might ask. For one, we found some local banks in Germany use encrypted PDFs as a drop-in replacement for S-MIME or OpenPGP because their customers might not want to deal with uhm, set, with the setup of encrypted e-mail. Second one, were some drop-in plugins for encrypt e-mail as well. So there are some companies out there that produce product that you can put into your outlook and you can use encrypted PDF files instead of encrypted email. We also found that some scanners and medical devices were able to send encrypted PDF files via e-mail. So you can set a password on that machine and they will send the encrypted PDF via e-mail and you have to put in the password some other way. And lastly, we found that some governmental organizations use encrypted PDF documents, for example, the US Department of Justice allows for the send, sending in some claims via encrypted PDFs. And I've exactly no idea how you how they get the password, but at least they allow it. So as we are from academia, let's take a step back and look at our attacker model. So we've got Alice and Bob. Alice wants to send a document to Bob. And she wants to send it over an unencrypted channel or a channel she doesn't trust. So of course, she decides to encrypt it. Second scenario is, they want to upload it to a shared storage. For example, Dropbox or any other shared storage. And of course, they don't trust the storage. So, again, they use end-to- end encryption. So let's assume that this shared storage is indeed dangerous or malicious. So, Alice will, of course, again upload the encrypted document to the attacker in this case, will perform some targeted modification of that, and will send the modified documents back to Bob, who will happily put in the password because from his point of view, it's undistinguishable from the original document and the original plain text will be leaked back to the attacker, breaking the confidentiality. So let's take a look at the first attack on how we did that. That's the direct exfiltration, so breaking the cryptography without touching any cryptography, as I like to say. But first, encryption in, in a nutshell, PDF encryption. So you have seen the structure of the PDF document. There is a header with a version number. There's a body where all the interesting objects live. So there is our confidential content that we want to actually, well, to actually exfiltrate as an attacker. And finally, there is Xref table and the trailer. So what changes if we decide to encrypt this document? Well, actually, not a whole lot. So instead of confidential data, of course, there's now some encrypted ciphertext. Okay. And the rest pretty much remains the same. The only thing that is added is a new value in the trailer that tells us how to decrypt this data again. So there's pretty much of the structure left unencrypted. And we thought about: Why is this? And we took a look at the standard. So, this is an excerpt from the PDF specification and I've highlighted the interesting parts for you. Encryption is only applied to strings and streams. Well, those of the values that actually can contain any text in the document and all other objects are not encrypted. And that is because, well, they want to allow random access to the whole document. So no parsing the whole document before actually showing page 16 of the encrypted document. Well, that seems kind of reasonable. So, but that also means that the whole documents structure is unencrypted and only the streams and strings are encrypted. This reveals a lot of information to an attacker that he or she shouldn't have probably. That's for one the number and size of pages, that's the number and size of objects in the document and that's also including any links, so any hyperlinks in document that are actually there. So, that's a lot of information an attacker probably shouldn't have. So, next we thought maybe we can do some more stuff. Can we add our own unencrypted content? And we took a look at the standard again and found that our so- called crypt filters, which provide finer granularity control of the encryption. This basically means as an attacker, I can change a document to say, hey, only strings in this document are encrypted and streams are unencrypted. That's what the identity filter is for. I have no idea why they decided to add that to a document format, but it's there. So that means their support for partial encryption and that means attackers content can be mixed with actual encrypted content. And we found 18 different techniques to do that in different readers. So there is a lot of ways to do that in the different readers. So let's have a look at a demo. So we have this document, this encrypted document, we put in our password and get our secret message. We now open it again in a text editor. We see, in object 4 0 down here, there's the actual ciphertext of the object, so of the message, and we see it's AES encrypted, with a 32 byte key, so it's AES-256. OK. Now we decide to add a new object that contains, well, plaintext. And, well, we simply add that to the contents array of this document. So, we say "Display this on the first page", save the document. We open it, and we'll put in our password and, oh well, this is indeed awkward. OK. So, now, we have broken the integrity of an encrypted document. Well, you might think maybe they didn't want any integrity in the encrypted files. Maybe that's the use case people have, I don't know. But we thought, maybe we can somehow exfiltrate the plaintext this way. So again, we took a step back, and looked at the PDF specification. And the first thing we found were so-called submit-form actions. And that's basically the same as a form on a website. You can put in data. You might have seen this in a contract, in a PDF contract, where you can put in your name, and your address, and so on, and so on, and the data that is saved inside of that is saved in strings and streams. And now remember that is everything that is encrypted in a document. And, of course, you can also send that back to an attacker, or well, to a legitimate use case, of course, via clicking a button, but clicking buttons is pretty lame. So we again looked at the standard and found the so-called open action. And that is an action, for example, submitting a form that can be performed upon opening a document. So how might this look? This is how a PDF form looks, already with the attack applied. So, we've got an URL here that is unencrypted, because all strings in this document are unencrypted, and we've got the value object 2 O, where the actual encrypted data lives. So, that is the value of the form fields. And what will happen on the attacker side as soon as this document is opened? Well, we'll get a post request with a confidential content. Let's have a demo. Again, we have this document. We put in our password. It's the original document you have already seen. We reopen it in a text viewer, or a text editor, again see it's encrypted, and we decide to change all strings to the identity filter. So, no encryption is applied to strings from now on. And then we add a whole blob of information for the open action, and for the form. So this will be op- this will be performed, as soon as the document is opened. There is a URL, p.df, and the value is the encrypted object 4 0. We start an HTTP server on the domain we specified, we open the document, put in the password again, and as soon as we open the document Adobe will helpfully show us a warning, but they will already click the button for remembering that for the future. And if you accept that, you will see your secret message on the attacker server. And that is pretty bad already. OK. The same works for hyperlinks, so, of course, there are links in PDF documents, and as on the Web, we can define a base URL for hyperlinks. So we can say all URLs from this document start with http://p.df. And of course we can define any object as a URL. So any object we prepared this way can be sent as a URL, and that will, of course, trigger a GET request upon opening the document again, if you defined an open action for the same object. So again, pretty bad and breaks confidentiality. And of course, everybody loves JavaScript in PDF files, and that works as well. Okay. Let's talk about ciphertext attacks, so actual cryptographic attacks, no more not touching the crypto. So you might remember the efail attacks on OpenPGP and S/MIME, and those had basically three prerequisites. 1: Well, ciphertext malleability, so it's called malleability gadgets. That's why we need ciphertext malleability, and we've got no integrity protection, that's a plus. Then we need some known plaintext for actual targeted modifications. And we need an exfiltration channel to send the data back to an attacker. Well, exfiltration channels are already dealt with as we have hyperlinks and forms. So we can already check that. Nice. Let's talk about ciphertext malleability, or what we call gadgets. So, some of you might remember this from crypto 101, or whatever lecture you ever had on cryptography. This is the decryption function of CBC, so cipher block chaining. And it's basically, you've got your ciphertext up here, and your plaintext down here. And it works by simply decrypting a block of ciphertext, XORing the previous block of ciphertext onto that, and you'll get the plaintext. So what happens, if you decide to change a single bit in the ciphertext, for example, the first bit of the initialization vector? Well, that same bit will flip in the actual plaintext. Wait a second. What happens, if you happen to know a whole plaintext block? Well, we can XOR that onto the first block, and basically get all zeros, or what we call a gadget, or a blank sheet of paper, because we can write on that by taking a chosen plaintext and XORing that onto this results. And this way we can, for example, construct URLs in the actual ciphertext, or in the actual resulting plaintext. What we can also do with these gadget is, gadgets is moving them somewhere else in the document, cloning them, so we can have multiple gadgets, at multiple places in the ciphertext. But remember, if you do that, there's always the avalanche effect of CBC, so you will have some random bytes in here, but the URL still remains in place. Okay. That's ciphertext malleability done. As I've said we need some plaintext. We need to have some known plaintext. And as the PDF standard has been pretty helpful up until now, in breaking PDF encryption, let's take a look again. And what we found here: Permissions. So a PDF documents can have different permissions for the author, and the user of the document. This basically means the author can edit the document and the users might not be able to do that. And of course, people started to change with that- started to tamper with that value, if it was left unencrypted, so in the newest version, it was decided this should be encrypted as a 16 byte value. So we've got 16 bytes. How do they look? Well, at first, we need room for extension. We need lots of permissions. Then we put 4 bytes of the actual permission value - That is also in unencrypted form in document. Then we need one byte for encrypted metadata, and for some reason we need some acronym, "adb", I'll leave it to you to figure out what that stands for. And finally, we've got four random bytes, because we have to fill up 16 bytes, and we have run out of ideas. Okay. We take all of that, encrypt it, and oh well, we know a lot of that, and that is basically known plaintext by design. Which is bad. Let's look at how this looks in a document. So, you see the perms value, I've marked it down here. That is the actual extended value I've shown you on the last slide. And above that you'll see the unencrypted value that's inside this perms value, so the minus 4 in this case, it's basically a bit field. On the right side you see the actual encrypted contents, and helpfully, all of this is encrypted under the same document-wide key in the newest version of the specification. And that means we can you reuse this plaintext anywhere in the document we want, and we can reuse this to build gadgets. To sum that last point up for you: Adobe decided to add permissions to the PDF format, and people thought of tampering with them. So they decided to encrypt these permissions to prevent tampering, and now known plaintext is available to attackers. All right. So that's basically all of the prerequisites done, and let's again have a demo. So, we again open this document, put in our password, well, as soon as Chrome decides to open this document, we put in our password. It's the same as before. Now, I've prepared a script for you, because I really can't do this live, and it basically does what I've told you. It's getting a blank gadget from the perms value. It's generating a URL from that. It's generating a field name, so that it will look nice on the server side, we regenerate this document and put a form in there. We start a web server, open this modified document, put in the password again and oh well, Chrome doesn't even ask. So as soon as this document is opened in Chrome and the password is put in, we'll get our secret message delivered to the attacker. Applause So we took a look at 27 viewers and found all of them vulnerable to at least one of our attacks. So some of them work with no user interaction as we have seen in Chrome. Some work with user interaction in specific cases, as you've seen with Adobe with a warning, but generally all of these were attackable in one way or the other. So what can be done about all of this? Well, you might think signatures might help. That's usually the first point people bring up: "A signature on the encrypted file will help." Well, no, not really. Why is that? Well, for one, a broken signature does not prevent opening the document. So we'll still be able to exfiltrate as soon as a password is put in. Signatures can be stripped because they're not encrypted. And as you have seen before, they can also be forged in most viewers. Signatures are not the answer. Closing exfiltration channels is also not the answer because for one, it's hard to do. And how would you even find all exfiltrations channels in an 800 pages standard? And I mean, we have barely scratched the surface of exfiltration channels. And should we really remove forms and hyperlinks from documents? And should we remove JavaScript? OK, maybe we should. And finally, if you have to do that, please ask the user before connecting to a web server. So let's look at some vendor reactions. Apple decided to do exactly what I've told you: to add a dialog to warn the user and even show the whole URL with the encrypted plaintext. And Google decided to stop trying to fix the unfixable in Chrome. They fixed the automatic exfiltration, but there's really nothing they can do about the standard. So this is a problem that has to be done in the standard. And that is basically that. For mitigating wrapping attacks, we have to deprecate partial encryption and disallow access from unencrypted to encrypted objects. And against the gadget attacks, we have to use authenticated encryption like AES-GCM. OK. And Adobe has told us that they were escalating this to the ISO working group that's now responsible for the PDF standard and this will be taken up in the next revision. So that's a win in my book. Applause Herald: Thank you so much, guys. That was really awesome. Please queue up by the microphones if you have any questions, we still have some time left for Q and A. But I think your research is really, really interesting because it opens my mind to like how would this actually be able to be misused in practice? Like, and I don't know, like, what's your take? I guess since you've been working so much with this, you must have some kind of idea as to what devious things you could come up with. Fabian: I mean, it's still an attacker scenario that requires a lot of resources and a very motivated attacker. So this might not be very important to the normal user. Let's be real here. So most of us are not targeted by the NSA, I guess. So you need an active attacker, an active man in the middle to actually perform these attacks. Herald: Great. Thank you. And then I think we have a question from microphone number four, please. Microphone 4: Yes. You'll said that the next standard might have a fix. Do you know a time frame on how long it takes to build such a standard? Fabian: Well, no, we don't really know. We have talked with Adobe and they told us they will show the next version of the standard to us before actually releasing that, but we have no time frame at all from them. Microphone 4: OK. Thank you. Herald: Thank you. Microphone number five, please. Microphone 5: Thank you for a very interesting talk. You showed in the first part that the signature has like these four numbers with the byte range. And why is this, like four numbers, not part of a signature? Is there a technical reason for that? Because the byte offset is predictable. Vladi: It is! The bytes ranges protected by the signature. But we just defined the second one and just moved the signed one to be validated later. So there are two byte ranges. But only the first one, the manipulated one, will be processed. Microphone 5: Thank you. Herald: Thank you so much. Microphone number four, please. Microphone 4: Oh, this is way too high for me. OK. I have an answer and a question for you. You mentioned during the talk that you weren't sure how the Department of Justice did distributes the passwords for encrypting PDFs. The answer is: in plain text, in a separate email or as the password of the week, which is distributed through various means. That is also what the Department of Homeland Security does, and the military is somewhat less stupid. As a question: I have roughly a half terabyte of sensitive PDFs that I would like to scan for your attack and also for redaction failures. Do you know of any fast, feasible ways to scan documents for the presence of this kind of attack? Fabian: I don't know of any tools, but I mean, scanning for the gadget attacks is actually possible if you tried to do some entropy detection. So, because you reuse ciphertext, you will have less entropy in your ciphertext, but that's pretty hard to do. Direct exfiltration should probably be detectable by scanning simply for words like "identity". Well, beyond that, 18 different techniques that we provided in the paper. But I don't know of any tools to do that automatically. Microphone 4: Thank you. Herald: Great. Thank you. And microphone number two, please. Microphone 2: Thank you for your very interesting presentation. I have one suggestion and one question for the mitigation scheme. If you simply run your PDF reader in a virtual machine, that is firewalled away, so your firewall won't led you to anybody going out. But for the signature forgeries, I had an idea. I'm not sure if this is actually a stupid idea, but did you consider faking the certificate? Because presumably the signature is protected by the seller's certificate. You make up your own, signing with that. Does it catch it and how? Vladi: We considered it but not in this paper. We assume that the certificate and the entire chain of trust for this path is totally secure. It was just an assumption to just concentrate only on the attacks we already found. So, perhaps there will be further research provided by us in the next months and years. Herald: We might just hear more from you in the future. Thank you so much. And now questions from the Internet, please. Signal Angel: I have two questions to the first part of your talk from the Internet. The first one is you mentioned a few reactions, but can you give a bit more detail about your experience with vendors while reporting these issues? Vladi: Yeah. We, ... for the first time we started, we asked the CERT team from BSI, CERT-Bund, to help us because there were a lot of affected vendors and we were not able to provide the support in a feasible way. So they supported us the entire way. We first created the report with, containing the exact description of the vulnerabilities and old exploits. Then, we distributed it to the BSI and they contacted the vendors and just proxied to the communication and there was a lot of communication. So I'm not aware of the entire communication, but only about the technical stuff where we were asked to just retest the fix and so on. So there was some reaction from Adobe, FoxIt and a lot of viewers reacted on our attacks and contacted us, but not everybody. Herald: Thank you so much. Unfortunately, that's the only time that we have available for questions today. I think you guys might stay around for a couple of minutes, just if someone has any more questions. Fabian, I thank ... and Vladislav, not enough. Thank you so much. It was very interesting. Please give them a great round of applause. Valdi: Thank you. Applause 36c3 postroll music subtitles created by c3subtitles.de in the year 2019. Join, and help us!