36c3 preroll music
Herald: The next talk is on how to break
PDF's, breaking the encryption and the
signatures, by Fabian Ising and Vladislav
Mladenov. Their talk was accepted at CCS
this year in London and they had that in
November. It comes from research that
basically produced two different kinds of
papers and it has been... people worldwide
have been interested in what has been
going on. Please give them a great round
of applause and welcome them to the stage.
Applause
Vladi: So can you hear me? Yeah. Perfect.
OK. Now you can see the slides. My name is
Vladislav Mladenov, or just Vladi if you
have some questions to me and this is
Fabian. And we are allowed today to talk
about how to break PDF security or more
special about how to break the
cryptography operations in PDF files. We
are a large team from university of
Bochum, Mue nster and Hackmanit GmbH. So as
I mentioned: We will talk about
cryptography and PDF files. Does it work?
Fabian: All right. OK. Let's try that
again. Okay.
Vladi: Perfect. This talk will consist of
two parts. The first part is about
digitally signed PDF files and how can we
recognize such files? If we open them we
see the information regarding that the
file was signed and all verification
procedures were valid. And more
information regarding the signature
validation panel and information about who
signed this file. This is the first part
of the talk and I will present this topic.
And the second part is regarding PDF
encrypted files and how can we recognize
such files? If you tried to open such
files, the first thing you see is the
password prompt. And after entering the
correct password, the file is decrypted
and you can read the content within this
file. If you open it with Adobe,
additional information regarding if this
file is secured or not is displayed
further. And this is the second part of
our talk, and Fabian, will talk: how can
we break the PDA encryption? So before we
start with the attacks on signatures or
encryption, we first need some basics. And
after six slides, you will be experts
regarding PDF files and you will
understand everything about it. But maybe
it's a little bit boring, so be patient:
there are only 6 slides. So the first is
quite easy. PDF files are... the first
specification was in 1993 and almost at
the beginning PDF cryptography operations
like signatures and encryption was already
there. The last version is PDF 2.0 and it
was released in 2017. And according to
Adobe 1.6 billion files are on the web and
perhaps more exchange beyond the web. So
basically PDF files are everywhere. And
that's the reason why we consider this
topic and tried to find or to analyze the
security of the features. If we have some
very simple file and we open it with Adobe
Reader, the first thing we see is, of
course, the content. "Hello, world!" in
this case, and additional information
regarding the focused page and how many
pages this document has. But what would
happen if we don't use a PDF viewer and
just use some text editor? We use the
Notepad++ to open and later manipulate the
files. So I will zoom this thing... this
file. And the first thing we see is that
we can read it. Perhaps it's quite, quite
funny. And but we can still extract some
information of this file. For example,
some information regarding the pages. And
here you can see the information that the
PDF file consists of one page. But more
interesting is that we can see the
content of the file itself. So the lessons
we learned is that we can use a simple
text editor to view and edit PDF files.
And for our attacks, we used only this
text editor. So let's go to the details.
How PDF files are structured and how they
are processed. PDF files consist of 4
parts: header, body and body is the most
important part of the PDF files. The body
contains the entire information presented
to the user. And 2 other sections: Xref
section and trailer. Very important think
about processing PDF files, is that
they're processed not from the top to the
bottom, but from the bottom to the top. So
the first thing is that the PDF viewer
analyses or processes is the trailer. So
let's start doing that. What information
is starting this trailer? Basically, there
are two very important informations. On
the first side this is the information:
what is the root element of this PDF? So
which is the first object which will be
processed? And the second important
information is where the Xref section
starts. It's just a byte offset pointing
to the position of the XRef section within
the PDF file. So this pointer, as
mentioned before, points to the Xref
section. But what is the Xref section
about? The Xref section is a catalog
pointing or holding the information where
the objects defined in the body are
contained or the byte positions of this
object. So how can we read this weird Xref
section? The first information we extract
is that the first object, which is defined
here, is the object with ID 0 and we have
5 further elements or objects which are
defined. So the first object is here. The
first entry is the byte position within
the file. The second is its generation
number. And the last charter points, if
this object is used or not used. So
reading it, reading this Xref section, we
extract the information that the object
with ID 0 is at byte position 0 and is not
in use. So the object with ID 1 is at the
position 9 and so on and so forth. So for
the object with ID 4 and the object number
comes from counting it: 0 1, 2, 3 and 4.
So the object with ID 4 can be found at
the offset 184 and it's in use. In other
words, the PDF viewer knows where each
object will be found and can properly
display it and process it. Now we come to
the most important part: the body, and I
mentioned it that in the body the entire
content which is presented to the user is
contained. So let's see. Object 4 0 is
this one and as you can see, it contains
the word "Hello World". The other objects
are a reference, too. So each pointer
points exactly to the starting position of
each of the objects. And how can we read
this object? You see, we have an object
starting with the ID number, then the
generation number and the word "obj". So
you now know where the object starts
and when it ends. Now how can we process
this body? As I mentioned before in the
trailer, there was a reference regarding
the root element and this element was with
ID 1 and generation number 0. So, we now
we start reading the document here and we
have a catalog and a reference to some
pages. Pages is just a description of all
the pages contained within the file. And
what can we see here is that we have this
number count once or we have only one page
and a reference to the page object which
contains the entire information
inscription of the page. If we have
multiple pages, then we will have here
multiple elements. Then we have one page.
And here we have the contents, which is a
reference to the string we already saw.
Perfect. If you understand this then you
know everything or almost everything about
PDF files. Now you can just use your
editor and open such files and analyze
them. Then we need one feature... I forgot
the last part. The most simple one. The
header. It should just one line stating
which version is used. For example, in our
case, 1.4. For the last version of Adobe
here will be stated 2.0. Now, we need this
one feature called "Incremental Update".
And I call this feature - do you know this
feature highlighting something in the PDF
file or putting some sticky notes?
Technically, it's called "incremental
update." I just call it reviewing master
and bachelor thesis of my students because
this is exactly the procedure I follow. I
just read the text and highlight something
and store the information I put at it.
Technically by putting such a sticky note.
this additional information is appended
after the end of the file. So we have a
body update which contains exactly the
information additionally of the new
objects and of course, new Xref section
and a new trailer pointing to this new
object. Okay, we are done. Considering
incremental update, we saw that it is used
mainly for sticky notes or highlighting.
But we observed something which is very
important because an incremental update we
can redefine existing objects, for
example, we can redefine the object with
ID 4 and put new content. So we replace in
this manner the word "Hello World" with
another sentence and of course the Xref
section and the trailer point to this new
object. So this is very important. With
incremental update we are not stuck to
only adding some highlighting or notes. We
can redefine already existing content and
perhaps we need this for the attacks we
will present. So let's talk about PDF
signatures. First, we need a difference
between electronic signature and digital
signature. Electronic signature. From a
technical point of view, it's just an
image. I just wrote it on my PC and put it
into the file. There is no cryptographic
protection. It could be me lying on the
beach doing something. From cryptographic
point of view is the same. It does not
provide any security, any cryptographic
security. What we will talk about here is
about digitally signed files, so if you
open such files, you have the additional
information regarding the validation about
the signatures and who signed this PDF
file. So as I mentioned before, this talk
will concentrate only on these digitally
signed PDF files. How? What kind of
process is behind digitally signing PDF
files? Imagine we have this abstract
overview of a PDF document. We have the
header, body, Xref section and trailer. We
want to sign it. What happens is that we
take this PDF file and via incremental
update we put additional information
regarding that. There is a new catalog and
more important, a new signature object
containing the signature value and
information about who signed this PDF
file. And of course, there is an Xref
section and trailer. And relevant for you:
The entire file is now protected by the
PDF signature. So manipulations within
this area should not be possible, right?
Yeah, let's talk about this: why it's not
possible and how can we break it? First,
we need an attack scenario. What we want
to achieve as an attacker. We assumed in
our research that the attacker possesses
this signed PDF file. This could be an old
contract, receipt or, in our case, a bill
from Amazon. And if we open this file, the
signature is valid. So everything is
green. No warnings are thrown and
everything is fine. What we tried to do is
to take this file, manipulate it somehow
and then send it to the victim. And now
the victim expects to receive a digitally
signed PDF file, so just tripping the
digital signature is a very trivial
scenario and we did not consider it
because it's trivial. We considered that
the victim expects to see that there is a
signature and it is valid. So no warning
casts are thrown and the entire left side
is exactly the same from the normal
behavior. But on the other side, the
content was exchanged so we manipulated
the receipt and exchanged it with another
content. The question is now: how can we
do it on a technical level? And we came up
with three attacks: incremental saving
attacks, signature wrapping and universal
signature forgery. And I will now
introduce the techniques and how these
attacks are working. The first attack is
the incremental saving attack. So I
mentioned before that via incremental
saving or via incremental updates, we can
add and remove and even redefine already
existing objects and the signature still
stays valid. Why is this happening?
Consider now again our case. We have some
header, body, Xref table and trailer and
the file is now signed and the signature
protects only the signed area. So what
would happen if I put a sticky note or
some highlighting? An incremental update
happens. If I open this file, usually this
happens: We have the information that this
signature is valid, when it was signed and
so on and so forth. So our first idea was
to just put new body updates, redefine
already existing content and with a Xref
table and trailer we point to the new
content. This is quite trivial because
it's a legitimate feature in PDF files, so
we didn't expect to be quite successful
and we were not so successful. But the
first idea: we applied this attack, we
opened it and we got this message. So it's
kind of a weird message because an
experienced user sees valid, but the
document has been updated and you should
know what does this exactly mean. But we
did not consider this attack as successful
because the warning is not the same or the
status of the signature validation is not
the same. So what we did is to evaluate
this first against this trivial case,
against older viewers we have, and Libre
office, for example, was vulnerable
against this trivial attack. This was the
only viewer which was vulnerable against
this trivial variation. But then we asked
ourselves: Okay, the other viewers are
quite secure. But how do they detect these
incremental updates? And from developer
point of view, the laziest thing we can do
is just to check if another Xref table and
trailer were added after the signature was
applied. So we just put our body updates
but just deleted the other two parts. This
is not a standard compliant PDF file. It's
broken. But our hope was that the PDF
viewer fixes this kind of stuff for us and
that these viewers are error-tolerant. And
we were quite successful because the
verification logic just checked: Is there
an Xref table and trailer after the
signature was applied? No? Okay.
Everything's fine. The signature is valid.
No warning was thrown. But then the
application logic saw that incremental
updates were applied and fixed this for us
and processed these body updates and no
warning was thrown. Some of the viewers
required to have a trailer. I don't know
why - it was a Black box testing. So we
just removed the Xref table, but the
trailer was there and we were able to
break further PDF viewers. The most
complex variation of the attack was the
following: We had the PDF viewers checked
if every incremental update contains a
signature object. But they did not check
if this signature is covered by the
incremental update. So we just copy-pasted
the signature which was provided here and
we just forced the PDF viewer to validate
this signed content twice - and still our
body updates were processed and for
example, Foxit or Master PDF were
vulnerable against this type of attack. So
the evaluation of our attack: We
considered as part of our evaluation 22
different viewers - among others, Adobe
with different versions, Foxit, and so on.
And as you can see 11 of 22 were
vulnerable against incremental saving. So
50 percent, and we were quite surprised
because we saw that the developers saw
that incremental updates could be
dangerous regarding the signature
validation. But we were still able to
bypass their considerations. We had - a
full signature bypass means that there is
no possibility for the victim to detect
the attack. A limited signature bypass
means that the victim, if the victim
clicks on one - at least one - additional
window and explicitly wants to validate
the signature, then the viewer was
vulnerable. But the most important thing
is by opening the file, there was a status
message that the signature validation and
all signatures are valid. So this was the
first layer and the viewers were
vulnerable against this. So let's talk
about the second attack class. We called
it "signature wrapping attack" and this is
the most complex attack of the 3 classes.
And now we have to go a little bit into
the details of how PDF signatures are
made. So imagine now we have a PDF file.
We have some header and the original
document. The original document contains
the header, the body, the Xref section and
so on and so forth. And we want to sign
this document. Technically, again, an
incremental update is provided and we have
a new catalog here. We have some other
objects, for example, certificates and so
on and the signature objects. And we will
now concentrate on this signature object
because it's essential for the attack we
want to to carry out. And the signature
object contains a lot of information, but
we want for this attacks only two elements
are relevant: The contents and the byte
range. The contents contains the signature
value. It's a PKCS7 container containing
the signature value and the certificates
used to validate the signature and the
bytes range. The byte range contains four
different values and what how these values
are being used. The first two, A and B
define the first signed area. And this is
here from the beginning of the document
until the start of the signature value.
Why we need this? Because the signature
value is part of the signed area. So we need
to exclude the signature value from the
document computation. And this is how the
bytes range is used. The first part is
from the beginning of the document until
the signed the signature value starts and
after the signature ends until the end of
the file is the second area specified by
the two digits C and D. So, now we have
everything protected besides the signature
value itself. What we wanted to try is to
create additional space for our attacks.
So our idea was to move the second signed
area. And how can we do it? So basically
we can do it by just defining another byte
range. And as you can see here, the byte
range points from area A to B. So this
area we didn't made any manipulation in
this part, right? It was not modified at
all. So it's still valid. And the second
part, the new C value and the next D
bytes, we didn't change anything here,
right? So basically, we didn't changed
anything in the signed area. And the
signature is still valid. But what we
created was a space for some malicious
objects; sometimes we needed some padding
and a new extra section pointing to this
malicious objects. Important thing was
that this malicious Xref sections, the
position is defined by the trailer. And
since we can not modify this trailer, this
position is fixed. So this is the only
limitation of the attack, but it works
like a charm. And the question is now: How
many PDF viewers were vulnerable against
this attack? And as you can see, this is
the signature wrapping column. 17 out of
22 applications were vulnerable against
this attack. This was quite expected
result because the attack was complex we
saw that many developers didn't, were not
aware of this threat and that's the reason
why so many vulnerabilities were there.
Now to the last class of attacks,
universal signature forgery. And we called
it universal signature forgery, but I
preferred to use another definition for
this attacks. I call them stupid
implementation flaws. We are coming from
the PenTesting area and I know a lot of
you are PenTesters, too. And, many of you
have experience, quite interesting
experience with zero bytes, null values or
some kind of weird values. And this is
what we tried in this kind of attacks.
Just tried to do some stupid values or
remove references and see what happen.
Considering the signature, there are two
different important elements: The contents
containing the signature value and the
byte range pointing to what is exactly
signed. So, what would happen if we remove
the contents? Our hope was that the
information regarding the signature is
still shown by the viewer as valid without
validating any signature because it was
not possible. And by just removing the
signature value is quite obvious idea. And
we were not successful with this kind of
attack. But let's proceed with another
values like for example, contents without
any value or contents like equals NULL or
zero bytes. And considering this last
version, we had two viewers which were
vulnerable against this attack. And
another, another case is, for example, by
removing the byte range. By removing this
byte range we have some signature value,
but we don't know what is exactly signed.
So, we tried this attack and of course,
byte range without any value or NULL bytes
or byte range with a minus or negative,
negative numbers. And usually this last
crashed very a lot of viewers. But the
most interesting is that Adobe made this
mistake by just removing the byte range.
We were able to bypass the entire
security. We didn't expect this behavior,
but it was a stupid implementation flaw,
allowing us to do anything in this
document and all the exploits we show in
our presentations were made on Adobe with
this attack. So let's see what were the
results of this attack. As you can see,
only 4 of 22 viewers were vulnerable
against this attack and only Adobe
unlimited; for the others, there was
limitation because if you click on the
signature validation, then a warning was
thrown. It was very easy for Adobe to fix.
And as you can see, Adobe didn't mistake,
made any mistake regarding incremental
saving, a signature wrapping, but
regarding controversial signature forgery.
There were vulnerable against this attack.
And this was the hope of our approach. In
summary, we were able to break 21 of 22
PDF viewers. The only
Applause
Thanks.
Applause
The only secure PDF viewer is Adobe 9,
which is deprecated and has remote code
execution. The only
Laugh
The only users allowed to use them or are
using it are Linux users, because this is
the last version available for Linux and
that's the reason why you consider it. So,
I'm done with the talk about PDF
signatures and now Fabian can talk about
PDF encryption. Thank you.
Fabian: Yes
Applause
OK, now that we have dealt with the
signatures, let's talk about another
cryptographic aspect in PDFs. And that is
encryption. And some of you might remember
our PDFex vulnerability from earlier this
year. It's, of course, an attack with a
logo and it presents two novel tech
techniques targeting PDF encryption that
have never been applied to PDF encryption
before. So one of them is these so-called
direct exfiltration where we break the
cryptography without even touching the
cryptography. So no ciphertext
manipulation here. The second one as so-
called malleability gadgets. And those are
actually targeted modifications of the
ciphertext of the document. But first,
let's take a step back and let again take
some keywords in. So PDF uses AES. OK.
Well, AES is good. Nothing can go wrong,
right? So let's go home. Encryption is
fine. Well, of course, we didn't stop
here, but took a closer look. So they use
CBC mode of operation, so cipher block
chaining. And, what's more important is
that they don't use any integrity
protection. So it's unintegrity protected
AES-CBC. And you might remember the
scenario from the attacks against
encrypted e-mail, so against OpenPGP and
S-MIME, it's basically the same problem.
But first, who actually uses PDF
encryption? You might ask. For one, we
found some local banks in Germany use
encrypted PDFs as a drop-in replacement
for S-MIME or OpenPGP because their
customers might not want to deal with uhm,
set, with the setup of encrypted e-mail.
Second one, were some drop-in plugins for
encrypt e-mail as well. So there are some
companies out there that produce product
that you can put into your outlook and you
can use encrypted PDF files instead of
encrypted email. We also found that some
scanners and medical devices were able to
send encrypted PDF files via e-mail. So
you can set a password on that machine and
they will send the encrypted PDF via
e-mail and you have to put in the
password some other way. And lastly, we
found that some governmental organizations
use encrypted PDF documents, for example,
the US Department of Justice allows for
the send, sending in some claims via
encrypted PDFs. And I've exactly no idea
how you how they get the password, but at
least they allow it. So as we are from
academia, let's take a step back and look
at our attacker model. So we've got Alice
and Bob. Alice wants to send a document to
Bob. And she wants to send it over an
unencrypted channel or a channel she
doesn't trust. So of course, she decides
to encrypt it. Second scenario is, they
want to upload it to a shared storage. For
example, Dropbox or any other shared
storage. And of course, they don't trust
the storage. So, again, they use end-to-
end encryption. So let's assume that this
shared storage is indeed dangerous or
malicious. So, Alice will, of course,
again upload the encrypted document to the
attacker in this case, will perform some
targeted modification of that, and will
send the modified documents back to Bob,
who will happily put in the password
because from his point of view, it's
undistinguishable from the original
document and the original plain text will
be leaked back to the attacker, breaking
the confidentiality. So let's take a look
at the first attack on how we did that.
That's the direct exfiltration, so
breaking the cryptography without touching
any cryptography, as I like to say. But
first, encryption in, in a nutshell, PDF
encryption. So you have seen the structure
of the PDF document. There is a header
with a version number. There's a body
where all the interesting objects live. So
there is our confidential content that we
want to actually, well, to actually
exfiltrate as an attacker. And finally,
there is Xref table and the trailer. So
what changes if we decide to encrypt this
document? Well, actually, not a whole lot.
So instead of confidential data, of
course, there's now some encrypted
ciphertext. Okay. And the rest pretty much
remains the same. The only thing that is
added is a new value in the trailer that
tells us how to decrypt this data again.
So there's pretty much of the structure
left unencrypted. And we thought about:
Why is this? And we took a look at the
standard. So, this is an excerpt from the
PDF specification and I've highlighted the
interesting parts for you. Encryption is
only applied to strings and streams. Well,
those of the values that actually can
contain any text in the document and all
other objects are not encrypted. And that
is because, well, they want to allow
random access to the whole document. So no
parsing the whole document before actually
showing page 16 of the encrypted document.
Well, that seems kind of reasonable. So,
but that also means that the whole
documents structure is unencrypted and
only the streams and strings are
encrypted. This reveals a lot of
information to an attacker that he or she
shouldn't have probably. That's for one
the number and size of pages, that's the
number and size of objects in the document
and that's also including any links, so
any hyperlinks in document that are
actually there. So, that's a lot of
information an attacker probably shouldn't
have. So, next we thought maybe we can do
some more stuff. Can we add our own
unencrypted content? And we took a look at
the standard again and found that our so-
called crypt filters, which provide finer
granularity control of the encryption.
This basically means as an attacker, I can
change a document to say, hey, only
strings in this document are encrypted and
streams are unencrypted. That's what the
identity filter is for. I have no idea why
they decided to add that to a document
format, but it's there. So that means
their support for partial encryption and
that means attackers content can be mixed
with actual encrypted content. And we
found 18 different techniques to do that
in different readers. So there is a lot of
ways to do that in the different readers.
So let's have a look at a demo. So we have
this document, this encrypted document, we
put in our password and get our secret
message. We now open it again in a text
editor. We see, in object 4 0 down here,
there's the actual ciphertext of the
object, so of the message, and we see it's
AES encrypted, with a 32 byte key, so it's
AES-256. OK. Now we decide to add a new
object that contains, well, plaintext.
And, well, we simply add that to the
contents array of this document. So, we
say "Display this on the first page", save
the document. We open it, and we'll put in
our password and, oh well, this is indeed
awkward. OK. So, now, we have broken the
integrity of an encrypted document. Well,
you might think maybe they didn't want any
integrity in the encrypted files. Maybe
that's the use case people have, I don't
know. But we thought, maybe we can somehow
exfiltrate the plaintext this way. So
again, we took a step back, and looked at
the PDF specification. And the first thing
we found were so-called submit-form
actions. And that's basically the same as
a form on a website. You can put in data.
You might have seen this in a contract, in
a PDF contract, where you can put in your
name, and your address, and so on, and so
on, and the data that is saved inside of
that is saved in strings and streams. And
now remember that is everything that is
encrypted in a document. And, of course,
you can also send that back to an
attacker, or well, to a legitimate use
case, of course, via clicking a button,
but clicking buttons is pretty lame. So we
again looked at the standard and found the
so-called open action. And that is an
action, for example, submitting a form
that can be performed upon opening a
document. So how might this look? This is
how a PDF form looks, already with the
attack applied. So, we've got an URL here
that is unencrypted, because all strings
in this document are unencrypted, and
we've got the value object 2 O, where the
actual encrypted data lives. So, that is
the value of the form fields. And what
will happen on the attacker side as soon
as this document is opened? Well, we'll
get a post request with a confidential
content. Let's have a demo. Again, we have
this document. We put in our password.
It's the original document you have
already seen. We reopen it in a text
viewer, or a text editor, again see it's
encrypted, and we decide to change all
strings to the identity filter. So, no
encryption is applied to strings from now
on. And then we add a whole blob of
information for the open action, and for
the form. So this will be op- this will be
performed, as soon as the document is
opened. There is a URL, p.df, and the
value is the encrypted object 4 0. We
start an HTTP server on the domain we
specified, we open the document, put in
the password again, and as soon as we open
the document Adobe will helpfully show us
a warning, but they will already click the
button for remembering that for the
future. And if you accept that, you will
see your secret message on the attacker
server. And that is pretty bad already.
OK. The same works for hyperlinks, so, of
course, there are links in PDF documents,
and as on the Web, we can define a base
URL for hyperlinks. So we can say all URLs
from this document start with http://p.df.
And of course we can define any object as
a URL. So any object we prepared this way
can be sent as a URL, and that will, of
course, trigger a GET request upon opening
the document again, if you defined an open
action for the same object. So again,
pretty bad and breaks confidentiality. And
of course, everybody loves JavaScript in
PDF files, and that works as well. Okay.
Let's talk about ciphertext attacks, so
actual cryptographic attacks, no more not
touching the crypto. So you might remember
the efail attacks on OpenPGP and S/MIME,
and those had basically three
prerequisites. 1: Well, ciphertext
malleability, so it's called malleability
gadgets. That's why we need ciphertext
malleability, and we've got no integrity
protection, that's a plus. Then we need
some known plaintext for actual targeted
modifications. And we need an exfiltration
channel to send the data back to an
attacker. Well, exfiltration channels are
already dealt with as we have hyperlinks
and forms. So we can already check that.
Nice. Let's talk about ciphertext
malleability, or what we call gadgets. So,
some of you might remember this from
crypto 101, or whatever lecture you ever
had on cryptography. This is the
decryption function of CBC, so cipher
block chaining. And it's basically, you've
got your ciphertext up here, and your
plaintext down here. And it works by
simply decrypting a block of ciphertext,
XORing the previous block of ciphertext
onto that, and you'll get the plaintext.
So what happens, if you decide to change a
single bit in the ciphertext, for example,
the first bit of the initialization
vector? Well, that same bit will flip in
the actual plaintext. Wait a second. What
happens, if you happen to know a whole
plaintext block? Well, we can XOR that
onto the first block, and basically get
all zeros, or what we call a gadget, or a
blank sheet of paper, because we can write
on that by taking a chosen plaintext and
XORing that onto this results. And this
way we can, for example, construct URLs in
the actual ciphertext, or in the actual
resulting plaintext. What we can also do
with these gadget is, gadgets is moving
them somewhere else in the document,
cloning them, so we can have multiple
gadgets, at multiple places in the
ciphertext. But remember, if you do that,
there's always the avalanche effect of
CBC, so you will have some random bytes in
here, but the URL still remains in place.
Okay. That's ciphertext malleability done.
As I've said we need some plaintext. We
need to have some known plaintext. And as
the PDF standard has been pretty helpful
up until now, in breaking PDF encryption,
let's take a look again. And what we found
here: Permissions. So a PDF documents can
have different permissions for the author,
and the user of the document. This
basically means the author can edit the
document and the users might not be able
to do that. And of course, people started
to change with that- started to tamper
with that value, if it was left
unencrypted, so in the newest version, it
was decided this should be encrypted as a
16 byte value. So we've got 16 bytes. How
do they look? Well, at first, we need room
for extension. We need lots of
permissions. Then we put 4 bytes of the
actual permission value - That is also in
unencrypted form in document. Then we need
one byte for encrypted metadata, and for
some reason we need some acronym, "adb",
I'll leave it to you to figure out what
that stands for. And finally, we've got
four random bytes, because we have to fill
up 16 bytes, and we have run out of ideas.
Okay. We take all of that, encrypt it, and
oh well, we know a lot of that, and that
is basically known plaintext by design.
Which is bad. Let's look at how this looks
in a document. So, you see the perms
value, I've marked it down here. That is
the actual extended value I've shown you
on the last slide. And above that you'll
see the unencrypted value that's inside
this perms value, so the minus 4 in this
case, it's basically a bit field. On the
right side you see the actual encrypted
contents, and helpfully, all of this is
encrypted under the same document-wide key
in the newest version of the
specification. And that means we can you
reuse this plaintext anywhere in the
document we want, and we can reuse this
to build gadgets. To sum that last point
up for you: Adobe decided to add
permissions to the PDF format, and people
thought of tampering with them. So they
decided to encrypt these permissions to
prevent tampering, and now known plaintext
is available to attackers. All right. So
that's basically all of the prerequisites
done, and let's again have a demo. So, we
again open this document, put in our
password, well, as soon as Chrome decides
to open this document, we put in our
password. It's the same as before. Now,
I've prepared a script for you, because I
really can't do this live, and it
basically does what I've told you. It's
getting a blank gadget from the perms
value. It's generating a URL from that.
It's generating a field name, so that it
will look nice on the server side, we
regenerate this document and put a form in
there. We start a web server, open this
modified document, put in the password
again and oh well, Chrome doesn't even
ask. So as soon as this document is opened
in Chrome and the password is put in,
we'll get our secret message delivered to
the attacker.
Applause
So we took a look at 27 viewers and found
all of them vulnerable to at least one of
our attacks. So some of them work with no
user interaction as we have seen in
Chrome. Some work with user interaction in
specific cases, as you've seen with Adobe
with a warning, but generally all of these
were attackable in one way or the other.
So what can be done about all of this?
Well, you might think signatures might
help. That's usually the first point
people bring up: "A signature on the
encrypted file will help." Well, no, not
really. Why is that? Well, for one, a
broken signature does not prevent opening
the document. So we'll still be able to
exfiltrate as soon as a password is put
in. Signatures can be stripped because
they're not encrypted. And as you have
seen before, they can also be forged in
most viewers. Signatures are not the
answer. Closing exfiltration channels is
also not the answer because for one, it's
hard to do. And how would you even find
all exfiltrations channels in an 800 pages
standard? And I mean, we have barely
scratched the surface of exfiltration
channels. And should we really remove
forms and hyperlinks from documents? And
should we remove JavaScript? OK, maybe we
should. And finally, if you have to do
that, please ask the user before
connecting to a web server. So let's look
at some vendor reactions. Apple decided to
do exactly what I've told you: to add a
dialog to warn the user and even show the
whole URL with the encrypted plaintext.
And Google decided to stop trying to fix
the unfixable in Chrome. They fixed the
automatic exfiltration, but there's really
nothing they can do about the standard. So
this is a problem that has to be done in
the standard. And that is basically that.
For mitigating wrapping attacks, we have
to deprecate partial encryption and
disallow access from unencrypted to
encrypted objects. And against the gadget
attacks, we have to use authenticated
encryption like AES-GCM. OK. And Adobe has
told us that they were escalating this to
the ISO working group that's now
responsible for the PDF standard and this
will be taken up in the next revision. So
that's a win in my book.
Applause
Herald: Thank you so much, guys. That was
really awesome. Please queue up by the
microphones if you have any questions, we
still have some time left for Q and A. But
I think your research is really, really
interesting because it opens my mind to
like how would this actually be able to be
misused in practice? Like, and I don't
know, like, what's your take? I guess
since you've been working so much with
this, you must have some kind of idea as
to what devious things you could come up
with.
Fabian: I mean, it's still an attacker
scenario that requires a lot of resources
and a very motivated attacker. So this
might not be very important to the normal
user. Let's be real here. So most of us
are not targeted by the NSA, I guess. So
you need an active attacker, an active man
in the middle to actually perform these
attacks.
Herald: Great. Thank you. And then I think
we have a question from microphone number
four, please.
Microphone 4: Yes. You'll said that the
next standard might have a fix.
Do you know a time frame on how long it
takes to build such a standard?
Fabian: Well, no, we don't really know. We
have talked with Adobe and they told us
they will show the next version of the
standard to us before actually releasing
that, but we have no time frame at all
from them.
Microphone 4: OK. Thank you.
Herald: Thank you.
Microphone number five, please.
Microphone 5: Thank you for a very
interesting talk. You showed in the first
part that the signature has like these
four numbers with the byte range. And why
is this, like four numbers, not part of a
signature? Is there a technical reason for
that? Because the byte offset is
predictable.
Vladi: It is! The bytes ranges protected
by the signature. But we just defined the
second one and just moved the signed one
to be validated later. So there are two
byte ranges. But only the first one, the
manipulated one, will be processed.
Microphone 5: Thank you.
Herald: Thank you so much. Microphone
number four, please.
Microphone 4: Oh, this is way too high for
me. OK. I have an answer and a question
for you. You mentioned during the talk
that you weren't sure how the Department
of Justice did distributes the passwords
for encrypting PDFs. The answer is: in
plain text, in a separate email or as the
password of the week, which is distributed
through various means. That is also what
the Department of Homeland Security does,
and the military is somewhat less stupid.
As a question: I have roughly a half
terabyte of sensitive PDFs that I would
like to scan for your attack and also for
redaction failures. Do you know of any
fast, feasible ways to scan documents for
the presence of this kind of attack?
Fabian: I don't know of any tools, but I
mean, scanning for the gadget attacks is
actually possible if you tried to do some
entropy detection. So, because you reuse
ciphertext, you will have less entropy in
your ciphertext, but that's pretty hard to
do. Direct exfiltration should probably be
detectable by scanning simply for words
like "identity". Well, beyond that, 18
different techniques that we provided in
the paper. But I don't know of any tools
to do that automatically.
Microphone 4: Thank you.
Herald: Great. Thank you. And microphone
number two, please. Microphone 2: Thank
you for your very interesting
presentation. I have one suggestion and
one question for the mitigation scheme. If
you simply run your PDF reader in a
virtual machine, that is firewalled away,
so your firewall won't led you to anybody
going out. But for the signature
forgeries, I had an idea. I'm not sure if
this is actually a stupid idea, but did
you consider faking the certificate?
Because presumably the signature is
protected by the seller's certificate. You
make up your own, signing with that. Does
it catch it and how?
Vladi: We considered it but not in this
paper. We assume that the certificate and
the entire chain of trust for this path is
totally secure. It was just an assumption
to just concentrate only on the attacks we
already found. So, perhaps there will be
further research provided by us in the
next months and years.
Herald: We might just hear more from you
in the future. Thank you so much. And now
questions from the Internet, please.
Signal Angel: I have two questions to the
first part of your talk from the Internet.
The first one is you mentioned a few
reactions, but can you give a bit more
detail about your experience with vendors
while reporting these issues?
Vladi: Yeah. We, ... for the first time we
started, we asked the CERT team from BSI,
CERT-Bund, to help us because there were a
lot of affected vendors and we were not
able to provide the support in a feasible
way. So they supported us the entire way.
We first created the report with,
containing the exact description of the
vulnerabilities and old exploits. Then, we
distributed it to the BSI and they
contacted the vendors and just proxied to
the communication and there was a lot of
communication. So I'm not aware of the
entire communication, but only about the
technical stuff where we were asked to
just retest the fix and so on. So there
was some reaction from Adobe, FoxIt and a
lot of viewers reacted on our attacks and
contacted us, but not everybody.
Herald: Thank you so much. Unfortunately,
that's the only time that we have
available for questions today. I think you
guys might stay around for a couple of
minutes, just if someone has any more
questions. Fabian, I thank ... and
Vladislav, not enough. Thank you so much.
It was very interesting. Please give them
a great round of applause.
Valdi: Thank you.
Applause
36c3 postroll music
subtitles created by c3subtitles.de
in the year 2019. Join, and help us!