-
36c3 preroll music
-
Herald: The next talk is on how to break
PDF's, breaking the encryption and the
-
signatures, by Fabian Ising and Vladislav
Mladenov. Their talk was accepted at CCS
-
this year in London and they had that in
November. It comes from research that
-
basically produced two different kinds of
papers and it has been... people worldwide
-
have been interested in what has been
going on. Please give them a great round
-
of applause and welcome them to the stage.
-
Applause
-
Vladi: So can you hear me? Yeah. Perfect.
OK. Now you can see the slides. My name is
-
Vladislav Mladenov, or just Vladi if you
have some questions to me and this is
-
Fabian. And we are allowed today to talk
about how to break PDF security or more
-
special about how to break the
cryptography operations in PDF files. We
-
are a large team from university of
Bochum, Mue nster and Hackmanit GmbH. So as
-
I mentioned: We will talk about
cryptography and PDF files. Does it work?
-
Fabian: All right. OK. Let's try that
again. Okay.
-
Vladi: Perfect. This talk will consist of
two parts. The first part is about
-
digitally signed PDF files and how can we
recognize such files? If we open them we
-
see the information regarding that the
file was signed and all verification
-
procedures were valid. And more
information regarding the signature
-
validation panel and information about who
signed this file. This is the first part
-
of the talk and I will present this topic.
And the second part is regarding PDF
-
encrypted files and how can we recognize
such files? If you tried to open such
-
files, the first thing you see is the
password prompt. And after entering the
-
correct password, the file is decrypted
and you can read the content within this
-
file. If you open it with Adobe,
additional information regarding if this
-
file is secured or not is displayed
further. And this is the second part of
-
our talk, and Fabian, will talk: how can
we break the PDA encryption? So before we
-
start with the attacks on signatures or
encryption, we first need some basics. And
-
after six slides, you will be experts
regarding PDF files and you will
-
understand everything about it. But maybe
it's a little bit boring, so be patient:
-
there are only 6 slides. So the first is
quite easy. PDF files are... the first
-
specification was in 1993 and almost at
the beginning PDF cryptography operations
-
like signatures and encryption was already
there. The last version is PDF 2.0 and it
-
was released in 2017. And according to
Adobe 1.6 billion files are on the web and
-
perhaps more exchange beyond the web. So
basically PDF files are everywhere. And
-
that's the reason why we consider this
topic and tried to find or to analyze the
-
security of the features. If we have some
very simple file and we open it with Adobe
-
Reader, the first thing we see is, of
course, the content. "Hello, world!" in
-
this case, and additional information
regarding the focused page and how many
-
pages this document has. But what would
happen if we don't use a PDF viewer and
-
just use some text editor? We use the
Notepad++ to open and later manipulate the
-
files. So I will zoom this thing... this
file. And the first thing we see is that
-
we can read it. Perhaps it's quite, quite
funny. And but we can still extract some
-
information of this file. For example,
some information regarding the pages. And
-
here you can see the information that the
PDF file consists of one page. But more
-
interesting is that we can see the
content of the file itself. So the lessons
-
we learned is that we can use a simple
text editor to view and edit PDF files.
-
And for our attacks, we used only this
text editor. So let's go to the details.
-
How PDF files are structured and how they
are processed. PDF files consist of 4
-
parts: header, body and body is the most
important part of the PDF files. The body
-
contains the entire information presented
to the user. And 2 other sections: Xref
-
section and trailer. Very important think
about processing PDF files, is that
-
they're processed not from the top to the
bottom, but from the bottom to the top. So
-
the first thing is that the PDF viewer
analyses or processes is the trailer. So
-
let's start doing that. What information
is starting this trailer? Basically, there
-
are two very important informations. On
the first side this is the information:
-
what is the root element of this PDF? So
which is the first object which will be
-
processed? And the second important
information is where the Xref section
-
starts. It's just a byte offset pointing
to the position of the XRef section within
-
the PDF file. So this pointer, as
mentioned before, points to the Xref
-
section. But what is the Xref section
about? The Xref section is a catalog
-
pointing or holding the information where
the objects defined in the body are
-
contained or the byte positions of this
object. So how can we read this weird Xref
-
section? The first information we extract
is that the first object, which is defined
-
here, is the object with ID 0 and we have
5 further elements or objects which are
-
defined. So the first object is here. The
first entry is the byte position within
-
the file. The second is its generation
number. And the last charter points, if
-
this object is used or not used. So
reading it, reading this Xref section, we
-
extract the information that the object
with ID 0 is at byte position 0 and is not
-
in use. So the object with ID 1 is at the
position 9 and so on and so forth. So for
-
the object with ID 4 and the object number
comes from counting it: 0 1, 2, 3 and 4.
-
So the object with ID 4 can be found at
the offset 184 and it's in use. In other
-
words, the PDF viewer knows where each
object will be found and can properly
-
display it and process it. Now we come to
the most important part: the body, and I
-
mentioned it that in the body the entire
content which is presented to the user is
-
contained. So let's see. Object 4 0 is
this one and as you can see, it contains
-
the word "Hello World". The other objects
are a reference, too. So each pointer
-
points exactly to the starting position of
each of the objects. And how can we read
-
this object? You see, we have an object
starting with the ID number, then the
-
generation number and the word "obj". So
you now know where the object starts
-
and when it ends. Now how can we process
this body? As I mentioned before in the
-
trailer, there was a reference regarding
the root element and this element was with
-
ID 1 and generation number 0. So, we now
we start reading the document here and we
-
have a catalog and a reference to some
pages. Pages is just a description of all
-
the pages contained within the file. And
what can we see here is that we have this
-
number count once or we have only one page
and a reference to the page object which
-
contains the entire information
inscription of the page. If we have
-
multiple pages, then we will have here
multiple elements. Then we have one page.
-
And here we have the contents, which is a
reference to the string we already saw.
-
Perfect. If you understand this then you
know everything or almost everything about
-
PDF files. Now you can just use your
editor and open such files and analyze
-
them. Then we need one feature... I forgot
the last part. The most simple one. The
-
header. It should just one line stating
which version is used. For example, in our
-
case, 1.4. For the last version of Adobe
here will be stated 2.0. Now, we need this
-
one feature called "Incremental Update".
And I call this feature - do you know this
-
feature highlighting something in the PDF
file or putting some sticky notes?
-
Technically, it's called "incremental
update." I just call it reviewing master
-
and bachelor thesis of my students because
this is exactly the procedure I follow. I
-
just read the text and highlight something
and store the information I put at it.
-
Technically by putting such a sticky note.
this additional information is appended
-
after the end of the file. So we have a
body update which contains exactly the
-
information additionally of the new
objects and of course, new Xref section
-
and a new trailer pointing to this new
object. Okay, we are done. Considering
-
incremental update, we saw that it is used
mainly for sticky notes or highlighting.
-
But we observed something which is very
important because an incremental update we
-
can redefine existing objects, for
example, we can redefine the object with
-
ID 4 and put new content. So we replace in
this manner the word "Hello World" with
-
another sentence and of course the Xref
section and the trailer point to this new
-
object. So this is very important. With
incremental update we are not stuck to
-
only adding some highlighting or notes. We
can redefine already existing content and
-
perhaps we need this for the attacks we
will present. So let's talk about PDF
-
signatures. First, we need a difference
between electronic signature and digital
-
signature. Electronic signature. From a
technical point of view, it's just an
-
image. I just wrote it on my PC and put it
into the file. There is no cryptographic
-
protection. It could be me lying on the
beach doing something. From cryptographic
-
point of view is the same. It does not
provide any security, any cryptographic
-
security. What we will talk about here is
about digitally signed files, so if you
-
open such files, you have the additional
information regarding the validation about
-
the signatures and who signed this PDF
file. So as I mentioned before, this talk
-
will concentrate only on these digitally
signed PDF files. How? What kind of
-
process is behind digitally signing PDF
files? Imagine we have this abstract
-
overview of a PDF document. We have the
header, body, Xref section and trailer. We
-
want to sign it. What happens is that we
take this PDF file and via incremental
-
update we put additional information
regarding that. There is a new catalog and
-
more important, a new signature object
containing the signature value and
-
information about who signed this PDF
file. And of course, there is an Xref
-
section and trailer. And relevant for you:
The entire file is now protected by the
-
PDF signature. So manipulations within
this area should not be possible, right?
-
Yeah, let's talk about this: why it's not
possible and how can we break it? First,
-
we need an attack scenario. What we want
to achieve as an attacker. We assumed in
-
our research that the attacker possesses
this signed PDF file. This could be an old
-
contract, receipt or, in our case, a bill
from Amazon. And if we open this file, the
-
signature is valid. So everything is
green. No warnings are thrown and
-
everything is fine. What we tried to do is
to take this file, manipulate it somehow
-
and then send it to the victim. And now
the victim expects to receive a digitally
-
signed PDF file, so just tripping the
digital signature is a very trivial
-
scenario and we did not consider it
because it's trivial. We considered that
-
the victim expects to see that there is a
signature and it is valid. So no warning
-
casts are thrown and the entire left side
is exactly the same from the normal
-
behavior. But on the other side, the
content was exchanged so we manipulated
-
the receipt and exchanged it with another
content. The question is now: how can we
-
do it on a technical level? And we came up
with three attacks: incremental saving
-
attacks, signature wrapping and universal
signature forgery. And I will now
-
introduce the techniques and how these
attacks are working. The first attack is
-
the incremental saving attack. So I
mentioned before that via incremental
-
saving or via incremental updates, we can
add and remove and even redefine already
-
existing objects and the signature still
stays valid. Why is this happening?
-
Consider now again our case. We have some
header, body, Xref table and trailer and
-
the file is now signed and the signature
protects only the signed area. So what
-
would happen if I put a sticky note or
some highlighting? An incremental update
-
happens. If I open this file, usually this
happens: We have the information that this
-
signature is valid, when it was signed and
so on and so forth. So our first idea was
-
to just put new body updates, redefine
already existing content and with a Xref
-
table and trailer we point to the new
content. This is quite trivial because
-
it's a legitimate feature in PDF files, so
we didn't expect to be quite successful
-
and we were not so successful. But the
first idea: we applied this attack, we
-
opened it and we got this message. So it's
kind of a weird message because an
-
experienced user sees valid, but the
document has been updated and you should
-
know what does this exactly mean. But we
did not consider this attack as successful
-
because the warning is not the same or the
status of the signature validation is not
-
the same. So what we did is to evaluate
this first against this trivial case,
-
against older viewers we have, and Libre
office, for example, was vulnerable
-
against this trivial attack. This was the
only viewer which was vulnerable against
-
this trivial variation. But then we asked
ourselves: Okay, the other viewers are
-
quite secure. But how do they detect these
incremental updates? And from developer
-
point of view, the laziest thing we can do
is just to check if another Xref table and
-
trailer were added after the signature was
applied. So we just put our body updates
-
but just deleted the other two parts. This
is not a standard compliant PDF file. It's
-
broken. But our hope was that the PDF
viewer fixes this kind of stuff for us and
-
that these viewers are error-tolerant. And
we were quite successful because the
-
verification logic just checked: Is there
an Xref table and trailer after the
-
signature was applied? No? Okay.
Everything's fine. The signature is valid.
-
No warning was thrown. But then the
application logic saw that incremental
-
updates were applied and fixed this for us
and processed these body updates and no
-
warning was thrown. Some of the viewers
required to have a trailer. I don't know
-
why - it was a Black box testing. So we
just removed the Xref table, but the
-
trailer was there and we were able to
break further PDF viewers. The most
-
complex variation of the attack was the
following: We had the PDF viewers checked
-
if every incremental update contains a
signature object. But they did not check
-
if this signature is covered by the
incremental update. So we just copy-pasted
-
the signature which was provided here and
we just forced the PDF viewer to validate
-
this signed content twice - and still our
body updates were processed and for
-
example, Foxit or Master PDF were
vulnerable against this type of attack. So
-
the evaluation of our attack: We
considered as part of our evaluation 22
-
different viewers - among others, Adobe
with different versions, Foxit, and so on.
-
And as you can see 11 of 22 were
vulnerable against incremental saving. So
-
50 percent, and we were quite surprised
because we saw that the developers saw
-
that incremental updates could be
dangerous regarding the signature
-
validation. But we were still able to
bypass their considerations. We had - a
-
full signature bypass means that there is
no possibility for the victim to detect
-
the attack. A limited signature bypass
means that the victim, if the victim
-
clicks on one - at least one - additional
window and explicitly wants to validate
-
the signature, then the viewer was
vulnerable. But the most important thing
-
is by opening the file, there was a status
message that the signature validation and
-
all signatures are valid. So this was the
first layer and the viewers were
-
vulnerable against this. So let's talk
about the second attack class. We called
-
it "signature wrapping attack" and this is
the most complex attack of the 3 classes.
-
And now we have to go a little bit into
the details of how PDF signatures are
-
made. So imagine now we have a PDF file.
We have some header and the original
-
document. The original document contains
the header, the body, the Xref section and
-
so on and so forth. And we want to sign
this document. Technically, again, an
-
incremental update is provided and we have
a new catalog here. We have some other
-
objects, for example, certificates and so
on and the signature objects. And we will
-
now concentrate on this signature object
because it's essential for the attack we
-
want to to carry out. And the signature
object contains a lot of information, but
-
we want for this attacks only two elements
are relevant: The contents and the byte
-
range. The contents contains the signature
value. It's a PKCS7 container containing
-
the signature value and the certificates
used to validate the signature and the
-
bytes range. The byte range contains four
different values and what how these values
-
are being used. The first two, A and B
define the first signed area. And this is
-
here from the beginning of the document
until the start of the signature value.
-
Why we need this? Because the signature
value is part of the signed area. So we need
-
to exclude the signature value from the
document computation. And this is how the
-
bytes range is used. The first part is
from the beginning of the document until
-
the signed the signature value starts and
after the signature ends until the end of
-
the file is the second area specified by
the two digits C and D. So, now we have
-
everything protected besides the signature
value itself. What we wanted to try is to
-
create additional space for our attacks.
So our idea was to move the second signed
-
area. And how can we do it? So basically
we can do it by just defining another byte
-
range. And as you can see here, the byte
range points from area A to B. So this
-
area we didn't made any manipulation in
this part, right? It was not modified at
-
all. So it's still valid. And the second
part, the new C value and the next D
-
bytes, we didn't change anything here,
right? So basically, we didn't changed
-
anything in the signed area. And the
signature is still valid. But what we
-
created was a space for some malicious
objects; sometimes we needed some padding
-
and a new extra section pointing to this
malicious objects. Important thing was
-
that this malicious Xref sections, the
position is defined by the trailer. And
-
since we can not modify this trailer, this
position is fixed. So this is the only
-
limitation of the attack, but it works
like a charm. And the question is now: How
-
many PDF viewers were vulnerable against
this attack? And as you can see, this is
-
the signature wrapping column. 17 out of
22 applications were vulnerable against
-
this attack. This was quite expected
result because the attack was complex we
-
saw that many developers didn't, were not
aware of this threat and that's the reason
-
why so many vulnerabilities were there.
Now to the last class of attacks,
-
universal signature forgery. And we called
it universal signature forgery, but I
-
preferred to use another definition for
this attacks. I call them stupid
-
implementation flaws. We are coming from
the PenTesting area and I know a lot of
-
you are PenTesters, too. And, many of you
have experience, quite interesting
-
experience with zero bytes, null values or
some kind of weird values. And this is
-
what we tried in this kind of attacks.
Just tried to do some stupid values or
-
remove references and see what happen.
Considering the signature, there are two
-
different important elements: The contents
containing the signature value and the
-
byte range pointing to what is exactly
signed. So, what would happen if we remove
-
the contents? Our hope was that the
information regarding the signature is
-
still shown by the viewer as valid without
validating any signature because it was
-
not possible. And by just removing the
signature value is quite obvious idea. And
-
we were not successful with this kind of
attack. But let's proceed with another
-
values like for example, contents without
any value or contents like equals NULL or
-
zero bytes. And considering this last
version, we had two viewers which were
-
vulnerable against this attack. And
another, another case is, for example, by
-
removing the byte range. By removing this
byte range we have some signature value,
-
but we don't know what is exactly signed.
So, we tried this attack and of course,
-
byte range without any value or NULL bytes
or byte range with a minus or negative,
-
negative numbers. And usually this last
crashed very a lot of viewers. But the
-
most interesting is that Adobe made this
mistake by just removing the byte range.
-
We were able to bypass the entire
security. We didn't expect this behavior,
-
but it was a stupid implementation flaw,
allowing us to do anything in this
-
document and all the exploits we show in
our presentations were made on Adobe with
-
this attack. So let's see what were the
results of this attack. As you can see,
-
only 4 of 22 viewers were vulnerable
against this attack and only Adobe
-
unlimited; for the others, there was
limitation because if you click on the
-
signature validation, then a warning was
thrown. It was very easy for Adobe to fix.
-
And as you can see, Adobe didn't mistake,
made any mistake regarding incremental
-
saving, a signature wrapping, but
regarding controversial signature forgery.
-
There were vulnerable against this attack.
And this was the hope of our approach. In
-
summary, we were able to break 21 of 22
PDF viewers. The only
-
Applause
Thanks.
-
Applause
The only secure PDF viewer is Adobe 9,
-
which is deprecated and has remote code
execution. The only
-
Laugh
The only users allowed to use them or are
-
using it are Linux users, because this is
the last version available for Linux and
-
that's the reason why you consider it. So,
I'm done with the talk about PDF
-
signatures and now Fabian can talk about
PDF encryption. Thank you.
-
Fabian: Yes
Applause
-
OK, now that we have dealt with the
signatures, let's talk about another
-
cryptographic aspect in PDFs. And that is
encryption. And some of you might remember
-
our PDFex vulnerability from earlier this
year. It's, of course, an attack with a
-
logo and it presents two novel tech
techniques targeting PDF encryption that
-
have never been applied to PDF encryption
before. So one of them is these so-called
-
direct exfiltration where we break the
cryptography without even touching the
-
cryptography. So no ciphertext
manipulation here. The second one as so-
-
called malleability gadgets. And those are
actually targeted modifications of the
-
ciphertext of the document. But first,
let's take a step back and let again take
-
some keywords in. So PDF uses AES. OK.
Well, AES is good. Nothing can go wrong,
-
right? So let's go home. Encryption is
fine. Well, of course, we didn't stop
-
here, but took a closer look. So they use
CBC mode of operation, so cipher block
-
chaining. And, what's more important is
that they don't use any integrity
-
protection. So it's unintegrity protected
AES-CBC. And you might remember the
-
scenario from the attacks against
encrypted e-mail, so against OpenPGP and
-
S-MIME, it's basically the same problem.
But first, who actually uses PDF
-
encryption? You might ask. For one, we
found some local banks in Germany use
-
encrypted PDFs as a drop-in replacement
for S-MIME or OpenPGP because their
-
customers might not want to deal with uhm,
set, with the setup of encrypted e-mail.
-
Second one, were some drop-in plugins for
encrypt e-mail as well. So there are some
-
companies out there that produce product
that you can put into your outlook and you
-
can use encrypted PDF files instead of
encrypted email. We also found that some
-
scanners and medical devices were able to
send encrypted PDF files via e-mail. So
-
you can set a password on that machine and
they will send the encrypted PDF via
-
e-mail and you have to put in the
password some other way. And lastly, we
-
found that some governmental organizations
use encrypted PDF documents, for example,
-
the US Department of Justice allows for
the send, sending in some claims via
-
encrypted PDFs. And I've exactly no idea
how you how they get the password, but at
-
least they allow it. So as we are from
academia, let's take a step back and look
-
at our attacker model. So we've got Alice
and Bob. Alice wants to send a document to
-
Bob. And she wants to send it over an
unencrypted channel or a channel she
-
doesn't trust. So of course, she decides
to encrypt it. Second scenario is, they
-
want to upload it to a shared storage. For
example, Dropbox or any other shared
-
storage. And of course, they don't trust
the storage. So, again, they use end-to-
-
end encryption. So let's assume that this
shared storage is indeed dangerous or
-
malicious. So, Alice will, of course,
again upload the encrypted document to the
-
attacker in this case, will perform some
targeted modification of that, and will
-
send the modified documents back to Bob,
who will happily put in the password
-
because from his point of view, it's
undistinguishable from the original
-
document and the original plain text will
be leaked back to the attacker, breaking
-
the confidentiality. So let's take a look
at the first attack on how we did that.
-
That's the direct exfiltration, so
breaking the cryptography without touching
-
any cryptography, as I like to say. But
first, encryption in, in a nutshell, PDF
-
encryption. So you have seen the structure
of the PDF document. There is a header
-
with a version number. There's a body
where all the interesting objects live. So
-
there is our confidential content that we
want to actually, well, to actually
-
exfiltrate as an attacker. And finally,
there is Xref table and the trailer. So
-
what changes if we decide to encrypt this
document? Well, actually, not a whole lot.
-
So instead of confidential data, of
course, there's now some encrypted
-
ciphertext. Okay. And the rest pretty much
remains the same. The only thing that is
-
added is a new value in the trailer that
tells us how to decrypt this data again.
-
So there's pretty much of the structure
left unencrypted. And we thought about:
-
Why is this? And we took a look at the
standard. So, this is an excerpt from the
-
PDF specification and I've highlighted the
interesting parts for you. Encryption is
-
only applied to strings and streams. Well,
those of the values that actually can
-
contain any text in the document and all
other objects are not encrypted. And that
-
is because, well, they want to allow
random access to the whole document. So no
-
parsing the whole document before actually
showing page 16 of the encrypted document.
-
Well, that seems kind of reasonable. So,
but that also means that the whole
-
documents structure is unencrypted and
only the streams and strings are
-
encrypted. This reveals a lot of
information to an attacker that he or she
-
shouldn't have probably. That's for one
the number and size of pages, that's the
-
number and size of objects in the document
and that's also including any links, so
-
any hyperlinks in document that are
actually there. So, that's a lot of
-
information an attacker probably shouldn't
have. So, next we thought maybe we can do
-
some more stuff. Can we add our own
unencrypted content? And we took a look at
-
the standard again and found that our so-
called crypt filters, which provide finer
-
granularity control of the encryption.
This basically means as an attacker, I can
-
change a document to say, hey, only
strings in this document are encrypted and
-
streams are unencrypted. That's what the
identity filter is for. I have no idea why
-
they decided to add that to a document
format, but it's there. So that means
-
their support for partial encryption and
that means attackers content can be mixed
-
with actual encrypted content. And we
found 18 different techniques to do that
-
in different readers. So there is a lot of
ways to do that in the different readers.
-
So let's have a look at a demo. So we have
this document, this encrypted document, we
-
put in our password and get our secret
message. We now open it again in a text
-
editor. We see, in object 4 0 down here,
there's the actual ciphertext of the
-
object, so of the message, and we see it's
AES encrypted, with a 32 byte key, so it's
-
AES-256. OK. Now we decide to add a new
object that contains, well, plaintext.
-
And, well, we simply add that to the
contents array of this document. So, we
-
say "Display this on the first page", save
the document. We open it, and we'll put in
-
our password and, oh well, this is indeed
awkward. OK. So, now, we have broken the
-
integrity of an encrypted document. Well,
you might think maybe they didn't want any
-
integrity in the encrypted files. Maybe
that's the use case people have, I don't
-
know. But we thought, maybe we can somehow
exfiltrate the plaintext this way. So
-
again, we took a step back, and looked at
the PDF specification. And the first thing
-
we found were so-called submit-form
actions. And that's basically the same as
-
a form on a website. You can put in data.
You might have seen this in a contract, in
-
a PDF contract, where you can put in your
name, and your address, and so on, and so
-
on, and the data that is saved inside of
that is saved in strings and streams. And
-
now remember that is everything that is
encrypted in a document. And, of course,
-
you can also send that back to an
attacker, or well, to a legitimate use
-
case, of course, via clicking a button,
but clicking buttons is pretty lame. So we
-
again looked at the standard and found the
so-called open action. And that is an
-
action, for example, submitting a form
that can be performed upon opening a
-
document. So how might this look? This is
how a PDF form looks, already with the
-
attack applied. So, we've got an URL here
that is unencrypted, because all strings
-
in this document are unencrypted, and
we've got the value object 2 O, where the
-
actual encrypted data lives. So, that is
the value of the form fields. And what
-
will happen on the attacker side as soon
as this document is opened? Well, we'll
-
get a post request with a confidential
content. Let's have a demo. Again, we have
-
this document. We put in our password.
It's the original document you have
-
already seen. We reopen it in a text
viewer, or a text editor, again see it's
-
encrypted, and we decide to change all
strings to the identity filter. So, no
-
encryption is applied to strings from now
on. And then we add a whole blob of
-
information for the open action, and for
the form. So this will be op- this will be
-
performed, as soon as the document is
opened. There is a URL, p.df, and the
-
value is the encrypted object 4 0. We
start an HTTP server on the domain we
-
specified, we open the document, put in
the password again, and as soon as we open
-
the document Adobe will helpfully show us
a warning, but they will already click the
-
button for remembering that for the
future. And if you accept that, you will
-
see your secret message on the attacker
server. And that is pretty bad already.
-
OK. The same works for hyperlinks, so, of
course, there are links in PDF documents,
-
and as on the Web, we can define a base
URL for hyperlinks. So we can say all URLs
-
from this document start with http://p.df.
And of course we can define any object as
-
a URL. So any object we prepared this way
can be sent as a URL, and that will, of
-
course, trigger a GET request upon opening
the document again, if you defined an open
-
action for the same object. So again,
pretty bad and breaks confidentiality. And
-
of course, everybody loves JavaScript in
PDF files, and that works as well. Okay.
-
Let's talk about ciphertext attacks, so
actual cryptographic attacks, no more not
-
touching the crypto. So you might remember
the efail attacks on OpenPGP and S/MIME,
-
and those had basically three
prerequisites. 1: Well, ciphertext
-
malleability, so it's called malleability
gadgets. That's why we need ciphertext
-
malleability, and we've got no integrity
protection, that's a plus. Then we need
-
some known plaintext for actual targeted
modifications. And we need an exfiltration
-
channel to send the data back to an
attacker. Well, exfiltration channels are
-
already dealt with as we have hyperlinks
and forms. So we can already check that.
-
Nice. Let's talk about ciphertext
malleability, or what we call gadgets. So,
-
some of you might remember this from
crypto 101, or whatever lecture you ever
-
had on cryptography. This is the
decryption function of CBC, so cipher
-
block chaining. And it's basically, you've
got your ciphertext up here, and your
-
plaintext down here. And it works by
simply decrypting a block of ciphertext,
-
XORing the previous block of ciphertext
onto that, and you'll get the plaintext.
-
So what happens, if you decide to change a
single bit in the ciphertext, for example,
-
the first bit of the initialization
vector? Well, that same bit will flip in
-
the actual plaintext. Wait a second. What
happens, if you happen to know a whole
-
plaintext block? Well, we can XOR that
onto the first block, and basically get
-
all zeros, or what we call a gadget, or a
blank sheet of paper, because we can write
-
on that by taking a chosen plaintext and
XORing that onto this results. And this
-
way we can, for example, construct URLs in
the actual ciphertext, or in the actual
-
resulting plaintext. What we can also do
with these gadget is, gadgets is moving
-
them somewhere else in the document,
cloning them, so we can have multiple
-
gadgets, at multiple places in the
ciphertext. But remember, if you do that,
-
there's always the avalanche effect of
CBC, so you will have some random bytes in
-
here, but the URL still remains in place.
Okay. That's ciphertext malleability done.
-
As I've said we need some plaintext. We
need to have some known plaintext. And as
-
the PDF standard has been pretty helpful
up until now, in breaking PDF encryption,
-
let's take a look again. And what we found
here: Permissions. So a PDF documents can
-
have different permissions for the author,
and the user of the document. This
-
basically means the author can edit the
document and the users might not be able
-
to do that. And of course, people started
to change with that- started to tamper
-
with that value, if it was left
unencrypted, so in the newest version, it
-
was decided this should be encrypted as a
16 byte value. So we've got 16 bytes. How
-
do they look? Well, at first, we need room
for extension. We need lots of
-
permissions. Then we put 4 bytes of the
actual permission value - That is also in
-
unencrypted form in document. Then we need
one byte for encrypted metadata, and for
-
some reason we need some acronym, "adb",
I'll leave it to you to figure out what
-
that stands for. And finally, we've got
four random bytes, because we have to fill
-
up 16 bytes, and we have run out of ideas.
Okay. We take all of that, encrypt it, and
-
oh well, we know a lot of that, and that
is basically known plaintext by design.
-
Which is bad. Let's look at how this looks
in a document. So, you see the perms
-
value, I've marked it down here. That is
the actual extended value I've shown you
-
on the last slide. And above that you'll
see the unencrypted value that's inside
-
this perms value, so the minus 4 in this
case, it's basically a bit field. On the
-
right side you see the actual encrypted
contents, and helpfully, all of this is
-
encrypted under the same document-wide key
in the newest version of the
-
specification. And that means we can you
reuse this plaintext anywhere in the
-
document we want, and we can reuse this
to build gadgets. To sum that last point
-
up for you: Adobe decided to add
permissions to the PDF format, and people
-
thought of tampering with them. So they
decided to encrypt these permissions to
-
prevent tampering, and now known plaintext
is available to attackers. All right. So
-
that's basically all of the prerequisites
done, and let's again have a demo. So, we
-
again open this document, put in our
password, well, as soon as Chrome decides
-
to open this document, we put in our
password. It's the same as before. Now,
-
I've prepared a script for you, because I
really can't do this live, and it
-
basically does what I've told you. It's
getting a blank gadget from the perms
-
value. It's generating a URL from that.
It's generating a field name, so that it
-
will look nice on the server side, we
regenerate this document and put a form in
-
there. We start a web server, open this
modified document, put in the password
-
again and oh well, Chrome doesn't even
ask. So as soon as this document is opened
-
in Chrome and the password is put in,
we'll get our secret message delivered to
-
the attacker.
Applause
-
So we took a look at 27 viewers and found
all of them vulnerable to at least one of
-
our attacks. So some of them work with no
user interaction as we have seen in
-
Chrome. Some work with user interaction in
specific cases, as you've seen with Adobe
-
with a warning, but generally all of these
were attackable in one way or the other.
-
So what can be done about all of this?
Well, you might think signatures might
-
help. That's usually the first point
people bring up: "A signature on the
-
encrypted file will help." Well, no, not
really. Why is that? Well, for one, a
-
broken signature does not prevent opening
the document. So we'll still be able to
-
exfiltrate as soon as a password is put
in. Signatures can be stripped because
-
they're not encrypted. And as you have
seen before, they can also be forged in
-
most viewers. Signatures are not the
answer. Closing exfiltration channels is
-
also not the answer because for one, it's
hard to do. And how would you even find
-
all exfiltrations channels in an 800 pages
standard? And I mean, we have barely
-
scratched the surface of exfiltration
channels. And should we really remove
-
forms and hyperlinks from documents? And
should we remove JavaScript? OK, maybe we
-
should. And finally, if you have to do
that, please ask the user before
-
connecting to a web server. So let's look
at some vendor reactions. Apple decided to
-
do exactly what I've told you: to add a
dialog to warn the user and even show the
-
whole URL with the encrypted plaintext.
And Google decided to stop trying to fix
-
the unfixable in Chrome. They fixed the
automatic exfiltration, but there's really
-
nothing they can do about the standard. So
this is a problem that has to be done in
-
the standard. And that is basically that.
For mitigating wrapping attacks, we have
-
to deprecate partial encryption and
disallow access from unencrypted to
-
encrypted objects. And against the gadget
attacks, we have to use authenticated
-
encryption like AES-GCM. OK. And Adobe has
told us that they were escalating this to
-
the ISO working group that's now
responsible for the PDF standard and this
-
will be taken up in the next revision. So
that's a win in my book.
-
Applause
-
Herald: Thank you so much, guys. That was
really awesome. Please queue up by the
-
microphones if you have any questions, we
still have some time left for Q and A. But
-
I think your research is really, really
interesting because it opens my mind to
-
like how would this actually be able to be
misused in practice? Like, and I don't
-
know, like, what's your take? I guess
since you've been working so much with
-
this, you must have some kind of idea as
to what devious things you could come up
-
with.
Fabian: I mean, it's still an attacker
-
scenario that requires a lot of resources
and a very motivated attacker. So this
-
might not be very important to the normal
user. Let's be real here. So most of us
-
are not targeted by the NSA, I guess. So
you need an active attacker, an active man
-
in the middle to actually perform these
attacks.
-
Herald: Great. Thank you. And then I think
we have a question from microphone number
-
four, please.
Microphone 4: Yes. You'll said that the
-
next standard might have a fix.
Do you know a time frame on how long it
-
takes to build such a standard?
Fabian: Well, no, we don't really know. We
-
have talked with Adobe and they told us
they will show the next version of the
-
standard to us before actually releasing
that, but we have no time frame at all
-
from them.
Microphone 4: OK. Thank you.
-
Herald: Thank you.
Microphone number five, please.
-
Microphone 5: Thank you for a very
interesting talk. You showed in the first
-
part that the signature has like these
four numbers with the byte range. And why
-
is this, like four numbers, not part of a
signature? Is there a technical reason for
-
that? Because the byte offset is
predictable.
-
Vladi: It is! The bytes ranges protected
by the signature. But we just defined the
-
second one and just moved the signed one
to be validated later. So there are two
-
byte ranges. But only the first one, the
manipulated one, will be processed.
-
Microphone 5: Thank you.
Herald: Thank you so much. Microphone
-
number four, please.
Microphone 4: Oh, this is way too high for
-
me. OK. I have an answer and a question
for you. You mentioned during the talk
-
that you weren't sure how the Department
of Justice did distributes the passwords
-
for encrypting PDFs. The answer is: in
plain text, in a separate email or as the
-
password of the week, which is distributed
through various means. That is also what
-
the Department of Homeland Security does,
and the military is somewhat less stupid.
-
As a question: I have roughly a half
terabyte of sensitive PDFs that I would
-
like to scan for your attack and also for
redaction failures. Do you know of any
-
fast, feasible ways to scan documents for
the presence of this kind of attack?
-
Fabian: I don't know of any tools, but I
mean, scanning for the gadget attacks is
-
actually possible if you tried to do some
entropy detection. So, because you reuse
-
ciphertext, you will have less entropy in
your ciphertext, but that's pretty hard to
-
do. Direct exfiltration should probably be
detectable by scanning simply for words
-
like "identity". Well, beyond that, 18
different techniques that we provided in
-
the paper. But I don't know of any tools
to do that automatically.
-
Microphone 4: Thank you.
Herald: Great. Thank you. And microphone
-
number two, please. Microphone 2: Thank
you for your very interesting
-
presentation. I have one suggestion and
one question for the mitigation scheme. If
-
you simply run your PDF reader in a
virtual machine, that is firewalled away,
-
so your firewall won't led you to anybody
going out. But for the signature
-
forgeries, I had an idea. I'm not sure if
this is actually a stupid idea, but did
-
you consider faking the certificate?
Because presumably the signature is
-
protected by the seller's certificate. You
make up your own, signing with that. Does
-
it catch it and how?
Vladi: We considered it but not in this
-
paper. We assume that the certificate and
the entire chain of trust for this path is
-
totally secure. It was just an assumption
to just concentrate only on the attacks we
-
already found. So, perhaps there will be
further research provided by us in the
-
next months and years.
Herald: We might just hear more from you
-
in the future. Thank you so much. And now
questions from the Internet, please.
-
Signal Angel: I have two questions to the
first part of your talk from the Internet.
-
The first one is you mentioned a few
reactions, but can you give a bit more
-
detail about your experience with vendors
while reporting these issues?
-
Vladi: Yeah. We, ... for the first time we
started, we asked the CERT team from BSI,
-
CERT-Bund, to help us because there were a
lot of affected vendors and we were not
-
able to provide the support in a feasible
way. So they supported us the entire way.
-
We first created the report with,
containing the exact description of the
-
vulnerabilities and old exploits. Then, we
distributed it to the BSI and they
-
contacted the vendors and just proxied to
the communication and there was a lot of
-
communication. So I'm not aware of the
entire communication, but only about the
-
technical stuff where we were asked to
just retest the fix and so on. So there
-
was some reaction from Adobe, FoxIt and a
lot of viewers reacted on our attacks and
-
contacted us, but not everybody.
Herald: Thank you so much. Unfortunately,
-
that's the only time that we have
available for questions today. I think you
-
guys might stay around for a couple of
minutes, just if someone has any more
-
questions. Fabian, I thank ... and
Vladislav, not enough. Thank you so much.
-
It was very interesting. Please give them
a great round of applause.
-
Valdi: Thank you.
Applause
-
36c3 postroll music
-
subtitles created by c3subtitles.de
in the year 2019. Join, and help us!