1
00:00:00,000 --> 00:00:19,480
36c3 preroll music
2
00:00:19,480 --> 00:00:25,090
Herald: The next talk is on how to break
PDF's, breaking the encryption and the
3
00:00:25,090 --> 00:00:32,910
signatures, by Fabian Ising and Vladislav
Mladenov. Their talk was accepted at CCS
4
00:00:32,910 --> 00:00:37,750
this year in London and they had that in
November. It comes from research that
5
00:00:37,750 --> 00:00:43,660
basically produced two different kinds of
papers and it has been... people worldwide
6
00:00:43,660 --> 00:00:47,540
have been interested in what has been
going on. Please give them a great round
7
00:00:47,540 --> 00:00:51,758
of applause and welcome them to the stage.
8
00:00:51,758 --> 00:00:59,150
Applause
9
00:00:59,150 --> 00:01:11,590
Vladi: So can you hear me? Yeah. Perfect.
OK. Now you can see the slides. My name is
10
00:01:11,590 --> 00:01:15,220
Vladislav Mladenov, or just Vladi if you
have some questions to me and this is
11
00:01:15,220 --> 00:01:20,670
Fabian. And we are allowed today to talk
about how to break PDF security or more
12
00:01:20,670 --> 00:01:28,230
special about how to break the
cryptography operations in PDF files. We
13
00:01:28,230 --> 00:01:36,590
are a large team from university of
Bochum, Mue nster and Hackmanit GmbH. So as
14
00:01:36,590 --> 00:01:46,159
I mentioned: We will talk about
cryptography and PDF files. Does it work?
15
00:01:46,159 --> 00:01:57,720
Fabian: All right. OK. Let's try that
again. Okay.
16
00:01:57,720 --> 00:02:02,070
Vladi: Perfect. This talk will consist of
two parts. The first part is about
17
00:02:02,070 --> 00:02:07,829
digitally signed PDF files and how can we
recognize such files? If we open them we
18
00:02:07,829 --> 00:02:16,230
see the information regarding that the
file was signed and all verification
19
00:02:16,230 --> 00:02:20,690
procedures were valid. And more
information regarding the signature
20
00:02:20,690 --> 00:02:27,220
validation panel and information about who
signed this file. This is the first part
21
00:02:27,220 --> 00:02:35,660
of the talk and I will present this topic.
And the second part is regarding PDF
22
00:02:35,660 --> 00:02:41,280
encrypted files and how can we recognize
such files? If you tried to open such
23
00:02:41,280 --> 00:02:47,080
files, the first thing you see is the
password prompt. And after entering the
24
00:02:47,080 --> 00:02:51,800
correct password, the file is decrypted
and you can read the content within this
25
00:02:51,800 --> 00:02:57,720
file. If you open it with Adobe,
additional information regarding if this
26
00:02:57,720 --> 00:03:04,420
file is secured or not is displayed
further. And this is the second part of
27
00:03:04,420 --> 00:03:11,650
our talk, and Fabian, will talk: how can
we break the PDA encryption? So before we
28
00:03:11,650 --> 00:03:19,450
start with the attacks on signatures or
encryption, we first need some basics. And
29
00:03:19,450 --> 00:03:22,700
after six slides, you will be experts
regarding PDF files and you will
30
00:03:22,700 --> 00:03:28,820
understand everything about it. But maybe
it's a little bit boring, so be patient:
31
00:03:28,820 --> 00:03:34,830
there are only 6 slides. So the first is
quite easy. PDF files are... the first
32
00:03:34,830 --> 00:03:42,250
specification was in 1993 and almost at
the beginning PDF cryptography operations
33
00:03:42,250 --> 00:03:48,920
like signatures and encryption was already
there. The last version is PDF 2.0 and it
34
00:03:48,920 --> 00:03:57,610
was released in 2017. And according to
Adobe 1.6 billion files are on the web and
35
00:03:57,610 --> 00:04:06,140
perhaps more exchange beyond the web. So
basically PDF files are everywhere. And
36
00:04:06,140 --> 00:04:11,790
that's the reason why we consider this
topic and tried to find or to analyze the
37
00:04:11,790 --> 00:04:19,730
security of the features. If we have some
very simple file and we open it with Adobe
38
00:04:19,730 --> 00:04:25,390
Reader, the first thing we see is, of
course, the content. "Hello, world!" in
39
00:04:25,390 --> 00:04:32,060
this case, and additional information
regarding the focused page and how many
40
00:04:32,060 --> 00:04:39,630
pages this document has. But what would
happen if we don't use a PDF viewer and
41
00:04:39,630 --> 00:04:48,210
just use some text editor? We use the
Notepad++ to open and later manipulate the
42
00:04:48,210 --> 00:04:56,400
files. So I will zoom this thing... this
file. And the first thing we see is that
43
00:04:56,400 --> 00:05:04,500
we can read it. Perhaps it's quite, quite
funny. And but we can still extract some
44
00:05:04,500 --> 00:05:10,910
information of this file. For example,
some information regarding the pages. And
45
00:05:10,910 --> 00:05:19,740
here you can see the information that the
PDF file consists of one page. But more
46
00:05:19,740 --> 00:05:27,350
interesting is that we can see the
content of the file itself. So the lessons
47
00:05:27,350 --> 00:05:34,960
we learned is that we can use a simple
text editor to view and edit PDF files.
48
00:05:34,960 --> 00:05:43,900
And for our attacks, we used only this
text editor. So let's go to the details.
49
00:05:43,900 --> 00:05:51,560
How PDF files are structured and how they
are processed. PDF files consist of 4
50
00:05:51,560 --> 00:05:59,170
parts: header, body and body is the most
important part of the PDF files. The body
51
00:05:59,170 --> 00:06:03,820
contains the entire information presented
to the user. And 2 other sections: Xref
52
00:06:03,820 --> 00:06:11,490
section and trailer. Very important think
about processing PDF files, is that
53
00:06:11,490 --> 00:06:18,020
they're processed not from the top to the
bottom, but from the bottom to the top. So
54
00:06:18,020 --> 00:06:23,700
the first thing is that the PDF viewer
analyses or processes is the trailer. So
55
00:06:23,700 --> 00:06:28,981
let's start doing that. What information
is starting this trailer? Basically, there
56
00:06:28,981 --> 00:06:35,540
are two very important informations. On
the first side this is the information:
57
00:06:35,540 --> 00:06:41,410
what is the root element of this PDF? So
which is the first object which will be
58
00:06:41,410 --> 00:06:47,860
processed? And the second important
information is where the Xref section
59
00:06:47,860 --> 00:06:54,000
starts. It's just a byte offset pointing
to the position of the XRef section within
60
00:06:54,000 --> 00:07:00,201
the PDF file. So this pointer, as
mentioned before, points to the Xref
61
00:07:00,201 --> 00:07:05,710
section. But what is the Xref section
about? The Xref section is a catalog
62
00:07:05,710 --> 00:07:11,180
pointing or holding the information where
the objects defined in the body are
63
00:07:11,180 --> 00:07:18,741
contained or the byte positions of this
object. So how can we read this weird Xref
64
00:07:18,741 --> 00:07:25,540
section? The first information we extract
is that the first object, which is defined
65
00:07:25,540 --> 00:07:34,610
here, is the object with ID 0 and we have
5 further elements or objects which are
66
00:07:34,610 --> 00:07:41,090
defined. So the first object is here. The
first entry is the byte position within
67
00:07:41,090 --> 00:07:46,610
the file. The second is its generation
number. And the last charter points, if
68
00:07:46,610 --> 00:07:53,200
this object is used or not used. So
reading it, reading this Xref section, we
69
00:07:53,200 --> 00:08:00,590
extract the information that the object
with ID 0 is at byte position 0 and is not
70
00:08:00,590 --> 00:08:08,650
in use. So the object with ID 1 is at the
position 9 and so on and so forth. So for
71
00:08:08,650 --> 00:08:18,370
the object with ID 4 and the object number
comes from counting it: 0 1, 2, 3 and 4.
72
00:08:18,370 --> 00:08:29,430
So the object with ID 4 can be found at
the offset 184 and it's in use. In other
73
00:08:29,430 --> 00:08:35,449
words, the PDF viewer knows where each
object will be found and can properly
74
00:08:35,449 --> 00:08:42,329
display it and process it. Now we come to
the most important part: the body, and I
75
00:08:42,329 --> 00:08:48,810
mentioned it that in the body the entire
content which is presented to the user is
76
00:08:48,810 --> 00:08:58,220
contained. So let's see. Object 4 0 is
this one and as you can see, it contains
77
00:08:58,220 --> 00:09:04,870
the word "Hello World". The other objects
are a reference, too. So each pointer
78
00:09:04,870 --> 00:09:10,119
points exactly to the starting position of
each of the objects. And how can we read
79
00:09:10,119 --> 00:09:15,910
this object? You see, we have an object
starting with the ID number, then the
80
00:09:15,910 --> 00:09:24,999
generation number and the word "obj". So
you now know where the object starts
81
00:09:24,999 --> 00:09:32,259
and when it ends. Now how can we process
this body? As I mentioned before in the
82
00:09:32,259 --> 00:09:40,970
trailer, there was a reference regarding
the root element and this element was with
83
00:09:40,970 --> 00:09:48,769
ID 1 and generation number 0. So, we now
we start reading the document here and we
84
00:09:48,769 --> 00:09:55,910
have a catalog and a reference to some
pages. Pages is just a description of all
85
00:09:55,910 --> 00:10:02,889
the pages contained within the file. And
what can we see here is that we have this
86
00:10:02,889 --> 00:10:09,779
number count once or we have only one page
and a reference to the page object which
87
00:10:09,779 --> 00:10:15,170
contains the entire information
inscription of the page. If we have
88
00:10:15,170 --> 00:10:22,230
multiple pages, then we will have here
multiple elements. Then we have one page.
89
00:10:22,230 --> 00:10:29,850
And here we have the contents, which is a
reference to the string we already saw.
90
00:10:29,850 --> 00:10:35,139
Perfect. If you understand this then you
know everything or almost everything about
91
00:10:35,139 --> 00:10:39,360
PDF files. Now you can just use your
editor and open such files and analyze
92
00:10:39,360 --> 00:10:50,310
them. Then we need one feature... I forgot
the last part. The most simple one. The
93
00:10:50,310 --> 00:10:56,129
header. It should just one line stating
which version is used. For example, in our
94
00:10:56,129 --> 00:11:04,779
case, 1.4. For the last version of Adobe
here will be stated 2.0. Now, we need this
95
00:11:04,779 --> 00:11:13,699
one feature called "Incremental Update".
And I call this feature - do you know this
96
00:11:13,699 --> 00:11:19,629
feature highlighting something in the PDF
file or putting some sticky notes?
97
00:11:19,629 --> 00:11:24,119
Technically, it's called "incremental
update." I just call it reviewing master
98
00:11:24,119 --> 00:11:30,680
and bachelor thesis of my students because
this is exactly the procedure I follow. I
99
00:11:30,680 --> 00:11:38,100
just read the text and highlight something
and store the information I put at it.
100
00:11:38,100 --> 00:11:46,970
Technically by putting such a sticky note.
this additional information is appended
101
00:11:46,970 --> 00:11:53,160
after the end of the file. So we have a
body update which contains exactly the
102
00:11:53,160 --> 00:12:01,369
information additionally of the new
objects and of course, new Xref section
103
00:12:01,369 --> 00:12:15,610
and a new trailer pointing to this new
object. Okay, we are done. Considering
104
00:12:15,610 --> 00:12:23,860
incremental update, we saw that it is used
mainly for sticky notes or highlighting.
105
00:12:23,860 --> 00:12:29,679
But we observed something which is very
important because an incremental update we
106
00:12:29,679 --> 00:12:36,930
can redefine existing objects, for
example, we can redefine the object with
107
00:12:36,930 --> 00:12:45,730
ID 4 and put new content. So we replace in
this manner the word "Hello World" with
108
00:12:45,730 --> 00:12:51,699
another sentence and of course the Xref
section and the trailer point to this new
109
00:12:51,699 --> 00:13:00,100
object. So this is very important. With
incremental update we are not stuck to
110
00:13:00,100 --> 00:13:06,220
only adding some highlighting or notes. We
can redefine already existing content and
111
00:13:06,220 --> 00:13:14,399
perhaps we need this for the attacks we
will present. So let's talk about PDF
112
00:13:14,399 --> 00:13:23,339
signatures. First, we need a difference
between electronic signature and digital
113
00:13:23,339 --> 00:13:28,699
signature. Electronic signature. From a
technical point of view, it's just an
114
00:13:28,699 --> 00:13:36,369
image. I just wrote it on my PC and put it
into the file. There is no cryptographic
115
00:13:36,369 --> 00:13:40,890
protection. It could be me lying on the
beach doing something. From cryptographic
116
00:13:40,890 --> 00:13:45,509
point of view is the same. It does not
provide any security, any cryptographic
117
00:13:45,509 --> 00:13:52,739
security. What we will talk about here is
about digitally signed files, so if you
118
00:13:52,739 --> 00:14:00,290
open such files, you have the additional
information regarding the validation about
119
00:14:00,290 --> 00:14:08,309
the signatures and who signed this PDF
file. So as I mentioned before, this talk
120
00:14:08,309 --> 00:14:16,689
will concentrate only on these digitally
signed PDF files. How? What kind of
121
00:14:16,689 --> 00:14:22,879
process is behind digitally signing PDF
files? Imagine we have this abstract
122
00:14:22,879 --> 00:14:28,639
overview of a PDF document. We have the
header, body, Xref section and trailer. We
123
00:14:28,639 --> 00:14:35,480
want to sign it. What happens is that we
take this PDF file and via incremental
124
00:14:35,480 --> 00:14:41,899
update we put additional information
regarding that. There is a new catalog and
125
00:14:41,899 --> 00:14:46,379
more important, a new signature object
containing the signature value and
126
00:14:46,379 --> 00:14:52,100
information about who signed this PDF
file. And of course, there is an Xref
127
00:14:52,100 --> 00:14:58,970
section and trailer. And relevant for you:
The entire file is now protected by the
128
00:14:58,970 --> 00:15:06,860
PDF signature. So manipulations within
this area should not be possible, right?
129
00:15:06,860 --> 00:15:15,879
Yeah, let's talk about this: why it's not
possible and how can we break it? First,
130
00:15:15,879 --> 00:15:21,370
we need an attack scenario. What we want
to achieve as an attacker. We assumed in
131
00:15:21,370 --> 00:15:27,839
our research that the attacker possesses
this signed PDF file. This could be an old
132
00:15:27,839 --> 00:15:35,989
contract, receipt or, in our case, a bill
from Amazon. And if we open this file, the
133
00:15:35,989 --> 00:15:41,440
signature is valid. So everything is
green. No warnings are thrown and
134
00:15:41,440 --> 00:15:48,329
everything is fine. What we tried to do is
to take this file, manipulate it somehow
135
00:15:48,329 --> 00:15:56,319
and then send it to the victim. And now
the victim expects to receive a digitally
136
00:15:56,319 --> 00:16:01,779
signed PDF file, so just tripping the
digital signature is a very trivial
137
00:16:01,779 --> 00:16:07,600
scenario and we did not consider it
because it's trivial. We considered that
138
00:16:07,600 --> 00:16:13,240
the victim expects to see that there is a
signature and it is valid. So no warning
139
00:16:13,240 --> 00:16:20,420
casts are thrown and the entire left side
is exactly the same from the normal
140
00:16:20,420 --> 00:16:28,109
behavior. But on the other side, the
content was exchanged so we manipulated
141
00:16:28,109 --> 00:16:33,790
the receipt and exchanged it with another
content. The question is now: how can we
142
00:16:33,790 --> 00:16:41,079
do it on a technical level? And we came up
with three attacks: incremental saving
143
00:16:41,079 --> 00:16:45,929
attacks, signature wrapping and universal
signature forgery. And I will now
144
00:16:45,929 --> 00:16:51,209
introduce the techniques and how these
attacks are working. The first attack is
145
00:16:51,209 --> 00:16:56,839
the incremental saving attack. So I
mentioned before that via incremental
146
00:16:56,839 --> 00:17:06,439
saving or via incremental updates, we can
add and remove and even redefine already
147
00:17:06,439 --> 00:17:14,650
existing objects and the signature still
stays valid. Why is this happening?
148
00:17:14,650 --> 00:17:21,110
Consider now again our case. We have some
header, body, Xref table and trailer and
149
00:17:21,110 --> 00:17:27,559
the file is now signed and the signature
protects only the signed area. So what
150
00:17:27,559 --> 00:17:32,600
would happen if I put a sticky note or
some highlighting? An incremental update
151
00:17:32,600 --> 00:17:39,169
happens. If I open this file, usually this
happens: We have the information that this
152
00:17:39,169 --> 00:17:45,799
signature is valid, when it was signed and
so on and so forth. So our first idea was
153
00:17:45,799 --> 00:17:53,250
to just put new body updates, redefine
already existing content and with a Xref
154
00:17:53,250 --> 00:17:59,419
table and trailer we point to the new
content. This is quite trivial because
155
00:17:59,419 --> 00:18:04,820
it's a legitimate feature in PDF files, so
we didn't expect to be quite successful
156
00:18:04,820 --> 00:18:11,760
and we were not so successful. But the
first idea: we applied this attack, we
157
00:18:11,760 --> 00:18:22,080
opened it and we got this message. So it's
kind of a weird message because an
158
00:18:22,080 --> 00:18:27,970
experienced user sees valid, but the
document has been updated and you should
159
00:18:27,970 --> 00:18:33,580
know what does this exactly mean. But we
did not consider this attack as successful
160
00:18:33,580 --> 00:18:41,110
because the warning is not the same or the
status of the signature validation is not
161
00:18:41,110 --> 00:18:50,909
the same. So what we did is to evaluate
this first against this trivial case,
162
00:18:50,909 --> 00:18:56,860
against older viewers we have, and Libre
office, for example, was vulnerable
163
00:18:56,860 --> 00:19:01,769
against this trivial attack. This was the
only viewer which was vulnerable against
164
00:19:01,769 --> 00:19:07,440
this trivial variation. But then we asked
ourselves: Okay, the other viewers are
165
00:19:07,440 --> 00:19:14,250
quite secure. But how do they detect these
incremental updates? And from developer
166
00:19:14,250 --> 00:19:22,410
point of view, the laziest thing we can do
is just to check if another Xref table and
167
00:19:22,410 --> 00:19:28,330
trailer were added after the signature was
applied. So we just put our body updates
168
00:19:28,330 --> 00:19:37,450
but just deleted the other two parts. This
is not a standard compliant PDF file. It's
169
00:19:37,450 --> 00:19:44,789
broken. But our hope was that the PDF
viewer fixes this kind of stuff for us and
170
00:19:44,789 --> 00:19:51,210
that these viewers are error-tolerant. And
we were quite successful because the
171
00:19:51,210 --> 00:19:56,320
verification logic just checked: Is there
an Xref table and trailer after the
172
00:19:56,320 --> 00:20:01,580
signature was applied? No? Okay.
Everything's fine. The signature is valid.
173
00:20:01,580 --> 00:20:05,450
No warning was thrown. But then the
application logic saw that incremental
174
00:20:05,450 --> 00:20:13,580
updates were applied and fixed this for us
and processed these body updates and no
175
00:20:13,580 --> 00:20:21,159
warning was thrown. Some of the viewers
required to have a trailer. I don't know
176
00:20:21,159 --> 00:20:25,350
why - it was a Black box testing. So we
just removed the Xref table, but the
177
00:20:25,350 --> 00:20:32,030
trailer was there and we were able to
break further PDF viewers. The most
178
00:20:32,030 --> 00:20:38,490
complex variation of the attack was the
following: We had the PDF viewers checked
179
00:20:38,490 --> 00:20:47,330
if every incremental update contains a
signature object. But they did not check
180
00:20:47,330 --> 00:20:53,200
if this signature is covered by the
incremental update. So we just copy-pasted
181
00:20:53,200 --> 00:21:01,290
the signature which was provided here and
we just forced the PDF viewer to validate
182
00:21:01,290 --> 00:21:10,100
this signed content twice - and still our
body updates were processed and for
183
00:21:10,100 --> 00:21:18,669
example, Foxit or Master PDF were
vulnerable against this type of attack. So
184
00:21:18,669 --> 00:21:24,909
the evaluation of our attack: We
considered as part of our evaluation 22
185
00:21:24,909 --> 00:21:31,050
different viewers - among others, Adobe
with different versions, Foxit, and so on.
186
00:21:31,050 --> 00:21:41,140
And as you can see 11 of 22 were
vulnerable against incremental saving. So
187
00:21:41,140 --> 00:21:47,160
50 percent, and we were quite surprised
because we saw that the developers saw
188
00:21:47,160 --> 00:21:51,639
that incremental updates could be
dangerous regarding the signature
189
00:21:51,639 --> 00:22:01,070
validation. But we were still able to
bypass their considerations. We had - a
190
00:22:01,070 --> 00:22:07,769
full signature bypass means that there is
no possibility for the victim to detect
191
00:22:07,769 --> 00:22:14,269
the attack. A limited signature bypass
means that the victim, if the victim
192
00:22:14,269 --> 00:22:23,470
clicks on one - at least one - additional
window and explicitly wants to validate
193
00:22:23,470 --> 00:22:31,520
the signature, then the viewer was
vulnerable. But the most important thing
194
00:22:31,520 --> 00:22:38,080
is by opening the file, there was a status
message that the signature validation and
195
00:22:38,080 --> 00:22:44,289
all signatures are valid. So this was the
first layer and the viewers were
196
00:22:44,289 --> 00:22:51,390
vulnerable against this. So let's talk
about the second attack class. We called
197
00:22:51,390 --> 00:22:57,970
it "signature wrapping attack" and this is
the most complex attack of the 3 classes.
198
00:22:57,970 --> 00:23:04,580
And now we have to go a little bit into
the details of how PDF signatures are
199
00:23:04,580 --> 00:23:10,450
made. So imagine now we have a PDF file.
We have some header and the original
200
00:23:10,450 --> 00:23:15,549
document. The original document contains
the header, the body, the Xref section and
201
00:23:15,549 --> 00:23:21,919
so on and so forth. And we want to sign
this document. Technically, again, an
202
00:23:21,919 --> 00:23:28,700
incremental update is provided and we have
a new catalog here. We have some other
203
00:23:28,700 --> 00:23:35,159
objects, for example, certificates and so
on and the signature objects. And we will
204
00:23:35,159 --> 00:23:38,720
now concentrate on this signature object
because it's essential for the attack we
205
00:23:38,720 --> 00:23:45,399
want to to carry out. And the signature
object contains a lot of information, but
206
00:23:45,399 --> 00:23:51,460
we want for this attacks only two elements
are relevant: The contents and the byte
207
00:23:51,460 --> 00:23:57,940
range. The contents contains the signature
value. It's a PKCS7 container containing
208
00:23:57,940 --> 00:24:05,710
the signature value and the certificates
used to validate the signature and the
209
00:24:05,710 --> 00:24:11,299
bytes range. The byte range contains four
different values and what how these values
210
00:24:11,299 --> 00:24:23,090
are being used. The first two, A and B
define the first signed area. And this is
211
00:24:23,090 --> 00:24:29,159
here from the beginning of the document
until the start of the signature value.
212
00:24:29,159 --> 00:24:35,370
Why we need this? Because the signature
value is part of the signed area. So we need
213
00:24:35,370 --> 00:24:42,780
to exclude the signature value from the
document computation. And this is how the
214
00:24:42,780 --> 00:24:49,179
bytes range is used. The first part is
from the beginning of the document until
215
00:24:49,179 --> 00:24:54,629
the signed the signature value starts and
after the signature ends until the end of
216
00:24:54,629 --> 00:25:04,759
the file is the second area specified by
the two digits C and D. So, now we have
217
00:25:04,759 --> 00:25:13,500
everything protected besides the signature
value itself. What we wanted to try is to
218
00:25:13,500 --> 00:25:21,889
create additional space for our attacks.
So our idea was to move the second signed
219
00:25:21,889 --> 00:25:30,350
area. And how can we do it? So basically
we can do it by just defining another byte
220
00:25:30,350 --> 00:25:40,240
range. And as you can see here, the byte
range points from area A to B. So this
221
00:25:40,240 --> 00:25:46,889
area we didn't made any manipulation in
this part, right? It was not modified at
222
00:25:46,889 --> 00:25:53,309
all. So it's still valid. And the second
part, the new C value and the next D
223
00:25:53,309 --> 00:26:00,169
bytes, we didn't change anything here,
right? So basically, we didn't changed
224
00:26:00,169 --> 00:26:06,750
anything in the signed area. And the
signature is still valid. But what we
225
00:26:06,750 --> 00:26:13,980
created was a space for some malicious
objects; sometimes we needed some padding
226
00:26:13,980 --> 00:26:20,960
and a new extra section pointing to this
malicious objects. Important thing was
227
00:26:20,960 --> 00:26:27,559
that this malicious Xref sections, the
position is defined by the trailer. And
228
00:26:27,559 --> 00:26:32,799
since we can not modify this trailer, this
position is fixed. So this is the only
229
00:26:32,799 --> 00:26:42,880
limitation of the attack, but it works
like a charm. And the question is now: How
230
00:26:42,880 --> 00:26:49,730
many PDF viewers were vulnerable against
this attack? And as you can see, this is
231
00:26:49,730 --> 00:26:58,169
the signature wrapping column. 17 out of
22 applications were vulnerable against
232
00:26:58,169 --> 00:27:06,000
this attack. This was quite expected
result because the attack was complex we
233
00:27:06,000 --> 00:27:14,789
saw that many developers didn't, were not
aware of this threat and that's the reason
234
00:27:14,789 --> 00:27:22,600
why so many vulnerabilities were there.
Now to the last class of attacks,
235
00:27:22,600 --> 00:27:28,580
universal signature forgery. And we called
it universal signature forgery, but I
236
00:27:28,580 --> 00:27:33,879
preferred to use another definition for
this attacks. I call them stupid
237
00:27:33,879 --> 00:27:40,909
implementation flaws. We are coming from
the PenTesting area and I know a lot of
238
00:27:40,909 --> 00:27:49,880
you are PenTesters, too. And, many of you
have experience, quite interesting
239
00:27:49,880 --> 00:27:58,460
experience with zero bytes, null values or
some kind of weird values. And this is
240
00:27:58,460 --> 00:28:06,309
what we tried in this kind of attacks.
Just tried to do some stupid values or
241
00:28:06,309 --> 00:28:13,100
remove references and see what happen.
Considering the signature, there are two
242
00:28:13,100 --> 00:28:18,389
different important elements: The contents
containing the signature value and the
243
00:28:18,389 --> 00:28:25,220
byte range pointing to what is exactly
signed. So, what would happen if we remove
244
00:28:25,220 --> 00:28:30,679
the contents? Our hope was that the
information regarding the signature is
245
00:28:30,679 --> 00:28:37,779
still shown by the viewer as valid without
validating any signature because it was
246
00:28:37,779 --> 00:28:45,169
not possible. And by just removing the
signature value is quite obvious idea. And
247
00:28:45,169 --> 00:28:48,899
we were not successful with this kind of
attack. But let's proceed with another
248
00:28:48,899 --> 00:28:57,090
values like for example, contents without
any value or contents like equals NULL or
249
00:28:57,090 --> 00:29:04,710
zero bytes. And considering this last
version, we had two viewers which were
250
00:29:04,710 --> 00:29:15,049
vulnerable against this attack. And
another, another case is, for example, by
251
00:29:15,049 --> 00:29:19,929
removing the byte range. By removing this
byte range we have some signature value,
252
00:29:19,929 --> 00:29:29,590
but we don't know what is exactly signed.
So, we tried this attack and of course,
253
00:29:29,590 --> 00:29:38,390
byte range without any value or NULL bytes
or byte range with a minus or negative,
254
00:29:38,390 --> 00:29:46,169
negative numbers. And usually this last
crashed very a lot of viewers. But the
255
00:29:46,169 --> 00:29:51,800
most interesting is that Adobe made this
mistake by just removing the byte range.
256
00:29:51,800 --> 00:29:56,990
We were able to bypass the entire
security. We didn't expect this behavior,
257
00:29:56,990 --> 00:30:00,950
but it was a stupid implementation flaw,
allowing us to do anything in this
258
00:30:00,950 --> 00:30:08,190
document and all the exploits we show in
our presentations were made on Adobe with
259
00:30:08,190 --> 00:30:14,909
this attack. So let's see what were the
results of this attack. As you can see,
260
00:30:14,909 --> 00:30:21,110
only 4 of 22 viewers were vulnerable
against this attack and only Adobe
261
00:30:21,110 --> 00:30:26,280
unlimited; for the others, there was
limitation because if you click on the
262
00:30:26,280 --> 00:30:32,760
signature validation, then a warning was
thrown. It was very easy for Adobe to fix.
263
00:30:32,760 --> 00:30:37,540
And as you can see, Adobe didn't mistake,
made any mistake regarding incremental
264
00:30:37,540 --> 00:30:40,820
saving, a signature wrapping, but
regarding controversial signature forgery.
265
00:30:40,820 --> 00:30:48,169
There were vulnerable against this attack.
And this was the hope of our approach. In
266
00:30:48,169 --> 00:30:56,029
summary, we were able to break 21 of 22
PDF viewers. The only
267
00:30:56,029 --> 00:31:00,850
Applause
Thanks.
268
00:31:00,850 --> 00:31:08,149
Applause
The only secure PDF viewer is Adobe 9,
269
00:31:08,149 --> 00:31:12,860
which is deprecated and has remote code
execution. The only
270
00:31:12,860 --> 00:31:18,039
Laugh
The only users allowed to use them or are
271
00:31:18,039 --> 00:31:25,450
using it are Linux users, because this is
the last version available for Linux and
272
00:31:25,450 --> 00:31:31,779
that's the reason why you consider it. So,
I'm done with the talk about PDF
273
00:31:31,779 --> 00:31:36,644
signatures and now Fabian can talk about
PDF encryption. Thank you.
274
00:31:36,644 --> 00:31:42,540
Fabian: Yes
Applause
275
00:31:42,540 --> 00:31:46,759
OK, now that we have dealt with the
signatures, let's talk about another
276
00:31:46,759 --> 00:31:52,759
cryptographic aspect in PDFs. And that is
encryption. And some of you might remember
277
00:31:52,759 --> 00:31:58,481
our PDFex vulnerability from earlier this
year. It's, of course, an attack with a
278
00:31:58,481 --> 00:32:03,720
logo and it presents two novel tech
techniques targeting PDF encryption that
279
00:32:03,720 --> 00:32:08,029
have never been applied to PDF encryption
before. So one of them is these so-called
280
00:32:08,029 --> 00:32:12,549
direct exfiltration where we break the
cryptography without even touching the
281
00:32:12,549 --> 00:32:18,840
cryptography. So no ciphertext
manipulation here. The second one as so-
282
00:32:18,840 --> 00:32:24,690
called malleability gadgets. And those are
actually targeted modifications of the
283
00:32:24,690 --> 00:32:31,240
ciphertext of the document. But first,
let's take a step back and let again take
284
00:32:31,240 --> 00:32:39,519
some keywords in. So PDF uses AES. OK.
Well, AES is good. Nothing can go wrong,
285
00:32:39,519 --> 00:32:44,220
right? So let's go home. Encryption is
fine. Well, of course, we didn't stop
286
00:32:44,220 --> 00:32:52,160
here, but took a closer look. So they use
CBC mode of operation, so cipher block
287
00:32:52,160 --> 00:32:58,309
chaining. And, what's more important is
that they don't use any integrity
288
00:32:58,309 --> 00:33:04,120
protection. So it's unintegrity protected
AES-CBC. And you might remember the
289
00:33:04,120 --> 00:33:08,909
scenario from the attacks against
encrypted e-mail, so against OpenPGP and
290
00:33:08,909 --> 00:33:15,999
S-MIME, it's basically the same problem.
But first, who actually uses PDF
291
00:33:15,999 --> 00:33:20,940
encryption? You might ask. For one, we
found some local banks in Germany use
292
00:33:20,940 --> 00:33:26,030
encrypted PDFs as a drop-in replacement
for S-MIME or OpenPGP because their
293
00:33:26,030 --> 00:33:34,899
customers might not want to deal with uhm,
set, with the setup of encrypted e-mail.
294
00:33:34,899 --> 00:33:39,740
Second one, were some drop-in plugins for
encrypt e-mail as well. So there are some
295
00:33:39,740 --> 00:33:44,570
companies out there that produce product
that you can put into your outlook and you
296
00:33:44,570 --> 00:33:51,330
can use encrypted PDF files instead of
encrypted email. We also found that some
297
00:33:51,330 --> 00:33:57,919
scanners and medical devices were able to
send encrypted PDF files via e-mail. So
298
00:33:57,919 --> 00:34:02,990
you can set a password on that machine and
they will send the encrypted PDF via
299
00:34:02,990 --> 00:34:10,369
e-mail and you have to put in the
password some other way. And lastly, we
300
00:34:10,369 --> 00:34:14,639
found that some governmental organizations
use encrypted PDF documents, for example,
301
00:34:14,639 --> 00:34:20,409
the US Department of Justice allows for
the send, sending in some claims via
302
00:34:20,409 --> 00:34:25,280
encrypted PDFs. And I've exactly no idea
how you how they get the password, but at
303
00:34:25,280 --> 00:34:30,850
least they allow it. So as we are from
academia, let's take a step back and look
304
00:34:30,850 --> 00:34:36,860
at our attacker model. So we've got Alice
and Bob. Alice wants to send a document to
305
00:34:36,860 --> 00:34:42,120
Bob. And she wants to send it over an
unencrypted channel or a channel she
306
00:34:42,120 --> 00:34:48,610
doesn't trust. So of course, she decides
to encrypt it. Second scenario is, they
307
00:34:48,610 --> 00:34:53,020
want to upload it to a shared storage. For
example, Dropbox or any other shared
308
00:34:53,020 --> 00:34:57,190
storage. And of course, they don't trust
the storage. So, again, they use end-to-
309
00:34:57,190 --> 00:35:05,120
end encryption. So let's assume that this
shared storage is indeed dangerous or
310
00:35:05,120 --> 00:35:11,420
malicious. So, Alice will, of course,
again upload the encrypted document to the
311
00:35:11,420 --> 00:35:17,490
attacker in this case, will perform some
targeted modification of that, and will
312
00:35:17,490 --> 00:35:22,290
send the modified documents back to Bob,
who will happily put in the password
313
00:35:22,290 --> 00:35:26,800
because from his point of view, it's
undistinguishable from the original
314
00:35:26,800 --> 00:35:32,880
document and the original plain text will
be leaked back to the attacker, breaking
315
00:35:32,880 --> 00:35:39,730
the confidentiality. So let's take a look
at the first attack on how we did that.
316
00:35:39,730 --> 00:35:43,410
That's the direct exfiltration, so
breaking the cryptography without touching
317
00:35:43,410 --> 00:35:51,360
any cryptography, as I like to say. But
first, encryption in, in a nutshell, PDF
318
00:35:51,360 --> 00:35:54,570
encryption. So you have seen the structure
of the PDF document. There is a header
319
00:35:54,570 --> 00:35:59,990
with a version number. There's a body
where all the interesting objects live. So
320
00:35:59,990 --> 00:36:06,820
there is our confidential content that we
want to actually, well, to actually
321
00:36:06,820 --> 00:36:14,740
exfiltrate as an attacker. And finally,
there is Xref table and the trailer. So
322
00:36:14,740 --> 00:36:19,730
what changes if we decide to encrypt this
document? Well, actually, not a whole lot.
323
00:36:19,730 --> 00:36:24,080
So instead of confidential data, of
course, there's now some encrypted
324
00:36:24,080 --> 00:36:29,010
ciphertext. Okay. And the rest pretty much
remains the same. The only thing that is
325
00:36:29,010 --> 00:36:36,960
added is a new value in the trailer that
tells us how to decrypt this data again.
326
00:36:36,960 --> 00:36:43,560
So there's pretty much of the structure
left unencrypted. And we thought about:
327
00:36:43,560 --> 00:36:50,120
Why is this? And we took a look at the
standard. So, this is an excerpt from the
328
00:36:50,120 --> 00:36:55,940
PDF specification and I've highlighted the
interesting parts for you. Encryption is
329
00:36:55,940 --> 00:37:00,690
only applied to strings and streams. Well,
those of the values that actually can
330
00:37:00,690 --> 00:37:07,640
contain any text in the document and all
other objects are not encrypted. And that
331
00:37:07,640 --> 00:37:12,270
is because, well, they want to allow
random access to the whole document. So no
332
00:37:12,270 --> 00:37:17,600
parsing the whole document before actually
showing page 16 of the encrypted document.
333
00:37:17,600 --> 00:37:24,560
Well, that seems kind of reasonable. So,
but that also means that the whole
334
00:37:24,560 --> 00:37:27,970
documents structure is unencrypted and
only the streams and strings are
335
00:37:27,970 --> 00:37:31,380
encrypted. This reveals a lot of
information to an attacker that he or she
336
00:37:31,380 --> 00:37:36,420
shouldn't have probably. That's for one
the number and size of pages, that's the
337
00:37:36,420 --> 00:37:42,610
number and size of objects in the document
and that's also including any links, so
338
00:37:42,610 --> 00:37:48,120
any hyperlinks in document that are
actually there. So, that's a lot of
339
00:37:48,120 --> 00:37:55,260
information an attacker probably shouldn't
have. So, next we thought maybe we can do
340
00:37:55,260 --> 00:38:01,270
some more stuff. Can we add our own
unencrypted content? And we took a look at
341
00:38:01,270 --> 00:38:05,910
the standard again and found that our so-
called crypt filters, which provide finer
342
00:38:05,910 --> 00:38:10,750
granularity control of the encryption.
This basically means as an attacker, I can
343
00:38:10,750 --> 00:38:15,920
change a document to say, hey, only
strings in this document are encrypted and
344
00:38:15,920 --> 00:38:21,340
streams are unencrypted. That's what the
identity filter is for. I have no idea why
345
00:38:21,340 --> 00:38:27,190
they decided to add that to a document
format, but it's there. So that means
346
00:38:27,190 --> 00:38:31,570
their support for partial encryption and
that means attackers content can be mixed
347
00:38:31,570 --> 00:38:36,880
with actual encrypted content. And we
found 18 different techniques to do that
348
00:38:36,880 --> 00:38:42,290
in different readers. So there is a lot of
ways to do that in the different readers.
349
00:38:42,290 --> 00:38:48,150
So let's have a look at a demo. So we have
this document, this encrypted document, we
350
00:38:48,150 --> 00:38:54,170
put in our password and get our secret
message. We now open it again in a text
351
00:38:54,170 --> 00:39:00,140
editor. We see, in object 4 0 down here,
there's the actual ciphertext of the
352
00:39:00,140 --> 00:39:06,110
object, so of the message, and we see it's
AES encrypted, with a 32 byte key, so it's
353
00:39:06,110 --> 00:39:15,670
AES-256. OK. Now we decide to add a new
object that contains, well, plaintext.
354
00:39:15,670 --> 00:39:22,220
And, well, we simply add that to the
contents array of this document. So, we
355
00:39:22,220 --> 00:39:28,241
say "Display this on the first page", save
the document. We open it, and we'll put in
356
00:39:28,241 --> 00:39:38,300
our password and, oh well, this is indeed
awkward. OK. So, now, we have broken the
357
00:39:38,300 --> 00:39:44,160
integrity of an encrypted document. Well,
you might think maybe they didn't want any
358
00:39:44,160 --> 00:39:49,190
integrity in the encrypted files. Maybe
that's the use case people have, I don't
359
00:39:49,190 --> 00:39:55,060
know. But we thought, maybe we can somehow
exfiltrate the plaintext this way. So
360
00:39:55,060 --> 00:40:00,040
again, we took a step back, and looked at
the PDF specification. And the first thing
361
00:40:00,040 --> 00:40:06,080
we found were so-called submit-form
actions. And that's basically the same as
362
00:40:06,080 --> 00:40:10,550
a form on a website. You can put in data.
You might have seen this in a contract, in
363
00:40:10,550 --> 00:40:14,740
a PDF contract, where you can put in your
name, and your address, and so on, and so
364
00:40:14,740 --> 00:40:23,330
on, and the data that is saved inside of
that is saved in strings and streams. And
365
00:40:23,330 --> 00:40:27,760
now remember that is everything that is
encrypted in a document. And, of course,
366
00:40:27,760 --> 00:40:32,101
you can also send that back to an
attacker, or well, to a legitimate use
367
00:40:32,101 --> 00:40:37,890
case, of course, via clicking a button,
but clicking buttons is pretty lame. So we
368
00:40:37,890 --> 00:40:42,120
again looked at the standard and found the
so-called open action. And that is an
369
00:40:42,120 --> 00:40:47,190
action, for example, submitting a form
that can be performed upon opening a
370
00:40:47,190 --> 00:40:54,980
document. So how might this look? This is
how a PDF form looks, already with the
371
00:40:54,980 --> 00:41:01,390
attack applied. So, we've got an URL here
that is unencrypted, because all strings
372
00:41:01,390 --> 00:41:07,400
in this document are unencrypted, and
we've got the value object 2 O, where the
373
00:41:07,400 --> 00:41:13,335
actual encrypted data lives. So, that is
the value of the form fields. And what
374
00:41:13,335 --> 00:41:17,120
will happen on the attacker side as soon
as this document is opened? Well, we'll
375
00:41:17,120 --> 00:41:24,540
get a post request with a confidential
content. Let's have a demo. Again, we have
376
00:41:24,540 --> 00:41:30,620
this document. We put in our password.
It's the original document you have
377
00:41:30,620 --> 00:41:36,160
already seen. We reopen it in a text
viewer, or a text editor, again see it's
378
00:41:36,160 --> 00:41:44,160
encrypted, and we decide to change all
strings to the identity filter. So, no
379
00:41:44,160 --> 00:41:49,480
encryption is applied to strings from now
on. And then we add a whole blob of
380
00:41:49,480 --> 00:41:55,940
information for the open action, and for
the form. So this will be op- this will be
381
00:41:55,940 --> 00:42:00,350
performed, as soon as the document is
opened. There is a URL, p.df, and the
382
00:42:00,350 --> 00:42:07,540
value is the encrypted object 4 0. We
start an HTTP server on the domain we
383
00:42:07,540 --> 00:42:12,970
specified, we open the document, put in
the password again, and as soon as we open
384
00:42:12,970 --> 00:42:17,770
the document Adobe will helpfully show us
a warning, but they will already click the
385
00:42:17,770 --> 00:42:22,170
button for remembering that for the
future. And if you accept that, you will
386
00:42:22,170 --> 00:42:29,390
see your secret message on the attacker
server. And that is pretty bad already.
387
00:42:29,390 --> 00:42:36,480
OK. The same works for hyperlinks, so, of
course, there are links in PDF documents,
388
00:42:36,480 --> 00:42:43,600
and as on the Web, we can define a base
URL for hyperlinks. So we can say all URLs
389
00:42:43,600 --> 00:42:49,940
from this document start with http://p.df.
And of course we can define any object as
390
00:42:49,940 --> 00:42:57,260
a URL. So any object we prepared this way
can be sent as a URL, and that will, of
391
00:42:57,260 --> 00:43:01,180
course, trigger a GET request upon opening
the document again, if you defined an open
392
00:43:01,180 --> 00:43:08,750
action for the same object. So again,
pretty bad and breaks confidentiality. And
393
00:43:08,750 --> 00:43:16,380
of course, everybody loves JavaScript in
PDF files, and that works as well. Okay.
394
00:43:16,380 --> 00:43:21,350
Let's talk about ciphertext attacks, so
actual cryptographic attacks, no more not
395
00:43:21,350 --> 00:43:29,190
touching the crypto. So you might remember
the efail attacks on OpenPGP and S/MIME,
396
00:43:29,190 --> 00:43:34,160
and those had basically three
prerequisites. 1: Well, ciphertext
397
00:43:34,160 --> 00:43:38,690
malleability, so it's called malleability
gadgets. That's why we need ciphertext
398
00:43:38,690 --> 00:43:43,850
malleability, and we've got no integrity
protection, that's a plus. Then we need
399
00:43:43,850 --> 00:43:48,680
some known plaintext for actual targeted
modifications. And we need an exfiltration
400
00:43:48,680 --> 00:43:53,070
channel to send the data back to an
attacker. Well, exfiltration channels are
401
00:43:53,070 --> 00:43:59,730
already dealt with as we have hyperlinks
and forms. So we can already check that.
402
00:43:59,730 --> 00:44:05,800
Nice. Let's talk about ciphertext
malleability, or what we call gadgets. So,
403
00:44:05,800 --> 00:44:10,180
some of you might remember this from
crypto 101, or whatever lecture you ever
404
00:44:10,180 --> 00:44:15,290
had on cryptography. This is the
decryption function of CBC, so cipher
405
00:44:15,290 --> 00:44:24,030
block chaining. And it's basically, you've
got your ciphertext up here, and your
406
00:44:24,030 --> 00:44:29,730
plaintext down here. And it works by
simply decrypting a block of ciphertext,
407
00:44:29,730 --> 00:44:35,850
XORing the previous block of ciphertext
onto that, and you'll get the plaintext.
408
00:44:35,850 --> 00:44:41,070
So what happens, if you decide to change a
single bit in the ciphertext, for example,
409
00:44:41,070 --> 00:44:47,530
the first bit of the initialization
vector? Well, that same bit will flip in
410
00:44:47,530 --> 00:44:53,110
the actual plaintext. Wait a second. What
happens, if you happen to know a whole
411
00:44:53,110 --> 00:45:00,150
plaintext block? Well, we can XOR that
onto the first block, and basically get
412
00:45:00,150 --> 00:45:05,890
all zeros, or what we call a gadget, or a
blank sheet of paper, because we can write
413
00:45:05,890 --> 00:45:14,130
on that by taking a chosen plaintext and
XORing that onto this results. And this
414
00:45:14,130 --> 00:45:18,740
way we can, for example, construct URLs in
the actual ciphertext, or in the actual
415
00:45:18,740 --> 00:45:24,420
resulting plaintext. What we can also do
with these gadget is, gadgets is moving
416
00:45:24,420 --> 00:45:28,580
them somewhere else in the document,
cloning them, so we can have multiple
417
00:45:28,580 --> 00:45:34,150
gadgets, at multiple places in the
ciphertext. But remember, if you do that,
418
00:45:34,150 --> 00:45:37,800
there's always the avalanche effect of
CBC, so you will have some random bytes in
419
00:45:37,800 --> 00:45:45,590
here, but the URL still remains in place.
Okay. That's ciphertext malleability done.
420
00:45:45,590 --> 00:45:50,610
As I've said we need some plaintext. We
need to have some known plaintext. And as
421
00:45:50,610 --> 00:45:54,460
the PDF standard has been pretty helpful
up until now, in breaking PDF encryption,
422
00:45:54,460 --> 00:46:02,071
let's take a look again. And what we found
here: Permissions. So a PDF documents can
423
00:46:02,071 --> 00:46:08,040
have different permissions for the author,
and the user of the document. This
424
00:46:08,040 --> 00:46:11,020
basically means the author can edit the
document and the users might not be able
425
00:46:11,020 --> 00:46:16,060
to do that. And of course, people started
to change with that- started to tamper
426
00:46:16,060 --> 00:46:20,220
with that value, if it was left
unencrypted, so in the newest version, it
427
00:46:20,220 --> 00:46:27,310
was decided this should be encrypted as a
16 byte value. So we've got 16 bytes. How
428
00:46:27,310 --> 00:46:30,890
do they look? Well, at first, we need room
for extension. We need lots of
429
00:46:30,890 --> 00:46:36,100
permissions. Then we put 4 bytes of the
actual permission value - That is also in
430
00:46:36,100 --> 00:46:42,270
unencrypted form in document. Then we need
one byte for encrypted metadata, and for
431
00:46:42,270 --> 00:46:46,840
some reason we need some acronym, "adb",
I'll leave it to you to figure out what
432
00:46:46,840 --> 00:46:52,700
that stands for. And finally, we've got
four random bytes, because we have to fill
433
00:46:52,700 --> 00:47:00,260
up 16 bytes, and we have run out of ideas.
Okay. We take all of that, encrypt it, and
434
00:47:00,260 --> 00:47:05,980
oh well, we know a lot of that, and that
is basically known plaintext by design.
435
00:47:05,980 --> 00:47:12,940
Which is bad. Let's look at how this looks
in a document. So, you see the perms
436
00:47:12,940 --> 00:47:16,410
value, I've marked it down here. That is
the actual extended value I've shown you
437
00:47:16,410 --> 00:47:22,750
on the last slide. And above that you'll
see the unencrypted value that's inside
438
00:47:22,750 --> 00:47:28,030
this perms value, so the minus 4 in this
case, it's basically a bit field. On the
439
00:47:28,030 --> 00:47:33,610
right side you see the actual encrypted
contents, and helpfully, all of this is
440
00:47:33,610 --> 00:47:37,750
encrypted under the same document-wide key
in the newest version of the
441
00:47:37,750 --> 00:47:43,510
specification. And that means we can you
reuse this plaintext anywhere in the
442
00:47:43,510 --> 00:47:48,930
document we want, and we can reuse this
to build gadgets. To sum that last point
443
00:47:48,930 --> 00:47:53,190
up for you: Adobe decided to add
permissions to the PDF format, and people
444
00:47:53,190 --> 00:47:56,950
thought of tampering with them. So they
decided to encrypt these permissions to
445
00:47:56,950 --> 00:48:06,360
prevent tampering, and now known plaintext
is available to attackers. All right. So
446
00:48:06,360 --> 00:48:14,330
that's basically all of the prerequisites
done, and let's again have a demo. So, we
447
00:48:14,330 --> 00:48:20,180
again open this document, put in our
password, well, as soon as Chrome decides
448
00:48:20,180 --> 00:48:26,740
to open this document, we put in our
password. It's the same as before. Now,
449
00:48:26,740 --> 00:48:31,630
I've prepared a script for you, because I
really can't do this live, and it
450
00:48:31,630 --> 00:48:35,400
basically does what I've told you. It's
getting a blank gadget from the perms
451
00:48:35,400 --> 00:48:39,670
value. It's generating a URL from that.
It's generating a field name, so that it
452
00:48:39,670 --> 00:48:45,410
will look nice on the server side, we
regenerate this document and put a form in
453
00:48:45,410 --> 00:48:50,080
there. We start a web server, open this
modified document, put in the password
454
00:48:50,080 --> 00:48:55,540
again and oh well, Chrome doesn't even
ask. So as soon as this document is opened
455
00:48:55,540 --> 00:48:59,160
in Chrome and the password is put in,
we'll get our secret message delivered to
456
00:48:59,160 --> 00:49:07,080
the attacker.
Applause
457
00:49:07,080 --> 00:49:13,510
So we took a look at 27 viewers and found
all of them vulnerable to at least one of
458
00:49:13,510 --> 00:49:18,390
our attacks. So some of them work with no
user interaction as we have seen in
459
00:49:18,390 --> 00:49:22,730
Chrome. Some work with user interaction in
specific cases, as you've seen with Adobe
460
00:49:22,730 --> 00:49:30,660
with a warning, but generally all of these
were attackable in one way or the other.
461
00:49:30,660 --> 00:49:35,670
So what can be done about all of this?
Well, you might think signatures might
462
00:49:35,670 --> 00:49:40,250
help. That's usually the first point
people bring up: "A signature on the
463
00:49:40,250 --> 00:49:46,550
encrypted file will help." Well, no, not
really. Why is that? Well, for one, a
464
00:49:46,550 --> 00:49:50,332
broken signature does not prevent opening
the document. So we'll still be able to
465
00:49:50,332 --> 00:49:54,360
exfiltrate as soon as a password is put
in. Signatures can be stripped because
466
00:49:54,360 --> 00:49:57,700
they're not encrypted. And as you have
seen before, they can also be forged in
467
00:49:57,700 --> 00:50:02,960
most viewers. Signatures are not the
answer. Closing exfiltration channels is
468
00:50:02,960 --> 00:50:08,360
also not the answer because for one, it's
hard to do. And how would you even find
469
00:50:08,360 --> 00:50:14,690
all exfiltrations channels in an 800 pages
standard? And I mean, we have barely
470
00:50:14,690 --> 00:50:18,430
scratched the surface of exfiltration
channels. And should we really remove
471
00:50:18,430 --> 00:50:24,290
forms and hyperlinks from documents? And
should we remove JavaScript? OK, maybe we
472
00:50:24,290 --> 00:50:28,700
should. And finally, if you have to do
that, please ask the user before
473
00:50:28,700 --> 00:50:34,300
connecting to a web server. So let's look
at some vendor reactions. Apple decided to
474
00:50:34,300 --> 00:50:38,680
do exactly what I've told you: to add a
dialog to warn the user and even show the
475
00:50:38,680 --> 00:50:44,460
whole URL with the encrypted plaintext.
And Google decided to stop trying to fix
476
00:50:44,460 --> 00:50:49,830
the unfixable in Chrome. They fixed the
automatic exfiltration, but there's really
477
00:50:49,830 --> 00:50:54,290
nothing they can do about the standard. So
this is a problem that has to be done in
478
00:50:54,290 --> 00:51:00,230
the standard. And that is basically that.
For mitigating wrapping attacks, we have
479
00:51:00,230 --> 00:51:04,110
to deprecate partial encryption and
disallow access from unencrypted to
480
00:51:04,110 --> 00:51:08,450
encrypted objects. And against the gadget
attacks, we have to use authenticated
481
00:51:08,450 --> 00:51:16,221
encryption like AES-GCM. OK. And Adobe has
told us that they were escalating this to
482
00:51:16,221 --> 00:51:19,980
the ISO working group that's now
responsible for the PDF standard and this
483
00:51:19,980 --> 00:51:24,710
will be taken up in the next revision. So
that's a win in my book.
484
00:51:24,710 --> 00:51:30,950
Applause
485
00:51:30,950 --> 00:51:36,330
Herald: Thank you so much, guys. That was
really awesome. Please queue up by the
486
00:51:36,330 --> 00:51:41,290
microphones if you have any questions, we
still have some time left for Q and A. But
487
00:51:41,290 --> 00:51:45,180
I think your research is really, really
interesting because it opens my mind to
488
00:51:45,180 --> 00:51:51,490
like how would this actually be able to be
misused in practice? Like, and I don't
489
00:51:51,490 --> 00:51:54,760
know, like, what's your take? I guess
since you've been working so much with
490
00:51:54,760 --> 00:51:59,020
this, you must have some kind of idea as
to what devious things you could come up
491
00:51:59,020 --> 00:52:02,680
with.
Fabian: I mean, it's still an attacker
492
00:52:02,680 --> 00:52:08,080
scenario that requires a lot of resources
and a very motivated attacker. So this
493
00:52:08,080 --> 00:52:13,680
might not be very important to the normal
user. Let's be real here. So most of us
494
00:52:13,680 --> 00:52:19,100
are not targeted by the NSA, I guess. So
you need an active attacker, an active man
495
00:52:19,100 --> 00:52:21,070
in the middle to actually perform these
attacks.
496
00:52:21,070 --> 00:52:25,800
Herald: Great. Thank you. And then I think
we have a question from microphone number
497
00:52:25,800 --> 00:52:28,850
four, please.
Microphone 4: Yes. You'll said that the
498
00:52:28,850 --> 00:52:32,700
next standard might have a fix.
Do you know a time frame on how long it
499
00:52:32,700 --> 00:52:41,450
takes to build such a standard?
Fabian: Well, no, we don't really know. We
500
00:52:41,450 --> 00:52:44,640
have talked with Adobe and they told us
they will show the next version of the
501
00:52:44,640 --> 00:52:48,950
standard to us before actually releasing
that, but we have no time frame at all
502
00:52:48,950 --> 00:52:51,950
from them.
Microphone 4: OK. Thank you.
503
00:52:51,950 --> 00:52:57,400
Herald: Thank you.
Microphone number five, please.
504
00:52:57,400 --> 00:53:02,300
Microphone 5: Thank you for a very
interesting talk. You showed in the first
505
00:53:02,300 --> 00:53:09,140
part that the signature has like these
four numbers with the byte range. And why
506
00:53:09,140 --> 00:53:15,580
is this, like four numbers, not part of a
signature? Is there a technical reason for
507
00:53:15,580 --> 00:53:18,480
that? Because the byte offset is
predictable.
508
00:53:18,480 --> 00:53:24,470
Vladi: It is! The bytes ranges protected
by the signature. But we just defined the
509
00:53:24,470 --> 00:53:31,710
second one and just moved the signed one
to be validated later. So there are two
510
00:53:31,710 --> 00:53:37,530
byte ranges. But only the first one, the
manipulated one, will be processed.
511
00:53:37,530 --> 00:53:42,580
Microphone 5: Thank you.
Herald: Thank you so much. Microphone
512
00:53:42,580 --> 00:53:47,940
number four, please.
Microphone 4: Oh, this is way too high for
513
00:53:47,940 --> 00:53:53,870
me. OK. I have an answer and a question
for you. You mentioned during the talk
514
00:53:53,870 --> 00:53:58,690
that you weren't sure how the Department
of Justice did distributes the passwords
515
00:53:58,690 --> 00:54:07,940
for encrypting PDFs. The answer is: in
plain text, in a separate email or as the
516
00:54:07,940 --> 00:54:14,300
password of the week, which is distributed
through various means. That is also what
517
00:54:14,300 --> 00:54:20,370
the Department of Homeland Security does,
and the military is somewhat less stupid.
518
00:54:20,370 --> 00:54:27,030
As a question: I have roughly a half
terabyte of sensitive PDFs that I would
519
00:54:27,030 --> 00:54:36,910
like to scan for your attack and also for
redaction failures. Do you know of any
520
00:54:36,910 --> 00:54:45,560
fast, feasible ways to scan documents for
the presence of this kind of attack?
521
00:54:45,560 --> 00:54:51,970
Fabian: I don't know of any tools, but I
mean, scanning for the gadget attacks is
522
00:54:51,970 --> 00:54:58,390
actually possible if you tried to do some
entropy detection. So, because you reuse
523
00:54:58,390 --> 00:55:01,870
ciphertext, you will have less entropy in
your ciphertext, but that's pretty hard to
524
00:55:01,870 --> 00:55:07,350
do. Direct exfiltration should probably be
detectable by scanning simply for words
525
00:55:07,350 --> 00:55:12,300
like "identity". Well, beyond that, 18
different techniques that we provided in
526
00:55:12,300 --> 00:55:15,980
the paper. But I don't know of any tools
to do that automatically.
527
00:55:15,980 --> 00:55:21,560
Microphone 4: Thank you.
Herald: Great. Thank you. And microphone
528
00:55:21,560 --> 00:55:24,200
number two, please. Microphone 2: Thank
you for your very interesting
529
00:55:24,200 --> 00:55:30,220
presentation. I have one suggestion and
one question for the mitigation scheme. If
530
00:55:30,220 --> 00:55:33,810
you simply run your PDF reader in a
virtual machine, that is firewalled away,
531
00:55:33,810 --> 00:55:38,660
so your firewall won't led you to anybody
going out. But for the signature
532
00:55:38,660 --> 00:55:43,020
forgeries, I had an idea. I'm not sure if
this is actually a stupid idea, but did
533
00:55:43,020 --> 00:55:47,440
you consider faking the certificate?
Because presumably the signature is
534
00:55:47,440 --> 00:55:52,250
protected by the seller's certificate. You
make up your own, signing with that. Does
535
00:55:52,250 --> 00:55:57,670
it catch it and how?
Vladi: We considered it but not in this
536
00:55:57,670 --> 00:56:04,900
paper. We assume that the certificate and
the entire chain of trust for this path is
537
00:56:04,900 --> 00:56:11,750
totally secure. It was just an assumption
to just concentrate only on the attacks we
538
00:56:11,750 --> 00:56:19,600
already found. So, perhaps there will be
further research provided by us in the
539
00:56:19,600 --> 00:56:22,810
next months and years.
Herald: We might just hear more from you
540
00:56:22,810 --> 00:56:27,890
in the future. Thank you so much. And now
questions from the Internet, please.
541
00:56:27,890 --> 00:56:34,800
Signal Angel: I have two questions to the
first part of your talk from the Internet.
542
00:56:34,800 --> 00:56:40,540
The first one is you mentioned a few
reactions, but can you give a bit more
543
00:56:40,540 --> 00:56:46,510
detail about your experience with vendors
while reporting these issues?
544
00:56:46,510 --> 00:56:58,480
Vladi: Yeah. We, ... for the first time we
started, we asked the CERT team from BSI,
545
00:56:58,480 --> 00:57:04,790
CERT-Bund, to help us because there were a
lot of affected vendors and we were not
546
00:57:04,790 --> 00:57:13,580
able to provide the support in a feasible
way. So they supported us the entire way.
547
00:57:13,580 --> 00:57:19,880
We first created the report with,
containing the exact description of the
548
00:57:19,880 --> 00:57:26,190
vulnerabilities and old exploits. Then, we
distributed it to the BSI and they
549
00:57:26,190 --> 00:57:32,540
contacted the vendors and just proxied to
the communication and there was a lot of
550
00:57:32,540 --> 00:57:36,680
communication. So I'm not aware of the
entire communication, but only about the
551
00:57:36,680 --> 00:57:45,930
technical stuff where we were asked to
just retest the fix and so on. So there
552
00:57:45,930 --> 00:57:52,810
was some reaction from Adobe, FoxIt and a
lot of viewers reacted on our attacks and
553
00:57:52,810 --> 00:57:58,210
contacted us, but not everybody.
Herald: Thank you so much. Unfortunately,
554
00:57:58,210 --> 00:58:01,670
that's the only time that we have
available for questions today. I think you
555
00:58:01,670 --> 00:58:06,080
guys might stay around for a couple of
minutes, just if someone has any more
556
00:58:06,080 --> 00:58:10,930
questions. Fabian, I thank ... and
Vladislav, not enough. Thank you so much.
557
00:58:10,930 --> 00:58:13,040
It was very interesting. Please give them
a great round of applause.
558
00:58:13,040 --> 00:58:14,793
Valdi: Thank you.
Applause
559
00:58:14,793 --> 00:58:20,299
36c3 postroll music
560
00:58:20,299 --> 00:58:43,000
subtitles created by c3subtitles.de
in the year 2019. Join, and help us!