WEBVTT
00:00:00.000 --> 00:00:09.520
36c3 prerol music
00:00:18.410 --> 00:00:23.250
Herald: So, the next talk for this
afternoon is about high speed binary
00:00:23.250 --> 00:00:28.110
fuzzing. We have two researchers that will
be presenting the product of their latest
00:00:28.110 --> 00:00:33.640
work, which is a framework for static
binary rewriting. Our speakers are—the
00:00:33.640 --> 00:00:38.580
first one is a computer science master's
student at EPFL and the second one is a
00:00:38.580 --> 00:00:42.730
security researcher and assistant
professor at EPFL. Please give a big round
00:00:42.730 --> 00:00:45.048
of applause to Nspace and gannimo.
00:00:45.048 --> 00:00:50.280
Applause
00:00:50.280 --> 00:00:52.610
gannimo (Mathias Payer): Thanks for the
introduction. It's a pleasure to be here,
00:00:52.610 --> 00:00:57.850
as always. We're going to talk about
different ways to speed up your fuzzing
00:00:57.850 --> 00:01:02.050
and to find different kinds of
vulnerabilities or to tweak your binaries
00:01:02.050 --> 00:01:08.070
in somewhat unintended ways. I'm Mathias
Payer or I go by gannimo on Twitter and I
00:01:08.070 --> 00:01:14.440
am an assistant professor at EPFL working
on different forms of software security:
00:01:14.440 --> 00:01:18.700
fuzzing sanitization, but also different
kinds of mitigations. And Matteo over
00:01:18.700 --> 00:01:24.160
there is working on his master's thesis on
different forms of binary rewriting for
00:01:24.160 --> 00:01:27.820
the kernel. And today we're going to take
you on a journey on how to actually
00:01:27.820 --> 00:01:32.180
develop very fast and very efficient
binary rewriting mechanisms that allow you
00:01:32.180 --> 00:01:37.710
to do unintended modifications to the
binaries and allow you to explore
00:01:37.710 --> 00:01:45.700
different kinds of unintended features in
binaries. So about this talk. What we
00:01:45.700 --> 00:01:49.729
discovered or the reason why we set out on
this journey was that fuzzing binaries is
00:01:49.729 --> 00:01:56.460
really, really hard. There's very few
tools in user space. There's—it's
00:01:56.460 --> 00:01:59.680
extremely hard to set it up and it's
extremely hard to set it up in a
00:01:59.680 --> 00:02:04.479
performant way. The setup is complex. You
have to compile different tools. You have
00:02:04.479 --> 00:02:08.520
to modify it. And the results are not
really that satisfactory. As soon as you
00:02:08.520 --> 00:02:13.320
move to the kernel, fuzzing binaries in a
kernel is even harder. There's no tooling
00:02:13.320 --> 00:02:16.880
whatsoever, there's very few users
actually working with binary code in the
00:02:16.880 --> 00:02:22.630
kernel or modifying binary code, and it's
just a nightmare to work with. So what we
00:02:22.630 --> 00:02:26.850
are presenting today is a new approach
that allows you to instrument any form of
00:02:26.850 --> 00:02:31.920
binary code or modern binary code based on
static rewriting, which gives you full
00:02:31.920 --> 00:02:36.819
native performance. You only pay for the
instrumentation that you add, and you can
00:02:36.819 --> 00:02:41.690
do very heavyweight transformations on top
of it. The picture, if you look at the
00:02:41.690 --> 00:02:47.470
modern system, let's say we are looking at
a modern setup. Let's say you're looking
00:02:47.470 --> 00:02:52.700
at cat pictures in your browser: Chrome
plus the kernel plus the libc plus the
00:02:52.700 --> 00:02:57.920
graphical user interface together clog up
at about 100 million lines of code.
00:02:57.920 --> 00:03:02.670
Instrumenting all of this for some form of
security analysis is a nightmare,
00:03:02.670 --> 00:03:06.690
especially along this large stack of
software. There's quite a bit of different
00:03:06.690 --> 00:03:11.260
compilers involved. There's different
linkers. It may be compiled on a different
00:03:11.260 --> 00:03:14.620
system, with different settings and so on.
And then getting your instrumentation
00:03:14.620 --> 00:03:18.569
across all of this is pretty much
impossible and extremely hard to work
00:03:18.569 --> 00:03:24.269
with. And we want to enable you to select
those different parts that you're actually
00:03:24.269 --> 00:03:29.629
interested in. Modify those and then focus
your fuzzing or analysis approaches on
00:03:29.629 --> 00:03:35.040
those small subsets of the code, giving
you a much better and stronger capability
00:03:35.040 --> 00:03:38.690
to test the systems that you're, or those
parts of the system that you're really,
00:03:38.690 --> 00:03:45.659
really interested in. Who's worked on
fuzzing before? Quick show of hands. Wow,
00:03:45.659 --> 00:03:54.379
that's a bunch of you. Do you use AFL?
Yeah, most of you, AFL. Libfuzzer? Cool,
00:03:54.379 --> 00:03:59.760
about 10, 15 percent libfuzzer, 30 percent
fuzzing, and AFL. There's a quite good
00:03:59.760 --> 00:04:03.980
knowledge of fuzzing, so I'm not going to
spend too much time on fuzzing, but for
00:04:03.980 --> 00:04:07.500
those that haven't really run their
fuzzing campaigns yet, it's a very simple
00:04:07.500 --> 00:04:12.060
software testing technique. You're
effectively taking a binary, let's say
00:04:12.060 --> 00:04:16.480
Chrome, as a target and you're running
this in some form of execution
00:04:16.480 --> 00:04:20.959
environment. And fuzzing then consists of
some form of input generation that creates
00:04:20.959 --> 00:04:26.620
new test cases, throws them at your
program and sees—and checks what is
00:04:26.620 --> 00:04:31.310
happening with your program. And either
everything is OK, and your code is being
00:04:31.310 --> 00:04:35.640
executed, and your input—the program
terminates, everything is fine, or you
00:04:35.640 --> 00:04:39.773
have a bug report. If you have a bug
report, you can use this. Find the
00:04:39.773 --> 00:04:44.520
vulnerability, maybe develop a PoC and
then come up with some form of either
00:04:44.520 --> 00:04:49.240
exploit or patch or anything else. Right.
So this is pretty much fuzzing in a in a
00:04:49.240 --> 00:04:55.560
nutshell. How do you get fuzzing to be
effective? How can you cover large source
00:04:55.560 --> 00:05:00.419
bases, complex code, and complex
environment? Well, there's a couple of
00:05:00.419 --> 00:05:04.979
simple steps that you can take. And let's
walk quickly through effective fuzzing
00:05:04.979 --> 00:05:12.630
101. Well, first, you want to be able to
create test cases that actually trigger
00:05:12.630 --> 00:05:18.100
bugs. And this is a very, very
complicated, complicated part. And we need
00:05:18.100 --> 00:05:22.800
to have some notion of the inputs that a
program accepts. And we need to have some
00:05:22.800 --> 00:05:27.780
notion of how we can explore different
parts of the program, right? Different
00:05:27.780 --> 00:05:30.870
parts of functionality. Well, on one hand,
we could have a developer write all the
00:05:30.870 --> 00:05:34.370
test cases by hand, but this would be kind
of boring. It would also require a lot of
00:05:34.370 --> 00:05:40.220
human effort in creating these different
inputs and so on. So coverage guided
00:05:40.220 --> 00:05:46.990
fuzzing has evolved as a very simple way
to guide the fuzzing process, leveraging
00:05:46.990 --> 00:05:51.220
the information on which parts of the code
have been executed by simply tracing the
00:05:51.220 --> 00:05:58.500
individual path through the program based
on the execution flow. So we can—the
00:05:58.500 --> 00:06:03.460
fuzzer can use this feedback to then
modify the inputs that are being thrown at
00:06:03.460 --> 00:06:09.830
the fuzzing process. The second step is
the fuzzer must be able to detect bugs. If
00:06:09.830 --> 00:06:13.080
you've ever looked at a memory corruption,
if you're just writing one byte after the
00:06:13.080 --> 00:06:18.490
end of a buffer, it's highly likely that
your software is not going to crash. But
00:06:18.490 --> 00:06:21.180
it's still a bug, and it may still be
exploitable based on the underlying
00:06:21.180 --> 00:06:26.690
conditions. So we want to be able to
detect violations as soon as they happen,
00:06:26.690 --> 00:06:31.600
for example, based on on some form of
sanitization that we add, some form of
00:06:31.600 --> 00:06:35.400
instrumentation that we add to the to the
binary, that then tells us, hey, there's a
00:06:35.400 --> 00:06:39.729
violation of the memory safety property,
and we terminate the application right
00:06:39.729 --> 00:06:45.300
away as a feedback to the fuzzer. Third,
but the—and last but not least: Speed is
00:06:45.300 --> 00:06:49.569
key, right? For if you're running a
fuzzing campaign, you have a fixed
00:06:49.569 --> 00:06:54.639
resource budget. You have a couple of
cores, and you want to run for 24 hours,
00:06:54.639 --> 00:06:59.470
48 hours, a couple of days. But in any
way, whatever your constraints are, you
00:06:59.470 --> 00:07:04.210
have a fixed amount of instructions that
you can actually execute. And you have to
00:07:04.210 --> 00:07:08.699
decide, am I spending my instructions on
generating new inputs, tracking
00:07:08.699 --> 00:07:14.139
constraints, finding bugs, running
sanitization or executing the program? And
00:07:14.139 --> 00:07:17.790
you need to find a balance between all of
them, as it is a zero sum game. You have a
00:07:17.790 --> 00:07:20.870
fixed amount of resources and you're
trying to make the best with these
00:07:20.870 --> 00:07:26.890
resources. So any overhead is slowing you
down. And again, this becomes an
00:07:26.890 --> 00:07:30.819
optimization problem. How can you most
effectively use the resources that you
00:07:30.819 --> 00:07:37.580
have available? As we are fuzzing with
source code, it's quite easy to actually
00:07:37.580 --> 00:07:41.770
leverage existing mechanisms, and we add
all that instrumentation at compile time.
00:07:41.770 --> 00:07:45.630
We take source code, we pipe it through
the compiler and modern compiler
00:07:45.630 --> 00:07:51.169
platforms, allow you to instrument and add
little code snippets during the
00:07:51.169 --> 00:07:55.419
compilation process that then carry out
all these tasks that are useful for
00:07:55.419 --> 00:08:00.270
fuzzing. For example, modern compilers can
add short snippets of code for coverage
00:08:00.270 --> 00:08:03.990
tracking that will record which parts of
the code that you have executed, or for
00:08:03.990 --> 00:08:08.770
sanitization which record and check every
single memory access if it is safe or not.
00:08:08.770 --> 00:08:12.360
And then when you're running the
instrumented binary, everything is fine
00:08:12.360 --> 00:08:17.380
and you can detect the policy violations
as you go along. Now if you would have
00:08:17.380 --> 00:08:21.330
source code for everything, this would be
amazing. But it's often not the case,
00:08:21.330 --> 00:08:28.129
right? We may be able on Linux to cover a
large part of the protocol stack by
00:08:28.129 --> 00:08:33.940
focusing only on source-code-based
approaches. But there may be applications
00:08:33.940 --> 00:08:39.300
where no source code is available. If we
move to Android or other mobile systems,
00:08:39.300 --> 00:08:43.199
there's many drivers that are not
available as open source or just available
00:08:43.199 --> 00:08:48.630
as binary blobs, or the full software
stack may be closed-source and we only get
00:08:48.630 --> 00:08:52.329
the binaries. And we still want to find
vulnerabilities in these complex software
00:08:52.329 --> 00:08:59.530
stacks that span hundreds of millions of
lines of code in a very efficient way. The
00:08:59.530 --> 00:09:04.620
only solution to cover this part of
massive code base is to actually rewrite
00:09:04.620 --> 00:09:08.990
and focus on binaries. A very simple
approach could be black box fuzzing, but
00:09:08.990 --> 00:09:11.620
this is—this doesn't really get you
anywhere because you don't get any
00:09:11.620 --> 00:09:16.100
feedback; you don't get any information if
you're triggering bugs. So one simple
00:09:16.100 --> 00:09:20.290
approach, and this is the approach that is
most dominantly used today, is to rewrite
00:09:20.290 --> 00:09:26.040
the program or the binary dynamically. So
you're taking the binary and during
00:09:26.040 --> 00:09:32.010
execution you use some form of dynamic
binary instrumentation based on Pin, angr,
00:09:32.010 --> 00:09:37.140
or some other binary rewriting tool and
translate the targeted runtime, adding
00:09:37.140 --> 00:09:43.330
this binary instrumentation on top of it
as you're executing it. It's simple, it's
00:09:43.330 --> 00:09:46.930
straightforward, but it comes at a
terrible performance cost of ten to a
00:09:46.930 --> 00:09:51.600
hundred x slow down, which is not really
effective. And you're spending all your
00:09:51.600 --> 00:09:57.600
cores and your cycles on just executing
the binary instrumentation. So we don't
00:09:57.600 --> 00:10:01.790
really want to do this and we want to have
something that's more effective than that.
00:10:01.790 --> 00:10:07.360
So what we are focusing on is to do static
rewriting. It involves a much more complex
00:10:07.360 --> 00:10:12.380
analysis as we are rewriting the binary
before it is being executed, and we have
00:10:12.380 --> 00:10:17.880
to recover all of the control flow, all of
the different mechanisms, but it results
00:10:17.880 --> 00:10:24.690
in a much better performance. And we can
get more bang for our buck. So why is
00:10:24.690 --> 00:10:30.830
static rewriting so challenging? Well,
first, simply adding code will break the
00:10:30.830 --> 00:10:35.320
target. So if you are disassembling this
piece of code here, which is a simple loop
00:10:35.320 --> 00:10:40.620
that loads data, decrements the registers,
and then jumps if you're not at the end of
00:10:40.620 --> 00:10:46.470
the array and keeps iterating through this
array. Now, as you look at the jump-not-
00:10:46.470 --> 00:10:52.100
zero instruction, the last instruction of
the snippet, it is a relative offset. So
00:10:52.100 --> 00:10:57.990
it jumps backward seven bytes. Which is
nice if you just execute the code as is.
00:10:57.990 --> 00:11:02.040
But as soon as you want to insert new
code, you change the offsets in the
00:11:02.040 --> 00:11:07.110
program, and you're modifying all these
different offsets. And simply adding new
00:11:07.110 --> 00:11:12.769
code somewhere in between will break the
target. So a core feature that we need to
00:11:12.769 --> 00:11:18.170
enforce, or core property that we need to
enforce, is that we must find all the
00:11:18.170 --> 00:11:24.050
references and properly adjust them, both
relative offsets and absolute offsets as
00:11:24.050 --> 00:11:29.800
well. Getting a single one wrong will
break everything. What makes this problem
00:11:29.800 --> 00:11:34.520
really, really hard is that if you're
looking at the binary, a byte is a byte,
00:11:34.520 --> 00:11:38.320
right? There's no way for us to
distinguish between scalars and
00:11:38.320 --> 00:11:43.649
references, and in fact they are
indistinguishable. Getting a single
00:11:43.649 --> 00:11:50.400
reference wrong breaks the target and
would introduce arbitrary crashes. So we
00:11:50.400 --> 00:11:54.460
have to come up with ways that allow us to
distinguish between the two. So for
00:11:54.460 --> 00:11:59.899
example, if you have this code here, it
takes a value and stores it somewhere on
00:11:59.899 --> 00:12:07.060
the stack. This could come from two
different kind of high-level constructs.
00:12:07.060 --> 00:12:12.170
On one hand, it could be taking the
address of a function and storing this
00:12:12.170 --> 00:12:16.540
function address somewhere and in a stack
variable. Or it could be just storing a
00:12:16.540 --> 00:12:21.579
scalar in a stack variable. And these two
are indistinguishable, and rewriting them,
00:12:21.579 --> 00:12:25.220
as soon as we add new code, the offsets
will change. If it is a function, we would
00:12:25.220 --> 00:12:31.800
have to modify the value; if it is a
scalar, we have to keep the same value. So
00:12:31.800 --> 00:12:35.510
how can we come up with a way that allows
us to distinguish between the two and
00:12:35.510 --> 00:12:44.610
rewrite binaries by recovering this
missing information? So let us take—let me
00:12:44.610 --> 00:12:48.120
take you or let us take you on a journey
towards instrumenting binaries in the
00:12:48.120 --> 00:12:53.070
kernel. This is what we aim for. We'll
start with the simple case of
00:12:53.070 --> 00:12:57.410 line:1
instrumenting binaries in user land, talk
about different kinds of coverage guided
00:12:57.410 --> 00:13:01.750 line:1
fuzzing and what kind of instrumentation
we can add, what kind of sanitization we
00:13:01.750 --> 00:13:06.390
can add, and then focusing on taking it
all together and applying it to kernel
00:13:06.390 --> 00:13:11.480
binaries to see what what will fall out of
it. Let's start with instrumenting
00:13:11.480 --> 00:13:17.019
binaries first. I will now talk a little
bit about RetroWrite, our mechanism and
00:13:17.019 --> 00:13:24.560
our tool that enables static binary
instrumentation by symbolizing existing
00:13:24.560 --> 00:13:30.800
binaries. So we recover the information
and we translate relative offsets and
00:13:30.800 --> 00:13:39.710
absolute offsets into actual labels that
are added to the assembly file. The
00:13:39.710 --> 00:13:42.760
instrumentation can then work on the
recovered assembly file, which can then be
00:13:42.760 --> 00:13:48.110
reassembled into a binary that can then be
executed for fuzzing. We implement
00:13:48.110 --> 00:13:52.459
coverage tracking and binary address
sanitizer on top of this, leveraging
00:13:52.459 --> 00:13:57.970
abstraction as we go forward. The key to
enabling this kind of binary rewriting is
00:13:57.970 --> 00:14:02.170
position-independent code. And position-
independent code has become the de-facto
00:14:02.170 --> 00:14:07.420
standard for any code that is being
executed on a modern system. And it
00:14:07.420 --> 00:14:12.019
effectively says that it is code that can
be loaded at any arbitrary address in your
00:14:12.019 --> 00:14:15.600
address space as you are executing
binaries. It is essential and a
00:14:15.600 --> 00:14:19.010
requirement if you want to have address
space layout randomization or if you want
00:14:19.010 --> 00:14:22.269
to use shared libraries, which de facto
you want to use in all these different
00:14:22.269 --> 00:14:26.090
systems. So since a couple of years, all
the code that you're executing on your
00:14:26.090 --> 00:14:33.079
phones, on your desktops, on your laptops
is position-independent code. And the idea
00:14:33.079 --> 00:14:36.680
between the position-independent code is
that you can load it anywhere in your
00:14:36.680 --> 00:14:41.040
address space and you can therefore not
use any hard-coded static addresses and
00:14:41.040 --> 00:14:44.420
you have to inform the system of
relocations or pick relative
00:14:44.420 --> 00:14:52.920
addresses—to—on how the system can
relocate these different mechanisms. On
00:14:52.920 --> 00:14:58.540
x86_64, position-independent code
leverages addressing that is relative to
00:14:58.540 --> 00:15:03.440
the instruction pointer. So for example,
it uses the current instruction pointer
00:15:03.440 --> 00:15:07.519
and then a relative offset to that
instruction pointer to reference global
00:15:07.519 --> 00:15:12.030
variables, other functions and so on. And
this is a very easy way for us to
00:15:12.030 --> 00:15:17.710
distinguish references from constants,
especially in PIE binaries. If it is RIP-
00:15:17.710 --> 00:15:21.360
relative, it is a reference; everything
else is a constant. And we can build our
00:15:21.360 --> 00:15:25.690
translation algorithm and our translation
mechanism based on this fundamental
00:15:25.690 --> 00:15:30.130
finding to remove any form of heuristic
that is needed by focusing especially on
00:15:30.130 --> 00:15:35.030
position-independent code. So we're
supporting position-independent code; we
00:15:35.030 --> 00:15:38.920
are—we don't support non-position-
independent code, but we give you the
00:15:38.920 --> 00:15:43.200
guarantee that we can rewrite all the
different code that is out there. So
00:15:43.200 --> 00:15:48.449
symbolization works as follows: If you
have the little bit of code on the lower
00:15:48.449 --> 00:15:54.030
right, symbolization replaces first all
the references with assembler labels. So
00:15:54.030 --> 00:15:57.700
look at the call instruction and the jump-
not-zero instruction; the call instruction
00:15:57.700 --> 00:16:02.399
references an absolute address and the
jump-not-zero instruction jumps backward
00:16:02.399 --> 00:16:08.259
relative 15 bytes. So by focusing on these
relative jumps and calls, we can replace
00:16:08.259 --> 00:16:12.020
them with actual labels and rewrite the
binary as follows: so we're calling a
00:16:12.020 --> 00:16:15.839
function, replacing it with the actual
label, and for the jump-not-zero we are
00:16:15.839 --> 00:16:21.020
inserting an actual label in the assembly
code and adding a backward reference. For
00:16:21.020 --> 00:16:26.089
PC-relative addresses, for example the
data load, we can then replace it with the
00:16:26.089 --> 00:16:30.329
name of the actual data that we have
recovered, and we can then add all the
00:16:30.329 --> 00:16:35.630 line:1
different relocations and use that as
auxiliary information on top of it. After
00:16:35.630 --> 00:16:43.480
these three steps, we can insert any new
code in between, and can therefore add
00:16:43.480 --> 00:16:47.420
different forms of instrumentations or run
some more higher-level analysis on top of
00:16:47.420 --> 00:16:53.940
it, and then reassemble the file for
fuzzing or coverage-guided tracking,
00:16:53.940 --> 00:16:59.100
address sanitization or whatever else you
want to do. I will now hand over to
00:16:59.100 --> 00:17:04.490
Matteo, who will cover coverage-guided
fuzzing and sanitization and then
00:17:04.490 --> 00:17:07.260
instrumenting the binaries in the kernel.
Go ahead.
00:17:07.260 --> 00:17:11.300
Nspace (Matteo Rizzo): So, now that we
have this really nice framework to rewrite
00:17:11.300 --> 00:17:16.500
binaries, one of the things that we want
to add to actually get the fuzzing is this
00:17:16.500 --> 00:17:22.960
coverage-tracking instrumentation. So
coverage-guided fuzzing is a way, a
00:17:22.960 --> 00:17:27.549
method, for—to let the fuzzer discover
interesting inputs, an interesting path to
00:17:27.549 --> 00:17:35.520
the target by itself. So the basic idea is
that the fuzzer will track coverage—the
00:17:35.520 --> 00:17:39.190
parts of the programs that are covered by
different inputs by inserting some kind of
00:17:39.190 --> 00:17:43.419
instrumentation. So, for example, here we
have this target program that checks if
00:17:43.419 --> 00:17:48.651
the input contains the string "PNG" at the
beginning, and if it does, then it does
00:17:48.651 --> 00:17:53.559
something interesting, otherwise it just
bails out and fails. So if we track the
00:17:53.559 --> 00:17:58.240
part of the programs that each input
executes, the fuzzer can figure out that
00:17:58.240 --> 00:18:03.100
an input that contains "P" will have
discovered a different path through the
00:18:03.100 --> 00:18:08.080
program than input that doesn't contain
it. And then so on it can, one byte at a
00:18:08.080 --> 00:18:13.360
time, discover that this program expects
this magic sequence "PNG" at the start of
00:18:13.360 --> 00:18:19.280
the input. So the way that the fuzzer does
this is that every time a new input
00:18:19.280 --> 00:18:23.730
discovers a new path though the target, it
is considered interesting and added to a
00:18:23.730 --> 00:18:28.890
corpus of interesting inputs. And every
time the fuzzer needs to generate a new
00:18:28.890 --> 00:18:35.610
input, it will select something from the
corpus, mutate it randomly, and then use
00:18:35.610 --> 00:18:39.830
it as the new input. So this is like
a—this is, like, conceptually pretty
00:18:39.830 --> 00:18:43.150
simple, but in practice it works really
well and it really lets the fuzzer
00:18:43.150 --> 00:18:47.740
discover the format that the target
expects in an unsupervised way. So as an
00:18:47.740 --> 00:18:53.010
example, this is an experiment that was
run by the author of AFL—AFL is the fuzzer
00:18:53.010 --> 00:18:58.049
that sort of popularized this
technique—where he was fuzzing a JPEG-
00:18:58.049 --> 00:19:02.160
parsing library, starting from a corpus
that only contained the string "hello". So
00:19:02.160 --> 00:19:07.650
now clearly "hello" is not a valid JPEG
image and so—but still, like, the fuzzer
00:19:07.650 --> 00:19:12.070
was still able to find—to discover the
correct format. So after a while it
00:19:12.070 --> 00:19:17.580
started generating some grayscale images,
on the top left, and as it generated more
00:19:17.580 --> 00:19:20.720
and more inputs, it started generating
more interesting images, such as some
00:19:20.720 --> 00:19:25.120
grayscale gradients, and later on even
some color images. So as you can see, this
00:19:25.120 --> 00:19:30.630
really works, and it allows us to fuzz a
program without really teaching the fuzzer
00:19:30.630 --> 00:19:34.600
how the input should look like. So that's
it for coverage-guided fuzzing. Now we'll
00:19:34.600 --> 00:19:38.190
talk a bit about sanitizations. As a
reminder, the core idea behind
00:19:38.190 --> 00:19:42.330
sanitization is that just looking for
crashes is likely to miss some of the
00:19:42.330 --> 00:19:45.919
bugs. So, for example, if you have this
out-of-bounds one-byte read, that will
00:19:45.919 --> 00:19:49.590
probably not crash the target, but you
would still like to catch it because it
00:19:49.590 --> 00:19:53.080
could be used for an info leak, for
example. So one of the most popular
00:19:53.080 --> 00:19:59.030
sanitizers is Address Sanitizer. So
Address Sanitizer will instrument all the
00:19:59.030 --> 00:20:04.630
memory accesses in your program and check
for memory corruption, which—so, memory
00:20:04.630 --> 00:20:08.809
corruption is a pretty dangerous class of
bugs that unfortunately still plagues C
00:20:08.809 --> 00:20:16.770
and C++ programs and unsafe languages in
general. And ASan tries to catch it by
00:20:16.770 --> 00:20:21.220
instrumenting the target. It is very
popular; it has been used to find
00:20:21.220 --> 00:20:26.900
thousands of bugs in complex software like
Chrome and Linux, and even though it has,
00:20:26.900 --> 00:20:31.500
like, a bit of a slowdown—like about 2x—it
is still really popular because it lets
00:20:31.500 --> 00:20:37.120
you find many, many more bugs. So how does
it work? The basic idea is that ASan will
00:20:37.120 --> 00:20:41.790
insert some special regions of memory
called 'red zones' around every object in
00:20:41.790 --> 00:20:47.270
memory. So we have a small example here
where we declare a 4-byte array on the
00:20:47.270 --> 00:20:53.700
stack. So ASan will allocate the array
"buf" and then add a red zone before it
00:20:53.700 --> 00:20:59.060
and a red zone after it. Whenever the
program accesses the red zones, it is
00:20:59.060 --> 00:21:02.660
terminated with a security violation. So
the instrumentation just prints a bug
00:21:02.660 --> 00:21:07.419
report and then crashes the target. This
is very useful for detecting, for example,
00:21:07.419 --> 00:21:11.400
buffer overflows or underflows and many
other kinds of bugs such as use-after-free
00:21:11.400 --> 00:21:16.230
and so on. So, as an example here, we are
trying to copy 5 bytes into a 4-byte
00:21:16.230 --> 00:21:22.580
buffer, and ASan will check each of the
accesses one by one. And when it sees that
00:21:22.580 --> 00:21:26.810
the last byte writes to a red zone, it
detects the violation and crashes the
00:21:26.810 --> 00:21:32.370
program. So this is good for us because
this bug might have not been found by
00:21:32.370 --> 00:21:36.120
simply looking for crashes, but it's
definitely found if we use ASan. So this
00:21:36.120 --> 00:21:40.750
is something we want for fuzzing. So now
that we've covered—briefly covered ASan we
00:21:40.750 --> 00:21:45.970
can talk about instrumenting binaries in
the kernel. So Mathias left us with
00:21:45.970 --> 00:21:52.580
RetroWrite, and with RetroWrite we can add
both coverage tracking and ASan to
00:21:52.580 --> 00:21:57.410
binaries. So the simple—it's a really
simple idea: now that we can rewrite this
00:21:57.410 --> 00:22:02.760
binary and add instructions wherever we
want, we can implement both coverage
00:22:02.760 --> 00:22:07.390
tracking and ASan. In order to implement
coverage tracking, we simply have to
00:22:07.390 --> 00:22:11.710
identify the start of every basic block
and add a little piece of instrumentation
00:22:11.710 --> 00:22:15.789
at the start of the basic block that tells
the fuzzer 'hey, we've reached this part
00:22:15.789 --> 00:22:19.400
of the program'—'hey, we've reached this
other part of the program'. Then the
00:22:19.400 --> 00:22:25.039
fuzzer can figure out whether that's a new
part or not. ASan is also, like, you know,
00:22:25.039 --> 00:22:29.240
it's also somewhat—it can also be
implemented in this way by finding all
00:22:29.240 --> 00:22:33.929
memory accesses, and then linking with
libASan. libASan is a sort of runtime for
00:22:33.929 --> 00:22:38.820
ASan that takes care of inserting the red
zones and instrument—and adding, you know,
00:22:38.820 --> 00:22:43.340
like, keeping around all the metadata that
ASan needs to know where the red zones
00:22:43.340 --> 00:22:48.419
are, and detecting whether a memory access
is invalid. So, how can we apply all of
00:22:48.419 --> 00:22:52.309
this in the kernel? Well, first of all,
fuzzing the kernel is not as easy as
00:22:52.309 --> 00:22:57.920
fuzzing some userspace program. There's
some issues here. So first of all, there's
00:22:57.920 --> 00:23:01.950 line:1
crash handling. So whenever you're fuzzing
a userspace program, you expect crashes,
00:23:01.950 --> 00:23:06.289
well, because that's what we're after. And
if a userspace program crashes, then the
00:23:06.289 --> 00:23:11.410
US simply terminates the crash gracefully.
And so the fuzzer can detect this, and
00:23:11.410 --> 00:23:16.270
save the input as a crashing input, and so
on. And this is all fine. But when you're
00:23:16.270 --> 00:23:19.470
fuzzing the kernel, so—if you were fuzzing
the kernel of the machine that you were
00:23:19.470 --> 00:23:23.040 line:1
using for fuzzing, after a while, the
machine would just go down. Because, after
00:23:23.040 --> 00:23:27.180
all, the kernel runs the machine, and if
it starts misbehaving, then all of it can
00:23:27.180 --> 00:23:31.720
go wrong. And more importantly, you can
lose your crashes, because the if the
00:23:31.720 --> 00:23:35.450
machine crashes, then the state of the
fuzzer is lost and you have no idea what
00:23:35.450 --> 00:23:39.590
your crashing input was. So what most
kernel fuzzers have to do is that they
00:23:39.590 --> 00:23:43.419
resort to some kind of VM to keep the
system stable. So they fuzz the kernel in
00:23:43.419 --> 00:23:48.500
a VM and then run the fuzzing agent
outside the VM. On top of that is tooling.
00:23:48.500 --> 00:23:52.710
So, if you want to fuzz a user space
program, you can just download AFL or use
00:23:52.710 --> 00:23:57.540
libfuzzer; there's plenty of tutorials
online, it's really easy to set up and
00:23:57.540 --> 00:24:01.200
just, like—compile your program, you start
fuzzing and you're good to go. If you want
00:24:01.200 --> 00:24:05.240
to fuzz the kernel, it's already much more
complicated. So, for example, if you want
00:24:05.240 --> 00:24:09.390
to fuzz Linux with, say, syzkaller, which
is a popular kernel fuzzer, you have to
00:24:09.390 --> 00:24:14.030
compile the kernel, you have to use a
special config that supports syzkaller,
00:24:14.030 --> 00:24:20.100
you have way less guides available than
for userspace fuzzing, and in general it's
00:24:20.100 --> 00:24:24.940
just much more complex and less intuitive
than just fuzzing userspace. And lastly,
00:24:24.940 --> 00:24:29.330
we have the issue of determinism. So in
general, if you have a single threaded
00:24:29.330 --> 00:24:32.770
userspace program, unless it uses some
kind of random number generator, it is
00:24:32.770 --> 00:24:38.210
more or less deterministic. There's
nothing that affects the execution of the
00:24:38.210 --> 00:24:42.299
program. But—and this is really nice if
you want to try to reproduce a test case,
00:24:42.299 --> 00:24:46.340
because if you have a non-deterministic
test case, then it's really hard to know
00:24:46.340 --> 00:24:50.680
whether this is really a crash or if it's
just something that you should ignore, and
00:24:50.680 --> 00:24:56.280
in the kernel this is even harder, because
you don't only have concurrency, like
00:24:56.280 --> 00:25:01.200
multi-processing, you also have interrupts.
So interrupts can happen at any time, and
00:25:01.200 --> 00:25:05.850
if one time you got an interrupt while
executing your test case and the second
00:25:05.850 --> 00:25:09.947
time you didn't, then maybe it only
crashes one time - you don't really know,
00:25:09.947 --> 00:25:15.910
it's not pretty. And so again, we
have several approaches to fuzzing
00:25:15.910 --> 00:25:20.550
binaries in the kernel. First one is to do
black box fuzzing. We don't really
00:25:20.550 --> 00:25:23.677
like this because it doesn't find much,
especially in something complex
00:25:23.677 --> 00:25:27.380
like a kernel. Approach 1 is to
use dynamic translation,
00:25:27.380 --> 00:25:32.620
so, use something
like QEMU or—you name it. This works, and
00:25:32.620 --> 00:25:35.121
people have used it successfully; the
problem is that it is really, really,
00:25:35.121 --> 00:25:41.500
really slow. Like, we're talking about
10x-plus overhead. And as we said before,
00:25:41.500 --> 00:25:45.570
the more iterations, the more test cases
you can execute in the same amount of
00:25:45.570 --> 00:25:50.700
time, the better, because you find more
bugs. And on top of that, there's no
00:25:50.700 --> 00:25:57.520
currently available sanitizer for
kernel binaries that works—is based on
00:25:57.520 --> 00:26:01.309
this approach. So in userspace you have
something like valgrind; in the kernel,
00:26:01.309 --> 00:26:05.071
you don't have anything, at least that we
know of. There is another approach, which
00:26:05.071 --> 00:26:09.951
is to use Intel Processor Trace. This has
been, like—there's been some research
00:26:09.951 --> 00:26:14.240
papers on this recently, and this is nice
because it allows you to collect coverage
00:26:14.240 --> 00:26:18.040
at nearly zero overhead. It's, like,
really fast, but the problem is that it
00:26:18.040 --> 00:26:23.020
requires hardware support, so it requires
a fairly new x86 CPU, and if you want to
00:26:23.020 --> 00:26:27.159
fuzz something on ARM, say, like, your
Android driver, or if you want to use an
00:26:27.159 --> 00:26:32.120
older CPU, then you're out of luck. And
what's worse, you cannot really use it for
00:26:32.120 --> 00:26:36.490
sanitization, or at least not the kind of
sanitization that ASan does, because it
00:26:36.490 --> 00:26:41.770
just traces the execution; it doesn't
allow you to do checks on memory accesses.
00:26:41.770 --> 00:26:47.350
So Approach 3, which is what we will use,
is static rewriting. So, we had this very
00:26:47.350 --> 00:26:50.750
nice framework for rewriting userspace
binaries, and then we asked ourselves, can
00:26:50.750 --> 00:26:56.659
we make this work in the kernel? So we
took the system, the original RetroWrite,
00:26:56.659 --> 00:27:02.650
we modified it, we implemented support for
Linux modules, and... it works! So we have
00:27:02.650 --> 00:27:08.110
implemented it—we have used it to fuzz
some kernel modules, and it really shows
00:27:08.110 --> 00:27:11.640
that this approach doesn't only work for
userspace; it can also be applied to the
00:27:11.640 --> 00:27:18.510
kernel. So as for some implementation, the
nice thing about kernel modules is that
00:27:18.510 --> 00:27:22.170
they're always position independent. So
you cannot have position—like, fixed-
00:27:22.170 --> 00:27:26.370
position kernel modules because Linux just
doesn't allow it. So we sort of get that
00:27:26.370 --> 00:27:32.220
for free, which is nice. And Linux modules
are also a special class of ELF files,
00:27:32.220 --> 00:27:35.890
which means that the format is—even though
it's not the same as userspace binaries,
00:27:35.890 --> 00:27:40.310
it's still somewhat similar, so we didn't
have to change the symbolizer that much,
00:27:40.310 --> 00:27:46.539
which is also nice. And we implemented
symbolization with this, and we used it to
00:27:46.539 --> 00:27:54.490
implement both code coverage and binary
ASan for kernel binary modules. So for
00:27:54.490 --> 00:27:59.039
coverage: The idea behind the whole
RetroWrite project was that we wanted to
00:27:59.039 --> 00:28:03.500
integrate with existing tools. So existing
fuzzing tools. We didn't want to force our
00:28:03.500 --> 00:28:08.770
users to write their own fuzzer that is
compatible with RetroWrite. So for—in
00:28:08.770 --> 00:28:13.470
userspace we had AFL-style coverage
tracking, and binary ASan which is
00:28:13.470 --> 00:28:16.490
compatible with source-based ASan, and we
wanted to follow the same principle in the
00:28:16.490 --> 00:28:21.900
kernel. So it turns out that Linux has
this built-in coverage-tracking framework
00:28:21.900 --> 00:28:26.529
called kCov that is used by several
popular kernel fuzzers like syzkaller, and
00:28:26.529 --> 00:28:31.049
we wanted to use it ourselves. So we
designed our coverage instrumentation so
00:28:31.049 --> 00:28:36.590
that it integrates with kCov. The downside
is that you need to compile the kernel
00:28:36.590 --> 00:28:40.690
with kCov, but then again, Linux is open
source, so you can sort of always do that;
00:28:40.690 --> 00:28:44.279
the kernel usually—it's usually not the
kernel, it is a binary blob, but it's
00:28:44.279 --> 00:28:48.929
usually only the modules. So that's just
still fine. And the way you do this is—the
00:28:48.929 --> 00:28:53.370
way you implement kCov for binary modules
is that you just have to find the start of
00:28:53.370 --> 00:28:58.539
every basic block, and add a call to some
function that then stores the collected
00:28:58.539 --> 00:29:02.530
coverage. So here's an example: we have a
short snippet of code with three basic
00:29:02.530 --> 00:29:07.620
blocks, and all we have to do is add a
call to "trace_pc" to the start of the
00:29:07.620 --> 00:29:11.940
basic block. "trace_pc" is a function that
is part of the main kernel image that then
00:29:11.940 --> 00:29:17.230
collects this coverage and makes it
available to a userspace fuzzing agent. So
00:29:17.230 --> 00:29:21.210
this is all really easy and it works. And
let's now see how we implemented binary
00:29:21.210 --> 00:29:25.600
ASan. So as I mentioned before, when we
instrument the program with binary ASan in
00:29:25.600 --> 00:29:29.690
userspace we link with libASan, which
takes care of setting up the metadata,
00:29:29.690 --> 00:29:33.880
takes care of putting the red zones around
our allocations, and so on. So we had to
00:29:33.880 --> 00:29:37.330
do something similar in the kernel; of
course, you cannot link with libASan in
00:29:37.330 --> 00:29:42.630
the kernel, because that doesn't work, but
what we can do instead is, again, compile
00:29:42.630 --> 00:29:47.240
the kernel with kASan support. So this
instruments the allocator, kmalloc, to add
00:29:47.240 --> 00:29:52.110
the red zones; it allocates space for the
metadata, it keeps this metadata around,
00:29:52.110 --> 00:29:56.279
does this all for us, which is really
nice. And again, the big advantage of
00:29:56.279 --> 00:30:00.580
using this approach is that we can
integrate seamlessly with a kASan-
00:30:00.580 --> 00:30:05.800
instrumented kernel and with fuzzers that
rely on kASan such as syzkaller. So we see
00:30:05.800 --> 00:30:11.500
this as more of a plus than, like, a
limitation. And how do you implement ASan?
00:30:11.500 --> 00:30:16.561
Well, you have to find every memory access
and instrument it to check the—to check
00:30:16.561 --> 00:30:22.370
whether this is accessing a red zone. And
if it does then you just call this bug
00:30:22.370 --> 00:30:26.010
report function that produces a stack
trace, a bug report, and crashes the
00:30:26.010 --> 00:30:29.649
kernel, so that the fuzzer can detect it.
Again, this is compatible with source-
00:30:29.649 --> 00:30:36.990
based kASan, so we're happy. We can simply
load the rewritten module with added
00:30:36.990 --> 00:30:40.220
instrumentation into a kernel, as long as
you have compiled the kernel with the
00:30:40.220 --> 00:30:44.340
right flags, and we can use a standard
kernel fuzzer. Here for the—our
00:30:44.340 --> 00:30:49.910
evaluation, we used syzkaller, a popular
kernel fuzzer by some folks at Google, and
00:30:49.910 --> 00:30:55.460
it worked really well. So we've finally
reached the end of our journey, and now we
00:30:55.460 --> 00:31:00.470
wanted to present some experiments we did
to see if this really works. So for
00:31:00.470 --> 00:31:05.289
userspace, we wanted to compare the
performance of our binary ASan with
00:31:05.289 --> 00:31:10.360
source-based ASan and with existing
solutions that also work on binaries. So
00:31:10.360 --> 00:31:15.860
for userspace, you can use valgrind
memcheck. It's a memory sanitizer that is
00:31:15.860 --> 00:31:20.850
based on binary translation and dynamic
binary translation and works on binaries.
00:31:20.850 --> 00:31:25.460
We compared it with source ASan and
RetroWrite ASan on the SPEC CPU benchmark
00:31:25.460 --> 00:31:31.100
and saw how fast it was. And for the
kernel we decided to fuzz some file
00:31:31.100 --> 00:31:37.519
systems and some drivers with syzkaller
using both source-based KASan and kCov and
00:31:37.519 --> 00:31:44.671
kRetroWrite-based KASan and kCov. So these
are our results for userspace. So the red
00:31:44.671 --> 00:31:48.990
bar is valgrind. We can see that the
execution time of valgrind is the highest.
NOTE Paragraph
00:31:48.990 --> 00:31:55.892
It is really, really slow—like, 3, 10, 30x
overhead, way too slow for fuzzing. Then
00:31:55.892 --> 00:32:02.580
in green, we have our binary ASan, which
is, like, already a large improvement. In
00:32:02.580 --> 00:32:07.059
orange we have source-based ASan. And then
finally in blue we have the original code
00:32:07.059 --> 00:32:11.090
without any instrumentation whatsoever. So
we can see that source-based ASan has,
00:32:11.090 --> 00:32:16.659
like, 2x or 3x overhead, and binary ASan
is a bit higher, like, a bit less
00:32:16.659 --> 00:32:21.312
efficient, but still somewhat close. So
that's for userspace, and for the kernel,
00:32:21.312 --> 00:32:25.440
we—these are some preliminary results, so,
this is, like—I'm doing this work as part
00:32:25.440 --> 00:32:29.897
of my master's thesis, and so I'm still,
like, running the evaluation. Here we can
00:32:29.897 --> 00:32:33.419
see that the overhead is already, like, a
bit lower. So the reason for this is that
00:32:33.419 --> 00:32:39.690
SPEC is a pure CPU benchmark; it doesn't
interact with the system that much. And so
00:32:39.690 --> 00:32:44.416
any instrumentation that you add is going
to massively slow down, or, like,
00:32:44.416 --> 00:32:49.320
considerably slow down the execution. By
contrast, when you fuzz a file system with
00:32:49.320 --> 00:32:56.460
syzkaller, not only every test case has to
go from the high—the host to the guest and
00:32:56.460 --> 00:33:01.770
then do multiple syscalls and so on, but
also every system call has to go through
00:33:01.770 --> 00:33:05.368
several layers of abstraction before it
gets to the actual file system. And all
00:33:05.368 --> 00:33:09.610
these—like, all of this takes a lot of
time, and so in practice the overhead of
00:33:09.610 --> 00:33:15.581
our instrumentation seems to be pretty
reasonable. So, since we know that you
00:33:15.581 --> 00:33:32.838
like demos, we've prepared a small demo of
kRetroWrite. So. Let's see. Yep. Okay. All
00:33:32.838 --> 00:33:40.470
right, so we've prepared a small kernel
module. And this module is just, like,
00:33:40.470 --> 00:33:45.669
really simple; it contains a
vulnerability, and what it does is that it
00:33:45.669 --> 00:33:49.929
creates a character device. So if you're
not familiar with this, a character device
00:33:49.929 --> 00:33:55.130
is like a fake file that is exposed by a
kernel driver and that it can read to and
00:33:55.130 --> 00:34:01.630
write from. And instead of going to a
file, the data that you read—that you, in
00:34:01.630 --> 00:34:05.590
this case, write to the fake file—goes to
the driver and is handled by this demo
00:34:05.590 --> 00:34:10.481
write function. So as we can see, this
function allocates a buffer, a 16-byte
00:34:10.481 --> 00:34:14.850
buffer on the heap, and then copies some
data into it, and then it checks if the
00:34:14.850 --> 00:34:19.970
data contains the string "1337". If it
does, then it accesses the buffer out of
00:34:19.970 --> 00:34:23.446
bounds; you can see "alloc[16]" and the
buffer is sixteen bytes; this is an out-
00:34:23.446 --> 00:34:27.550
of-bounds read by one byte. And if it
doesn't then it just accesses the buffer
00:34:27.550 --> 00:34:33.050
in bounds, which is fine, and it's not a
vulnerability. So we can compile this
00:34:33.050 --> 00:34:47.450
driver. OK, um... OK, and then so we have
our module, and then we will instrument it
00:34:47.450 --> 00:35:01.495
using kRetroWrite. So, instrument... Yes,
please. OK. Right. So kRetroWrite did some
00:35:01.495 --> 00:35:07.329
processing, and it produced an
instrumented module with ASan or kASan and
00:35:07.329 --> 00:35:09.770
a symbolized assembly file. We can
actually have a look at the symbolized
00:35:09.770 --> 00:35:17.740
assembly file to see what it looks like.
Yes. Yes. OK. So, is this big enough?
00:35:17.740 --> 00:35:22.900
Yeah... As you can see, so—we can actually
see here the ASan instrumentation. Ah,
00:35:22.900 --> 00:35:29.329
shouldn't—yeah. So, we—this is the ASan
instrumentation. The original code loads
00:35:29.329 --> 00:35:33.290
some data from this address. And as you
can see, the ASan instrumentation first
00:35:33.290 --> 00:35:38.240
computes the actual address, and then does
some checking—basically, this is checking
00:35:38.240 --> 00:35:44.430
some metadata that ASan stores to check if
the address is in a red zone or not, and
00:35:44.430 --> 00:35:49.430
then if the fail check fails, then it
calls this ASan report which produces a
00:35:49.430 --> 00:35:54.829
stack trace and crashes the kernel. So
this is fine. We can actually even look at
00:35:54.829 --> 00:36:17.820
the disassembly of both modules, so...
object dump and then demo... Ah, nope. OK,
00:36:17.820 --> 00:36:21.830
so on the left, we have the original
module without any instrumentation; on the
00:36:21.830 --> 00:36:27.070
right, we have the module instrumented
with ASan. So as you can see, the original
00:36:27.070 --> 00:36:33.160
module has "push r13" and then has this
memory load here; on the right in the
00:36:33.160 --> 00:36:38.559
instrumented module, kRetroWrite inserted
the ASan instrumentation. So the original
00:36:38.559 --> 00:36:43.940
load is still down here, but between that,
between the first instruction and this
00:36:43.940 --> 00:36:47.851
instruction, we have—now have the kASan
instrumentation that does our check. So
00:36:47.851 --> 00:36:56.700
this is all fine. Now we can actually test
it and see what it does. So we can—we will
00:36:56.700 --> 00:37:02.210
boot a very simple, a very minimal Linux
system, and try to target the
00:37:02.210 --> 00:37:05.793
vulnerability first with the non-
instrumented module and then with the
00:37:05.793 --> 00:37:10.410
instrumented module. And we can—we will
see that in the—with the non-instrumented
00:37:10.410 --> 00:37:14.550
module, the kernel will not crash, but
with the instrumented module it will crash
00:37:14.550 --> 00:37:22.434
and produce a bug report. So. Let's see.
Yeah, this is a QEMU VM, I have no idea
00:37:22.434 --> 00:37:27.481
why it's taking so long to boot. I'll
blame the the demo gods not being kind to
00:37:27.481 --> 00:37:39.730
us. Yeah, I guess we just have to wait.
OK. So. All right, so we loaded the
00:37:39.730 --> 00:37:47.334
module. We will see that it has created a
fake file character device in /dev/demo.
00:37:47.334 --> 00:37:59.020
Yep. We can write this file. Yep. So this
will—this accesses the array in bounds,
00:37:59.020 --> 00:38:04.410
and so this is fine. Then what we can also
do is write "1337" to it so it will access
00:38:04.410 --> 00:38:08.968
the array out of bounds. So this is the
non instrumented module, so this will not
00:38:08.968 --> 00:38:14.050
crash. It will just print some garbage
value. Okay, that's it. Now we can load
00:38:14.050 --> 00:38:25.890
the instrumented module instead... and do
the same experiment again. All right. We
00:38:25.890 --> 00:38:31.640
can see that /dev/demo is still here. So
the module still works. Let's try to write
00:38:31.640 --> 00:38:38.540
"1234" into it. This, again, doesn't
crash. But when we try to write "1337",
00:38:38.540 --> 00:38:47.940
this will produce a bug report.
applause
00:38:47.940 --> 00:38:51.129
So this has quite a lot of information. We
00:38:51.129 --> 00:38:55.700
can see, like, the—where the memory was
allocated, there's a stack trace for that;
00:38:55.700 --> 00:39:02.150
it wasn't freed, so there's no stack trace
for the free. And we see that the cache
00:39:02.150 --> 00:39:06.760
size of the memory, like, it was a 16-byte
allocation. We can see the shape of the
00:39:06.760 --> 00:39:10.900
memory. We see that these two zeros means
that there's two 8-byte chunks of valid
00:39:10.900 --> 00:39:15.550
memory. And then these "fc fc fc" is
the—are the red zones that I was talking
00:39:15.550 --> 00:39:19.980
about before. All right, so that's it for
the demo. We will switch back to our
00:39:19.980 --> 00:39:24.630
presentation now. So... hope you enjoyed
it.
00:39:24.630 --> 00:39:30.530
gannimo: Cool. So after applying this to a
demo module, we also wanted to see what
00:39:30.530 --> 00:39:35.365
happens if we apply this to a real file
system. After a couple of hours we
00:39:35.365 --> 00:39:41.390
were—when we came back and checked on the
results, we saw a couple of issues popping
00:39:41.390 --> 00:39:48.720
up, including a nice set of use-after-free
reads, a set of use-after-free writes, and
00:39:48.720 --> 00:39:56.220
we checked the bug reports and we saw a
whole bunch of Linux kernel issues popping
00:39:56.220 --> 00:40:02.640
up one after the other in this nondescript
module that we fuzzed. We're in the
00:40:02.640 --> 00:40:06.930
process of reporting it. This will take
some time until it is fixed; that's why
00:40:06.930 --> 00:40:13.470
you see the blurry lines. But as you see,
there's still quite a bit of opportunity
00:40:13.470 --> 00:40:19.190
in the Linux kernel where you can apply
different forms of targeted fuzzing into
00:40:19.190 --> 00:40:26.349
different modules, leverage these modules
on top of a kASan instrumented kernel and
00:40:26.349 --> 00:40:31.720
then leveraging this as part of your
fuzzing toolchain to find interesting
00:40:31.720 --> 00:40:39.080
kernel 0days that... yeah. You can then
develop further, or report, or do whatever
00:40:39.080 --> 00:40:44.766
you want with them. Now, we've shown you
how you can take existing binary-only
00:40:44.766 --> 00:40:51.250
modules, think different binary-only
drivers, or even existing modules where
00:40:51.250 --> 00:40:55.800
you don't want to instrument a full set of
the Linux kernel, but only focus fuzzing
00:40:55.800 --> 00:41:02.130
and exploration on a small different—small
limited piece of code and then do security
00:41:02.130 --> 00:41:09.247
tests on those. We've shown you how we can
do coverage-based tracking and address
00:41:09.247 --> 00:41:13.500
sanitization. But this is also up to you
on what kind of other instrumentation you
00:41:13.500 --> 00:41:17.890
want. Like this is just a tool, a
framework that allows you to do arbitrary
00:41:17.890 --> 00:41:23.780
forms of instrumentation. So we've taken
you on a journey from instrumenting
00:41:23.780 --> 00:41:29.380
binaries over coverage-guided fuzzing and
sanitization to instrumenting modules in
00:41:29.380 --> 00:41:36.692
the kernel and then finding crashes in the
kernel. Let me wrap up the talk. So, this
00:41:36.692 --> 00:41:41.581
is one of the the fun pieces of work that
we do in the hexhive lab at EPFL. So if
00:41:41.581 --> 00:41:45.740
you're looking for postdoc opportunities
or if you're thinking about a PhD, come
00:41:45.740 --> 00:41:51.809
talk to us. We're always hiring. The tools
will be released as open source. A large
00:41:51.809 --> 00:41:57.319
chunk of the userspace work is already
open source. We're working on a set of
00:41:57.319 --> 00:42:02.350
additional demos and so on so that you can
get started faster, leveraging the
00:42:02.350 --> 00:42:07.810
different existing instrumentation that is
already out there. The userspace work is
00:42:07.810 --> 00:42:12.139
already available. The kernel work will be
available in a couple of weeks. This
00:42:12.139 --> 00:42:16.770
allows you to instrument real-world
binaries for fuzzing, leveraging existing
00:42:16.770 --> 00:42:21.200
transformations for coverage tracking to
enable fast and effective fuzzing and
00:42:21.200 --> 00:42:26.490
memory checking to detect the actual bugs
that exist there. The key takeaway from
00:42:26.490 --> 00:42:32.430
this talk is that RetroWrite and
kRetroWrite enables static binary
00:42:32.430 --> 00:42:38.300
rewriting at zero instrumentation cost. We
take the limitation of focusing only on
00:42:38.300 --> 00:42:43.240
position-independent code, which is not a
real implementation, but we get the
00:42:43.240 --> 00:42:47.800
advantage of being able to symbolize
without actually relying on heuristics, so
00:42:47.800 --> 00:42:55.380
we can even symbolize large, complex
source—large, complex applications and
00:42:55.380 --> 00:43:01.090
effectively rewrite those aspects and then
you can focus fuzzing on these parts.
00:43:01.090 --> 00:43:06.329
Another point I want to mention is that
this enables you to reuse existing tooling
00:43:06.329 --> 00:43:10.981
so you can take a binary blob, instrument
it, and then reuse, for example, Address
00:43:10.981 --> 00:43:15.966
Sanitizer or existing fuzzing tools, as it
integrates really, really nice. As I said,
00:43:15.966 --> 00:43:22.700
all the code is open source. Check it out.
Try it. Let us know if it breaks. We're
00:43:22.700 --> 00:43:27.521
happy to fix. We are committed to open
source. And let us know if there are any
00:43:27.521 --> 00:43:36.750
questions. Thank you.
applause
00:43:36.750 --> 00:43:42.250
Herald: So, thanks, guys, for an
interesting talk. We have some time for
00:43:42.250 --> 00:43:47.180
questions, so we have microphones along
the aisles. We'll start from question from
00:43:47.180 --> 00:43:51.079
microphone number two.
Q: Hi. Thanks for your talk and for the
00:43:51.079 --> 00:43:59.400
demo. I'm not sure about the use-case you
showed for the kernel RetroWrite. 'Cause
00:43:59.400 --> 00:44:05.579
you're usually interested in fuzzing
binary in kernelspace when you don't have
00:44:05.579 --> 00:44:13.980
source code for the kernel. For example,
for IoT or Android and so on. But you just
00:44:13.980 --> 00:44:22.260
reuse the kCov and kASan in the kernel,
and you never have the kernel in IoT or
00:44:22.260 --> 00:44:28.599
Android which is compiled with that. So
are you—do you have any plans to binary
00:44:28.599 --> 00:44:31.666
instrument the kernel itself, not the
modules?
00:44:31.666 --> 00:44:39.390
Nspace: So we thought about that. I think
that there's some additional problems that
00:44:39.390 --> 00:44:43.910
we would have to solve in order to be able
to instrument the full kernel. So other
00:44:43.910 --> 00:44:47.819
than the fact that it gives us
compatibility with, like, existing tools,
00:44:47.819 --> 00:44:51.720
the reason why we decided to go with
compiling the kernel with kASan and kCov
00:44:51.720 --> 00:44:56.757
is that building the, like—you would you
have to, like, think about it. You
00:44:56.757 --> 00:45:01.540
have to instrument the memory allocator to
add red zones, which is, like, already
00:45:01.540 --> 00:45:07.069
somewhat complex. You have to instrument
the exception handlers to catch, like, any
00:45:07.069 --> 00:45:12.240
faults that the instrumentation detects.
You would have to, like, set up some
00:45:12.240 --> 00:45:17.480
memory for the ASan shadow. So this is,
like—I think you should be able to do it,
00:45:17.480 --> 00:45:21.690
but it would require a lot of additional
work. So this is, like—this was like four
00:45:21.690 --> 00:45:25.510
months' thesis. So we decided to start
small and prove that it works in
00:45:25.510 --> 00:45:30.470
the kernel for modules, and then leave it
to future work to actually extend it to
00:45:30.470 --> 00:45:37.558
the full kernel. Also, like, I think for
Android—so in the case of Linux, the
00:45:37.558 --> 00:45:42.072
kernel is GPL, right, so if the
manufacturers ships a custom kernel, they
00:45:42.072 --> 00:45:44.614
have to release the source code, right?
Q: They never do.
00:45:44.614 --> 00:45:47.220
Nspace: They never—well, that's a
different issue. Right?
00:45:47.220 --> 00:45:49.009
gannimo: Right.
Q: So that's why I ask, because I don't
00:45:49.009 --> 00:45:51.839
see how it just can be used in the real
world.
00:45:51.839 --> 00:45:57.122
gannimo: Well, let me try to put this into
perspective a little bit as well. Right.
00:45:57.122 --> 00:46:02.030
So there's the—what we did so far is we
leveraged existing tools, like kASan or
00:46:02.030 --> 00:46:09.440
kCov, and integrated into these existing
tools. Now, doing heap-based allocation is
00:46:09.440 --> 00:46:13.572
fairly simple and replacing those with
additional red zones—that instrumentation
00:46:13.572 --> 00:46:20.203
you can carry out fairly well by focusing
on the different allocators. Second to
00:46:20.203 --> 00:46:24.972
that, simply oopsing the kernel and
printing the stack trace is also fairly
00:46:24.972 --> 00:46:29.250
straightforward. So it's not a lot of
additional effort. So it is—it involves
00:46:29.250 --> 00:46:38.471
some engineering effort to port this to
non-kASan-compiled kernels. But we think
00:46:38.471 --> 00:46:44.740
it is very feasible. In the interest of
time, we focused on kASan-enabled kernels,
00:46:44.740 --> 00:46:50.960
so that some form of ASan is already
enabled. But yeah, this is additional
00:46:50.960 --> 00:46:55.660
engineering effort. But there is also a
community out there that can help us with
00:46:55.660 --> 00:47:00.960
these kind of changes. So kRetroWrite and
RetroWrite themselves are the binary
00:47:00.960 --> 00:47:07.060
rewriting platform that allow you to turn
a binary into an assembly file that you
00:47:07.060 --> 00:47:11.619
can then instrument and run different
passes on top of it. So another pass would
00:47:11.619 --> 00:47:16.399
be a full ASan pass or kASan pass that
somebody could add and then contribute
00:47:16.399 --> 00:47:19.100
back to the community.
Q: Yeah, it would be really useful.
00:47:19.100 --> 00:47:20.186
Thanks.
gannimo: Cool.
00:47:20.186 --> 00:47:24.260
Angel: Next question from the Internet.
Q: Yes, there is a question regarding the
00:47:24.260 --> 00:47:30.890
slide on the SPEC CPU benchmark. The
second or third graph from the right had
00:47:30.890 --> 00:47:36.700
an instrumented version that was faster
than the original program. Why is that?
00:47:36.700 --> 00:47:42.299
gannimo: Cache effect. Thank you.
Angel: Microphone number one.
00:47:42.299 --> 00:47:47.032
Q: Thank you. Thank you for presentation.
I have question: how many architecture do
00:47:47.032 --> 00:47:51.210
you support, and if you have support more,
what then?
00:47:51.210 --> 00:47:56.400
gannimo: x86_64.
Q: Okay. So no plans for ARM or MIPS,
00:47:56.400 --> 00:47:58.130
or...?
gannimo: Oh, there are plans.
00:47:58.130 --> 00:48:01.390
Q: Okay.
Nspace: Right, so—
00:48:01.390 --> 00:48:05.980
gannimo: Right. Again, there's a finite
amount of time. We focused on the
00:48:05.980 --> 00:48:11.778
technology. ARM is high up on the list. If
somebody is interested in working on it
00:48:11.778 --> 00:48:17.670
and contributing, we're happy to hear from
it. Our list of targets is ARM first and
00:48:17.670 --> 00:48:22.915
then maybe something else. But I think
with x86_64 and ARM we've covered a
00:48:22.915 --> 00:48:33.420
majority of the interesting platforms.
Q: And second question, did you try to
00:48:33.420 --> 00:48:37.970
fuzz any real closed-source program?
Because as I understand from presentation,
00:48:37.970 --> 00:48:44.710
you fuzz, like, just file system, what we
can compile and fuzz with syzkaller like
00:48:44.710 --> 00:48:48.570
in the past.
Nspace: So for the evaluation, we wanted
00:48:48.570 --> 00:48:52.130
to be able to compare between the source-
based instrumentation and the binary-based
00:48:52.130 --> 00:48:57.460
instrumentation, so we focused mostly on
open-source filesystem and drivers because
00:48:57.460 --> 00:49:02.058
then we could instrument them with a
compiler. We haven't yet tried, but this
00:49:02.058 --> 00:49:05.740
is, like, also pretty high up on the list.
We wanted to try to find some closed-
00:49:05.740 --> 00:49:10.609
source drivers—there's lots of them, like
for GPUs or anything—and we'll give it a
00:49:10.609 --> 00:49:15.460
try and find some 0days, perhaps.
Q: Yes, but with syzkaller, you still have
00:49:15.460 --> 00:49:22.582
a problem. You have to write rules, like,
dictionaries. I mean, you have to
00:49:22.582 --> 00:49:24.599
understand the format, have to communicate
with the driver.
00:49:24.599 --> 00:49:28.550
Nspace: Yeah, right But there's, for
example, closed-source file systems that
00:49:28.550 --> 00:49:33.270
we are looking at.
Q: Okay. Thinking.
00:49:33.270 --> 00:49:38.657
Herald: Number two.
Q: Hi. Thank you for your talk. So I don't
00:49:38.657 --> 00:49:45.070
know if there are any kCov- or kASan-
equivalent solution to Windows, but I was
00:49:45.070 --> 00:49:49.933
wondering if you tried, or are you
planning to do it on Windows, the
00:49:49.933 --> 00:49:52.540
framework? Because I know it might be
challenging because of the driver
00:49:52.540 --> 00:49:56.849
signature enforcement and PatchGuard, but
I wondered if you tried or thought about
00:49:56.849 --> 00:49:59.290
it.
gannimo: Yes, we thought about it and we
00:49:59.290 --> 00:50:06.383
decided against it. Windows is incredibly
hard and we are academics. The research I
00:50:06.383 --> 00:50:11.800
do in my lab, or we do in my research lab,
focuses on predominantly open-source
00:50:11.800 --> 00:50:17.060
software and empowers open-source
software. Doing full support for Microsoft
00:50:17.060 --> 00:50:20.780
Windows is somewhat out of scope. If
somebody wants to port these tools, we are
00:50:20.780 --> 00:50:24.190
happy to hear it and work with these
people. But it's a lot of additional
00:50:24.190 --> 00:50:28.530
engineering effort, versus very
additional—very low additional research
00:50:28.530 --> 00:50:33.060
value, so we'll have to find some form of
compromise. And, like, if you would be
00:50:33.060 --> 00:50:38.650
willing to fund us, we would go ahead. But
it's—yeah, it's a cost question.
00:50:38.650 --> 00:50:42.089
Q: And you're referring both to kernel and
user space, right?
00:50:42.089 --> 00:50:45.089
gannimo: Yeah.
Q: Okay. Thank you.
00:50:45.089 --> 00:50:48.105
Herald: Number five.
Q: Hi, thanks for the talk. This seems
00:50:48.105 --> 00:50:52.400
most interesting if you're looking for
vulnerabilities in closed source kernel
00:50:52.400 --> 00:50:58.359
modules, but not giving it too much
thought, it seems it's really trivial to
00:50:58.359 --> 00:51:01.920
prevent this if you're writing a closed
source module.
00:51:01.920 --> 00:51:07.130
gannimo: Well, how would you prevent this?
Q: Well, for starters, you would just take
00:51:07.130 --> 00:51:11.492
a difference between the address of two
functions. That's not gonna be IP
00:51:11.492 --> 00:51:15.860
relative, so...
Nspace: Right. So we explicitly—like, even
00:51:15.860 --> 00:51:21.589
in the original RetroWrite paper—we
explicitly decided to not try to deal with
00:51:21.589 --> 00:51:25.777
obfuscated code, or code that is
purposefully trying to defeat this kind of
00:51:25.777 --> 00:51:30.510
rewriting. Because, like, the assumption
is that first of all, there are techniques
00:51:30.510 --> 00:51:34.099
to, like, deobfuscate code or remove
these, like, checks in some way, but this
00:51:34.099 --> 00:51:39.510
is, like, sort of orthogonal work. And at
the same time, I guess most drivers are
00:51:39.510 --> 00:51:43.980
not really compiled with the sort of
obfuscation; they're just, like, you know,
00:51:43.980 --> 00:51:47.657
they're compiled with a regular compiler.
But yeah, of course, this is, like, a
00:51:47.657 --> 00:51:50.070
limitation.
gannimo: They're likely stripped, but not
00:51:50.070 --> 00:51:54.281
necessarily obfuscated. At least from what
we've seen when we looked at binary-only
00:51:54.281 --> 00:51:58.980
drivers.
Herald: Microphone number two.
00:51:58.980 --> 00:52:04.350
Q: How do you decide where to place the
red zones? From what I heard, you talked
00:52:04.350 --> 00:52:10.030
about instrumenting the allocators, but,
well, there are a lot of variables on the
00:52:10.030 --> 00:52:13.270
stack, so how do you deal with those?
gannimo: Oh, yeah, that's actually super
00:52:13.270 --> 00:52:20.159
cool. I refer to some extent to the paper
that is on the GitHub repo as well. If you
00:52:20.159 --> 00:52:26.778
think about it, modern compilers use
canaries for buffers. Are you aware of
00:52:26.778 --> 00:52:31.150
stack canaries—how stack canaries work?
So, stack canaries—like, if the compiler
00:52:31.150 --> 00:52:34.440
sees there's a buffer that may be
overflown, it places a stack canary
00:52:34.440 --> 00:52:39.740
between the buffer and any other data.
What we use is we—as part of our analysis
00:52:39.740 --> 00:52:44.750
tool, we find these stack canaries, remove
the code that does the stack canary, and
00:52:44.750 --> 00:52:49.420
use this space to place our red zones. So
we actually hack the stack in areas,
00:52:49.420 --> 00:52:54.569
remove that code, and add ASan red zones
into the empty stack canaries that are now
00:52:54.569 --> 00:52:58.599
there. It's actually a super cool
optimization because we piggyback on what
00:52:58.599 --> 00:53:02.630
kind of work the compiler already did for
us before, and we can then leverage that
00:53:02.630 --> 00:53:06.780
to gain additional benefits and protect
the stack as well.
00:53:06.780 --> 00:53:11.120
Q: Thanks.
Angel: Another question from the Internet.
00:53:16.039 --> 00:53:20.920
Q: Yes. Did you consider lifting the
binary code to LLVM IR instead of
00:53:20.920 --> 00:53:28.370
generating assembler source?
gannimo: Yes. laughter But, so—a little
00:53:28.370 --> 00:53:32.060
bit longer answer. Yes, we did consider
that. Yes, it would be super nice to lift
00:53:32.060 --> 00:53:38.710
to LLVM IR. We've actually looked into
this. It's incredibly hard. It's
00:53:38.710 --> 00:53:42.270
incredibly complex. There's no direct
mapping between the machine code
00:53:42.270 --> 00:53:48.490
equivalent and the LLVM IR. You would
still need to recover all the types. So
00:53:48.490 --> 00:53:51.800
it's like this magic dream that you
recover full LLVM IR, then do heavyweight
00:53:51.800 --> 00:53:57.470
transformations on top of it. But this is
incredibly hard because if you compile
00:53:57.470 --> 00:54:03.570
down from LLVM IR to machine code, you
lose a massive amount of information. You
00:54:03.570 --> 00:54:07.150
would have to find a way to recover all of
that information, which is pretty much
00:54:07.150 --> 00:54:14.990
impossible and undecidable for many cases.
So for example, just as a note, we only
00:54:14.990 --> 00:54:19.420
recover control flow and we only
desymbolize control flow. For data
00:54:19.420 --> 00:54:23.030
references—we don't support
instrumentation of data references yet
00:54:23.030 --> 00:54:28.839
because there's still an undecidable
problem that we are facing with. I can
00:54:28.839 --> 00:54:32.859
talk more about this offline, or there is
a note in the paper as well. So this is
00:54:32.859 --> 00:54:37.270
just a small problem. Only if you're
lifting to assembly files. If you're
00:54:37.270 --> 00:54:41.700
lifting to LLVM IR, you would have to do
full end-to-end type recovery, which is
00:54:41.700 --> 00:54:46.400
massively more complicated. Yes, it would
be super nice. Unfortunately, it is
00:54:46.400 --> 00:54:50.530
undecidable and really, really hard. So
you can come up with some heuristics, but
00:54:50.530 --> 00:54:55.270
there is no solution that will do this
in—that will be correct 100 percent of the
00:54:55.270 --> 00:54:57.490
time.
Angel: We'll take one more question from
00:54:57.490 --> 00:55:02.609
microphone number six.
Q: Thank you for your talk. What kind of
00:55:02.609 --> 00:55:07.299
disassemblers did you use for RetroWrite,
and did you have problems with the wrong
00:55:07.299 --> 00:55:12.880
disassembly? And if so, how did you handle
it?
00:55:12.880 --> 00:55:18.790
Nspace: So, RetroWrite—so we used
Capstone for the disassembly.
00:55:18.790 --> 00:55:24.150
gannimo: An amazing tool, by the way.
Nspace: Yeah. So the idea is that, like,
00:55:24.150 --> 00:55:30.240
we need some kind of—some information
about where the functions are. So for the
00:55:30.240 --> 00:55:33.549
kernel modules, this is actually fine
because kernel modules come with this sort
00:55:33.549 --> 00:55:37.730
of information because the kernel needs
it, to build stack traces, for example.
00:55:37.730 --> 00:55:41.869
For userspace binaries, this is somewhat
less common, but you can use another tool
00:55:41.869 --> 00:55:46.170
to try to do function identification. And
we do, like—sort of, like, disassemble the
00:55:46.170 --> 00:55:54.500
entire function. So we have run into some
issues with, like—in AT&T syntax, because
00:55:54.500 --> 00:55:59.650
like we wanted to use gas, GNU's
assembler, for, for...
00:55:59.650 --> 00:56:04.240
gannimo: Reassembling.
Nspace: Reassembly, yeah. And some
00:56:04.240 --> 00:56:09.819
instructions are a lot—you can express the
same, like, two different instructions,
00:56:09.819 --> 00:56:15.670
like five-byte NOP and six-byte NOP, using
the same string of, like, text—a mnemonic,
00:56:15.670 --> 00:56:19.970
an operand string. But the problem is
that, like, the kernel doesn't like it and
00:56:19.970 --> 00:56:21.970
crashes. This took me like two days to
debug.
00:56:21.970 --> 00:56:27.640
gannimo: So the kernel uses dynamic binary
patching when it runs, at runtime, and it
00:56:27.640 --> 00:56:32.980
uses fixed offsets, so if you replace a
five-byte NOP with a six-byte NOP or vice
00:56:32.980 --> 00:56:37.830
versa, your offsets change and your kernel
just blows up in your face.
00:56:37.830 --> 00:56:43.099
Q: So it was kind of a case-on-case basis
where you saw the errors coming out of the
00:56:43.099 --> 00:56:47.920
disassembly and you had to fix it?
Nspace: So sorry, can you repeat the
00:56:47.920 --> 00:56:51.030
question?
Q: Like, for example, if you—if some
00:56:51.030 --> 00:56:54.910
instruction is not supported by the
disassembler, so you saw that it crashed,
00:56:54.910 --> 00:56:58.000
that there's something wrong, and then you
fix it by hand?
00:56:58.000 --> 00:57:02.940
Nspace: Yeah, well, if we saw that there
was a problem with it, this—like, I don't
00:57:02.940 --> 00:57:06.960
recall having any unknown instructions in
the dissasembler. I don't think I've ever
00:57:06.960 --> 00:57:11.290
had a problem with that. But yeah, this
was a lot of, like, you know, engineering
00:57:11.290 --> 00:57:14.290
work.
gannimo: So let me repeat. The problem was
00:57:14.290 --> 00:57:19.220
not a bug in the disassembler, but an
issue with the instruction format—that the
00:57:19.220 --> 00:57:24.530
same mnemonic can be translated into two
different instructions, one of which was
00:57:24.530 --> 00:57:29.089
five bytes long, the other one was six
bytes long. Both used the exact same
00:57:29.089 --> 00:57:32.880
mnemonic. Right, so this was an issue with
assembly encoding.
00:57:32.880 --> 00:57:38.290
Q: But you had no problems with
unsupported instructions which couldn't be
00:57:38.290 --> 00:57:41.339
disassembled?
Nspace: No, no. Not as far as I know, at
00:57:41.339 --> 00:57:43.339
least.
Angel: We have one more minute, so a very
00:57:43.339 --> 00:57:52.069
short question from microphone number two.
Q: Does it work? Ah. Is your binary
00:57:52.069 --> 00:58:02.020
instrumentation equally powerful as kernel
address space... I mean, kASan? So, does
00:58:02.020 --> 00:58:06.349
it detect all the memory corruptions on
stack, heap and globals?
00:58:06.349 --> 00:58:13.050
gannimo: No globals. But heap—it does all
of them on the heap. There's some slight
00:58:13.050 --> 00:58:20.150
variation on the stack because we have to
piggyback on the canary stuff. As I
00:58:20.150 --> 00:58:23.880
mentioned quickly before, there is no
reflowing and full recovery of data
00:58:23.880 --> 00:58:28.990
layouts. So to get anything on the stack,
we have to piggyback on existing compiler
00:58:28.990 --> 00:58:36.650
extensions like stack canaries. But—so we
don't support intra-object overflows on
00:58:36.650 --> 00:58:40.631
the stack. But we do leverage the stack in
areas to get some stack benefits, which
00:58:40.631 --> 00:58:45.490
is, I don't know, 90, 95 percent there
because the stack canaries are pretty
00:58:45.490 --> 00:58:51.319
good. For heap, we get the same precision.
For globals, we have very limited support.
00:58:51.319 --> 00:58:54.290
Q: Thanks.
Angel: So that's all the time we have for
00:58:54.290 --> 00:58:57.600
this talk. You can find the speakers, I
think, afterwards offline. Please give
00:58:57.600 --> 00:58:59.820
them a big round of applause for an
interesting talk.
00:58:59.820 --> 00:59:03.050
applause
00:59:03.050 --> 00:59:07.360
36c3 postrol music
00:59:07.360 --> 00:59:29.000
Subtitles created by c3subtitles.de
in the year 2021. Join, and help us!