35C3 - Attacking Chrome IPC

Edit subtitles

0:00 - 0:17

35c3 Preroll music
0:17 - 0:24

Herald: And now Ned Williamson's talk:
"Attacking Chrome Inter Process
0:24 - 0:30

Communication: reliably finding bugs to
escape the Chrome sandbox". He will be
0:30 - 0:33

talking about finding bugs in the Chrome
inter-process communication in order to
0:33 - 0:40

escape from the sandbox using a fuzzing
method to enumerate the attack surface of
0:40 - 0:47

the Chrome inter-process communication.
Ned is a vulnerability researcher. He
0:47 - 0:52

likes C and C++ vulnerabilities. He did
research for consoles and browsers and now
0:52 - 0:57

started to work on mobile devices. Please
welcome him with a huge round of applause.
0:57 - 1:04

Applause
1:04 - 1:10

Ned: All right. Hello everyone. My name is
Ned and today I'll be talking about Chrome
1:10 - 1:17

IPC. And actually as I was writing this
talk, I kind of came up with this idea to
1:17 - 1:22

make it more usable for everyone and the
way I ended up doing this was by trying to
1:22 - 1:26

start really general and then kind of
going more and more specific all the way
1:26 - 1:32

down to the Chrome IPC fuzzing. So if
you're really technical the end will be
1:32 - 1:36

still interesting and then if you're new
to this stuff hopefully the beginning part
1:36 - 1:45

will show some of how to get started. So
just a quick overview about me. I've
1:45 - 1:50

mostly been spending the last several
years on low level vulnerability research
1:50 - 1:56

and my particular interest is on any kind
of critical bugs. Meaning kind of the more
1:56 - 2:00

severe individual bug the more interesting
to me. So I'm trying to kind of solve this
2:00 - 2:07

problem of: How do we make the bug finding
process effective enough to bring out
2:07 - 2:13

these really rare hidden bugs? And you will
see an example of how to do that by the
2:13 - 2:19

end. But just an overview. I've basically
worked on four things the first being CTFs
2:19 - 2:26

then went to 3DS and Chrome. Now I'm
starting on XNU, but just a month ago
2:26 - 2:33

so, not too much yet. Before we get into it
I just want to give a little recap of what
2:33 - 2:40

happened since last time. So I was part of
the Nintendo hacking talk two years ago
2:40 - 2:52

here and I presented two exploits called
Soundhax and Fasthax and not to go into it
2:52 - 2:56

too much but I did want to share like what
happened here because I was actually
2:56 - 3:01

surprised. I put Google analytics on the
Soundhax website and I thought like maybe
3:01 - 3:06

a thousand people use it or something but
I just looked at the stats like a couple
3:06 - 3:12

weeks ago and then turned out like 100k
people used it or something. And then I
3:12 - 3:19

searched YouTube and found like these huge
videos where they were copied so like I
3:19 - 3:23

wanted to have a screenshot but it's
copyrighted so didn't do that, but
3:23 - 3:28

basically it looks like something on the
order of about a million users which is
3:28 - 3:34

crazy because this is one of my intro
projects really so I think like this
3:34 - 3:38

should show you that you don't have to be
all the way up onto Chrome or whatever it
3:38 - 3:44

is to get into this and do some huge fun
project. And then I just wanted to
3:44 - 3:50

publicly talk about the donations because
I had a donation link on the Soundhax
3:50 - 3:56

website and we fortunately received about
a thousand dollars in donations and then
3:56 - 4:03

half of that went to the emulator people
because they, uh, that's how I eventually
4:03 - 4:08

wrote my exploit for Soundhax so it made
sense to repay that, and then the other
4:08 - 4:14

half went to buying switches for like the
toolchain developers who couldn't afford
4:14 - 4:22

it and so just wanted to thank everyone
who used this or whoever donated. Just a
4:22 - 4:27

shout out. So we will get into the actual
meat of the talk. So basically I want to
4:27 - 4:33

focus on the bug finding process, not
exploitation necessarily because this
4:33 - 4:39

topic is pretty well explored I think and
I think the bug hunting aspect is kind of
4:39 - 4:44

what's the most prohibitive for people to
join in. And when I look at the number of
4:44 - 4:49

people who I play CTF with who are really
good at exploitation and then a number of
4:49 - 4:55

like these prolific bug hunters it just
seems like from what I see from how smart
4:55 - 4:58

people are there should be more people
doing the bug hunting and I hope that if I
4:58 - 5:06

can talk about it more people can come
over. So with that, the agenda will be
5:06 - 5:13

just overall how do you make a process to
achieve any goal. Then next how do you
5:13 - 5:19

apply this kind of, some kind of strategy
to bug hunting, then this new fuzzing
5:19 - 5:24

style I've been kind of developing, some
other people out in the industry have
5:24 - 5:31

been working on. And then finally how does
this all tie back to Chrome IPC. So also
5:31 - 5:36

just to mention - I should mention that
the bug I'll be showing in this
5:36 - 5:41

presentation was used in a full chain
exploit that I developed with a couple
5:41 - 5:47

other people and the details of the
exploitation of that will be discussed at
5:47 - 5:51

OffensiveCon, so that's also here in
Germany and hopefully people will check it
5:51 - 5:56

out. So how do you become an expert in
anything and I kind of was thinking this
5:56 - 6:01

before I even started anything and I was
like in the CTF stage and I was just kind
6:01 - 6:06

of curious like if I approach this with
the mindset of there's this arbitrary
6:06 - 6:11

skill I want to learn. And if I approach
it strategically like what's going to
6:11 - 6:16

happen. So I looked into this expert
research and then there's kind of this
6:16 - 6:22

idea of pop psych like you need to study
something deliberately for 10000 hours to
6:22 - 6:27

get good at it. And you know there's some
debate about this number it's kind of made
6:27 - 6:31

up, I guess, but the essential idea of
deliberate practice I think is very
6:31 - 6:39

useful. And it's exactly how I structured
my study. And so what deliberate means is,
6:39 - 6:45

when you're learning you want to be
thinking like purposefully like I want to
6:45 - 6:50

make sure that the project that I'm doing
is making me get better. I want to be
6:50 - 6:57

actually thinking about how I'm
structuring my training and then you want
6:57 - 7:00

to make sure that you're kind of always
struggling because that's just how you're
7:00 - 7:06

growing. So essentially to do this you
just need to keep picking projects that
7:06 - 7:12

have some like success and failure
feedback mechanism that's tied to the real
7:12 - 7:16

world and you know with bug hunting like
this is very obvious you know you're
7:16 - 7:21

either finding a bug or not. And as I
mentioned you want something difficult but
7:21 - 7:26

achievable. And so this kind of order that
I did the different projects I mentioned
7:26 - 7:32

in the beginning was specifically chosen
so that each stage would be achievable to
7:32 - 7:39

me. But also like really really stretching
what I could do. And there's a funny
7:39 - 7:44

anecdote there's this guy named Ben
Franklin from American history. And I read
7:44 - 7:49

the story that he used to be really bad at
writing and wanted to get better. So the
7:49 - 7:54

way he did it was he took an essay that
looked perfect to him and then he took
7:54 - 8:00

notes on it and then a week later he
rewrote the essay from the notes and then
8:00 - 8:05

he would just compare the goal versus what
he had done and basically saw all the
8:05 - 8:12

shortcomings. And so that kind of just
stuck in my head. And so I'll show like
8:12 - 8:19

how do you play this kind of trick to bug
finding practice. And then just another
8:19 - 8:26

thing with setting goals for bug hunting.
A lot of it is psychological. I think it's
8:26 - 8:34

almost psychological more than
intelligence, for sure. And basically you
8:34 - 8:39

want to iteratively pick harder and harder
project so that your tolerance for failure goes
8:39 - 8:45

up and up. And so by the time I was working
on Chrome I worked on it every day for six
8:45 - 8:50

months, like right home from work until 1:00
AM, sleep, up, and then all day every
8:50 - 8:56

weekend and found nothing the whole time.
And then just one day found something and
8:56 - 9:02

then from there all that accumulated
struggle and effort, when the bug
9:02 - 9:07

precipitated it was just like a sign that
all of these necessary skills were there
9:07 - 9:13

and then I was able to repeat it. And so now
we'll talk about what that actually looked
9:13 - 9:20

like for bug hunting. So when you think
about how to train the skill I think
9:20 - 9:25

there's kind of two constituent skills
that are important and those are knowing
9:25 - 9:31

where to look and then recognizing the bug
when you're looking at it. And this first
9:31 - 9:37

part is just from my own experience it
seemed like just being a developer it's
9:37 - 9:42

pretty easy to get a sense for, you know,
you can look at the Git logs like I'm
9:42 - 9:48

mentioning here, other crashes happening
somewhere in the library, are bugs getting
9:48 - 9:54

reported publicly? Does the code look bad?
You know it's not hard to tell that
9:54 - 9:58

something looks sketchy. But I think
what's really hard is getting the bug to
9:58 - 10:05

kind of come out. And so that's where I'll
talk about strategy kind of directly. And
10:05 - 10:15

so I kind of have this training idea, where
essentially once you have this kind of
10:15 - 10:18

target in mind where it's a little bit of
your skill range but you think it's
10:18 - 10:24

doable, you try to enumerate all the
existing bug reports and then look through
10:24 - 10:28

each of them and then it's this like Ben
Franklin idea like you take the bug and
10:28 - 10:33

then you look at. Usually there will be
like you know this block of text and
10:33 - 10:37

they're mentioning like the file where
it's happening and stuff and you can kind
10:37 - 10:44

of skim it and sense like, where the bug is.
I went out without actually looking at what it
10:44 - 10:49

is. And so then you go over and you try to
find it yourself. And you know it's really
10:49 - 10:54

important that you actively try to look
for the bug yourself and kind of strain
10:54 - 10:58

yourself and when you've given up
essentially then you look at what was the
10:58 - 11:04

bug. And then through that struggle it's
usually pretty clear like what was the
11:04 - 11:10

fundamental thing you were missing. And
you know just by repeating this process
11:10 - 11:15

constantly this is how you train. And so
this is actually how I first ever started
11:15 - 11:20

on bug hunting was you know, some of you
may know j00ru he's this like, really
11:20 - 11:25

talented researcher. He's been at it for a
long time and I remember seeing this blog
11:25 - 11:29

post from him showing all these IDA pro
bugs and it just kind of blew my mind.
11:29 - 11:36

Like wow someone took IDA and found like
security vulnerabilities in it and then
11:36 - 11:40

when I looked at the bug reports they're
pretty small so I thought OK how do I
11:40 - 11:47

practice and how could I have done this
myself. So basically the first day, you
11:47 - 11:51

know they're all like integer overflow
bugs I could barely even, like, I knew
11:51 - 11:57

what integer overflow was, but I hadn't
actively looked for it before. And so I
11:57 - 12:01

was looking at the function and I couldn't
find it. Basically I went to sleep feeling
12:01 - 12:05

like oh god like "I'll never be able to do
this stuff." And then the next day I looked
12:05 - 12:10

at again. I was like "Oh yeah that's
actually easy". And then I kind of failed
12:10 - 12:14

the second one and then by the third day I
was like able to just see where they were
12:14 - 12:20

once you know I knew where to look. So
that kind of made me think "OK, I'll just
12:20 - 12:25

keep doing this for a long time and keep
doing harder and harder." So this is
12:25 - 12:31

essentially the strategy. Like, I think, you
know, I'm probably the perfect example of
12:31 - 12:37

someone who was like an intermediate CTF
player, really like insecure or whatever
12:37 - 12:43

like and just wanted to get into this but
I had no idea what I was doing. And I just
12:43 - 12:46

kept thinking if I just believe in this
kind of process you know hopefully it
12:46 - 12:52

works out. And so here's just like a
little really basic roadmap if you want to
12:52 - 13:00

try to replicate what I did which is to
focus on CTF because if you can do CTF
13:00 - 13:05

binary problems these are perfect examples
of a kind of training where you try to do
13:05 - 13:10

something yourself. There's a write up and
like once you can do these problems you
13:10 - 13:14

know all the kind of low level details
that are needed. You know what a bug is.
13:14 - 13:18

Things like that. And then from there you
just kind of progressively do harder and
13:18 - 13:24

harder targets. And so there's kind of
this component where like you know, I
13:24 - 13:29

don't - you can't really assess your own
ability like how much of this is innate or
13:29 - 13:37

something and it just seemed to me that
regardless of that, like I'm saying here,
13:37 - 13:42

you know, "This isn't chess where you have
people trained from birth with perfect
13:42 - 13:48

study and decades of, you know, like we're
barely figuring this stuff out and it's
13:48 - 13:54

just kind of a huge mess." And so there's
plenty of room for new people to join in.
13:54 - 13:59

And then also there's a lot of these kind
of stories about people who are just
13:59 - 14:04

insanely naturally gifted and stuff. And I
tried really hard to like, look into what
14:04 - 14:09

these people are actually doing and I
haven't found a case where someone wasn't
14:09 - 14:15

working extremely hard. And so you know
just keep that in mind. So just for the
14:15 - 14:19

sake of time I won't go into this too much
but if you're looking at the slides later
14:19 - 14:24

I just kind of give more detail on like
how I pick the mini projects and got down
14:24 - 14:32

to Chrome. So now let's talk about fuzzing
and so before I get into it, I should
14:32 - 14:37

emphasize that you should really know how
to do auditing and the first couple of
14:37 - 14:44

years like not until into the six months
of failure on Chrome, you know, I was
14:44 - 14:50

doing auditing the whole time. And I think
fuzzing gets a bad rap because people
14:50 - 14:55

think that these are unrelated strategies
and people are only a fuzzer person or an
14:55 - 15:03

auditor person. And really I think these
things are extremely like their work
15:03 - 15:07

really well together. But you can't really
know why fuzzing is failing or how does it
15:07 - 15:15

even apply it or where to apply it without
being able to audit yourself. And part of
15:15 - 15:20

this was like, I noticed on Chrome that I
could audit things but essentially the bug
15:20 - 15:24

density was so low on the sandbox attack
surface that I needed a way to kind of
15:24 - 15:32

automate what I was looking for in each
subsystem I was looking at. So you know
15:32 - 15:37

you have like 20 subsystems that you want
to read, well you know it takes about a
15:37 - 15:42

week each minimum to learn. It's a lot
faster to try to fuzz for like a day or
15:42 - 15:48

two each thing and then... I don't know
like it's... I can't explain it it just
15:48 - 15:54

did random things and then this is what
worked. So. So how would you practice
15:54 - 16:00

fuzzing. It's really the same idea that I
had about auditing where you take a bug
16:00 - 16:05

and just ask yourself like how would I
have written a fuzzer in the first place
16:05 - 16:15

to hit the bug. How could I have known to
write the fuzzer that would have triggered
16:15 - 16:20

this. Am I lacking something in auditing
ability? Am I not able to write fuzzers
16:20 - 16:25

well enough and actually it took me
probably like a year of fuzzer writing to
16:25 - 16:32

get good enough where I could actually act
on my ideas, like, just it's kind of
16:32 - 16:40

tricky. And so we'll get back to it later
but this exact idea of practicing fuzzing
16:40 - 16:45

on something that looks un-fuzzable is how
I found this real exploitable sandbox
16:45 - 16:52

escape. So really quick, just for those of
you who don't know too much about fuzzing,
16:52 - 16:57

at least in like the current meta.
Essentially there's this tool called AFL
16:57 - 17:05

that came out in 2014 which I think really
shifted how well fuzzing worked. And the
17:05 - 17:09

idea is essentially that you have some
corpus of inputs that you want to Fuzz and
17:09 - 17:14

then as you're mutating them you're
looking for coverage feedback which is
17:14 - 17:19

compiled into your code and then as you're
mutating and running new test cases when
17:19 - 17:25

you find new coverage you take that input
and put in your corpus and over time your
17:25 - 17:31

corpus kind of grows and grows as more
coverage is hit. And so there's... this
17:31 - 17:37

just seems to work really well and then
there's another version of this basically
17:37 - 17:44

called libFuzzer, and this is just written
by the LLVM project and the same people
17:44 - 17:50

who wrote address sanitiser also wrote
libFuzzer and just in my experience it's
17:50 - 17:53

written in a way that's a lot more
extensible and easy to understand and play
17:53 - 18:01

with. And so it makes it kind of easier to
audit and fuzz together. And so if you
18:01 - 18:06

want to think about what fuzzing is,
essentially you're trying to replicate the
18:06 - 18:13

normal testing process, but kind of
parameterizing like what a unit test would
18:13 - 18:18

be doing with some input bytes, that you're
just feeding into something and seeing if
18:18 - 18:25

it crashes. And so what's interesting is
there's kind of this gap in the middle of
18:25 - 18:29

like an end to end test which AFL will
give you just feed [it] a binary or like
18:29 - 18:34

the unit test which libFuzzer will give
you where you just keep stuffing bytes
18:34 - 18:39

into a parser and real security
vulnerabilities are kind of logical in
18:39 - 18:45

nature. And I think that's why people
think that fuzzing isn't applicable and I
18:45 - 18:50

think there's actually kind of this part
in the middle where if you see a few
18:50 - 18:54

components that look suspicious and then
you can integrate them and fuzz them in
18:54 - 19:00

isolation, but have the complexity that
you'd kind of see in the real program,
19:00 - 19:08

that's where a lot of bugs come out. And
so how we do this is using a grammar. And
19:08 - 19:14

so essentially it's combining generative
fuzzing with coverage guided fuzzing and
19:14 - 19:22

so we'll touch on how that works in a
minute, but just for some more evidence on
19:22 - 19:27

why does this work well, like I'm not the
only person who is doing this. Kind of
19:27 - 19:35

simultaneously myself and two other people
I guess seem to have stumbled across this
19:35 - 19:42

idea last year or two years ago, and those
are Syzkaller and Lokihardt. So Syzkaller
19:42 - 19:48

is a kind of fully automated Linux kernel
fuzzer. And if you guys haven't seen this
19:48 - 19:54

it's kind of hilarious like essentially
they are automatically generating zero day
19:54 - 20:01

bugs, like tens per month at least and
they automatically generate the test case,
20:01 - 20:06

like submit the report when the commit
comes in it's like automatically tracked.
20:06 - 20:10

It's basically this 0-day generator
sitting there. laughter Yeah I know! And
20:10 - 20:16

I see this. I'm like OK there's 3000 bugs
that are being found. There's a web app
20:16 - 20:21

for it and you can just download it, you
know. And I saw the Linux talk from the
20:21 - 20:25

author of Syzkaller and the YouTube videos
has like 100 views and stuff. I'm just
20:25 - 20:31

like OK, so people need to reiterate how
important this stuff is. So then there's
20:31 - 20:37

Lokihardt as well who's like a famous,
extremely talented, kind of canonical
20:37 - 20:46

auditing person and he seems to be doing a
very similar thing with Chakra and V8 and
20:46 - 20:52

he's finding like tens of interesting
exploitable bugs. And then there's me who
20:52 - 20:57

applied this on the Chrome sandbox and
found over 30 bugs, about half of which
20:57 - 21:03

are security relevant, and then five of
which were a sandbox escape without render
21:03 - 21:08

code execution. So you know this is just
to emphasize like we're finding really
21:08 - 21:14

important things with this technique. And
since I discussed this the first time a
21:14 - 21:21

couple of months ago at PoC conference
it's been used by someone in their Chrome
21:21 - 21:25

security team to fuzz SQLite, and they're
already finding new bugs in the first
21:25 - 21:32

week. So just more of the evidence like
here's the kind of the breakdown of some
21:32 - 21:39

of the bugs I found with this strategy. So
just to highlight a couple of them, or
21:39 - 21:43

maybe three of them. So the first one was
an out of bounds read - just an integer
21:43 - 21:52

overflow in blobs. And this lets you...
you can make a blob and then ask to read
21:52 - 21:56

part of it, and then the offset could have
been negative and there's integer overflow
21:56 - 22:01

- they got the check wrong so it was a
full memory disclosure from the browser
22:01 - 22:08

process. There's also this AppCache use
after free which is what I used in the
22:08 - 22:13

exploit this year. And then finally... I
guess the critical bugs are pretty
22:13 - 22:21

interesting so, two of these I guess the
first pair are in QUIC and the first
22:21 - 22:25

one is a stack buffer overflow with just a
bad packet that comes in over the network.
22:25 - 22:32

So you just browse to an attacker site and
they stack buffer overflow Chrome browser
22:32 - 22:40

process which is outside the sandbox and
it jumped over the stack cookie so that
22:40 - 22:49

was bad. And then then these block file
cache problems. These were in the HTTP
22:49 - 22:57

caching mechanism which is also in the
privileged process and these were actually
22:57 - 23:03

crashing in the wild for three years and
they didn't know how to... I guess I don't
23:03 - 23:05

know if they didn't have resources or they
didn't know how to address the problem or
23:05 - 23:09

something, but I sent them the test case
and then they closed like four bug reports
23:09 - 23:16

in ancient bugs. So you know it just goes
to show that this kind of technique works
23:16 - 23:22

in a variety of really interesting places
that are really important. And so now
23:22 - 23:30

let's get to the boring stuff. So what's
Protobuf. Well Protobuf is this data
23:30 - 23:34

serialisation format from Google, and it
doesn't really matter that it's Protobuf,
23:34 - 23:41

just this idea is you want some kind of...
you want to encode like a little language
23:41 - 23:47

for yourself that expresses what you want
to fuzz a kind of a higher abstraction
23:47 - 23:52

layer than just fuzzing bytes randomly.
And so if any of you have done functional
23:52 - 23:57

programming like, I had been doing stuff
with OCaml and quickcheck for a couple of
23:57 - 24:02

years, and then when I saw this I just
immediately recognized the pattern.
24:02 - 24:07

Essentially what you can do is, you can
create this little tree structure of just
24:07 - 24:14

basic types like enum, you create these
messages, you can just kind of specify
24:14 - 24:22

actions you want your fuzzer to take. And
then next what libprotobuf-mutator will do
24:22 - 24:28

is, it will take the specification you've
written and link it into libFuzzer so that
24:28 - 24:34

it will automatically fuzz and create
these, like, trees that are these kind of
24:34 - 24:37

random ASTs from this little language you
wrote and then you can kind of parse this
24:37 - 24:45

language which sounds crazy, or more hard
than it is, but you essentially you can
24:45 - 24:49

generate this highly structured input
which makes it a lot easier to explore
24:49 - 24:59

like, logical type of bugs. So I just
really want to emphasize this strategy can
24:59 - 25:05

be used to fuzz anything and so kind of
this same exact idea is being used to find
25:05 - 25:12

bugs in caching APIs, encrypted networking
protocols, kernels, sandbox, serialisation
25:12 - 25:18

code, stateful systems that have IPC and
network interaction and timing as part of
25:18 - 25:25

it, which is what we'll show at the end.
And so like what's what's common here. You
25:25 - 25:30

know we just fuzz all of these different
systems in the same way. The idea is, as
25:30 - 25:36

an auditor what you do is you kind of
notice like "okay there's some subsystem,
25:36 - 25:42

like some caching mechanism with a simple
API" and you look at how it's implemented
25:42 - 25:45

and it looks complicated, so you think
"okay, you know if I can write a fuzzer in
25:45 - 25:51

like a few hours for this, you know it
seems like high value". So once you kind
25:51 - 25:56

of play with the API a bit and understand
like how the API works you know you can
25:56 - 26:00

just write this little specification for
the API in Protobuf, and go ahead and
26:00 - 26:11

write the fuzzer. So basically I'll show
how this works on Chrome. So just to make
26:11 - 26:16

sure I cover all of the background
knowledge, for those of you that don't
26:16 - 26:19

really care about fuzzing or don't care
about anything else at least you can get
26:19 - 26:28

bootstrapped on Chrome IPC research. The
basic idea of how the Chrome sandboxing
26:28 - 26:35

situation works is - when I'm saying I'm
finding bugs in the sandbox, like it's really
26:35 - 26:41

finding bugs in the browser process which
are reachable from a sandboxed process and
26:41 - 26:49

so the sandbox itself is just constraining
these render tab processes. So they can't
26:49 - 26:54

really do much and then what you want to
do is jump from there to the browser
26:54 - 27:01

process which can do anything. It's a very
common model. Like almost... on 3DS you
27:01 - 27:07

have userland kernel then security code
processor you have a Linux like you might
27:07 - 27:11

have a userland process and then in the
sandbox there's some APIs in the kernel
27:11 - 27:15

you can hit - syscalls you can hit and
basically everything just keeps boiling
27:15 - 27:20

down to "there's some API that you can
look at from the less privileged context".
27:20 - 27:25

And then if you can trigger a bug in that
API you escape, and you know, this kind of
27:25 - 27:33

applies everywhere. And so this idea of
understanding self-contained chunks of
27:33 - 27:39

syscalls in Linux - like hundreds - but
being able to look at and say like okay,
27:39 - 27:47

here are 10 related syscalls. This is a
subsystem that I want to fuzz in isolation
27:47 - 27:52

- like this is kind of how you think about
it. And so if you just want to get started
27:52 - 27:59

on Chrome what you want to do is look at,
"OK what are these endpoints in the
27:59 - 28:05

browser process that I can reach from the
render", and then.. you don't really have
28:05 - 28:10

to understand how IPC works to do this.
You just have to be able to recognize what
28:10 - 28:16

you're allowed to hit from the render to
the browser and what's actually in the
28:16 - 28:20

browser. And so fortunately the Chrome
codebase is pretty well organized so they
28:20 - 28:25

just tell you if you just don't really go
into any folder that says browser in it.
28:25 - 28:32

All of this is outside the sandbox and
prone to sandbox escape. And so most of my
28:32 - 28:39

bugs I found were in this content browser
subsystem kind of thing. But you can look
28:39 - 28:42

anywhere and I think like all these
results I've had the in last year were
28:42 - 28:48

just in one folder. And so you know
there's so many other places where bugs
28:48 - 28:52

can manifest that I didn't even look at.
So basically there's plenty of room for
28:52 - 29:03

more. So just to go in on what I did is -
in this kind of content stuff - is: you
29:03 - 29:08

just want to see where APIs are reachable
from the renderer are enumerated and those
29:08 - 29:17

are in this RenderProcessHostImpl::Init
function. So yeah C++ kind of wordy but
29:17 - 29:22

you get used to it! Basically there's
there's two places where the APIs are set
29:22 - 29:28

up, or the interfaces are exposed. Those
are CreateMessageFilters() and
29:28 - 29:33

RegisterMojoInterfaces() and it took me a
while to realize where these were. Like a
29:33 - 29:40

year or something. But like those are the
key functions to look at. And so I'll skip
29:40 - 29:46

over old style IPC because it's going
away, but it's pretty easy to figure out
29:46 - 29:53

what's going on if you look at it. So I'll
talk a bit about Mojo. So essentially this
29:53 - 30:01

is a new IPC platform that the Chrome team
has developed and the idea is they want
30:01 - 30:08

to, I guess, simplify this process for
developers in terms of defining an
30:08 - 30:12

interface that you want to expose to a
render or some other client somewhere
30:12 - 30:19

else, and essentially you write these
little interface files called .mojom and
30:19 - 30:24

then the build system will generate all
the C++ glue for you - you can just like
30:24 - 30:32

subclass something and then it handles all
the mechanics of actually exposing this to
30:32 - 30:36

other processes and so on. And so as a
security researcher you know you don't
30:36 - 30:42

really care about that. All you care about
is "what can I reach" and "how do I know
30:42 - 30:48

what to fuzz" or something. So what I
guess I looked at is just: what are some
30:48 - 30:54

of the .mojom files that are subclassed in
this content/browser and you can
30:54 - 31:01

just do little grep to check this. So
essentially the AppCache is one of the
31:01 - 31:09

bugs I found this year. And here's the API
that the render can... these are all the
31:09 - 31:14

messages that the render can send to the
browser and along with the types of
31:14 - 31:19

documents. And so you know that's pretty
straightforward. So in the browser process
31:19 - 31:25

this is the code that we're trying to
attack which is the actual C++
31:25 - 31:30

implementation code for this API. And so
you can see they're subclassing there and
31:30 - 31:35

then they just make sure to override all
these virtual functions that actually
31:35 - 31:41

implements the API. And so I won't go too
into detail on this part because it's a
31:41 - 31:48

little boring, but essentially: how does a
render get from it to all the way over to
31:48 - 31:55

this kind of browser C++ code? Well it
essentially goes through this request
31:55 - 31:59

mechanism where the render tells the
browser process "Hey I have this kind of
31:59 - 32:09

request to access this interface" and then
it'll actually just create that
32:09 - 32:17

DispatcherHost implementation object and
just feed in that request over there. So
32:17 - 32:25

essentially stuff gets glued together
somehow. And then there's this stuff which
32:25 - 32:32

is kind of ugly, but I mean here's where
you're actually exposing the ability to do
32:32 - 32:39

this. So here's here's where the request
comes in, and then where this requests
32:39 - 32:42

handler function gets fed in as that thing
I mentioned earlier - the
32:42 - 32:47

RegisterMojoInterfaces. So it's named
pretty well it's kind of easy to follow.
32:47 - 32:50

And they're adding new stuff constantly
all of this stuff is on the attack
32:50 - 32:55

surface. Like I think I stopped Chrome a
couple of months ago, I think I looked and
32:55 - 33:00

there's like you know five new APIs in
there, they're constantly adding things.
33:00 - 33:08

So just a quick point about this.
Essentially you want to do fuzzing
33:08 - 33:17

in-process with this LibFuzzer+protobuf-
mutator strategy, and you don't want to be
33:17 - 33:22

actually doing IPC - it's just very
brittle and weird. So what you really want
33:22 - 33:28

to do is just like here's the C++ object I
want to just instantiate it and call those
33:28 - 33:35

functions myself and then this whole thing
is just very lightweight and easy to play
33:35 - 33:39

with which is... you know having a
lightweight and very easy to rebuild,
33:39 - 33:45

tweak something and play with it, print
things... like the faster you can iterate
33:45 - 33:49

the better so anything that's too
complicated, like the success rate goes
33:49 - 33:58

way down. So essentially you know the
fuzzer that I made open source is the way
33:58 - 34:03

you should do it. But the way I actually
did it was: I just made the object...
34:03 - 34:08

commented out the private... I don't know
if you can see it on here... Yeah, so just
34:08 - 34:11

commented out the private, created the
object, started calling these things
34:11 - 34:15

randomly, it would crash and I would just
hand fix things and you know it's kind of
34:15 - 34:21

sloppy but you're testing something in a
very small unit that's not really exposed
34:21 - 34:27

to that kind of testing. So now let's kind
of put together everything I've talked
34:27 - 34:35

about so far. So this exploitable AppCache
Use-After-Free I found this year was found
34:35 - 34:44

using this same idea of deliberate
practice. So I looked at this AppCache
34:44 - 34:49

subsystem in the browser process, and I
noticed that there were three old bug
34:49 - 34:54

reports that were triggering memory
corruption and they were pretty
34:54 - 35:00

interesting because they involved
different kind of ways of attacking and
35:00 - 35:05

these things had clearly been audited and
I had actually seen these bugs a couple of
35:05 - 35:11

years ago and I kind of used it as
evidence to myself at the time that
35:11 - 35:16

fuzzing doesn't work and you need
auditing. But it kind of stuck in my head
35:16 - 35:22

and I kept thinking "Someday I'll come
back to this and like I'll overcome it"
35:22 - 35:26

you know. So essentially what's
interesting is - I've already talked
35:26 - 35:31

about - you know - it's easy to specify
the API and just feed IPC messages into
35:31 - 35:35

it. And I think everyone kind of
understands that, who does any IPC
35:35 - 35:40

fuzzing. But then there's also this idea
that you've got some remote server that
35:40 - 35:47

the AppCache thing like creates a network
request, some server's serving the request
35:47 - 35:52

and doing different things, and so on the
second bug it actually matters when things
35:52 - 35:59

were - like when the server was returning
data. Because some jobs like stay alive
35:59 - 36:03

and then if you send an IPC message to
close your session and then the job is
36:03 - 36:07

still alive, there's like a raw pointer
somewhere, and you know something going on
36:07 - 36:13

that it matters that the server keeps the
connection open. And then the last thing
36:13 - 36:21

is just kind of a logical issue. And if
the server returns these HTTP codes in the
36:21 - 36:25

headers of the response in this kind of
weird order, you trigger some logical bug
36:25 - 36:31

that actually leads to memory corruption.
And so you know I looked at this and I
36:31 - 36:38

said OK, well, so what do we need to test
to cover all this? Basically IPC, Network,
36:38 - 36:44

and that timing. And so not only that, but
this is kind of a stateful thing. So we
36:44 - 36:51

want to make sure that for each fuzzing
session that we kind of reset the state
36:51 - 36:56

completely. And fortunately in C++ this
isn't too hard because you just destroy
36:56 - 37:02

the object and if it doesn't exist anymore
what state is there. So you just make sure
37:02 - 37:08

that you don't leave things lingering. So
yes I just had this basic idea: we'll call
37:08 - 37:16

random IPCs with this fuzzed input, we
return random data from the network, and
37:16 - 37:22

then, we reset the state of the cache on
every iteration. And then part of it was
37:22 - 37:27

thinking OK, if I can repro these old bugs
if I reintroduce them by editing the
37:27 - 37:32

source, this is kind of appealing to this
deliberate practice idea that I could have
37:32 - 37:36

written a fuzzer that would trigger these
old things, and this is a kind of the idea
37:36 - 37:43

I was pursuing when actually triggered a
new bug. So now what's tricky about this
37:43 - 37:47

is if you just return random data from the
network you're not going to make much
37:47 - 37:54

progress. And this is kind of where the
auditing background comes in. You want to
37:54 - 38:00

think about what is expressive enough
of... Like, how do I make my fuzzer
38:00 - 38:06

expressive enough that I can hit
everything, but then not so generic that
38:06 - 38:12

it's just spraying... like it's just
noise. And so I'll show he did that in
38:12 - 38:19

this specification and so at a high level
I kind of root node in the AST or the tree
38:19 - 38:26

of the fuzzier message is this session
message, and then this just contains a
38:26 - 38:32

sequence of commands and so... Commands
are something I also made up, and so the
38:32 - 38:39

first ten of them are all the different
IPC calls I can do, the eleventh one is
38:39 - 38:44

handling any pending requests or pre-
caching like a response to any new
38:44 - 38:49

requests that comes in. So that handles
like both the asynchronous case where it
38:49 - 38:53

makes a request and it's waiting for the
server and also like the synchronous
38:53 - 38:57

version where the response comes
immediately. And then lastly this run
38:57 - 39:05

until idle thing which essentially just...
it helps you... like if you place these
39:05 - 39:12

RunUntilIdles randomly as you're doing
these IPC messages, you're kind of
39:12 - 39:17

flushing the queue of accumulated work.
And so, what this lets you do, is kind of
39:17 - 39:27

identify these race condition type things.
Because you can do something like do a
39:27 - 39:31

bunch of IPCs that come in and are handled
at the same time without actually
39:31 - 39:35

serving... like actually doing the work
yet, and then you do this RunUntilIdle and
39:35 - 39:42

then all the work happens. And you know I
didn't think of this a priori in some
39:42 - 39:47

smart way, I just looked at the unit tests
and I just tried to think about like "OK
39:47 - 39:50

how are these developers already testing
it". And this is what it looked like
39:50 - 39:58

they're doing. So these messages are very
easy to write, essentially just provide
39:58 - 40:03

for each IPC message that I could have
sent to this thing, just make sure all the
40:03 - 40:10

arguments are correct, and then there's a
little bit of cleverness which is like the
40:10 - 40:17

HostID also breaks down to just an
enum of like 0,1,2 because just from
40:17 - 40:23

looking at the code you know that if I'm
randomly creating hosts, destroying them
40:23 - 40:29

and stuff over the whole 4 billion int32
IDs, like it's just going to fall apart
40:29 - 40:36

and not find anything interesting. So I
just constrained that for the URL. That is
40:36 - 40:41

also a custom message that I constrained
to just return a few premade legit URLs so
40:41 - 40:49

that way I'm also not testing the URL
parsing stuff. So then, how do I handle
40:49 - 40:53

the network? Well, I just read the source
and looked at "what are all the types of
40:53 - 40:59

HTTPS response codes that affect control
flow?". And I just enumerated them and
40:59 - 41:09

then for any given request, that comes in
from the AppCache system, I just encode
41:09 - 41:14

anything interesting about the response
that I thought of just by reviewing the
41:14 - 41:19

source. And it seems like the things that
mattered were those HTTP codes whether or
41:19 - 41:27

not the headers asked AppCache to do
caching or just download it once. And then
41:27 - 41:35

also the AppCache can request from the
server this manifest file which has some
41:35 - 41:42

metadata about what files that it should
be caching. And so essentially just all of
41:42 - 41:48

this is encoded in one message. And so how
you go from this like high level
41:48 - 41:53

description to actually fuzzing is just
this. So you can see how simple it is.
41:53 - 42:01

You're really just... you know I looked at
the unit test code and saw how they set up
42:01 - 42:08

this AppCache service. And so they let you
pass in this URLLoaderFactory, and what
42:08 - 42:18

this is, is just this kind of unit testable
network request thing so this is how I'm
42:18 - 42:22

like, intercepting the network requests
and feeding data. And so I do this little
42:22 - 42:30

setup and then here I just create the one
render to browser host. This just kind of
42:30 - 42:37

simulating how you would do the Mojo stuff
if it was the real render to browser
42:37 - 42:41

interaction, and then I just go through
those commands that I mentioned and just
42:41 - 42:44

do these things. So I mean this is all it
is. You just pull the HostID out of this
42:44 - 42:51

Protobuf message that we're getting at the
top there - that session that I defined as
42:51 - 42:59

the top level TreeNode. And you just go
through and you just call the APIs that
42:59 - 43:04

are there. And so, how to get the network
stuff to work, as I mentioned they have
43:04 - 43:16

this like mock URLLoaderFactory - also C++-y!
But essentially it's this... well okay
43:16 - 43:24

so this is when I basically handle one of
my request messages that I came up with. I
43:24 - 43:28

just simulated a response. This is a built
in like unit test function that they have
43:28 - 43:36

in their codebase and I just pass in the
relevant bits that came from that message.
43:36 - 43:45

So, yes this is what it looks like. I have
some kind of DoRequest helper function and
43:45 - 43:51

then I just pass my stuff through to it.
And so it takes that URLFactory and then
43:51 - 43:57

serves responses to anything that's
waiting and then what's interesting here
43:57 - 44:02

and what's necessary to find the bug is
that - I mentioned that this is
44:02 - 44:08

asynchronous - so what will happen is when
you do RegisterHost... and then if I go
44:08 - 44:15

back to.. Yeah, you like register host,
select a cache, do some things - like the
44:15 - 44:20

AppCache will make a request to the server
and then get this manifest, and then it
44:20 - 44:27

will start making requests to download
things and then these things are pending like
44:27 - 44:35

responses that it's waiting for from the
server. And so it actually mattered that
44:35 - 44:40

you mutate the state further before those
responses come in. And so by doing this
44:40 - 44:46

like in between the IPC messages - not
like preloading the network factory with a
44:46 - 44:53

bunch of responses - I'm actually serving
things… Like I'm not encoding an assumption
44:53 - 44:58

about when I'm serving responses. And I
know this is kind of tedious to go so into
44:58 - 45:07

detail, but essentially you run this thing
that's maybe 150 lines or something, and
45:07 - 45:15

then trigger this bug with AddressSanitiser.
And so essentially a Use-After-Free
45:15 - 45:22

happens and what what's going
on here is you can see the scoped_refptr
45:22 - 45:31

pointer destructor. And it turns out that
when... yes so when you go to unregister
45:31 - 45:38

the host - like it's an IPC message there
at the bottom that I that I send - and
45:38 - 45:45

then it just accidently... This is kind of
inaccurate, this stack trace, but
45:45 - 45:52

essentially some ref count goes from one
to zero, and then it starts destroying
45:52 - 45:58

this AppCache object. And then in the
destructor one of these requests was
45:58 - 46:05

waiting on a response from the server, and
then it essentially gives a reference back
46:05 - 46:12

to that other object. And that's kind of
eliding some details but essentially the
46:12 - 46:17

refcount went back up to 1 and then, now
you're adding a bunch of references all
46:17 - 46:22

over the place to something while it's
being destroyed. And so what happens is
46:22 - 46:28

now you have all these pointers to a freed
object and then you can trigger access to
46:28 - 46:33

that freed thing again later. And so this
kind of the recipe for an exploitable bug.
46:33 - 46:39

And so I just want to point out that all
of this fuzzer is open source and it's
46:39 - 46:43

just in the Chrome codebase. So if you
download it or go online to the code
46:43 - 46:53

search tool you can search for appcache
fuzzer and it will come up. So then real
46:53 - 47:01

quickly, just to cover the exploitation.
You know I guess I have more time and I
47:01 - 47:07

thought - I compressed this a lot! But
essentially I did this part in a chain
47:07 - 47:13

with two other guys - Saelo and Niklas -
and so Saelo provided the RCE bug. And so
47:13 - 47:18

from there we get code execution in the
render, and then this lets us send
47:18 - 47:26

arbitrary IPC messages and so it's kind of
annoying to send IPC with Mojo like
47:26 - 47:33

arbitrarily, so we kind of piggybacked on
the renderer-sided glue code for sending
47:33 - 47:37

these AppCache messages. So we just like
found the C++ object and called into it
47:37 - 47:46

and then all in all we end up with this
primitive where we can decref and like
47:46 - 47:52

release a reference to this refcounted
thing. Like after it's been freed multiple
47:52 - 47:59

times. So, there's two stages to
exploiting this - because we're in the
47:59 - 48:05

render and we only have one bug, we need
to turn this into a memory disclosure. And
48:05 - 48:09

so you know fortunately this bug can be
triggered repeatedly. And so the idea here
48:09 - 48:18

is triggering it once gives you this
decrement-by-n primitive. And so when
48:18 - 48:22

you're releasing - you know if you ever
hit zero - you'll trigger the destructor
48:22 - 48:27

again. And so essentially what you want to
do for the leak is to not trigger the
48:27 - 48:33

destructor because it will blow up, but
rather find a string somewhere in memory
48:33 - 48:38

where it is a string pointing to the heap
and then decrement the string pointer so
48:38 - 48:43

then it starts like sliding somewhere else
into the heap. So that when you read that
48:43 - 48:50

string back you're actually leaking heap
data. And so, we did that. So there's some
48:50 - 48:55

object that had a standard C++ string at
the beginning. On Windows the first..
48:55 - 49:01

keyword[?] is like the, er, pointer
to the string data. So you decrement this.
49:01 - 49:06

It is actually a cookie object, so we just
read the cookie back from the browser and
49:06 - 49:12

then in the cookie value we see the leaked
bytes. And then from there there was a
49:12 - 49:21

vtable access that we can control in the
destructor. So we make another fake object
49:21 - 49:26

that looks like it has one reference left.
Make it hit zero so the destructor is
49:26 - 49:32

triggered and then this AppCache thing
gets confused and essentially calls a
49:32 - 49:36

control vtable pointer. And then from
there, those are the primitives you need
49:36 - 49:43

to write an exploit and then it was just a
matter of putting it together. And if
49:43 - 49:48

you're curious about that again you should
look forward to Niklas' talk. And so just
49:48 - 49:53

a summary: essentially starting all the
way from the beginning you want to be
49:53 - 50:00

practicing deliberately. Keep working
constantly, and keep identifying gaps and
50:00 - 50:07

actively working to improve. It sounds
weird but you want to keep that in mind.
50:07 - 50:12

And use this new technique with LibFuzzer
and protobuf-mutator - I can promise you
50:12 - 50:20

it's not going to be the last time you see
someone using this. And I mentioned I've
50:20 - 50:28

started on XNU and we'll see some initial
results pretty soon on that. It's working.
50:28 - 50:37

So yeah, and lastly never give up. It may
take months but it's fine. So with that I
50:37 - 50:43

guess I'll open to questions. Yeah, thank
you.
50:43 - 50:47

Applause
50:47 - 50:55

Herald: Okay thank you for that talk. If
you have a question please line up at the
50:55 - 51:00

microphones in the room and try to limit
your question to one single sentence. If
51:00 - 51:04

you would like to leave at this point
please do that as quietly as possible so
51:04 - 51:13

that everyone else can still stay for the
questions. And also if you're listening on
51:13 - 51:28

a stream you can ask a question online.
Seems there is... no question? Ah there is
51:28 - 51:31

one. Microphone number 2, your question
please.
51:31 - 51:38

Mic 2: Hello. I just want to ask why have
you chosen Chrome for bug hunting? Was it
51:38 - 51:41

just like you picked one random browser
and you started?
51:41 - 51:50

Ned: Yeah no. I mean it's basically just
kind of the hardest thing I could think of
51:50 - 51:57

that I could plausibly do? You know just
for the purpose of getting better. And
51:57 - 52:02

there's more to it. I think Chrome the way
it's written is very amenable to research
52:02 - 52:06

and like I actually didn't know C++ before
I worked on Chrome!
52:06 - 52:11

So like learning looking at a great
example of the C++ code base and learning
52:11 - 52:17

from that was really helpful to me. And
you know I glossed over kind of my path
52:17 - 52:22

but I was actually finding random obscure
library bugs that weren't even reachable
52:22 - 52:27

at first. So, just the quality of Chrome
makes it so that what your training is the
52:27 - 52:34

real talent not just being able to
decipher bad code. So, I highly recommend
52:34 - 52:40

it. I can say that I definitely feel that
the two years I invested on that one
52:40 - 52:49

project completely helped me get better.
Herald: Thank you. Signal angel, question
52:49 - 52:57

from the internet.
Signal Angel: Hello, there is one question
52:57 - 53:03

from the internet.... inaudible
Ned: So the question is: "is it possible
53:03 - 53:10

to attack using Meltdown or SPECTRE?" I
don't know. I guess it's possible. I was
53:10 - 53:15

essentially focusing only on application
level bugs. So things that I could
53:15 - 53:24

trigger deterministically using only bugs
in the Chrome code itself. And so... I
53:24 - 53:28

mean also those things came along way
after I was doing my research. So you know
53:28 - 53:32

I can't comment on that but I'm sure
someone knows.
53:32 - 53:38

Herald: Thanks. Ned: Yeah.
Herald: I see no more people on the
53:38 - 53:42

microphones or questions on the internet.
Yeah. OK. Thanks for your talk and thanks
53:42 - 53:44

for Q&A.
Ned: Thank you.
53:44 - 53:50

Applause
53:50 - 53:54

postroll music
53:54 - 54:13

subtitles created by c3subtitles.de
in the year 2020. Join, and help us!

Title:: 35C3 - Attacking Chrome IPC
Description:: more » « less
Video Language:: English
Duration:: 54:13

	Bar Sch edited English subtitles for 35C3 - Attacking Chrome IPC
	flavioamieiro edited English subtitles for 35C3 - Attacking Chrome IPC
	flavioamieiro edited English subtitles for 35C3 - Attacking Chrome IPC
	camelusferus edited English subtitles for 35C3 - Attacking Chrome IPC
	C3Subtitles edited English subtitles for 35C3 - Attacking Chrome IPC
	C3Subtitles edited English subtitles for 35C3 - Attacking Chrome IPC
	C3Subtitles edited English subtitles for 35C3 - Attacking Chrome IPC
	C3Subtitles edited English subtitles for 35C3 - Attacking Chrome IPC

Show all

English subtitles

Revisions

Revision 10 Edited

Bar Sch

35C3 - Attacking Chrome IPC

Revisions

Our website uses cookies

Operating cookies (Required)