35C3 - Taming the Chaos: Can we build systems that actually work?

0:00 - 0:18

35C3 preroll music
0:18 - 0:29

Herald angel: OK. Our next speaker will be
trying to tame the chaos. Peter Sewell
0:29 - 0:34

from the University of Cambridge. A warm
round of applause please.
0:34 - 0:38

applause
0:38 - 0:49

Sewell: Thank you! Okay. So here we are.
C3 again with a program full of
0:49 - 0:57

fascinating and exciting and cool stuff.
And if we look just at the security talks
0:57 - 1:00

all manner of interesting things. I'm
going to go to a lot to these. You should
1:00 - 1:07

too. Let's see though if we read some of
the titles: exploiting kernel memory
1:07 - 1:19

corruptions, attacking end to end e-mail
encryption. What else have we got: modern
1:19 - 1:32

Windows user space exploitation,
compromising online accounts. OK so a lot
1:32 - 1:36

of these talks, as usual, they're
explaining to us that our computers,
1:36 - 1:43

they're not doing what we would like and
this as usual is the merest tip of a tiny
1:43 - 1:49

iceberg, alright. We have a hundreds of
thousands of known vulnerabilities and
1:49 - 1:53

many unknown vulnerabilities. Let me do it
a bit dark here, but let me try some
1:53 - 1:59

audience participation. How many of you
have in the last year written at least a
1:59 - 2:09

hundred lines of code? And of those people
keep your hands up if you are completely
2:09 - 2:18

confident that code does the right thing.
laughter I would like to talk to you
2:18 - 2:23

later laughter and so would the people
in the self driving car that you're
2:23 - 2:31

probably engineering. So, so how many
unknown vulnerabilities then. Well you can
2:31 - 2:39

take what's in your mind right now and
multiply it by - oh, this doesnt work very
2:39 - 2:49

well. No. No, no. Computers, they do the
wrong thing again and again and again.
2:49 - 2:54

laughter and some applause We can
multiply by this estimate of about a 100
2:54 - 3:01

billion lines of new code each year. So if
we take all of this: these talks, these
3:01 - 3:06

numbers, these should make us not happy
and excited, these should make us sad,
3:06 - 3:16

very sad and frankly rather scared of what
is going on in the world. So what can we
3:16 - 3:26

do? We can, option one, give up using
computers altogether. applause In most
3:26 - 3:30

audiences this will be a hard sell but
here I'm sure you're all happy to just
3:30 - 3:40

turn them off now. Or we can throw up our
hands in the air and collapse in some kind
3:40 - 3:55

of pit of despair. Well, maybe, maybe not.
My task today is to give you a tiny
3:55 - 4:00

smidgen, a very tall, a very small amount
of hope that it may be possible to do
4:00 - 4:04

slightly better. But if we want to do
better we first have to understand why
4:04 - 4:11

things are so bad and if we look at our
aeronautical engineering colleagues for
4:11 - 4:16

example, they can make planes which very
rarely fall out of the sky. Why is it that
4:16 - 4:23

they can do that and we are so shit at
making computers? Well, I'm going to reuse
4:23 - 4:28

a picture that I used at 31C3, which is
still the best description I can find of
4:28 - 4:36

the stack of hardware and software that we
live above. Here we go. It's, there's so
4:36 - 4:40

much information in this, it's just ace.
So we see down here all of this
4:40 - 4:45

transparent silicon, it's the hardware we
live above. We see a stack of phases of a
4:45 - 4:51

C compiler, we see all kinds of other
other bits and pieces. There might be a
4:51 - 4:59

slightly hostile wireless environment in
this room for some reason. If we look at
4:59 - 5:03

this and think about the root causes for
why our systems are so bad we can see
5:03 - 5:08

terrible things. So the first is,
obviously there's a lot of legacy
5:08 - 5:13

complexity that we're really stuck with,
alright. If you pull out one of those
5:13 - 5:16

pieces from the middle and try and replace
it with something which is not of exactly
5:16 - 5:21

the same shape the whole pile will
collapse. So we somehow need to find an
5:21 - 5:28

incremental path to a better world. And
then, this is the most fundamental reason
5:28 - 5:33

that these are not like aeroplanes, these
systems are discrete not continuous. If
5:33 - 5:38

you take an honest good I made out of a
piece of steel and you push on it a little
5:38 - 5:44

bit it moves a little bit, basically in
proportion. If it is 1 percent too strong,
5:44 - 5:50

1 percent to weak, basically it doesn't
matter. But in these things one line of
5:50 - 5:56

code can mean you're open to a
catastrophic exploit and one line in many,
5:56 - 6:06

many million. OK. And next thing... this
really doesn't work. They're very
6:06 - 6:11

complicated. But the scary thing is not
the static complexity of those lines of
6:11 - 6:15

code and the number of components although
that's pretty intimidating the really
6:15 - 6:21

scary thing is that the exponential number
of states and execution paths. So these
6:21 - 6:26

two together they mean that the testing
that we rely on testing is the only way we
6:26 - 6:31

have to build systems which are not
instantly broken. Testing can never save
6:31 - 6:41

us from these exploitable errors. And that
means ultimately we need to do proof.
6:41 - 6:46

Honest machine checked mathematical proof.
And this also tells us that we need to
6:46 - 6:51

arrange for our systems to be secure for
simple reasons that do not depend on the
6:51 - 6:57

correctness of all of those hundred
billion lines of code. Then another thing
6:57 - 7:03

that we have here. All these interfaces:
the architecture interface, the C language
7:03 - 7:11

interface, the sockets API interface, the
TCP interface. All of these we rely on to
7:11 - 7:18

let different parts of the system be
engineered by different organizations. But
7:18 - 7:25

they're all really badly described and
badly defined. So what you find is, for
7:25 - 7:30

each of these, typically a prose book
varying in thickness between about that
7:30 - 7:37

and about that, full of text. Well, we
still rely on testing - limited though it
7:37 - 7:44

is - but you can never test anything
against those books. So we need instead
7:44 - 7:49

interface descriptions, definitions,
specifications, that are more rigorous,
7:49 - 7:54

mathematically rigorous and that are
executable - not in the normal sense of
7:54 - 7:59

executable as an implementation but
executable as a test oracle. So you can
7:59 - 8:04

compute whether some observed behaviour is
allowed or not and not have to read the
8:04 - 8:08

book and argue on the Internet. And we
also need interface definitions that
8:08 - 8:20

support this proof that we need. And then
all of this stuff was made up in the 1970s
8:20 - 8:26

or the sixties and in the 70s and the 60s
the world was a happy place and people
8:26 - 8:32

walked around with flowers in their hair
and nothing bad ever happened. And that is
8:32 - 8:36

no longer the case. And so we need
defensive design, but we need defensive
8:36 - 8:41

design that is somehow compatible or can
be made enough compatible with this legacy
8:41 - 8:47

investment. That's quite hard. And then -
I can't say very much about this, but we
8:47 - 8:50

have at the moment very bad incentives,
because we have a very strong incentive to
8:50 - 8:55

make things that are more complex than we
can possibly handle or understand. We make
8:55 - 9:00

things, we add features, we make a thing
which is just about shippable and then we
9:00 - 9:07

ship it. And then we go on to the next
thing. So we need economic and legal
9:07 - 9:11

incentives for simplicity and for security
that we do not yet have. But it's hard to
9:11 - 9:17

talk about those until we have the
technical means to react to those
9:17 - 9:22

incentives. OK, so what I'm going to do
now, I'm going to talk about two of these
9:22 - 9:26

interfaces, the architect interface and
the C language interface, and see what we
9:26 - 9:33

can do to make things better. A lot of
this has to do with memory. So whoever it
9:33 - 9:38

was that picked the subtitle for this
meeting "refreshing memories" thank you,
9:38 - 9:45

because it's great, I love it. Let's
refresh your memory quite a way. So I
9:45 - 9:53

invite you to think back to when you were
very very young in about 1837 when cool
9:53 - 9:59

hacking was going on. Charles Babbage was
designing the Analytical Engine and even
9:59 - 10:03

then you see there was this dichotomy
between a mill performing operations and a
10:03 - 10:11

store holding numbers. This is a plan view
of the analytical engine, well, it was
10:11 - 10:15

vaporware, but a design from the
analytical engine, and you see here these
10:15 - 10:22

circles these are columns of numbers each
made out of a stack of cogs, maybe a 50
10:22 - 10:28

digit decimal number in each of those
columns. And this array, really, he
10:28 - 10:35

imagined it going on out to and over there
somewhere, with about 1000 numbers. So
10:35 - 10:43

even then you have a memory which is an
array of numbers. I think these were not-
10:43 - 10:47

I don't think you could do address
computation on these things, I think
10:47 - 10:54

addresses were only constants, but, still,
basically an array of numbers. So okay
10:54 - 10:59

what do we got now. Let's look a bit at C.
How many of those people who have
10:59 - 11:07

programmed 100 lines of code, how many of
you were C programmers? Quite a few. Or
11:07 - 11:13

maybe you're just embarrassed. I forgot to
say. Those that didn't put your hands up
11:13 - 11:18

for that first question. You should feel
proud and you should glory in your
11:18 - 11:26

innocence while you still have it. You are
not yet complicit in this trainwreck. It
11:26 - 11:32

worked then. Okay so here's a little piece
of C-code which I shall explain. I shall
11:32 - 11:38

explain it in several different ways. So
we start out. It creates two locations x
11:38 - 11:53

and secret... Don't do that. This is so
bad. Creates x, storing one and secret_key
11:53 - 11:59

which stores, I might get this wrong, your
pin. And then there's some computation
11:59 - 12:05

which is supposed to only operate on x but
maybe it's a teensy bit buggy or corrupted
12:05 - 12:14

by somebody and then we try and we make a
pointer to x and in this execution x just
12:14 - 12:21

happened to be allocated at 14. This
pointer contains the number 14 and then we
12:21 - 12:25

add one to that pointer. It's C so
actually that adding one to the pointer it
12:25 - 12:33

really means add the four bytes which are
the size of that thing. So we add 4 to 14
12:33 - 12:40

and we get 18. And then we try and read
the thing which is pointed to. What's
12:40 - 12:49

going to happen. Let me compile this and
run it in a conventional way. It printed a
12:49 - 12:54

secret key. Bad. This program, rather
distressingly, this is characteristic by
12:54 - 12:59

no means of all security flaws but a very
disturbingly large fraction of all the
12:59 - 13:11

security flaws are basically doing this.
So does C really let you do that. Yes and
13:11 - 13:18

no. So if you look at the C standard,
which is one of these beautiful books, it
13:18 - 13:23

says you, have to read moderately
carefully to understand this, but it says
13:23 - 13:27

for this program has undefined behavior.
And many of you will know what that means
13:27 - 13:33

but others won't, but, so that means as
far as the standard is concerned for
13:33 - 13:39

programs like that there is no constraint
on the behavior of the implementation at
13:39 - 13:47

all. Or said another way and maybe a more
usefully way: The standard lets
13:47 - 13:52

implementations, so C compilers, assume
that programmers don't have this kind of
13:52 - 14:00

stuff and it's the programmer's
responsibility to ensure that. Now in
14:00 - 14:05

about 1970, 75 maybe 1980, this was a
really good idea, it gives compilers a lot
14:05 - 14:14

of flexibility to do efficient
implementation. Now not so much. How does
14:14 - 14:19

this work in the definition? So the
standard somehow has to be able to look at
14:19 - 14:29

this program and identify it as having
this undefined behavior. And indeed, well,
14:29 - 14:37

if we just think about the numbers in
memory this 18, it points to a perfectly
14:37 - 14:42

legitimate storage location and that plus
one was also something that you're
14:42 - 14:48

definitely allowed to do in C. So the only
way that we can know that this is
14:48 - 14:54

undefined behavior is to keep track of the
original allocation of this pointer. And
14:54 - 14:59

here I've got these allocation identifiers
at 3, at 4, at 5, and you see here I've
14:59 - 15:04

still got this pointer despite the fact
that I added 4 to it. I still remembered
15:04 - 15:10

the allocation idea so I can check, or the
standard can check, when we try and read
15:10 - 15:15

from this whether that address is within
the footprint of that original allocation,
15:15 - 15:19

i.e. is within there and in fact it is
not, it's over here, it is not within
15:19 - 15:26

there at all. So this program is deemed to
have undefined behavior. Just to clarify
15:26 - 15:31

something people often get confused. So we
detect undefined behavior here but it
15:31 - 15:35

isn't really a temporal thing. The fact
that there is undefined behavior anywhere
15:35 - 15:42

in the execution means the whole program
is toast. Okay but this is really
15:42 - 15:46

interesting because we're relying for this
critical part of the standard onto
15:46 - 15:52

information which is not there at run time
in a conventional implementation. So just
15:52 - 15:59

to emphasize that point, if I compile this
program, let's say to ARM assembly
15:59 - 16:05

language I get a sequence of store and
load and add instructions, store, load,
16:05 - 16:12

add, store, load, load. And if I look at
what the ARM architecture says can happen
16:12 - 16:17

these blue transitions... one thing to
notice is that we can perfectly well do
16:17 - 16:22

some things out of order. At this point we
could either do this load or we could do
16:22 - 16:27

that store. That would be a whole other
talk. Let's stick with the sequential
16:27 - 16:33

execution for now. What I want to
emphasize is that these, this load of this
16:33 - 16:37

address, really was loading a 64 bit
number, which was the address of x and
16:37 - 16:44

adding 4 to it and then loading from that
address, the secret key. So there is no
16:44 - 16:56

trace of these allocation ideas. No trace
at all. Okay, let me step back a little
16:56 - 17:05

bit. I've been showing you some
screenshots of C behaviour and ARM
17:05 - 17:12

behaviour and I claim that these are
actually showing you all of the behaviours
17:12 - 17:19

allowed by what one would like the
standards to be for these two things. And
17:19 - 17:27

really what I've been showing you are GUIs
attached to mathematical models that we've
17:27 - 17:31

built in a big research project for the
last many years. REMS: "Rigorous
17:31 - 17:35

Engineering of Mainstream Systems" sounds
good, aspirational title I think I would
17:35 - 17:42

say. And in that we've built semantics, so
mathematical models defining the envelope
17:42 - 17:48

of all allowed behaviour for a quite big
fragment of C and for the concurrency
17:48 - 17:54

architecture of major processors ARM, and
x86, and RISC-V, and IBM POWER and also
17:54 - 17:58

for the instruction set architecture of
these processors, the details of how
17:58 - 18:04

individual instructions are executed. And
in each case these are specs that are
18:04 - 18:08

executable as test oracles, you can
compare algorithmically some observed
18:08 - 18:13

behaviour against whether spec says it's
allowed or not, which you can't do with
18:13 - 18:19

those books. So is it just an idiot simple
idea but for some reason the industry has
18:19 - 18:27

not taken it on board any time in the last
five decades. It's not that hard to do.
18:27 - 18:31

These are also mathematical models, I'll
come back to that later. But another thing
18:31 - 18:35

I want to emphasize here is that in each
of these cases, especially these first
18:35 - 18:41

two, when you start looking really closely
then you learn what you have to do is not
18:41 - 18:44

build a nice clean mathematical model of
something which is well understood. What
18:44 - 18:51

you learn is that there are real open
questions. For example within the C
18:51 - 18:56

standards committee and within ARM a few
years ago about exactly what these things
18:56 - 19:01

even were. And that is a bit disturbing
given that these are the foundations for
19:01 - 19:06

all of our information infrastructure.
There is also a lot of other work going on
19:06 - 19:11

with other people within this REMS project
on web assembly and Javascript and file
19:11 - 19:16

systems and other concurrency. I don't
have time to talk about those but Hannes
19:16 - 19:20

Mehnert it is going to talk us a little
bit about TCP later today. You should go
19:20 - 19:23

to that talk too if you like this one. If
you don't like this one you should still
19:23 - 19:33

go to that talk. Okay so this is doing
somewhat better. certainly more rigorous
19:33 - 19:42

engineering of specifications, but so far
only for existing infrastructure for this
19:42 - 19:50

C language, ARM architecture, what have
you. So what if we try and do better, what
19:50 - 19:55

if we try and build better security in. So
the architectures that we have, really
19:55 - 20:00

they date back to the 1970s or even 60s
with the idea of virtual memory. That
20:00 - 20:05

gives you - I need to go into what it is -
but that gives you very coarse grain
20:05 - 20:10

protection. You can have your separate
processes isolated from each other and on
20:10 - 20:15

a good day you might have separate browser
tabs isolated from each other in some
20:15 - 20:22

browsers, sometimes, but it doesn't scale.
It's just too expensive. You have lots of
20:22 - 20:27

little compartments. One thing we can do
we can certainly design much better
20:27 - 20:34

programming languages than C but all of
that legacy code, that's got a massive
20:34 - 20:41

inertia. So an obvious question is whether
we can build in better security into the
20:41 - 20:46

hardware that doesn't need some kind of
massive pervasive change to all the
20:46 - 20:51

software we ever wrote. And that's the
question that a different large project,
20:51 - 20:56

the CHERI project has been addressing. So
this is something that's been going on for
20:56 - 21:02

the last 8 years or so, mostly at SRI and
at Cambridge funded by DARPA and the EPSRC
21:02 - 21:07

and ARM and other people, mostly led by
Robert Watson, Simon Moore, Peter Neumann
21:07 - 21:15

and a cast of thousands. And me a tiny
bit. So... How, don't do that. I should
21:15 - 21:21

learn to stop pushing this button. It's
very hard to not push the button. So
21:21 - 21:25

here's the question in a more focused way
that CHERI is asking: Can we build
21:25 - 21:33

hardware support which is both efficient
enough and deployable enough that gives us
21:33 - 21:37

both fine grained memory protection to
prevent that kind of bug that we were
21:37 - 21:42

talking about earlier, well that kind of
leak at least, and also scalable
21:42 - 21:46

compartmentalization. So you can take bits
of untrusted code and put them in safe
21:46 - 21:53

boxes and know that nothing bad is going
to get out. That's the question. The
21:53 - 21:57

answer... The basic idea of the answer is
pretty simple and it also dates back to
21:57 - 22:03

the 70s is to add hardware support for
what people call capabilities, and we'll
22:03 - 22:08

see that working in a few moments. The
idea is that this will let programmers
22:08 - 22:12

exercise the principle of least privilege,
so with each little bit of code having
22:12 - 22:18

only the permissions it needs to do its
job and also the principle of intentional
22:18 - 22:26

use. So with each little bit of code when
it uses one of those capabilities, will
22:26 - 22:30

require it to explicitly use it rather
than implicitly. So the intention of the
22:30 - 22:36

code has to be made plain. So let's see
how this works. So these capabilities
22:36 - 22:44

they're basically replacing some or maybe
all of the pointers in your code. So if we
22:44 - 22:51

take this example again in ISO C we had an
address and in the standard, well, the
22:51 - 22:55

standard is not very clear about this but
we're trying to make it more clear, an
22:55 - 23:01

allocation ID, say something like it. Now
in Cherry C we can run the same code but
23:01 - 23:06

we might represent this pointer not just
with that number at runtime but with
23:06 - 23:13

something that contains that number and
the base of the original allocation and
23:13 - 23:17

the length of the original allocation and
some permissions. And having all of those
23:17 - 23:23

in the pointer means that we can do.. the
hardware can do, at run time, at access
23:23 - 23:30

time, a very efficient check that this is
within this region of memory. And if it's
23:30 - 23:36

not it can fail stop and trap. No
information leak. I need a bit more
23:36 - 23:41

machinery to make this actually work. So
it would be nice if all of these were 64
23:41 - 23:46

bit numbers but then you get a 256 bit
pointer, and that's a bit expensive so
23:46 - 23:53

there's a clever compression scheme that
squeezes all this down into 1 to 8 bits
23:53 - 24:00

with enough accuracy. And then you need to
design the instruction set of the CPU
24:00 - 24:07

carefully to ensure that you can never
forge one of these capabilities with a
24:07 - 24:13

normal instruction. You can never just
magic one up out of a bunch of bits that
24:13 - 24:20

you happen to find lying around. So all
the capability manipulating instructions,
24:20 - 24:27

they're going to take a valid capability
and construct another, possibly smaller,
24:27 - 24:32

in memory extent, or possibly with less
permissions, new capability. They're never
24:32 - 24:39

going to grow the memory extent or add
permissions. And when the hardware starts,
24:39 - 24:43

at hardware initialization time, the
hardware will hand the software a
24:43 - 24:49

capability to everything and then as the
operating system works and the linker
24:49 - 24:56

works and the C language allocator works,
these capabilities will be chopped up into
24:56 - 25:03

ever finer and smaller pieces like to be
handed out the right places. And then need
25:03 - 25:10

one more little thing. So this C language
world we're living in here, or really a
25:10 - 25:16

machine code language world so there's
nothing stopping code writing bytes of
25:16 - 25:23

stuff. So you have to somehow protect
these capabilities against being forged by
25:23 - 25:28

the code, either on purpose or
accidentally writing bytes over the top.
25:28 - 25:35

You need to preserve their integrity. So
that's done by adding for each of these
25:35 - 25:43

128 bit sized underlined units of memory
just a one extra bit. That holds a tag and
25:43 - 25:55

that tag records whether this memory holds
are currently valid capability or not. So
25:55 - 26:00

all the capability manipulating
instructions, if they're given a valid
26:00 - 26:06

capability with a tag set then the result
will still have the tags set. But if you
26:06 - 26:14

write, so you just wrote that byte there
which might change the base then the
26:14 - 26:20

hardware will clear that tag and these
tags, they're not conventionally
26:20 - 26:25

addressable, they don't have addresses.
They just stop there in the memory system
26:25 - 26:30

of the hardware. So there is no way that soft-
ware can inaudible through these. So this is
26:30 - 26:39

really cool. We've taken, what in ISO was
undefined behavior, in the standard, and
26:39 - 26:44

in implementations was a memory leak, and
we've turned it into something that in
26:44 - 26:49

CHERI C is in both the specification and
the implementation, is guaranteed to trap,
26:49 - 26:57

to fail stop, and not leak that
information. So another few things about
26:57 - 27:02

the design that I should mention. I think
all these blue things I think I've talked
27:02 - 27:08

about. But then a lot of care has gone in
to make this be very flexible to make it
27:08 - 27:12

possible to adopt it. So you can use
capabilities for all pointers, if you want
27:12 - 27:19

to, or just at some interfaces. You might
for example use them at the kernel
27:19 - 27:25

userspace interface. It coexist very
nicely with existing C and C++. You don't
27:25 - 27:29

need to change the source code very much
at all, and we'll see some a tiny bit of
27:29 - 27:37

data about this, to make these to start
using these. It coexists with the existing
27:37 - 27:41

virtual memory machinery, so you can use
both capabilities and virtual memory, if
27:41 - 27:46

you want, or you can just use capabilities
if you want or just use virtual memory if
27:46 - 27:51

you want. And then there's some more
machinery, which I'm not going to talk
27:51 - 27:55

about, for doing this secure encapsulation
stuff. What I've talked about so far is
27:55 - 27:59

basically the the fine grained memory
protection story. And I should say the
27:59 - 28:05

focus so far is on spatial memory safety.
So that's not in the first instance
28:05 - 28:11

protecting against reuse of an old
capability in a bad way but there various
28:11 - 28:18

schemes for supporting temporal memory
safety with basically the same setup. Okay
28:18 - 28:25

so. So how do we... Okay this is some kind
of a high level idea. How do we know that
28:25 - 28:33

it can be made to work. Well the only way
is to actually build it and see. And this
28:33 - 28:37

has been a really interesting process
because mostly when you're building
28:37 - 28:41

something either in academia or an
industry you have to keep most of the
28:41 - 28:47

parts fixed, I mean you are not routinely
changing both the processor and the
28:47 - 28:51

operating system, because that would just
be too scary. But here we have been doing
28:51 - 28:58

that and indeed we've really been doing a
three way, three way codesign of building
28:58 - 29:03

hardware and building and adapting
software to run above it and also building
29:03 - 29:09

these mathematical models all at the same
time. This is, well. A, it's all the fun
29:09 - 29:13

because you can often get groups of people
together that can do all of those things
29:13 - 29:20

but B, it's really effective. So what
we've produced then is an architecture
29:20 - 29:25

specification. In fact, extending MIPS
because it was conveniently free of IP
29:25 - 29:32

concerns and that specification is one of
these mathematically rigorous things
29:32 - 29:36

expressed in a domain specific language
specifically for writing instruction set
29:36 - 29:40

architecture specifications, and we've
designed and implemented actually two of
29:40 - 29:44

those. And then there are hardware
processor implementations that actually
29:44 - 29:49

run an FPGAs. And a lot of hardware
research devoted to making this go fast
29:49 - 29:54

and be efficient despite these extra tags
and whatnot. And then there's a big
29:54 - 30:01

software stack adapted to run over this
stuff. Including Clang and a FreeBSD
30:01 - 30:07

kernel and FreeBSD userspace and WebKit
and all kinds of pieces. So this is a very
30:07 - 30:12

major evaluation effort. That one is not
normally able to do. This is why, this
30:12 - 30:18

combination of things is why that list of
people up there was about 40 people. I say
30:18 - 30:23

this is based on MIPS but the ideas are
not specific to MIPS, there are ongoing
30:23 - 30:28

research projects to see if this can be
adapted to the ARM application class
30:28 - 30:33

architecture and the ARM microcontroller
architecture and the RISC-V. We'll see how
30:33 - 30:42

that goes, in due course. Okay so then
with all of these pieces we can evaluate
30:42 - 30:46

whether it actually works. And that's
still kind of an ongoing process but the
30:46 - 30:57

data that we've got so far is pretty
encouraging. So we see here first,
30:57 - 31:08

performance. So you see, maybe a percent
or 2 in many cases of runtime overhead.
31:08 - 31:12

There are workloads where it performs
badly if you have something that's very
31:12 - 31:17

pointer heavy, then the extra pressure
from those larger pointers will be a bit
31:17 - 31:24

annoying, but really it seems to be
surprisingly good. Then is it something
31:24 - 31:31

that can work without massive adaption to
the existing code. Well, in this port of
31:31 - 31:38

the FreeBSD userspace, so all the programs
that, all in all programs that run, they
31:38 - 31:45

were something like 20000 files and only
200 of those needed a change. And that's
31:45 - 31:52

been done by one or two people in not all
that large an amount of time. Some of the
31:52 - 31:57

other code that's more and more like an
operating system needs a bit more invasive
31:57 - 32:06

change but still, it seems to be viable.
Is it actually more secure? How are you
32:06 - 32:11

going to measure that. We like measuring
things as a whole we try and do science
32:11 - 32:17

where we can. Can we measure that? Not
really. But it certainly does, the whole
32:17 - 32:23

setup certainly does, mitigate against
quite a large family of known attacks. I
32:23 - 32:28

mean if you try and run Heartbleed you'll
get a trap. It will say trap. I think he
32:28 - 32:36

will say signal 34 in fact. So that's
pretty good. And if you think about
32:36 - 32:43

attacks, very many of them involve a chain
of multiple floors put together by an
32:43 - 32:49

ingenious member of the C3 community or
somebody else, and very often, at least
32:49 - 33:00

one of those is some kind of memory access
permission flaw of some kind or other.
33:00 - 33:04

Okay so this is nice, but I was talking a
lot in the first part of the talk, about
33:04 - 33:12

rigorous engineering. So here we've got
formally defined definitions of the
33:12 - 33:17

architecture and that's been really useful
for testing against and for doing test
33:17 - 33:23

generation on the basis of, and doing
specification coverage in an automated
33:23 - 33:29

way. But surely we should be able to do
more than that. So this formal definition
33:29 - 33:35

of the architecture, what does it look
like. So for each instruction of this
33:35 - 33:41

CHERI MIPS processor, there's a definition
and that definition, it looks a bit like
33:41 - 33:48

normal code. So here's a definition of how
the instruction for "capability increment
33:48 - 33:54

cursor immediate" goes. So this is a thing
that takes a capability register and an
33:54 - 33:58

immediate value and it tries to add
something on to the cursor of that
33:58 - 34:04

capability. And what you see here is some
code, looks like code, that defines
34:04 - 34:07

exactly what happens and also defines
exactly what happens if something goes
34:07 - 34:13

wrong. You can see there's some kind of an
exception case there. That's a processor.
34:13 - 34:19

No that's a... Yeah, that a "raising our
processor" exception. So it looks like
34:19 - 34:25

code, but, from this, we can generate both
reasonably high performance emulators,
34:25 - 34:31

reasonably in C and in OCaml, and also
mathematical models in the theorem
34:31 - 34:35

provers. So three of the main theorem
provers in the world are called Isabelle
34:35 - 34:42

and HOL and Coq. And we generate
definitions in all of those. So then we
34:42 - 34:46

got a mathematical definition of the
architecture. Then finally, we can start
34:46 - 34:52

stating some properties. So Cherrie's
design with some kind of vague security
34:52 - 34:58

goals in mind from the beginning but now
we can take, let'ssay one of those goals and
34:58 - 35:07

make it into a precise thing. So this is a
kind of a secure encapsulation, kind of
35:07 - 35:13

thing not exactly the memory leak that we
were looking at before. Sorry, the secret
35:13 - 35:18

leak that we were looking at. So let's
take imagine some CHERI compartment.
35:18 - 35:22

Haven't told you exactly what that means
but let's suppose that that's understood
35:22 - 35:27

and that inside that compartment there is
a piece of software running. And that
35:27 - 35:36

might be maybe a video codec or a web
browser or even a data structure
35:36 - 35:44

implementation maybe, or a C++ class and
all of its objects maybe. So imagine that
35:44 - 35:50

compartment running and imagine the
initial capabilities that it has access to
35:50 - 35:56

via registers and memory and via all the
capabilities it has access to via the
35:56 - 36:00

memory that it has access to and so on
recursively. So imagine all of those
36:00 - 36:06

capabilities. So the theorem says no
matter how this code executes whatever it
36:06 - 36:16

does however compromised or buggy it was
in an execution up until any point at
36:16 - 36:24

which it raises an exception or makes an
inter compartment call it can't make any
36:24 - 36:31

access which is not allowed by those
initial capabilities. You have to exclude
36:31 - 36:35

exceptions and into compartment calls
because by design they do introduce new
36:35 - 36:43

capabilities into the execution. But until
that point you get this guarantee. And
36:43 - 36:49

this is something that we've proved
actually [...] has approved with a machine
36:49 - 36:56

check proof in in fact the Isabelle
theorem prover. So this is a fact which is
36:56 - 37:04

about as solidly known as any fact the
human race knows. These provers they're
37:04 - 37:10

checking a mathematical proof in
exceedingly great detail with great care
37:10 - 37:14

and attention and they're structured in
such a way that the soundness of approver
37:14 - 37:21

depends only on some tiny little kernel.
So conceivably cosmic rays hit the
37:21 - 37:27

transistors at all the wrong points. But
basically we know this for sure. I
37:27 - 37:34

emphasize this is a security property we
have not proved, we certainly wouldn't
37:34 - 37:39

claim to have proved that CHERI is secure
and there are all kinds of other kinds of
37:39 - 37:45

attack that this statement doesn't even
address but at least this one property we
37:45 - 37:50

know it for real. So that's kind of
comforting and not a thing that
37:50 - 38:01

conventional computer science engineering
has been able to do on the whole. So, are
38:01 - 38:11

we taming the chaos then. Well no. Sorry.
But I've shown you two kinds of things.
38:11 - 38:16

The first was showing how we can do
somewhat more rigorous engineering for
38:16 - 38:22

that existing infrastructure. It's now
feasible to do that and in fact I believe
38:22 - 38:26

it has been feasible to build
specifications which you can use as test
38:26 - 38:31

oracles for many decades. We haven't
needed any super fancy new tech for that.
38:31 - 38:41

We've just needed to focus on that idea.
So that's for existing infrastructure and
38:41 - 38:47

then CHERI is a relatively light weight
change that does built-in improved
38:47 - 38:54

security and we think it's demonstrably
deployable at least in principle. Is it
38:54 - 38:58

deployable in practice? Are we going to
have in all of our phones five years from
38:58 - 39:06

now? Will these all be CHERI enabled? I
can't say. I think it is plausible that
39:06 - 39:10

that should happen and we're exploring
various routes that might mean that there
39:10 - 39:20

happens and we'll see how that goes. Okay
so with that I aiming to have given you at
39:20 - 39:26

least not a whole lot of hope but just a
little bit of hope that the world can be
39:26 - 39:31

made a better place and I encourage you to
think about ways to do it because Lord
39:31 - 39:38

knows we need it. But maybe you can at
least dream for a few moments of living in
39:38 - 39:46

a better world. And with that I thank you
for your attention.
39:46 - 39:57

applause
39:57 - 40:06

Herald Angel: Thanks very much. And with
that we'll head straight into the Q and A.
40:06 - 40:10

We'll just start with mic number one which
I can see right now.
40:10 - 40:16

Q: Yes thanks for the very encouraging
talk. I have a question about how to C
40:16 - 40:22

code the part below that. The theorem that
you stated assumes that the mentioned
40:22 - 40:28

description matches the hardware at least
is describes the hardware accurately. But
40:28 - 40:33

are there any attempts to check that the
hardware actually works like that? That
40:33 - 40:37

there are no wholes in the hardware? That
some combination of mentioned commands
40:37 - 40:43

work differently or allow memory access
there is not in the model? So, is it
40:43 - 40:49

possible to use hardware designs and check
that it matches the spec and will Intel or
40:49 - 40:54

AMD try to check their model against that
spec?
40:54 - 41:00

Sewell: OK. So the question is basically
can we do provably correct hardware
41:00 - 41:06

underneath this architectural abstraction.
And the answer is... So it's a good
41:06 - 41:13

question. People are working on that and
it can be done I think on at least a
41:13 - 41:22

modest academic scale. Trying to do that
in its full glory for a industrial
41:22 - 41:28

superscalar processor design, is right now
out of reach. But it's certainly something
41:28 - 41:35

one would like to be able to do. I think
we will get there maybe in a decade or so.
41:35 - 41:42

A decade is quick.
Herald Angel: Thanks. Mic number four is
41:42 - 41:48

empty again. So we'll take the Internet.
Sewell: Where is the internet?
41:48 - 41:53

Signal Angel: The internet is right here
and everywhere so ruminate would like to
41:53 - 42:00

know is the effort and cost of building in security
to hardware as you've described in your
42:00 - 42:06

talk significantly greater or less than
the effort and cost of porting software to
42:06 - 42:12

more secure languages.
Sewell: So, is the effort of building new
42:12 - 42:17

hardware more or less than the cost of
porting existing software to better
42:17 - 42:26

languages? I think the answer has to be
yes. It is less. The difficulty with
42:26 - 42:31

porting software is all of you and all
your colleagues and all of your friends
42:31 - 42:36

and all your enemies have been beavering
away writing lines of code for 50 years.
42:36 - 42:42

Really, I don't know what to say,
effectively. Really numerously. There's an
42:42 - 42:46

awful lot of it. It's a really terrifying
amount of code out there. It's very hard
42:46 - 42:54

to get people to rewrite it.
Signal Angel: OK, mic two.
42:54 - 43:03

Q: OK, given that cost of ... of today on the
software level not on the hardware level, is it
43:03 - 43:12

possible to go half way using an entire
system as an oracle. So during development
43:12 - 43:19

there are code based with the secure
hardware but then essentially ship the
43:19 - 43:29

same software to unsecure hardware and inaudible
Sewell: So this question, if I paraphrase
43:29 - 43:36

it is can we use this hardware
implementation really as the perfect
43:36 - 43:45

sanitiser? And I think there is scope for
doing that. And you can even imagine
43:45 - 43:50

running this not on actual silicon
hardware but in like a QEMU implementation
43:50 - 43:55

which we have in order to do that. And I
think for for that purpose it could
43:55 - 44:01

already be a pretty effective bug finding
tool. It would be worth doing. But you
44:01 - 44:06

would like as for any testing it will only
explore a very tiny fraction of the
44:06 - 44:14

possible paths. So we would like to have
it always on if we possibly can.
44:14 - 44:20

Herald Angel: OK. The internet again.
Signal Angel: OK someone would like to
44:20 - 44:26

know how does your technique compare to
Intel MPX which failed miserably with a inaudible
44:26 - 44:34

ignoring it. Will it better or faster?
Could you talk a little bit about that.
44:34 - 44:38

Sewell: I think for that question, for a
real answer to that question, you would
44:38 - 44:46

need one of my more hardware colleagues.
What I can say is they make nasty faces
44:46 - 44:53

when you say MPX.
Herald Angel: Well that's that. Mic number
44:53 - 44:57

one.
Q: Somewhat inherent to your whole
44:57 - 45:02

construction was this idea of having an
unchangeable bit which described whether
45:02 - 45:09

your pointer contents have been changed.
Is this even something that can be done in
45:09 - 45:13

software? Or do you have to construct a
whole extra RAM or something?
45:13 - 45:21

Sewell: So you can't do it exclusively in
software because you need to protect it.
45:21 - 45:26

In the hardware implementations that my
colleagues have built it's done in a
45:26 - 45:33

reasonably efficient way. So it's cached
and managed in clever ways to ensure that
45:33 - 45:39

it's not terribly expensive to do. But you
do need slightly wider data paths in your
45:39 - 45:45

cache protocol and things like that to
manage it.
45:45 - 45:52

Herald Angel: OK. Mic seven.
Q: How good does this system help securing
45:52 - 45:57

the control flow integrity compared to
compared to other systems, like hardware
45:57 - 46:04

systems data flow isolation or code
pointer integrity in ARM inaudible?
46:04 - 46:09

Sewell: Ah well, that's another question
which is a bit hard to answer because then
46:09 - 46:18

it depends somewhat on how you're using
the capabilities. So you can arrange here
46:18 - 46:26

that each state each C language allocation
is independently protected and indeed you
46:26 - 46:30

can also arrange that each way - this is
sensible - that each subobject is
46:30 - 46:35

independent protected. And those
protections are going to give you
46:35 - 46:39

protection against trashing bits of the
stack because they'll restrict the
46:39 - 46:51

capability to just those objects.
Herald Angel: OK the internet again.
46:51 - 46:58

Signal Angel: Twitter would like to know
is it possible to is it possible to run
46:58 - 47:04

CHERI C without custom hardware?
Sewell: Can you run CHERI C without custom
47:04 - 47:13

hardware? And the answer is you can run it
in emulation above this QEMU. That works
47:13 - 47:18

pretty fast, I mean fast enough for
debugging as the earlier question was
47:18 - 47:29

talking about. You can imagine you know
somewhat faster emulations of that. It's
47:29 - 47:34

not clear how much it's worth going down
that route. The real potential big win is
47:34 - 47:41

to have this be always on and that's what
we would like to have.
47:41 - 47:47

Herald Angel: OK. Mic one.
Q: Do you have some examples of the kinds
47:47 - 47:52

of code changes that you need to the
operating system and userland that you
47:52 - 47:56

mentioned?
Sewell: So, okay so what kind of changes
47:56 - 48:07

do you need to to adapt code to CHERI? So,
if you look at C-code there is the C
48:07 - 48:13

standard that deems a whole lot of things
to be undefined behavior and then if you
48:13 - 48:20

look at actual C code you find that very
very much of it does depend on some idiom
48:20 - 48:29

that the C standard does not approve of.
So there is for example, pause there are
48:29 - 48:35

quite a lot of instances of code that
constructs a pointer by doing more than
48:35 - 48:42

one more than plus one offset and then
brings it back within range before you use
48:42 - 48:49

it. So in fact in CHERI C many of those
cases are allowed but not all of them.
48:49 - 48:58

More exotic cases where there's really
some crazy into object stuff going on or
48:58 - 49:03

where pointer arithmetic between objects -
which the C standard is certainly quite
49:03 - 49:09

down on but which does happen - and code
which is, say manipulating the low order
49:09 - 49:14

bits of a pointer to store some data in
it. That's pretty common. You can do it in
49:14 - 49:18

CHERI C but you might have to adapt the
code a little bit. Other cases are more
49:18 - 49:23

fundamental. So say, in operating system
code if you swap out a page to disk and
49:23 - 49:28

then you swap it back in then somehow the
operating system has to reconstruct new
49:28 - 49:35

capabilities from a large capability which
it kept for the purpose. So that needs a
49:35 - 49:42

bit more work. Though it's kind of a
combination. Some things you would regard
49:42 - 49:48

as good changes to the code anyway and
others are more radical.
49:48 - 49:52

Herald Angel: OK, another one from the
internet.
49:52 - 50:00

Signal Angel: I lost one pause
Sewell: The Internet has gone quiet. Yes.
50:00 - 50:04

Signal Angel: Last question from the
internet. Are there any plans to impact
50:04 - 50:12

test on Linux or just FreeBSD?
Sewell: If anyone has a good number of
50:12 - 50:20

spare minutes then that would be lovely to
try. The impact on Linux not just
50:20 - 50:26

previously the BSD code base is simpler
and maybe cleaner and my systemsy
50:26 - 50:30

colleagues are already familiar with it.
It's the obvious target for an academic
50:30 - 50:35

project doing it already a humungous
amount of work. So I think we would love
50:35 - 50:41

to know how Linux and especially how
Android could be adapted but it's not
50:41 - 50:46

something that we have the resource to do
at the moment.
50:46 - 50:51

Herald Angel: Mic number on again.
Q: How likely or dangerous.
50:51 - 50:55

Herald Angel: Mic number one is not
working.
50:55 - 51:00

Q: How likely or dangerous you think it is
for a programmer to screw up their
51:00 - 51:05

capabilities he specifies?
Sewell: So how often will the programmer
51:05 - 51:11

screw up the capabilities? So in many
cases the programmer is just doing an
51:11 - 51:17

access with a C or C++ pointer or C++
object in a normal way. They're not
51:17 - 51:24

manually constructing these things. So a
lot of that is going to just work. If
51:24 - 51:28

you're building some particular secure
encapsulation setup you have to be a bit
51:28 - 51:34

careful. But on the whole I think this is
a small risk compared to the state of
51:34 - 51:41

software as we have it now.
Herald Angel: OK. Mic eight.
51:41 - 51:49

Q: So existing memory protection
techniques are quite patch-worky, like
51:49 - 52:06

canaries and address layout randomisation.
Is this a system designed to replace or
52:06 - 52:18

supplement these measures?
Sewell: I think if you pause So again it
52:18 - 52:24

depends a bit how you use it. So if you
have capabilities really everywhere then
52:24 - 52:33

there's not much need for address space
layout randomization or canaries to
52:33 - 52:42

protect against explicit information
leaks. I imagine that you might still want
52:42 - 52:48

randomisation to protect against some side
channel flows but canaries is not so much
52:48 - 52:52

on whether you actually do for side-
channel flows - I don't know. That's a
52:52 - 52:56

good question.
Herald Angel: Mic four please.
52:56 - 53:03

Q: Thanks for the very interesting talk
you presented with CHERI, a system of
53:03 - 53:08

veryfying existing C-software, sort of
post hoc and improving its security via
53:08 - 53:12

run-time mechanism...?
A: Improving or Proving? Improving yes,
53:12 - 53:19

proving no, yes.
Q: So what's your opinion on the viability
53:19 - 53:23

of using a static analysis and static
verification for that. Would it be
53:23 - 53:30

possible to somehow analyse software that
already exist and as written in these
53:30 - 53:34

unsafe languages and show that it
nevertheless has some security properties
53:34 - 53:41

without writing all the proofs manually in
like HOL or Isabelle?
53:41 - 53:47

A: Well so if you want to analyse existing
software... So, static analysis is already
53:47 - 53:53

useful for finding bugs in existing
software. If you want to have assurance
53:53 - 53:59

from static analysis, that's really very
tough. So you certainly wouldn't want to
53:59 - 54:07

manually write proofs in terms of the
definitional C semantics, you would need
54:07 - 54:11

intermediate infrastructure. And if you
look at the people, who have done verified
54:11 - 54:16

software, so verified compilers and
verified hypervisor, and verified
54:16 - 54:25

operating systems, all of those are now
possible on significant scales, but they
54:25 - 54:31

use whole layers of proof and verification
infrastructure for doing that. They're not
54:31 - 54:36

writing proofs by hand for every little
detail. And some of those verification
54:36 - 54:42

methods either are, you can imagine them
being basically like static analysis, you
54:42 - 54:48

know, you might use a static analysis to
prove some relatively simple facts about
54:48 - 54:55

the code and then stitch those together
into a larger assembly. I think any kind
54:55 - 54:59

of more complete thing I think is hard to
imagine. I mean these analysis tools
54:59 - 55:03

mostly, they rely on making some
approximation in order to be able to do
55:03 - 55:10

their thing at all. So it's hard to get
assurance out of them.
55:10 - 55:17

Herald Angel: Okay, Mic one.
Q: You said you modified an MIPS
55:17 - 55:23

architecture to add some logic to check their
capabilities. Do you know what the cost of
55:23 - 55:30

this regarding computational power,
an energetic power supply?
55:30 - 55:33

A: Sorry, can you repeat the last part of
that?
55:33 - 55:39

Q: What the cost of this modification
regarding computational power, and power
55:39 - 55:42

consumption.
A: Ah, so what's the energy cost? So I
55:42 - 55:47

gave you a performance cost. I didn't give
you, I carefully didn't give you an energy
55:47 - 55:53

cost estimate, because it's really hard to
do in a scientifically credible way,
55:53 - 55:59

without making a more or less production
superscalar implementation. And we are
55:59 - 56:04

sadly not in a position to do that,
although if you have 10 or 20 million
56:04 - 56:09

pounds, then I would be happy to accept
it.
56:09 - 56:15

Herald Angel: Mic number 7, please.
Q: How does the class of problems that you
56:15 - 56:21

can address with CHERI compare to the
class of problems that are excluded by,
56:21 - 56:27

for example, the Rust programming
language?
56:27 - 56:30

A: So how does the problems of CHERI
relate to the problems, sorry the problems
56:30 - 56:40

excluded by CHERI, relate to the problems
excluded by Rust. So if you are happy to
56:40 - 56:51

write all of your code in Rust without
ever using the word "unsafe", then maybe
56:51 - 56:56

there would be no point in CHERI, at all.
Are you happy to do that?
56:56 - 57:01

Q: Probably not, no.
A: I think someone shakes their head
57:01 - 57:10

sideways, yeah.
Herald Angel: Mic number 1.
57:10 - 57:15

Q: What do you think about the following
thesis: We are building a whole system of
57:15 - 57:21

things, with artificial intelligence and
something like that. That is above this
57:21 - 57:30

technical level and it's building another
chaos that isn't healing with those
57:30 - 57:34

things.
A: It's dreadful, isn't it.
57:34 - 57:46

Laughter
A: There are, so you might, in fact some
57:46 - 57:50

of my colleagues are interested in this
question, if you, you might well want to
57:50 - 57:59

bound what your artificial intelligence
can access or touch and for that you might
57:59 - 58:04

want this kind of technology. But this is
not, we are, with machine learning systems
58:04 - 58:07

we are intrinsically building things that
we, on the whole, don't understand and
58:07 - 58:12

that will have edge cases that go wrong.
And this is not speaking to that in any
58:12 - 58:17

way.
Herald Angel: OK. Does the internet have a
58:17 - 58:25

question? No, good. I don't see anyone
else, so let's conclude this. Thank you
58:25 - 58:29

very much.
A: Thank you.
58:29 - 58:30

applause
58:30 - 58:36

35c3 postroll music
58:36 - 58:53

subtitles created by c3subtitles.de
in the year 2019. Join, and help us!

Title:: 35C3 - Taming the Chaos: Can we build systems that actually work?
Description:: more » « less
Video Language:: English
Duration:: 58:53

	C3Subtitles edited English subtitles for 35C3 - Taming the Chaos: Can we build systems that actually work?
	Bar Sch edited English subtitles for 35C3 - Taming the Chaos: Can we build systems that actually work?
	Bar Sch edited English subtitles for 35C3 - Taming the Chaos: Can we build systems that actually work?
	Bar Sch edited English subtitles for 35C3 - Taming the Chaos: Can we build systems that actually work?
	Bar Sch edited English subtitles for 35C3 - Taming the Chaos: Can we build systems that actually work?
	C3Subtitles edited English subtitles for 35C3 - Taming the Chaos: Can we build systems that actually work?
	C3Subtitles edited English subtitles for 35C3 - Taming the Chaos: Can we build systems that actually work?

English subtitles

Revisions

Revision 7 Edited

C3Subtitles

35C3 - Taming the Chaos: Can we build systems that actually work?

Revisions

Our website uses cookies

Operating cookies (Required)