0:00:00.000,0:00:18.010
35C3 preroll music
0:00:18.010,0:00:29.239
Herald angel: OK. Our next speaker will be[br]trying to tame the chaos. Peter Sewell
0:00:29.239,0:00:34.120
from the University of Cambridge. A warm[br]round of applause please.
0:00:34.120,0:00:37.910
applause
0:00:37.910,0:00:48.740
Sewell: Thank you! Okay. So here we are.[br]C3 again with a program full of
0:00:48.740,0:00:56.710
fascinating and exciting and cool stuff.[br]And if we look just at the security talks
0:00:56.710,0:00:59.940
all manner of interesting things. I'm[br]going to go to a lot to these. You should
0:00:59.940,0:01:06.830
too. Let's see though if we read some of[br]the titles: exploiting kernel memory
0:01:06.830,0:01:18.750
corruptions, attacking end to end e-mail[br]encryption. What else have we got: modern
0:01:18.750,0:01:31.550
Windows user space exploitation,[br]compromising online accounts. OK so a lot
0:01:31.550,0:01:36.130
of these talks, as usual, they're[br]explaining to us that our computers,
0:01:36.130,0:01:42.950
they're not doing what we would like and[br]this as usual is the merest tip of a tiny
0:01:42.950,0:01:48.970
iceberg, alright. We have a hundreds of[br]thousands of known vulnerabilities and
0:01:48.970,0:01:53.280
many unknown vulnerabilities. Let me do it[br]a bit dark here, but let me try some
0:01:53.280,0:01:59.490
audience participation. How many of you[br]have in the last year written at least a
0:01:59.490,0:02:09.019
hundred lines of code? And of those people[br]keep your hands up if you are completely
0:02:09.019,0:02:17.550
confident that code does the right thing.[br]laughter I would like to talk to you
0:02:17.550,0:02:23.200
later laughter and so would the people[br]in the self driving car that you're
0:02:23.200,0:02:30.790
probably engineering. So, so how many[br]unknown vulnerabilities then. Well you can
0:02:30.790,0:02:39.160
take what's in your mind right now and[br]multiply it by - oh, this doesnt work very
0:02:39.160,0:02:48.920
well. No. No, no. Computers, they do the[br]wrong thing again and again and again.
0:02:48.920,0:02:54.421
laughter and some applause We can[br]multiply by this estimate of about a 100
0:02:54.421,0:03:00.630
billion lines of new code each year. So if[br]we take all of this: these talks, these
0:03:00.630,0:03:05.900
numbers, these should make us not happy[br]and excited, these should make us sad,
0:03:05.900,0:03:16.070
very sad and frankly rather scared of what[br]is going on in the world. So what can we
0:03:16.070,0:03:25.610
do? We can, option one, give up using[br]computers altogether. applause In most
0:03:25.610,0:03:29.640
audiences this will be a hard sell but[br]here I'm sure you're all happy to just
0:03:29.640,0:03:40.180
turn them off now. Or we can throw up our[br]hands in the air and collapse in some kind
0:03:40.180,0:03:54.830
of pit of despair. Well, maybe, maybe not.[br]My task today is to give you a tiny
0:03:54.830,0:04:00.050
smidgen, a very tall, a very small amount[br]of hope that it may be possible to do
0:04:00.050,0:04:04.099
slightly better. But if we want to do[br]better we first have to understand why
0:04:04.099,0:04:10.560
things are so bad and if we look at our[br]aeronautical engineering colleagues for
0:04:10.560,0:04:15.739
example, they can make planes which very[br]rarely fall out of the sky. Why is it that
0:04:15.739,0:04:22.880
they can do that and we are so shit at[br]making computers? Well, I'm going to reuse
0:04:22.880,0:04:27.970
a picture that I used at 31C3, which is[br]still the best description I can find of
0:04:27.970,0:04:35.990
the stack of hardware and software that we[br]live above. Here we go. It's, there's so
0:04:35.990,0:04:40.160
much information in this, it's just ace.[br]So we see down here all of this
0:04:40.160,0:04:45.430
transparent silicon, it's the hardware we[br]live above. We see a stack of phases of a
0:04:45.430,0:04:51.241
C compiler, we see all kinds of other[br]other bits and pieces. There might be a
0:04:51.241,0:04:58.870
slightly hostile wireless environment in[br]this room for some reason. If we look at
0:04:58.870,0:05:03.230
this and think about the root causes for[br]why our systems are so bad we can see
0:05:03.230,0:05:07.889
terrible things. So the first is,[br]obviously there's a lot of legacy
0:05:07.889,0:05:12.810
complexity that we're really stuck with,[br]alright. If you pull out one of those
0:05:12.810,0:05:16.240
pieces from the middle and try and replace[br]it with something which is not of exactly
0:05:16.240,0:05:21.350
the same shape the whole pile will[br]collapse. So we somehow need to find an
0:05:21.350,0:05:27.970
incremental path to a better world. And[br]then, this is the most fundamental reason
0:05:27.970,0:05:33.220
that these are not like aeroplanes, these[br]systems are discrete not continuous. If
0:05:33.220,0:05:38.290
you take an honest good I made out of a[br]piece of steel and you push on it a little
0:05:38.290,0:05:44.270
bit it moves a little bit, basically in[br]proportion. If it is 1 percent too strong,
0:05:44.270,0:05:50.090
1 percent to weak, basically it doesn't[br]matter. But in these things one line of
0:05:50.090,0:05:56.490
code can mean you're open to a[br]catastrophic exploit and one line in many,
0:05:56.490,0:06:05.970
many million. OK. And next thing... this[br]really doesn't work. They're very
0:06:05.970,0:06:11.380
complicated. But the scary thing is not[br]the static complexity of those lines of
0:06:11.380,0:06:15.080
code and the number of components although[br]that's pretty intimidating the really
0:06:15.080,0:06:21.110
scary thing is that the exponential number[br]of states and execution paths. So these
0:06:21.110,0:06:26.139
two together they mean that the testing[br]that we rely on testing is the only way we
0:06:26.139,0:06:30.710
have to build systems which are not[br]instantly broken. Testing can never save
0:06:30.710,0:06:40.550
us from these exploitable errors. And that[br]means ultimately we need to do proof.
0:06:40.550,0:06:45.610
Honest machine checked mathematical proof.[br]And this also tells us that we need to
0:06:45.610,0:06:51.289
arrange for our systems to be secure for[br]simple reasons that do not depend on the
0:06:51.289,0:06:57.180
correctness of all of those hundred[br]billion lines of code. Then another thing
0:06:57.180,0:07:03.360
that we have here. All these interfaces:[br]the architecture interface, the C language
0:07:03.360,0:07:10.650
interface, the sockets API interface, the[br]TCP interface. All of these we rely on to
0:07:10.650,0:07:17.500
let different parts of the system be[br]engineered by different organizations. But
0:07:17.500,0:07:24.870
they're all really badly described and[br]badly defined. So what you find is, for
0:07:24.870,0:07:30.130
each of these, typically a prose book[br]varying in thickness between about that
0:07:30.130,0:07:36.909
and about that, full of text. Well, we[br]still rely on testing - limited though it
0:07:36.909,0:07:44.000
is - but you can never test anything[br]against those books. So we need instead
0:07:44.000,0:07:48.940
interface descriptions, definitions,[br]specifications, that are more rigorous,
0:07:48.940,0:07:54.419
mathematically rigorous and that are[br]executable - not in the normal sense of
0:07:54.419,0:07:58.940
executable as an implementation but[br]executable as a test oracle. So you can
0:07:58.940,0:08:03.880
compute whether some observed behaviour is[br]allowed or not and not have to read the
0:08:03.880,0:08:07.700
book and argue on the Internet. And we[br]also need interface definitions that
0:08:07.700,0:08:19.820
support this proof that we need. And then[br]all of this stuff was made up in the 1970s
0:08:19.820,0:08:25.639
or the sixties and in the 70s and the 60s[br]the world was a happy place and people
0:08:25.639,0:08:32.068
walked around with flowers in their hair[br]and nothing bad ever happened. And that is
0:08:32.068,0:08:36.340
no longer the case. And so we need[br]defensive design, but we need defensive
0:08:36.340,0:08:41.149
design that is somehow compatible or can[br]be made enough compatible with this legacy
0:08:41.149,0:08:46.879
investment. That's quite hard. And then -[br]I can't say very much about this, but we
0:08:46.879,0:08:50.360
have at the moment very bad incentives,[br]because we have a very strong incentive to
0:08:50.360,0:08:55.420
make things that are more complex than we[br]can possibly handle or understand. We make
0:08:55.420,0:09:00.410
things, we add features, we make a thing[br]which is just about shippable and then we
0:09:00.410,0:09:06.550
ship it. And then we go on to the next[br]thing. So we need economic and legal
0:09:06.550,0:09:11.300
incentives for simplicity and for security[br]that we do not yet have. But it's hard to
0:09:11.300,0:09:17.109
talk about those until we have the[br]technical means to react to those
0:09:17.109,0:09:21.980
incentives. OK, so what I'm going to do[br]now, I'm going to talk about two of these
0:09:21.980,0:09:26.230
interfaces, the architect interface and[br]the C language interface, and see what we
0:09:26.230,0:09:32.819
can do to make things better. A lot of[br]this has to do with memory. So whoever it
0:09:32.819,0:09:38.209
was that picked the subtitle for this[br]meeting "refreshing memories" thank you,
0:09:38.209,0:09:45.429
because it's great, I love it. Let's[br]refresh your memory quite a way. So I
0:09:45.429,0:09:52.959
invite you to think back to when you were[br]very very young in about 1837 when cool
0:09:52.959,0:09:58.519
hacking was going on. Charles Babbage was[br]designing the Analytical Engine and even
0:09:58.519,0:10:03.369
then you see there was this dichotomy[br]between a mill performing operations and a
0:10:03.369,0:10:11.100
store holding numbers. This is a plan view[br]of the analytical engine, well, it was
0:10:11.100,0:10:15.240
vaporware, but a design from the[br]analytical engine, and you see here these
0:10:15.240,0:10:21.689
circles these are columns of numbers each[br]made out of a stack of cogs, maybe a 50
0:10:21.689,0:10:28.340
digit decimal number in each of those[br]columns. And this array, really, he
0:10:28.340,0:10:34.681
imagined it going on out to and over there[br]somewhere, with about 1000 numbers. So
0:10:34.681,0:10:42.850
even then you have a memory which is an[br]array of numbers. I think these were not-
0:10:42.850,0:10:46.519
I don't think you could do address[br]computation on these things, I think
0:10:46.519,0:10:53.649
addresses were only constants, but, still,[br]basically an array of numbers. So okay
0:10:53.649,0:10:59.370
what do we got now. Let's look a bit at C.[br]How many of those people who have
0:10:59.370,0:11:06.980
programmed 100 lines of code, how many of[br]you were C programmers? Quite a few. Or
0:11:06.980,0:11:12.660
maybe you're just embarrassed. I forgot to[br]say. Those that didn't put your hands up
0:11:12.660,0:11:18.499
for that first question. You should feel[br]proud and you should glory in your
0:11:18.499,0:11:26.460
innocence while you still have it. You are[br]not yet complicit in this trainwreck. It
0:11:26.460,0:11:31.680
worked then. Okay so here's a little piece[br]of C-code which I shall explain. I shall
0:11:31.680,0:11:38.279
explain it in several different ways. So[br]we start out. It creates two locations x
0:11:38.279,0:11:52.670
and secret... Don't do that. This is so[br]bad. Creates x, storing one and secret_key
0:11:52.670,0:11:58.980
which stores, I might get this wrong, your[br]pin. And then there's some computation
0:11:58.980,0:12:05.129
which is supposed to only operate on x but[br]maybe it's a teensy bit buggy or corrupted
0:12:05.129,0:12:13.620
by somebody and then we try and we make a[br]pointer to x and in this execution x just
0:12:13.620,0:12:21.209
happened to be allocated at 14. This[br]pointer contains the number 14 and then we
0:12:21.209,0:12:25.389
add one to that pointer. It's C so[br]actually that adding one to the pointer it
0:12:25.389,0:12:32.600
really means add the four bytes which are[br]the size of that thing. So we add 4 to 14
0:12:32.600,0:12:40.059
and we get 18. And then we try and read[br]the thing which is pointed to. What's
0:12:40.059,0:12:48.549
going to happen. Let me compile this and[br]run it in a conventional way. It printed a
0:12:48.549,0:12:54.300
secret key. Bad. This program, rather[br]distressingly, this is characteristic by
0:12:54.300,0:12:58.830
no means of all security flaws but a very[br]disturbingly large fraction of all the
0:12:58.830,0:13:10.600
security flaws are basically doing this.[br]So does C really let you do that. Yes and
0:13:10.600,0:13:17.911
no. So if you look at the C standard,[br]which is one of these beautiful books, it
0:13:17.911,0:13:22.859
says you, have to read moderately[br]carefully to understand this, but it says
0:13:22.859,0:13:27.399
for this program has undefined behavior.[br]And many of you will know what that means
0:13:27.399,0:13:32.949
but others won't, but, so that means as[br]far as the standard is concerned for
0:13:32.949,0:13:39.329
programs like that there is no constraint[br]on the behavior of the implementation at
0:13:39.329,0:13:47.039
all. Or said another way and maybe a more[br]usefully way: The standard lets
0:13:47.039,0:13:52.329
implementations, so C compilers, assume[br]that programmers don't have this kind of
0:13:52.329,0:13:59.569
stuff and it's the programmer's[br]responsibility to ensure that. Now in
0:13:59.569,0:14:04.781
about 1970, 75 maybe 1980, this was a[br]really good idea, it gives compilers a lot
0:14:04.781,0:14:14.000
of flexibility to do efficient[br]implementation. Now not so much. How does
0:14:14.000,0:14:19.199
this work in the definition? So the[br]standard somehow has to be able to look at
0:14:19.199,0:14:29.470
this program and identify it as having[br]this undefined behavior. And indeed, well,
0:14:29.470,0:14:37.089
if we just think about the numbers in[br]memory this 18, it points to a perfectly
0:14:37.089,0:14:42.189
legitimate storage location and that plus[br]one was also something that you're
0:14:42.189,0:14:48.149
definitely allowed to do in C. So the only[br]way that we can know that this is
0:14:48.149,0:14:54.419
undefined behavior is to keep track of the[br]original allocation of this pointer. And
0:14:54.419,0:14:59.240
here I've got these allocation identifiers[br]at 3, at 4, at 5, and you see here I've
0:14:59.240,0:15:03.689
still got this pointer despite the fact[br]that I added 4 to it. I still remembered
0:15:03.689,0:15:09.939
the allocation idea so I can check, or the[br]standard can check, when we try and read
0:15:09.939,0:15:15.290
from this whether that address is within[br]the footprint of that original allocation,
0:15:15.290,0:15:19.019
i.e. is within there and in fact it is[br]not, it's over here, it is not within
0:15:19.019,0:15:25.509
there at all. So this program is deemed to[br]have undefined behavior. Just to clarify
0:15:25.509,0:15:30.649
something people often get confused. So we[br]detect undefined behavior here but it
0:15:30.649,0:15:35.299
isn't really a temporal thing. The fact[br]that there is undefined behavior anywhere
0:15:35.299,0:15:41.790
in the execution means the whole program[br]is toast. Okay but this is really
0:15:41.790,0:15:46.339
interesting because we're relying for this[br]critical part of the standard onto
0:15:46.339,0:15:52.319
information which is not there at run time[br]in a conventional implementation. So just
0:15:52.319,0:15:59.050
to emphasize that point, if I compile this[br]program, let's say to ARM assembly
0:15:59.050,0:16:04.850
language I get a sequence of store and[br]load and add instructions, store, load,
0:16:04.850,0:16:12.089
add, store, load, load. And if I look at[br]what the ARM architecture says can happen
0:16:12.089,0:16:17.139
these blue transitions... one thing to[br]notice is that we can perfectly well do
0:16:17.139,0:16:21.639
some things out of order. At this point we[br]could either do this load or we could do
0:16:21.639,0:16:27.240
that store. That would be a whole other[br]talk. Let's stick with the sequential
0:16:27.240,0:16:33.100
execution for now. What I want to[br]emphasize is that these, this load of this
0:16:33.100,0:16:36.569
address, really was loading a 64 bit[br]number, which was the address of x and
0:16:36.569,0:16:44.419
adding 4 to it and then loading from that[br]address, the secret key. So there is no
0:16:44.419,0:16:55.910
trace of these allocation ideas. No trace[br]at all. Okay, let me step back a little
0:16:55.910,0:17:04.939
bit. I've been showing you some[br]screenshots of C behaviour and ARM
0:17:04.939,0:17:12.109
behaviour and I claim that these are[br]actually showing you all of the behaviours
0:17:12.109,0:17:19.359
allowed by what one would like the[br]standards to be for these two things. And
0:17:19.359,0:17:26.529
really what I've been showing you are GUIs[br]attached to mathematical models that we've
0:17:26.529,0:17:31.000
built in a big research project for the[br]last many years. REMS: "Rigorous
0:17:31.000,0:17:34.929
Engineering of Mainstream Systems" sounds[br]good, aspirational title I think I would
0:17:34.929,0:17:42.049
say. And in that we've built semantics, so[br]mathematical models defining the envelope
0:17:42.049,0:17:47.580
of all allowed behaviour for a quite big[br]fragment of C and for the concurrency
0:17:47.580,0:17:53.950
architecture of major processors ARM, and[br]x86, and RISC-V, and IBM POWER and also
0:17:53.950,0:17:57.770
for the instruction set architecture of[br]these processors, the details of how
0:17:57.770,0:18:03.940
individual instructions are executed. And[br]in each case these are specs that are
0:18:03.940,0:18:08.350
executable as test oracles, you can[br]compare algorithmically some observed
0:18:08.350,0:18:13.320
behaviour against whether spec says it's[br]allowed or not, which you can't do with
0:18:13.320,0:18:19.030
those books. So is it just an idiot simple[br]idea but for some reason the industry has
0:18:19.030,0:18:26.960
not taken it on board any time in the last[br]five decades. It's not that hard to do.
0:18:26.960,0:18:30.710
These are also mathematical models, I'll[br]come back to that later. But another thing
0:18:30.710,0:18:34.529
I want to emphasize here is that in each[br]of these cases, especially these first
0:18:34.529,0:18:40.669
two, when you start looking really closely[br]then you learn what you have to do is not
0:18:40.669,0:18:44.400
build a nice clean mathematical model of[br]something which is well understood. What
0:18:44.400,0:18:50.540
you learn is that there are real open[br]questions. For example within the C
0:18:50.540,0:18:55.759
standards committee and within ARM a few[br]years ago about exactly what these things
0:18:55.759,0:19:01.130
even were. And that is a bit disturbing[br]given that these are the foundations for
0:19:01.130,0:19:06.009
all of our information infrastructure.[br]There is also a lot of other work going on
0:19:06.009,0:19:11.010
with other people within this REMS project[br]on web assembly and Javascript and file
0:19:11.010,0:19:16.000
systems and other concurrency. I don't[br]have time to talk about those but Hannes
0:19:16.000,0:19:19.640
Mehnert it is going to talk us a little[br]bit about TCP later today. You should go
0:19:19.640,0:19:22.760
to that talk too if you like this one. If[br]you don't like this one you should still
0:19:22.760,0:19:33.169
go to that talk. Okay so this is doing[br]somewhat better. certainly more rigorous
0:19:33.169,0:19:41.519
engineering of specifications, but so far[br]only for existing infrastructure for this
0:19:41.519,0:19:49.539
C language, ARM architecture, what have[br]you. So what if we try and do better, what
0:19:49.539,0:19:54.750
if we try and build better security in. So[br]the architectures that we have, really
0:19:54.750,0:20:00.250
they date back to the 1970s or even 60s[br]with the idea of virtual memory. That
0:20:00.250,0:20:04.759
gives you - I need to go into what it is -[br]but that gives you very coarse grain
0:20:04.759,0:20:10.269
protection. You can have your separate[br]processes isolated from each other and on
0:20:10.269,0:20:15.350
a good day you might have separate browser[br]tabs isolated from each other in some
0:20:15.350,0:20:22.429
browsers, sometimes, but it doesn't scale.[br]It's just too expensive. You have lots of
0:20:22.429,0:20:27.090
little compartments. One thing we can do[br]we can certainly design much better
0:20:27.090,0:20:34.389
programming languages than C but all of[br]that legacy code, that's got a massive
0:20:34.389,0:20:40.520
inertia. So an obvious question is whether[br]we can build in better security into the
0:20:40.520,0:20:45.591
hardware that doesn't need some kind of[br]massive pervasive change to all the
0:20:45.591,0:20:51.409
software we ever wrote. And that's the[br]question that a different large project,
0:20:51.409,0:20:55.630
the CHERI project has been addressing. So[br]this is something that's been going on for
0:20:55.630,0:21:01.529
the last 8 years or so, mostly at SRI and[br]at Cambridge funded by DARPA and the EPSRC
0:21:01.529,0:21:06.570
and ARM and other people, mostly led by[br]Robert Watson, Simon Moore, Peter Neumann
0:21:06.570,0:21:14.660
and a cast of thousands. And me a tiny[br]bit. So... How, don't do that. I should
0:21:14.660,0:21:20.889
learn to stop pushing this button. It's[br]very hard to not push the button. So
0:21:20.889,0:21:25.190
here's the question in a more focused way[br]that CHERI is asking: Can we build
0:21:25.190,0:21:33.330
hardware support which is both efficient[br]enough and deployable enough that gives us
0:21:33.330,0:21:37.340
both fine grained memory protection to[br]prevent that kind of bug that we were
0:21:37.340,0:21:42.090
talking about earlier, well that kind of[br]leak at least, and also scalable
0:21:42.090,0:21:45.990
compartmentalization. So you can take bits[br]of untrusted code and put them in safe
0:21:45.990,0:21:52.879
boxes and know that nothing bad is going[br]to get out. That's the question. The
0:21:52.879,0:21:57.389
answer... The basic idea of the answer is[br]pretty simple and it also dates back to
0:21:57.389,0:22:02.800
the 70s is to add hardware support for[br]what people call capabilities, and we'll
0:22:02.800,0:22:07.840
see that working in a few moments. The[br]idea is that this will let programmers
0:22:07.840,0:22:12.460
exercise the principle of least privilege,[br]so with each little bit of code having
0:22:12.460,0:22:17.809
only the permissions it needs to do its[br]job and also the principle of intentional
0:22:17.809,0:22:25.790
use. So with each little bit of code when[br]it uses one of those capabilities, will
0:22:25.790,0:22:30.419
require it to explicitly use it rather[br]than implicitly. So the intention of the
0:22:30.419,0:22:36.129
code has to be made plain. So let's see[br]how this works. So these capabilities
0:22:36.129,0:22:44.120
they're basically replacing some or maybe[br]all of the pointers in your code. So if we
0:22:44.120,0:22:51.050
take this example again in ISO C we had an[br]address and in the standard, well, the
0:22:51.050,0:22:54.549
standard is not very clear about this but[br]we're trying to make it more clear, an
0:22:54.549,0:23:00.960
allocation ID, say something like it. Now[br]in Cherry C we can run the same code but
0:23:00.960,0:23:05.529
we might represent this pointer not just[br]with that number at runtime but with
0:23:05.529,0:23:12.769
something that contains that number and[br]the base of the original allocation and
0:23:12.769,0:23:17.399
the length of the original allocation and[br]some permissions. And having all of those
0:23:17.399,0:23:23.020
in the pointer means that we can do.. the[br]hardware can do, at run time, at access
0:23:23.020,0:23:29.530
time, a very efficient check that this is[br]within this region of memory. And if it's
0:23:29.530,0:23:36.269
not it can fail stop and trap. No[br]information leak. I need a bit more
0:23:36.269,0:23:41.090
machinery to make this actually work. So[br]it would be nice if all of these were 64
0:23:41.090,0:23:45.940
bit numbers but then you get a 256 bit[br]pointer, and that's a bit expensive so
0:23:45.940,0:23:52.789
there's a clever compression scheme that[br]squeezes all this down into 1 to 8 bits
0:23:52.789,0:24:00.299
with enough accuracy. And then you need to[br]design the instruction set of the CPU
0:24:00.299,0:24:07.390
carefully to ensure that you can never[br]forge one of these capabilities with a
0:24:07.390,0:24:13.120
normal instruction. You can never just[br]magic one up out of a bunch of bits that
0:24:13.120,0:24:19.710
you happen to find lying around. So all[br]the capability manipulating instructions,
0:24:19.710,0:24:27.159
they're going to take a valid capability[br]and construct another, possibly smaller,
0:24:27.159,0:24:32.020
in memory extent, or possibly with less[br]permissions, new capability. They're never
0:24:32.020,0:24:38.610
going to grow the memory extent or add[br]permissions. And when the hardware starts,
0:24:38.610,0:24:43.370
at hardware initialization time, the[br]hardware will hand the software a
0:24:43.370,0:24:48.909
capability to everything and then as the[br]operating system works and the linker
0:24:48.909,0:24:55.820
works and the C language allocator works,[br]these capabilities will be chopped up into
0:24:55.820,0:25:03.181
ever finer and smaller pieces like to be[br]handed out the right places. And then need
0:25:03.181,0:25:10.220
one more little thing. So this C language[br]world we're living in here, or really a
0:25:10.220,0:25:16.330
machine code language world so there's[br]nothing stopping code writing bytes of
0:25:16.330,0:25:23.340
stuff. So you have to somehow protect[br]these capabilities against being forged by
0:25:23.340,0:25:28.210
the code, either on purpose or[br]accidentally writing bytes over the top.
0:25:28.210,0:25:34.900
You need to preserve their integrity. So[br]that's done by adding for each of these
0:25:34.900,0:25:42.830
128 bit sized underlined units of memory[br]just a one extra bit. That holds a tag and
0:25:42.830,0:25:55.159
that tag records whether this memory holds[br]are currently valid capability or not. So
0:25:55.159,0:25:59.519
all the capability manipulating[br]instructions, if they're given a valid
0:25:59.519,0:26:06.100
capability with a tag set then the result[br]will still have the tags set. But if you
0:26:06.100,0:26:13.900
write, so you just wrote that byte there[br]which might change the base then the
0:26:13.900,0:26:20.230
hardware will clear that tag and these[br]tags, they're not conventionally
0:26:20.230,0:26:25.190
addressable, they don't have addresses.[br]They just stop there in the memory system
0:26:25.190,0:26:30.470
of the hardware. So there is no way that soft-[br]ware can inaudible through these. So this is
0:26:30.470,0:26:39.240
really cool. We've taken, what in ISO was[br]undefined behavior, in the standard, and
0:26:39.240,0:26:43.970
in implementations was a memory leak, and[br]we've turned it into something that in
0:26:43.970,0:26:49.200
CHERI C is in both the specification and[br]the implementation, is guaranteed to trap,
0:26:49.200,0:26:56.700
to fail stop, and not leak that[br]information. So another few things about
0:26:56.700,0:27:01.500
the design that I should mention. I think[br]all these blue things I think I've talked
0:27:01.500,0:27:07.610
about. But then a lot of care has gone in[br]to make this be very flexible to make it
0:27:07.610,0:27:12.499
possible to adopt it. So you can use[br]capabilities for all pointers, if you want
0:27:12.499,0:27:18.999
to, or just at some interfaces. You might[br]for example use them at the kernel
0:27:18.999,0:27:24.830
userspace interface. It coexist very[br]nicely with existing C and C++. You don't
0:27:24.830,0:27:29.139
need to change the source code very much[br]at all, and we'll see some a tiny bit of
0:27:29.139,0:27:36.860
data about this, to make these to start[br]using these. It coexists with the existing
0:27:36.860,0:27:41.140
virtual memory machinery, so you can use[br]both capabilities and virtual memory, if
0:27:41.140,0:27:45.960
you want, or you can just use capabilities[br]if you want or just use virtual memory if
0:27:45.960,0:27:50.940
you want. And then there's some more[br]machinery, which I'm not going to talk
0:27:50.940,0:27:54.999
about, for doing this secure encapsulation[br]stuff. What I've talked about so far is
0:27:54.999,0:27:59.440
basically the the fine grained memory[br]protection story. And I should say the
0:27:59.440,0:28:05.379
focus so far is on spatial memory safety.[br]So that's not in the first instance
0:28:05.379,0:28:11.450
protecting against reuse of an old[br]capability in a bad way but there various
0:28:11.450,0:28:17.711
schemes for supporting temporal memory[br]safety with basically the same setup. Okay
0:28:17.711,0:28:24.820
so. So how do we... Okay this is some kind[br]of a high level idea. How do we know that
0:28:24.820,0:28:32.669
it can be made to work. Well the only way[br]is to actually build it and see. And this
0:28:32.669,0:28:36.950
has been a really interesting process[br]because mostly when you're building
0:28:36.950,0:28:40.980
something either in academia or an[br]industry you have to keep most of the
0:28:40.980,0:28:46.730
parts fixed, I mean you are not routinely[br]changing both the processor and the
0:28:46.730,0:28:51.149
operating system, because that would just[br]be too scary. But here we have been doing
0:28:51.149,0:28:57.850
that and indeed we've really been doing a[br]three way, three way codesign of building
0:28:57.850,0:29:02.809
hardware and building and adapting[br]software to run above it and also building
0:29:02.809,0:29:09.249
these mathematical models all at the same[br]time. This is, well. A, it's all the fun
0:29:09.249,0:29:13.220
because you can often get groups of people[br]together that can do all of those things
0:29:13.220,0:29:20.269
but B, it's really effective. So what[br]we've produced then is an architecture
0:29:20.269,0:29:25.050
specification. In fact, extending MIPS[br]because it was conveniently free of IP
0:29:25.050,0:29:32.190
concerns and that specification is one of[br]these mathematically rigorous things
0:29:32.190,0:29:36.350
expressed in a domain specific language[br]specifically for writing instruction set
0:29:36.350,0:29:40.139
architecture specifications, and we've[br]designed and implemented actually two of
0:29:40.139,0:29:44.399
those. And then there are hardware[br]processor implementations that actually
0:29:44.399,0:29:49.249
run an FPGAs. And a lot of hardware[br]research devoted to making this go fast
0:29:49.249,0:29:54.379
and be efficient despite these extra tags[br]and whatnot. And then there's a big
0:29:54.379,0:30:00.669
software stack adapted to run over this[br]stuff. Including Clang and a FreeBSD
0:30:00.669,0:30:06.620
kernel and FreeBSD userspace and WebKit[br]and all kinds of pieces. So this is a very
0:30:06.620,0:30:12.309
major evaluation effort. That one is not[br]normally able to do. This is why, this
0:30:12.309,0:30:18.049
combination of things is why that list of[br]people up there was about 40 people. I say
0:30:18.049,0:30:22.559
this is based on MIPS but the ideas are[br]not specific to MIPS, there are ongoing
0:30:22.559,0:30:28.190
research projects to see if this can be[br]adapted to the ARM application class
0:30:28.190,0:30:32.580
architecture and the ARM microcontroller[br]architecture and the RISC-V. We'll see how
0:30:32.580,0:30:41.529
that goes, in due course. Okay so then[br]with all of these pieces we can evaluate
0:30:41.529,0:30:46.269
whether it actually works. And that's[br]still kind of an ongoing process but the
0:30:46.269,0:30:57.180
data that we've got so far is pretty[br]encouraging. So we see here first,
0:30:57.180,0:31:08.230
performance. So you see, maybe a percent[br]or 2 in many cases of runtime overhead.
0:31:08.230,0:31:11.500
There are workloads where it performs[br]badly if you have something that's very
0:31:11.500,0:31:17.270
pointer heavy, then the extra pressure[br]from those larger pointers will be a bit
0:31:17.270,0:31:24.090
annoying, but really it seems to be[br]surprisingly good. Then is it something
0:31:24.090,0:31:31.419
that can work without massive adaption to[br]the existing code. Well, in this port of
0:31:31.419,0:31:38.250
the FreeBSD userspace, so all the programs[br]that, all in all programs that run, they
0:31:38.250,0:31:45.120
were something like 20000 files and only[br]200 of those needed a change. And that's
0:31:45.120,0:31:52.081
been done by one or two people in not all[br]that large an amount of time. Some of the
0:31:52.081,0:31:57.230
other code that's more and more like an[br]operating system needs a bit more invasive
0:31:57.230,0:32:06.070
change but still, it seems to be viable.[br]Is it actually more secure? How are you
0:32:06.070,0:32:10.710
going to measure that. We like measuring[br]things as a whole we try and do science
0:32:10.710,0:32:17.440
where we can. Can we measure that? Not[br]really. But it certainly does, the whole
0:32:17.440,0:32:23.299
setup certainly does, mitigate against[br]quite a large family of known attacks. I
0:32:23.299,0:32:28.260
mean if you try and run Heartbleed you'll[br]get a trap. It will say trap. I think he
0:32:28.260,0:32:36.499
will say signal 34 in fact. So that's[br]pretty good. And if you think about
0:32:36.499,0:32:42.950
attacks, very many of them involve a chain[br]of multiple floors put together by an
0:32:42.950,0:32:49.399
ingenious member of the C3 community or[br]somebody else, and very often, at least
0:32:49.399,0:33:00.119
one of those is some kind of memory access[br]permission flaw of some kind or other.
0:33:00.119,0:33:03.539
Okay so this is nice, but I was talking a[br]lot in the first part of the talk, about
0:33:03.539,0:33:11.920
rigorous engineering. So here we've got[br]formally defined definitions of the
0:33:11.920,0:33:16.779
architecture and that's been really useful[br]for testing against and for doing test
0:33:16.779,0:33:22.889
generation on the basis of, and doing[br]specification coverage in an automated
0:33:22.889,0:33:28.809
way. But surely we should be able to do[br]more than that. So this formal definition
0:33:28.809,0:33:34.860
of the architecture, what does it look[br]like. So for each instruction of this
0:33:34.860,0:33:40.669
CHERI MIPS processor, there's a definition[br]and that definition, it looks a bit like
0:33:40.669,0:33:47.980
normal code. So here's a definition of how[br]the instruction for "capability increment
0:33:47.980,0:33:54.159
cursor immediate" goes. So this is a thing[br]that takes a capability register and an
0:33:54.159,0:33:58.310
immediate value and it tries to add[br]something on to the cursor of that
0:33:58.310,0:34:03.640
capability. And what you see here is some[br]code, looks like code, that defines
0:34:03.640,0:34:06.830
exactly what happens and also defines[br]exactly what happens if something goes
0:34:06.830,0:34:13.140
wrong. You can see there's some kind of an[br]exception case there. That's a processor.
0:34:13.140,0:34:19.139
No that's a... Yeah, that a "raising our[br]processor" exception. So it looks like
0:34:19.139,0:34:25.270
code, but, from this, we can generate both[br]reasonably high performance emulators,
0:34:25.270,0:34:30.780
reasonably in C and in OCaml, and also[br]mathematical models in the theorem
0:34:30.780,0:34:34.770
provers. So three of the main theorem[br]provers in the world are called Isabelle
0:34:34.770,0:34:41.690
and HOL and Coq. And we generate[br]definitions in all of those. So then we
0:34:41.690,0:34:46.489
got a mathematical definition of the[br]architecture. Then finally, we can start
0:34:46.489,0:34:51.530
stating some properties. So Cherrie's[br]design with some kind of vague security
0:34:51.530,0:34:57.809
goals in mind from the beginning but now[br]we can take, let'ssay one of those goals and
0:34:57.809,0:35:06.569
make it into a precise thing. So this is a[br]kind of a secure encapsulation, kind of
0:35:06.569,0:35:12.540
thing not exactly the memory leak that we[br]were looking at before. Sorry, the secret
0:35:12.540,0:35:17.820
leak that we were looking at. So let's[br]take imagine some CHERI compartment.
0:35:17.820,0:35:22.020
Haven't told you exactly what that means[br]but let's suppose that that's understood
0:35:22.020,0:35:27.490
and that inside that compartment there is[br]a piece of software running. And that
0:35:27.490,0:35:35.559
might be maybe a video codec or a web[br]browser or even a data structure
0:35:35.559,0:35:44.410
implementation maybe, or a C++ class and[br]all of its objects maybe. So imagine that
0:35:44.410,0:35:50.049
compartment running and imagine the[br]initial capabilities that it has access to
0:35:50.049,0:35:55.640
via registers and memory and via all the[br]capabilities it has access to via the
0:35:55.640,0:35:59.510
memory that it has access to and so on[br]recursively. So imagine all of those
0:35:59.510,0:36:06.359
capabilities. So the theorem says no[br]matter how this code executes whatever it
0:36:06.359,0:36:15.730
does however compromised or buggy it was[br]in an execution up until any point at
0:36:15.730,0:36:24.020
which it raises an exception or makes an[br]inter compartment call it can't make any
0:36:24.020,0:36:30.530
access which is not allowed by those[br]initial capabilities. You have to exclude
0:36:30.530,0:36:35.230
exceptions and into compartment calls[br]because by design they do introduce new
0:36:35.230,0:36:42.690
capabilities into the execution. But until[br]that point you get this guarantee. And
0:36:42.690,0:36:48.849
this is something that we've proved[br]actually [...] has approved with a machine
0:36:48.849,0:36:55.980
check proof in in fact the Isabelle[br]theorem prover. So this is a fact which is
0:36:55.980,0:37:03.589
about as solidly known as any fact the[br]human race knows. These provers they're
0:37:03.589,0:37:10.450
checking a mathematical proof in[br]exceedingly great detail with great care
0:37:10.450,0:37:14.230
and attention and they're structured in[br]such a way that the soundness of approver
0:37:14.230,0:37:21.300
depends only on some tiny little kernel.[br]So conceivably cosmic rays hit the
0:37:21.300,0:37:27.230
transistors at all the wrong points. But[br]basically we know this for sure. I
0:37:27.230,0:37:34.299
emphasize this is a security property we[br]have not proved, we certainly wouldn't
0:37:34.299,0:37:39.440
claim to have proved that CHERI is secure[br]and there are all kinds of other kinds of
0:37:39.440,0:37:44.849
attack that this statement doesn't even[br]address but at least this one property we
0:37:44.849,0:37:49.880
know it for real. So that's kind of[br]comforting and not a thing that
0:37:49.880,0:38:01.420
conventional computer science engineering[br]has been able to do on the whole. So, are
0:38:01.420,0:38:10.750
we taming the chaos then. Well no. Sorry.[br]But I've shown you two kinds of things.
0:38:10.750,0:38:15.980
The first was showing how we can do[br]somewhat more rigorous engineering for
0:38:15.980,0:38:21.640
that existing infrastructure. It's now[br]feasible to do that and in fact I believe
0:38:21.640,0:38:26.370
it has been feasible to build[br]specifications which you can use as test
0:38:26.370,0:38:31.460
oracles for many decades. We haven't[br]needed any super fancy new tech for that.
0:38:31.460,0:38:41.140
We've just needed to focus on that idea.[br]So that's for existing infrastructure and
0:38:41.140,0:38:47.470
then CHERI is a relatively light weight[br]change that does built-in improved
0:38:47.470,0:38:53.520
security and we think it's demonstrably[br]deployable at least in principle. Is it
0:38:53.520,0:38:58.010
deployable in practice? Are we going to[br]have in all of our phones five years from
0:38:58.010,0:39:05.890
now? Will these all be CHERI enabled? I[br]can't say. I think it is plausible that
0:39:05.890,0:39:09.590
that should happen and we're exploring[br]various routes that might mean that there
0:39:09.590,0:39:20.260
happens and we'll see how that goes. Okay[br]so with that I aiming to have given you at
0:39:20.260,0:39:25.760
least not a whole lot of hope but just a[br]little bit of hope that the world can be
0:39:25.760,0:39:30.670
made a better place and I encourage you to[br]think about ways to do it because Lord
0:39:30.670,0:39:38.290
knows we need it. But maybe you can at[br]least dream for a few moments of living in
0:39:38.290,0:39:46.010
a better world. And with that I thank you[br]for your attention.
0:39:46.010,0:39:57.000
applause
0:39:57.000,0:40:06.420
Herald Angel: Thanks very much. And with[br]that we'll head straight into the Q and A.
0:40:06.420,0:40:10.210
We'll just start with mic number one which[br]I can see right now.
0:40:10.210,0:40:15.720
Q: Yes thanks for the very encouraging[br]talk. I have a question about how to C
0:40:15.720,0:40:21.500
code the part below that. The theorem that[br]you stated assumes that the mentioned
0:40:21.500,0:40:28.200
description matches the hardware at least[br]is describes the hardware accurately. But
0:40:28.200,0:40:33.130
are there any attempts to check that the[br]hardware actually works like that? That
0:40:33.130,0:40:37.170
there are no wholes in the hardware? That[br]some combination of mentioned commands
0:40:37.170,0:40:42.980
work differently or allow memory access[br]there is not in the model? So, is it
0:40:42.980,0:40:48.990
possible to use hardware designs and check[br]that it matches the spec and will Intel or
0:40:48.990,0:40:53.730
AMD try to check their model against that[br]spec?
0:40:53.730,0:40:59.700
Sewell: OK. So the question is basically[br]can we do provably correct hardware
0:40:59.700,0:41:05.560
underneath this architectural abstraction.[br]And the answer is... So it's a good
0:41:05.560,0:41:12.700
question. People are working on that and[br]it can be done I think on at least a
0:41:12.700,0:41:22.130
modest academic scale. Trying to do that[br]in its full glory for a industrial
0:41:22.130,0:41:27.830
superscalar processor design, is right now[br]out of reach. But it's certainly something
0:41:27.830,0:41:35.259
one would like to be able to do. I think[br]we will get there maybe in a decade or so.
0:41:35.259,0:41:41.910
A decade is quick.[br]Herald Angel: Thanks. Mic number four is
0:41:41.910,0:41:48.471
empty again. So we'll take the Internet.[br]Sewell: Where is the internet?
0:41:48.471,0:41:53.359
Signal Angel: The internet is right here[br]and everywhere so ruminate would like to
0:41:53.359,0:41:59.960
know is the effort and cost of building in security[br]to hardware as you've described in your
0:41:59.960,0:42:05.880
talk significantly greater or less than[br]the effort and cost of porting software to
0:42:05.880,0:42:12.059
more secure languages.[br]Sewell: So, is the effort of building new
0:42:12.059,0:42:17.349
hardware more or less than the cost of[br]porting existing software to better
0:42:17.349,0:42:26.280
languages? I think the answer has to be[br]yes. It is less. The difficulty with
0:42:26.280,0:42:30.700
porting software is all of you and all[br]your colleagues and all of your friends
0:42:30.700,0:42:36.009
and all your enemies have been beavering[br]away writing lines of code for 50 years.
0:42:36.009,0:42:42.409
Really, I don't know what to say,[br]effectively. Really numerously. There's an
0:42:42.409,0:42:46.280
awful lot of it. It's a really terrifying[br]amount of code out there. It's very hard
0:42:46.280,0:42:54.388
to get people to rewrite it.[br]Signal Angel: OK, mic two.
0:42:54.388,0:43:03.029
Q: OK, given that cost of ... of today on the[br]software level not on the hardware level, is it
0:43:03.029,0:43:11.940
possible to go half way using an entire[br]system as an oracle. So during development
0:43:11.940,0:43:19.280
there are code based with the secure[br]hardware but then essentially ship the
0:43:19.280,0:43:28.859
same software to unsecure hardware and inaudible[br]Sewell: So this question, if I paraphrase
0:43:28.859,0:43:35.819
it is can we use this hardware[br]implementation really as the perfect
0:43:35.819,0:43:45.020
sanitiser? And I think there is scope for[br]doing that. And you can even imagine
0:43:45.020,0:43:49.750
running this not on actual silicon[br]hardware but in like a QEMU implementation
0:43:49.750,0:43:54.819
which we have in order to do that. And I[br]think for for that purpose it could
0:43:54.819,0:44:00.720
already be a pretty effective bug finding[br]tool. It would be worth doing. But you
0:44:00.720,0:44:06.250
would like as for any testing it will only[br]explore a very tiny fraction of the
0:44:06.250,0:44:13.539
possible paths. So we would like to have[br]it always on if we possibly can.
0:44:13.539,0:44:19.690
Herald Angel: OK. The internet again.[br]Signal Angel: OK someone would like to
0:44:19.690,0:44:26.130
know how does your technique compare to[br]Intel MPX which failed miserably with a inaudible
0:44:26.130,0:44:33.820
ignoring it. Will it better or faster?[br]Could you talk a little bit about that.
0:44:33.820,0:44:38.400
Sewell: I think for that question, for a[br]real answer to that question, you would
0:44:38.400,0:44:45.549
need one of my more hardware colleagues.[br]What I can say is they make nasty faces
0:44:45.549,0:44:53.450
when you say MPX.[br]Herald Angel: Well that's that. Mic number
0:44:53.450,0:44:56.880
one.[br]Q: Somewhat inherent to your whole
0:44:56.880,0:45:02.460
construction was this idea of having an[br]unchangeable bit which described whether
0:45:02.460,0:45:08.539
your pointer contents have been changed.[br]Is this even something that can be done in
0:45:08.539,0:45:12.850
software? Or do you have to construct a[br]whole extra RAM or something?
0:45:12.850,0:45:21.309
Sewell: So you can't do it exclusively in[br]software because you need to protect it.
0:45:21.309,0:45:25.550
In the hardware implementations that my[br]colleagues have built it's done in a
0:45:25.550,0:45:32.700
reasonably efficient way. So it's cached[br]and managed in clever ways to ensure that
0:45:32.700,0:45:38.730
it's not terribly expensive to do. But you[br]do need slightly wider data paths in your
0:45:38.730,0:45:45.460
cache protocol and things like that to[br]manage it.
0:45:45.460,0:45:51.527
Herald Angel: OK. Mic seven.[br]Q: How good does this system help securing
0:45:51.527,0:45:56.850
the control flow integrity compared to[br]compared to other systems, like hardware
0:45:56.850,0:46:03.870
systems data flow isolation or code[br]pointer integrity in ARM inaudible?
0:46:03.870,0:46:09.310
Sewell: Ah well, that's another question[br]which is a bit hard to answer because then
0:46:09.310,0:46:18.179
it depends somewhat on how you're using[br]the capabilities. So you can arrange here
0:46:18.179,0:46:25.549
that each state each C language allocation[br]is independently protected and indeed you
0:46:25.549,0:46:30.470
can also arrange that each way - this is[br]sensible - that each subobject is
0:46:30.470,0:46:34.920
independent protected. And those[br]protections are going to give you
0:46:34.920,0:46:39.310
protection against trashing bits of the[br]stack because they'll restrict the
0:46:39.310,0:46:50.669
capability to just those objects.[br]Herald Angel: OK the internet again.
0:46:50.669,0:46:57.740
Signal Angel: Twitter would like to know[br]is it possible to is it possible to run
0:46:57.740,0:47:04.289
CHERI C without custom hardware?[br]Sewell: Can you run CHERI C without custom
0:47:04.289,0:47:12.950
hardware? And the answer is you can run it[br]in emulation above this QEMU. That works
0:47:12.950,0:47:18.040
pretty fast, I mean fast enough for[br]debugging as the earlier question was
0:47:18.040,0:47:28.609
talking about. You can imagine you know[br]somewhat faster emulations of that. It's
0:47:28.609,0:47:34.340
not clear how much it's worth going down[br]that route. The real potential big win is
0:47:34.340,0:47:41.240
to have this be always on and that's what[br]we would like to have.
0:47:41.240,0:47:47.390
Herald Angel: OK. Mic one.[br]Q: Do you have some examples of the kinds
0:47:47.390,0:47:51.900
of code changes that you need to the[br]operating system and userland that you
0:47:51.900,0:47:55.859
mentioned?[br]Sewell: So, okay so what kind of changes
0:47:55.859,0:48:07.140
do you need to to adapt code to CHERI? So,[br]if you look at C-code there is the C
0:48:07.140,0:48:12.520
standard that deems a whole lot of things[br]to be undefined behavior and then if you
0:48:12.520,0:48:19.940
look at actual C code you find that very[br]very much of it does depend on some idiom
0:48:19.940,0:48:28.650
that the C standard does not approve of.[br]So there is for example, pause there are
0:48:28.650,0:48:34.740
quite a lot of instances of code that[br]constructs a pointer by doing more than
0:48:34.740,0:48:42.380
one more than plus one offset and then[br]brings it back within range before you use
0:48:42.380,0:48:49.450
it. So in fact in CHERI C many of those[br]cases are allowed but not all of them.
0:48:49.450,0:48:58.200
More exotic cases where there's really[br]some crazy into object stuff going on or
0:48:58.200,0:49:03.339
where pointer arithmetic between objects -[br]which the C standard is certainly quite
0:49:03.339,0:49:09.290
down on but which does happen - and code[br]which is, say manipulating the low order
0:49:09.290,0:49:13.549
bits of a pointer to store some data in[br]it. That's pretty common. You can do it in
0:49:13.549,0:49:18.170
CHERI C but you might have to adapt the[br]code a little bit. Other cases are more
0:49:18.170,0:49:22.849
fundamental. So say, in operating system[br]code if you swap out a page to disk and
0:49:22.849,0:49:28.460
then you swap it back in then somehow the[br]operating system has to reconstruct new
0:49:28.460,0:49:35.200
capabilities from a large capability which[br]it kept for the purpose. So that needs a
0:49:35.200,0:49:41.869
bit more work. Though it's kind of a[br]combination. Some things you would regard
0:49:41.869,0:49:48.320
as good changes to the code anyway and[br]others are more radical.
0:49:48.320,0:49:51.600
Herald Angel: OK, another one from the[br]internet.
0:49:51.600,0:50:00.000
Signal Angel: I lost one pause[br]Sewell: The Internet has gone quiet. Yes.
0:50:00.000,0:50:04.420
Signal Angel: Last question from the[br]internet. Are there any plans to impact
0:50:04.420,0:50:11.950
test on Linux or just FreeBSD?[br]Sewell: If anyone has a good number of
0:50:11.950,0:50:19.980
spare minutes then that would be lovely to[br]try. The impact on Linux not just
0:50:19.980,0:50:26.400
previously the BSD code base is simpler[br]and maybe cleaner and my systemsy
0:50:26.400,0:50:30.230
colleagues are already familiar with it.[br]It's the obvious target for an academic
0:50:30.230,0:50:35.260
project doing it already a humungous[br]amount of work. So I think we would love
0:50:35.260,0:50:40.910
to know how Linux and especially how[br]Android could be adapted but it's not
0:50:40.910,0:50:46.259
something that we have the resource to do[br]at the moment.
0:50:46.259,0:50:50.710
Herald Angel: Mic number on again.[br]Q: How likely or dangerous.
0:50:50.710,0:50:55.460
Herald Angel: Mic number one is not[br]working.
0:50:55.460,0:50:59.540
Q: How likely or dangerous you think it is[br]for a programmer to screw up their
0:50:59.540,0:51:05.329
capabilities he specifies?[br]Sewell: So how often will the programmer
0:51:05.329,0:51:10.750
screw up the capabilities? So in many[br]cases the programmer is just doing an
0:51:10.750,0:51:17.370
access with a C or C++ pointer or C++[br]object in a normal way. They're not
0:51:17.370,0:51:23.560
manually constructing these things. So a[br]lot of that is going to just work. If
0:51:23.560,0:51:28.170
you're building some particular secure[br]encapsulation setup you have to be a bit
0:51:28.170,0:51:34.280
careful. But on the whole I think this is[br]a small risk compared to the state of
0:51:34.280,0:51:40.960
software as we have it now.[br]Herald Angel: OK. Mic eight.
0:51:40.960,0:51:49.220
Q: So existing memory protection[br]techniques are quite patch-worky, like
0:51:49.220,0:52:06.160
canaries and address layout randomisation.[br]Is this a system designed to replace or
0:52:06.160,0:52:17.920
supplement these measures?[br]Sewell: I think if you pause So again it
0:52:17.920,0:52:24.480
depends a bit how you use it. So if you[br]have capabilities really everywhere then
0:52:24.480,0:52:33.240
there's not much need for address space[br]layout randomization or canaries to
0:52:33.240,0:52:41.829
protect against explicit information[br]leaks. I imagine that you might still want
0:52:41.829,0:52:48.500
randomisation to protect against some side[br]channel flows but canaries is not so much
0:52:48.500,0:52:51.539
on whether you actually do for side-[br]channel flows - I don't know. That's a
0:52:51.539,0:52:56.500
good question.[br]Herald Angel: Mic four please.
0:52:56.500,0:53:02.580
Q: Thanks for the very interesting talk[br]you presented with CHERI, a system of
0:53:02.580,0:53:07.529
veryfying existing C-software, sort of[br]post hoc and improving its security via
0:53:07.529,0:53:12.450
run-time mechanism...?[br]A: Improving or Proving? Improving yes,
0:53:12.450,0:53:18.560
proving no, yes.[br]Q: So what's your opinion on the viability
0:53:18.560,0:53:23.020
of using a static analysis and static[br]verification for that. Would it be
0:53:23.020,0:53:29.579
possible to somehow analyse software that[br]already exist and as written in these
0:53:29.579,0:53:34.089
unsafe languages and show that it[br]nevertheless has some security properties
0:53:34.089,0:53:40.830
without writing all the proofs manually in[br]like HOL or Isabelle?
0:53:40.830,0:53:47.050
A: Well so if you want to analyse existing[br]software... So, static analysis is already
0:53:47.050,0:53:53.029
useful for finding bugs in existing[br]software. If you want to have assurance
0:53:53.029,0:53:58.869
from static analysis, that's really very[br]tough. So you certainly wouldn't want to
0:53:58.869,0:54:07.430
manually write proofs in terms of the[br]definitional C semantics, you would need
0:54:07.430,0:54:10.849
intermediate infrastructure. And if you[br]look at the people, who have done verified
0:54:10.849,0:54:15.750
software, so verified compilers and[br]verified hypervisor, and verified
0:54:15.750,0:54:25.029
operating systems, all of those are now[br]possible on significant scales, but they
0:54:25.029,0:54:30.510
use whole layers of proof and verification[br]infrastructure for doing that. They're not
0:54:30.510,0:54:35.829
writing proofs by hand for every little[br]detail. And some of those verification
0:54:35.829,0:54:42.180
methods either are, you can imagine them[br]being basically like static analysis, you
0:54:42.180,0:54:47.690
know, you might use a static analysis to[br]prove some relatively simple facts about
0:54:47.690,0:54:54.820
the code and then stitch those together[br]into a larger assembly. I think any kind
0:54:54.820,0:54:59.170
of more complete thing I think is hard to[br]imagine. I mean these analysis tools
0:54:59.170,0:55:03.339
mostly, they rely on making some[br]approximation in order to be able to do
0:55:03.339,0:55:09.750
their thing at all. So it's hard to get[br]assurance out of them.
0:55:09.750,0:55:16.869
Herald Angel: Okay, Mic one.[br]Q: You said you modified an MIPS
0:55:16.869,0:55:22.740
architecture to add some logic to check their[br]capabilities. Do you know what the cost of
0:55:22.740,0:55:30.219
this regarding computational power,[br]an energetic power supply?
0:55:30.219,0:55:33.030
A: Sorry, can you repeat the last part of[br]that?
0:55:33.030,0:55:39.190
Q: What the cost of this modification[br]regarding computational power, and power
0:55:39.190,0:55:42.300
consumption.[br]A: Ah, so what's the energy cost? So I
0:55:42.300,0:55:46.609
gave you a performance cost. I didn't give[br]you, I carefully didn't give you an energy
0:55:46.609,0:55:52.599
cost estimate, because it's really hard to[br]do in a scientifically credible way,
0:55:52.599,0:55:59.250
without making a more or less production[br]superscalar implementation. And we are
0:55:59.250,0:56:03.500
sadly not in a position to do that,[br]although if you have 10 or 20 million
0:56:03.500,0:56:09.249
pounds, then I would be happy to accept[br]it.
0:56:09.249,0:56:15.220
Herald Angel: Mic number 7, please.[br]Q: How does the class of problems that you
0:56:15.220,0:56:21.440
can address with CHERI compare to the[br]class of problems that are excluded by,
0:56:21.440,0:56:26.680
for example, the Rust programming[br]language?
0:56:26.680,0:56:30.460
A: So how does the problems of CHERI[br]relate to the problems, sorry the problems
0:56:30.460,0:56:39.529
excluded by CHERI, relate to the problems[br]excluded by Rust. So if you are happy to
0:56:39.529,0:56:50.859
write all of your code in Rust without[br]ever using the word "unsafe", then maybe
0:56:50.859,0:56:56.240
there would be no point in CHERI, at all.[br]Are you happy to do that?
0:56:56.240,0:57:00.750
Q: Probably not, no.[br]A: I think someone shakes their head
0:57:00.750,0:57:10.000
sideways, yeah.[br]Herald Angel: Mic number 1.
0:57:10.000,0:57:14.640
Q: What do you think about the following[br]thesis: We are building a whole system of
0:57:14.640,0:57:21.109
things, with artificial intelligence and[br]something like that. That is above this
0:57:21.109,0:57:29.760
technical level and it's building another[br]chaos that isn't healing with those
0:57:29.760,0:57:34.009
things.[br]A: It's dreadful, isn't it.
0:57:34.009,0:57:45.700
Laughter[br]A: There are, so you might, in fact some
0:57:45.700,0:57:50.320
of my colleagues are interested in this[br]question, if you, you might well want to
0:57:50.320,0:57:58.819
bound what your artificial intelligence[br]can access or touch and for that you might
0:57:58.819,0:58:03.720
want this kind of technology. But this is[br]not, we are, with machine learning systems
0:58:03.720,0:58:07.410
we are intrinsically building things that[br]we, on the whole, don't understand and
0:58:07.410,0:58:11.960
that will have edge cases that go wrong.[br]And this is not speaking to that in any
0:58:11.960,0:58:17.240
way.[br]Herald Angel: OK. Does the internet have a
0:58:17.240,0:58:24.793
question? No, good. I don't see anyone[br]else, so let's conclude this. Thank you
0:58:24.793,0:58:28.932
very much.[br]A: Thank you.
0:58:28.932,0:58:30.289
applause
0:58:30.289,0:58:35.930
35c3 postroll music
0:58:35.930,0:58:53.000
subtitles created by c3subtitles.de[br]in the year 2019. Join, and help us!