0:00:00.000,0:00:18.010 35C3 preroll music 0:00:18.010,0:00:29.239 Herald angel: OK. Our next speaker will be[br]trying to tame the chaos. Peter Sewell 0:00:29.239,0:00:34.120 from the University of Cambridge. A warm[br]round of applause please. 0:00:34.120,0:00:37.910 applause 0:00:37.910,0:00:48.740 Sewell: Thank you! Okay. So here we are.[br]C3 again with a program full of 0:00:48.740,0:00:56.710 fascinating and exciting and cool stuff.[br]And if we look just at the security talks 0:00:56.710,0:00:59.940 all manner of interesting things. I'm[br]going to go to a lot to these. You should 0:00:59.940,0:01:06.830 too. Let's see though if we read some of[br]the titles: exploiting kernel memory 0:01:06.830,0:01:18.750 corruptions, attacking end to end e-mail[br]encryption. What else have we got: modern 0:01:18.750,0:01:31.550 Windows user space exploitation,[br]compromising online accounts. OK so a lot 0:01:31.550,0:01:36.130 of these talks, as usual, they're[br]explaining to us that our computers, 0:01:36.130,0:01:42.950 they're not doing what we would like and[br]this as usual is the merest tip of a tiny 0:01:42.950,0:01:48.970 iceberg, alright. We have a hundreds of[br]thousands of known vulnerabilities and 0:01:48.970,0:01:53.280 many unknown vulnerabilities. Let me do it[br]a bit dark here, but let me try some 0:01:53.280,0:01:59.490 audience participation. How many of you[br]have in the last year written at least a 0:01:59.490,0:02:09.019 hundred lines of code? And of those people[br]keep your hands up if you are completely 0:02:09.019,0:02:17.550 confident that code does the right thing.[br]laughter I would like to talk to you 0:02:17.550,0:02:23.200 later laughter and so would the people[br]in the self driving car that you're 0:02:23.200,0:02:30.790 probably engineering. So, so how many[br]unknown vulnerabilities then. Well you can 0:02:30.790,0:02:39.160 take what's in your mind right now and[br]multiply it by - oh, this doesnt work very 0:02:39.160,0:02:48.920 well. No. No, no. Computers, they do the[br]wrong thing again and again and again. 0:02:48.920,0:02:54.421 laughter and some applause We can[br]multiply by this estimate of about a 100 0:02:54.421,0:03:00.630 billion lines of new code each year. So if[br]we take all of this: these talks, these 0:03:00.630,0:03:05.900 numbers, these should make us not happy[br]and excited, these should make us sad, 0:03:05.900,0:03:16.070 very sad and frankly rather scared of what[br]is going on in the world. So what can we 0:03:16.070,0:03:25.610 do? We can, option one, give up using[br]computers altogether. applause In most 0:03:25.610,0:03:29.640 audiences this will be a hard sell but[br]here I'm sure you're all happy to just 0:03:29.640,0:03:40.180 turn them off now. Or we can throw up our[br]hands in the air and collapse in some kind 0:03:40.180,0:03:54.830 of pit of despair. Well, maybe, maybe not.[br]My task today is to give you a tiny 0:03:54.830,0:04:00.050 smidgen, a very tall, a very small amount[br]of hope that it may be possible to do 0:04:00.050,0:04:04.099 slightly better. But if we want to do[br]better we first have to understand why 0:04:04.099,0:04:10.560 things are so bad and if we look at our[br]aeronautical engineering colleagues for 0:04:10.560,0:04:15.739 example, they can make planes which very[br]rarely fall out of the sky. Why is it that 0:04:15.739,0:04:22.880 they can do that and we are so shit at[br]making computers? Well, I'm going to reuse 0:04:22.880,0:04:27.970 a picture that I used at 31C3, which is[br]still the best description I can find of 0:04:27.970,0:04:35.990 the stack of hardware and software that we[br]live above. Here we go. It's, there's so 0:04:35.990,0:04:40.160 much information in this, it's just ace.[br]So we see down here all of this 0:04:40.160,0:04:45.430 transparent silicon, it's the hardware we[br]live above. We see a stack of phases of a 0:04:45.430,0:04:51.241 C compiler, we see all kinds of other[br]other bits and pieces. There might be a 0:04:51.241,0:04:58.870 slightly hostile wireless environment in[br]this room for some reason. If we look at 0:04:58.870,0:05:03.230 this and think about the root causes for[br]why our systems are so bad we can see 0:05:03.230,0:05:07.889 terrible things. So the first is,[br]obviously there's a lot of legacy 0:05:07.889,0:05:12.810 complexity that we're really stuck with,[br]alright. If you pull out one of those 0:05:12.810,0:05:16.240 pieces from the middle and try and replace[br]it with something which is not of exactly 0:05:16.240,0:05:21.350 the same shape the whole pile will[br]collapse. So we somehow need to find an 0:05:21.350,0:05:27.970 incremental path to a better world. And[br]then, this is the most fundamental reason 0:05:27.970,0:05:33.220 that these are not like aeroplanes, these[br]systems are discrete not continuous. If 0:05:33.220,0:05:38.290 you take an honest good I made out of a[br]piece of steel and you push on it a little 0:05:38.290,0:05:44.270 bit it moves a little bit, basically in[br]proportion. If it is 1 percent too strong, 0:05:44.270,0:05:50.090 1 percent to weak, basically it doesn't[br]matter. But in these things one line of 0:05:50.090,0:05:56.490 code can mean you're open to a[br]catastrophic exploit and one line in many, 0:05:56.490,0:06:05.970 many million. OK. And next thing... this[br]really doesn't work. They're very 0:06:05.970,0:06:11.380 complicated. But the scary thing is not[br]the static complexity of those lines of 0:06:11.380,0:06:15.080 code and the number of components although[br]that's pretty intimidating the really 0:06:15.080,0:06:21.110 scary thing is that the exponential number[br]of states and execution paths. So these 0:06:21.110,0:06:26.139 two together they mean that the testing[br]that we rely on testing is the only way we 0:06:26.139,0:06:30.710 have to build systems which are not[br]instantly broken. Testing can never save 0:06:30.710,0:06:40.550 us from these exploitable errors. And that[br]means ultimately we need to do proof. 0:06:40.550,0:06:45.610 Honest machine checked mathematical proof.[br]And this also tells us that we need to 0:06:45.610,0:06:51.289 arrange for our systems to be secure for[br]simple reasons that do not depend on the 0:06:51.289,0:06:57.180 correctness of all of those hundred[br]billion lines of code. Then another thing 0:06:57.180,0:07:03.360 that we have here. All these interfaces:[br]the architecture interface, the C language 0:07:03.360,0:07:10.650 interface, the sockets API interface, the[br]TCP interface. All of these we rely on to 0:07:10.650,0:07:17.500 let different parts of the system be[br]engineered by different organizations. But 0:07:17.500,0:07:24.870 they're all really badly described and[br]badly defined. So what you find is, for 0:07:24.870,0:07:30.130 each of these, typically a prose book[br]varying in thickness between about that 0:07:30.130,0:07:36.909 and about that, full of text. Well, we[br]still rely on testing - limited though it 0:07:36.909,0:07:44.000 is - but you can never test anything[br]against those books. So we need instead 0:07:44.000,0:07:48.940 interface descriptions, definitions,[br]specifications, that are more rigorous, 0:07:48.940,0:07:54.419 mathematically rigorous and that are[br]executable - not in the normal sense of 0:07:54.419,0:07:58.940 executable as an implementation but[br]executable as a test oracle. So you can 0:07:58.940,0:08:03.880 compute whether some observed behaviour is[br]allowed or not and not have to read the 0:08:03.880,0:08:07.700 book and argue on the Internet. And we[br]also need interface definitions that 0:08:07.700,0:08:19.820 support this proof that we need. And then[br]all of this stuff was made up in the 1970s 0:08:19.820,0:08:25.639 or the sixties and in the 70s and the 60s[br]the world was a happy place and people 0:08:25.639,0:08:32.068 walked around with flowers in their hair[br]and nothing bad ever happened. And that is 0:08:32.068,0:08:36.340 no longer the case. And so we need[br]defensive design, but we need defensive 0:08:36.340,0:08:41.149 design that is somehow compatible or can[br]be made enough compatible with this legacy 0:08:41.149,0:08:46.879 investment. That's quite hard. And then -[br]I can't say very much about this, but we 0:08:46.879,0:08:50.360 have at the moment very bad incentives,[br]because we have a very strong incentive to 0:08:50.360,0:08:55.420 make things that are more complex than we[br]can possibly handle or understand. We make 0:08:55.420,0:09:00.410 things, we add features, we make a thing[br]which is just about shippable and then we 0:09:00.410,0:09:06.550 ship it. And then we go on to the next[br]thing. So we need economic and legal 0:09:06.550,0:09:11.300 incentives for simplicity and for security[br]that we do not yet have. But it's hard to 0:09:11.300,0:09:17.109 talk about those until we have the[br]technical means to react to those 0:09:17.109,0:09:21.980 incentives. OK, so what I'm going to do[br]now, I'm going to talk about two of these 0:09:21.980,0:09:26.230 interfaces, the architect interface and[br]the C language interface, and see what we 0:09:26.230,0:09:32.819 can do to make things better. A lot of[br]this has to do with memory. So whoever it 0:09:32.819,0:09:38.209 was that picked the subtitle for this[br]meeting "refreshing memories" thank you, 0:09:38.209,0:09:45.429 because it's great, I love it. Let's[br]refresh your memory quite a way. So I 0:09:45.429,0:09:52.959 invite you to think back to when you were[br]very very young in about 1837 when cool 0:09:52.959,0:09:58.519 hacking was going on. Charles Babbage was[br]designing the Analytical Engine and even 0:09:58.519,0:10:03.369 then you see there was this dichotomy[br]between a mill performing operations and a 0:10:03.369,0:10:11.100 store holding numbers. This is a plan view[br]of the analytical engine, well, it was 0:10:11.100,0:10:15.240 vaporware, but a design from the[br]analytical engine, and you see here these 0:10:15.240,0:10:21.689 circles these are columns of numbers each[br]made out of a stack of cogs, maybe a 50 0:10:21.689,0:10:28.340 digit decimal number in each of those[br]columns. And this array, really, he 0:10:28.340,0:10:34.681 imagined it going on out to and over there[br]somewhere, with about 1000 numbers. So 0:10:34.681,0:10:42.850 even then you have a memory which is an[br]array of numbers. I think these were not- 0:10:42.850,0:10:46.519 I don't think you could do address[br]computation on these things, I think 0:10:46.519,0:10:53.649 addresses were only constants, but, still,[br]basically an array of numbers. So okay 0:10:53.649,0:10:59.370 what do we got now. Let's look a bit at C.[br]How many of those people who have 0:10:59.370,0:11:06.980 programmed 100 lines of code, how many of[br]you were C programmers? Quite a few. Or 0:11:06.980,0:11:12.660 maybe you're just embarrassed. I forgot to[br]say. Those that didn't put your hands up 0:11:12.660,0:11:18.499 for that first question. You should feel[br]proud and you should glory in your 0:11:18.499,0:11:26.460 innocence while you still have it. You are[br]not yet complicit in this trainwreck. It 0:11:26.460,0:11:31.680 worked then. Okay so here's a little piece[br]of C-code which I shall explain. I shall 0:11:31.680,0:11:38.279 explain it in several different ways. So[br]we start out. It creates two locations x 0:11:38.279,0:11:52.670 and secret... Don't do that. This is so[br]bad. Creates x, storing one and secret_key 0:11:52.670,0:11:58.980 which stores, I might get this wrong, your[br]pin. And then there's some computation 0:11:58.980,0:12:05.129 which is supposed to only operate on x but[br]maybe it's a teensy bit buggy or corrupted 0:12:05.129,0:12:13.620 by somebody and then we try and we make a[br]pointer to x and in this execution x just 0:12:13.620,0:12:21.209 happened to be allocated at 14. This[br]pointer contains the number 14 and then we 0:12:21.209,0:12:25.389 add one to that pointer. It's C so[br]actually that adding one to the pointer it 0:12:25.389,0:12:32.600 really means add the four bytes which are[br]the size of that thing. So we add 4 to 14 0:12:32.600,0:12:40.059 and we get 18. And then we try and read[br]the thing which is pointed to. What's 0:12:40.059,0:12:48.549 going to happen. Let me compile this and[br]run it in a conventional way. It printed a 0:12:48.549,0:12:54.300 secret key. Bad. This program, rather[br]distressingly, this is characteristic by 0:12:54.300,0:12:58.830 no means of all security flaws but a very[br]disturbingly large fraction of all the 0:12:58.830,0:13:10.600 security flaws are basically doing this.[br]So does C really let you do that. Yes and 0:13:10.600,0:13:17.911 no. So if you look at the C standard,[br]which is one of these beautiful books, it 0:13:17.911,0:13:22.859 says you, have to read moderately[br]carefully to understand this, but it says 0:13:22.859,0:13:27.399 for this program has undefined behavior.[br]And many of you will know what that means 0:13:27.399,0:13:32.949 but others won't, but, so that means as[br]far as the standard is concerned for 0:13:32.949,0:13:39.329 programs like that there is no constraint[br]on the behavior of the implementation at 0:13:39.329,0:13:47.039 all. Or said another way and maybe a more[br]usefully way: The standard lets 0:13:47.039,0:13:52.329 implementations, so C compilers, assume[br]that programmers don't have this kind of 0:13:52.329,0:13:59.569 stuff and it's the programmer's[br]responsibility to ensure that. Now in 0:13:59.569,0:14:04.781 about 1970, 75 maybe 1980, this was a[br]really good idea, it gives compilers a lot 0:14:04.781,0:14:14.000 of flexibility to do efficient[br]implementation. Now not so much. How does 0:14:14.000,0:14:19.199 this work in the definition? So the[br]standard somehow has to be able to look at 0:14:19.199,0:14:29.470 this program and identify it as having[br]this undefined behavior. And indeed, well, 0:14:29.470,0:14:37.089 if we just think about the numbers in[br]memory this 18, it points to a perfectly 0:14:37.089,0:14:42.189 legitimate storage location and that plus[br]one was also something that you're 0:14:42.189,0:14:48.149 definitely allowed to do in C. So the only[br]way that we can know that this is 0:14:48.149,0:14:54.419 undefined behavior is to keep track of the[br]original allocation of this pointer. And 0:14:54.419,0:14:59.240 here I've got these allocation identifiers[br]at 3, at 4, at 5, and you see here I've 0:14:59.240,0:15:03.689 still got this pointer despite the fact[br]that I added 4 to it. I still remembered 0:15:03.689,0:15:09.939 the allocation idea so I can check, or the[br]standard can check, when we try and read 0:15:09.939,0:15:15.290 from this whether that address is within[br]the footprint of that original allocation, 0:15:15.290,0:15:19.019 i.e. is within there and in fact it is[br]not, it's over here, it is not within 0:15:19.019,0:15:25.509 there at all. So this program is deemed to[br]have undefined behavior. Just to clarify 0:15:25.509,0:15:30.649 something people often get confused. So we[br]detect undefined behavior here but it 0:15:30.649,0:15:35.299 isn't really a temporal thing. The fact[br]that there is undefined behavior anywhere 0:15:35.299,0:15:41.790 in the execution means the whole program[br]is toast. Okay but this is really 0:15:41.790,0:15:46.339 interesting because we're relying for this[br]critical part of the standard onto 0:15:46.339,0:15:52.319 information which is not there at run time[br]in a conventional implementation. So just 0:15:52.319,0:15:59.050 to emphasize that point, if I compile this[br]program, let's say to ARM assembly 0:15:59.050,0:16:04.850 language I get a sequence of store and[br]load and add instructions, store, load, 0:16:04.850,0:16:12.089 add, store, load, load. And if I look at[br]what the ARM architecture says can happen 0:16:12.089,0:16:17.139 these blue transitions... one thing to[br]notice is that we can perfectly well do 0:16:17.139,0:16:21.639 some things out of order. At this point we[br]could either do this load or we could do 0:16:21.639,0:16:27.240 that store. That would be a whole other[br]talk. Let's stick with the sequential 0:16:27.240,0:16:33.100 execution for now. What I want to[br]emphasize is that these, this load of this 0:16:33.100,0:16:36.569 address, really was loading a 64 bit[br]number, which was the address of x and 0:16:36.569,0:16:44.419 adding 4 to it and then loading from that[br]address, the secret key. So there is no 0:16:44.419,0:16:55.910 trace of these allocation ideas. No trace[br]at all. Okay, let me step back a little 0:16:55.910,0:17:04.939 bit. I've been showing you some[br]screenshots of C behaviour and ARM 0:17:04.939,0:17:12.109 behaviour and I claim that these are[br]actually showing you all of the behaviours 0:17:12.109,0:17:19.359 allowed by what one would like the[br]standards to be for these two things. And 0:17:19.359,0:17:26.529 really what I've been showing you are GUIs[br]attached to mathematical models that we've 0:17:26.529,0:17:31.000 built in a big research project for the[br]last many years. REMS: "Rigorous 0:17:31.000,0:17:34.929 Engineering of Mainstream Systems" sounds[br]good, aspirational title I think I would 0:17:34.929,0:17:42.049 say. And in that we've built semantics, so[br]mathematical models defining the envelope 0:17:42.049,0:17:47.580 of all allowed behaviour for a quite big[br]fragment of C and for the concurrency 0:17:47.580,0:17:53.950 architecture of major processors ARM, and[br]x86, and RISC-V, and IBM POWER and also 0:17:53.950,0:17:57.770 for the instruction set architecture of[br]these processors, the details of how 0:17:57.770,0:18:03.940 individual instructions are executed. And[br]in each case these are specs that are 0:18:03.940,0:18:08.350 executable as test oracles, you can[br]compare algorithmically some observed 0:18:08.350,0:18:13.320 behaviour against whether spec says it's[br]allowed or not, which you can't do with 0:18:13.320,0:18:19.030 those books. So is it just an idiot simple[br]idea but for some reason the industry has 0:18:19.030,0:18:26.960 not taken it on board any time in the last[br]five decades. It's not that hard to do. 0:18:26.960,0:18:30.710 These are also mathematical models, I'll[br]come back to that later. But another thing 0:18:30.710,0:18:34.529 I want to emphasize here is that in each[br]of these cases, especially these first 0:18:34.529,0:18:40.669 two, when you start looking really closely[br]then you learn what you have to do is not 0:18:40.669,0:18:44.400 build a nice clean mathematical model of[br]something which is well understood. What 0:18:44.400,0:18:50.540 you learn is that there are real open[br]questions. For example within the C 0:18:50.540,0:18:55.759 standards committee and within ARM a few[br]years ago about exactly what these things 0:18:55.759,0:19:01.130 even were. And that is a bit disturbing[br]given that these are the foundations for 0:19:01.130,0:19:06.009 all of our information infrastructure.[br]There is also a lot of other work going on 0:19:06.009,0:19:11.010 with other people within this REMS project[br]on web assembly and Javascript and file 0:19:11.010,0:19:16.000 systems and other concurrency. I don't[br]have time to talk about those but Hannes 0:19:16.000,0:19:19.640 Mehnert it is going to talk us a little[br]bit about TCP later today. You should go 0:19:19.640,0:19:22.760 to that talk too if you like this one. If[br]you don't like this one you should still 0:19:22.760,0:19:33.169 go to that talk. Okay so this is doing[br]somewhat better. certainly more rigorous 0:19:33.169,0:19:41.519 engineering of specifications, but so far[br]only for existing infrastructure for this 0:19:41.519,0:19:49.539 C language, ARM architecture, what have[br]you. So what if we try and do better, what 0:19:49.539,0:19:54.750 if we try and build better security in. So[br]the architectures that we have, really 0:19:54.750,0:20:00.250 they date back to the 1970s or even 60s[br]with the idea of virtual memory. That 0:20:00.250,0:20:04.759 gives you - I need to go into what it is -[br]but that gives you very coarse grain 0:20:04.759,0:20:10.269 protection. You can have your separate[br]processes isolated from each other and on 0:20:10.269,0:20:15.350 a good day you might have separate browser[br]tabs isolated from each other in some 0:20:15.350,0:20:22.429 browsers, sometimes, but it doesn't scale.[br]It's just too expensive. You have lots of 0:20:22.429,0:20:27.090 little compartments. One thing we can do[br]we can certainly design much better 0:20:27.090,0:20:34.389 programming languages than C but all of[br]that legacy code, that's got a massive 0:20:34.389,0:20:40.520 inertia. So an obvious question is whether[br]we can build in better security into the 0:20:40.520,0:20:45.591 hardware that doesn't need some kind of[br]massive pervasive change to all the 0:20:45.591,0:20:51.409 software we ever wrote. And that's the[br]question that a different large project, 0:20:51.409,0:20:55.630 the CHERI project has been addressing. So[br]this is something that's been going on for 0:20:55.630,0:21:01.529 the last 8 years or so, mostly at SRI and[br]at Cambridge funded by DARPA and the EPSRC 0:21:01.529,0:21:06.570 and ARM and other people, mostly led by[br]Robert Watson, Simon Moore, Peter Neumann 0:21:06.570,0:21:14.660 and a cast of thousands. And me a tiny[br]bit. So... How, don't do that. I should 0:21:14.660,0:21:20.889 learn to stop pushing this button. It's[br]very hard to not push the button. So 0:21:20.889,0:21:25.190 here's the question in a more focused way[br]that CHERI is asking: Can we build 0:21:25.190,0:21:33.330 hardware support which is both efficient[br]enough and deployable enough that gives us 0:21:33.330,0:21:37.340 both fine grained memory protection to[br]prevent that kind of bug that we were 0:21:37.340,0:21:42.090 talking about earlier, well that kind of[br]leak at least, and also scalable 0:21:42.090,0:21:45.990 compartmentalization. So you can take bits[br]of untrusted code and put them in safe 0:21:45.990,0:21:52.879 boxes and know that nothing bad is going[br]to get out. That's the question. The 0:21:52.879,0:21:57.389 answer... The basic idea of the answer is[br]pretty simple and it also dates back to 0:21:57.389,0:22:02.800 the 70s is to add hardware support for[br]what people call capabilities, and we'll 0:22:02.800,0:22:07.840 see that working in a few moments. The[br]idea is that this will let programmers 0:22:07.840,0:22:12.460 exercise the principle of least privilege,[br]so with each little bit of code having 0:22:12.460,0:22:17.809 only the permissions it needs to do its[br]job and also the principle of intentional 0:22:17.809,0:22:25.790 use. So with each little bit of code when[br]it uses one of those capabilities, will 0:22:25.790,0:22:30.419 require it to explicitly use it rather[br]than implicitly. So the intention of the 0:22:30.419,0:22:36.129 code has to be made plain. So let's see[br]how this works. So these capabilities 0:22:36.129,0:22:44.120 they're basically replacing some or maybe[br]all of the pointers in your code. So if we 0:22:44.120,0:22:51.050 take this example again in ISO C we had an[br]address and in the standard, well, the 0:22:51.050,0:22:54.549 standard is not very clear about this but[br]we're trying to make it more clear, an 0:22:54.549,0:23:00.960 allocation ID, say something like it. Now[br]in Cherry C we can run the same code but 0:23:00.960,0:23:05.529 we might represent this pointer not just[br]with that number at runtime but with 0:23:05.529,0:23:12.769 something that contains that number and[br]the base of the original allocation and 0:23:12.769,0:23:17.399 the length of the original allocation and[br]some permissions. And having all of those 0:23:17.399,0:23:23.020 in the pointer means that we can do.. the[br]hardware can do, at run time, at access 0:23:23.020,0:23:29.530 time, a very efficient check that this is[br]within this region of memory. And if it's 0:23:29.530,0:23:36.269 not it can fail stop and trap. No[br]information leak. I need a bit more 0:23:36.269,0:23:41.090 machinery to make this actually work. So[br]it would be nice if all of these were 64 0:23:41.090,0:23:45.940 bit numbers but then you get a 256 bit[br]pointer, and that's a bit expensive so 0:23:45.940,0:23:52.789 there's a clever compression scheme that[br]squeezes all this down into 1 to 8 bits 0:23:52.789,0:24:00.299 with enough accuracy. And then you need to[br]design the instruction set of the CPU 0:24:00.299,0:24:07.390 carefully to ensure that you can never[br]forge one of these capabilities with a 0:24:07.390,0:24:13.120 normal instruction. You can never just[br]magic one up out of a bunch of bits that 0:24:13.120,0:24:19.710 you happen to find lying around. So all[br]the capability manipulating instructions, 0:24:19.710,0:24:27.159 they're going to take a valid capability[br]and construct another, possibly smaller, 0:24:27.159,0:24:32.020 in memory extent, or possibly with less[br]permissions, new capability. They're never 0:24:32.020,0:24:38.610 going to grow the memory extent or add[br]permissions. And when the hardware starts, 0:24:38.610,0:24:43.370 at hardware initialization time, the[br]hardware will hand the software a 0:24:43.370,0:24:48.909 capability to everything and then as the[br]operating system works and the linker 0:24:48.909,0:24:55.820 works and the C language allocator works,[br]these capabilities will be chopped up into 0:24:55.820,0:25:03.181 ever finer and smaller pieces like to be[br]handed out the right places. And then need 0:25:03.181,0:25:10.220 one more little thing. So this C language[br]world we're living in here, or really a 0:25:10.220,0:25:16.330 machine code language world so there's[br]nothing stopping code writing bytes of 0:25:16.330,0:25:23.340 stuff. So you have to somehow protect[br]these capabilities against being forged by 0:25:23.340,0:25:28.210 the code, either on purpose or[br]accidentally writing bytes over the top. 0:25:28.210,0:25:34.900 You need to preserve their integrity. So[br]that's done by adding for each of these 0:25:34.900,0:25:42.830 128 bit sized underlined units of memory[br]just a one extra bit. That holds a tag and 0:25:42.830,0:25:55.159 that tag records whether this memory holds[br]are currently valid capability or not. So 0:25:55.159,0:25:59.519 all the capability manipulating[br]instructions, if they're given a valid 0:25:59.519,0:26:06.100 capability with a tag set then the result[br]will still have the tags set. But if you 0:26:06.100,0:26:13.900 write, so you just wrote that byte there[br]which might change the base then the 0:26:13.900,0:26:20.230 hardware will clear that tag and these[br]tags, they're not conventionally 0:26:20.230,0:26:25.190 addressable, they don't have addresses.[br]They just stop there in the memory system 0:26:25.190,0:26:30.470 of the hardware. So there is no way that soft-[br]ware can inaudible through these. So this is 0:26:30.470,0:26:39.240 really cool. We've taken, what in ISO was[br]undefined behavior, in the standard, and 0:26:39.240,0:26:43.970 in implementations was a memory leak, and[br]we've turned it into something that in 0:26:43.970,0:26:49.200 CHERI C is in both the specification and[br]the implementation, is guaranteed to trap, 0:26:49.200,0:26:56.700 to fail stop, and not leak that[br]information. So another few things about 0:26:56.700,0:27:01.500 the design that I should mention. I think[br]all these blue things I think I've talked 0:27:01.500,0:27:07.610 about. But then a lot of care has gone in[br]to make this be very flexible to make it 0:27:07.610,0:27:12.499 possible to adopt it. So you can use[br]capabilities for all pointers, if you want 0:27:12.499,0:27:18.999 to, or just at some interfaces. You might[br]for example use them at the kernel 0:27:18.999,0:27:24.830 userspace interface. It coexist very[br]nicely with existing C and C++. You don't 0:27:24.830,0:27:29.139 need to change the source code very much[br]at all, and we'll see some a tiny bit of 0:27:29.139,0:27:36.860 data about this, to make these to start[br]using these. It coexists with the existing 0:27:36.860,0:27:41.140 virtual memory machinery, so you can use[br]both capabilities and virtual memory, if 0:27:41.140,0:27:45.960 you want, or you can just use capabilities[br]if you want or just use virtual memory if 0:27:45.960,0:27:50.940 you want. And then there's some more[br]machinery, which I'm not going to talk 0:27:50.940,0:27:54.999 about, for doing this secure encapsulation[br]stuff. What I've talked about so far is 0:27:54.999,0:27:59.440 basically the the fine grained memory[br]protection story. And I should say the 0:27:59.440,0:28:05.379 focus so far is on spatial memory safety.[br]So that's not in the first instance 0:28:05.379,0:28:11.450 protecting against reuse of an old[br]capability in a bad way but there various 0:28:11.450,0:28:17.711 schemes for supporting temporal memory[br]safety with basically the same setup. Okay 0:28:17.711,0:28:24.820 so. So how do we... Okay this is some kind[br]of a high level idea. How do we know that 0:28:24.820,0:28:32.669 it can be made to work. Well the only way[br]is to actually build it and see. And this 0:28:32.669,0:28:36.950 has been a really interesting process[br]because mostly when you're building 0:28:36.950,0:28:40.980 something either in academia or an[br]industry you have to keep most of the 0:28:40.980,0:28:46.730 parts fixed, I mean you are not routinely[br]changing both the processor and the 0:28:46.730,0:28:51.149 operating system, because that would just[br]be too scary. But here we have been doing 0:28:51.149,0:28:57.850 that and indeed we've really been doing a[br]three way, three way codesign of building 0:28:57.850,0:29:02.809 hardware and building and adapting[br]software to run above it and also building 0:29:02.809,0:29:09.249 these mathematical models all at the same[br]time. This is, well. A, it's all the fun 0:29:09.249,0:29:13.220 because you can often get groups of people[br]together that can do all of those things 0:29:13.220,0:29:20.269 but B, it's really effective. So what[br]we've produced then is an architecture 0:29:20.269,0:29:25.050 specification. In fact, extending MIPS[br]because it was conveniently free of IP 0:29:25.050,0:29:32.190 concerns and that specification is one of[br]these mathematically rigorous things 0:29:32.190,0:29:36.350 expressed in a domain specific language[br]specifically for writing instruction set 0:29:36.350,0:29:40.139 architecture specifications, and we've[br]designed and implemented actually two of 0:29:40.139,0:29:44.399 those. And then there are hardware[br]processor implementations that actually 0:29:44.399,0:29:49.249 run an FPGAs. And a lot of hardware[br]research devoted to making this go fast 0:29:49.249,0:29:54.379 and be efficient despite these extra tags[br]and whatnot. And then there's a big 0:29:54.379,0:30:00.669 software stack adapted to run over this[br]stuff. Including Clang and a FreeBSD 0:30:00.669,0:30:06.620 kernel and FreeBSD userspace and WebKit[br]and all kinds of pieces. So this is a very 0:30:06.620,0:30:12.309 major evaluation effort. That one is not[br]normally able to do. This is why, this 0:30:12.309,0:30:18.049 combination of things is why that list of[br]people up there was about 40 people. I say 0:30:18.049,0:30:22.559 this is based on MIPS but the ideas are[br]not specific to MIPS, there are ongoing 0:30:22.559,0:30:28.190 research projects to see if this can be[br]adapted to the ARM application class 0:30:28.190,0:30:32.580 architecture and the ARM microcontroller[br]architecture and the RISC-V. We'll see how 0:30:32.580,0:30:41.529 that goes, in due course. Okay so then[br]with all of these pieces we can evaluate 0:30:41.529,0:30:46.269 whether it actually works. And that's[br]still kind of an ongoing process but the 0:30:46.269,0:30:57.180 data that we've got so far is pretty[br]encouraging. So we see here first, 0:30:57.180,0:31:08.230 performance. So you see, maybe a percent[br]or 2 in many cases of runtime overhead. 0:31:08.230,0:31:11.500 There are workloads where it performs[br]badly if you have something that's very 0:31:11.500,0:31:17.270 pointer heavy, then the extra pressure[br]from those larger pointers will be a bit 0:31:17.270,0:31:24.090 annoying, but really it seems to be[br]surprisingly good. Then is it something 0:31:24.090,0:31:31.419 that can work without massive adaption to[br]the existing code. Well, in this port of 0:31:31.419,0:31:38.250 the FreeBSD userspace, so all the programs[br]that, all in all programs that run, they 0:31:38.250,0:31:45.120 were something like 20000 files and only[br]200 of those needed a change. And that's 0:31:45.120,0:31:52.081 been done by one or two people in not all[br]that large an amount of time. Some of the 0:31:52.081,0:31:57.230 other code that's more and more like an[br]operating system needs a bit more invasive 0:31:57.230,0:32:06.070 change but still, it seems to be viable.[br]Is it actually more secure? How are you 0:32:06.070,0:32:10.710 going to measure that. We like measuring[br]things as a whole we try and do science 0:32:10.710,0:32:17.440 where we can. Can we measure that? Not[br]really. But it certainly does, the whole 0:32:17.440,0:32:23.299 setup certainly does, mitigate against[br]quite a large family of known attacks. I 0:32:23.299,0:32:28.260 mean if you try and run Heartbleed you'll[br]get a trap. It will say trap. I think he 0:32:28.260,0:32:36.499 will say signal 34 in fact. So that's[br]pretty good. And if you think about 0:32:36.499,0:32:42.950 attacks, very many of them involve a chain[br]of multiple floors put together by an 0:32:42.950,0:32:49.399 ingenious member of the C3 community or[br]somebody else, and very often, at least 0:32:49.399,0:33:00.119 one of those is some kind of memory access[br]permission flaw of some kind or other. 0:33:00.119,0:33:03.539 Okay so this is nice, but I was talking a[br]lot in the first part of the talk, about 0:33:03.539,0:33:11.920 rigorous engineering. So here we've got[br]formally defined definitions of the 0:33:11.920,0:33:16.779 architecture and that's been really useful[br]for testing against and for doing test 0:33:16.779,0:33:22.889 generation on the basis of, and doing[br]specification coverage in an automated 0:33:22.889,0:33:28.809 way. But surely we should be able to do[br]more than that. So this formal definition 0:33:28.809,0:33:34.860 of the architecture, what does it look[br]like. So for each instruction of this 0:33:34.860,0:33:40.669 CHERI MIPS processor, there's a definition[br]and that definition, it looks a bit like 0:33:40.669,0:33:47.980 normal code. So here's a definition of how[br]the instruction for "capability increment 0:33:47.980,0:33:54.159 cursor immediate" goes. So this is a thing[br]that takes a capability register and an 0:33:54.159,0:33:58.310 immediate value and it tries to add[br]something on to the cursor of that 0:33:58.310,0:34:03.640 capability. And what you see here is some[br]code, looks like code, that defines 0:34:03.640,0:34:06.830 exactly what happens and also defines[br]exactly what happens if something goes 0:34:06.830,0:34:13.140 wrong. You can see there's some kind of an[br]exception case there. That's a processor. 0:34:13.140,0:34:19.139 No that's a... Yeah, that a "raising our[br]processor" exception. So it looks like 0:34:19.139,0:34:25.270 code, but, from this, we can generate both[br]reasonably high performance emulators, 0:34:25.270,0:34:30.780 reasonably in C and in OCaml, and also[br]mathematical models in the theorem 0:34:30.780,0:34:34.770 provers. So three of the main theorem[br]provers in the world are called Isabelle 0:34:34.770,0:34:41.690 and HOL and Coq. And we generate[br]definitions in all of those. So then we 0:34:41.690,0:34:46.489 got a mathematical definition of the[br]architecture. Then finally, we can start 0:34:46.489,0:34:51.530 stating some properties. So Cherrie's[br]design with some kind of vague security 0:34:51.530,0:34:57.809 goals in mind from the beginning but now[br]we can take, let'ssay one of those goals and 0:34:57.809,0:35:06.569 make it into a precise thing. So this is a[br]kind of a secure encapsulation, kind of 0:35:06.569,0:35:12.540 thing not exactly the memory leak that we[br]were looking at before. Sorry, the secret 0:35:12.540,0:35:17.820 leak that we were looking at. So let's[br]take imagine some CHERI compartment. 0:35:17.820,0:35:22.020 Haven't told you exactly what that means[br]but let's suppose that that's understood 0:35:22.020,0:35:27.490 and that inside that compartment there is[br]a piece of software running. And that 0:35:27.490,0:35:35.559 might be maybe a video codec or a web[br]browser or even a data structure 0:35:35.559,0:35:44.410 implementation maybe, or a C++ class and[br]all of its objects maybe. So imagine that 0:35:44.410,0:35:50.049 compartment running and imagine the[br]initial capabilities that it has access to 0:35:50.049,0:35:55.640 via registers and memory and via all the[br]capabilities it has access to via the 0:35:55.640,0:35:59.510 memory that it has access to and so on[br]recursively. So imagine all of those 0:35:59.510,0:36:06.359 capabilities. So the theorem says no[br]matter how this code executes whatever it 0:36:06.359,0:36:15.730 does however compromised or buggy it was[br]in an execution up until any point at 0:36:15.730,0:36:24.020 which it raises an exception or makes an[br]inter compartment call it can't make any 0:36:24.020,0:36:30.530 access which is not allowed by those[br]initial capabilities. You have to exclude 0:36:30.530,0:36:35.230 exceptions and into compartment calls[br]because by design they do introduce new 0:36:35.230,0:36:42.690 capabilities into the execution. But until[br]that point you get this guarantee. And 0:36:42.690,0:36:48.849 this is something that we've proved[br]actually [...] has approved with a machine 0:36:48.849,0:36:55.980 check proof in in fact the Isabelle[br]theorem prover. So this is a fact which is 0:36:55.980,0:37:03.589 about as solidly known as any fact the[br]human race knows. These provers they're 0:37:03.589,0:37:10.450 checking a mathematical proof in[br]exceedingly great detail with great care 0:37:10.450,0:37:14.230 and attention and they're structured in[br]such a way that the soundness of approver 0:37:14.230,0:37:21.300 depends only on some tiny little kernel.[br]So conceivably cosmic rays hit the 0:37:21.300,0:37:27.230 transistors at all the wrong points. But[br]basically we know this for sure. I 0:37:27.230,0:37:34.299 emphasize this is a security property we[br]have not proved, we certainly wouldn't 0:37:34.299,0:37:39.440 claim to have proved that CHERI is secure[br]and there are all kinds of other kinds of 0:37:39.440,0:37:44.849 attack that this statement doesn't even[br]address but at least this one property we 0:37:44.849,0:37:49.880 know it for real. So that's kind of[br]comforting and not a thing that 0:37:49.880,0:38:01.420 conventional computer science engineering[br]has been able to do on the whole. So, are 0:38:01.420,0:38:10.750 we taming the chaos then. Well no. Sorry.[br]But I've shown you two kinds of things. 0:38:10.750,0:38:15.980 The first was showing how we can do[br]somewhat more rigorous engineering for 0:38:15.980,0:38:21.640 that existing infrastructure. It's now[br]feasible to do that and in fact I believe 0:38:21.640,0:38:26.370 it has been feasible to build[br]specifications which you can use as test 0:38:26.370,0:38:31.460 oracles for many decades. We haven't[br]needed any super fancy new tech for that. 0:38:31.460,0:38:41.140 We've just needed to focus on that idea.[br]So that's for existing infrastructure and 0:38:41.140,0:38:47.470 then CHERI is a relatively light weight[br]change that does built-in improved 0:38:47.470,0:38:53.520 security and we think it's demonstrably[br]deployable at least in principle. Is it 0:38:53.520,0:38:58.010 deployable in practice? Are we going to[br]have in all of our phones five years from 0:38:58.010,0:39:05.890 now? Will these all be CHERI enabled? I[br]can't say. I think it is plausible that 0:39:05.890,0:39:09.590 that should happen and we're exploring[br]various routes that might mean that there 0:39:09.590,0:39:20.260 happens and we'll see how that goes. Okay[br]so with that I aiming to have given you at 0:39:20.260,0:39:25.760 least not a whole lot of hope but just a[br]little bit of hope that the world can be 0:39:25.760,0:39:30.670 made a better place and I encourage you to[br]think about ways to do it because Lord 0:39:30.670,0:39:38.290 knows we need it. But maybe you can at[br]least dream for a few moments of living in 0:39:38.290,0:39:46.010 a better world. And with that I thank you[br]for your attention. 0:39:46.010,0:39:57.000 applause 0:39:57.000,0:40:06.420 Herald Angel: Thanks very much. And with[br]that we'll head straight into the Q and A. 0:40:06.420,0:40:10.210 We'll just start with mic number one which[br]I can see right now. 0:40:10.210,0:40:15.720 Q: Yes thanks for the very encouraging[br]talk. I have a question about how to C 0:40:15.720,0:40:21.500 code the part below that. The theorem that[br]you stated assumes that the mentioned 0:40:21.500,0:40:28.200 description matches the hardware at least[br]is describes the hardware accurately. But 0:40:28.200,0:40:33.130 are there any attempts to check that the[br]hardware actually works like that? That 0:40:33.130,0:40:37.170 there are no wholes in the hardware? That[br]some combination of mentioned commands 0:40:37.170,0:40:42.980 work differently or allow memory access[br]there is not in the model? So, is it 0:40:42.980,0:40:48.990 possible to use hardware designs and check[br]that it matches the spec and will Intel or 0:40:48.990,0:40:53.730 AMD try to check their model against that[br]spec? 0:40:53.730,0:40:59.700 Sewell: OK. So the question is basically[br]can we do provably correct hardware 0:40:59.700,0:41:05.560 underneath this architectural abstraction.[br]And the answer is... So it's a good 0:41:05.560,0:41:12.700 question. People are working on that and[br]it can be done I think on at least a 0:41:12.700,0:41:22.130 modest academic scale. Trying to do that[br]in its full glory for a industrial 0:41:22.130,0:41:27.830 superscalar processor design, is right now[br]out of reach. But it's certainly something 0:41:27.830,0:41:35.259 one would like to be able to do. I think[br]we will get there maybe in a decade or so. 0:41:35.259,0:41:41.910 A decade is quick.[br]Herald Angel: Thanks. Mic number four is 0:41:41.910,0:41:48.471 empty again. So we'll take the Internet.[br]Sewell: Where is the internet? 0:41:48.471,0:41:53.359 Signal Angel: The internet is right here[br]and everywhere so ruminate would like to 0:41:53.359,0:41:59.960 know is the effort and cost of building in security[br]to hardware as you've described in your 0:41:59.960,0:42:05.880 talk significantly greater or less than[br]the effort and cost of porting software to 0:42:05.880,0:42:12.059 more secure languages.[br]Sewell: So, is the effort of building new 0:42:12.059,0:42:17.349 hardware more or less than the cost of[br]porting existing software to better 0:42:17.349,0:42:26.280 languages? I think the answer has to be[br]yes. It is less. The difficulty with 0:42:26.280,0:42:30.700 porting software is all of you and all[br]your colleagues and all of your friends 0:42:30.700,0:42:36.009 and all your enemies have been beavering[br]away writing lines of code for 50 years. 0:42:36.009,0:42:42.409 Really, I don't know what to say,[br]effectively. Really numerously. There's an 0:42:42.409,0:42:46.280 awful lot of it. It's a really terrifying[br]amount of code out there. It's very hard 0:42:46.280,0:42:54.388 to get people to rewrite it.[br]Signal Angel: OK, mic two. 0:42:54.388,0:43:03.029 Q: OK, given that cost of ... of today on the[br]software level not on the hardware level, is it 0:43:03.029,0:43:11.940 possible to go half way using an entire[br]system as an oracle. So during development 0:43:11.940,0:43:19.280 there are code based with the secure[br]hardware but then essentially ship the 0:43:19.280,0:43:28.859 same software to unsecure hardware and inaudible[br]Sewell: So this question, if I paraphrase 0:43:28.859,0:43:35.819 it is can we use this hardware[br]implementation really as the perfect 0:43:35.819,0:43:45.020 sanitiser? And I think there is scope for[br]doing that. And you can even imagine 0:43:45.020,0:43:49.750 running this not on actual silicon[br]hardware but in like a QEMU implementation 0:43:49.750,0:43:54.819 which we have in order to do that. And I[br]think for for that purpose it could 0:43:54.819,0:44:00.720 already be a pretty effective bug finding[br]tool. It would be worth doing. But you 0:44:00.720,0:44:06.250 would like as for any testing it will only[br]explore a very tiny fraction of the 0:44:06.250,0:44:13.539 possible paths. So we would like to have[br]it always on if we possibly can. 0:44:13.539,0:44:19.690 Herald Angel: OK. The internet again.[br]Signal Angel: OK someone would like to 0:44:19.690,0:44:26.130 know how does your technique compare to[br]Intel MPX which failed miserably with a inaudible 0:44:26.130,0:44:33.820 ignoring it. Will it better or faster?[br]Could you talk a little bit about that. 0:44:33.820,0:44:38.400 Sewell: I think for that question, for a[br]real answer to that question, you would 0:44:38.400,0:44:45.549 need one of my more hardware colleagues.[br]What I can say is they make nasty faces 0:44:45.549,0:44:53.450 when you say MPX.[br]Herald Angel: Well that's that. Mic number 0:44:53.450,0:44:56.880 one.[br]Q: Somewhat inherent to your whole 0:44:56.880,0:45:02.460 construction was this idea of having an[br]unchangeable bit which described whether 0:45:02.460,0:45:08.539 your pointer contents have been changed.[br]Is this even something that can be done in 0:45:08.539,0:45:12.850 software? Or do you have to construct a[br]whole extra RAM or something? 0:45:12.850,0:45:21.309 Sewell: So you can't do it exclusively in[br]software because you need to protect it. 0:45:21.309,0:45:25.550 In the hardware implementations that my[br]colleagues have built it's done in a 0:45:25.550,0:45:32.700 reasonably efficient way. So it's cached[br]and managed in clever ways to ensure that 0:45:32.700,0:45:38.730 it's not terribly expensive to do. But you[br]do need slightly wider data paths in your 0:45:38.730,0:45:45.460 cache protocol and things like that to[br]manage it. 0:45:45.460,0:45:51.527 Herald Angel: OK. Mic seven.[br]Q: How good does this system help securing 0:45:51.527,0:45:56.850 the control flow integrity compared to[br]compared to other systems, like hardware 0:45:56.850,0:46:03.870 systems data flow isolation or code[br]pointer integrity in ARM inaudible? 0:46:03.870,0:46:09.310 Sewell: Ah well, that's another question[br]which is a bit hard to answer because then 0:46:09.310,0:46:18.179 it depends somewhat on how you're using[br]the capabilities. So you can arrange here 0:46:18.179,0:46:25.549 that each state each C language allocation[br]is independently protected and indeed you 0:46:25.549,0:46:30.470 can also arrange that each way - this is[br]sensible - that each subobject is 0:46:30.470,0:46:34.920 independent protected. And those[br]protections are going to give you 0:46:34.920,0:46:39.310 protection against trashing bits of the[br]stack because they'll restrict the 0:46:39.310,0:46:50.669 capability to just those objects.[br]Herald Angel: OK the internet again. 0:46:50.669,0:46:57.740 Signal Angel: Twitter would like to know[br]is it possible to is it possible to run 0:46:57.740,0:47:04.289 CHERI C without custom hardware?[br]Sewell: Can you run CHERI C without custom 0:47:04.289,0:47:12.950 hardware? And the answer is you can run it[br]in emulation above this QEMU. That works 0:47:12.950,0:47:18.040 pretty fast, I mean fast enough for[br]debugging as the earlier question was 0:47:18.040,0:47:28.609 talking about. You can imagine you know[br]somewhat faster emulations of that. It's 0:47:28.609,0:47:34.340 not clear how much it's worth going down[br]that route. The real potential big win is 0:47:34.340,0:47:41.240 to have this be always on and that's what[br]we would like to have. 0:47:41.240,0:47:47.390 Herald Angel: OK. Mic one.[br]Q: Do you have some examples of the kinds 0:47:47.390,0:47:51.900 of code changes that you need to the[br]operating system and userland that you 0:47:51.900,0:47:55.859 mentioned?[br]Sewell: So, okay so what kind of changes 0:47:55.859,0:48:07.140 do you need to to adapt code to CHERI? So,[br]if you look at C-code there is the C 0:48:07.140,0:48:12.520 standard that deems a whole lot of things[br]to be undefined behavior and then if you 0:48:12.520,0:48:19.940 look at actual C code you find that very[br]very much of it does depend on some idiom 0:48:19.940,0:48:28.650 that the C standard does not approve of.[br]So there is for example, pause there are 0:48:28.650,0:48:34.740 quite a lot of instances of code that[br]constructs a pointer by doing more than 0:48:34.740,0:48:42.380 one more than plus one offset and then[br]brings it back within range before you use 0:48:42.380,0:48:49.450 it. So in fact in CHERI C many of those[br]cases are allowed but not all of them. 0:48:49.450,0:48:58.200 More exotic cases where there's really[br]some crazy into object stuff going on or 0:48:58.200,0:49:03.339 where pointer arithmetic between objects -[br]which the C standard is certainly quite 0:49:03.339,0:49:09.290 down on but which does happen - and code[br]which is, say manipulating the low order 0:49:09.290,0:49:13.549 bits of a pointer to store some data in[br]it. That's pretty common. You can do it in 0:49:13.549,0:49:18.170 CHERI C but you might have to adapt the[br]code a little bit. Other cases are more 0:49:18.170,0:49:22.849 fundamental. So say, in operating system[br]code if you swap out a page to disk and 0:49:22.849,0:49:28.460 then you swap it back in then somehow the[br]operating system has to reconstruct new 0:49:28.460,0:49:35.200 capabilities from a large capability which[br]it kept for the purpose. So that needs a 0:49:35.200,0:49:41.869 bit more work. Though it's kind of a[br]combination. Some things you would regard 0:49:41.869,0:49:48.320 as good changes to the code anyway and[br]others are more radical. 0:49:48.320,0:49:51.600 Herald Angel: OK, another one from the[br]internet. 0:49:51.600,0:50:00.000 Signal Angel: I lost one pause[br]Sewell: The Internet has gone quiet. Yes. 0:50:00.000,0:50:04.420 Signal Angel: Last question from the[br]internet. Are there any plans to impact 0:50:04.420,0:50:11.950 test on Linux or just FreeBSD?[br]Sewell: If anyone has a good number of 0:50:11.950,0:50:19.980 spare minutes then that would be lovely to[br]try. The impact on Linux not just 0:50:19.980,0:50:26.400 previously the BSD code base is simpler[br]and maybe cleaner and my systemsy 0:50:26.400,0:50:30.230 colleagues are already familiar with it.[br]It's the obvious target for an academic 0:50:30.230,0:50:35.260 project doing it already a humungous[br]amount of work. So I think we would love 0:50:35.260,0:50:40.910 to know how Linux and especially how[br]Android could be adapted but it's not 0:50:40.910,0:50:46.259 something that we have the resource to do[br]at the moment. 0:50:46.259,0:50:50.710 Herald Angel: Mic number on again.[br]Q: How likely or dangerous. 0:50:50.710,0:50:55.460 Herald Angel: Mic number one is not[br]working. 0:50:55.460,0:50:59.540 Q: How likely or dangerous you think it is[br]for a programmer to screw up their 0:50:59.540,0:51:05.329 capabilities he specifies?[br]Sewell: So how often will the programmer 0:51:05.329,0:51:10.750 screw up the capabilities? So in many[br]cases the programmer is just doing an 0:51:10.750,0:51:17.370 access with a C or C++ pointer or C++[br]object in a normal way. They're not 0:51:17.370,0:51:23.560 manually constructing these things. So a[br]lot of that is going to just work. If 0:51:23.560,0:51:28.170 you're building some particular secure[br]encapsulation setup you have to be a bit 0:51:28.170,0:51:34.280 careful. But on the whole I think this is[br]a small risk compared to the state of 0:51:34.280,0:51:40.960 software as we have it now.[br]Herald Angel: OK. Mic eight. 0:51:40.960,0:51:49.220 Q: So existing memory protection[br]techniques are quite patch-worky, like 0:51:49.220,0:52:06.160 canaries and address layout randomisation.[br]Is this a system designed to replace or 0:52:06.160,0:52:17.920 supplement these measures?[br]Sewell: I think if you pause So again it 0:52:17.920,0:52:24.480 depends a bit how you use it. So if you[br]have capabilities really everywhere then 0:52:24.480,0:52:33.240 there's not much need for address space[br]layout randomization or canaries to 0:52:33.240,0:52:41.829 protect against explicit information[br]leaks. I imagine that you might still want 0:52:41.829,0:52:48.500 randomisation to protect against some side[br]channel flows but canaries is not so much 0:52:48.500,0:52:51.539 on whether you actually do for side-[br]channel flows - I don't know. That's a 0:52:51.539,0:52:56.500 good question.[br]Herald Angel: Mic four please. 0:52:56.500,0:53:02.580 Q: Thanks for the very interesting talk[br]you presented with CHERI, a system of 0:53:02.580,0:53:07.529 veryfying existing C-software, sort of[br]post hoc and improving its security via 0:53:07.529,0:53:12.450 run-time mechanism...?[br]A: Improving or Proving? Improving yes, 0:53:12.450,0:53:18.560 proving no, yes.[br]Q: So what's your opinion on the viability 0:53:18.560,0:53:23.020 of using a static analysis and static[br]verification for that. Would it be 0:53:23.020,0:53:29.579 possible to somehow analyse software that[br]already exist and as written in these 0:53:29.579,0:53:34.089 unsafe languages and show that it[br]nevertheless has some security properties 0:53:34.089,0:53:40.830 without writing all the proofs manually in[br]like HOL or Isabelle? 0:53:40.830,0:53:47.050 A: Well so if you want to analyse existing[br]software... So, static analysis is already 0:53:47.050,0:53:53.029 useful for finding bugs in existing[br]software. If you want to have assurance 0:53:53.029,0:53:58.869 from static analysis, that's really very[br]tough. So you certainly wouldn't want to 0:53:58.869,0:54:07.430 manually write proofs in terms of the[br]definitional C semantics, you would need 0:54:07.430,0:54:10.849 intermediate infrastructure. And if you[br]look at the people, who have done verified 0:54:10.849,0:54:15.750 software, so verified compilers and[br]verified hypervisor, and verified 0:54:15.750,0:54:25.029 operating systems, all of those are now[br]possible on significant scales, but they 0:54:25.029,0:54:30.510 use whole layers of proof and verification[br]infrastructure for doing that. They're not 0:54:30.510,0:54:35.829 writing proofs by hand for every little[br]detail. And some of those verification 0:54:35.829,0:54:42.180 methods either are, you can imagine them[br]being basically like static analysis, you 0:54:42.180,0:54:47.690 know, you might use a static analysis to[br]prove some relatively simple facts about 0:54:47.690,0:54:54.820 the code and then stitch those together[br]into a larger assembly. I think any kind 0:54:54.820,0:54:59.170 of more complete thing I think is hard to[br]imagine. I mean these analysis tools 0:54:59.170,0:55:03.339 mostly, they rely on making some[br]approximation in order to be able to do 0:55:03.339,0:55:09.750 their thing at all. So it's hard to get[br]assurance out of them. 0:55:09.750,0:55:16.869 Herald Angel: Okay, Mic one.[br]Q: You said you modified an MIPS 0:55:16.869,0:55:22.740 architecture to add some logic to check their[br]capabilities. Do you know what the cost of 0:55:22.740,0:55:30.219 this regarding computational power,[br]an energetic power supply? 0:55:30.219,0:55:33.030 A: Sorry, can you repeat the last part of[br]that? 0:55:33.030,0:55:39.190 Q: What the cost of this modification[br]regarding computational power, and power 0:55:39.190,0:55:42.300 consumption.[br]A: Ah, so what's the energy cost? So I 0:55:42.300,0:55:46.609 gave you a performance cost. I didn't give[br]you, I carefully didn't give you an energy 0:55:46.609,0:55:52.599 cost estimate, because it's really hard to[br]do in a scientifically credible way, 0:55:52.599,0:55:59.250 without making a more or less production[br]superscalar implementation. And we are 0:55:59.250,0:56:03.500 sadly not in a position to do that,[br]although if you have 10 or 20 million 0:56:03.500,0:56:09.249 pounds, then I would be happy to accept[br]it. 0:56:09.249,0:56:15.220 Herald Angel: Mic number 7, please.[br]Q: How does the class of problems that you 0:56:15.220,0:56:21.440 can address with CHERI compare to the[br]class of problems that are excluded by, 0:56:21.440,0:56:26.680 for example, the Rust programming[br]language? 0:56:26.680,0:56:30.460 A: So how does the problems of CHERI[br]relate to the problems, sorry the problems 0:56:30.460,0:56:39.529 excluded by CHERI, relate to the problems[br]excluded by Rust. So if you are happy to 0:56:39.529,0:56:50.859 write all of your code in Rust without[br]ever using the word "unsafe", then maybe 0:56:50.859,0:56:56.240 there would be no point in CHERI, at all.[br]Are you happy to do that? 0:56:56.240,0:57:00.750 Q: Probably not, no.[br]A: I think someone shakes their head 0:57:00.750,0:57:10.000 sideways, yeah.[br]Herald Angel: Mic number 1. 0:57:10.000,0:57:14.640 Q: What do you think about the following[br]thesis: We are building a whole system of 0:57:14.640,0:57:21.109 things, with artificial intelligence and[br]something like that. That is above this 0:57:21.109,0:57:29.760 technical level and it's building another[br]chaos that isn't healing with those 0:57:29.760,0:57:34.009 things.[br]A: It's dreadful, isn't it. 0:57:34.009,0:57:45.700 Laughter[br]A: There are, so you might, in fact some 0:57:45.700,0:57:50.320 of my colleagues are interested in this[br]question, if you, you might well want to 0:57:50.320,0:57:58.819 bound what your artificial intelligence[br]can access or touch and for that you might 0:57:58.819,0:58:03.720 want this kind of technology. But this is[br]not, we are, with machine learning systems 0:58:03.720,0:58:07.410 we are intrinsically building things that[br]we, on the whole, don't understand and 0:58:07.410,0:58:11.960 that will have edge cases that go wrong.[br]And this is not speaking to that in any 0:58:11.960,0:58:17.240 way.[br]Herald Angel: OK. Does the internet have a 0:58:17.240,0:58:24.793 question? No, good. I don't see anyone[br]else, so let's conclude this. Thank you 0:58:24.793,0:58:28.932 very much.[br]A: Thank you. 0:58:28.932,0:58:30.289 applause 0:58:30.289,0:58:35.930 35c3 postroll music 0:58:35.930,0:58:53.000 subtitles created by c3subtitles.de[br]in the year 2019. Join, and help us!