0:00:09.519,0:00:14.479 applause 0:00:14.479,0:00:16.930 Peter Sewell: Thank you for that[br]introduction and to the organisers for 0:00:16.930,0:00:23.459 this opportunity to speak to you guys. So[br]my name is Peter Sewell. I'm an academic 0:00:23.459,0:00:30.609 computer scientist and as such, I bear a[br]heavy burden of guilt for what we have to 0:00:30.609,0:00:38.040 work with. Shared guilt, to be fair, but[br]guilt all the same. In many, many years of 0:00:38.040,0:00:44.670 higher education, I know two things about[br]computers. Maybe you've figured them out 0:00:44.670,0:00:55.000 slightly faster. The first thing, there's[br]an awful lot of them about. Shocking. And 0:00:55.000,0:01:03.300 the second is that they go wrong. They go[br]wrong a lot. Really a lot. A lot. A lot! 0:01:03.300,0:01:09.009 And they go wrong in many interestingly[br]different ways. So sometimes they go wrong 0:01:09.009,0:01:17.850 in spectacular ways, right? With[br]fireworks. And here, we see... 0:01:17.850,0:01:24.030 laughterapplause 0:01:24.030,0:01:31.260 Here we see half a billion dollars worth[br]of sparkle. You don't get that every day. 0:01:31.260,0:01:34.970 And so what was going on? So as I[br]understand it, what was going on was that 0:01:34.970,0:01:40.600 the - this was the first flight of the[br]Ariane 5 launcher and it reused some of 0:01:40.600,0:01:45.851 the guidance software from Ariane 4.[br]That's fine. Good, tested code. Should 0:01:45.851,0:01:53.030 reuse it. But the launch profile for[br]Ariane 5 went sideways a bit more - on 0:01:53.030,0:01:58.830 purpose - than the Ariane 5 launch. And as[br]a result, one of the sideways values in 0:01:58.830,0:02:04.740 the guidance control software overflowed.[br]Bad. And hence, the guidance software 0:02:04.740,0:02:08.449 decided that something bad was going wrong[br]and in order to not land rockets on 0:02:08.449,0:02:15.841 people's heads, it kindly blew itself up.[br]Okay. So that's a spectacular. Sometimes 0:02:15.841,0:02:21.200 they go wrong in insidious ways. And you[br]people probably know much more about that 0:02:21.200,0:02:29.370 than me. Very insidious ways. Exploited by[br]bad people to our detriment. And they go 0:02:29.370,0:02:36.990 wrong in insidious ways, exploited by our[br]kindly governments for our protection. 0:02:36.990,0:02:44.959 Right? And these - there are many, many,[br]many different causes of these things but 0:02:44.959,0:02:49.420 many of these causes are completely[br]trivial. You know, you have one line of 0:02:49.420,0:02:56.150 code in the wrong place - something like[br]that, right? So that's very, very bad and 0:02:56.150,0:03:01.040 you're all conscious of how bad that is.[br]But from my point of view, this is as 0:03:01.040,0:03:06.959 nothing with the real disaster that we[br]have been facing and continue to face. 0:03:06.959,0:03:14.950 Right? The real disaster is the enormous[br]waste of human life and energy and effort 0:03:14.950,0:03:21.360 and emotion dealing with these crap[br]machines that we've built. Right? And it's 0:03:21.360,0:03:25.489 hard to quantify that. I did a little[br]experiment a couple of months ago. I 0:03:25.489,0:03:34.030 Googled up "Android problem" and "Windows[br]problem" and got about 300 million hits 0:03:34.030,0:03:43.840 for each, right? Somehow indicative of the[br]amount of wrongness. So this should you 0:03:43.840,0:03:49.790 seem kind of familiar to you. If you think[br]back - think back to when you were young. 0:03:49.790,0:03:55.700 Maybe back in the 19th century when you[br]were young, and what were you doing then 0:03:55.700,0:04:00.940 as a community of hackers and makers and[br]builders? Well, this was the full-on 0:04:00.940,0:04:07.130 industrial revolution. You were building[br]stuff. You were building beautiful bridges 0:04:07.130,0:04:14.739 of cast-iron and stone with numerous[br]arches and pillars. And every so often 0:04:14.739,0:04:22.300 they would fall down. I believe someone[br]miscalculated the wind loading. And we 0:04:22.300,0:04:30.630 would build magical steam engines to whisk[br]us away. And every so often, they would 0:04:30.630,0:04:35.770 blow up. And that kind of thing, as you[br]may have noticed in the last hundred years 0:04:35.770,0:04:41.810 - sorry - that kind of thing doesn't[br]happen very often any more, right? We have 0:04:41.810,0:04:46.990 pretty good civil and mechanical[br]engineering. And what does it mean to have 0:04:46.990,0:04:52.620 good civil and mechanical engineering? It[br]means that we know enough thermodynamics 0:04:52.620,0:04:56.750 and materials science and quality control[br]management and all that kind of thing to 0:04:56.750,0:05:01.830 model our designs before we build them and[br]predict how they're going to behave pretty 0:05:01.830,0:05:10.750 well and with pretty high confidence,[br]right? I see an optimisation in this talk. 0:05:10.750,0:05:15.340 For computer science, for computing[br]systems we can predict with 100 per cent 0:05:15.340,0:05:20.450 certainty that they will not work.[br]Ah, time for lunch! 0:05:20.450,0:05:26.509 applause 0:05:26.509,0:05:33.330 But it's not a very satisfying answer,[br]really. So why is it so hard? Let me 0:05:33.330,0:05:38.500 discuss a couple of the reasons why[br]computing is tricky. In some sense, 0:05:38.500,0:05:45.430 trickier than steam engines. The first[br]reason is that there is just too much 0:05:45.430,0:05:50.360 code. Right? So this is one of the more[br]scary graphs I've found on the internet 0:05:50.360,0:05:55.470 recently. This is a measure of code in[br]hundreds of millions of lines. Each of 0:05:55.470,0:06:00.250 these blocks I think is 10 million lines[br]of code. And in the little letters - you 0:06:00.250,0:06:04.860 might not be able to see we have Android[br]and the Large Hadron Collider and Windows 0:06:04.860,0:06:12.820 Vista and Facebook, US Army Future Combat[br]System, Mac OS and software in a high-end 0:06:12.820,0:06:15.010 car. Right? 0:06:15.010,0:06:23.230 laughterapplause 0:06:23.230,0:06:25.320 I'm flying home. 0:06:25.320,0:06:27.810 laughter 0:06:27.810,0:06:33.520 So we that's a bit of a problem, really.[br]We're never going to get that right. Then 0:06:33.520,0:06:37.410 there's too much legacy complexity there,[br]right? Really too much. So as 0:06:37.410,0:06:41.840 right-minded, pure-minded people you might[br]think that software and hardware should be 0:06:41.840,0:06:46.490 architected beautifully with regular[br]structure a bit like a Mies van der Rohe 0:06:46.490,0:06:53.660 skyscraper or something. Clean. Okay. You[br]know that's not realistic, right? You 0:06:53.660,0:06:59.470 might expect a bit of baroqueness. You[br]might expect maybe something more like 0:06:59.470,0:07:02.680 this. You know, it's a bit twiddly but it[br]still hangs together. It's still got 0:07:02.680,0:07:08.160 structure. It still works. But then you[br]stop and you think and you realise that 0:07:08.160,0:07:15.770 software and hardware is built by very[br]smart people and usually in big groups and 0:07:15.770,0:07:20.300 subject to a whole assortment of rather[br]intense commercial pressure and related 0:07:20.300,0:07:25.210 open-source pressure and using the best[br]components and tools that they know and 0:07:25.210,0:07:27.169 they like. 0:07:27.169,0:07:34.520 laughterapplause 0:07:34.520,0:07:38.990 Someone from the room: (Indistinct)[br]You forgot management! 0:07:38.990,0:07:42.480 And the best management that[br]human beings can arrange. 0:07:42.480,0:07:49.680 laughterapplause 0:07:49.680,0:07:54.910 So we end up with something[br]spectacularly ingenious. 0:07:54.910,0:08:05.490 laughterapplause 0:08:05.490,0:08:09.730 You can see here, if you look[br]closely, you can see all the parts. 0:08:09.730,0:08:11.000 laughter 0:08:11.000,0:08:15.770 This was just a found picture from the[br]internet. But you can see there's a bunch 0:08:15.770,0:08:19.060 of, you know, transparent sort of[br]silicon-like stuff down here. Let's say 0:08:19.060,0:08:25.079 that's our hardware. And over here, I see[br]a C compiler with all of its phases. 0:08:25.079,0:08:28.599 laughter 0:08:28.599,0:08:36.930 And these bottles, that's the file system,[br]I reckon. And there's a TCP stack in here 0:08:36.930,0:08:43.690 somewhere. And there's a piece that[br]belongs to the NSA, obviously. And up 0:08:43.690,0:08:50.259 here, I think up here - yeah, there's a[br]JavaScript engine and a TLS stack and a 0:08:50.259,0:08:56.050 web browser. And then perched at the top[br]is the software that most people use to 0:08:56.050,0:08:57.910 talk to their banks. 0:08:57.910,0:09:01.489 laughter 0:09:01.489,0:09:09.489 It's just - it's just insane, right? And[br]then we have to look at how we build these 0:09:09.489,0:09:14.089 pieces, right? So we build them[br]fundamentally by a process of trial and 0:09:14.089,0:09:20.649 error, right? Just occasionally we write[br]some specifications in text. And then we 0:09:20.649,0:09:26.029 write some code and maybe we write a few[br]ad hoc texts and then we test and fix 0:09:26.029,0:09:30.610 until the thing is marketable and then we[br]test and fix and extend until the thing is 0:09:30.610,0:09:36.069 no longer marketable and then we keep on[br]using it until we can't anymore. Right? So 0:09:36.069,0:09:40.259 the fundamental process that we're using[br]to develop these systems is trial and 0:09:40.259,0:09:47.930 error by exploring particular execution[br]paths of the code on particular inputs. 0:09:47.930,0:09:54.459 But - you know this, but I want to[br]highlight - there are too many. The number 0:09:54.459,0:09:59.649 of execution paths scales typically at[br]least exponentially with the size of the 0:09:59.649,0:10:04.490 code and the number of states scales[br]exponentially with the size of the data 0:10:04.490,0:10:08.790 that you're dealing with and the number of[br]possible inputs is also shockingly large. 0:10:08.790,0:10:14.920 There is no way that we can do this and we[br]should perhaps stop deluding ourselves 0:10:14.920,0:10:21.129 that there is. And then the most[br]fundamental reason that computers are hard 0:10:21.129,0:10:27.899 is that they are too discrete, right? If[br]you take a bridge or, I don't know, a 0:10:27.899,0:10:36.519 chair or something - where's something[br]slightly bendable? You take a chair and 0:10:36.519,0:10:44.199 you exert a bit more force on it than it's[br]used to - I'm not very strong - and it 0:10:44.199,0:10:51.299 bends a bit. It deforms continuously. If[br]you take a bridge and you tighten up one 0:10:51.299,0:10:55.239 of the bolts a bit too much or you[br]underspecify one of the girders by 10 per 0:10:55.239,0:11:01.019 cent, occasionally you'll reach a point of[br]catastrophic failure but usually it will 0:11:01.019,0:11:04.089 just be a little bit weaker or a little[br]bit stronger or a little bit wibbly. 0:11:04.089,0:11:08.960 Right? But computer systems, you change a[br]bit - sometimes it doesn't make a 0:11:08.960,0:11:13.179 difference but disturbingly often you[br]change a line of code - as in system of 0:11:13.179,0:11:19.600 those bugs that we were talking about -[br]the whole thing becomes broken, right? So 0:11:19.600,0:11:26.929 it's quite different from that traditional[br]engineering. It's really different. Okay. 0:11:26.929,0:11:34.850 So, "A New Dawn", this thing is titled.[br]What can we do? What might we do? So I'm 0:11:34.850,0:11:40.879 going to talk about several different[br]possibilities, all right? One thing, we 0:11:40.879,0:11:46.490 can do more of the stuff which is[br]traditionally called software engineering. 0:11:46.490,0:11:50.910 It's not really engineering in my book,[br]but some of it's good. I mean, it's 0:11:50.910,0:11:55.119 useful, you know? We can do more testing[br]and have more assertions in our code and 0:11:55.119,0:11:58.920 better management maybe and all that kind[br]of stuff. We can do all that - we should 0:11:58.920,0:12:04.339 do all that, but it's not going to make a[br]very big change, all right? It's not 0:12:04.339,0:12:13.989 addressing any of those root causes of the[br]problem. What else could we do? So as an 0:12:13.989,0:12:18.570 academic computer scientist, I've devoted[br]several years of my life to working in the 0:12:18.570,0:12:24.509 area of programming languages and I look[br]at the programming languages used out 0:12:24.509,0:12:36.919 there in the world today. Oh, it's just[br]disgusting! It's shocking. 0:12:36.919,0:12:43.399 applause 0:12:43.399,0:12:50.420 The reasons that languages become popular[br]and built into our shared infrastructure, 0:12:50.420,0:12:54.790 the reasons for that have almost nothing[br]to do with how well those languages are 0:12:54.790,0:13:00.600 designed and in fact whether[br]they are designed at all, right? 0:13:00.600,0:13:04.110 applause 0:13:04.110,0:13:09.040 So I didn't really want to get too much[br]hate mail after this talk so I'm going to 0:13:09.040,0:13:13.999 try and avoid naming particular languages[br]as much as I can, but I encourage you to 0:13:13.999,0:13:20.279 think away as I speak and imagine the[br]examples. No, that was - I didn't want to 0:13:20.279,0:13:23.730 get much hate mail. Sorry. 0:13:23.730,0:13:25.160 laughter 0:13:25.160,0:13:26.579 So... 0:13:26.579,0:13:32.069 laughterapplause 0:13:32.069,0:13:36.540 So at the very least, we could use[br]programming languages based on ideas from 0:13:36.540,0:13:47.389 the 1970s instead of the 1960s. I mean,[br]really, come on. Right? You know, back in 0:13:47.389,0:13:54.120 '67 or '69 or whatever we had ECPL and C[br]and only a couple of years later we had 0:13:54.120,0:14:00.069 languages that guaranteed type and memory[br]safety and enforcement of abstraction 0:14:00.069,0:14:05.089 boundaries and gave us rich mechanisms for[br]programming so we could say what we mean 0:14:05.089,0:14:11.350 and not futz about with 23 million[br]pointers, about 1 per cent of which were 0:14:11.350,0:14:16.629 completely wrong, right? So for any[br]particular - in the context of any 0:14:16.629,0:14:20.459 particular project or any particular[br]existing legacy infrastructure there may 0:14:20.459,0:14:25.789 be very good reasons why one just can't[br]make a switch. Right? But for us 0:14:25.789,0:14:31.860 collectively, there is no excuse. It's[br]just wretched, right? 0:14:31.860,0:14:37.040 applause 0:14:37.040,0:14:40.969 It's completely bonkers. And then the[br]other thing that you see as a programming 0:14:40.969,0:14:49.239 language researcher is that - er, well, to[br]make another analogy, it looks as if 0:14:49.239,0:14:55.529 anyone who was capable of driving a car[br]considered themselves an expert on car 0:14:55.529,0:15:03.559 design, right? There is in fact a rather[br]well-established subject of, you know, 0:15:03.559,0:15:08.239 programming language design and we know[br]how to specify completely precisely, 0:15:08.239,0:15:13.019 mathematically precisely not just the[br]syntax like we had in these BNF grammars 0:15:13.019,0:15:18.160 from the 1960s and still have, but also[br]the behaviour of these things. We know how 0:15:18.160,0:15:22.699 to specify a few operational semantics and[br]the type systems. And we know how to do 0:15:22.699,0:15:27.360 mathematical proofs that our language[br]designs are self-consistent, that they do 0:15:27.360,0:15:35.059 guarantee good properties of arbitrary[br]well-typed programs that you write, right? 0:15:35.059,0:15:38.759 We can do this now. We can do this, if we[br]have to, post hoc. And people are doing 0:15:38.759,0:15:45.669 this for JavaScript and C and god help[br]them, PHP and other such languages. But 0:15:45.669,0:15:50.829 even better, we can do it at design time.[br]And if you do this at design time, you see 0:15:50.829,0:15:55.509 the structure. You have to - the act of[br]writing that specification forces you to 0:15:55.509,0:15:59.939 consider all the cases and all the[br]interactions between features. You can 0:15:59.939,0:16:07.861 still get it wrong, but you get it wrong[br]in more interesting ways. Right? So people 0:16:07.861,0:16:12.119 are starting to do this now. So I think[br]there was a talk on the first day of this 0:16:12.119,0:16:17.429 by Hannes and David who are building[br]system software in maybe not the best sys 0:16:17.429,0:16:21.800 language you could possibly imagine, but[br]something which compared with C is as 0:16:21.800,0:16:27.759 manna from heaven. Anil Madhavapeddy and[br]his colleagues in Cambridge have been 0:16:27.759,0:16:32.759 working hard to build system software in[br]at least moderately modern languages. And 0:16:32.759,0:16:40.780 it works. Sometimes you have to use C but[br]surprisingly often you really don't. Yeah, 0:16:40.780,0:16:47.049 so I teach, as my introducer said, in the[br]University of Cambridge. And I teach 0:16:47.049,0:16:51.269 semantics. And one of my fears is that I[br]will accidentally let someone out in the 0:16:51.269,0:16:54.970 world who will accidentally find[br]themselves over a weekend inventing a 0:16:54.970,0:16:58.149 little scripting language which will[br]accidentally take over the world and 0:16:58.149,0:17:05.359 become popular and I won't have explained[br]to them what they have to do or even that 0:17:05.359,0:17:08.189 there is a subject they have to[br]understand. And if you want to understand 0:17:08.189,0:17:18.740 it, there's lots of stuff you can look at.[br]Okay. Let's go to the uppermost extreme. 0:17:18.740,0:17:23.829 If we want software and hardware that[br]actually works, we should prove it 0:17:23.829,0:17:30.700 correct. Right? So this is something which[br]academics, especially in the domain of 0:17:30.700,0:17:36.830 labelled "formal methods" in the '70s and[br]'80s, they've been pushing this, promoting 0:17:36.830,0:17:44.789 this as an idea and a promise and an[br]exhortation for decades, right? Why don't 0:17:44.789,0:17:50.169 we just prove our programs correct? And[br]for some of those decades it wasn't really 0:17:50.169,0:17:54.570 plausible. I remember back when I was[br]considerably younger, it was a big thing 0:17:54.570,0:17:59.540 to proof that if you took a linked list[br]and reversed it twice, you got back the 0:17:59.540,0:18:08.059 same thing. Right? That was hard, then.[br]But now, well, we can't do this for the 0:18:08.059,0:18:14.419 Linux kernel, let's say, but we can[br]surprising often do this. And I just give 0:18:14.419,0:18:19.590 you a couple of the examples of modern[br]day, state-of-the-art academic 0:18:19.590,0:18:25.720 verification projects. People have[br]verified compilers all the way from C-like 0:18:25.720,0:18:31.769 languages down to assembly or from ML-like[br]languages. You know, higher-order 0:18:31.769,0:18:36.959 functional imperative languages all the[br]way down to binary models of how x86 0:18:36.959,0:18:44.470 behaves. And people have verified LLVM[br]optimisation paths. There's verified 0:18:44.470,0:18:49.880 software fault isolation which I believe[br]goes faster than the original non-verified 0:18:49.880,0:18:56.130 form. Win for them. There's a verified[br]hypervisor from Nectar group in Australia. 0:18:56.130,0:19:00.570 Right? Verified secure hypervisor with[br]proofs of security properties. There's any 0:19:00.570,0:19:06.690 amount of work verifying crypto protocols,[br]sometimes with respect to assumptions - I 0:19:06.690,0:19:12.510 think reasonable assumption on what the[br]underlying mathematics is giving you. So 0:19:12.510,0:19:22.919 we can do this. I maybe have to explain a[br]bit what I mean by "prove". To some 0:19:22.919,0:19:30.990 computery people, "I proved it works"[br]means "I tested it a little bit". 0:19:30.990,0:19:32.510 laughter 0:19:32.510,0:19:34.960 In extremis, "it compiled". 0:19:34.960,0:19:44.120 laughterapplause 0:19:44.120,0:19:50.770 To a program analysis person, you might[br]run some sophisticated tool but written in 0:19:50.770,0:19:54.649 untrusted code and possibly doing an[br]approximate and maybe unsound analysis and 0:19:54.649,0:19:59.720 you might find an absence of certain kinds[br]of bugs. That's tremendously useful; it's 0:19:59.720,0:20:04.640 not what we would really call "proved".[br]I mean, nor would they, to be fair. 0:20:04.640,0:20:08.970 Scientists generally don't use the word[br]because they know they're not doing it, 0:20:08.970,0:20:13.559 right? They're doing controlled[br]experiments, which gives them, you know, 0:20:13.559,0:20:20.070 information for or against some[br]hypothesis. Mathematician write proofs. 0:20:20.070,0:20:25.139 You probably did that when you were young.[br]And those proofs are careful mathematical 0:20:25.139,0:20:30.139 arguments usually written on paper with[br]pencils and their aim is to convince a 0:20:30.139,0:20:33.389 human being that if that mathematician was[br]really nailed up against the wall and 0:20:33.389,0:20:38.980 pushed, they could expand that into a[br]completely detailed proof. But what we 0:20:38.980,0:20:47.750 have in computing, we don't have the rich[br]mathematical structure that these people 0:20:47.750,0:20:51.950 are working with, we have a tremendous[br]amount of ad hoc and stupid detail mixed 0:20:51.950,0:20:57.769 in which a smidgin of rich mathematical[br]structure. So - and we have hundreds of 0:20:57.769,0:21:03.399 millions of lines of code. So this is just[br]not going to cut it. And if we want say 0:21:03.399,0:21:08.440 the word "prove" ever, then the only thing[br]which is legitimate is to do honest 0:21:08.440,0:21:13.230 program proof and that means you have to[br]have some system that machine-checks that 0:21:13.230,0:21:17.540 your proof is actually a proof. And[br]sometime we'll also have machines that 0:21:17.540,0:21:22.250 will sort of help and sort of hinder that[br]process, right? And there's a variety of 0:21:22.250,0:21:27.340 these systems. Coq and HOL4 and Isabelle4[br]- Isabelle/HOL and what have you that we 0:21:27.340,0:21:36.799 can look up. So using these systems we can[br]prove nontrivial facts about these. Not 0:21:36.799,0:21:41.530 necessarily that they do what you want[br]them to do, but that they meet the precise 0:21:41.530,0:21:48.519 specification that we've written down. We[br]can do that. But it's tricky, right? So 0:21:48.519,0:21:52.930 these projects, these are by academic[br]standards all quite big, you know? This is 0:21:52.930,0:21:58.009 something like, I don't know, 10 people[br]for a few years or something. By industry 0:21:58.009,0:22:03.399 scale, maybe not that big. By your scale,[br]maybe not that big. But still a lot of 0:22:03.399,0:22:08.149 work for a research group to do. And quite[br]high-end work. It can take, you know, 0:22:08.149,0:22:13.730 literally a few years to become really[br]fluent in using these tools, right? And 0:22:13.730,0:22:20.139 there isn't as yet sort of an open-source,[br]collaborative movement doing this kind of 0:22:20.139,0:22:26.710 stuff. Maybe it's the time to start. Maybe[br]in five years, ten years. I don't know. If 0:22:26.710,0:22:35.680 you want a challenge, there's a challenge.[br]But it's really quite hard, right? It's 0:22:35.680,0:22:40.830 not something that one can quite put[br]forward credibly as a solution for all 0:22:40.830,0:22:46.529 software and hardware development, right?[br]So that leads me to, like an intermediate 0:22:46.529,0:22:53.399 point. That should have been a four there.[br]What can we do which improves the quality 0:22:53.399,0:22:58.340 of our system, which injects some[br]mathematical rigour but which involves 0:22:58.340,0:23:06.370 less work and is kind of easy? And what we[br]can do is take some of these interfaces 0:23:06.370,0:23:11.679 and be precise about what they are. And[br]initially, we have to do that after the 0:23:11.679,0:23:16.941 fact. We have to reverse-engineer good[br]specs of existing stuff. But then we can 0:23:16.941,0:23:23.470 use the same techniques to make much[br]cleaner and better-tested and things which 0:23:23.470,0:23:30.670 are easier to test in the future. That's[br]the idea. So my colleagues and I have been 0:23:30.670,0:23:35.360 doing this kind of stuff for the last[br]quite a few years and I just want to give 0:23:35.360,0:23:41.570 you just a - two little vignettes just to[br]sort of show you how it goes. So I can't 0:23:41.570,0:23:46.759 give you much detail. And this is,[br]obviously, joint work with a whole bunch 0:23:46.759,0:23:55.399 of very smart and good people, some of[br]which I name here. So this is not me, this 0:23:55.399,0:24:00.840 is a whole community effort. So I'm going[br]to talk for a little bit about the 0:24:00.840,0:24:05.799 behaviour of multiprocessors, so at that[br]hardware interface. And I'm going to talk 0:24:05.799,0:24:11.090 a tiny little bit about the behaviour of[br]the TCP and socket API, right? Network 0:24:11.090,0:24:17.169 protocols. So - and there'll be some[br]similarities and some differences between 0:24:17.169,0:24:24.990 these two things. So multiprocessors. You[br]probably want your computers to go fast. 0:24:24.990,0:24:31.090 Most people do. And it's an obvious idea[br]to glom together, because processors don't 0:24:31.090,0:24:35.529 go that fast, let's glom together a whole[br]bunch of processors and make them all talk 0:24:35.529,0:24:46.150 to the same memory. This is not a new[br]idea. It goes back at least to 1962 when 0:24:46.150,0:24:51.380 the Burroughs D825 had, I think, two[br]processors talking to a single shared 0:24:51.380,0:24:55.850 memory. It had outstanding features[br]including truly modular hardware with 0:24:55.850,0:25:02.649 parallel processing throughout, and - some[br]things do not change very much - the 0:25:02.649,0:25:11.640 complement of compiling languages was to[br]be expanded, right? 1962, so that'll be 52 0:25:11.640,0:25:18.820 years ago. Deary me. Okay. Now, how do[br]these things behave? So let me show you 0:25:18.820,0:25:22.899 some code, right? It's a hacker[br]conference. There should be code. Let me 0:25:22.899,0:25:29.030 show you two example programs and these[br]will both be programs with two threads and 0:25:29.030,0:25:34.570 they will both be acting on two shared[br]variables, X and Y. And in the initial 0:25:34.570,0:25:42.899 state, X and Y are both zero. So, the[br]first program. On this thread we write X 0:25:42.899,0:25:48.970 and then we read Y, and over here we write[br]Y and then we read X, right? Now, these 0:25:48.970,0:25:53.629 are operating, you know, in an[br]interleaving concurrency, you might think, 0:25:53.629,0:25:58.030 so there's no synchronisation between[br]those two threads so we might interleave 0:25:58.030,0:26:03.269 them. We might run all of thread 0 before[br]all of thread 1 or all of thread 1 before 0:26:03.269,0:26:10.340 all of thread 0 or they might be mixed in[br]like that or like that or like that or 0:26:10.340,0:26:16.940 like that. There's six possible ways of[br]interleaving two lists of two things. And 0:26:16.940,0:26:22.980 in all of those ways you never see in the[br]same execution that this reads from the 0:26:22.980,0:26:30.169 initial state AND this reads from the[br]initial state. You never see that. So you 0:26:30.169,0:26:35.479 might expect and you might assume in your[br]code that these are the possible outcomes. 0:26:35.479,0:26:40.490 So what happens if you were to actually[br]run that code on my laptop in a particular 0:26:40.490,0:26:46.970 test harness? Well, okay, you see[br]occasionally they're quite well 0:26:46.970,0:26:52.499 synchronised and they both read the other[br]guy's right. Sorry, big call. They both 0:26:52.499,0:26:57.510 read the other guy's right. And quite[br]often one thread comes completely before 0:26:57.510,0:27:09.769 the other. But sometimes, sometimes they[br]both see 0. Huh. Rats. Well, that's 0:27:09.769,0:27:18.840 interesting. Let me show you another[br]program. So now, thread 0 writes some 0:27:18.840,0:27:23.440 stuff. Maybe it writes a big bunch of[br]data, maybe multi-word data. And then it 0:27:23.440,0:27:28.249 writes, let's say that's a flag announcing[br]to some another thread that that data is 0:27:28.249,0:27:33.249 ready to go now. And the other thread[br]busy-waits reading that value until it 0:27:33.249,0:27:38.710 gets 1, until it sees the flag. And then[br]it reads the data. So you might want, as a 0:27:38.710,0:27:43.610 programmer, you might want a guarantee[br]that this read will always see the data 0:27:43.610,0:27:48.759 that you have initialised. Hey? That would[br]be like a message-passing kind of an 0:27:48.759,0:27:56.229 idiom. So what happens if you run that?[br]Well, on the x86 processors that we've 0:27:56.229,0:28:03.929 tested, you always see that value. Good.[br]This is a descent coding idiom on x86. All 0:28:03.929,0:28:12.230 right. On Arma IBM Power processors,[br]however, you see sometimes you just ignore 0:28:12.230,0:28:16.850 the value written and you see the initial[br]state. So this is not a good 0:28:16.850,0:28:25.539 message-passing idiom, right? So what's[br]going on? Well, these behaviours, it might 0:28:25.539,0:28:31.260 be surprising, eh? And when you see[br]surprising behaviour in hardware, you have 0:28:31.260,0:28:36.100 two choices. Either it's a bug in the[br]hardware, and we have found a number of 0:28:36.100,0:28:45.259 bugs in hardware. It's always - it's very[br]rewarding. Or it's a bug in your 0:28:45.259,0:28:51.350 expectation. Or it's a bug in your test[br]harness. Right? So what you do is you walk 0:28:51.350,0:28:58.050 along and you walk up to your friendly[br]processor architect in IBM or ARM or x86 0:28:58.050,0:29:04.770 land, and we have worked with all of these[br]people. And you ask them, "Is this a bug?" 0:29:04.770,0:29:12.360 And a processor architect is a person who[br]is - has the authority to decide whether 0:29:12.360,0:29:16.240 some behaviour is intended or is a bug. 0:29:16.240,0:29:17.750 laughter 0:29:17.750,0:29:24.669 That's what they do, essentially. And for[br]these, they say, "Oh, we meant it to be 0:29:24.669,0:29:29.239 like that". Right? No question. "We meant[br]it to be like that. That's perfectly 0:29:29.239,0:29:36.559 proper. We have, because you and everybody[br]else wanted their computers to go fast, we 0:29:36.559,0:29:41.690 as processor designers have introduced all[br]manner of sophisticated optimizations, 0:29:41.690,0:29:45.200 which if you were running sequential code,[br]you would never notice, but if you start 0:29:45.200,0:29:49.289 running fancy concurrent code like this,[br]fancy concurrent code that isn't just 0:29:49.289,0:29:56.049 trivially locked, you can notice", so this[br]is intentional behaviour. And if you want 0:29:56.049,0:30:00.129 to write that code, like a mutual[br]exclusion algorithm or a message-passing 0:30:00.129,0:30:06.460 algorithm or something, then you need to[br]insert special instructions, memory 0:30:06.460,0:30:11.180 barrier instructions. And so you go away[br]and you say, "What do they do?" And you 0:30:11.180,0:30:18.949 look up in the vendor documentation. And[br]you get stuff like this. I'll read it out: 0:30:18.949,0:30:22.190 (Reads quickly) "For any applicable pair[br]AB, the memory barrier ensures that a will 0:30:22.190,0:30:24.740 be performed with respect to any processor[br]or mechanism, to the extent required by 0:30:24.740,0:30:27.590 the associated memory coherence required[br]at beauties, before b is performed with 0:30:27.590,0:30:30.360 respect to that prior or mechanism. A[br]includes all applicable storage accesses 0:30:30.360,0:30:33.090 by any such processor or mechanism that[br]have been performed with respect to P1 0:30:33.090,0:30:36.210 before the memory barrier is created. B[br]includes all applicable storage accesses 0:30:36.210,0:30:38.670 by any such processor or mechanism that[br]are performed after a Load instruction 0:30:38.670,0:30:41.290 executed by that processor or mechanism[br]has returned the value stored by a store 0:30:41.290,0:30:42.700 that is in B." 0:30:42.700,0:30:49.850 applause 0:30:49.850,0:30:58.630 Hands up if you understand what that[br]means? Hands up if you think that if you 0:30:58.630,0:31:02.139 thought about it and read the rest of the[br]book, you would understand or you could 0:31:02.139,0:31:09.730 understand what that means? Hands up the[br]people who think that we're doomed forever? 0:31:09.730,0:31:12.920 laughter 0:31:12.920,0:31:16.979 So I'm sorry to the first few groups,[br]but the last ones are right. 0:31:16.979,0:31:22.299 laughterapplause 0:31:22.299,0:31:28.850 This looks like it's really intricate and[br]really carefully thought out stuff. And we 0:31:28.850,0:31:32.220 thought that for a while and we spent[br]literally years trying to make precise 0:31:32.220,0:31:37.159 mathematical models that give precise[br]meaning to these words, but actually it 0:31:37.159,0:31:45.460 can't be done. So what do you have to do[br]in that circumstance? You have to go away 0:31:45.460,0:31:52.120 and you have to invent some kind of a[br]model. And there's a - this is a really 0:31:52.120,0:31:56.190 fundamental problem, that we - on the one[br]hand, we develop our software by this 0:31:56.190,0:32:01.960 trial and error process, but on the other[br]hand are points like this. We have these 0:32:01.960,0:32:06.470 loose specifications, supposedly, that[br]have to cover all manner of behaviour of 0:32:06.470,0:32:10.730 many generations of processors that might[br]all behave the same way, written just in 0:32:10.730,0:32:16.769 text. And we tell people they should write[br]"to the spec", but the only way they have 0:32:16.769,0:32:22.450 of developing their code is to run it and[br]run particular executions on the 0:32:22.450,0:32:26.960 particular hardware implementations that[br]they have, whose relationship to the spec 0:32:26.960,0:32:34.740 is very elusive. We can't use those thick[br]books that you get or those quite weighty 0:32:34.740,0:32:39.260 PDFs that you get these days from the[br]processor vendors to test programs. You 0:32:39.260,0:32:43.970 can't feed a program into that thick book[br]and get output out. And you can't test 0:32:43.970,0:32:47.429 processor implementations and you can't[br]prove anything and you can't even 0:32:47.429,0:32:53.430 communicate between human beings, right?[br]So these things, really, they don't exist. 0:32:53.430,0:32:58.500 So what can we do? Well, the best we can[br]do at this point in time is a bit of 0:32:58.500,0:33:04.950 empirical science, right? So we can invent[br]some model off the top of our heads, 0:33:04.950,0:33:09.649 trying to describe just the[br]programmer-visible behaviour, not the 0:33:09.649,0:33:15.529 internal structure. And we can make a[br]tool. Because that's not prose now. Now we 0:33:15.529,0:33:18.950 can - now it's stuff. It's real[br]mathematics and we can turn that into code 0:33:18.950,0:33:23.200 and execute it. We can make tools that[br]take programs and execute it and - small 0:33:23.200,0:33:28.280 programs - tell us the set of all[br]behaviours allowed by that model. And then 0:33:28.280,0:33:33.759 we can compare those sets of behaviours[br]against the experimental observations. And 0:33:33.759,0:33:38.360 you might have to fix the model and you[br]might find hard bugs. And then the model 0:33:38.360,0:33:42.869 also has to make sense to the architect so[br]you can discuss with the architects what 0:33:42.869,0:33:47.100 they intend and what their institution is,[br]and then you can also prove some other 0:33:47.100,0:33:50.980 facts about it to give a bit more[br]assurance and then because you probably 0:33:50.980,0:33:57.249 got this wrong the first couple of times,[br]you can go back. And this results or has 0:33:57.249,0:34:04.509 resulted in models which are not[br]guaranteed to be correct, but they are 0:34:04.509,0:34:08.639 effectively the de facto standard for[br]understanding the concurrency behaviour of 0:34:08.639,0:34:14.300 these things, you know? And various people[br]use them. And just to give you a tiny 0:34:14.300,0:34:18.570 snippet - I'll have to speed up a bit - a[br]tiny snippet of what the model looks like, 0:34:18.570,0:34:22.969 it's basically just an abstract machine.[br]It's got some stuff for the threads that 0:34:22.969,0:34:26.949 handles the programmer-visible speculative[br]execution, and some stuff for a storage 0:34:26.949,0:34:32.739 subsystem that handles the fact that in[br]these architectures, memory rights can be 0:34:32.739,0:34:38.270 propagated to different threads at[br]different times, right? And there's a - 0:34:38.270,0:34:42.540 that model has a state which is just a[br]collection of some sets and lists and 0:34:42.540,0:34:47.230 partial orders and a few other things[br]which I won't talk about. And it has some 0:34:47.230,0:34:51.030 transitions, right? In any state, there[br]might be several possible transitions. 0:34:51.030,0:34:54.699 This is just a sample. I'm not going to[br]explain all of this. But when you want to 0:34:54.699,0:34:58.570 propagate a write to another thread, there[br]have to be some preconditions that you have to 0:34:58.570,0:35:07.140 satisfy. And than there is an action. It's not[br]amazingly complicated, really. So this is text, 0:35:07.140,0:35:13.319 but it's very precisely written text and it has[br]the great advantage that you can go through 0:35:13.319,0:35:16.570 these bullet points with your friendly[br]architect and say, "Is this what you 0:35:16.570,0:35:22.440 mean?" And they can think about it and say[br]yes or no. But for the actual definition, 0:35:22.440,0:35:27.309 that's in mathematics but it's mathematics[br]that's quite close to code. I mean, it's 0:35:27.309,0:35:32.380 terribly close to pure, functional code[br]with outside effects. And just to give you 0:35:32.380,0:35:37.230 an idea, this is some of it and those are[br]three of those conditions in the real, 0:35:37.230,0:35:42.660 honest, true version. Then we can take[br]this mathematics and because it's in a 0:35:42.660,0:35:49.270 particular style, we can compile this into[br]actually OCaml and then OCaml byte code 0:35:49.270,0:35:55.770 and then we can run it for batch jobs. But[br]then you can compile that OCaml byte code 0:35:55.770,0:36:02.579 to JavaScript and glue on an user interface[br]and stick it into a web browser and then you 0:36:02.579,0:36:07.760 have a web interface that lets people[br]explore this model. And that's also a 0:36:07.760,0:36:16.190 necessary part of validating that it makes[br]sense. Okay. This is not rocket science. 0:36:16.190,0:36:21.710 You've missed the talk for rocket science,[br]I'm sorry. All we're doing here is 0:36:21.710,0:36:28.630 specifying the intended behaviour of a[br]system. Okay? That's - it's not very 0:36:28.630,0:36:32.490 technical but it's unusual. And we happen[br]to be doing it in this mathematical 0:36:32.490,0:36:37.049 language, but you could use in fact almost[br]any language so long as you understand 0:36:37.049,0:36:40.170 what you're doing. What you have to[br]understand is that you're writing 0:36:40.170,0:36:45.230 something which is not an implementation.[br]It is something which, given some observed 0:36:45.230,0:36:49.810 behaviour, some arbitrary observed[br]behaviour, will be able to decide if it's 0:36:49.810,0:36:54.580 allowed, if you want it to be allowed or[br]not, right? It has to be executed as a 0:36:54.580,0:37:02.059 test oracle. The key question for getting[br]this to work is to understand how much 0:37:02.059,0:37:05.730 non-determinism or loose specification[br]there is in the system that you're working 0:37:05.730,0:37:08.890 with, right? So if everything is[br]completely deterministic, you can write a 0:37:08.890,0:37:13.839 reference implementation in the cleanest[br]language of your choice and just compare 0:37:13.839,0:37:19.270 the output of that and the output of the[br]real thing, right? But if there's more 0:37:19.270,0:37:24.000 non-determinism, then you can't do that.[br]I'm going to have to abbreviate this a 0:37:24.000,0:37:31.369 little tiny bit. So for the TCP network[br]protocols, there is more non-determinism. 0:37:31.369,0:37:35.220 You know what the TCP is, yes? A protocol[br]that gives you sort of reliable 0:37:35.220,0:37:41.720 connections, assuming that the world is[br]made of good people, right? You look at 0:37:41.720,0:37:47.640 the standards for TCP, and they're the[br]same kind of wibbly text from the 1980s 0:37:47.640,0:37:52.400 and you look at the implementations and[br]they are a ghastly mess. And you try and 0:37:52.400,0:37:55.690 understand the relationship between the[br]two of them and you can't because the 0:37:55.690,0:38:01.740 standard, again, is not the definition. It[br]doesn't define in all circumstances what 0:38:01.740,0:38:07.010 behaviour is allowed and what isn't. So[br]again, we had to make up a specification 0:38:07.010,0:38:14.310 in the first instance just of this host, a[br]single endpoint, and observing its 0:38:14.310,0:38:18.510 sockets, API, calls and returns and its[br]behaviour on the wire, right? And when 0:38:18.510,0:38:21.969 you've decided - and a few internal debug[br]events. And when you've decided what to 0:38:21.969,0:38:26.619 observe, then observation is just a trace,[br]it's just a list of those events. And you 0:38:26.619,0:38:34.750 have to be able to ask your spec "Is that[br]list admissible or not"? But there's a lot 0:38:34.750,0:38:38.820 of non-determinism. Variation between the[br]implementations, variations within the 0:38:38.820,0:38:45.650 implementations. And that's internal. It's[br]not announced in the API or on the wire 0:38:45.650,0:38:50.980 maybe until much later when the[br]implementation picks a new window size or 0:38:50.980,0:38:56.890 something. You can't tell until quite a[br]lot later in the trace what it's picked. 0:38:56.890,0:39:02.740 And that makes the job of checking whether[br]a trace is admissible by the spec much 0:39:02.740,0:39:08.559 harder than it has to be. Right? And if[br]you had designed the TCP protocol and 0:39:08.559,0:39:14.470 those implementations for to be testable[br]in this very discriminating way, back in 0:39:14.470,0:39:20.470 the 1980s when you were designing TCP[br]protocol, it would be much easier. And for 0:39:20.470,0:39:25.650 new protocols, one should make sure that[br]you're doing this in this particular way. 0:39:25.650,0:39:30.930 I don't think I want to talk about that[br]slide. That's just one of the rules of 0:39:30.930,0:39:35.480 that specification. But the key fact about[br]that spec is the - well, we handled the 0:39:35.480,0:39:40.210 real protocols for arbitrary inputs. But[br]it's structured not just for this 0:39:40.210,0:39:46.240 executability as a test oracle, but it's[br]structured for clarity, not performance, 0:39:46.240,0:39:49.750 which is scarcely ever what anyone ever[br]does, right? So it's a whole new kind of 0:39:49.750,0:39:54.480 thing. And the testing is very[br]discriminating. So we found all manner of 0:39:54.480,0:40:03.839 amusing and bizarre bugs which I think I[br]don't have time to talk about. Okay. So 0:40:03.839,0:40:12.610 I've described three kinds of things that[br]we might do. The first thing - for 0:40:12.610,0:40:18.800 goodness sake - I mean, it's just a[br]no-brainer. Just do it already. Everybody. 0:40:18.800,0:40:29.670 All of you. All of you. Right? This middle[br]zone is a very interesting zone as far as 0:40:29.670,0:40:33.920 I'm concerned, right? It's - out of what -[br]a place where we could get a very good 0:40:33.920,0:40:39.220 benefit for not that much effort, right?[br]We can do it, if necessary, post hoc. We 0:40:39.220,0:40:44.490 can also do it and even better, at design[br]time. We have to do this in a way that 0:40:44.490,0:40:48.240 makes our executions testable as test[br]oracles, and another thing that that 0:40:48.240,0:40:52.119 enables is completely random testing. When[br]you've got a test oracle, you can just 0:40:52.119,0:40:58.210 feed in random goop. This is fuzzing, but[br]better fuzzing - feed in random goop and 0:40:58.210,0:41:02.859 check at every point whether what the[br]behaviour that you're getting is allowable 0:41:02.859,0:41:09.271 or not. And then they're written for[br]clarity, not for performance, and that 0:41:09.271,0:41:13.619 means you can see what you're doing,[br]right? You can see the complexity. If you 0:41:13.619,0:41:17.140 go ahead and you add some feature to your[br]protocol or your programming language or 0:41:17.140,0:41:22.420 whatever and you're working just with text[br]specifications, then you can't see the 0:41:22.420,0:41:24.759 interactions. You just chuck in a couple[br]of paragraphs and you think for a few 0:41:24.759,0:41:30.210 minutes, right? And if you're lucky, you[br]make an implementation. But if you're 0:41:30.210,0:41:35.309 writing a spec that has to cover all the[br]cases like this, the act of doing that 0:41:35.309,0:41:41.150 encourages you to think or helps you think[br]about the excess complexity. And you might 0:41:41.150,0:41:46.690 think that's too complex, I'm being silly,[br]I'll make it simpler. And it has to be 0:41:46.690,0:41:51.630 usable for proof. So this, I think also is[br]pretty much a no-brainer. And it doesn't 0:41:51.630,0:41:56.650 require great technical, you know,[br]mathematical skill either, right? Lots of 0:41:56.650,0:42:01.760 people can do this. And then there's[br]option 3, best option, full-on formal 0:42:01.760,0:42:07.930 verification of the key components. And[br]that is now also in reach. You can imagine 0:42:07.930,0:42:16.309 secure systems made using actual verified[br]compilers and verified hypervisors with 0:42:16.309,0:42:22.540 verified TLS stacks and you can imagine[br]making things out of those or making those 0:42:22.540,0:42:27.870 and then making things out of those. And[br]maybe one should be thinking about that. 0:42:27.870,0:42:34.070 So, maybe not appropriate for everything.[br]If you were writing Word, you would not 0:42:34.070,0:42:38.599 wish to do these things. Probably, you're[br]not. But for this key infrastructure that 0:42:38.599,0:42:46.420 we really depend on, we trust even though[br]it's not trustworthy, we have to do this. 0:42:46.420,0:42:52.450 Is this a new dawn? I wonder if there's[br]anyone standing next to a light switch 0:42:52.450,0:42:58.920 that can dim them for me. I didn't ask[br]them beforehand, so... If you think back 0:42:58.920,0:43:05.540 the last 70-odd years, we started building[br]computers in around 1940. It's been pretty 0:43:05.540,0:43:12.760 dark from the point of view of rigorous,[br]solid, reliable engineering, stuff that is 0:43:12.760,0:43:21.760 actually trustworthy. Pretty dark, I would[br]say. Maybe, just maybe there's a tiny 0:43:21.760,0:43:26.339 smidgen of light coming through the doors.[br]And we start to have a little bit of a 0:43:26.339,0:43:31.109 clue and we start to get reusable models[br]of processor architectures and programming 0:43:31.109,0:43:35.540 languages and network protocols and what[br]have you. It's just the beginnings of the 0:43:35.540,0:43:39.790 analogue of that thermodynamics and[br]materials science and quality control 0:43:39.790,0:43:47.819 management and what have you that we have[br]from mechanical engineering. And we've got 0:43:47.819,0:43:57.200 no choice. If we don't, the next 75 years[br]is going to be a lot worse. You can just 0:43:57.200,0:44:02.400 imagine, right? So I went to this - as I'm[br]sure many of you do - this great talk on 0:44:02.400,0:44:08.370 some Apple boot loader exploit yesterday[br]which was relying on some feature that had 0:44:08.370,0:44:14.900 been introduced in the early '80s and was[br]still present. And you can imagine in 100 0:44:14.900,0:44:22.800 years from now, you can imagine as long as[br]human civilisation endures, the x86 0:44:22.800,0:44:28.869 architecture and the socket API and all of[br]this stuff, it will be embedded in some 0:44:28.869,0:44:34.099 monumental ghastly stack of[br]virtualisation layers forever. 0:44:34.099,0:44:45.760 laughterapplause 0:44:45.760,0:44:50.720 So I'd like to thank especially all of my[br]colleagues that have been working with me 0:44:50.720,0:44:56.550 or not with me in these directions. I'd[br]like to thank our generous funders who 0:44:56.550,0:45:01.389 support this stuff. I'd like to thank you[br]for your attention and I exhort you, 0:45:01.389,0:45:04.850 sort it out. 0:45:04.850,0:45:20.800 applause 0:45:20.800,0:45:25.300 Herald: Thank you very much, Peter, for[br]those interesting insights to our 0:45:25.300,0:45:31.269 programming. We have now time for a Q & A,[br]so those people who have to leave the 0:45:31.269,0:45:39.970 room, please do so quietly and as fast as[br]possible, so we can go on and hear what 0:45:39.970,0:45:51.270 the questions are. So we[br]start with microphone 4, please. 0:45:51.270,0:45:58.790 Mic 4: Hello. Thanks for the talk and my[br]question is if you got an oracle of the 0:45:58.790,0:46:04.270 software behaviour which can translate[br]every possible input to a correct output, 0:46:04.270,0:46:08.410 how can this be less complex and[br]error-prone than the reference 0:46:08.410,0:46:10.510 implementation? 0:46:10.510,0:46:15.620 Peter: Good question. So the first point[br]is that this oracle doesn't have to 0:46:15.620,0:46:22.010 produce the outputs. It only has to check[br]that the inputs and the outputs match up. 0:46:22.010,0:46:26.560 And then the second point is that in[br]general it might have to have much of the 0:46:26.560,0:46:33.221 same complexity, but by writing it to be[br]clear as opposed to to be fast, you may 0:46:33.221,0:46:37.730 adopt a completely different structure,[br]right? So the structure of our TCP spec is 0:46:37.730,0:46:43.540 organised by the behaviour that you see[br]for particular transitions. The structure 0:46:43.540,0:46:49.160 of a real implementation has fast-path[br]code and lots of complicated intertwined 0:46:49.160,0:46:55.309 branching and all manner of excess[br]complexity, of implementation complexity, 0:46:55.309,0:46:56.879 if you will. 0:46:56.879,0:47:01.450 Mic 4: Thanks.[br]Herald: So microphone 3, please? 0:47:01.450,0:47:06.710 Mic 3: I wanted to ask you what you[br]thought about programming by manipulating 0:47:06.710,0:47:13.859 the abstract syntax tree directly so as to[br]not allow incompilable programs because of 0:47:13.859,0:47:18.410 some, like, you're missing a semicolon or[br]something like that. What's your thoughts 0:47:18.410,0:47:20.220 on that? 0:47:20.220,0:47:23.700 Peter: So that's - in the grand scheme of[br]things, I think that's not a very big 0:47:23.700,0:47:28.890 deal, right? So there's - people have[br]worked on structured editors for lordy 0:47:28.890,0:47:33.309 knows how many years. It's not a big deal[br]because it's very easy for a compiler to 0:47:33.309,0:47:38.569 detect that kind of thing. And even more,[br]it's very easy for a compiler of a 0:47:38.569,0:47:43.140 sensibly designed language to detect type[br]errors, even for a very sophisticated type 0:47:43.140,0:47:49.589 system. So I don't think that - that's not[br]- I mean, some people might find it 0:47:49.589,0:47:52.930 helpful, but I don't think it's getting to[br]the heart of the matter. 0:47:52.930,0:47:54.250 Mic 3: Thanks. 0:47:54.250,0:47:59.850 Herald: So we've got some questions[br]from our signal angels from the IRC. 0:47:59.850,0:48:06.150 Signal angel: Yes. There's one question.[br]It's about the repository for Isabelle and 0:48:06.150,0:48:11.300 HOL proofs. It should be on source forge,[br]and you said there are no open source 0:48:11.300,0:48:15.480 repositories. What about them? 0:48:15.480,0:48:20.589 Peter: I'm not quite sure what the[br]question is, really. So there's a 0:48:20.589,0:48:28.230 repository of Isabelle formal proofs which[br]is the archival form of proofs, it's 0:48:28.230,0:48:34.790 called. I didn't really mean to say there[br]are no open source repositories and in 0:48:34.790,0:48:38.500 fact I don't know under what conditions[br]most of those proofs are licensed. They 0:48:38.500,0:48:43.210 probably are open. But there isn't an open[br]source community building these things, 0:48:43.210,0:48:47.710 right? It's still pretty[br]much an academic pursuit. 0:48:47.710,0:48:50.329 Herald: Microphone 2, please. 0:48:50.329,0:48:55.530 Mic 2: Hello. Thanks again for your talk.[br]I just want to add something that you 0:48:55.530,0:49:01.970 didn't address, and that's that we[br]actually need more good studies in 0:49:01.970,0:49:07.970 software engineering. We often hear a lot[br]of experts and also very well-known 0:49:07.970,0:49:11.740 computer scientists say things like,[br]"Well, we just need to do functional 0:49:11.740,0:49:17.839 programming and OOP is bad and stuff like[br]that", which I think there's a lot of 0:49:17.839,0:49:24.109 truth to it, but we actually need studies[br]where we test these kinds of claims that 0:49:24.109,0:49:29.530 we make, because when you look at other[br]fields which it also did compare to, like, 0:49:29.530,0:49:34.250 medicine, if somebody comes around and[br]is well-known and says things like, 0:49:34.250,0:49:42.230 "homeopathy works", we don't believe them.[br]We do trials, we do good trials. And 0:49:42.230,0:49:50.430 there's a lot of myths out there, like, or[br]not well tested things, like "hire the 0:49:50.430,0:49:54.739 best programmers, they are 100 times[br]more productive", "do steady type 0:49:54.739,0:50:00.430 systems", and stuff like that. And we need[br]to verify that this is true, that this 0:50:00.430,0:50:02.210 helps. And it's... 0:50:02.210,0:50:07.490 Peter: Okay. So in the grand scheme of[br]things, arguably you're right. In the 0:50:07.490,0:50:12.739 grandest scheme of things, computer[br]science is actually a branch of psychology 0:50:12.739,0:50:18.609 or really sociology. We are trying to[br]understand how large groups of people can 0:50:18.609,0:50:25.839 combine to make things which are insanely[br]complicated. Now, for - would it be good 0:50:25.839,0:50:31.329 if we had, you know, solid evidence that[br]programming in this language was better 0:50:31.329,0:50:38.450 than programming in that language? Well,[br]yes, but it's staggeringly difficult and 0:50:38.450,0:50:45.079 expensive to do honest experiments of that[br]nature and they're scarcely ever done, 0:50:45.079,0:50:49.040 right? You might do little experiments on[br]little groups of students but it's really 0:50:49.040,0:50:55.930 difficult. And then some of these things[br]which I'm saying, when one is familiar 0:50:55.930,0:51:01.740 with the different possible options, are[br]just blindingly obvious. I mean, there are 0:51:01.740,0:51:09.161 reasons why these people are using OCaml[br]for their system programming, right? It's 0:51:09.161,0:51:16.609 not - you know, it's not - "Homeopathy is[br]right", but "Homeopathy is wrong" which is 0:51:16.609,0:51:21.230 the kind of argument being put forward. 0:51:21.230,0:51:24.260 Herald: Question from[br]microphone 5, please. 0:51:24.260,0:51:29.609 Mic 5: So where are you using ECC[br]memory for your testing - up here. 0:51:29.609,0:51:33.510 Peter: When you say up here,[br]there's a bit of ambiguity. 0:51:33.510,0:51:34.819 Mic 5: Here. 0:51:34.819,0:51:37.860 Peter: Thank you. Say again? 0:51:37.860,0:51:43.119 Mic 5: So where are you using ECC memory[br]for the testing you showed about the 0:51:43.119,0:51:51.070 results of the multithreaded results due[br]to memory barriers and memory reorderings? 0:51:51.070,0:51:55.450 Peter: Well... 0:51:55.450,0:52:00.130 Mic 5: Okay, but even if you were or you[br]were not, the point I want to make is that 0:52:00.130,0:52:05.950 you can expect about 1 bit flip each[br]minute in a modern computer system that 0:52:05.950,0:52:13.670 may completely change what your software[br]is doing, so perhaps we also have to look 0:52:13.670,0:52:19.180 in ways how we can work if something goes[br]wrong at the very low level so that we 0:52:19.180,0:52:25.010 kind of have a check against our[br]specification on a more higher level of 0:52:25.010,0:52:31.940 our software. So it is valuable to do good[br]engineering on the low levels, but still I 0:52:31.940,0:52:37.670 think we will not solve the problems of[br]computation and computer engineering just 0:52:37.670,0:52:42.520 by proving things in the mathematical[br]domain because computers are physical 0:52:42.520,0:52:47.739 entities. They have errors and so on and[br]we really have to deal with them as well. 0:52:47.739,0:52:54.340 Peter: So it's certainly true that there[br]are such random errors. In fact, I would 0:52:54.340,0:52:59.480 put the point that I would argue that you[br]have to be able to model the physics well 0:52:59.480,0:53:03.680 enough and you have to be able to model[br]the statistics of those errors well 0:53:03.680,0:53:07.750 enough, so that's a different kind of[br]mathematics. And it's certainly valuable 0:53:07.750,0:53:13.290 and necessary. But if you look at the[br]statistic you just quoted, is that the 0:53:13.290,0:53:22.109 main cause of why systems go wrong? Except[br]possibly for space hardware, no. So 0:53:22.109,0:53:26.499 important, yes. The most important thing[br]that we should be paying attention to? I 0:53:26.499,0:53:29.940 don't really think so. 0:53:29.940,0:53:32.980 Herald: Microphone 6, please? 0:53:32.980,0:53:41.200 Mic 6: Hi. I really think that what you're[br]proposing to specify and verify more key 0:53:41.200,0:53:47.160 components is - would be a valuable[br]addition, but how do you make sure that 0:53:47.160,0:53:51.880 your specification actually matches the[br]behaviour that you want to do or to have? 0:53:51.880,0:53:58.300 Peter: So as I described, we do as partial[br]validation of those specifications, we do 0:53:58.300,0:54:03.050 a lot of testing against the[br]implementations, against a range of 0:54:03.050,0:54:07.060 existing implementations. That's one[br]source of validation. Another source of 0:54:07.060,0:54:11.520 validation is that you talk to the[br]architects and the designers. You want the 0:54:11.520,0:54:16.929 internal structure to match their intent.[br]You want it to be comprehensible to them. 0:54:16.929,0:54:21.510 Another kind of validation that we do is[br]prove properties about them. So we proved 0:54:21.510,0:54:28.750 that you can correctly compile from a C11[br]concurrency, a mathematical model of that, 0:54:28.750,0:54:35.432 to a IBM Power concurrency. And that kind[br]of proof gives you a bit more level of 0:54:35.432,0:54:39.450 assurance in both. So none of this is[br]giving you a total guarantee. You 0:54:39.450,0:54:43.980 certainly don't claim a total guarantee.[br]But it gives you pretty good levels of 0:54:43.980,0:54:47.880 assurance by normal standards. 0:54:47.880,0:54:49.580 Herald: Mic 4? 0:54:49.580,0:54:56.309 Mic 4: Yes. Thanks again. You proposed[br]random tests, and with my expertise, it's 0:54:56.309,0:55:02.859 very annoying if you have tests where the[br]outcome is sometimes a failure and you 0:55:02.859,0:55:10.499 want to have reproducible tests to fix a[br]bug. So how do we bring this aspects to 0:55:10.499,0:55:17.850 test more variety in data and to have it[br]reproducible together? 0:55:17.850,0:55:24.800 Peter: Well, if - as I was mentioning, for[br]the TCP thing, one of the - so the problem 0:55:24.800,0:55:29.150 is reproducibility is exactly this[br]internal non-determinism, right? The 0:55:29.150,0:55:33.389 system makes this scheduling choice or[br]what have you and you can't see what it 0:55:33.389,0:55:39.599 is, or the checker can't see what it is.[br]So one way to do that is to really design 0:55:39.599,0:55:45.369 the protocols in a different way to expose[br]that non-determinism. The other fact about 0:55:45.369,0:55:53.421 this kind of general purpose specification[br]as test oracle idea is that in some sense, 0:55:53.421,0:55:59.240 it doesn't have to be reproducible. Right?[br]The specification is giving a yes-no 0:55:59.240,0:56:05.510 answer for an arbitrary test. And that[br]means that you can use arbitrary tests. 0:56:05.510,0:56:09.730 Figuring out the root causes of the[br]differences afterwards may be tricky, but 0:56:09.730,0:56:13.200 you can use them for assurance. 0:56:13.200,0:56:18.489 Herald: So we now got time for two[br]more questions. Microphone 1, please? 0:56:18.489,0:56:23.069 Mic 1: Hiya. Thanks for a great talk. So[br]what you've described seems to be a middle 0:56:23.069,0:56:28.259 ground between a full-on, formal[br]verification and more practical testing, 0:56:28.259,0:56:34.309 like, in between. So where do you see in[br]the future where formal verifications will 0:56:34.309,0:56:39.831 go? Will they converge to the middle or[br]whether it will still be there just to 0:56:39.831,0:56:44.299 verify something that's more[br]well-specified? 0:56:44.299,0:56:49.940 Peter: So the progress of sort of the[br]whole subject of formal verification in 0:56:49.940,0:56:56.930 general, if you look at that over the last[br]10 years or so, it's been very - it's been 0:56:56.930,0:57:01.349 really amazing compared with what we could[br]do in the 80s and 90s and early 2000s, 0:57:01.349,0:57:06.480 right? So the scope of things, the scale[br]of things which are - for which it is 0:57:06.480,0:57:11.180 becoming feasible to do verification is[br]getting bigger. And I think that will 0:57:11.180,0:57:16.339 continue. You know, I don't know when you[br]might see a completely verified TOR 0:57:16.339,0:57:23.980 client, let's say, but that's not[br]inconceivable now. And 20 years ago, that 0:57:23.980,0:57:29.470 would have been completely inconceivable.[br]So that is expanding and at the same time, 0:57:29.470,0:57:36.829 I hope we see more of this interface, text[br]based oracle specification - and these - 0:57:36.829,0:57:41.000 when you're doing formal verification of a[br]piece of the system, you have to have 0:57:41.000,0:57:47.810 these indicators already defined, all[br]right? So these two fit very well together. 0:57:47.810,0:57:51.489 Herald: So the last question[br]from microphone 2, please. 0:57:51.489,0:57:56.040 Mic 2: Hi. You mentioned that you often[br]find bugs in hardware. Is there any effort 0:57:56.040,0:58:00.310 to verify the transistors on chips[br]themselves as well? 0:58:00.310,0:58:03.440 Peter: So there's a whole world of[br]hardware verification. That wasn't my 0:58:03.440,0:58:10.640 topic today. And most of the big processor[br]vendors have teams working on this. 0:58:10.640,0:58:14.440 Unsurprisingly, if you remember your[br]history, many of them have teams focusing 0:58:14.440,0:58:21.059 on the floating-point behaviour of their[br]processors. And they're doing - they're 0:58:21.059,0:58:26.981 also doing by the standards of ten years[br]ago, pretty spectacularly well. So there 0:58:26.981,0:58:33.181 are companies that have the whole of some[br]execution unit formally verified. There's 0:58:33.181,0:58:36.979 been a lot of work over the years on[br]verification of cache protocols and such 0:58:36.979,0:58:43.100 like. Right? There's a lot of work on not[br]verifying the hardware as a whole, but 0:58:43.100,0:58:47.519 verifying that, you know, the RTL-level[br]description of the hardware matches some 0:58:47.519,0:58:54.060 lower level description. So[br]there is a lot of that going on. 0:58:54.060,0:58:57.619 Herald: Thank you very much[br]again for this great talk. 0:58:57.619,0:59:00.480 Give him another applause.[br]Thank you, Peter. 0:59:00.480,0:59:01.790 Peter: Thank you. 0:59:01.790,0:59:04.219 applause 0:59:04.219,0:59:15.000 subtitles created by c3subtitles.de[br]Join, and help us!