0:00:09.519,0:00:14.479
applause
0:00:14.479,0:00:16.930
Peter Sewell: Thank you for that[br]introduction and to the organisers for
0:00:16.930,0:00:23.459
this opportunity to speak to you guys. So[br]my name is Peter Sewell. I'm an academic
0:00:23.459,0:00:30.609
computer scientist and as such, I bear a[br]heavy burden of guilt for what we have to
0:00:30.609,0:00:38.040
work with. Shared guilt, to be fair, but[br]guilt all the same. In many, many years of
0:00:38.040,0:00:44.670
higher education, I know two things about[br]computers. Maybe you've figured them out
0:00:44.670,0:00:55.000
slightly faster. The first thing, there's[br]an awful lot of them about. Shocking. And
0:00:55.000,0:01:03.300
the second is that they go wrong. They go[br]wrong a lot. Really a lot. A lot. A lot!
0:01:03.300,0:01:09.009
And they go wrong in many interestingly[br]different ways. So sometimes they go wrong
0:01:09.009,0:01:17.850
in spectacular ways, right? With[br]fireworks. And here, we see...
0:01:17.850,0:01:24.030
laughterapplause
0:01:24.030,0:01:31.260
Here we see half a billion dollars worth[br]of sparkle. You don't get that every day.
0:01:31.260,0:01:34.970
And so what was going on? So as I[br]understand it, what was going on was that
0:01:34.970,0:01:40.600
the - this was the first flight of the[br]Ariane 5 launcher and it reused some of
0:01:40.600,0:01:45.851
the guidance software from Ariane 4.[br]That's fine. Good, tested code. Should
0:01:45.851,0:01:53.030
reuse it. But the launch profile for[br]Ariane 5 went sideways a bit more - on
0:01:53.030,0:01:58.830
purpose - than the Ariane 5 launch. And as[br]a result, one of the sideways values in
0:01:58.830,0:02:04.740
the guidance control software overflowed.[br]Bad. And hence, the guidance software
0:02:04.740,0:02:08.449
decided that something bad was going wrong[br]and in order to not land rockets on
0:02:08.449,0:02:15.841
people's heads, it kindly blew itself up.[br]Okay. So that's a spectacular. Sometimes
0:02:15.841,0:02:21.200
they go wrong in insidious ways. And you[br]people probably know much more about that
0:02:21.200,0:02:29.370
than me. Very insidious ways. Exploited by[br]bad people to our detriment. And they go
0:02:29.370,0:02:36.990
wrong in insidious ways, exploited by our[br]kindly governments for our protection.
0:02:36.990,0:02:44.959
Right? And these - there are many, many,[br]many different causes of these things but
0:02:44.959,0:02:49.420
many of these causes are completely[br]trivial. You know, you have one line of
0:02:49.420,0:02:56.150
code in the wrong place - something like[br]that, right? So that's very, very bad and
0:02:56.150,0:03:01.040
you're all conscious of how bad that is.[br]But from my point of view, this is as
0:03:01.040,0:03:06.959
nothing with the real disaster that we[br]have been facing and continue to face.
0:03:06.959,0:03:14.950
Right? The real disaster is the enormous[br]waste of human life and energy and effort
0:03:14.950,0:03:21.360
and emotion dealing with these crap[br]machines that we've built. Right? And it's
0:03:21.360,0:03:25.489
hard to quantify that. I did a little[br]experiment a couple of months ago. I
0:03:25.489,0:03:34.030
Googled up "Android problem" and "Windows[br]problem" and got about 300 million hits
0:03:34.030,0:03:43.840
for each, right? Somehow indicative of the[br]amount of wrongness. So this should you
0:03:43.840,0:03:49.790
seem kind of familiar to you. If you think[br]back - think back to when you were young.
0:03:49.790,0:03:55.700
Maybe back in the 19th century when you[br]were young, and what were you doing then
0:03:55.700,0:04:00.940
as a community of hackers and makers and[br]builders? Well, this was the full-on
0:04:00.940,0:04:07.130
industrial revolution. You were building[br]stuff. You were building beautiful bridges
0:04:07.130,0:04:14.739
of cast-iron and stone with numerous[br]arches and pillars. And every so often
0:04:14.739,0:04:22.300
they would fall down. I believe someone[br]miscalculated the wind loading. And we
0:04:22.300,0:04:30.630
would build magical steam engines to whisk[br]us away. And every so often, they would
0:04:30.630,0:04:35.770
blow up. And that kind of thing, as you[br]may have noticed in the last hundred years
0:04:35.770,0:04:41.810
- sorry - that kind of thing doesn't[br]happen very often any more, right? We have
0:04:41.810,0:04:46.990
pretty good civil and mechanical[br]engineering. And what does it mean to have
0:04:46.990,0:04:52.620
good civil and mechanical engineering? It[br]means that we know enough thermodynamics
0:04:52.620,0:04:56.750
and materials science and quality control[br]management and all that kind of thing to
0:04:56.750,0:05:01.830
model our designs before we build them and[br]predict how they're going to behave pretty
0:05:01.830,0:05:10.750
well and with pretty high confidence,[br]right? I see an optimisation in this talk.
0:05:10.750,0:05:15.340
For computer science, for computing[br]systems we can predict with 100 per cent
0:05:15.340,0:05:20.450
certainty that they will not work.[br]Ah, time for lunch!
0:05:20.450,0:05:26.509
applause
0:05:26.509,0:05:33.330
But it's not a very satisfying answer,[br]really. So why is it so hard? Let me
0:05:33.330,0:05:38.500
discuss a couple of the reasons why[br]computing is tricky. In some sense,
0:05:38.500,0:05:45.430
trickier than steam engines. The first[br]reason is that there is just too much
0:05:45.430,0:05:50.360
code. Right? So this is one of the more[br]scary graphs I've found on the internet
0:05:50.360,0:05:55.470
recently. This is a measure of code in[br]hundreds of millions of lines. Each of
0:05:55.470,0:06:00.250
these blocks I think is 10 million lines[br]of code. And in the little letters - you
0:06:00.250,0:06:04.860
might not be able to see we have Android[br]and the Large Hadron Collider and Windows
0:06:04.860,0:06:12.820
Vista and Facebook, US Army Future Combat[br]System, Mac OS and software in a high-end
0:06:12.820,0:06:15.010
car. Right?
0:06:15.010,0:06:23.230
laughterapplause
0:06:23.230,0:06:25.320
I'm flying home.
0:06:25.320,0:06:27.810
laughter
0:06:27.810,0:06:33.520
So we that's a bit of a problem, really.[br]We're never going to get that right. Then
0:06:33.520,0:06:37.410
there's too much legacy complexity there,[br]right? Really too much. So as
0:06:37.410,0:06:41.840
right-minded, pure-minded people you might[br]think that software and hardware should be
0:06:41.840,0:06:46.490
architected beautifully with regular[br]structure a bit like a Mies van der Rohe
0:06:46.490,0:06:53.660
skyscraper or something. Clean. Okay. You[br]know that's not realistic, right? You
0:06:53.660,0:06:59.470
might expect a bit of baroqueness. You[br]might expect maybe something more like
0:06:59.470,0:07:02.680
this. You know, it's a bit twiddly but it[br]still hangs together. It's still got
0:07:02.680,0:07:08.160
structure. It still works. But then you[br]stop and you think and you realise that
0:07:08.160,0:07:15.770
software and hardware is built by very[br]smart people and usually in big groups and
0:07:15.770,0:07:20.300
subject to a whole assortment of rather[br]intense commercial pressure and related
0:07:20.300,0:07:25.210
open-source pressure and using the best[br]components and tools that they know and
0:07:25.210,0:07:27.169
they like.
0:07:27.169,0:07:34.520
laughterapplause
0:07:34.520,0:07:38.990
Someone from the room: (Indistinct)[br]You forgot management!
0:07:38.990,0:07:42.480
And the best management that[br]human beings can arrange.
0:07:42.480,0:07:49.680
laughterapplause
0:07:49.680,0:07:54.910
So we end up with something[br]spectacularly ingenious.
0:07:54.910,0:08:05.490
laughterapplause
0:08:05.490,0:08:09.730
You can see here, if you look[br]closely, you can see all the parts.
0:08:09.730,0:08:11.000
laughter
0:08:11.000,0:08:15.770
This was just a found picture from the[br]internet. But you can see there's a bunch
0:08:15.770,0:08:19.060
of, you know, transparent sort of[br]silicon-like stuff down here. Let's say
0:08:19.060,0:08:25.079
that's our hardware. And over here, I see[br]a C compiler with all of its phases.
0:08:25.079,0:08:28.599
laughter
0:08:28.599,0:08:36.930
And these bottles, that's the file system,[br]I reckon. And there's a TCP stack in here
0:08:36.930,0:08:43.690
somewhere. And there's a piece that[br]belongs to the NSA, obviously. And up
0:08:43.690,0:08:50.259
here, I think up here - yeah, there's a[br]JavaScript engine and a TLS stack and a
0:08:50.259,0:08:56.050
web browser. And then perched at the top[br]is the software that most people use to
0:08:56.050,0:08:57.910
talk to their banks.
0:08:57.910,0:09:01.489
laughter
0:09:01.489,0:09:09.489
It's just - it's just insane, right? And[br]then we have to look at how we build these
0:09:09.489,0:09:14.089
pieces, right? So we build them[br]fundamentally by a process of trial and
0:09:14.089,0:09:20.649
error, right? Just occasionally we write[br]some specifications in text. And then we
0:09:20.649,0:09:26.029
write some code and maybe we write a few[br]ad hoc texts and then we test and fix
0:09:26.029,0:09:30.610
until the thing is marketable and then we[br]test and fix and extend until the thing is
0:09:30.610,0:09:36.069
no longer marketable and then we keep on[br]using it until we can't anymore. Right? So
0:09:36.069,0:09:40.259
the fundamental process that we're using[br]to develop these systems is trial and
0:09:40.259,0:09:47.930
error by exploring particular execution[br]paths of the code on particular inputs.
0:09:47.930,0:09:54.459
But - you know this, but I want to[br]highlight - there are too many. The number
0:09:54.459,0:09:59.649
of execution paths scales typically at[br]least exponentially with the size of the
0:09:59.649,0:10:04.490
code and the number of states scales[br]exponentially with the size of the data
0:10:04.490,0:10:08.790
that you're dealing with and the number of[br]possible inputs is also shockingly large.
0:10:08.790,0:10:14.920
There is no way that we can do this and we[br]should perhaps stop deluding ourselves
0:10:14.920,0:10:21.129
that there is. And then the most[br]fundamental reason that computers are hard
0:10:21.129,0:10:27.899
is that they are too discrete, right? If[br]you take a bridge or, I don't know, a
0:10:27.899,0:10:36.519
chair or something - where's something[br]slightly bendable? You take a chair and
0:10:36.519,0:10:44.199
you exert a bit more force on it than it's[br]used to - I'm not very strong - and it
0:10:44.199,0:10:51.299
bends a bit. It deforms continuously. If[br]you take a bridge and you tighten up one
0:10:51.299,0:10:55.239
of the bolts a bit too much or you[br]underspecify one of the girders by 10 per
0:10:55.239,0:11:01.019
cent, occasionally you'll reach a point of[br]catastrophic failure but usually it will
0:11:01.019,0:11:04.089
just be a little bit weaker or a little[br]bit stronger or a little bit wibbly.
0:11:04.089,0:11:08.960
Right? But computer systems, you change a[br]bit - sometimes it doesn't make a
0:11:08.960,0:11:13.179
difference but disturbingly often you[br]change a line of code - as in system of
0:11:13.179,0:11:19.600
those bugs that we were talking about -[br]the whole thing becomes broken, right? So
0:11:19.600,0:11:26.929
it's quite different from that traditional[br]engineering. It's really different. Okay.
0:11:26.929,0:11:34.850
So, "A New Dawn", this thing is titled.[br]What can we do? What might we do? So I'm
0:11:34.850,0:11:40.879
going to talk about several different[br]possibilities, all right? One thing, we
0:11:40.879,0:11:46.490
can do more of the stuff which is[br]traditionally called software engineering.
0:11:46.490,0:11:50.910
It's not really engineering in my book,[br]but some of it's good. I mean, it's
0:11:50.910,0:11:55.119
useful, you know? We can do more testing[br]and have more assertions in our code and
0:11:55.119,0:11:58.920
better management maybe and all that kind[br]of stuff. We can do all that - we should
0:11:58.920,0:12:04.339
do all that, but it's not going to make a[br]very big change, all right? It's not
0:12:04.339,0:12:13.989
addressing any of those root causes of the[br]problem. What else could we do? So as an
0:12:13.989,0:12:18.570
academic computer scientist, I've devoted[br]several years of my life to working in the
0:12:18.570,0:12:24.509
area of programming languages and I look[br]at the programming languages used out
0:12:24.509,0:12:36.919
there in the world today. Oh, it's just[br]disgusting! It's shocking.
0:12:36.919,0:12:43.399
applause
0:12:43.399,0:12:50.420
The reasons that languages become popular[br]and built into our shared infrastructure,
0:12:50.420,0:12:54.790
the reasons for that have almost nothing[br]to do with how well those languages are
0:12:54.790,0:13:00.600
designed and in fact whether[br]they are designed at all, right?
0:13:00.600,0:13:04.110
applause
0:13:04.110,0:13:09.040
So I didn't really want to get too much[br]hate mail after this talk so I'm going to
0:13:09.040,0:13:13.999
try and avoid naming particular languages[br]as much as I can, but I encourage you to
0:13:13.999,0:13:20.279
think away as I speak and imagine the[br]examples. No, that was - I didn't want to
0:13:20.279,0:13:23.730
get much hate mail. Sorry.
0:13:23.730,0:13:25.160
laughter
0:13:25.160,0:13:26.579
So...
0:13:26.579,0:13:32.069
laughterapplause
0:13:32.069,0:13:36.540
So at the very least, we could use[br]programming languages based on ideas from
0:13:36.540,0:13:47.389
the 1970s instead of the 1960s. I mean,[br]really, come on. Right? You know, back in
0:13:47.389,0:13:54.120
'67 or '69 or whatever we had ECPL and C[br]and only a couple of years later we had
0:13:54.120,0:14:00.069
languages that guaranteed type and memory[br]safety and enforcement of abstraction
0:14:00.069,0:14:05.089
boundaries and gave us rich mechanisms for[br]programming so we could say what we mean
0:14:05.089,0:14:11.350
and not futz about with 23 million[br]pointers, about 1 per cent of which were
0:14:11.350,0:14:16.629
completely wrong, right? So for any[br]particular - in the context of any
0:14:16.629,0:14:20.459
particular project or any particular[br]existing legacy infrastructure there may
0:14:20.459,0:14:25.789
be very good reasons why one just can't[br]make a switch. Right? But for us
0:14:25.789,0:14:31.860
collectively, there is no excuse. It's[br]just wretched, right?
0:14:31.860,0:14:37.040
applause
0:14:37.040,0:14:40.969
It's completely bonkers. And then the[br]other thing that you see as a programming
0:14:40.969,0:14:49.239
language researcher is that - er, well, to[br]make another analogy, it looks as if
0:14:49.239,0:14:55.529
anyone who was capable of driving a car[br]considered themselves an expert on car
0:14:55.529,0:15:03.559
design, right? There is in fact a rather[br]well-established subject of, you know,
0:15:03.559,0:15:08.239
programming language design and we know[br]how to specify completely precisely,
0:15:08.239,0:15:13.019
mathematically precisely not just the[br]syntax like we had in these BNF grammars
0:15:13.019,0:15:18.160
from the 1960s and still have, but also[br]the behaviour of these things. We know how
0:15:18.160,0:15:22.699
to specify a few operational semantics and[br]the type systems. And we know how to do
0:15:22.699,0:15:27.360
mathematical proofs that our language[br]designs are self-consistent, that they do
0:15:27.360,0:15:35.059
guarantee good properties of arbitrary[br]well-typed programs that you write, right?
0:15:35.059,0:15:38.759
We can do this now. We can do this, if we[br]have to, post hoc. And people are doing
0:15:38.759,0:15:45.669
this for JavaScript and C and god help[br]them, PHP and other such languages. But
0:15:45.669,0:15:50.829
even better, we can do it at design time.[br]And if you do this at design time, you see
0:15:50.829,0:15:55.509
the structure. You have to - the act of[br]writing that specification forces you to
0:15:55.509,0:15:59.939
consider all the cases and all the[br]interactions between features. You can
0:15:59.939,0:16:07.861
still get it wrong, but you get it wrong[br]in more interesting ways. Right? So people
0:16:07.861,0:16:12.119
are starting to do this now. So I think[br]there was a talk on the first day of this
0:16:12.119,0:16:17.429
by Hannes and David who are building[br]system software in maybe not the best sys
0:16:17.429,0:16:21.800
language you could possibly imagine, but[br]something which compared with C is as
0:16:21.800,0:16:27.759
manna from heaven. Anil Madhavapeddy and[br]his colleagues in Cambridge have been
0:16:27.759,0:16:32.759
working hard to build system software in[br]at least moderately modern languages. And
0:16:32.759,0:16:40.780
it works. Sometimes you have to use C but[br]surprisingly often you really don't. Yeah,
0:16:40.780,0:16:47.049
so I teach, as my introducer said, in the[br]University of Cambridge. And I teach
0:16:47.049,0:16:51.269
semantics. And one of my fears is that I[br]will accidentally let someone out in the
0:16:51.269,0:16:54.970
world who will accidentally find[br]themselves over a weekend inventing a
0:16:54.970,0:16:58.149
little scripting language which will[br]accidentally take over the world and
0:16:58.149,0:17:05.359
become popular and I won't have explained[br]to them what they have to do or even that
0:17:05.359,0:17:08.189
there is a subject they have to[br]understand. And if you want to understand
0:17:08.189,0:17:18.740
it, there's lots of stuff you can look at.[br]Okay. Let's go to the uppermost extreme.
0:17:18.740,0:17:23.829
If we want software and hardware that[br]actually works, we should prove it
0:17:23.829,0:17:30.700
correct. Right? So this is something which[br]academics, especially in the domain of
0:17:30.700,0:17:36.830
labelled "formal methods" in the '70s and[br]'80s, they've been pushing this, promoting
0:17:36.830,0:17:44.789
this as an idea and a promise and an[br]exhortation for decades, right? Why don't
0:17:44.789,0:17:50.169
we just prove our programs correct? And[br]for some of those decades it wasn't really
0:17:50.169,0:17:54.570
plausible. I remember back when I was[br]considerably younger, it was a big thing
0:17:54.570,0:17:59.540
to proof that if you took a linked list[br]and reversed it twice, you got back the
0:17:59.540,0:18:08.059
same thing. Right? That was hard, then.[br]But now, well, we can't do this for the
0:18:08.059,0:18:14.419
Linux kernel, let's say, but we can[br]surprising often do this. And I just give
0:18:14.419,0:18:19.590
you a couple of the examples of modern[br]day, state-of-the-art academic
0:18:19.590,0:18:25.720
verification projects. People have[br]verified compilers all the way from C-like
0:18:25.720,0:18:31.769
languages down to assembly or from ML-like[br]languages. You know, higher-order
0:18:31.769,0:18:36.959
functional imperative languages all the[br]way down to binary models of how x86
0:18:36.959,0:18:44.470
behaves. And people have verified LLVM[br]optimisation paths. There's verified
0:18:44.470,0:18:49.880
software fault isolation which I believe[br]goes faster than the original non-verified
0:18:49.880,0:18:56.130
form. Win for them. There's a verified[br]hypervisor from Nectar group in Australia.
0:18:56.130,0:19:00.570
Right? Verified secure hypervisor with[br]proofs of security properties. There's any
0:19:00.570,0:19:06.690
amount of work verifying crypto protocols,[br]sometimes with respect to assumptions - I
0:19:06.690,0:19:12.510
think reasonable assumption on what the[br]underlying mathematics is giving you. So
0:19:12.510,0:19:22.919
we can do this. I maybe have to explain a[br]bit what I mean by "prove". To some
0:19:22.919,0:19:30.990
computery people, "I proved it works"[br]means "I tested it a little bit".
0:19:30.990,0:19:32.510
laughter
0:19:32.510,0:19:34.960
In extremis, "it compiled".
0:19:34.960,0:19:44.120
laughterapplause
0:19:44.120,0:19:50.770
To a program analysis person, you might[br]run some sophisticated tool but written in
0:19:50.770,0:19:54.649
untrusted code and possibly doing an[br]approximate and maybe unsound analysis and
0:19:54.649,0:19:59.720
you might find an absence of certain kinds[br]of bugs. That's tremendously useful; it's
0:19:59.720,0:20:04.640
not what we would really call "proved".[br]I mean, nor would they, to be fair.
0:20:04.640,0:20:08.970
Scientists generally don't use the word[br]because they know they're not doing it,
0:20:08.970,0:20:13.559
right? They're doing controlled[br]experiments, which gives them, you know,
0:20:13.559,0:20:20.070
information for or against some[br]hypothesis. Mathematician write proofs.
0:20:20.070,0:20:25.139
You probably did that when you were young.[br]And those proofs are careful mathematical
0:20:25.139,0:20:30.139
arguments usually written on paper with[br]pencils and their aim is to convince a
0:20:30.139,0:20:33.389
human being that if that mathematician was[br]really nailed up against the wall and
0:20:33.389,0:20:38.980
pushed, they could expand that into a[br]completely detailed proof. But what we
0:20:38.980,0:20:47.750
have in computing, we don't have the rich[br]mathematical structure that these people
0:20:47.750,0:20:51.950
are working with, we have a tremendous[br]amount of ad hoc and stupid detail mixed
0:20:51.950,0:20:57.769
in which a smidgin of rich mathematical[br]structure. So - and we have hundreds of
0:20:57.769,0:21:03.399
millions of lines of code. So this is just[br]not going to cut it. And if we want say
0:21:03.399,0:21:08.440
the word "prove" ever, then the only thing[br]which is legitimate is to do honest
0:21:08.440,0:21:13.230
program proof and that means you have to[br]have some system that machine-checks that
0:21:13.230,0:21:17.540
your proof is actually a proof. And[br]sometime we'll also have machines that
0:21:17.540,0:21:22.250
will sort of help and sort of hinder that[br]process, right? And there's a variety of
0:21:22.250,0:21:27.340
these systems. Coq and HOL4 and Isabelle4[br]- Isabelle/HOL and what have you that we
0:21:27.340,0:21:36.799
can look up. So using these systems we can[br]prove nontrivial facts about these. Not
0:21:36.799,0:21:41.530
necessarily that they do what you want[br]them to do, but that they meet the precise
0:21:41.530,0:21:48.519
specification that we've written down. We[br]can do that. But it's tricky, right? So
0:21:48.519,0:21:52.930
these projects, these are by academic[br]standards all quite big, you know? This is
0:21:52.930,0:21:58.009
something like, I don't know, 10 people[br]for a few years or something. By industry
0:21:58.009,0:22:03.399
scale, maybe not that big. By your scale,[br]maybe not that big. But still a lot of
0:22:03.399,0:22:08.149
work for a research group to do. And quite[br]high-end work. It can take, you know,
0:22:08.149,0:22:13.730
literally a few years to become really[br]fluent in using these tools, right? And
0:22:13.730,0:22:20.139
there isn't as yet sort of an open-source,[br]collaborative movement doing this kind of
0:22:20.139,0:22:26.710
stuff. Maybe it's the time to start. Maybe[br]in five years, ten years. I don't know. If
0:22:26.710,0:22:35.680
you want a challenge, there's a challenge.[br]But it's really quite hard, right? It's
0:22:35.680,0:22:40.830
not something that one can quite put[br]forward credibly as a solution for all
0:22:40.830,0:22:46.529
software and hardware development, right?[br]So that leads me to, like an intermediate
0:22:46.529,0:22:53.399
point. That should have been a four there.[br]What can we do which improves the quality
0:22:53.399,0:22:58.340
of our system, which injects some[br]mathematical rigour but which involves
0:22:58.340,0:23:06.370
less work and is kind of easy? And what we[br]can do is take some of these interfaces
0:23:06.370,0:23:11.679
and be precise about what they are. And[br]initially, we have to do that after the
0:23:11.679,0:23:16.941
fact. We have to reverse-engineer good[br]specs of existing stuff. But then we can
0:23:16.941,0:23:23.470
use the same techniques to make much[br]cleaner and better-tested and things which
0:23:23.470,0:23:30.670
are easier to test in the future. That's[br]the idea. So my colleagues and I have been
0:23:30.670,0:23:35.360
doing this kind of stuff for the last[br]quite a few years and I just want to give
0:23:35.360,0:23:41.570
you just a - two little vignettes just to[br]sort of show you how it goes. So I can't
0:23:41.570,0:23:46.759
give you much detail. And this is,[br]obviously, joint work with a whole bunch
0:23:46.759,0:23:55.399
of very smart and good people, some of[br]which I name here. So this is not me, this
0:23:55.399,0:24:00.840
is a whole community effort. So I'm going[br]to talk for a little bit about the
0:24:00.840,0:24:05.799
behaviour of multiprocessors, so at that[br]hardware interface. And I'm going to talk
0:24:05.799,0:24:11.090
a tiny little bit about the behaviour of[br]the TCP and socket API, right? Network
0:24:11.090,0:24:17.169
protocols. So - and there'll be some[br]similarities and some differences between
0:24:17.169,0:24:24.990
these two things. So multiprocessors. You[br]probably want your computers to go fast.
0:24:24.990,0:24:31.090
Most people do. And it's an obvious idea[br]to glom together, because processors don't
0:24:31.090,0:24:35.529
go that fast, let's glom together a whole[br]bunch of processors and make them all talk
0:24:35.529,0:24:46.150
to the same memory. This is not a new[br]idea. It goes back at least to 1962 when
0:24:46.150,0:24:51.380
the Burroughs D825 had, I think, two[br]processors talking to a single shared
0:24:51.380,0:24:55.850
memory. It had outstanding features[br]including truly modular hardware with
0:24:55.850,0:25:02.649
parallel processing throughout, and - some[br]things do not change very much - the
0:25:02.649,0:25:11.640
complement of compiling languages was to[br]be expanded, right? 1962, so that'll be 52
0:25:11.640,0:25:18.820
years ago. Deary me. Okay. Now, how do[br]these things behave? So let me show you
0:25:18.820,0:25:22.899
some code, right? It's a hacker[br]conference. There should be code. Let me
0:25:22.899,0:25:29.030
show you two example programs and these[br]will both be programs with two threads and
0:25:29.030,0:25:34.570
they will both be acting on two shared[br]variables, X and Y. And in the initial
0:25:34.570,0:25:42.899
state, X and Y are both zero. So, the[br]first program. On this thread we write X
0:25:42.899,0:25:48.970
and then we read Y, and over here we write[br]Y and then we read X, right? Now, these
0:25:48.970,0:25:53.629
are operating, you know, in an[br]interleaving concurrency, you might think,
0:25:53.629,0:25:58.030
so there's no synchronisation between[br]those two threads so we might interleave
0:25:58.030,0:26:03.269
them. We might run all of thread 0 before[br]all of thread 1 or all of thread 1 before
0:26:03.269,0:26:10.340
all of thread 0 or they might be mixed in[br]like that or like that or like that or
0:26:10.340,0:26:16.940
like that. There's six possible ways of[br]interleaving two lists of two things. And
0:26:16.940,0:26:22.980
in all of those ways you never see in the[br]same execution that this reads from the
0:26:22.980,0:26:30.169
initial state AND this reads from the[br]initial state. You never see that. So you
0:26:30.169,0:26:35.479
might expect and you might assume in your[br]code that these are the possible outcomes.
0:26:35.479,0:26:40.490
So what happens if you were to actually[br]run that code on my laptop in a particular
0:26:40.490,0:26:46.970
test harness? Well, okay, you see[br]occasionally they're quite well
0:26:46.970,0:26:52.499
synchronised and they both read the other[br]guy's right. Sorry, big call. They both
0:26:52.499,0:26:57.510
read the other guy's right. And quite[br]often one thread comes completely before
0:26:57.510,0:27:09.769
the other. But sometimes, sometimes they[br]both see 0. Huh. Rats. Well, that's
0:27:09.769,0:27:18.840
interesting. Let me show you another[br]program. So now, thread 0 writes some
0:27:18.840,0:27:23.440
stuff. Maybe it writes a big bunch of[br]data, maybe multi-word data. And then it
0:27:23.440,0:27:28.249
writes, let's say that's a flag announcing[br]to some another thread that that data is
0:27:28.249,0:27:33.249
ready to go now. And the other thread[br]busy-waits reading that value until it
0:27:33.249,0:27:38.710
gets 1, until it sees the flag. And then[br]it reads the data. So you might want, as a
0:27:38.710,0:27:43.610
programmer, you might want a guarantee[br]that this read will always see the data
0:27:43.610,0:27:48.759
that you have initialised. Hey? That would[br]be like a message-passing kind of an
0:27:48.759,0:27:56.229
idiom. So what happens if you run that?[br]Well, on the x86 processors that we've
0:27:56.229,0:28:03.929
tested, you always see that value. Good.[br]This is a descent coding idiom on x86. All
0:28:03.929,0:28:12.230
right. On Arma IBM Power processors,[br]however, you see sometimes you just ignore
0:28:12.230,0:28:16.850
the value written and you see the initial[br]state. So this is not a good
0:28:16.850,0:28:25.539
message-passing idiom, right? So what's[br]going on? Well, these behaviours, it might
0:28:25.539,0:28:31.260
be surprising, eh? And when you see[br]surprising behaviour in hardware, you have
0:28:31.260,0:28:36.100
two choices. Either it's a bug in the[br]hardware, and we have found a number of
0:28:36.100,0:28:45.259
bugs in hardware. It's always - it's very[br]rewarding. Or it's a bug in your
0:28:45.259,0:28:51.350
expectation. Or it's a bug in your test[br]harness. Right? So what you do is you walk
0:28:51.350,0:28:58.050
along and you walk up to your friendly[br]processor architect in IBM or ARM or x86
0:28:58.050,0:29:04.770
land, and we have worked with all of these[br]people. And you ask them, "Is this a bug?"
0:29:04.770,0:29:12.360
And a processor architect is a person who[br]is - has the authority to decide whether
0:29:12.360,0:29:16.240
some behaviour is intended or is a bug.
0:29:16.240,0:29:17.750
laughter
0:29:17.750,0:29:24.669
That's what they do, essentially. And for[br]these, they say, "Oh, we meant it to be
0:29:24.669,0:29:29.239
like that". Right? No question. "We meant[br]it to be like that. That's perfectly
0:29:29.239,0:29:36.559
proper. We have, because you and everybody[br]else wanted their computers to go fast, we
0:29:36.559,0:29:41.690
as processor designers have introduced all[br]manner of sophisticated optimizations,
0:29:41.690,0:29:45.200
which if you were running sequential code,[br]you would never notice, but if you start
0:29:45.200,0:29:49.289
running fancy concurrent code like this,[br]fancy concurrent code that isn't just
0:29:49.289,0:29:56.049
trivially locked, you can notice", so this[br]is intentional behaviour. And if you want
0:29:56.049,0:30:00.129
to write that code, like a mutual[br]exclusion algorithm or a message-passing
0:30:00.129,0:30:06.460
algorithm or something, then you need to[br]insert special instructions, memory
0:30:06.460,0:30:11.180
barrier instructions. And so you go away[br]and you say, "What do they do?" And you
0:30:11.180,0:30:18.949
look up in the vendor documentation. And[br]you get stuff like this. I'll read it out:
0:30:18.949,0:30:22.190
(Reads quickly) "For any applicable pair[br]AB, the memory barrier ensures that a will
0:30:22.190,0:30:24.740
be performed with respect to any processor[br]or mechanism, to the extent required by
0:30:24.740,0:30:27.590
the associated memory coherence required[br]at beauties, before b is performed with
0:30:27.590,0:30:30.360
respect to that prior or mechanism. A[br]includes all applicable storage accesses
0:30:30.360,0:30:33.090
by any such processor or mechanism that[br]have been performed with respect to P1
0:30:33.090,0:30:36.210
before the memory barrier is created. B[br]includes all applicable storage accesses
0:30:36.210,0:30:38.670
by any such processor or mechanism that[br]are performed after a Load instruction
0:30:38.670,0:30:41.290
executed by that processor or mechanism[br]has returned the value stored by a store
0:30:41.290,0:30:42.700
that is in B."
0:30:42.700,0:30:49.850
applause
0:30:49.850,0:30:58.630
Hands up if you understand what that[br]means? Hands up if you think that if you
0:30:58.630,0:31:02.139
thought about it and read the rest of the[br]book, you would understand or you could
0:31:02.139,0:31:09.730
understand what that means? Hands up the[br]people who think that we're doomed forever?
0:31:09.730,0:31:12.920
laughter
0:31:12.920,0:31:16.979
So I'm sorry to the first few groups,[br]but the last ones are right.
0:31:16.979,0:31:22.299
laughterapplause
0:31:22.299,0:31:28.850
This looks like it's really intricate and[br]really carefully thought out stuff. And we
0:31:28.850,0:31:32.220
thought that for a while and we spent[br]literally years trying to make precise
0:31:32.220,0:31:37.159
mathematical models that give precise[br]meaning to these words, but actually it
0:31:37.159,0:31:45.460
can't be done. So what do you have to do[br]in that circumstance? You have to go away
0:31:45.460,0:31:52.120
and you have to invent some kind of a[br]model. And there's a - this is a really
0:31:52.120,0:31:56.190
fundamental problem, that we - on the one[br]hand, we develop our software by this
0:31:56.190,0:32:01.960
trial and error process, but on the other[br]hand are points like this. We have these
0:32:01.960,0:32:06.470
loose specifications, supposedly, that[br]have to cover all manner of behaviour of
0:32:06.470,0:32:10.730
many generations of processors that might[br]all behave the same way, written just in
0:32:10.730,0:32:16.769
text. And we tell people they should write[br]"to the spec", but the only way they have
0:32:16.769,0:32:22.450
of developing their code is to run it and[br]run particular executions on the
0:32:22.450,0:32:26.960
particular hardware implementations that[br]they have, whose relationship to the spec
0:32:26.960,0:32:34.740
is very elusive. We can't use those thick[br]books that you get or those quite weighty
0:32:34.740,0:32:39.260
PDFs that you get these days from the[br]processor vendors to test programs. You
0:32:39.260,0:32:43.970
can't feed a program into that thick book[br]and get output out. And you can't test
0:32:43.970,0:32:47.429
processor implementations and you can't[br]prove anything and you can't even
0:32:47.429,0:32:53.430
communicate between human beings, right?[br]So these things, really, they don't exist.
0:32:53.430,0:32:58.500
So what can we do? Well, the best we can[br]do at this point in time is a bit of
0:32:58.500,0:33:04.950
empirical science, right? So we can invent[br]some model off the top of our heads,
0:33:04.950,0:33:09.649
trying to describe just the[br]programmer-visible behaviour, not the
0:33:09.649,0:33:15.529
internal structure. And we can make a[br]tool. Because that's not prose now. Now we
0:33:15.529,0:33:18.950
can - now it's stuff. It's real[br]mathematics and we can turn that into code
0:33:18.950,0:33:23.200
and execute it. We can make tools that[br]take programs and execute it and - small
0:33:23.200,0:33:28.280
programs - tell us the set of all[br]behaviours allowed by that model. And then
0:33:28.280,0:33:33.759
we can compare those sets of behaviours[br]against the experimental observations. And
0:33:33.759,0:33:38.360
you might have to fix the model and you[br]might find hard bugs. And then the model
0:33:38.360,0:33:42.869
also has to make sense to the architect so[br]you can discuss with the architects what
0:33:42.869,0:33:47.100
they intend and what their institution is,[br]and then you can also prove some other
0:33:47.100,0:33:50.980
facts about it to give a bit more[br]assurance and then because you probably
0:33:50.980,0:33:57.249
got this wrong the first couple of times,[br]you can go back. And this results or has
0:33:57.249,0:34:04.509
resulted in models which are not[br]guaranteed to be correct, but they are
0:34:04.509,0:34:08.639
effectively the de facto standard for[br]understanding the concurrency behaviour of
0:34:08.639,0:34:14.300
these things, you know? And various people[br]use them. And just to give you a tiny
0:34:14.300,0:34:18.570
snippet - I'll have to speed up a bit - a[br]tiny snippet of what the model looks like,
0:34:18.570,0:34:22.969
it's basically just an abstract machine.[br]It's got some stuff for the threads that
0:34:22.969,0:34:26.949
handles the programmer-visible speculative[br]execution, and some stuff for a storage
0:34:26.949,0:34:32.739
subsystem that handles the fact that in[br]these architectures, memory rights can be
0:34:32.739,0:34:38.270
propagated to different threads at[br]different times, right? And there's a -
0:34:38.270,0:34:42.540
that model has a state which is just a[br]collection of some sets and lists and
0:34:42.540,0:34:47.230
partial orders and a few other things[br]which I won't talk about. And it has some
0:34:47.230,0:34:51.030
transitions, right? In any state, there[br]might be several possible transitions.
0:34:51.030,0:34:54.699
This is just a sample. I'm not going to[br]explain all of this. But when you want to
0:34:54.699,0:34:58.570
propagate a write to another thread, there[br]have to be some preconditions that you have to
0:34:58.570,0:35:07.140
satisfy. And than there is an action. It's not[br]amazingly complicated, really. So this is text,
0:35:07.140,0:35:13.319
but it's very precisely written text and it has[br]the great advantage that you can go through
0:35:13.319,0:35:16.570
these bullet points with your friendly[br]architect and say, "Is this what you
0:35:16.570,0:35:22.440
mean?" And they can think about it and say[br]yes or no. But for the actual definition,
0:35:22.440,0:35:27.309
that's in mathematics but it's mathematics[br]that's quite close to code. I mean, it's
0:35:27.309,0:35:32.380
terribly close to pure, functional code[br]with outside effects. And just to give you
0:35:32.380,0:35:37.230
an idea, this is some of it and those are[br]three of those conditions in the real,
0:35:37.230,0:35:42.660
honest, true version. Then we can take[br]this mathematics and because it's in a
0:35:42.660,0:35:49.270
particular style, we can compile this into[br]actually OCaml and then OCaml byte code
0:35:49.270,0:35:55.770
and then we can run it for batch jobs. But[br]then you can compile that OCaml byte code
0:35:55.770,0:36:02.579
to JavaScript and glue on an user interface[br]and stick it into a web browser and then you
0:36:02.579,0:36:07.760
have a web interface that lets people[br]explore this model. And that's also a
0:36:07.760,0:36:16.190
necessary part of validating that it makes[br]sense. Okay. This is not rocket science.
0:36:16.190,0:36:21.710
You've missed the talk for rocket science,[br]I'm sorry. All we're doing here is
0:36:21.710,0:36:28.630
specifying the intended behaviour of a[br]system. Okay? That's - it's not very
0:36:28.630,0:36:32.490
technical but it's unusual. And we happen[br]to be doing it in this mathematical
0:36:32.490,0:36:37.049
language, but you could use in fact almost[br]any language so long as you understand
0:36:37.049,0:36:40.170
what you're doing. What you have to[br]understand is that you're writing
0:36:40.170,0:36:45.230
something which is not an implementation.[br]It is something which, given some observed
0:36:45.230,0:36:49.810
behaviour, some arbitrary observed[br]behaviour, will be able to decide if it's
0:36:49.810,0:36:54.580
allowed, if you want it to be allowed or[br]not, right? It has to be executed as a
0:36:54.580,0:37:02.059
test oracle. The key question for getting[br]this to work is to understand how much
0:37:02.059,0:37:05.730
non-determinism or loose specification[br]there is in the system that you're working
0:37:05.730,0:37:08.890
with, right? So if everything is[br]completely deterministic, you can write a
0:37:08.890,0:37:13.839
reference implementation in the cleanest[br]language of your choice and just compare
0:37:13.839,0:37:19.270
the output of that and the output of the[br]real thing, right? But if there's more
0:37:19.270,0:37:24.000
non-determinism, then you can't do that.[br]I'm going to have to abbreviate this a
0:37:24.000,0:37:31.369
little tiny bit. So for the TCP network[br]protocols, there is more non-determinism.
0:37:31.369,0:37:35.220
You know what the TCP is, yes? A protocol[br]that gives you sort of reliable
0:37:35.220,0:37:41.720
connections, assuming that the world is[br]made of good people, right? You look at
0:37:41.720,0:37:47.640
the standards for TCP, and they're the[br]same kind of wibbly text from the 1980s
0:37:47.640,0:37:52.400
and you look at the implementations and[br]they are a ghastly mess. And you try and
0:37:52.400,0:37:55.690
understand the relationship between the[br]two of them and you can't because the
0:37:55.690,0:38:01.740
standard, again, is not the definition. It[br]doesn't define in all circumstances what
0:38:01.740,0:38:07.010
behaviour is allowed and what isn't. So[br]again, we had to make up a specification
0:38:07.010,0:38:14.310
in the first instance just of this host, a[br]single endpoint, and observing its
0:38:14.310,0:38:18.510
sockets, API, calls and returns and its[br]behaviour on the wire, right? And when
0:38:18.510,0:38:21.969
you've decided - and a few internal debug[br]events. And when you've decided what to
0:38:21.969,0:38:26.619
observe, then observation is just a trace,[br]it's just a list of those events. And you
0:38:26.619,0:38:34.750
have to be able to ask your spec "Is that[br]list admissible or not"? But there's a lot
0:38:34.750,0:38:38.820
of non-determinism. Variation between the[br]implementations, variations within the
0:38:38.820,0:38:45.650
implementations. And that's internal. It's[br]not announced in the API or on the wire
0:38:45.650,0:38:50.980
maybe until much later when the[br]implementation picks a new window size or
0:38:50.980,0:38:56.890
something. You can't tell until quite a[br]lot later in the trace what it's picked.
0:38:56.890,0:39:02.740
And that makes the job of checking whether[br]a trace is admissible by the spec much
0:39:02.740,0:39:08.559
harder than it has to be. Right? And if[br]you had designed the TCP protocol and
0:39:08.559,0:39:14.470
those implementations for to be testable[br]in this very discriminating way, back in
0:39:14.470,0:39:20.470
the 1980s when you were designing TCP[br]protocol, it would be much easier. And for
0:39:20.470,0:39:25.650
new protocols, one should make sure that[br]you're doing this in this particular way.
0:39:25.650,0:39:30.930
I don't think I want to talk about that[br]slide. That's just one of the rules of
0:39:30.930,0:39:35.480
that specification. But the key fact about[br]that spec is the - well, we handled the
0:39:35.480,0:39:40.210
real protocols for arbitrary inputs. But[br]it's structured not just for this
0:39:40.210,0:39:46.240
executability as a test oracle, but it's[br]structured for clarity, not performance,
0:39:46.240,0:39:49.750
which is scarcely ever what anyone ever[br]does, right? So it's a whole new kind of
0:39:49.750,0:39:54.480
thing. And the testing is very[br]discriminating. So we found all manner of
0:39:54.480,0:40:03.839
amusing and bizarre bugs which I think I[br]don't have time to talk about. Okay. So
0:40:03.839,0:40:12.610
I've described three kinds of things that[br]we might do. The first thing - for
0:40:12.610,0:40:18.800
goodness sake - I mean, it's just a[br]no-brainer. Just do it already. Everybody.
0:40:18.800,0:40:29.670
All of you. All of you. Right? This middle[br]zone is a very interesting zone as far as
0:40:29.670,0:40:33.920
I'm concerned, right? It's - out of what -[br]a place where we could get a very good
0:40:33.920,0:40:39.220
benefit for not that much effort, right?[br]We can do it, if necessary, post hoc. We
0:40:39.220,0:40:44.490
can also do it and even better, at design[br]time. We have to do this in a way that
0:40:44.490,0:40:48.240
makes our executions testable as test[br]oracles, and another thing that that
0:40:48.240,0:40:52.119
enables is completely random testing. When[br]you've got a test oracle, you can just
0:40:52.119,0:40:58.210
feed in random goop. This is fuzzing, but[br]better fuzzing - feed in random goop and
0:40:58.210,0:41:02.859
check at every point whether what the[br]behaviour that you're getting is allowable
0:41:02.859,0:41:09.271
or not. And then they're written for[br]clarity, not for performance, and that
0:41:09.271,0:41:13.619
means you can see what you're doing,[br]right? You can see the complexity. If you
0:41:13.619,0:41:17.140
go ahead and you add some feature to your[br]protocol or your programming language or
0:41:17.140,0:41:22.420
whatever and you're working just with text[br]specifications, then you can't see the
0:41:22.420,0:41:24.759
interactions. You just chuck in a couple[br]of paragraphs and you think for a few
0:41:24.759,0:41:30.210
minutes, right? And if you're lucky, you[br]make an implementation. But if you're
0:41:30.210,0:41:35.309
writing a spec that has to cover all the[br]cases like this, the act of doing that
0:41:35.309,0:41:41.150
encourages you to think or helps you think[br]about the excess complexity. And you might
0:41:41.150,0:41:46.690
think that's too complex, I'm being silly,[br]I'll make it simpler. And it has to be
0:41:46.690,0:41:51.630
usable for proof. So this, I think also is[br]pretty much a no-brainer. And it doesn't
0:41:51.630,0:41:56.650
require great technical, you know,[br]mathematical skill either, right? Lots of
0:41:56.650,0:42:01.760
people can do this. And then there's[br]option 3, best option, full-on formal
0:42:01.760,0:42:07.930
verification of the key components. And[br]that is now also in reach. You can imagine
0:42:07.930,0:42:16.309
secure systems made using actual verified[br]compilers and verified hypervisors with
0:42:16.309,0:42:22.540
verified TLS stacks and you can imagine[br]making things out of those or making those
0:42:22.540,0:42:27.870
and then making things out of those. And[br]maybe one should be thinking about that.
0:42:27.870,0:42:34.070
So, maybe not appropriate for everything.[br]If you were writing Word, you would not
0:42:34.070,0:42:38.599
wish to do these things. Probably, you're[br]not. But for this key infrastructure that
0:42:38.599,0:42:46.420
we really depend on, we trust even though[br]it's not trustworthy, we have to do this.
0:42:46.420,0:42:52.450
Is this a new dawn? I wonder if there's[br]anyone standing next to a light switch
0:42:52.450,0:42:58.920
that can dim them for me. I didn't ask[br]them beforehand, so... If you think back
0:42:58.920,0:43:05.540
the last 70-odd years, we started building[br]computers in around 1940. It's been pretty
0:43:05.540,0:43:12.760
dark from the point of view of rigorous,[br]solid, reliable engineering, stuff that is
0:43:12.760,0:43:21.760
actually trustworthy. Pretty dark, I would[br]say. Maybe, just maybe there's a tiny
0:43:21.760,0:43:26.339
smidgen of light coming through the doors.[br]And we start to have a little bit of a
0:43:26.339,0:43:31.109
clue and we start to get reusable models[br]of processor architectures and programming
0:43:31.109,0:43:35.540
languages and network protocols and what[br]have you. It's just the beginnings of the
0:43:35.540,0:43:39.790
analogue of that thermodynamics and[br]materials science and quality control
0:43:39.790,0:43:47.819
management and what have you that we have[br]from mechanical engineering. And we've got
0:43:47.819,0:43:57.200
no choice. If we don't, the next 75 years[br]is going to be a lot worse. You can just
0:43:57.200,0:44:02.400
imagine, right? So I went to this - as I'm[br]sure many of you do - this great talk on
0:44:02.400,0:44:08.370
some Apple boot loader exploit yesterday[br]which was relying on some feature that had
0:44:08.370,0:44:14.900
been introduced in the early '80s and was[br]still present. And you can imagine in 100
0:44:14.900,0:44:22.800
years from now, you can imagine as long as[br]human civilisation endures, the x86
0:44:22.800,0:44:28.869
architecture and the socket API and all of[br]this stuff, it will be embedded in some
0:44:28.869,0:44:34.099
monumental ghastly stack of[br]virtualisation layers forever.
0:44:34.099,0:44:45.760
laughterapplause
0:44:45.760,0:44:50.720
So I'd like to thank especially all of my[br]colleagues that have been working with me
0:44:50.720,0:44:56.550
or not with me in these directions. I'd[br]like to thank our generous funders who
0:44:56.550,0:45:01.389
support this stuff. I'd like to thank you[br]for your attention and I exhort you,
0:45:01.389,0:45:04.850
sort it out.
0:45:04.850,0:45:20.800
applause
0:45:20.800,0:45:25.300
Herald: Thank you very much, Peter, for[br]those interesting insights to our
0:45:25.300,0:45:31.269
programming. We have now time for a Q & A,[br]so those people who have to leave the
0:45:31.269,0:45:39.970
room, please do so quietly and as fast as[br]possible, so we can go on and hear what
0:45:39.970,0:45:51.270
the questions are. So we[br]start with microphone 4, please.
0:45:51.270,0:45:58.790
Mic 4: Hello. Thanks for the talk and my[br]question is if you got an oracle of the
0:45:58.790,0:46:04.270
software behaviour which can translate[br]every possible input to a correct output,
0:46:04.270,0:46:08.410
how can this be less complex and[br]error-prone than the reference
0:46:08.410,0:46:10.510
implementation?
0:46:10.510,0:46:15.620
Peter: Good question. So the first point[br]is that this oracle doesn't have to
0:46:15.620,0:46:22.010
produce the outputs. It only has to check[br]that the inputs and the outputs match up.
0:46:22.010,0:46:26.560
And then the second point is that in[br]general it might have to have much of the
0:46:26.560,0:46:33.221
same complexity, but by writing it to be[br]clear as opposed to to be fast, you may
0:46:33.221,0:46:37.730
adopt a completely different structure,[br]right? So the structure of our TCP spec is
0:46:37.730,0:46:43.540
organised by the behaviour that you see[br]for particular transitions. The structure
0:46:43.540,0:46:49.160
of a real implementation has fast-path[br]code and lots of complicated intertwined
0:46:49.160,0:46:55.309
branching and all manner of excess[br]complexity, of implementation complexity,
0:46:55.309,0:46:56.879
if you will.
0:46:56.879,0:47:01.450
Mic 4: Thanks.[br]Herald: So microphone 3, please?
0:47:01.450,0:47:06.710
Mic 3: I wanted to ask you what you[br]thought about programming by manipulating
0:47:06.710,0:47:13.859
the abstract syntax tree directly so as to[br]not allow incompilable programs because of
0:47:13.859,0:47:18.410
some, like, you're missing a semicolon or[br]something like that. What's your thoughts
0:47:18.410,0:47:20.220
on that?
0:47:20.220,0:47:23.700
Peter: So that's - in the grand scheme of[br]things, I think that's not a very big
0:47:23.700,0:47:28.890
deal, right? So there's - people have[br]worked on structured editors for lordy
0:47:28.890,0:47:33.309
knows how many years. It's not a big deal[br]because it's very easy for a compiler to
0:47:33.309,0:47:38.569
detect that kind of thing. And even more,[br]it's very easy for a compiler of a
0:47:38.569,0:47:43.140
sensibly designed language to detect type[br]errors, even for a very sophisticated type
0:47:43.140,0:47:49.589
system. So I don't think that - that's not[br]- I mean, some people might find it
0:47:49.589,0:47:52.930
helpful, but I don't think it's getting to[br]the heart of the matter.
0:47:52.930,0:47:54.250
Mic 3: Thanks.
0:47:54.250,0:47:59.850
Herald: So we've got some questions[br]from our signal angels from the IRC.
0:47:59.850,0:48:06.150
Signal angel: Yes. There's one question.[br]It's about the repository for Isabelle and
0:48:06.150,0:48:11.300
HOL proofs. It should be on source forge,[br]and you said there are no open source
0:48:11.300,0:48:15.480
repositories. What about them?
0:48:15.480,0:48:20.589
Peter: I'm not quite sure what the[br]question is, really. So there's a
0:48:20.589,0:48:28.230
repository of Isabelle formal proofs which[br]is the archival form of proofs, it's
0:48:28.230,0:48:34.790
called. I didn't really mean to say there[br]are no open source repositories and in
0:48:34.790,0:48:38.500
fact I don't know under what conditions[br]most of those proofs are licensed. They
0:48:38.500,0:48:43.210
probably are open. But there isn't an open[br]source community building these things,
0:48:43.210,0:48:47.710
right? It's still pretty[br]much an academic pursuit.
0:48:47.710,0:48:50.329
Herald: Microphone 2, please.
0:48:50.329,0:48:55.530
Mic 2: Hello. Thanks again for your talk.[br]I just want to add something that you
0:48:55.530,0:49:01.970
didn't address, and that's that we[br]actually need more good studies in
0:49:01.970,0:49:07.970
software engineering. We often hear a lot[br]of experts and also very well-known
0:49:07.970,0:49:11.740
computer scientists say things like,[br]"Well, we just need to do functional
0:49:11.740,0:49:17.839
programming and OOP is bad and stuff like[br]that", which I think there's a lot of
0:49:17.839,0:49:24.109
truth to it, but we actually need studies[br]where we test these kinds of claims that
0:49:24.109,0:49:29.530
we make, because when you look at other[br]fields which it also did compare to, like,
0:49:29.530,0:49:34.250
medicine, if somebody comes around and[br]is well-known and says things like,
0:49:34.250,0:49:42.230
"homeopathy works", we don't believe them.[br]We do trials, we do good trials. And
0:49:42.230,0:49:50.430
there's a lot of myths out there, like, or[br]not well tested things, like "hire the
0:49:50.430,0:49:54.739
best programmers, they are 100 times[br]more productive", "do steady type
0:49:54.739,0:50:00.430
systems", and stuff like that. And we need[br]to verify that this is true, that this
0:50:00.430,0:50:02.210
helps. And it's...
0:50:02.210,0:50:07.490
Peter: Okay. So in the grand scheme of[br]things, arguably you're right. In the
0:50:07.490,0:50:12.739
grandest scheme of things, computer[br]science is actually a branch of psychology
0:50:12.739,0:50:18.609
or really sociology. We are trying to[br]understand how large groups of people can
0:50:18.609,0:50:25.839
combine to make things which are insanely[br]complicated. Now, for - would it be good
0:50:25.839,0:50:31.329
if we had, you know, solid evidence that[br]programming in this language was better
0:50:31.329,0:50:38.450
than programming in that language? Well,[br]yes, but it's staggeringly difficult and
0:50:38.450,0:50:45.079
expensive to do honest experiments of that[br]nature and they're scarcely ever done,
0:50:45.079,0:50:49.040
right? You might do little experiments on[br]little groups of students but it's really
0:50:49.040,0:50:55.930
difficult. And then some of these things[br]which I'm saying, when one is familiar
0:50:55.930,0:51:01.740
with the different possible options, are[br]just blindingly obvious. I mean, there are
0:51:01.740,0:51:09.161
reasons why these people are using OCaml[br]for their system programming, right? It's
0:51:09.161,0:51:16.609
not - you know, it's not - "Homeopathy is[br]right", but "Homeopathy is wrong" which is
0:51:16.609,0:51:21.230
the kind of argument being put forward.
0:51:21.230,0:51:24.260
Herald: Question from[br]microphone 5, please.
0:51:24.260,0:51:29.609
Mic 5: So where are you using ECC[br]memory for your testing - up here.
0:51:29.609,0:51:33.510
Peter: When you say up here,[br]there's a bit of ambiguity.
0:51:33.510,0:51:34.819
Mic 5: Here.
0:51:34.819,0:51:37.860
Peter: Thank you. Say again?
0:51:37.860,0:51:43.119
Mic 5: So where are you using ECC memory[br]for the testing you showed about the
0:51:43.119,0:51:51.070
results of the multithreaded results due[br]to memory barriers and memory reorderings?
0:51:51.070,0:51:55.450
Peter: Well...
0:51:55.450,0:52:00.130
Mic 5: Okay, but even if you were or you[br]were not, the point I want to make is that
0:52:00.130,0:52:05.950
you can expect about 1 bit flip each[br]minute in a modern computer system that
0:52:05.950,0:52:13.670
may completely change what your software[br]is doing, so perhaps we also have to look
0:52:13.670,0:52:19.180
in ways how we can work if something goes[br]wrong at the very low level so that we
0:52:19.180,0:52:25.010
kind of have a check against our[br]specification on a more higher level of
0:52:25.010,0:52:31.940
our software. So it is valuable to do good[br]engineering on the low levels, but still I
0:52:31.940,0:52:37.670
think we will not solve the problems of[br]computation and computer engineering just
0:52:37.670,0:52:42.520
by proving things in the mathematical[br]domain because computers are physical
0:52:42.520,0:52:47.739
entities. They have errors and so on and[br]we really have to deal with them as well.
0:52:47.739,0:52:54.340
Peter: So it's certainly true that there[br]are such random errors. In fact, I would
0:52:54.340,0:52:59.480
put the point that I would argue that you[br]have to be able to model the physics well
0:52:59.480,0:53:03.680
enough and you have to be able to model[br]the statistics of those errors well
0:53:03.680,0:53:07.750
enough, so that's a different kind of[br]mathematics. And it's certainly valuable
0:53:07.750,0:53:13.290
and necessary. But if you look at the[br]statistic you just quoted, is that the
0:53:13.290,0:53:22.109
main cause of why systems go wrong? Except[br]possibly for space hardware, no. So
0:53:22.109,0:53:26.499
important, yes. The most important thing[br]that we should be paying attention to? I
0:53:26.499,0:53:29.940
don't really think so.
0:53:29.940,0:53:32.980
Herald: Microphone 6, please?
0:53:32.980,0:53:41.200
Mic 6: Hi. I really think that what you're[br]proposing to specify and verify more key
0:53:41.200,0:53:47.160
components is - would be a valuable[br]addition, but how do you make sure that
0:53:47.160,0:53:51.880
your specification actually matches the[br]behaviour that you want to do or to have?
0:53:51.880,0:53:58.300
Peter: So as I described, we do as partial[br]validation of those specifications, we do
0:53:58.300,0:54:03.050
a lot of testing against the[br]implementations, against a range of
0:54:03.050,0:54:07.060
existing implementations. That's one[br]source of validation. Another source of
0:54:07.060,0:54:11.520
validation is that you talk to the[br]architects and the designers. You want the
0:54:11.520,0:54:16.929
internal structure to match their intent.[br]You want it to be comprehensible to them.
0:54:16.929,0:54:21.510
Another kind of validation that we do is[br]prove properties about them. So we proved
0:54:21.510,0:54:28.750
that you can correctly compile from a C11[br]concurrency, a mathematical model of that,
0:54:28.750,0:54:35.432
to a IBM Power concurrency. And that kind[br]of proof gives you a bit more level of
0:54:35.432,0:54:39.450
assurance in both. So none of this is[br]giving you a total guarantee. You
0:54:39.450,0:54:43.980
certainly don't claim a total guarantee.[br]But it gives you pretty good levels of
0:54:43.980,0:54:47.880
assurance by normal standards.
0:54:47.880,0:54:49.580
Herald: Mic 4?
0:54:49.580,0:54:56.309
Mic 4: Yes. Thanks again. You proposed[br]random tests, and with my expertise, it's
0:54:56.309,0:55:02.859
very annoying if you have tests where the[br]outcome is sometimes a failure and you
0:55:02.859,0:55:10.499
want to have reproducible tests to fix a[br]bug. So how do we bring this aspects to
0:55:10.499,0:55:17.850
test more variety in data and to have it[br]reproducible together?
0:55:17.850,0:55:24.800
Peter: Well, if - as I was mentioning, for[br]the TCP thing, one of the - so the problem
0:55:24.800,0:55:29.150
is reproducibility is exactly this[br]internal non-determinism, right? The
0:55:29.150,0:55:33.389
system makes this scheduling choice or[br]what have you and you can't see what it
0:55:33.389,0:55:39.599
is, or the checker can't see what it is.[br]So one way to do that is to really design
0:55:39.599,0:55:45.369
the protocols in a different way to expose[br]that non-determinism. The other fact about
0:55:45.369,0:55:53.421
this kind of general purpose specification[br]as test oracle idea is that in some sense,
0:55:53.421,0:55:59.240
it doesn't have to be reproducible. Right?[br]The specification is giving a yes-no
0:55:59.240,0:56:05.510
answer for an arbitrary test. And that[br]means that you can use arbitrary tests.
0:56:05.510,0:56:09.730
Figuring out the root causes of the[br]differences afterwards may be tricky, but
0:56:09.730,0:56:13.200
you can use them for assurance.
0:56:13.200,0:56:18.489
Herald: So we now got time for two[br]more questions. Microphone 1, please?
0:56:18.489,0:56:23.069
Mic 1: Hiya. Thanks for a great talk. So[br]what you've described seems to be a middle
0:56:23.069,0:56:28.259
ground between a full-on, formal[br]verification and more practical testing,
0:56:28.259,0:56:34.309
like, in between. So where do you see in[br]the future where formal verifications will
0:56:34.309,0:56:39.831
go? Will they converge to the middle or[br]whether it will still be there just to
0:56:39.831,0:56:44.299
verify something that's more[br]well-specified?
0:56:44.299,0:56:49.940
Peter: So the progress of sort of the[br]whole subject of formal verification in
0:56:49.940,0:56:56.930
general, if you look at that over the last[br]10 years or so, it's been very - it's been
0:56:56.930,0:57:01.349
really amazing compared with what we could[br]do in the 80s and 90s and early 2000s,
0:57:01.349,0:57:06.480
right? So the scope of things, the scale[br]of things which are - for which it is
0:57:06.480,0:57:11.180
becoming feasible to do verification is[br]getting bigger. And I think that will
0:57:11.180,0:57:16.339
continue. You know, I don't know when you[br]might see a completely verified TOR
0:57:16.339,0:57:23.980
client, let's say, but that's not[br]inconceivable now. And 20 years ago, that
0:57:23.980,0:57:29.470
would have been completely inconceivable.[br]So that is expanding and at the same time,
0:57:29.470,0:57:36.829
I hope we see more of this interface, text[br]based oracle specification - and these -
0:57:36.829,0:57:41.000
when you're doing formal verification of a[br]piece of the system, you have to have
0:57:41.000,0:57:47.810
these indicators already defined, all[br]right? So these two fit very well together.
0:57:47.810,0:57:51.489
Herald: So the last question[br]from microphone 2, please.
0:57:51.489,0:57:56.040
Mic 2: Hi. You mentioned that you often[br]find bugs in hardware. Is there any effort
0:57:56.040,0:58:00.310
to verify the transistors on chips[br]themselves as well?
0:58:00.310,0:58:03.440
Peter: So there's a whole world of[br]hardware verification. That wasn't my
0:58:03.440,0:58:10.640
topic today. And most of the big processor[br]vendors have teams working on this.
0:58:10.640,0:58:14.440
Unsurprisingly, if you remember your[br]history, many of them have teams focusing
0:58:14.440,0:58:21.059
on the floating-point behaviour of their[br]processors. And they're doing - they're
0:58:21.059,0:58:26.981
also doing by the standards of ten years[br]ago, pretty spectacularly well. So there
0:58:26.981,0:58:33.181
are companies that have the whole of some[br]execution unit formally verified. There's
0:58:33.181,0:58:36.979
been a lot of work over the years on[br]verification of cache protocols and such
0:58:36.979,0:58:43.100
like. Right? There's a lot of work on not[br]verifying the hardware as a whole, but
0:58:43.100,0:58:47.519
verifying that, you know, the RTL-level[br]description of the hardware matches some
0:58:47.519,0:58:54.060
lower level description. So[br]there is a lot of that going on.
0:58:54.060,0:58:57.619
Herald: Thank you very much[br]again for this great talk.
0:58:57.619,0:59:00.480
Give him another applause.[br]Thank you, Peter.
0:59:00.480,0:59:01.790
Peter: Thank you.
0:59:01.790,0:59:04.219
applause
0:59:04.219,0:59:15.000
subtitles created by c3subtitles.de[br]Join, and help us!