Return to Video

Peter Sewell: Why are computers so @#!*, and what can we do about it?

  • 0:10 - 0:14
    applause
  • 0:14 - 0:17
    Peter Sewell: Thank you for that
    introduction and to the organisers for
  • 0:17 - 0:23
    this opportunity to speak to you guys. So
    my name is Peter Sewell. I'm an academic
  • 0:23 - 0:31
    computer scientist and as such, I bear a
    heavy burden of guilt for what we have to
  • 0:31 - 0:38
    work with. Shared guilt, to be fair, but
    guilt all the same. In many, many years of
  • 0:38 - 0:45
    higher education, I know two things about
    computers. Maybe you've figured them out
  • 0:45 - 0:55
    slightly faster. The first thing, there's
    an awful lot of them about. Shocking. And
  • 0:55 - 1:03
    the second is that they go wrong. They go
    wrong a lot. Really a lot. A lot. A lot!
  • 1:03 - 1:09
    And they go wrong in many interestingly
    different ways. So sometimes they go wrong
  • 1:09 - 1:18
    in spectacular ways, right? With
    fireworks. And here, we see...
  • 1:18 - 1:24
    laughterapplause
  • 1:24 - 1:31
    Here we see half a billion dollars worth
    of sparkle. You don't get that every day.
  • 1:31 - 1:35
    And so what was going on? So as I
    understand it, what was going on was that
  • 1:35 - 1:41
    the - this was the first flight of the
    Ariane 5 launcher and it reused some of
  • 1:41 - 1:46
    the guidance software from Ariane 4.
    That's fine. Good, tested code. Should
  • 1:46 - 1:53
    reuse it. But the launch profile for
    Ariane 5 went sideways a bit more - on
  • 1:53 - 1:59
    purpose - than the Ariane 5 launch. And as
    a result, one of the sideways values in
  • 1:59 - 2:05
    the guidance control software overflowed.
    Bad. And hence, the guidance software
  • 2:05 - 2:08
    decided that something bad was going wrong
    and in order to not land rockets on
  • 2:08 - 2:16
    people's heads, it kindly blew itself up.
    Okay. So that's a spectacular. Sometimes
  • 2:16 - 2:21
    they go wrong in insidious ways. And you
    people probably know much more about that
  • 2:21 - 2:29
    than me. Very insidious ways. Exploited by
    bad people to our detriment. And they go
  • 2:29 - 2:37
    wrong in insidious ways, exploited by our
    kindly governments for our protection.
  • 2:37 - 2:45
    Right? And these - there are many, many,
    many different causes of these things but
  • 2:45 - 2:49
    many of these causes are completely
    trivial. You know, you have one line of
  • 2:49 - 2:56
    code in the wrong place - something like
    that, right? So that's very, very bad and
  • 2:56 - 3:01
    you're all conscious of how bad that is.
    But from my point of view, this is as
  • 3:01 - 3:07
    nothing with the real disaster that we
    have been facing and continue to face.
  • 3:07 - 3:15
    Right? The real disaster is the enormous
    waste of human life and energy and effort
  • 3:15 - 3:21
    and emotion dealing with these crap
    machines that we've built. Right? And it's
  • 3:21 - 3:25
    hard to quantify that. I did a little
    experiment a couple of months ago. I
  • 3:25 - 3:34
    Googled up "Android problem" and "Windows
    problem" and got about 300 million hits
  • 3:34 - 3:44
    for each, right? Somehow indicative of the
    amount of wrongness. So this should you
  • 3:44 - 3:50
    seem kind of familiar to you. If you think
    back - think back to when you were young.
  • 3:50 - 3:56
    Maybe back in the 19th century when you
    were young, and what were you doing then
  • 3:56 - 4:01
    as a community of hackers and makers and
    builders? Well, this was the full-on
  • 4:01 - 4:07
    industrial revolution. You were building
    stuff. You were building beautiful bridges
  • 4:07 - 4:15
    of cast-iron and stone with numerous
    arches and pillars. And every so often
  • 4:15 - 4:22
    they would fall down. I believe someone
    miscalculated the wind loading. And we
  • 4:22 - 4:31
    would build magical steam engines to whisk
    us away. And every so often, they would
  • 4:31 - 4:36
    blow up. And that kind of thing, as you
    may have noticed in the last hundred years
  • 4:36 - 4:42
    - sorry - that kind of thing doesn't
    happen very often any more, right? We have
  • 4:42 - 4:47
    pretty good civil and mechanical
    engineering. And what does it mean to have
  • 4:47 - 4:53
    good civil and mechanical engineering? It
    means that we know enough thermodynamics
  • 4:53 - 4:57
    and materials science and quality control
    management and all that kind of thing to
  • 4:57 - 5:02
    model our designs before we build them and
    predict how they're going to behave pretty
  • 5:02 - 5:11
    well and with pretty high confidence,
    right? I see an optimisation in this talk.
  • 5:11 - 5:15
    For computer science, for computing
    systems we can predict with 100 per cent
  • 5:15 - 5:20
    certainty that they will not work.
    Ah, time for lunch!
  • 5:20 - 5:27
    applause
  • 5:27 - 5:33
    But it's not a very satisfying answer,
    really. So why is it so hard? Let me
  • 5:33 - 5:38
    discuss a couple of the reasons why
    computing is tricky. In some sense,
  • 5:38 - 5:45
    trickier than steam engines. The first
    reason is that there is just too much
  • 5:45 - 5:50
    code. Right? So this is one of the more
    scary graphs I've found on the internet
  • 5:50 - 5:55
    recently. This is a measure of code in
    hundreds of millions of lines. Each of
  • 5:55 - 6:00
    these blocks I think is 10 million lines
    of code. And in the little letters - you
  • 6:00 - 6:05
    might not be able to see we have Android
    and the Large Hadron Collider and Windows
  • 6:05 - 6:13
    Vista and Facebook, US Army Future Combat
    System, Mac OS and software in a high-end
  • 6:13 - 6:15
    car. Right?
  • 6:15 - 6:23
    laughterapplause
  • 6:23 - 6:25
    I'm flying home.
  • 6:25 - 6:28
    laughter
  • 6:28 - 6:34
    So we that's a bit of a problem, really.
    We're never going to get that right. Then
  • 6:34 - 6:37
    there's too much legacy complexity there,
    right? Really too much. So as
  • 6:37 - 6:42
    right-minded, pure-minded people you might
    think that software and hardware should be
  • 6:42 - 6:46
    architected beautifully with regular
    structure a bit like a Mies van der Rohe
  • 6:46 - 6:54
    skyscraper or something. Clean. Okay. You
    know that's not realistic, right? You
  • 6:54 - 6:59
    might expect a bit of baroqueness. You
    might expect maybe something more like
  • 6:59 - 7:03
    this. You know, it's a bit twiddly but it
    still hangs together. It's still got
  • 7:03 - 7:08
    structure. It still works. But then you
    stop and you think and you realise that
  • 7:08 - 7:16
    software and hardware is built by very
    smart people and usually in big groups and
  • 7:16 - 7:20
    subject to a whole assortment of rather
    intense commercial pressure and related
  • 7:20 - 7:25
    open-source pressure and using the best
    components and tools that they know and
  • 7:25 - 7:27
    they like.
  • 7:27 - 7:35
    laughterapplause
  • 7:35 - 7:39
    Someone from the room: (Indistinct)
    You forgot management!
  • 7:39 - 7:42
    And the best management that
    human beings can arrange.
  • 7:42 - 7:50
    laughterapplause
  • 7:50 - 7:55
    So we end up with something
    spectacularly ingenious.
  • 7:55 - 8:05
    laughterapplause
  • 8:05 - 8:10
    You can see here, if you look
    closely, you can see all the parts.
  • 8:10 - 8:11
    laughter
  • 8:11 - 8:16
    This was just a found picture from the
    internet. But you can see there's a bunch
  • 8:16 - 8:19
    of, you know, transparent sort of
    silicon-like stuff down here. Let's say
  • 8:19 - 8:25
    that's our hardware. And over here, I see
    a C compiler with all of its phases.
  • 8:25 - 8:29
    laughter
  • 8:29 - 8:37
    And these bottles, that's the file system,
    I reckon. And there's a TCP stack in here
  • 8:37 - 8:44
    somewhere. And there's a piece that
    belongs to the NSA, obviously. And up
  • 8:44 - 8:50
    here, I think up here - yeah, there's a
    JavaScript engine and a TLS stack and a
  • 8:50 - 8:56
    web browser. And then perched at the top
    is the software that most people use to
  • 8:56 - 8:58
    talk to their banks.
  • 8:58 - 9:01
    laughter
  • 9:01 - 9:09
    It's just - it's just insane, right? And
    then we have to look at how we build these
  • 9:09 - 9:14
    pieces, right? So we build them
    fundamentally by a process of trial and
  • 9:14 - 9:21
    error, right? Just occasionally we write
    some specifications in text. And then we
  • 9:21 - 9:26
    write some code and maybe we write a few
    ad hoc texts and then we test and fix
  • 9:26 - 9:31
    until the thing is marketable and then we
    test and fix and extend until the thing is
  • 9:31 - 9:36
    no longer marketable and then we keep on
    using it until we can't anymore. Right? So
  • 9:36 - 9:40
    the fundamental process that we're using
    to develop these systems is trial and
  • 9:40 - 9:48
    error by exploring particular execution
    paths of the code on particular inputs.
  • 9:48 - 9:54
    But - you know this, but I want to
    highlight - there are too many. The number
  • 9:54 - 10:00
    of execution paths scales typically at
    least exponentially with the size of the
  • 10:00 - 10:04
    code and the number of states scales
    exponentially with the size of the data
  • 10:04 - 10:09
    that you're dealing with and the number of
    possible inputs is also shockingly large.
  • 10:09 - 10:15
    There is no way that we can do this and we
    should perhaps stop deluding ourselves
  • 10:15 - 10:21
    that there is. And then the most
    fundamental reason that computers are hard
  • 10:21 - 10:28
    is that they are too discrete, right? If
    you take a bridge or, I don't know, a
  • 10:28 - 10:37
    chair or something - where's something
    slightly bendable? You take a chair and
  • 10:37 - 10:44
    you exert a bit more force on it than it's
    used to - I'm not very strong - and it
  • 10:44 - 10:51
    bends a bit. It deforms continuously. If
    you take a bridge and you tighten up one
  • 10:51 - 10:55
    of the bolts a bit too much or you
    underspecify one of the girders by 10 per
  • 10:55 - 11:01
    cent, occasionally you'll reach a point of
    catastrophic failure but usually it will
  • 11:01 - 11:04
    just be a little bit weaker or a little
    bit stronger or a little bit wibbly.
  • 11:04 - 11:09
    Right? But computer systems, you change a
    bit - sometimes it doesn't make a
  • 11:09 - 11:13
    difference but disturbingly often you
    change a line of code - as in system of
  • 11:13 - 11:20
    those bugs that we were talking about -
    the whole thing becomes broken, right? So
  • 11:20 - 11:27
    it's quite different from that traditional
    engineering. It's really different. Okay.
  • 11:27 - 11:35
    So, "A New Dawn", this thing is titled.
    What can we do? What might we do? So I'm
  • 11:35 - 11:41
    going to talk about several different
    possibilities, all right? One thing, we
  • 11:41 - 11:46
    can do more of the stuff which is
    traditionally called software engineering.
  • 11:46 - 11:51
    It's not really engineering in my book,
    but some of it's good. I mean, it's
  • 11:51 - 11:55
    useful, you know? We can do more testing
    and have more assertions in our code and
  • 11:55 - 11:59
    better management maybe and all that kind
    of stuff. We can do all that - we should
  • 11:59 - 12:04
    do all that, but it's not going to make a
    very big change, all right? It's not
  • 12:04 - 12:14
    addressing any of those root causes of the
    problem. What else could we do? So as an
  • 12:14 - 12:19
    academic computer scientist, I've devoted
    several years of my life to working in the
  • 12:19 - 12:25
    area of programming languages and I look
    at the programming languages used out
  • 12:25 - 12:37
    there in the world today. Oh, it's just
    disgusting! It's shocking.
  • 12:37 - 12:43
    applause
  • 12:43 - 12:50
    The reasons that languages become popular
    and built into our shared infrastructure,
  • 12:50 - 12:55
    the reasons for that have almost nothing
    to do with how well those languages are
  • 12:55 - 13:01
    designed and in fact whether
    they are designed at all, right?
  • 13:01 - 13:04
    applause
  • 13:04 - 13:09
    So I didn't really want to get too much
    hate mail after this talk so I'm going to
  • 13:09 - 13:14
    try and avoid naming particular languages
    as much as I can, but I encourage you to
  • 13:14 - 13:20
    think away as I speak and imagine the
    examples. No, that was - I didn't want to
  • 13:20 - 13:24
    get much hate mail. Sorry.
  • 13:24 - 13:25
    laughter
  • 13:25 - 13:27
    So...
  • 13:27 - 13:32
    laughterapplause
  • 13:32 - 13:37
    So at the very least, we could use
    programming languages based on ideas from
  • 13:37 - 13:47
    the 1970s instead of the 1960s. I mean,
    really, come on. Right? You know, back in
  • 13:47 - 13:54
    '67 or '69 or whatever we had ECPL and C
    and only a couple of years later we had
  • 13:54 - 14:00
    languages that guaranteed type and memory
    safety and enforcement of abstraction
  • 14:00 - 14:05
    boundaries and gave us rich mechanisms for
    programming so we could say what we mean
  • 14:05 - 14:11
    and not futz about with 23 million
    pointers, about 1 per cent of which were
  • 14:11 - 14:17
    completely wrong, right? So for any
    particular - in the context of any
  • 14:17 - 14:20
    particular project or any particular
    existing legacy infrastructure there may
  • 14:20 - 14:26
    be very good reasons why one just can't
    make a switch. Right? But for us
  • 14:26 - 14:32
    collectively, there is no excuse. It's
    just wretched, right?
  • 14:32 - 14:37
    applause
  • 14:37 - 14:41
    It's completely bonkers. And then the
    other thing that you see as a programming
  • 14:41 - 14:49
    language researcher is that - er, well, to
    make another analogy, it looks as if
  • 14:49 - 14:56
    anyone who was capable of driving a car
    considered themselves an expert on car
  • 14:56 - 15:04
    design, right? There is in fact a rather
    well-established subject of, you know,
  • 15:04 - 15:08
    programming language design and we know
    how to specify completely precisely,
  • 15:08 - 15:13
    mathematically precisely not just the
    syntax like we had in these BNF grammars
  • 15:13 - 15:18
    from the 1960s and still have, but also
    the behaviour of these things. We know how
  • 15:18 - 15:23
    to specify a few operational semantics and
    the type systems. And we know how to do
  • 15:23 - 15:27
    mathematical proofs that our language
    designs are self-consistent, that they do
  • 15:27 - 15:35
    guarantee good properties of arbitrary
    well-typed programs that you write, right?
  • 15:35 - 15:39
    We can do this now. We can do this, if we
    have to, post hoc. And people are doing
  • 15:39 - 15:46
    this for JavaScript and C and god help
    them, PHP and other such languages. But
  • 15:46 - 15:51
    even better, we can do it at design time.
    And if you do this at design time, you see
  • 15:51 - 15:56
    the structure. You have to - the act of
    writing that specification forces you to
  • 15:56 - 16:00
    consider all the cases and all the
    interactions between features. You can
  • 16:00 - 16:08
    still get it wrong, but you get it wrong
    in more interesting ways. Right? So people
  • 16:08 - 16:12
    are starting to do this now. So I think
    there was a talk on the first day of this
  • 16:12 - 16:17
    by Hannes and David who are building
    system software in maybe not the best sys
  • 16:17 - 16:22
    language you could possibly imagine, but
    something which compared with C is as
  • 16:22 - 16:28
    manna from heaven. Anil Madhavapeddy and
    his colleagues in Cambridge have been
  • 16:28 - 16:33
    working hard to build system software in
    at least moderately modern languages. And
  • 16:33 - 16:41
    it works. Sometimes you have to use C but
    surprisingly often you really don't. Yeah,
  • 16:41 - 16:47
    so I teach, as my introducer said, in the
    University of Cambridge. And I teach
  • 16:47 - 16:51
    semantics. And one of my fears is that I
    will accidentally let someone out in the
  • 16:51 - 16:55
    world who will accidentally find
    themselves over a weekend inventing a
  • 16:55 - 16:58
    little scripting language which will
    accidentally take over the world and
  • 16:58 - 17:05
    become popular and I won't have explained
    to them what they have to do or even that
  • 17:05 - 17:08
    there is a subject they have to
    understand. And if you want to understand
  • 17:08 - 17:19
    it, there's lots of stuff you can look at.
    Okay. Let's go to the uppermost extreme.
  • 17:19 - 17:24
    If we want software and hardware that
    actually works, we should prove it
  • 17:24 - 17:31
    correct. Right? So this is something which
    academics, especially in the domain of
  • 17:31 - 17:37
    labelled "formal methods" in the '70s and
    '80s, they've been pushing this, promoting
  • 17:37 - 17:45
    this as an idea and a promise and an
    exhortation for decades, right? Why don't
  • 17:45 - 17:50
    we just prove our programs correct? And
    for some of those decades it wasn't really
  • 17:50 - 17:55
    plausible. I remember back when I was
    considerably younger, it was a big thing
  • 17:55 - 18:00
    to proof that if you took a linked list
    and reversed it twice, you got back the
  • 18:00 - 18:08
    same thing. Right? That was hard, then.
    But now, well, we can't do this for the
  • 18:08 - 18:14
    Linux kernel, let's say, but we can
    surprising often do this. And I just give
  • 18:14 - 18:20
    you a couple of the examples of modern
    day, state-of-the-art academic
  • 18:20 - 18:26
    verification projects. People have
    verified compilers all the way from C-like
  • 18:26 - 18:32
    languages down to assembly or from ML-like
    languages. You know, higher-order
  • 18:32 - 18:37
    functional imperative languages all the
    way down to binary models of how x86
  • 18:37 - 18:44
    behaves. And people have verified LLVM
    optimisation paths. There's verified
  • 18:44 - 18:50
    software fault isolation which I believe
    goes faster than the original non-verified
  • 18:50 - 18:56
    form. Win for them. There's a verified
    hypervisor from Nectar group in Australia.
  • 18:56 - 19:01
    Right? Verified secure hypervisor with
    proofs of security properties. There's any
  • 19:01 - 19:07
    amount of work verifying crypto protocols,
    sometimes with respect to assumptions - I
  • 19:07 - 19:13
    think reasonable assumption on what the
    underlying mathematics is giving you. So
  • 19:13 - 19:23
    we can do this. I maybe have to explain a
    bit what I mean by "prove". To some
  • 19:23 - 19:31
    computery people, "I proved it works"
    means "I tested it a little bit".
  • 19:31 - 19:33
    laughter
  • 19:33 - 19:35
    In extremis, "it compiled".
  • 19:35 - 19:44
    laughterapplause
  • 19:44 - 19:51
    To a program analysis person, you might
    run some sophisticated tool but written in
  • 19:51 - 19:55
    untrusted code and possibly doing an
    approximate and maybe unsound analysis and
  • 19:55 - 20:00
    you might find an absence of certain kinds
    of bugs. That's tremendously useful; it's
  • 20:00 - 20:05
    not what we would really call "proved".
    I mean, nor would they, to be fair.
  • 20:05 - 20:09
    Scientists generally don't use the word
    because they know they're not doing it,
  • 20:09 - 20:14
    right? They're doing controlled
    experiments, which gives them, you know,
  • 20:14 - 20:20
    information for or against some
    hypothesis. Mathematician write proofs.
  • 20:20 - 20:25
    You probably did that when you were young.
    And those proofs are careful mathematical
  • 20:25 - 20:30
    arguments usually written on paper with
    pencils and their aim is to convince a
  • 20:30 - 20:33
    human being that if that mathematician was
    really nailed up against the wall and
  • 20:33 - 20:39
    pushed, they could expand that into a
    completely detailed proof. But what we
  • 20:39 - 20:48
    have in computing, we don't have the rich
    mathematical structure that these people
  • 20:48 - 20:52
    are working with, we have a tremendous
    amount of ad hoc and stupid detail mixed
  • 20:52 - 20:58
    in which a smidgin of rich mathematical
    structure. So - and we have hundreds of
  • 20:58 - 21:03
    millions of lines of code. So this is just
    not going to cut it. And if we want say
  • 21:03 - 21:08
    the word "prove" ever, then the only thing
    which is legitimate is to do honest
  • 21:08 - 21:13
    program proof and that means you have to
    have some system that machine-checks that
  • 21:13 - 21:18
    your proof is actually a proof. And
    sometime we'll also have machines that
  • 21:18 - 21:22
    will sort of help and sort of hinder that
    process, right? And there's a variety of
  • 21:22 - 21:27
    these systems. Coq and HOL4 and Isabelle4
    - Isabelle/HOL and what have you that we
  • 21:27 - 21:37
    can look up. So using these systems we can
    prove nontrivial facts about these. Not
  • 21:37 - 21:42
    necessarily that they do what you want
    them to do, but that they meet the precise
  • 21:42 - 21:49
    specification that we've written down. We
    can do that. But it's tricky, right? So
  • 21:49 - 21:53
    these projects, these are by academic
    standards all quite big, you know? This is
  • 21:53 - 21:58
    something like, I don't know, 10 people
    for a few years or something. By industry
  • 21:58 - 22:03
    scale, maybe not that big. By your scale,
    maybe not that big. But still a lot of
  • 22:03 - 22:08
    work for a research group to do. And quite
    high-end work. It can take, you know,
  • 22:08 - 22:14
    literally a few years to become really
    fluent in using these tools, right? And
  • 22:14 - 22:20
    there isn't as yet sort of an open-source,
    collaborative movement doing this kind of
  • 22:20 - 22:27
    stuff. Maybe it's the time to start. Maybe
    in five years, ten years. I don't know. If
  • 22:27 - 22:36
    you want a challenge, there's a challenge.
    But it's really quite hard, right? It's
  • 22:36 - 22:41
    not something that one can quite put
    forward credibly as a solution for all
  • 22:41 - 22:47
    software and hardware development, right?
    So that leads me to, like an intermediate
  • 22:47 - 22:53
    point. That should have been a four there.
    What can we do which improves the quality
  • 22:53 - 22:58
    of our system, which injects some
    mathematical rigour but which involves
  • 22:58 - 23:06
    less work and is kind of easy? And what we
    can do is take some of these interfaces
  • 23:06 - 23:12
    and be precise about what they are. And
    initially, we have to do that after the
  • 23:12 - 23:17
    fact. We have to reverse-engineer good
    specs of existing stuff. But then we can
  • 23:17 - 23:23
    use the same techniques to make much
    cleaner and better-tested and things which
  • 23:23 - 23:31
    are easier to test in the future. That's
    the idea. So my colleagues and I have been
  • 23:31 - 23:35
    doing this kind of stuff for the last
    quite a few years and I just want to give
  • 23:35 - 23:42
    you just a - two little vignettes just to
    sort of show you how it goes. So I can't
  • 23:42 - 23:47
    give you much detail. And this is,
    obviously, joint work with a whole bunch
  • 23:47 - 23:55
    of very smart and good people, some of
    which I name here. So this is not me, this
  • 23:55 - 24:01
    is a whole community effort. So I'm going
    to talk for a little bit about the
  • 24:01 - 24:06
    behaviour of multiprocessors, so at that
    hardware interface. And I'm going to talk
  • 24:06 - 24:11
    a tiny little bit about the behaviour of
    the TCP and socket API, right? Network
  • 24:11 - 24:17
    protocols. So - and there'll be some
    similarities and some differences between
  • 24:17 - 24:25
    these two things. So multiprocessors. You
    probably want your computers to go fast.
  • 24:25 - 24:31
    Most people do. And it's an obvious idea
    to glom together, because processors don't
  • 24:31 - 24:36
    go that fast, let's glom together a whole
    bunch of processors and make them all talk
  • 24:36 - 24:46
    to the same memory. This is not a new
    idea. It goes back at least to 1962 when
  • 24:46 - 24:51
    the Burroughs D825 had, I think, two
    processors talking to a single shared
  • 24:51 - 24:56
    memory. It had outstanding features
    including truly modular hardware with
  • 24:56 - 25:03
    parallel processing throughout, and - some
    things do not change very much - the
  • 25:03 - 25:12
    complement of compiling languages was to
    be expanded, right? 1962, so that'll be 52
  • 25:12 - 25:19
    years ago. Deary me. Okay. Now, how do
    these things behave? So let me show you
  • 25:19 - 25:23
    some code, right? It's a hacker
    conference. There should be code. Let me
  • 25:23 - 25:29
    show you two example programs and these
    will both be programs with two threads and
  • 25:29 - 25:35
    they will both be acting on two shared
    variables, X and Y. And in the initial
  • 25:35 - 25:43
    state, X and Y are both zero. So, the
    first program. On this thread we write X
  • 25:43 - 25:49
    and then we read Y, and over here we write
    Y and then we read X, right? Now, these
  • 25:49 - 25:54
    are operating, you know, in an
    interleaving concurrency, you might think,
  • 25:54 - 25:58
    so there's no synchronisation between
    those two threads so we might interleave
  • 25:58 - 26:03
    them. We might run all of thread 0 before
    all of thread 1 or all of thread 1 before
  • 26:03 - 26:10
    all of thread 0 or they might be mixed in
    like that or like that or like that or
  • 26:10 - 26:17
    like that. There's six possible ways of
    interleaving two lists of two things. And
  • 26:17 - 26:23
    in all of those ways you never see in the
    same execution that this reads from the
  • 26:23 - 26:30
    initial state AND this reads from the
    initial state. You never see that. So you
  • 26:30 - 26:35
    might expect and you might assume in your
    code that these are the possible outcomes.
  • 26:35 - 26:40
    So what happens if you were to actually
    run that code on my laptop in a particular
  • 26:40 - 26:47
    test harness? Well, okay, you see
    occasionally they're quite well
  • 26:47 - 26:52
    synchronised and they both read the other
    guy's right. Sorry, big call. They both
  • 26:52 - 26:58
    read the other guy's right. And quite
    often one thread comes completely before
  • 26:58 - 27:10
    the other. But sometimes, sometimes they
    both see 0. Huh. Rats. Well, that's
  • 27:10 - 27:19
    interesting. Let me show you another
    program. So now, thread 0 writes some
  • 27:19 - 27:23
    stuff. Maybe it writes a big bunch of
    data, maybe multi-word data. And then it
  • 27:23 - 27:28
    writes, let's say that's a flag announcing
    to some another thread that that data is
  • 27:28 - 27:33
    ready to go now. And the other thread
    busy-waits reading that value until it
  • 27:33 - 27:39
    gets 1, until it sees the flag. And then
    it reads the data. So you might want, as a
  • 27:39 - 27:44
    programmer, you might want a guarantee
    that this read will always see the data
  • 27:44 - 27:49
    that you have initialised. Hey? That would
    be like a message-passing kind of an
  • 27:49 - 27:56
    idiom. So what happens if you run that?
    Well, on the x86 processors that we've
  • 27:56 - 28:04
    tested, you always see that value. Good.
    This is a descent coding idiom on x86. All
  • 28:04 - 28:12
    right. On Arma IBM Power processors,
    however, you see sometimes you just ignore
  • 28:12 - 28:17
    the value written and you see the initial
    state. So this is not a good
  • 28:17 - 28:26
    message-passing idiom, right? So what's
    going on? Well, these behaviours, it might
  • 28:26 - 28:31
    be surprising, eh? And when you see
    surprising behaviour in hardware, you have
  • 28:31 - 28:36
    two choices. Either it's a bug in the
    hardware, and we have found a number of
  • 28:36 - 28:45
    bugs in hardware. It's always - it's very
    rewarding. Or it's a bug in your
  • 28:45 - 28:51
    expectation. Or it's a bug in your test
    harness. Right? So what you do is you walk
  • 28:51 - 28:58
    along and you walk up to your friendly
    processor architect in IBM or ARM or x86
  • 28:58 - 29:05
    land, and we have worked with all of these
    people. And you ask them, "Is this a bug?"
  • 29:05 - 29:12
    And a processor architect is a person who
    is - has the authority to decide whether
  • 29:12 - 29:16
    some behaviour is intended or is a bug.
  • 29:16 - 29:18
    laughter
  • 29:18 - 29:25
    That's what they do, essentially. And for
    these, they say, "Oh, we meant it to be
  • 29:25 - 29:29
    like that". Right? No question. "We meant
    it to be like that. That's perfectly
  • 29:29 - 29:37
    proper. We have, because you and everybody
    else wanted their computers to go fast, we
  • 29:37 - 29:42
    as processor designers have introduced all
    manner of sophisticated optimizations,
  • 29:42 - 29:45
    which if you were running sequential code,
    you would never notice, but if you start
  • 29:45 - 29:49
    running fancy concurrent code like this,
    fancy concurrent code that isn't just
  • 29:49 - 29:56
    trivially locked, you can notice", so this
    is intentional behaviour. And if you want
  • 29:56 - 30:00
    to write that code, like a mutual
    exclusion algorithm or a message-passing
  • 30:00 - 30:06
    algorithm or something, then you need to
    insert special instructions, memory
  • 30:06 - 30:11
    barrier instructions. And so you go away
    and you say, "What do they do?" And you
  • 30:11 - 30:19
    look up in the vendor documentation. And
    you get stuff like this. I'll read it out:
  • 30:19 - 30:22
    (Reads quickly) "For any applicable pair
    AB, the memory barrier ensures that a will
  • 30:22 - 30:25
    be performed with respect to any processor
    or mechanism, to the extent required by
  • 30:25 - 30:28
    the associated memory coherence required
    at beauties, before b is performed with
  • 30:28 - 30:30
    respect to that prior or mechanism. A
    includes all applicable storage accesses
  • 30:30 - 30:33
    by any such processor or mechanism that
    have been performed with respect to P1
  • 30:33 - 30:36
    before the memory barrier is created. B
    includes all applicable storage accesses
  • 30:36 - 30:39
    by any such processor or mechanism that
    are performed after a Load instruction
  • 30:39 - 30:41
    executed by that processor or mechanism
    has returned the value stored by a store
  • 30:41 - 30:43
    that is in B."
  • 30:43 - 30:50
    applause
  • 30:50 - 30:59
    Hands up if you understand what that
    means? Hands up if you think that if you
  • 30:59 - 31:02
    thought about it and read the rest of the
    book, you would understand or you could
  • 31:02 - 31:10
    understand what that means? Hands up the
    people who think that we're doomed forever?
  • 31:10 - 31:13
    laughter
  • 31:13 - 31:17
    So I'm sorry to the first few groups,
    but the last ones are right.
  • 31:17 - 31:22
    laughterapplause
  • 31:22 - 31:29
    This looks like it's really intricate and
    really carefully thought out stuff. And we
  • 31:29 - 31:32
    thought that for a while and we spent
    literally years trying to make precise
  • 31:32 - 31:37
    mathematical models that give precise
    meaning to these words, but actually it
  • 31:37 - 31:45
    can't be done. So what do you have to do
    in that circumstance? You have to go away
  • 31:45 - 31:52
    and you have to invent some kind of a
    model. And there's a - this is a really
  • 31:52 - 31:56
    fundamental problem, that we - on the one
    hand, we develop our software by this
  • 31:56 - 32:02
    trial and error process, but on the other
    hand are points like this. We have these
  • 32:02 - 32:06
    loose specifications, supposedly, that
    have to cover all manner of behaviour of
  • 32:06 - 32:11
    many generations of processors that might
    all behave the same way, written just in
  • 32:11 - 32:17
    text. And we tell people they should write
    "to the spec", but the only way they have
  • 32:17 - 32:22
    of developing their code is to run it and
    run particular executions on the
  • 32:22 - 32:27
    particular hardware implementations that
    they have, whose relationship to the spec
  • 32:27 - 32:35
    is very elusive. We can't use those thick
    books that you get or those quite weighty
  • 32:35 - 32:39
    PDFs that you get these days from the
    processor vendors to test programs. You
  • 32:39 - 32:44
    can't feed a program into that thick book
    and get output out. And you can't test
  • 32:44 - 32:47
    processor implementations and you can't
    prove anything and you can't even
  • 32:47 - 32:53
    communicate between human beings, right?
    So these things, really, they don't exist.
  • 32:53 - 32:58
    So what can we do? Well, the best we can
    do at this point in time is a bit of
  • 32:58 - 33:05
    empirical science, right? So we can invent
    some model off the top of our heads,
  • 33:05 - 33:10
    trying to describe just the
    programmer-visible behaviour, not the
  • 33:10 - 33:16
    internal structure. And we can make a
    tool. Because that's not prose now. Now we
  • 33:16 - 33:19
    can - now it's stuff. It's real
    mathematics and we can turn that into code
  • 33:19 - 33:23
    and execute it. We can make tools that
    take programs and execute it and - small
  • 33:23 - 33:28
    programs - tell us the set of all
    behaviours allowed by that model. And then
  • 33:28 - 33:34
    we can compare those sets of behaviours
    against the experimental observations. And
  • 33:34 - 33:38
    you might have to fix the model and you
    might find hard bugs. And then the model
  • 33:38 - 33:43
    also has to make sense to the architect so
    you can discuss with the architects what
  • 33:43 - 33:47
    they intend and what their institution is,
    and then you can also prove some other
  • 33:47 - 33:51
    facts about it to give a bit more
    assurance and then because you probably
  • 33:51 - 33:57
    got this wrong the first couple of times,
    you can go back. And this results or has
  • 33:57 - 34:05
    resulted in models which are not
    guaranteed to be correct, but they are
  • 34:05 - 34:09
    effectively the de facto standard for
    understanding the concurrency behaviour of
  • 34:09 - 34:14
    these things, you know? And various people
    use them. And just to give you a tiny
  • 34:14 - 34:19
    snippet - I'll have to speed up a bit - a
    tiny snippet of what the model looks like,
  • 34:19 - 34:23
    it's basically just an abstract machine.
    It's got some stuff for the threads that
  • 34:23 - 34:27
    handles the programmer-visible speculative
    execution, and some stuff for a storage
  • 34:27 - 34:33
    subsystem that handles the fact that in
    these architectures, memory rights can be
  • 34:33 - 34:38
    propagated to different threads at
    different times, right? And there's a -
  • 34:38 - 34:43
    that model has a state which is just a
    collection of some sets and lists and
  • 34:43 - 34:47
    partial orders and a few other things
    which I won't talk about. And it has some
  • 34:47 - 34:51
    transitions, right? In any state, there
    might be several possible transitions.
  • 34:51 - 34:55
    This is just a sample. I'm not going to
    explain all of this. But when you want to
  • 34:55 - 34:59
    propagate a write to another thread, there
    have to be some preconditions that you have to
  • 34:59 - 35:07
    satisfy. And than there is an action. It's not
    amazingly complicated, really. So this is text,
  • 35:07 - 35:13
    but it's very precisely written text and it has
    the great advantage that you can go through
  • 35:13 - 35:17
    these bullet points with your friendly
    architect and say, "Is this what you
  • 35:17 - 35:22
    mean?" And they can think about it and say
    yes or no. But for the actual definition,
  • 35:22 - 35:27
    that's in mathematics but it's mathematics
    that's quite close to code. I mean, it's
  • 35:27 - 35:32
    terribly close to pure, functional code
    with outside effects. And just to give you
  • 35:32 - 35:37
    an idea, this is some of it and those are
    three of those conditions in the real,
  • 35:37 - 35:43
    honest, true version. Then we can take
    this mathematics and because it's in a
  • 35:43 - 35:49
    particular style, we can compile this into
    actually OCaml and then OCaml byte code
  • 35:49 - 35:56
    and then we can run it for batch jobs. But
    then you can compile that OCaml byte code
  • 35:56 - 36:03
    to JavaScript and glue on an user interface
    and stick it into a web browser and then you
  • 36:03 - 36:08
    have a web interface that lets people
    explore this model. And that's also a
  • 36:08 - 36:16
    necessary part of validating that it makes
    sense. Okay. This is not rocket science.
  • 36:16 - 36:22
    You've missed the talk for rocket science,
    I'm sorry. All we're doing here is
  • 36:22 - 36:29
    specifying the intended behaviour of a
    system. Okay? That's - it's not very
  • 36:29 - 36:32
    technical but it's unusual. And we happen
    to be doing it in this mathematical
  • 36:32 - 36:37
    language, but you could use in fact almost
    any language so long as you understand
  • 36:37 - 36:40
    what you're doing. What you have to
    understand is that you're writing
  • 36:40 - 36:45
    something which is not an implementation.
    It is something which, given some observed
  • 36:45 - 36:50
    behaviour, some arbitrary observed
    behaviour, will be able to decide if it's
  • 36:50 - 36:55
    allowed, if you want it to be allowed or
    not, right? It has to be executed as a
  • 36:55 - 37:02
    test oracle. The key question for getting
    this to work is to understand how much
  • 37:02 - 37:06
    non-determinism or loose specification
    there is in the system that you're working
  • 37:06 - 37:09
    with, right? So if everything is
    completely deterministic, you can write a
  • 37:09 - 37:14
    reference implementation in the cleanest
    language of your choice and just compare
  • 37:14 - 37:19
    the output of that and the output of the
    real thing, right? But if there's more
  • 37:19 - 37:24
    non-determinism, then you can't do that.
    I'm going to have to abbreviate this a
  • 37:24 - 37:31
    little tiny bit. So for the TCP network
    protocols, there is more non-determinism.
  • 37:31 - 37:35
    You know what the TCP is, yes? A protocol
    that gives you sort of reliable
  • 37:35 - 37:42
    connections, assuming that the world is
    made of good people, right? You look at
  • 37:42 - 37:48
    the standards for TCP, and they're the
    same kind of wibbly text from the 1980s
  • 37:48 - 37:52
    and you look at the implementations and
    they are a ghastly mess. And you try and
  • 37:52 - 37:56
    understand the relationship between the
    two of them and you can't because the
  • 37:56 - 38:02
    standard, again, is not the definition. It
    doesn't define in all circumstances what
  • 38:02 - 38:07
    behaviour is allowed and what isn't. So
    again, we had to make up a specification
  • 38:07 - 38:14
    in the first instance just of this host, a
    single endpoint, and observing its
  • 38:14 - 38:19
    sockets, API, calls and returns and its
    behaviour on the wire, right? And when
  • 38:19 - 38:22
    you've decided - and a few internal debug
    events. And when you've decided what to
  • 38:22 - 38:27
    observe, then observation is just a trace,
    it's just a list of those events. And you
  • 38:27 - 38:35
    have to be able to ask your spec "Is that
    list admissible or not"? But there's a lot
  • 38:35 - 38:39
    of non-determinism. Variation between the
    implementations, variations within the
  • 38:39 - 38:46
    implementations. And that's internal. It's
    not announced in the API or on the wire
  • 38:46 - 38:51
    maybe until much later when the
    implementation picks a new window size or
  • 38:51 - 38:57
    something. You can't tell until quite a
    lot later in the trace what it's picked.
  • 38:57 - 39:03
    And that makes the job of checking whether
    a trace is admissible by the spec much
  • 39:03 - 39:09
    harder than it has to be. Right? And if
    you had designed the TCP protocol and
  • 39:09 - 39:14
    those implementations for to be testable
    in this very discriminating way, back in
  • 39:14 - 39:20
    the 1980s when you were designing TCP
    protocol, it would be much easier. And for
  • 39:20 - 39:26
    new protocols, one should make sure that
    you're doing this in this particular way.
  • 39:26 - 39:31
    I don't think I want to talk about that
    slide. That's just one of the rules of
  • 39:31 - 39:35
    that specification. But the key fact about
    that spec is the - well, we handled the
  • 39:35 - 39:40
    real protocols for arbitrary inputs. But
    it's structured not just for this
  • 39:40 - 39:46
    executability as a test oracle, but it's
    structured for clarity, not performance,
  • 39:46 - 39:50
    which is scarcely ever what anyone ever
    does, right? So it's a whole new kind of
  • 39:50 - 39:54
    thing. And the testing is very
    discriminating. So we found all manner of
  • 39:54 - 40:04
    amusing and bizarre bugs which I think I
    don't have time to talk about. Okay. So
  • 40:04 - 40:13
    I've described three kinds of things that
    we might do. The first thing - for
  • 40:13 - 40:19
    goodness sake - I mean, it's just a
    no-brainer. Just do it already. Everybody.
  • 40:19 - 40:30
    All of you. All of you. Right? This middle
    zone is a very interesting zone as far as
  • 40:30 - 40:34
    I'm concerned, right? It's - out of what -
    a place where we could get a very good
  • 40:34 - 40:39
    benefit for not that much effort, right?
    We can do it, if necessary, post hoc. We
  • 40:39 - 40:44
    can also do it and even better, at design
    time. We have to do this in a way that
  • 40:44 - 40:48
    makes our executions testable as test
    oracles, and another thing that that
  • 40:48 - 40:52
    enables is completely random testing. When
    you've got a test oracle, you can just
  • 40:52 - 40:58
    feed in random goop. This is fuzzing, but
    better fuzzing - feed in random goop and
  • 40:58 - 41:03
    check at every point whether what the
    behaviour that you're getting is allowable
  • 41:03 - 41:09
    or not. And then they're written for
    clarity, not for performance, and that
  • 41:09 - 41:14
    means you can see what you're doing,
    right? You can see the complexity. If you
  • 41:14 - 41:17
    go ahead and you add some feature to your
    protocol or your programming language or
  • 41:17 - 41:22
    whatever and you're working just with text
    specifications, then you can't see the
  • 41:22 - 41:25
    interactions. You just chuck in a couple
    of paragraphs and you think for a few
  • 41:25 - 41:30
    minutes, right? And if you're lucky, you
    make an implementation. But if you're
  • 41:30 - 41:35
    writing a spec that has to cover all the
    cases like this, the act of doing that
  • 41:35 - 41:41
    encourages you to think or helps you think
    about the excess complexity. And you might
  • 41:41 - 41:47
    think that's too complex, I'm being silly,
    I'll make it simpler. And it has to be
  • 41:47 - 41:52
    usable for proof. So this, I think also is
    pretty much a no-brainer. And it doesn't
  • 41:52 - 41:57
    require great technical, you know,
    mathematical skill either, right? Lots of
  • 41:57 - 42:02
    people can do this. And then there's
    option 3, best option, full-on formal
  • 42:02 - 42:08
    verification of the key components. And
    that is now also in reach. You can imagine
  • 42:08 - 42:16
    secure systems made using actual verified
    compilers and verified hypervisors with
  • 42:16 - 42:23
    verified TLS stacks and you can imagine
    making things out of those or making those
  • 42:23 - 42:28
    and then making things out of those. And
    maybe one should be thinking about that.
  • 42:28 - 42:34
    So, maybe not appropriate for everything.
    If you were writing Word, you would not
  • 42:34 - 42:39
    wish to do these things. Probably, you're
    not. But for this key infrastructure that
  • 42:39 - 42:46
    we really depend on, we trust even though
    it's not trustworthy, we have to do this.
  • 42:46 - 42:52
    Is this a new dawn? I wonder if there's
    anyone standing next to a light switch
  • 42:52 - 42:59
    that can dim them for me. I didn't ask
    them beforehand, so... If you think back
  • 42:59 - 43:06
    the last 70-odd years, we started building
    computers in around 1940. It's been pretty
  • 43:06 - 43:13
    dark from the point of view of rigorous,
    solid, reliable engineering, stuff that is
  • 43:13 - 43:22
    actually trustworthy. Pretty dark, I would
    say. Maybe, just maybe there's a tiny
  • 43:22 - 43:26
    smidgen of light coming through the doors.
    And we start to have a little bit of a
  • 43:26 - 43:31
    clue and we start to get reusable models
    of processor architectures and programming
  • 43:31 - 43:36
    languages and network protocols and what
    have you. It's just the beginnings of the
  • 43:36 - 43:40
    analogue of that thermodynamics and
    materials science and quality control
  • 43:40 - 43:48
    management and what have you that we have
    from mechanical engineering. And we've got
  • 43:48 - 43:57
    no choice. If we don't, the next 75 years
    is going to be a lot worse. You can just
  • 43:57 - 44:02
    imagine, right? So I went to this - as I'm
    sure many of you do - this great talk on
  • 44:02 - 44:08
    some Apple boot loader exploit yesterday
    which was relying on some feature that had
  • 44:08 - 44:15
    been introduced in the early '80s and was
    still present. And you can imagine in 100
  • 44:15 - 44:23
    years from now, you can imagine as long as
    human civilisation endures, the x86
  • 44:23 - 44:29
    architecture and the socket API and all of
    this stuff, it will be embedded in some
  • 44:29 - 44:34
    monumental ghastly stack of
    virtualisation layers forever.
  • 44:34 - 44:46
    laughterapplause
  • 44:46 - 44:51
    So I'd like to thank especially all of my
    colleagues that have been working with me
  • 44:51 - 44:57
    or not with me in these directions. I'd
    like to thank our generous funders who
  • 44:57 - 45:01
    support this stuff. I'd like to thank you
    for your attention and I exhort you,
  • 45:01 - 45:05
    sort it out.
  • 45:05 - 45:21
    applause
  • 45:21 - 45:25
    Herald: Thank you very much, Peter, for
    those interesting insights to our
  • 45:25 - 45:31
    programming. We have now time for a Q & A,
    so those people who have to leave the
  • 45:31 - 45:40
    room, please do so quietly and as fast as
    possible, so we can go on and hear what
  • 45:40 - 45:51
    the questions are. So we
    start with microphone 4, please.
  • 45:51 - 45:59
    Mic 4: Hello. Thanks for the talk and my
    question is if you got an oracle of the
  • 45:59 - 46:04
    software behaviour which can translate
    every possible input to a correct output,
  • 46:04 - 46:08
    how can this be less complex and
    error-prone than the reference
  • 46:08 - 46:11
    implementation?
  • 46:11 - 46:16
    Peter: Good question. So the first point
    is that this oracle doesn't have to
  • 46:16 - 46:22
    produce the outputs. It only has to check
    that the inputs and the outputs match up.
  • 46:22 - 46:27
    And then the second point is that in
    general it might have to have much of the
  • 46:27 - 46:33
    same complexity, but by writing it to be
    clear as opposed to to be fast, you may
  • 46:33 - 46:38
    adopt a completely different structure,
    right? So the structure of our TCP spec is
  • 46:38 - 46:44
    organised by the behaviour that you see
    for particular transitions. The structure
  • 46:44 - 46:49
    of a real implementation has fast-path
    code and lots of complicated intertwined
  • 46:49 - 46:55
    branching and all manner of excess
    complexity, of implementation complexity,
  • 46:55 - 46:57
    if you will.
  • 46:57 - 47:01
    Mic 4: Thanks.
    Herald: So microphone 3, please?
  • 47:01 - 47:07
    Mic 3: I wanted to ask you what you
    thought about programming by manipulating
  • 47:07 - 47:14
    the abstract syntax tree directly so as to
    not allow incompilable programs because of
  • 47:14 - 47:18
    some, like, you're missing a semicolon or
    something like that. What's your thoughts
  • 47:18 - 47:20
    on that?
  • 47:20 - 47:24
    Peter: So that's - in the grand scheme of
    things, I think that's not a very big
  • 47:24 - 47:29
    deal, right? So there's - people have
    worked on structured editors for lordy
  • 47:29 - 47:33
    knows how many years. It's not a big deal
    because it's very easy for a compiler to
  • 47:33 - 47:39
    detect that kind of thing. And even more,
    it's very easy for a compiler of a
  • 47:39 - 47:43
    sensibly designed language to detect type
    errors, even for a very sophisticated type
  • 47:43 - 47:50
    system. So I don't think that - that's not
    - I mean, some people might find it
  • 47:50 - 47:53
    helpful, but I don't think it's getting to
    the heart of the matter.
  • 47:53 - 47:54
    Mic 3: Thanks.
  • 47:54 - 48:00
    Herald: So we've got some questions
    from our signal angels from the IRC.
  • 48:00 - 48:06
    Signal angel: Yes. There's one question.
    It's about the repository for Isabelle and
  • 48:06 - 48:11
    HOL proofs. It should be on source forge,
    and you said there are no open source
  • 48:11 - 48:15
    repositories. What about them?
  • 48:15 - 48:21
    Peter: I'm not quite sure what the
    question is, really. So there's a
  • 48:21 - 48:28
    repository of Isabelle formal proofs which
    is the archival form of proofs, it's
  • 48:28 - 48:35
    called. I didn't really mean to say there
    are no open source repositories and in
  • 48:35 - 48:38
    fact I don't know under what conditions
    most of those proofs are licensed. They
  • 48:38 - 48:43
    probably are open. But there isn't an open
    source community building these things,
  • 48:43 - 48:48
    right? It's still pretty
    much an academic pursuit.
  • 48:48 - 48:50
    Herald: Microphone 2, please.
  • 48:50 - 48:56
    Mic 2: Hello. Thanks again for your talk.
    I just want to add something that you
  • 48:56 - 49:02
    didn't address, and that's that we
    actually need more good studies in
  • 49:02 - 49:08
    software engineering. We often hear a lot
    of experts and also very well-known
  • 49:08 - 49:12
    computer scientists say things like,
    "Well, we just need to do functional
  • 49:12 - 49:18
    programming and OOP is bad and stuff like
    that", which I think there's a lot of
  • 49:18 - 49:24
    truth to it, but we actually need studies
    where we test these kinds of claims that
  • 49:24 - 49:30
    we make, because when you look at other
    fields which it also did compare to, like,
  • 49:30 - 49:34
    medicine, if somebody comes around and
    is well-known and says things like,
  • 49:34 - 49:42
    "homeopathy works", we don't believe them.
    We do trials, we do good trials. And
  • 49:42 - 49:50
    there's a lot of myths out there, like, or
    not well tested things, like "hire the
  • 49:50 - 49:55
    best programmers, they are 100 times
    more productive", "do steady type
  • 49:55 - 50:00
    systems", and stuff like that. And we need
    to verify that this is true, that this
  • 50:00 - 50:02
    helps. And it's...
  • 50:02 - 50:07
    Peter: Okay. So in the grand scheme of
    things, arguably you're right. In the
  • 50:07 - 50:13
    grandest scheme of things, computer
    science is actually a branch of psychology
  • 50:13 - 50:19
    or really sociology. We are trying to
    understand how large groups of people can
  • 50:19 - 50:26
    combine to make things which are insanely
    complicated. Now, for - would it be good
  • 50:26 - 50:31
    if we had, you know, solid evidence that
    programming in this language was better
  • 50:31 - 50:38
    than programming in that language? Well,
    yes, but it's staggeringly difficult and
  • 50:38 - 50:45
    expensive to do honest experiments of that
    nature and they're scarcely ever done,
  • 50:45 - 50:49
    right? You might do little experiments on
    little groups of students but it's really
  • 50:49 - 50:56
    difficult. And then some of these things
    which I'm saying, when one is familiar
  • 50:56 - 51:02
    with the different possible options, are
    just blindingly obvious. I mean, there are
  • 51:02 - 51:09
    reasons why these people are using OCaml
    for their system programming, right? It's
  • 51:09 - 51:17
    not - you know, it's not - "Homeopathy is
    right", but "Homeopathy is wrong" which is
  • 51:17 - 51:21
    the kind of argument being put forward.
  • 51:21 - 51:24
    Herald: Question from
    microphone 5, please.
  • 51:24 - 51:30
    Mic 5: So where are you using ECC
    memory for your testing - up here.
  • 51:30 - 51:34
    Peter: When you say up here,
    there's a bit of ambiguity.
  • 51:34 - 51:35
    Mic 5: Here.
  • 51:35 - 51:38
    Peter: Thank you. Say again?
  • 51:38 - 51:43
    Mic 5: So where are you using ECC memory
    for the testing you showed about the
  • 51:43 - 51:51
    results of the multithreaded results due
    to memory barriers and memory reorderings?
  • 51:51 - 51:55
    Peter: Well...
  • 51:55 - 52:00
    Mic 5: Okay, but even if you were or you
    were not, the point I want to make is that
  • 52:00 - 52:06
    you can expect about 1 bit flip each
    minute in a modern computer system that
  • 52:06 - 52:14
    may completely change what your software
    is doing, so perhaps we also have to look
  • 52:14 - 52:19
    in ways how we can work if something goes
    wrong at the very low level so that we
  • 52:19 - 52:25
    kind of have a check against our
    specification on a more higher level of
  • 52:25 - 52:32
    our software. So it is valuable to do good
    engineering on the low levels, but still I
  • 52:32 - 52:38
    think we will not solve the problems of
    computation and computer engineering just
  • 52:38 - 52:43
    by proving things in the mathematical
    domain because computers are physical
  • 52:43 - 52:48
    entities. They have errors and so on and
    we really have to deal with them as well.
  • 52:48 - 52:54
    Peter: So it's certainly true that there
    are such random errors. In fact, I would
  • 52:54 - 52:59
    put the point that I would argue that you
    have to be able to model the physics well
  • 52:59 - 53:04
    enough and you have to be able to model
    the statistics of those errors well
  • 53:04 - 53:08
    enough, so that's a different kind of
    mathematics. And it's certainly valuable
  • 53:08 - 53:13
    and necessary. But if you look at the
    statistic you just quoted, is that the
  • 53:13 - 53:22
    main cause of why systems go wrong? Except
    possibly for space hardware, no. So
  • 53:22 - 53:26
    important, yes. The most important thing
    that we should be paying attention to? I
  • 53:26 - 53:30
    don't really think so.
  • 53:30 - 53:33
    Herald: Microphone 6, please?
  • 53:33 - 53:41
    Mic 6: Hi. I really think that what you're
    proposing to specify and verify more key
  • 53:41 - 53:47
    components is - would be a valuable
    addition, but how do you make sure that
  • 53:47 - 53:52
    your specification actually matches the
    behaviour that you want to do or to have?
  • 53:52 - 53:58
    Peter: So as I described, we do as partial
    validation of those specifications, we do
  • 53:58 - 54:03
    a lot of testing against the
    implementations, against a range of
  • 54:03 - 54:07
    existing implementations. That's one
    source of validation. Another source of
  • 54:07 - 54:12
    validation is that you talk to the
    architects and the designers. You want the
  • 54:12 - 54:17
    internal structure to match their intent.
    You want it to be comprehensible to them.
  • 54:17 - 54:22
    Another kind of validation that we do is
    prove properties about them. So we proved
  • 54:22 - 54:29
    that you can correctly compile from a C11
    concurrency, a mathematical model of that,
  • 54:29 - 54:35
    to a IBM Power concurrency. And that kind
    of proof gives you a bit more level of
  • 54:35 - 54:39
    assurance in both. So none of this is
    giving you a total guarantee. You
  • 54:39 - 54:44
    certainly don't claim a total guarantee.
    But it gives you pretty good levels of
  • 54:44 - 54:48
    assurance by normal standards.
  • 54:48 - 54:50
    Herald: Mic 4?
  • 54:50 - 54:56
    Mic 4: Yes. Thanks again. You proposed
    random tests, and with my expertise, it's
  • 54:56 - 55:03
    very annoying if you have tests where the
    outcome is sometimes a failure and you
  • 55:03 - 55:10
    want to have reproducible tests to fix a
    bug. So how do we bring this aspects to
  • 55:10 - 55:18
    test more variety in data and to have it
    reproducible together?
  • 55:18 - 55:25
    Peter: Well, if - as I was mentioning, for
    the TCP thing, one of the - so the problem
  • 55:25 - 55:29
    is reproducibility is exactly this
    internal non-determinism, right? The
  • 55:29 - 55:33
    system makes this scheduling choice or
    what have you and you can't see what it
  • 55:33 - 55:40
    is, or the checker can't see what it is.
    So one way to do that is to really design
  • 55:40 - 55:45
    the protocols in a different way to expose
    that non-determinism. The other fact about
  • 55:45 - 55:53
    this kind of general purpose specification
    as test oracle idea is that in some sense,
  • 55:53 - 55:59
    it doesn't have to be reproducible. Right?
    The specification is giving a yes-no
  • 55:59 - 56:06
    answer for an arbitrary test. And that
    means that you can use arbitrary tests.
  • 56:06 - 56:10
    Figuring out the root causes of the
    differences afterwards may be tricky, but
  • 56:10 - 56:13
    you can use them for assurance.
  • 56:13 - 56:18
    Herald: So we now got time for two
    more questions. Microphone 1, please?
  • 56:18 - 56:23
    Mic 1: Hiya. Thanks for a great talk. So
    what you've described seems to be a middle
  • 56:23 - 56:28
    ground between a full-on, formal
    verification and more practical testing,
  • 56:28 - 56:34
    like, in between. So where do you see in
    the future where formal verifications will
  • 56:34 - 56:40
    go? Will they converge to the middle or
    whether it will still be there just to
  • 56:40 - 56:44
    verify something that's more
    well-specified?
  • 56:44 - 56:50
    Peter: So the progress of sort of the
    whole subject of formal verification in
  • 56:50 - 56:57
    general, if you look at that over the last
    10 years or so, it's been very - it's been
  • 56:57 - 57:01
    really amazing compared with what we could
    do in the 80s and 90s and early 2000s,
  • 57:01 - 57:06
    right? So the scope of things, the scale
    of things which are - for which it is
  • 57:06 - 57:11
    becoming feasible to do verification is
    getting bigger. And I think that will
  • 57:11 - 57:16
    continue. You know, I don't know when you
    might see a completely verified TOR
  • 57:16 - 57:24
    client, let's say, but that's not
    inconceivable now. And 20 years ago, that
  • 57:24 - 57:29
    would have been completely inconceivable.
    So that is expanding and at the same time,
  • 57:29 - 57:37
    I hope we see more of this interface, text
    based oracle specification - and these -
  • 57:37 - 57:41
    when you're doing formal verification of a
    piece of the system, you have to have
  • 57:41 - 57:48
    these indicators already defined, all
    right? So these two fit very well together.
  • 57:48 - 57:51
    Herald: So the last question
    from microphone 2, please.
  • 57:51 - 57:56
    Mic 2: Hi. You mentioned that you often
    find bugs in hardware. Is there any effort
  • 57:56 - 58:00
    to verify the transistors on chips
    themselves as well?
  • 58:00 - 58:03
    Peter: So there's a whole world of
    hardware verification. That wasn't my
  • 58:03 - 58:11
    topic today. And most of the big processor
    vendors have teams working on this.
  • 58:11 - 58:14
    Unsurprisingly, if you remember your
    history, many of them have teams focusing
  • 58:14 - 58:21
    on the floating-point behaviour of their
    processors. And they're doing - they're
  • 58:21 - 58:27
    also doing by the standards of ten years
    ago, pretty spectacularly well. So there
  • 58:27 - 58:33
    are companies that have the whole of some
    execution unit formally verified. There's
  • 58:33 - 58:37
    been a lot of work over the years on
    verification of cache protocols and such
  • 58:37 - 58:43
    like. Right? There's a lot of work on not
    verifying the hardware as a whole, but
  • 58:43 - 58:48
    verifying that, you know, the RTL-level
    description of the hardware matches some
  • 58:48 - 58:54
    lower level description. So
    there is a lot of that going on.
  • 58:54 - 58:58
    Herald: Thank you very much
    again for this great talk.
  • 58:58 - 59:00
    Give him another applause.
    Thank you, Peter.
  • 59:00 - 59:02
    Peter: Thank you.
  • 59:02 - 59:04
    applause
  • 59:04 - 59:15
    subtitles created by c3subtitles.de
    Join, and help us!
Title:
Peter Sewell: Why are computers so @#!*, and what can we do about it?
Description:

more » « less
Video Language:
English
Duration:
59:15

English subtitles

Revisions