< Return to Video

36C3 - Getting software right with properties, generated tests, and proofs

  • 0:00 - 0:19
    36C3 preroll music
  • 0:19 - 0:26
    Herald: Our next talk is held by Mike
    Sperber, and he is already very ready for
  • 0:26 - 0:31
    that. He's head of a software company in
    Tübingen in Germany and he's going to talk
  • 0:31 - 0:37
    about "Getting software right with
    properties, generator tests and proofs".
  • 0:37 - 0:42
    And the main thing here is functional
    programming. One tiny thing you might not
  • 0:42 - 0:51
    know about him is that 1986 he won a
    federal competition on IT and so give him
  • 0:51 - 0:54
    a warm applause for that and also for his
    talk.
  • 0:54 - 0:58
    applause
  • 0:58 - 1:05
    Mike Sperber: Thank you very much. So is
    anybody actively using the induction loop
  • 1:05 - 1:09
    feature in the first couple of rows? Cuz
    I know somebody who would like to know.
  • 1:09 - 1:17
    Not right now. Okay. Anyway, so let me get
    one shameless plug of advertising out of
  • 1:17 - 1:22
    the way. If you find the contents of this
    talk interesting, we're running a
  • 1:22 - 1:26
    developer conference in Berlin in February
    called Bob, which is very friendly and
  • 1:26 - 1:33
    very nice, very tiny compared to this one.
    And we'd love to see you there. Another
  • 1:33 - 1:40
    thing, this is an introductory talk. So if
    you were expecting the latest developments
  • 1:40 - 1:44
    on proof tactic, in fact, if you know what
    proof tactic is, then all you might get
  • 1:44 - 1:49
    from this talk is sort of mild amusement.
    And I won't be mad at you at all if you go
  • 1:49 - 1:56
    for one of the more exciting talks. Ok?
    So. Or leave at any time. That's perfectly
  • 1:56 - 2:02
    fine, if this material is not for you.
    Speaking of introductory talks, here's a
  • 2:02 - 2:06
    piece of code written in the language that
    I will use for this talk and it's called
  • 2:06 - 2:12
    Idris. Who has written an Idris program
    before? Very good. Ok. Oh, there's one
  • 2:12 - 2:18
    person back there. That means if any part
    of this program, as soon as I'm done
  • 2:18 - 2:22
    explaining, is not clear to you, it's also
    not clear to two or three hundred other
  • 2:22 - 2:27
    people in this room. And I would love to
    have your help. Interrupt me, ask a
  • 2:27 - 2:30
    question anytime in the talk if there's
    anything here not clear. It's going to
  • 2:30 - 2:35
    get, even though it's meant to be
    introductory, will get quite technical at
  • 2:35 - 2:41
    times. So let me try explaining this one.
    So this is a classic example in functional
  • 2:41 - 2:46
    programming that I use often in my talks,
    about animals on the Texas Highway. And if
  • 2:46 - 2:50
    you can see there, the central definition
    says data Animal. That's the data
  • 2:50 - 2:54
    definition of animals. And in this
    particular version of the Texas Highway,
  • 2:54 - 2:58
    there's two different kinds of animals.
    There is Armadillo, it's where it says
  • 2:58 - 3:04
    Dillo there. And there's Parrots, for some
    reason, on the Texas Highway. Does that
  • 3:04 - 3:07
    make sense? Two different kinds of
    animals. And you see that definition.
  • 3:07 - 3:14
    Yeah, nod, that greatly helps me. And if
    you see those two definitions for Dillo
  • 3:14 - 3:19
    and Parrot, you can see, while the arrows
    are kind of funny, but you can see that
  • 3:19 - 3:24
    Dillo and Parrots have two properties
    each, and it says their Liveness. That's
  • 3:24 - 3:28
    one of the properties of an armadillo. And
    up there at the very top, you see the
  • 3:28 - 3:32
    definition of Liveness, it says Liveness
    means dead or alive. It's an armadillo,
  • 3:32 - 3:37
    can be dead or alive on the Texas Highway.
    And there's also the Weight. And well, you
  • 3:37 - 3:42
    see, this colon thing is the type
    signature for the constructor for
  • 3:42 - 3:46
    armadillos. So it says there's a liveness
    going on, there's a weight going on, and
  • 3:46 - 3:50
    then it constructs an animal. And for a
    Parrot, there's a string. Every parrot
  • 3:50 - 3:54
    speaks, right? And so it's the sentence
    the Parrot says, and also the Weight. And
  • 3:54 - 3:59
    it also produces an animal. And, up there,
    you can see the definition of Weight is
  • 3:59 - 4:02
    for simplicity's sake, I'm saying that
    Weight is a type. So that's kind of
  • 4:02 - 4:06
    unusual for Idris, but you don't have to
    worry about it. But you can see there,
  • 4:06 - 4:11
    Weight is just the same thing as an
    integer. And if you look down there, where
  • 4:11 - 4:17
    it says a1, a2 and a3, that has three
    examples for animals. So it says a1:
  • 4:17 - 4:21
    Animal, just to say that a1 is an animal.
    So, Idris is a language that always has
  • 4:21 - 4:27
    type declarations. And it says a1 is Dillo
    Alive 10. And that means it's an
  • 4:27 - 4:31
    armadillo, it's still alive, and it
    weighs, let's say, ten kilograms. The
  • 4:31 - 4:36
    second one is dead, a little bit heavier,
    weighs twelve kilograms. And the third
  • 4:36 - 4:41
    animal is a parrot that knows, well, it's
    a pirate's parrot, obviously, and maybe
  • 4:41 - 4:48
    weighs three kilograms. Ok, so far? Ok. So
    if you have any question about any of
  • 4:48 - 4:52
    that, then please ask away. So, what
    happens to animals on the Texas Highway
  • 4:52 - 4:57
    is, you know, people drive cars, they run
    them over. So there's a function down
  • 4:57 - 5:01
    here, and, well, we're doing functional
    programing, shouldn't worry you at all.
  • 5:01 - 5:05
    But what's important here is that it says
    there is an animal going in, an animal
  • 5:05 - 5:10
    going out. And really what this means is
    that this animal object up there is not
  • 5:10 - 5:16
    really the animal. It is the state of the
    animal at a given time. So, runOverAnimal.
  • 5:16 - 5:20
    you can see the type signature that says
    an animal goes in, an animal goes out. And
  • 5:20 - 5:24
    what it really means is, the state of the
    animal goes in before it gets run over and
  • 5:24 - 5:30
    the state of the animal after it gets run
    over comes out. And then while we know
  • 5:30 - 5:34
    there's two different kinds of animals.
    And that means that for the definition of
  • 5:34 - 5:38
    runOverAnimal, we need what's called
    equations. There's two different
  • 5:38 - 5:42
    equations. And the first equation says
    what happens to armadillos when they get
  • 5:42 - 5:46
    run over. So an armadillo has a liveness
    and a weight. Here's something going on
  • 5:46 - 5:51
    called pattern matching. And the second
    equation says when there's a parrot going
  • 5:51 - 5:55
    on it has a sentence and a weight, and on
    the right hand side, you can see, well,
  • 5:55 - 5:59
    when an armadillo gets run over. Well, all
    that means is, the liveness sort of turns
  • 5:59 - 6:03
    to Dead. We're constructing a new
    armadillo object and it's dead and it has
  • 6:03 - 6:09
    the same weight as before. And the
    function, the equation at the bottom says,
  • 6:09 - 6:15
    well, when we run over a parrot, it turns
    really, really quiet. Ok? So, classic
  • 6:15 - 6:21
    example. Ok so far. We're going to return
    to that example at the very end. Right now
  • 6:21 - 6:25
    it's just to illustrate the language that
    we're doing things in and we're going to
  • 6:25 - 6:32
    do a lot of things without complicated
    programs. So, well. So, I'm going to jump
  • 6:32 - 6:36
    around a little bit. So, one thing. So,
    just the other day, two weeks ago, I was
  • 6:36 - 6:40
    teaching a course on architecture and
    somebody said: Well, there's this problem.
  • 6:40 - 6:44
    I'm building a domain model. I'm putting
    the domain model in a database. And, you
  • 6:44 - 6:47
    know, customer comes in, has new
    requirement or somebody comes in, has new
  • 6:47 - 6:51
    requirements. And that always ends the
    same way. I put a new call in the database
  • 6:51 - 6:56
    and, you know, seven, eight, nine, it just
    goes on and on. As the software gets older
  • 6:56 - 7:01
    and older and older, more columns that
    create the old ones seem a little stale.
  • 7:01 - 7:08
    And so, yes, well, how can we build models
    that are flexible? And so, here's
  • 7:08 - 7:13
    something completely different, you might
    think. So, here's sort of the key to that,
  • 7:13 - 7:17
    to building flexible models. Does anybody
    recognize this? Does anybody associate a
  • 7:17 - 7:23
    word with this? laughter Very good. So,
    you might remember, depending on what
  • 7:23 - 7:27
    state you went to school in, you might
    remember that this is a property called
  • 7:27 - 7:32
    associativity. Right? And it means that we
    can associate either the A and the B first
  • 7:32 - 7:38
    with the parentheses or the B and the C.
    So, and this is, if you take away one
  • 7:38 - 7:43
    thing from this talk, it's associativity.
    Knowing what that is is one of the most
  • 7:43 - 7:47
    useful things in software development. So,
    of course it's just a generic equation, we
  • 7:47 - 7:51
    really need to be more specific, namely
    that we're dealing with numbers and
  • 7:51 - 7:55
    addition. And you might know that it's not
    just addition that's associative. Also,
  • 7:55 - 7:59
    multiplication, for example, is
    associative. So here's a little mathy
  • 7:59 - 8:04
    stuff there at the beginning. So, you see
    that upside down A. That says "for all".
  • 8:04 - 8:09
    We just say for all. What that means is
    "for all A, B and C". And then, this funny
  • 8:09 - 8:14
    epsilon-shape letter kind of thing, it
    means "element of". And then that funky N
  • 8:14 - 8:17
    means the natural numbers. So all the
    numbers from zero, one, two, three, the
  • 8:17 - 8:23
    whole numbers from zero on up. So, what
    that means is, for all natural numbers A,
  • 8:23 - 8:30
    B and C, the associative property holds
    when you add them up. But while it says
  • 8:30 - 8:34
    numbers in addition, it doesn't just hold
    for numbers and, addition, in fact,
  • 8:34 - 8:38
    associativity is everywhere around us.
    Specifically, it's everywhere around us
  • 8:38 - 8:42
    when we program. So here's another
    example. When you're dealing with lists
  • 8:42 - 8:45
    and that funky, the two pluses that you
    see there, they are just list
  • 8:45 - 8:50
    concatenation. So you concatenate two
    lists and well, of course you can
  • 8:50 - 8:56
    concatenate three lists by just using that
    double plus in any order. And that's also
  • 8:56 - 9:01
    associative. So, it doesn't matter if you
    first concatenate the B and the C and then
  • 9:01 - 9:06
    tack the A onto the front, or if you
    concatenate the A and B and tack the C on
  • 9:06 - 9:10
    at the end. Doesn't matter, you always get
    the same result. So lists and
  • 9:10 - 9:17
    concatenation also have this associative
    property. And here's something that I
  • 9:17 - 9:22
    always find very, very enlightening is
    that you can construct images that way.
  • 9:22 - 9:27
    Well, you don't see it here. So here's an
    image. Well, it's from a cool researcher
  • 9:27 - 9:32
    of mine and functional programing, Brent
    Yorgey, and he has a great library out
  • 9:32 - 9:37
    called diagrams, for constructing diagrams
    out of parts. And so this really is what
  • 9:37 - 9:41
    associativity is about. It's about
    operators that construct things out of
  • 9:41 - 9:45
    parts. And so, as you can see here, well
    there's different shapes here, there's
  • 9:45 - 9:48
    sort of the black rectangles, there's a
    different rectangle set, that denote the
  • 9:48 - 9:52
    towers of Hanoi. We're not really going to
    deal with the towers of Hanoi here,
  • 9:52 - 9:58
    really. The important thing that the image
    consists of several parts. And well, in
  • 9:58 - 10:01
    normal or sort of in classic object-
    oriented programming, when you do
  • 10:01 - 10:06
    graphics, you have a canvas and you might
    draw pixels on that canvas. You know,
  • 10:06 - 10:11
    might be square shaped or a circle shaped
    canvas pixels. But what we're doing here
  • 10:11 - 10:18
    is, we are treating an image as a data
    type and the definition is not important.
  • 10:18 - 10:23
    What is important is that there are a
    couple of functions that construct sort of
  • 10:23 - 10:29
    simple images. So here's a function that
    you might imagine called star and it
  • 10:29 - 10:34
    constructs stars. And well, you can see up
    there there's a type declaration and it
  • 10:34 - 10:38
    says, well, the star function, it accepts
    an integer, it accepts a Mode, whatever
  • 10:38 - 10:42
    that is, accepts a Color and it produces
    an Image. And we can call that star
  • 10:42 - 10:49
    function with the arguments 200 and Solid
    and Gold. So Mode is Solid or Outline. And
  • 10:49 - 10:54
    then we have a Color and we get Image and
    that image is an object. Not particularly
  • 10:54 - 11:00
    exciting. But while we might have another
    function called Polygon, Polygon takes two
  • 11:00 - 11:06
    integers that denote the size of the
    polygon and the number of vertices, and
  • 11:06 - 11:10
    also whether it's an outline or whether
    it's solid and a color. And for example,
  • 11:10 - 11:17
    if we call it with 180, again, that's the
    size and 5 we get a five corner polygon
  • 11:17 - 11:23
    and we get that as an outline and it's in
    red. Now, the idea here is that we can
  • 11:23 - 11:27
    combine. Just as we can combine two
    numbers or we can combine two lists, we
  • 11:27 - 11:32
    can combine two images. Maybe the most
    intuitive way of combining two images is
  • 11:32 - 11:37
    just sticking them beside each other. So
    there's a function there called beside.
  • 11:37 - 11:41
    And it takes an image and it takes another
    image and produces an image. Right. And
  • 11:41 - 11:45
    this is exactly what we're thinking about
    when we talk about associativity. We're
  • 11:45 - 11:51
    talking about a sort of a binary operator
    that produces the same thing that went in.
  • 11:51 - 11:56
    And so, for example, we could stick those
    two images next to each other. We could
  • 11:56 - 12:01
    also imagine an operator called above that
    just puts one image above the other image.
  • 12:01 - 12:05
    And we can combine these two things. Here
    it really is important that the same thing
  • 12:05 - 12:10
    comes out so that it's image goes in,
    another image goes in, an image goes out.
  • 12:10 - 12:15
    So we could again call above on the result
    of beside and make arrangements. So here's
  • 12:15 - 12:19
    a tiling arrangement for your bathroom or
    something like that. Now, beside and above
  • 12:19 - 12:24
    are are two possible operators and you
    might already think about associativity,
  • 12:24 - 12:29
    but really the more fundamental one is
    overlay. You put two images on top of each
  • 12:29 - 12:33
    other. And so again, overlay has the right
    type. An image goes in, another image goes
  • 12:33 - 12:38
    in, and an image comes out. And if we take
    the gold star and the pentagon and put
  • 12:38 - 12:46
    them on top of each other, then it looks
    like this. And we can then formulate an
  • 12:46 - 12:52
    associativity property. It might not quite
    look the same because I wrote overlay in
  • 12:52 - 12:56
    front rather than between the operators.
    We could also write it between. But just
  • 12:56 - 13:00
    to show you that it's the same idea. So,
    it doesn't really matter if we first take
  • 13:00 - 13:05
    two images, A and B and superimpose those
    two and then put those two on top of C or
  • 13:05 - 13:13
    if we do it in another order. Does that
    make sense so far? Ok. No? Do you have a
  • 13:13 - 13:16
    question?
    Anonymous: Mumbling
  • 13:16 - 13:21
    Mike: Yeah, so ahh, good point, good
    point. So this implies that there must be
  • 13:21 - 13:25
    some kind of, that there's probably some
    notion of transparency involved. Yes. Yes,
  • 13:25 - 13:29
    there is. But then you have associativity.
    And really what it means. Very good
  • 13:29 - 13:38
    question. So, if you think of this image
    in terms of the color at certain
  • 13:38 - 13:44
    coordinates, Right, Well, you need to
    think about how to combine those two
  • 13:44 - 13:49
    colors that are in the constituent images.
    And you can imagine that there also has to
  • 13:49 - 13:53
    be a combination operation for the colour.
    And that also needs to be associative as a
  • 13:53 - 13:59
    prerequisite for the for the overlay
    operation to be associative. Does that
  • 13:59 - 14:08
    make sense now? Thank you. Good question.
    Great question. So anyway, so since this
  • 14:08 - 14:12
    associativity property is something that
    is not just restricted to numbers, as we
  • 14:12 - 14:15
    may have learned in school, it really
    makes sense to get. And that means that
  • 14:15 - 14:19
    when we talk about associativity, we
    always have to name two things. We have to
  • 14:19 - 14:23
    say what set we're operating on and what
    the operation is. And the combination
  • 14:23 - 14:28
    those two things has a name in mathematics
    and it's not the best name, but it's
  • 14:28 - 14:34
    called a semigroup. Right. And, but, you
    know, if you drop it in certain circles,
  • 14:34 - 14:39
    they'll think that you're an expert on
    mathematics, you might try that. So, just
  • 14:39 - 14:43
    to go over that: So, you have some subset
    S, and that S might be Image, it might be
  • 14:43 - 14:47
    the natural numbers or something like
    that. And we have an operation that I'm
  • 14:47 - 14:54
    just gonna call circle here, then take any
    a, b and c from that set S. We can use
  • 14:54 - 14:58
    circle as an operator and we have that
    associativity property and for that circle
  • 14:58 - 15:02
    you can put in overlay, you can put in
    beside, you can put in above, you can put
  • 15:02 - 15:09
    in + you can put in times or you can put
    in the list concatenation operator, the
  • 15:09 - 15:16
    ++. Okay? And that associativity is great.
    It's really my favorite property because
  • 15:16 - 15:20
    it means when we have a whole lot of
    things that we combine, we can
  • 15:20 - 15:25
    parenthesise in any way we want. We will
    get the same result no matter which way we
  • 15:25 - 15:29
    parenthesise them. And that really means,
    we can leave out the parentheses when we
  • 15:29 - 15:33
    write an expression that involves only the
    circle operator if it's associative, if we
  • 15:33 - 15:37
    can just leave out all the parentheses
    because the parentheses don't matter. And
  • 15:37 - 15:42
    that makes it well, that makes it
    instantly easier to read, I think. Also it
  • 15:42 - 15:47
    has practical uses. So if you do big data
    processing associativity means that if you
  • 15:47 - 15:52
    have large datasets that span several
    machines or several hard drives or several
  • 15:52 - 15:59
    data sources, and you're combining them
    and you have an associative combination
  • 15:59 - 16:03
    operation, it just means you can rearrange
    that combination operation according to
  • 16:03 - 16:08
    the load in your compute cluster. And that
    makes it a very useful property when
  • 16:08 - 16:12
    you're doing big data processing in sort
    of MapReduce based frameworks. But, I
  • 16:12 - 16:16
    mean, that's a practical application, but
    I think it's much more useful,
  • 16:16 - 16:21
    associativity is much more useful when you
    use it for designing your domain model.
  • 16:21 - 16:26
    And I talked in the beginning how, well,
    you want to avoid always adding more
  • 16:26 - 16:30
    database columns. And one way of doing
    that is to view your domain model, not as
  • 16:30 - 16:33
    something that has more and more
    properties, but your domain model as
  • 16:33 - 16:37
    building blocks that you combine into a
    larger building blocks the same way that
  • 16:37 - 16:43
    we combine images from simpler images. So
    here's one of the great papers from
  • 16:43 - 16:46
    functional programing, one of my two or
    three favorites from Brent Yorgey. And
  • 16:46 - 16:50
    it's called "Something Something: Theme
    and Variations". And you can see that it
  • 16:50 - 16:55
    is about images. And these images get
    superimposed with an operation that is
  • 16:55 - 17:01
    just like overlay and that is, that title
    is eminently googleable. Now, it has a
  • 17:01 - 17:05
    funny word there. It says, it doesn't say
    semigroup, could say"Semigroup: Theme and
  • 17:05 - 17:09
    Variation", it says "Monoid: Theme and
    Variation". And a monoid, well, it's also
  • 17:09 - 17:12
    not something, even though it sounds kind
    of fancy, it's actually not much more
  • 17:12 - 17:16
    complicated than a semigroup. It's a
    semigroup. And also the semigroup has a
  • 17:16 - 17:20
    special element called the neutral
    element. And whenever we combine something
  • 17:20 - 17:24
    with a neutral element, it doesn't matter
    if we do it in front or at the back, we
  • 17:24 - 17:28
    get the same thing back. So, of course,
    the neutral element with respect to
  • 17:28 - 17:34
    numbers, in addition, would be zero. The
    neutral element with respect to lists and
  • 17:34 - 17:39
    and concatenation would be the empty list.
    I always hear several voices. That's
  • 17:39 - 17:43
    wonderful. Thank you. And the same thing
    for the overlay and beside and above, you
  • 17:43 - 17:46
    can imagine that you have just an empty
    image that has only, that consists only of
  • 17:46 - 17:52
    transparency, that can work as the neutral
    element. So all of these things that I
  • 17:52 - 17:55
    showed you that are associative, they're
    not just associative. They're not just
  • 17:55 - 18:01
    semigroups, they're also monoids. And so,
    as I said, as long as you remember
  • 18:01 - 18:05
    associativity, that's the important thing.
    But often you also find a monoid, and
  • 18:05 - 18:08
    monoids in the wild they're just
    everywhere. We've seen them for numbers
  • 18:08 - 18:13
    and lists and images, music forms a
    natural monoid. You can you can describe
  • 18:13 - 18:20
    musical structure with monoid operations.
    You can treat animations, the time axis.
  • 18:20 - 18:25
    You can define monoidal combination of
    animations. A famous example in functional
  • 18:25 - 18:29
    programming is with financial contracts.
    If you were here last year for a talk of
  • 18:29 - 18:32
    mine, we talked about semiconductor-
    fabrication routes, which sounds very
  • 18:32 - 18:37
    concrete, but also they form a monoid. The
    properties themselves that we'll see for a
  • 18:37 - 18:41
    monoid are all kinds of things. They're
    everywhere around you. And these are
  • 18:41 - 18:45
    really the key towards making flexible
    domain models because in almost any domain
  • 18:45 - 18:49
    model you can find a monoid just by
    looking for building blocks and for ways
  • 18:49 - 18:54
    of combining those building blocks into a
    larger building blocks. So let me get
  • 18:54 - 19:00
    back. So I said, well, you can use
    associativity or you can use this monoid
  • 19:00 - 19:04
    thing to guide your design. And I haven't
    really made that concrete yet. And so I
  • 19:04 - 19:09
    stole a couple of pictures from Brent's
    paper. So you remember the beside and the
  • 19:09 - 19:14
    above operations. And those are fine for
    arranging things sort of in the vertical
  • 19:14 - 19:20
    and the horizontal axis. The way that they
    work is, they make, they put a bounding
  • 19:20 - 19:24
    box around every picture and then they
    arrange the bounding boxes either beside
  • 19:24 - 19:28
    each other or above each other. So it's a
    slightly more involved thought. And that
  • 19:28 - 19:32
    works great when you're, when, you know,
    your picture is, happens to be a square
  • 19:32 - 19:37
    that's aligned with the axes. It doesn't
    work so well if your picture is rotated,
  • 19:37 - 19:40
    right. Because the bounding box, the
    bounding box then is too big. And if you
  • 19:40 - 19:44
    want to attach anything about, just about
    in any direction, then there's going to be
  • 19:44 - 19:50
    a gap in your picture. And so beside and
    above are not particularly good operations
  • 19:50 - 19:55
    as the basis for an image library. The
    overlay operation is much better. But that
  • 19:55 - 19:59
    leaves open the question how you can
    arrange pictures, several pictures so that
  • 19:59 - 20:05
    they are beside each other or that they
    just touch. And Brent came up with this
  • 20:05 - 20:12
    idea of an envelope, technical idea. So
    the idea is that, well, if you give me, so
  • 20:12 - 20:18
    the red dot there, that's the origin. If
    you give me a vector starting at the
  • 20:18 - 20:23
    origin, I will tell you how far you have
    to go along that vector so that I can draw
  • 20:23 - 20:27
    that blue perpendicular line that's just
    outside the shape. And that's called an
  • 20:27 - 20:34
    envelope. And envelopes are wonderful. So
    if you ship each picture not just with
  • 20:34 - 20:38
    sort of the visuals that you see, but also
    with a function that describes the
  • 20:38 - 20:44
    envelope, then you can use that envelope
    to arrange pictures both in the horizontal
  • 20:44 - 20:48
    and the vertical, but also in the diagonal
    by just drawing vectors so that they
  • 20:48 - 20:53
    touch. So, that's a slightly more
    complicated idea. Does it make sense? And
  • 20:53 - 20:57
    Brent goes through the motions of using
    that inspiration from the monoid that he
  • 20:57 - 21:03
    is getting. He's saying "Everything must
    be a monoid! Absolutely.", and uses that
  • 21:03 - 21:06
    as a guiding principle through the
    library. So I'm not going to go into
  • 21:06 - 21:11
    technical detail on how that works, but
    it's a very pleasing paper to read on
  • 21:11 - 21:18
    that. And it results in a beautiful
    library that's great fun to use. So that
  • 21:18 - 21:22
    means, though, that you also have to find
    a monoidal combination operation for the
  • 21:22 - 21:26
    envelopes. You can't just, we've already
    seen how we can combine the pictures
  • 21:26 - 21:29
    themselves, but we also need to combine
    the envelopes. And fortunately, that's
  • 21:29 - 21:33
    pretty easy. If somebody sets a vector in
    a certain direction, then that envelope is
  • 21:33 - 21:37
    just a maximum. Those two pictures, right,
    if you combine that ellipse and that
  • 21:37 - 21:41
    square, you can see that I'm just going to
    have to go to the maximum of those two
  • 21:41 - 21:47
    numbers in order to just be outside the
    composite shape that that comes up
  • 21:47 - 21:53
    superimposing those two things. So that's
    great. Now, I sort of introduced these
  • 21:53 - 21:57
    properties as a mathematical thing, right.
    I said, well, there's this fancy, fancy
  • 21:57 - 22:03
    upside down operator says for all images
    and we might say for all images. Now, we
  • 22:03 - 22:09
    can also formulate these properties as
    code. And that's really where additional
  • 22:09 - 22:12
    magic is. So, for example, the
    associativity property, well, there's not
  • 22:12 - 22:16
    much of a difference except that the
    image1 and image2, they are now in
  • 22:16 - 22:20
    typewriter font. So we could put those in
    the program. But there's still that
  • 22:20 - 22:24
    mathematical stuff on top. But in a
    functional language, in a lot of other
  • 22:24 - 22:28
    languages too by now, we could also put
    the top line and translate that into code.
  • 22:28 - 22:32
    And it might look like this. So that's
    what it looks like in Idris. So, it's not
  • 22:32 - 22:37
    quite the same, but maybe we recognize the
    structure. So, we say, well, there's a
  • 22:37 - 22:41
    property called and the property is just
    called overlayAssociative. So we give it a
  • 22:41 - 22:48
    name. So, Idris is an ASCII language,
    still so, primarily. So, we say just
  • 22:48 - 22:53
    forAll there instead of the upside down
    all. And then it says arbTriple arbImage
  • 22:53 - 22:59
    arbImage arbImage. And that means for all
    arbitrary triples of arbitrary images and
  • 22:59 - 23:02
    other arbitrary image and another
    arbitrary image, so, triples, three
  • 23:02 - 23:08
    things. And we're going to call those
    three images image1, image2 and image3.
  • 23:08 - 23:15
    That funky backslash there, that's a
    Lambda in Idris. And then the overlay prop
  • 23:15 - 23:21
    means that while, if we overlay one way
    and we overlay another way, according to
  • 23:21 - 23:26
    associativity, we get the same result. Do
    you recognize that structure? Right. That
  • 23:26 - 23:31
    it's the same thing. So that we're writing
    structurally the same thing that we wrote
  • 23:31 - 23:37
    in mathematical notation. Now as a piece
    of code. And now the great thing is once
  • 23:37 - 23:41
    we've written it as a piece of code, we
    can manipulate it in a program.
  • 23:41 - 23:45
    So, one way, there is different ways
    of manipulating it. But one of the most
  • 23:45 - 23:48
    useful ones is, again by another great
    researcher in functional programming,
  • 23:48 - 23:52
    John Hughes, came up with something called
    QuickCheck. So if there's another thing
  • 23:52 - 23:56
    you take away from this talk is: google
    QuickCheck. And whatever language you use,
  • 23:56 - 24:01
    it doesn't have to be Idris. In fact, I
    had to hack together a QuickCheck for this
  • 24:01 - 24:04
    talk, but basically any other language is
    going to have a QuickCheck, whether that
  • 24:04 - 24:09
    language be a functional language or
    whether it's Java or Python or R or
  • 24:09 - 24:15
    something like that. You can always get a
    QuickCheck for that. And I'm going to try
  • 24:15 - 24:23
    and demonstrate this QuickCheck thing not
    by thinking about the design so much, but
  • 24:23 - 24:28
    by demonstrating a property of something
    that's very error prone. So, here's this
  • 24:28 - 24:35
    idea, we want to have a representation for
    sets of natural numbers. And we're going
  • 24:35 - 24:41
    to represent those sets of natural numbers
    by a list of intervals. So, by a list of
  • 24:41 - 24:45
    ranges, if you will, between two numbers.
    Now, I'll try to explain that. So, up
  • 24:45 - 24:50
    there at the top, it has a type
    definition. It says, ISet, interval set,
  • 24:50 - 24:57
    is a type. And that type is defined to be
    just a synonym for a list of pairs of
  • 24:57 - 25:00
    natural numbers. That's what those round
    parentheses with a comma in the middle
  • 25:00 - 25:05
    mean. OK. And just to see what that means
    is, there's a function there. I haven't,
  • 25:05 - 25:10
    I've lighted the definition, but what's
    important about it is its type signature.
  • 25:10 - 25:16
    It takes an interval set and it produces a
    list of all of the members of that set.
  • 25:16 - 25:21
    And you can see sort of a demo thing here
    that I typed in before the talk. So, if I
  • 25:21 - 25:27
    apply iToList so that, the brackets there
    they just mean the list, and we feed in a
  • 25:27 - 25:32
    list of intervals and those intervals are
    from zero to three, from five to seven,
  • 25:32 - 25:37
    and from nine to ten, respectively.
    They're all inclusive. And you can see
  • 25:37 - 25:40
    down there is a list of all of the
    members. So, the first interval is from 0
  • 25:40 - 25:47
    to 3. So, it has the numbers 0, 1, 2 and
    3. The next one goes from five to seven.
  • 25:47 - 25:51
    So it has the three numbers 5, 6, 7. And
    the last one goes from nine to ten. So it
  • 25:51 - 25:57
    has the two numbers, 9 and 10 there. Does
    that make any sense again? Slightly more
  • 25:57 - 26:06
    complicated example. So let's see. So, of
    course, well, not of course, but the way
  • 26:06 - 26:10
    we want to do it, the way I want to do it
    is, I want to have the interval set
  • 26:10 - 26:16
    structured in a certain way. I don't just
    want any list of any pair of numbers to
  • 26:16 - 26:20
    denote an interval set. And therefore,
    here is a function that describes what it
  • 26:20 - 26:26
    needs to be a valid interval set. Right.
    So, for example, we don't really, in order
  • 26:26 - 26:29
    to have efficient processing, we don't
    really want two intervals in one interval
  • 26:29 - 26:33
    set to overlap. Right. We want them to be
    disjoint and we also want them to be
  • 26:33 - 26:38
    ordered so we can have efficient
    operations for certain things. Right. And
  • 26:38 - 26:41
    so, let's go through this. So, there is an
    isValid function that just tells you
  • 26:41 - 26:46
    whether that interval set is valid or not.
    It says, well, if that set, and there's
  • 26:46 - 26:50
    three different cases here, which is why
    there's three different equations, in the
  • 26:50 - 26:55
    first equation says the empty interval
    set, the empty brackets mean the empty
  • 26:55 - 26:59
    list, and if the intervals, the list
    representing the interval set is empty,
  • 26:59 - 27:04
    then we're going to say True. Empty set -
    perfectly fine. The next case says, our
  • 27:04 - 27:09
    interval set consists only of a single
    interval and that single interval goes
  • 27:09 - 27:14
    from low to high. Well, we kind of
    interpret that there, but, and, well, that
  • 27:14 - 27:18
    interval set is valid, if low comes in
    front of high. Right, they shouldn't be
  • 27:18 - 27:24
    the other way around. So, does that make
    sense? Somebody in, can you nod at the
  • 27:24 - 27:29
    back, a little bit? You're still there?
    OK. Thank you. Great. So, then it becomes
  • 27:29 - 27:33
    a little bit more complicated and it says,
    well, this is the third case, when there's
  • 27:33 - 27:39
    at least two intervals in the interval
    set. And those two intervals are, the
  • 27:39 - 27:45
    first one goes from lo1 to hi1. The second
    one goes from lo2 to hi2. So, those ::,
  • 27:45 - 27:49
    they separate the first element of a list
    from the rest. And then there's the rest
  • 27:49 - 27:54
    of the list. And it says, well, again, we
    want the interval to be ordered so that
  • 27:54 - 27:59
    the lower numbers are on the left. That's
    where it says lo1 is less or equal hi1.
  • 27:59 - 28:04
    And then it says, well, that there should
    be a gap between two consecutive
  • 28:04 - 28:09
    intervals. Otherwise, they should be one
    interval, which is why the high from one
  • 28:09 - 28:15
    interval should be separated from the low
    of the next interval by at least one. And
  • 28:15 - 28:19
    then we're going to say, well, also we
    want all the rest of the list, including
  • 28:19 - 28:27
    lo2 and hi2 to be valid too. So far so
    good? OK, so this is probably, well, the
  • 28:27 - 28:32
    second most complicated piece of code. So,
    anyway, so, here's, so, we might imagine a
  • 28:32 - 28:37
    union function. And the union function,
    guess what, it forms a monoid, with
  • 28:37 - 28:43
    respect to interval sets. So, it takes,
    two internal sets go in and another one
  • 28:43 - 28:48
    comes out. And if you've written that kind
    of thing before, you might notice it's
  • 28:48 - 28:53
    probably a little tricky with that fancy
    validity condition that's there. So how
  • 28:53 - 28:58
    can we get this right? Well, what we do is
    we write down properties. Of course, we
  • 28:58 - 29:03
    could write down associativity. I'll leave
    that as an exercise. Another one is just
  • 29:03 - 29:09
    very simple. Just a very simple property
    that says for all pairs of two arbitrary
  • 29:09 - 29:14
    interval sets, we want the union of those
    two interval sets to be valid, a valid
  • 29:14 - 29:21
    data structure. We want the union function
    to preserve validity. OK? Makes sense? So
  • 29:21 - 29:27
    here's another property that says, well, I
    already gave you this function or I told
  • 29:27 - 29:33
    you that there is this function iToList,
    which just gives us a list of elements of
  • 29:33 - 29:37
    an interval set. And what we can do is, we
    can use sort of that representation,
  • 29:37 - 29:41
    that's also a representation for sets. We
    can use that representation sort of as a
  • 29:41 - 29:45
    model and say, well, if we take the
    unions, you see there for all pairs,
  • 29:45 - 29:50
    again, of arbitrary interval sets, we take
    the union. It says iUnion, iset1 and
  • 29:50 - 29:56
    iset2. And we convert that to a list. And,
    we could also do, we could instead convert
  • 29:56 - 30:01
    each individual set to a list and then
    just merge those two lists. And that
  • 30:01 - 30:06
    should yield the same result. So, in a
    way, we're just giving a very simple model
  • 30:06 - 30:10
    for our interval sets, right, and that
    would, so those two criteria would be kind
  • 30:10 - 30:16
    of nice to have in order to get our
    implementation correct. And I already got
  • 30:16 - 30:20
    started before the talk on this. Looks
    like this. No. Doesn't look like this.
  • 30:20 - 30:29
    We'll get to that later. But like this.
    So, here's what I came up with. So, you
  • 30:29 - 30:33
    see there is that, while there's all this
    other code there, ignore that. But there
  • 30:33 - 30:37
    is iUnion says ISet -> ISet -> ISet, do
    you see that? And then, there's two
  • 30:37 - 30:41
    equations that say, well, the first set is
    empty, then I'm just going to give you the
  • 30:41 - 30:45
    second one. And if the second one is
    empty, I'm just going to give you the
  • 30:45 - 30:50
    first one. Right? Classic things when you
    have union or concatenation operations or
  • 30:50 - 30:54
    something like that. And now you can see
    the third case. It gets tricky, right?
  • 30:54 - 30:58
    Again, you don't need, I mean, main thing
    is you need to understand it's tricky.
  • 30:58 - 31:06
    Well, the third one is such that, well,
    says so that both have at least one
  • 31:06 - 31:11
    element and that element is in the
    interval lo1 and hi1 in the first case and
  • 31:11 - 31:15
    lo2 and hi2 in the second case. And then
    there's the rest. And I already put in a
  • 31:15 - 31:21
    little bit of code, and I said, well if
    lo1 comes after hi1 (means hi2), then we
  • 31:21 - 31:25
    want to start with lo2 to hi2 and then
    continue with the union. In the other
  • 31:25 - 31:29
    case, if lo2 comes after hi2 (means hi1),
    then we're gonna start with lo1 and hi1.
  • 31:29 - 31:34
    And in the other case, it means, that no
    interval comes before the other, and
  • 31:34 - 31:39
    therefore we need to merge the two
    intervals at the beginning. Does that make
  • 31:39 - 31:44
    remote sense? Right. Don't worry. We'll
    get back on solid track. So, we just take
  • 31:44 - 31:48
    the minimum of those two intervals and
    maximum of those two intervals and we do
  • 31:48 - 31:52
    this. Now, the great thing is, I told you
    about this tool by John Hughes called
  • 31:52 - 31:59
    QuickCheck. And the great thing is, we can
    load this into Idris. And then here comes
  • 31:59 - 32:11
    a REPL, and we can say, I hope I'm doing
    this right. So, we want QuickCheck, and we
  • 32:11 - 32:17
    want, what was it called? It was called
    prop_unionCorrect. I hope I'm doing this
  • 32:17 - 32:22
    right. And, well, very small font. But you
    can see here it says "100 tests". And that
  • 32:22 - 32:27
    is what QuickCheck does, as, it takes your
    code version off the property and
  • 32:27 - 32:32
    automatically generates a lot of tests for
    them. And that is super effective at
  • 32:32 - 32:36
    weeding out bugs. So it says, well, the
    thing that you wrote is correct. It always
  • 32:36 - 32:44
    produces interval sets that when you take
    the list, it gives you the right result.
  • 32:44 - 32:50
    But there was that other criterion called
    unionValid. And there it says, and this is
  • 32:50 - 32:55
    really the better part, of course, of
    QuickCheck is, when it fails, it says it's
  • 32:55 - 33:00
    falsifiable. It says there is a counter
    example. And so, here it says, I did nine
  • 33:00 - 33:05
    tests, I generated nine random tests, and
    I found one where the result is not valid.
  • 33:05 - 33:11
    And the great thing is that we can then go
    and cut and paste this example. So I could
  • 33:11 - 33:18
    say iUnion, this, remove the comma in the
    middle, and call this. And well, what
  • 33:18 - 33:24
    happens here is, what we can see is, we
    can see 2 and 4, 1 and 1, and 3 and 5, and
  • 33:24 - 33:28
    what's not valid about. So, by the way,
    this is randomized. So, this always goes
  • 33:28 - 33:32
    differently. So I have to look at it, too.
    So, then it says, well, those two
  • 33:32 - 33:35
    intervals, they should really just be
    merged and they should just be one
  • 33:35 - 33:44
    interval. Right? And so, it didn't do that
    correctly. And the reason for that, maybe
  • 33:44 - 33:55
    you saw it. So, and, what happened is that
    it ran into one of those two cases here
  • 33:55 - 34:00
    where it says if lo1 greater than hi2 or
    lo2 greater than hi1. Remember that I told
  • 34:00 - 34:05
    you there needs to be a gap of at least
    one between them. Remember? And here's an
  • 34:05 - 34:11
    off-by-one error that says, well. So this
    says, they can, lo1 greater than hi2 says
  • 34:11 - 34:15
    they can still be right next to each
    other. Right? And this is what happened
  • 34:15 - 34:20
    here. We need to make sure that there is
    that gap in here. So, I can fix it like
  • 34:20 - 34:38
    this. Loaded again. Oh, no. There's still
    a counterexample. So, and we can try that
  • 34:38 - 34:43
    out, so, and that's great. We get test
    cases that sort of show where the bugs
  • 34:43 - 34:51
    are. And in this case, well, what happened
    here? They still overlap. And what
  • 34:51 - 35:00
    happened here? So can you see it? So, you
    can see that the first two intervals, they
  • 35:00 - 35:05
    must run into that last case. Right.
    Because they overlap. Zero is the interval
  • 35:05 - 35:13
    from 0 to 3 and the interval from 0 to 5.
    They overlap. So we need to get to that
  • 35:13 - 35:19
    case. And so it merges them. And then it
    went and and somehow didn't merge it with
  • 35:19 - 35:30
    the 6 and the 7 that's there. And, so,
    well, if you look at it. So it must have
  • 35:30 - 35:39
    done this. And and what it did is, it then
    went on with the rest there. Let's have
  • 35:39 - 35:59
    one more look. What actually happened? So
    there it is. So, it merged those and then
  • 35:59 - 36:03
    you can see that it went into a symmetry
    problem here. Well, maybe you don't see.
  • 36:03 - 36:08
    But, you know, this is tricky stuff. I
    couldn't do this by myself. So you can see
  • 36:08 - 36:19
    here that it just tacks the result onto
    iset1Rest, whereas the maximum of hi1 and
  • 36:19 - 36:26
    hi2 could, might violate the consistency
    criteria if it's the wrong one, and then
  • 36:26 - 36:30
    it runs into one of the other cases. Now
    I've never seen this tricky one. Does it
  • 36:30 - 36:34
    make sense? But, can you see that it
    should be symmetrical? The last one. Can
  • 36:34 - 36:40
    you see it? OK, so we'll try and make it
    symmetrical. Do it like this. So we'll
  • 36:40 - 36:52
    say, well, if so, this only works. So if
    hi1 is less than hi2. So we really need to
  • 36:52 - 36:59
    make sure, then it is perfectly. And then
    the maximum of those two numbers is hi1.
  • 36:59 - 37:05
    Does that make sense? And so the max of
    those two numbers is hi1 and then it's
  • 37:05 - 37:12
    perfectly valid to tack it onto iset1Rest.
    In the other case, hi2 is greater and we
  • 37:12 - 37:18
    need to go and do something different and
    rip this out here. Stick it in front here
  • 37:18 - 37:36
    and then. And then. And now it's
    symmetrical. OK. So, load this. And, ahh!
  • 37:36 - 37:41
    It has passed the test. OK, live great.
    applause
  • 37:41 - 37:47
    Thank you. I did practice getting it
    correct, right. But you can, you know,
  • 37:47 - 37:51
    this kind of stuff. It always gets me. I
    mean, you know, with old age especially,
  • 37:51 - 37:56
    this kind of stuff, it always drives the
    sweat on my forehead, right? You know,
  • 37:56 - 37:59
    there's off-by-one. There is, you know, I
    don't know how many cases there need to
  • 37:59 - 38:03
    be. And QuickCheck is the kind of thing
    that weeds out the bugs. And even though
  • 38:03 - 38:07
    it weeds out the bugs in a different order
    each time, it always weeds them all out.
  • 38:07 - 38:12
    OK. So it's a great tool. Now, I recommend
    that you try that. It generates tests from
  • 38:12 - 38:19
    properties. OK, where are we? So let me
    let me give you a couple of real world
  • 38:19 - 38:23
    examples. So if you're using X windows,
    there's a there's a tiling, a window
  • 38:23 - 38:27
    manager, xmonad. It's already a couple of
    years old and they don't do much
  • 38:27 - 38:33
    development on it anymore. That's because
    it's correct. Right. laughter Right. And
  • 38:33 - 38:39
    why is it correct? Well, it's because they
    wrote down a lot of properties for the
  • 38:39 - 38:46
    geometry and the tiling algorithms and
    verified them using QuickCheck. And so I
  • 38:46 - 38:49
    sort of loosely translated. So, Don
    Stewart, one of the authors of xmonad
  • 38:49 - 38:53
    graciously wrote a couple of blog posts on
    a simplified version of xmonad and I
  • 38:53 - 39:02
    translated them into Idris. So, here's a
    very simple idea of just a stacking window
  • 39:02 - 39:05
    manager. So, it doesn't do geometry, it
    just has stacks of windows and it has
  • 39:05 - 39:10
    several workspaces. In each workspace is a
    stack of windows. So here's a data type
  • 39:10 - 39:17
    called a StackSet, its parameterized by a
    type called window. We'll see later why
  • 39:17 - 39:21
    there's a type parameter and why it just
    doesn't say what the windows are. And then
  • 39:21 - 39:25
    it says there's a constructor StackSet
    and there's two fields in there. One is
  • 39:25 - 39:34
    called "current", that's the number of the
    workspace that's currently
  • 39:34 - 39:37
    active. And then there's "stacks",
    which is a map from the number of the
  • 39:37 - 39:44
    workspace to the stack of, to the list of
    windows that sit in that workspace. Again,
  • 39:44 - 39:48
    so here, really the technicalities are
    not particularly important, but there is a
  • 39:48 - 39:54
    bunch of operations that operate on this
    window manager configuration. And again,
  • 39:54 - 39:58
    here, really the details aren't important.
    So you could create an empty stack set.
  • 39:58 - 40:03
    You could say, well, you know, I have the
    number of a window that I want to get to
  • 40:03 - 40:07
    the front. And please make me, please
    rotate me, the stack set around so that I
  • 40:07 - 40:13
    can see it. "peek" means, you know, maybe
    I can get the topmost window that the user
  • 40:13 - 40:17
    is currently looking at. "rotate" means
    I'm just going to rotate the workspaces
  • 40:17 - 40:21
    around in either left or right direction.
    That's what that ordering argument. "push"
  • 40:21 - 40:26
    is, I push a new window onto the current
    workspace. "insert" means insert a window
  • 40:26 - 40:32
    into one of the other workspaces. "delete"
    means I delete a window. "shift" means,
  • 40:32 - 40:36
    also means I shift something with the
    windows. Not really important what they
  • 40:36 - 40:43
    do. But you can imagine again, just as we
    did with the interval sets is validity
  • 40:43 - 40:47
    criterion or an invariant that should hold
    for these operations. And it's very
  • 40:47 - 40:51
    simple. Well, it says well, if you have a
    stack set with some windows in it, I'm
  • 40:51 - 40:55
    just going to tell you whether that stack
    set is consistent. And by doing that, I'm
  • 40:55 - 41:02
    just going to say, well, the current, the
    number of the current stack
  • 41:02 - 41:07
    should not be higher than the number of
    window stacks that there are. Right. So,
  • 41:07 - 41:11
    the number of stacks that there are. And
    the other one, that just says a window
  • 41:11 - 41:18
    should not be in several of the
    workspaces. Right? And then I can go and
  • 41:18 - 41:23
    maybe with this definition, all
    those function definitions aren't very
  • 41:23 - 41:27
    complicated. But, I can go and write a
    whole bunch of properties. And if you just
  • 41:27 - 41:31
    understand, well, maybe the second one,
    "prop_view_I", you understand all of them.
  • 41:31 - 41:36
    It just says, well, for all pairs of a
    natural number and a stack set that are a
  • 41:36 - 41:40
    "stackIndex" and "stackSet", I want, if I
    call the "view" function, which is one of
  • 41:40 - 41:45
    the operations, I want the view function
    to produce a consistent stack set. And
  • 41:45 - 41:50
    then it goes on to do all of that for all
    the other ones. At the bottom here, you
  • 41:50 - 41:54
    can see some prerequisites that need to
    hold for the property so that invariant
  • 41:54 - 41:59
    only needs to hold if the window, if the
    number of the window is actually smaller
  • 41:59 - 42:05
    than the size of the stack set. Otherwise,
    I think the function just returns what
  • 42:05 - 42:10
    would go with it, what went in there, So
    that's a very, that's just a very
  • 42:10 - 42:14
    efficient way to invent properties, to
    think of some invariant that shall hold in
  • 42:14 - 42:17
    your data structure. And if you know
    Idris, you can sometimes encode that in
  • 42:17 - 42:21
    the types, but often that's kind of
    tedious. And you can just write it down as
  • 42:21 - 42:26
    a property and then have QuickCheck check
    it for you. And it's not particularly
  • 42:26 - 42:30
    exciting for the simple definition, but
    you can imagine that the actual definition
  • 42:30 - 42:34
    when you have tiling window management
    going on is much more complicated than the
  • 42:34 - 42:38
    one that you just saw. But you can keep
    those same properties, right? There still
  • 42:38 - 42:42
    needs to be some consistency invariant
    that, if you have tilings, the windows
  • 42:42 - 42:46
    don't overlap, and things like that. That
    should be obvious. Write those properties
  • 42:46 - 42:51
    down, check them using QuickCheck and that
    will weed out a lot of the bugs.
  • 42:51 - 42:56
    Here's an example from our
    practice. We, couple months ago, we were
  • 42:56 - 43:03
    tasked with migrating a giant Visual Basic
    6 application. It had a password checking
  • 43:03 - 43:07
    function there. You can see here a Visual
    Basic 6 type signature. And the property
  • 43:07 - 43:15
    that we wrote was, well, if we create the
    hash from the password and we compare it
  • 43:15 - 43:20
    with the hash that's in the database, then
    they should all come out the same. And to
  • 43:20 - 43:24
    our surprise, that function, that test,
    that property, failed when we ran it for
  • 43:24 - 43:29
    QuickCheck and we had to correct it
    because that password hash is restricted
  • 43:29 - 43:37
    to 11 characters by some restriction in
    the database schema. And so that means
  • 43:37 - 43:40
    that you can use QuickCheck not just to
    sort of check the correctness of things
  • 43:40 - 43:44
    that you already know, but to actually
    develop a model for what goes on in your
  • 43:44 - 43:51
    software, which you don't always know very
    well. So that's what we did there. Another
  • 43:51 - 43:57
    example is, we wrote, for a large
    industrial client, we needed to write a
  • 43:57 - 44:01
    synchronization application. So when you
    had two mobile devices and they would sort
  • 44:01 - 44:06
    of meet as strangers, they would exchange
    data and they all needed to look at the
  • 44:06 - 44:11
    same sort of device configuration data.
    And we didn't want them to exchange all
  • 44:11 - 44:15
    the data every single time. We just wanted
    to exchange them, the data blocks that the
  • 44:15 - 44:21
    other side was missing. And again, there's
    great algorithms for this based on Merkle
  • 44:21 - 44:24
    trees. They're pretty complicated. You
    have to do a lot of bit fiddling with
  • 44:24 - 44:29
    that. But fortunately, the property for
    that is pretty easy to write. So here's
  • 44:29 - 44:34
    the property that says, well, so the
    synchronization algorithm works on sets of
  • 44:34 - 44:40
    blocks, whatever a block is. So you can
    see the property here for all pairs of
  • 44:40 - 44:45
    sets of blocks and more sets of blocks. So
    they're called bs1 and bs2. Block set one
  • 44:45 - 44:53
    and block set two. What we can do is, we
    want, if we union those two, then we get
  • 44:53 - 44:57
    all the blocks in the system. We call that
    all, or we can call the synchronization
  • 44:57 - 45:05
    algorithm and that will give us two new
    block sets, block set bs1' and bs2'. And
  • 45:05 - 45:11
    those block sets are the ones that get
    transferred to the other side. OK. And the
  • 45:11 - 45:16
    criterion here just says if we take the
    ones that we have, if we union them with
  • 45:16 - 45:20
    the ones that we get, we should get all of
    them. That should be all of them. And that
  • 45:20 - 45:25
    should be the same for both sides. And
    also, we want the algorithm to be
  • 45:25 - 45:29
    efficient so we don't want it to transfer
    blocks. So we want to make sure that the
  • 45:29 - 45:33
    blocks that we have and the blocks that we
    get are disjoint. That they don't have any
  • 45:33 - 45:38
    elements in common. Otherwise, we could
    make that algorithm trivially correct by
  • 45:38 - 45:42
    just transferring all the blocks every
    single time. And I can tell you, I
  • 45:42 - 45:46
    sweated. You know, I sweated one or two
    weeks over this algorithm and it was
  • 45:46 - 45:53
    really hard to write. But this one test
    weeded out all the bugs that I found along
  • 45:53 - 45:59
    the way. So that is just super, super
    effective. John Hughes has a couple of
  • 45:59 - 46:03
    papers on hard bugs that he found. So he
    found a bug in a distributed database
  • 46:03 - 46:10
    called mnesia. And that bug was dependent
    on opening the database, closing it and
  • 46:10 - 46:15
    opening it again. So this is not the kind
    of bug that you find by just writing a
  • 46:15 - 46:20
    bunch of smart unit tests. Right? So, if
    you did anything shorter in the beginning,
  • 46:20 - 46:25
    so if you just open the file and then did
    some lookups there, that would not
  • 46:25 - 46:29
    manifest the bug. You really needed to
    close and then open again. Have you
  • 46:29 - 46:34
    turned, have you tried turning it off and
    on again? But then the database breaks in
  • 46:34 - 46:39
    this case. And here's another example
    called The Mysteries of Dropbox. So you
  • 46:39 - 46:46
    can imagine that with Dropbox you really
    want certain properties to hold. Right?
  • 46:46 - 46:50
    And it turns out they didn't. They never
    worried about writing properties down. But
  • 46:50 - 46:55
    John Hughes did it and found a couple of
    bugs. So here's one. It's kind of hard to
  • 46:55 - 47:01
    read where it says client 1 writes a into
    a file that was previously empty. So that
  • 47:01 - 47:07
    funky turnstile there is empty. So writes
    a into a file and then deletes the file
  • 47:07 - 47:12
    and another client writes, replaces, sees
    the a in the file replaces it with a b.
  • 47:12 - 47:18
    And then client 1 goes and writes c into
    the file that it previously thought to be
  • 47:18 - 47:23
    empty. And then unfortunately, even though
    you can imagine that you should see either
  • 47:23 - 47:29
    b or c in that file, but Dropbox deleted
    it. So I think they fixed that bug now.
  • 47:29 - 47:37
    But. so you go. So it goes. Oscar Wikstrom
    has a couple of great, pretty recent blog
  • 47:37 - 47:45
    posts on properties in a screencasts
    editor that I highly recommend. So this is
  • 47:45 - 47:51
    a great tool for finding bugs, but it's
    not the same as having a proof. Right? So,
  • 47:51 - 47:56
    you can still imagine that you can find
    very subtle bugs that are not covered by
  • 47:56 - 48:00
    QuickCheck. QuickCheck just randomizes,
    just generates randomized tests. So, that
  • 48:00 - 48:07
    is not the same thing as making sure that
    there aren't any bugs. So the great thing
  • 48:07 - 48:11
    about Idris and the reason I chose it for
    this talk is that Idris allows you to not
  • 48:11 - 48:16
    just encode properties in the language. It
    also allows you to encode proofs in the
  • 48:16 - 48:21
    language. So here is the associative
    property for the list concatenation
  • 48:21 - 48:25
    operation. And if you look at the top that
    has the definition of that function from
  • 48:25 - 48:30
    the Idris standard library, it says ++, in
    goes a list, in goes another list, out
  • 48:30 - 48:36
    comes a list. Then it says, well, if you
    concatenate the empty list with any list,
  • 48:36 - 48:41
    that is just that list "right". Do you
    see that? The second one says, well, if we
  • 48:41 - 48:46
    concatenate a list that starts with the
    element x and goes on with xs, then we
  • 48:46 - 48:51
    just sort of pull the x in front and
    concatenate the rest with "right". So
  • 48:51 - 48:55
    that's a classic recursive definition of
    list concatenation in functional
  • 48:55 - 48:59
    programming. And now here's something
    really strange in Idris. Here's the type
  • 48:59 - 49:04
    declaration for a definition, again in the
    standard library, called appendAssoc. And
  • 49:04 - 49:09
    it says, if you have a list a, you have a
    list b, and you have a list c, and in the
  • 49:09 - 49:15
    type it says, oh, the associative property
    should hold. Right. And so this is a
  • 49:15 - 49:21
    statement of that property. That's
    wonderful. It's not the same as a proof.
  • 49:21 - 49:30
    So, but writing proofs, who loved that in
    math? Oh. Oh, you're good! I didn't. I'm
  • 49:30 - 49:34
    sorry. So. So the great thing about Idris
    is, it helps you write down the proofs.
  • 49:34 - 49:39
    I'll show you how that works just really,
    really briefly. So here's that. So here's
  • 49:39 - 49:43
    just what I showed you on that slide. So I
    can load that in there and it says, well,
  • 49:43 - 49:47
    you're not done. You didn't write a proof
    for that property, but in Idris, you can
  • 49:47 - 49:51
    just push a bunch of buttons. Now, I love
    that. So I can push one button and it
  • 49:51 - 49:55
    says, oh, well, you should write a proof
    of that form. You have lists a, b and c.
  • 49:55 - 49:59
    Well, now and I can push another button
    that says, well, you're doing this on
  • 49:59 - 50:05
    lists and if you're writing anything on
    lists, you always need to distinguish
  • 50:05 - 50:09
    between the two cases of the empty list
    and the list that consists of the first
  • 50:09 - 50:15
    element x and further element xs. And then
    it says, well, write down something, but
  • 50:15 - 50:19
    then I can tell Idris: Well, I'm too lazy.
    I'm not going to write anything so I can
  • 50:19 - 50:23
    just push a button. And Idris wrote this
    so you can see me, but I didn't type this
  • 50:23 - 50:29
    right. I just pushed a button and it says,
    Refl. What is Refl? What could that be?
  • 50:29 - 50:36
    Well, you can ask it what Refl is. It says
    Refl. Oh, you can see here, landing here.
  • 50:36 - 50:40
    Refl is just a proof sort of a built in
    proof that says that if two things are
  • 50:40 - 50:45
    true, two things are equal, if they're
    identical, if they're the same. Right. And
  • 50:45 - 50:49
    that kind of makes sense in the first
    equation, because the first equation of
  • 50:49 - 50:53
    appendAssoc corresponds to the first
    equation of ++. Can you see that, how it
  • 50:53 - 51:01
    corresponds? Can you see that? The first
    list is empty. Can you see that? Can you
  • 51:01 - 51:05
    see how the first list is empty with the
    first equation of appendAssoc and the
  • 51:05 - 51:14
    first list is empty up there with ++. Can
    you see that? OK. And then it just says,
  • 51:14 - 51:19
    well, then obviously. Well, not quite
    obviously, but then sort of the the way
  • 51:19 - 51:23
    that the definition works, it comes out
    just right. So what's really important is
  • 51:23 - 51:28
    that Idris accepts that proof with the
    first. The second one is slightly more
  • 51:28 - 51:36
    tricky. But again, we can get help because
    we know that appendAssoc is this recursive
  • 51:36 - 51:39
    function. It recurses on the first
    argument. So we're just going to do the
  • 51:39 - 51:47
    same thing in the proof. And you can tell
    Idris that it should use that, that it
  • 51:47 - 51:52
    should use that fact, if you will. So
    here's the recursive call. And again, I'm
  • 51:52 - 51:56
    too lazy to push a button. But if I push
    that button, it also puts in Refl and
  • 51:56 - 52:02
    there's loads. So this it might be a
    mystery to you how it works, but this is a
  • 52:02 - 52:07
    proof of the associate of property of the
    list concatenation in Idris. And since
  • 52:07 - 52:12
    Idris helps you write it, it's kind of fun
    to do that. Oddly enough, even for
  • 52:12 - 52:16
    somebody who doesn't, usually who doesn't
    usually doesn't enjoy proofs. So the way
  • 52:16 - 52:22
    that you program in Idris, we haven't done
    that a lot in this talk, is that you put a
  • 52:22 - 52:26
    lot of information in the types and the
    more information you put in the types, the
  • 52:26 - 52:30
    better Idris will get at figuring out the
    correct definition. And you don't have to
  • 52:30 - 52:39
    do it by yourself. OK. So that's really
    nice. OK. So we got that and and sort of
  • 52:39 - 52:44
    these kinds of proof assisting systems
    such as Idris have been used in a lot of
  • 52:44 - 52:49
    real world systems. One one prominent
    example is SEL4, a version of the L4
  • 52:49 - 52:53
    micro kernel, has a long history, but
    important properties of that kernel have
  • 52:53 - 52:57
    been verified. It runs in the security
    enclave on iOS and even though it's
  • 52:57 - 53:01
    written in C, it provably does not have
    buffer overflows or a lot of the nasty
  • 53:01 - 53:06
    things that are responsible for a lot of
    security exploits. Compcert is another
  • 53:06 - 53:11
    example, which is a verified, I should
    mention this has been verified with the
  • 53:11 - 53:16
    help of a tool called Isabel. Also, great
    fun to use. There's a project called
  • 53:16 - 53:20
    called Compcert, which is a verified C
    compiler, which is important for a lot of
  • 53:20 - 53:24
    certified software where, you know, the
    source code might be certified. But how do
  • 53:24 - 53:30
    you know that the compiler generates
    correct code? And you know, because it's
  • 53:30 - 53:34
    been proven to be correct. And even there,
    you can shoot, you can cheat sometimes. So
  • 53:34 - 53:38
    for example, register allocators, very
    complicated, very hard to prove right.
  • 53:38 - 53:42
    But what you can do is you can write a
    checker that the register allocator did
  • 53:42 - 53:48
    its job, did its job well and you can
    verify the checker. And so you can cheat a
  • 53:48 - 53:52
    bit. So there's tools for that. We've seen
    Idris and there's a number of other tools
  • 53:52 - 53:57
    and they're getting more and more mature.
    And they're great fun, really. They really
  • 53:57 - 54:02
    are great fun. But, you know, going back,
    switching down a gear a little bit,
  • 54:02 - 54:05
    there's lots of useful properties that you
    can look for in your programs. So
  • 54:05 - 54:10
    commutativity might be useful that you can
    switch the two arguments for an operation.
  • 54:10 - 54:14
    Also, if you have relations, you might
    remember that from some math class,
  • 54:14 - 54:18
    there's some properties here like
    reflexivity, symmetry, antisymmetry and
  • 54:18 - 54:24
    transitivity. Reflexivity says that a is
    always related to a. Symmetry says if it's
  • 54:24 - 54:29
    one way, if a and b are one way related,
    they need to be related the other way too.
  • 54:29 - 54:33
    Antisymmetry intuitively would seem kind
    of the opposite. That doesn't make sense.
  • 54:33 - 54:39
    It's just says: if two things are related
    in both both ways around. So, for example,
  • 54:39 - 54:43
    you know, orders like less or equal are
    antisymmetrical, then they must be the
  • 54:43 - 54:48
    same. And transitivity just says that you
    can form chains of your relation. So those
  • 54:48 - 54:53
    are a little dictionary of useful
    properties that you can look for. Let me
  • 54:53 - 54:58
    close with one fancy property that you've
    probably seen somewhere and that property
  • 54:58 - 55:04
    is called Functor. And you might have seen
    in your programing language, in your list
  • 55:04 - 55:09
    library or in your stream library. There's
    a function called map, right? And you
  • 55:09 - 55:14
    know, even Java has that and has had it
    for many years. And what map does is, if
  • 55:14 - 55:19
    you have some, you know, in Java, for
    example, it says "Stream", or it might be
  • 55:19 - 55:23
    "List", right. It says, well, if I have a
    list of As, I can apply a function to each
  • 55:23 - 55:27
    element of that list. But you can
    generalize that, it doesn't have to be
  • 55:27 - 55:31
    lists. It could be an Optional of As, for
    example. You could also apply a function
  • 55:31 - 55:35
    to the value that's in there. So, you can
    generalize that notion, and then it's a
  • 55:35 - 55:43
    functor. And, of course, in Idris, you can
    write down equations for functors. And,
  • 55:43 - 55:48
    please ignore the technicalities here,
    (stammers) but, if you sort of pick out
  • 55:48 - 55:52
    where it says "functorIdentity",
    the middle row says g v equals to v,
  • 55:52 - 55:56
    which means g is the identity
    function. When you feed in v, you always
  • 55:56 - 56:01
    get v back. And when you use map with the
    identity function, you apply the identity
  • 56:01 - 56:06
    function on each element of your list or
    whatever it is. Then then you always get
  • 56:06 - 56:11
    back the same list. And here just says you
    get function composition. So if you apply
  • 56:11 - 56:15
    one function and then another function and
    you do that either inside or outside the
  • 56:15 - 56:18
    map, you should also get the same results.
    So there's also just as there is
  • 56:18 - 56:24
    associativity with monoids, with functors.
    There's these laws and you might think,
  • 56:24 - 56:28
    well, where would I look for a functor?
    I've never seen a functor except for the
  • 56:28 - 56:32
    ones on streams. A couple weeks ago in a
    training, somebody said, well, you always
  • 56:32 - 56:39
    start with that animal example. Shouldn't
    you look for a functor there? And I was
  • 56:39 - 56:43
    kind of, you know, sweat broke out on my
    forehead, I was like, where's that gonna
  • 56:43 - 56:47
    go? But, we came up with this. So, if you
    go back, you can see that this is obvious.
  • 56:47 - 56:52
    So, what you need for functors is, you
    need a type parameter. Right. And so you
  • 56:52 - 56:56
    just look for a place to stick a type
    parameter, any place here at all. And
  • 56:56 - 56:59
    if you look at Dillo and Parrot, they
    both prominently have this weight thing.
  • 56:59 - 57:03
    Right. And so that seems more important
    than the other two properties, which are
  • 57:03 - 57:10
    specific to particular kind of animal. And
    so the weight, the thing to do is just to,
  • 57:10 - 57:14
    well, you can see I replaced upper case
    Weight by lower case weight and made that
  • 57:14 - 57:20
    into a type parameter, and, I can then
    provide a functor implementation down
  • 57:20 - 57:24
    there. And you might think, what is that
    good for? Well, I don't know. Well, one
  • 57:24 - 57:27
    thing that you could do is you could
    provide a different representation for
  • 57:27 - 57:31
    weights. Another thing that you could do,
    if you look at the type for runOverAnimal,
  • 57:31 - 57:36
    it says animal weight -> animal weight and
    weight is a type variable. What that type
  • 57:36 - 57:41
    signature tells you is that runOverAnimal
    does not know what weight is. And that
  • 57:41 - 57:46
    means that the weight cannot change as a
    result of that function. And you see that
  • 57:46 - 57:50
    in the type signature you get immediate,
    small benefit, but you get a benefit even
  • 57:50 - 57:56
    with silly examples such as this one. And
    that really brings me to the end. So in
  • 57:56 - 58:00
    your software, in your domain model, look
    for a Combinator, look for a function that
  • 58:00 - 58:06
    will combine two things into a bigger
    thing. See if you can make that thing
  • 58:06 - 58:10
    associative and look for a neutral
    element. And very often you will find one;
  • 58:10 - 58:13
    make it a monoid, you know, say monoid a
    couple of times. You'll remember it.
  • 58:13 - 58:17
    You'll remember it. Generally, write
    properties for the things, for the
  • 58:17 - 58:23
    operations in your software, test those
    properties using QuickCheck. You know, if
  • 58:23 - 58:28
    you feel like you have a lot of time,
    prove them correct. Find the functor. If
  • 58:28 - 58:32
    you found, if you found the monoid, you
    know, find the functor next. You know, and
  • 58:32 - 58:36
    it takes, it might take time. I'm
    very old. As you noticed at the beginning.
  • 58:36 - 58:42
    So. So it gets easier over the years and
    it will just seem like a regular staple of
  • 58:42 - 58:47
    your of your arsenal when you program. And
    of course, when the important properties
  • 58:47 - 58:51
    in your program have either been written
    down, if they've been tested with
  • 58:51 - 58:55
    QuickCheck or even proven, then you can
    sleep much more soundly than maybe you
  • 58:55 - 59:05
    currently can. Thank you very much.
    applause
  • 59:05 - 59:11
    Herald: Thank you, Mike. So I see we have
    three minutes for questions. Maybe that's
  • 59:11 - 59:16
    two or three questions. If you have any
    come to the microphones, please. Do we
  • 59:16 - 59:26
    have a question from the Internet? No, not
    yet. So, microphone two. Right.
  • 59:26 - 59:34
    Question: Hi. So, QuickCheck generated
    hundred tests. Yes. What can we say about
  • 59:34 - 59:39
    the quality of this test? Could we say
    your program was correct using thoese
  • 59:39 - 59:43
    tests? Are these tests good?
    Answer: Yeah. Very good question. So what
  • 59:43 - 59:46
    would you say about the quality of the
    tests? And indeed, if you really do serve
  • 59:46 - 59:51
    industrial strength applications, a quick
    QuickCheck comes with a bunch of tools
  • 59:51 - 59:55
    that let you look, for example, at the
    distribution of the individual example
  • 59:55 - 60:02
    rated and while you didn't quite see me do
    that, but I mean, for your domain objects,
  • 60:02 - 60:06
    you will typically write generators that
    will generate those examples and you can
  • 60:06 - 60:10
    reason about the distribution of those.
    And you absolutely should do that because
  • 60:10 - 60:15
    otherwise you might miss large areas of
    your test space. So, but there
  • 60:15 - 60:20
    are tools and they help you do that. But
    even if you don't do that, you know, it's,
  • 60:20 - 60:24
    you find a lot of, I found a lot of
    bugs in my software even without worrying
  • 60:24 - 60:29
    about that. But if you go beyond that,
    look at the distribution thing.
  • 60:29 - 60:34
    Herald: Thank you, next one, please.
    Number two.
  • 60:34 - 60:41
    Q: Let's say I've hacked a program, for
    example, in Java or C# or whatever. How do
  • 60:41 - 60:50
    I, how do I apply what I learned so
    far? So, where do I start when I have
  • 60:50 - 60:58
    already completed the C# program with,
    yeah, how do I apply QuickCheck on that?
  • 60:58 - 61:01
    A: So, just pragmatically because it's
    written in C#? That's the question?
  • 61:01 - 61:04
    Q: Yes
    A: So, well, I have to be very concrete
  • 61:04 - 61:08
    here, I mean, so, if you can think
    properties, right, one way to do, so,
  • 61:08 - 61:15
    for example, so, C# you can link with F#
    and there is a QuickCheck version for F#
  • 61:15 - 61:19
    called "FsCheck". And FsCheck, actually,
    even though it's itself written in F#, you
  • 61:19 - 61:23
    can also use it from C#. So, you have two
    options. You can write your tests in a
  • 61:23 - 61:28
    slightly more awkward fashion in C#, or
    you could just link your code with F# test
  • 61:28 - 61:35
    suite and write it down there. And there
    is a fairly reasonable Java QuickCheck, I
  • 61:35 - 61:40
    hear. Another idea would be to use the
    slightly more fancier, the slightly
  • 61:40 - 61:45
    fancier QuickChecks that exist for Scala
    and Enclosure. I'm sure there's one for
  • 61:45 - 61:49
    Kotlin as well, and link that against your
    Java code. Does that answer your question?
  • 61:49 - 61:55
    Q: So whatever language I use, I have to
    find out what the correct implementation
  • 61:55 - 61:57
    of QuickCkeck?
    A: Yeah. Yeah. But as I said, I mean
  • 61:57 - 62:01
    usually, a fun thing I do in training is,
    I chat "QuickCheck" and somebody calls on
  • 62:01 - 62:05
    the language, you know. Quick, QuickCheck
    PHP or something like that. And there is
  • 62:05 - 62:08
    one, sure enough, I didn't know about before.
    Q: Thank you.
  • 62:08 - 62:12
    Herald: All right. Thank you. And thank
    you, Mike, again, for showing us a way to
  • 62:12 - 62:15
    sleeping soundly.
    A: Thank you.
  • 62:15 - 62:21
    applause
  • 62:21 - 62:27
    postroll music
  • 62:27 - 62:46
    Subtitles created by c3subtitles.de
    in the year 2020. Join, and help us!
Title:
36C3 - Getting software right with properties, generated tests, and proofs
Description:

more » « less
Video Language:
English
Duration:
01:02:48

English subtitles

Revisions