<i>36C3 preroll music</i>

Herald: Our next talk is held by Mike
Sperber, and he is already very ready for

that. He's head of a software company in
Tübingen in Germany and he's going to talk

about "Getting software right with
properties, generator tests and proofs".

And the main thing here is functional
programming. One tiny thing you might not

know about him is that 1986 he won a
federal competition on IT and so give him

a warm applause for that and also for his
talk.

<i>applause</i>

Mike Sperber: Thank you very much. So is
anybody actively using the induction loop

feature in the first couple of rows? Cuz
I know somebody who would like to know.

Not right now. Okay. Anyway, so let me get
one shameless plug of advertising out of

the way. If you find the contents of this
talk interesting, we're running a

developer conference in Berlin in February
called Bob, which is very friendly and

very nice, very tiny compared to this one.
And we'd love to see you there. Another

thing, this is an introductory talk. So if
you were expecting the latest developments

on proof tactic, in fact, if you know what
proof tactic is, then all you might get

from this talk is sort of mild amusement.
And I won't be mad at you at all if you go

for one of the more exciting talks. Ok?
So. Or leave at any time. That's perfectly

fine, if this material is not for you.
Speaking of introductory talks, here's a

piece of code written in the language that
I will use for this talk and it's called

Idris. Who has written an Idris program
before? Very good. Ok. Oh, there's one

person back there. That means if any part
of this program, as soon as I'm done

explaining, is not clear to you, it's also
not clear to two or three hundred other

people in this room. And I would love to
have your help. Interrupt me, ask a

question anytime in the talk if there's
anything here not clear. It's going to

get, even though it's meant to be
introductory, will get quite technical at

times. So let me try explaining this one.
So this is a classic example in functional

programming that I use often in my talks,
about animals on the Texas Highway. And if

you can see there, the central definition
says data Animal. That's the data

definition of animals. And in this
particular version of the Texas Highway,

there's two different kinds of animals.
There is Armadillo, it's where it says

Dillo there. And there's Parrots, for some
reason, on the Texas Highway. Does that

make sense? Two different kinds of
animals. And you see that definition.

Yeah, nod, that greatly helps me. And if
you see those two definitions for Dillo

and Parrot, you can see, while the arrows
are kind of funny, but you can see that

Dillo and Parrots have two properties
each, and it says their Liveness. That's

one of the properties of an armadillo. And
up there at the very top, you see the

definition of Liveness, it says Liveness
means dead or alive. It's an armadillo,

can be dead or alive on the Texas Highway.
And there's also the Weight. And well, you

see, this colon thing is the type
signature for the constructor for

armadillos. So it says there's a liveness
going on, there's a weight going on, and

then it constructs an animal. And for a
Parrot, there's a string. Every parrot

speaks, right? And so it's the sentence
the Parrot says, and also the Weight. And

it also produces an animal. And, up there,
you can see the definition of Weight is

for simplicity's sake, I'm saying that
Weight is a type. So that's kind of

unusual for Idris, but you don't have to
worry about it. But you can see there,

Weight is just the same thing as an
integer. And if you look down there, where

it says a1, a2 and a3, that has three
examples for animals. So it says a1:

Animal, just to say that a1 is an animal.
So, Idris is a language that always has

type declarations. And it says a1 is Dillo
Alive 10. And that means it's an

armadillo, it's still alive, and it
weighs, let's say, ten kilograms. The

second one is dead, a little bit heavier,
weighs twelve kilograms. And the third

animal is a parrot that knows, well, it's
a pirate's parrot, obviously, and maybe

weighs three kilograms. Ok, so far? Ok. So
if you have any question about any of

that, then please ask away. So, what
happens to animals on the Texas Highway

is, you know, people drive cars, they run
them over. So there's a function down

here, and, well, we're doing functional
programing, shouldn't worry you at all.

But what's important here is that it says
there is an animal going in, an animal

going out. And really what this means is
that this animal object up there is not

really the animal. It is the state of the
animal at a given time. So, runOverAnimal.

you can see the type signature that says
an animal goes in, an animal goes out. And

what it really means is, the state of the
animal goes in before it gets run over and

the state of the animal after it gets run
over comes out. And then while we know

there's two different kinds of animals.
And that means that for the definition of

runOverAnimal, we need what's called
equations. There's two different

equations. And the first equation says
what happens to armadillos when they get

run over. So an armadillo has a liveness
and a weight. Here's something going on

called pattern matching. And the second
equation says when there's a parrot going

on it has a sentence and a weight, and on
the right hand side, you can see, well,

when an armadillo gets run over. Well, all
that means is, the liveness sort of turns

to Dead. We're constructing a new
armadillo object and it's dead and it has

the same weight as before. And the
function, the equation at the bottom says,

well, when we run over a parrot, it turns
really, really quiet. Ok? So, classic

example. Ok so far. We're going to return
to that example at the very end. Right now

it's just to illustrate the language that
we're doing things in and we're going to

do a lot of things without complicated
programs. So, well. So, I'm going to jump

around a little bit. So, one thing. So,
just the other day, two weeks ago, I was

teaching a course on architecture and
somebody said: Well, there's this problem.

I'm building a domain model. I'm putting
the domain model in a database. And, you

know, customer comes in, has new
requirement or somebody comes in, has new

requirements. And that always ends the
same way. I put a new call in the database

and, you know, seven, eight, nine, it just
goes on and on. As the software gets older

and older and older, more columns that
create the old ones seem a little stale.

And so, yes, well, how can we build models
that are flexible? And so, here's

something completely different, you might
think. So, here's sort of the key to that,

to building flexible models. Does anybody
recognize this? Does anybody associate a

word with this? <i>laughter</i> Very good. So,
you might remember, depending on what

state you went to school in, you might
remember that this is a property called

associativity. Right? And it means that we
can associate either the A and the B first

with the parentheses or the B and the C.
So, and this is, if you take away one

thing from this talk, it's associativity.
Knowing what that is is one of the most

useful things in software development. So,
of course it's just a generic equation, we

really need to be more specific, namely
that we're dealing with numbers and

addition. And you might know that it's not
just addition that's associative. Also,

multiplication, for example, is
associative. So here's a little mathy

stuff there at the beginning. So, you see
that upside down A. That says "for all".

We just say for all. What that means is
"for all A, B and C". And then, this funny

epsilon-shape letter kind of thing, it
means "element of". And then that funky N

means the natural numbers. So all the
numbers from zero, one, two, three, the

whole numbers from zero on up. So, what
that means is, for all natural numbers A,

B and C, the associative property holds
when you add them up. But while it says

numbers in addition, it doesn't just hold
for numbers and, addition, in fact,

associativity is everywhere around us.
Specifically, it's everywhere around us

when we program. So here's another
example. When you're dealing with lists

and that funky, the two pluses that you
see there, they are just list

concatenation. So you concatenate two
lists and well, of course you can

concatenate three lists by just using that
double plus in any order. And that's also

associative. So, it doesn't matter if you
first concatenate the B and the C and then

tack the A onto the front, or if you
concatenate the A and B and tack the C on

at the end. Doesn't matter, you always get
the same result. So lists and

concatenation also have this associative
property. And here's something that I

always find very, very enlightening is
that you can construct images that way.

Well, you don't see it here. So here's an
image. Well, it's from a cool researcher

of mine and functional programing, Brent
Yorgey, and he has a great library out

called diagrams, for constructing diagrams
out of parts. And so this really is what

associativity is about. It's about
operators that construct things out of

parts. And so, as you can see here, well
there's different shapes here, there's

sort of the black rectangles, there's a
different rectangle set, that denote the

towers of Hanoi. We're not really going to
deal with the towers of Hanoi here,

really. The important thing that the image
consists of several parts. And well, in

normal or sort of in classic object-
oriented programming, when you do

graphics, you have a canvas and you might
draw pixels on that canvas. You know,

might be square shaped or a circle shaped
canvas pixels. But what we're doing here

is, we are treating an image as a data
type and the definition is not important.

What is important is that there are a
couple of functions that construct sort of

simple images. So here's a function that
you might imagine called star and it

constructs stars. And well, you can see up
there there's a type declaration and it

says, well, the star function, it accepts
an integer, it accepts a Mode, whatever

that is, accepts a Color and it produces
an Image. And we can call that star

function with the arguments 200 and Solid
and Gold. So Mode is Solid or Outline. And

then we have a Color and we get Image and
that image is an object. Not particularly

exciting. But while we might have another
function called Polygon, Polygon takes two

integers that denote the size of the
polygon and the number of vertices, and

also whether it's an outline or whether
it's solid and a color. And for example,

if we call it with 180, again, that's the
size and 5 we get a five corner polygon

and we get that as an outline and it's in
red. Now, the idea here is that we can

combine. Just as we can combine two
numbers or we can combine two lists, we

can combine two images. Maybe the most
intuitive way of combining two images is

just sticking them beside each other. So
there's a function there called beside.

And it takes an image and it takes another
image and produces an image. Right. And

this is exactly what we're thinking about
when we talk about associativity. We're

talking about a sort of a binary operator
that produces the same thing that went in.

And so, for example, we could stick those
two images next to each other. We could

also imagine an operator called above that
just puts one image above the other image.

And we can combine these two things. Here
it really is important that the same thing

comes out so that it's image goes in,
another image goes in, an image goes out.

So we could again call above on the result
of beside and make arrangements. So here's

a tiling arrangement for your bathroom or
something like that. Now, beside and above

are are two possible operators and you
might already think about associativity,

but really the more fundamental one is
overlay. You put two images on top of each

other. And so again, overlay has the right
type. An image goes in, another image goes

in, and an image comes out. And if we take
the gold star and the pentagon and put

them on top of each other, then it looks
like this. And we can then formulate an

associativity property. It might not quite
look the same because I wrote overlay in

front rather than between the operators.
We could also write it between. But just

to show you that it's the same idea. So,
it doesn't really matter if we first take

two images, A and B and superimpose those
two and then put those two on top of C or

if we do it in another order. Does that
make sense so far? Ok. No? Do you have a

question?
Anonymous: <i>Mumbling</i>

Mike: Yeah, so ahh, good point, good
point. So this implies that there must be

some kind of, that there's probably some
notion of transparency involved. Yes. Yes,

there is. But then you have associativity.
And really what it means. Very good

question. So, if you think of this image
in terms of the color at certain

coordinates, Right, Well, you need to
think about how to combine those two

colors that are in the constituent images.
And you can imagine that there also has to

be a combination operation for the colour.
And that also needs to be associative as a

prerequisite for the for the overlay
operation to be associative. Does that

make sense now? Thank you. Good question.
Great question. So anyway, so since this

associativity property is something that
is not just restricted to numbers, as we

may have learned in school, it really
makes sense to get. And that means that

when we talk about associativity, we
always have to name two things. We have to

say what set we're operating on and what
the operation is. And the combination

those two things has a name in mathematics
and it's not the best name, but it's

called a semigroup. Right. And, but, you
know, if you drop it in certain circles,

they'll think that you're an expert on
mathematics, you might try that. So, just

to go over that: So, you have some subset
S, and that S might be Image, it might be

the natural numbers or something like
that. And we have an operation that I'm

just gonna call circle here, then take any
a, b and c from that set S. We can use

circle as an operator and we have that
associativity property and for that circle

you can put in overlay, you can put in
beside, you can put in above, you can put

in + you can put in times or you can put
in the list concatenation operator, the

++. Okay? And that associativity is great.
It's really my favorite property because

it means when we have a whole lot of
things that we combine, we can

parenthesise in any way we want. We will
get the same result no matter which way we

parenthesise them. And that really means,
we can leave out the parentheses when we

write an expression that involves only the
circle operator if it's associative, if we

can just leave out all the parentheses
because the parentheses don't matter. And

that makes it well, that makes it
instantly easier to read, I think. Also it

has practical uses. So if you do big data
processing associativity means that if you

have large datasets that span several
machines or several hard drives or several

data sources, and you're combining them
and you have an associative combination

operation, it just means you can rearrange
that combination operation according to

the load in your compute cluster. And that
makes it a very useful property when

you're doing big data processing in sort
of MapReduce based frameworks. But, I

mean, that's a practical application, but
I think it's much more useful,

associativity is much more useful when you
use it for designing your domain model.

And I talked in the beginning how, well,
you want to avoid always adding more

database columns. And one way of doing
that is to view your domain model, not as

something that has more and more
properties, but your domain model as

building blocks that you combine into a
larger building blocks the same way that

we combine images from simpler images. So
here's one of the great papers from

functional programing, one of my two or
three favorites from Brent Yorgey. And

it's called "Something Something: Theme
and Variations". And you can see that it

is about images. And these images get
superimposed with an operation that is

just like overlay and that is, that title
is eminently googleable. Now, it has a

funny word there. It says, it doesn't say
semigroup, could say"Semigroup: Theme and

Variation", it says "Monoid: Theme and
Variation". And a monoid, well, it's also

not something, even though it sounds kind
of fancy, it's actually not much more

complicated than a semigroup. It's a
semigroup. And also the semigroup has a

special element called the neutral
element. And whenever we combine something

with a neutral element, it doesn't matter
if we do it in front or at the back, we

get the same thing back. So, of course,
the neutral element with respect to

numbers, in addition, would be zero. The
neutral element with respect to lists and

and concatenation would be the empty list.
I always hear several voices. That's

wonderful. Thank you. And the same thing
for the overlay and beside and above, you

can imagine that you have just an empty
image that has only, that consists only of

transparency, that can work as the neutral
element. So all of these things that I

showed you that are associative, they're
not just associative. They're not just

semigroups, they're also monoids. And so,
as I said, as long as you remember

associativity, that's the important thing.
But often you also find a monoid, and

monoids in the wild they're just
everywhere. We've seen them for numbers

and lists and images, music forms a
natural monoid. You can you can describe

musical structure with monoid operations.
You can treat animations, the time axis.

You can define monoidal combination of
animations. A famous example in functional

programming is with financial contracts.
If you were here last year for a talk of

mine, we talked about semiconductor-
fabrication routes, which sounds very

concrete, but also they form a monoid. The
properties themselves that we'll see for a

monoid are all kinds of things. They're
everywhere around you. And these are

really the key towards making flexible
domain models because in almost any domain

model you can find a monoid just by
looking for building blocks and for ways

of combining those building blocks into a
larger building blocks. So let me get

back. So I said, well, you can use
associativity or you can use this monoid

thing to guide your design. And I haven't
really made that concrete yet. And so I

stole a couple of pictures from Brent's
paper. So you remember the beside and the

above operations. And those are fine for
arranging things sort of in the vertical

and the horizontal axis. The way that they
work is, they make, they put a bounding

box around every picture and then they
arrange the bounding boxes either beside

each other or above each other. So it's a
slightly more involved thought. And that

works great when you're, when, you know,
your picture is, happens to be a square

that's aligned with the axes. It doesn't
work so well if your picture is rotated,

right. Because the bounding box, the
bounding box then is too big. And if you

want to attach anything about, just about
in any direction, then there's going to be

a gap in your picture. And so beside and
above are not particularly good operations

as the basis for an image library. The
overlay operation is much better. But that

leaves open the question how you can
arrange pictures, several pictures so that

they are beside each other or that they
just touch. And Brent came up with this

idea of an envelope, technical idea. So
the idea is that, well, if you give me, so

the red dot there, that's the origin. If
you give me a vector starting at the

origin, I will tell you how far you have
to go along that vector so that I can draw

that blue perpendicular line that's just
outside the shape. And that's called an

envelope. And envelopes are wonderful. So
if you ship each picture not just with

sort of the visuals that you see, but also
with a function that describes the

envelope, then you can use that envelope
to arrange pictures both in the horizontal

and the vertical, but also in the diagonal
by just drawing vectors so that they

touch. So, that's a slightly more
complicated idea. Does it make sense? And

Brent goes through the motions of using
that inspiration from the monoid that he

is getting. He's saying "Everything must
be a monoid! Absolutely.", and uses that

as a guiding principle through the
library. So I'm not going to go into

technical detail on how that works, but
it's a very pleasing paper to read on

that. And it results in a beautiful
library that's great fun to use. So that

means, though, that you also have to find
a monoidal combination operation for the

envelopes. You can't just, we've already
seen how we can combine the pictures

themselves, but we also need to combine
the envelopes. And fortunately, that's

pretty easy. If somebody sets a vector in
a certain direction, then that envelope is

just a maximum. Those two pictures, right,
if you combine that ellipse and that

square, you can see that I'm just going to
have to go to the maximum of those two

numbers in order to just be outside the
composite shape that that comes up

superimposing those two things. So that's
great. Now, I sort of introduced these

properties as a mathematical thing, right.
I said, well, there's this fancy, fancy

upside down operator says for all images
and we might say for all images. Now, we

can also formulate these properties as
code. And that's really where additional

magic is. So, for example, the
associativity property, well, there's not

much of a difference except that the
image1 and image2, they are now in

typewriter font. So we could put those in
the program. But there's still that

mathematical stuff on top. But in a
functional language, in a lot of other

languages too by now, we could also put
the top line and translate that into code.

And it might look like this. So that's
what it looks like in Idris. So, it's not

quite the same, but maybe we recognize the
structure. So, we say, well, there's a

property called and the property is just
called overlayAssociative. So we give it a

name. So, Idris is an ASCII language,
still so, primarily. So, we say just

forAll there instead of the upside down
all. And then it says arbTriple arbImage

arbImage arbImage. And that means for all
arbitrary triples of arbitrary images and

other arbitrary image and another
arbitrary image, so, triples, three

things. And we're going to call those
three images image1, image2 and image3.

That funky backslash there, that's a
Lambda in Idris. And then the overlay prop

means that while, if we overlay one way
and we overlay another way, according to

associativity, we get the same result. Do
you recognize that structure? Right. That

it's the same thing. So that we're writing
structurally the same thing that we wrote

in mathematical notation. Now as a piece
of code. And now the great thing is once

we've written it as a piece of code, we
can manipulate it in a program.

So, one way, there is different ways
of manipulating it. But one of the most

useful ones is, again by another great
researcher in functional programming,

John Hughes, came up with something called
QuickCheck. So if there's another thing

you take away from this talk is: google
QuickCheck. And whatever language you use,

it doesn't have to be Idris. In fact, I
had to hack together a QuickCheck for this

talk, but basically any other language is
going to have a QuickCheck, whether that

language be a functional language or
whether it's Java or Python or R or

something like that. You can always get a
QuickCheck for that. And I'm going to try

and demonstrate this QuickCheck thing not
by thinking about the design so much, but

by demonstrating a property of something
that's very error prone. So, here's this

idea, we want to have a representation for
sets of natural numbers. And we're going

to represent those sets of natural numbers
by a list of intervals. So, by a list of

ranges, if you will, between two numbers.
Now, I'll try to explain that. So, up

there at the top, it has a type
definition. It says, ISet, interval set,

is a type. And that type is defined to be
just a synonym for a list of pairs of

natural numbers. That's what those round
parentheses with a comma in the middle

mean. OK. And just to see what that means
is, there's a function there. I haven't,

I've lighted the definition, but what's
important about it is its type signature.

It takes an interval set and it produces a
list of all of the members of that set.

And you can see sort of a demo thing here
that I typed in before the talk. So, if I

apply iToList so that, the brackets there
they just mean the list, and we feed in a

list of intervals and those intervals are
from zero to three, from five to seven,

and from nine to ten, respectively.
They're all inclusive. And you can see

down there is a list of all of the
members. So, the first interval is from 0

to 3. So, it has the numbers 0, 1, 2 and
3. The next one goes from five to seven.

So it has the three numbers 5, 6, 7. And
the last one goes from nine to ten. So it

has the two numbers, 9 and 10 there. Does
that make any sense again? Slightly more

complicated example. So let's see. So, of
course, well, not of course, but the way

we want to do it, the way I want to do it
is, I want to have the interval set

structured in a certain way. I don't just
want <i>any</i> list of any pair of numbers to

denote an interval set. And therefore,
here is a function that describes what it

needs to be a valid interval set. Right.
So, for example, we don't really, in order

to have efficient processing, we don't
really want two intervals in one interval

set to overlap. Right. We want them to be
disjoint and we also want them to be

ordered so we can have efficient
operations for certain things. Right. And

so, let's go through this. So, there is an
isValid function that just tells you

whether that interval set is valid or not.
It says, well, if that set, and there's

three different cases here, which is why
there's three different equations, in the

first equation says the empty interval
set, the empty brackets mean the empty

list, and if the intervals, the list
representing the interval set is empty,

then we're going to say True. Empty set -
perfectly fine. The next case says, our

interval set consists only of a single
interval and that single interval goes

from low to high. Well, we kind of
interpret that there, but, and, well, that

interval set is valid, if low comes in
front of high. Right, they shouldn't be

the other way around. So, does that make
sense? Somebody in, can you nod at the

back, a little bit? You're still there?
OK. Thank you. Great. So, then it becomes

a little bit more complicated and it says,
well, this is the third case, when there's

at least two intervals in the interval
set. And those two intervals are, the

first one goes from lo1 to hi1. The second
one goes from lo2 to hi2. So, those ::,

they separate the first element of a list
from the rest. And then there's the rest

of the list. And it says, well, again, we
want the interval to be ordered so that

the lower numbers are on the left. That's
where it says lo1 is less or equal hi1.

And then it says, well, that there should
be a gap between two consecutive

intervals. Otherwise, they should be one
interval, which is why the high from one

interval should be separated from the low
of the next interval by at least one. And

then we're going to say, well, also we
want all the rest of the list, including

lo2 and hi2 to be valid too. So far so
good? OK, so this is probably, well, the

second most complicated piece of code. So,
anyway, so, here's, so, we might imagine a

union function. And the union function,
guess what, it forms a monoid, with

respect to interval sets. So, it takes,
two internal sets go in and another one

comes out. And if you've written that kind
of thing before, you might notice it's

probably a little tricky with that fancy
validity condition that's there. So how

can we get this right? Well, what we do is
we write down properties. Of course, we

could write down associativity. I'll leave
that as an exercise. Another one is just

very simple. Just a very simple property
that says for all pairs of two arbitrary

interval sets, we want the union of those
two interval sets to be valid, a valid

data structure. We want the union function
to preserve validity. OK? Makes sense? So

here's another property that says, well, I
already gave you this function or I told

you that there is this function iToList,
which just gives us a list of elements of

an interval set. And what we can do is, we
can use sort of that representation,

that's also a representation for sets. We
can use that representation sort of as a

model and say, well, if we take the
unions, you see there for all pairs,

again, of arbitrary interval sets, we take
the union. It says iUnion, iset1 and

iset2. And we convert that to a list. And,
we could also do, we could instead convert

each individual set to a list and then
just merge those two lists. And that

should yield the same result. So, in a
way, we're just giving a very simple model

for our interval sets, right, and that
would, so those two criteria would be kind

of nice to have in order to get our
implementation correct. And I already got

started before the talk on this. Looks
like this. No. Doesn't look like this.

We'll get to that later. But like this.
So, here's what I came up with. So, you

see there is that, while there's all this
other code there, ignore that. But there

is iUnion says ISet -&gt; ISet -&gt; ISet, do
you see that? And then, there's two

equations that say, well, the first set is
empty, then I'm just going to give you the

second one. And if the second one is
empty, I'm just going to give you the

first one. Right? Classic things when you
have union or concatenation operations or

something like that. And now you can see
the third case. It gets tricky, right?

Again, you don't need, I mean, main thing
is you need to understand it's tricky.

Well, the third one is such that, well,
says so that both have at least one

element and that element is in the
interval lo1 and hi1 in the first case and

lo2 and hi2 in the second case. And then
there's the rest. And I already put in a

little bit of code, and I said, well if
lo1 comes after hi1 (means hi2), then we

want to start with lo2 to hi2 and then
continue with the union. In the other

case, if lo2 comes after hi2 (means hi1),
then we're gonna start with lo1 and hi1.

And in the other case, it means, that no
interval comes before the other, and

therefore we need to merge the two
intervals at the beginning. Does that make

remote sense? Right. Don't worry. We'll
get back on solid track. So, we just take

the minimum of those two intervals and
maximum of those two intervals and we do

this. Now, the great thing is, I told you
about this tool by John Hughes called

QuickCheck. And the great thing is, we can
load this into Idris. And then here comes

a REPL, and we can say, I hope I'm doing
this right. So, we want QuickCheck, and we

want, what was it called? It was called
prop_unionCorrect. I hope I'm doing this

right. And, well, very small font. But you
can see here it says "100 tests". And that

is what QuickCheck does, as, it takes your
code version off the property and

automatically generates a lot of tests for
them. And that is super effective at

weeding out bugs. So it says, well, the
thing that you wrote is correct. It always

produces interval sets that when you take
the list, it gives you the right result.

But there was that other criterion called
unionValid. And there it says, and this is

really the better part, of course, of
QuickCheck is, when it fails, it says it's

falsifiable. It says there is a counter
example. And so, here it says, I did nine

tests, I generated nine random tests, and
I found one where the result is not valid.

And the great thing is that we can then go
and cut and paste this example. So I could

say iUnion, this, remove the comma in the
middle, and call this. And well, what

happens here is, what we can see is, we
can see 2 and 4, 1 and 1, and 3 and 5, and

what's not valid about. So, by the way,
this is randomized. So, this always goes

differently. So I have to look at it, too.
So, then it says, well, those two

intervals, they should really just be
merged and they should just be one

interval. Right? And so, it didn't do that
correctly. And the reason for that, maybe

you saw it. So, and, what happened is that
it ran into one of those two cases here

where it says if lo1 greater than hi2 or
lo2 greater than hi1. Remember that I told

you there needs to be a gap of at least
one between them. Remember? And here's an

off-by-one error that says, well. So this
says, they can, lo1 greater than hi2 says

they can still be right next to each
other. Right? And this is what happened

here. We need to make sure that there is
that gap in here. So, I can fix it like

this. Loaded again. Oh, no. There's still
a counterexample. So, and we can try that

out, so, and that's great. We get test
cases that sort of show where the bugs

are. And in this case, well, what happened
here? They still overlap. And what

happened here? So can you see it? So, you
can see that the first two intervals, they

must run into that last case. Right.
Because they overlap. Zero is the interval

from 0 to 3 and the interval from 0 to 5.
They overlap. So we need to get to that

case. And so it merges them. And then it
went and and somehow didn't merge it with

the 6 and the 7 that's there. And, so,
well, if you look at it. So it must have

done this. And and what it did is, it then
went on with the rest there. Let's have

one more look. What actually happened? So
there it is. So, it merged those and then

you can see that it went into a symmetry
problem here. Well, maybe you don't see.

But, you know, this is tricky stuff. I
couldn't do this by myself. So you can see

here that it just tacks the result onto
iset1Rest, whereas the maximum of hi1 and

hi2 could, might violate the consistency
criteria if it's the wrong one, and then

it runs into one of the other cases. Now
I've never seen this tricky one. Does it

make sense? But, can you see that it
should be symmetrical? The last one. Can

you see it? OK, so we'll try and make it
symmetrical. Do it like this. So we'll

say, well, if so, this only works. So if
hi1 is less than hi2. So we really need to

make sure, then it is perfectly. And then
the maximum of those two numbers is hi1.

Does that make sense? And so the max of
those two numbers is hi1 and then it's

perfectly valid to tack it onto iset1Rest.
In the other case, hi2 is greater and we

need to go and do something different and
rip this out here. Stick it in front here

and then. And then. And now it's
symmetrical. OK. So, load this. And, ahh!

It has passed the test. OK, live great.
<i>applause</i>

Thank you. I did practice getting it
correct, right. But you can, you know,

this kind of stuff. It always gets me. I
mean, you know, with old age especially,

this kind of stuff, it always drives the
sweat on my forehead, right? You know,

there's off-by-one. There is, you know, I
don't know how many cases there need to

be. And QuickCheck is the kind of thing
that weeds out the bugs. And even though

it weeds out the bugs in a different order
each time, it always weeds them all out.

OK. So it's a great tool. Now, I recommend
that you try that. It generates tests from

properties. OK, where are we? So let me
let me give you a couple of real world

examples. So if you're using X windows,
there's a there's a tiling, a window

manager, xmonad. It's already a couple of
years old and they don't do much

development on it anymore. That's because
it's correct. Right. <i>laughter</i> Right. And

why is it correct? Well, it's because they
wrote down a lot of properties for the

geometry and the tiling algorithms and
verified them using QuickCheck. And so I

sort of loosely translated. So, Don
Stewart, one of the authors of xmonad

graciously wrote a couple of blog posts on
a simplified version of xmonad and I

translated them into Idris. So, here's a
very simple idea of just a stacking window

manager. So, it doesn't do geometry, it
just has stacks of windows and it has

several workspaces. In each workspace is a
stack of windows. So here's a data type

called a StackSet, its parameterized by a
type called window. We'll see later why

there's a type parameter and why it just
doesn't say what the windows are. And then

it says there's a constructor StackSet
and there's two fields in there. One is

called "current", that's the number of the
workspace that's currently

active. And then there's "stacks",
which is a map from the number of the

workspace to the stack of, to the list of
windows that sit in that workspace. Again,

so here, really the technicalities are
not particularly important, but there is a

bunch of operations that operate on this
window manager configuration. And again,

here, really the details aren't important.
So you could create an empty stack set.

You could say, well, you know, I have the
number of a window that I want to get to

the front. And please make me, please
rotate me, the stack set around so that I

can see it. "peek" means, you know, maybe
I can get the topmost window that the user

is currently looking at. "rotate" means
I'm just going to rotate the workspaces

around in either left or right direction.
That's what that ordering argument. "push"

is, I push a new window onto the current
workspace. "insert" means insert a window

into one of the other workspaces. "delete"
means I delete a window. "shift" means,

also means I shift something with the
windows. Not really important what they

do. But you can imagine again, just as we
did with the interval sets is validity

criterion or an invariant that should hold
for these operations. And it's very

simple. Well, it says well, if you have a
stack set with some windows in it, I'm

just going to tell you whether that stack
set is consistent. And by doing that, I'm

just going to say, well, the current, the
number of the current stack

should not be higher than the number of
window stacks that there are. Right. So,

the number of stacks that there are. And
the other one, that just says a window

should not be in several of the
workspaces. Right? And then I can go and

maybe with this definition, all
those function definitions aren't very

complicated. But, I can go and write a
whole bunch of properties. And if you just

understand, well, maybe the second one,
"prop_view_I", you understand all of them.

It just says, well, for all pairs of a
natural number and a stack set that are a

"stackIndex" and "stackSet", I want, if I
call the "view" function, which is one of

the operations, I want the view function
to produce a consistent stack set. And

then it goes on to do all of that for all
the other ones. At the bottom here, you

can see some prerequisites that need to
hold for the property so that invariant

only needs to hold if the window, if the
number of the window is actually smaller

than the size of the stack set. Otherwise,
I think the function just returns what

would go with it, what went in there, So
that's a very, that's just a very

efficient way to invent properties, to
think of some invariant that shall hold in

your data structure. And if you know
Idris, you can sometimes encode that in

the types, but often that's kind of
tedious. And you can just write it down as

a property and then have QuickCheck check
it for you. And it's not particularly

exciting for the simple definition, but
you can imagine that the actual definition

when you have tiling window management
going on is much more complicated than the

one that you just saw. But you can keep
those same properties, right? There still

needs to be some consistency invariant
that, if you have tilings, the windows

don't overlap, and things like that. That
should be obvious. Write those properties

down, check them using QuickCheck and that
will weed out a lot of the bugs.

Here's an example from our
practice. We, couple months ago, we were

tasked with migrating a giant Visual Basic
6 application. It had a password checking

function there. You can see here a Visual
Basic 6 type signature. And the property

that we wrote was, well, if we create the
hash from the password and we compare it

with the hash that's in the database, then
they should all come out the same. And to

our surprise, that function, that test,
that property, failed when we ran it for

QuickCheck and we had to correct it
because that password hash is restricted

to 11 characters by some restriction in
the database schema. And so that means

that you can use QuickCheck not just to
sort of check the correctness of things

that you already know, but to actually
develop a model for what goes on in your

software, which you don't always know very
well. So that's what we did there. Another

example is, we wrote, for a large
industrial client, we needed to write a

synchronization application. So when you
had two mobile devices and they would sort

of meet as strangers, they would exchange
data and they all needed to look at the

same sort of device configuration data.
And we didn't want them to exchange all

the data every single time. We just wanted
to exchange them, the data blocks that the

other side was missing. And again, there's
great algorithms for this based on Merkle

trees. They're pretty complicated. You
have to do a lot of bit fiddling with

that. But fortunately, the property for
that is pretty easy to write. So here's

the property that says, well, so the
synchronization algorithm works on sets of

blocks, whatever a block is. So you can
see the property here for all pairs of

sets of blocks and more sets of blocks. So
they're called bs1 and bs2. Block set one

and block set two. What we can do is, we
want, if we union those two, then we get

all the blocks in the system. We call that
all, or we can call the synchronization

algorithm and that will give us two new
block sets, block set bs1' and bs2'. And

those block sets are the ones that get
transferred to the other side. OK. And the

criterion here just says if we take the
ones that we have, if we union them with

the ones that we get, we should get all of
them. That should be all of them. And that

should be the same for both sides. And
also, we want the algorithm to be

efficient so we don't want it to transfer
blocks. So we want to make sure that the

blocks that we have and the blocks that we
get are disjoint. That they don't have any

elements in common. Otherwise, we could
make that algorithm trivially correct by

just transferring all the blocks every
single time. And I can tell you, I

sweated. You know, I sweated one or two
weeks over this algorithm and it was

really hard to write. But this one test
weeded out all the bugs that I found along

the way. So that is just super, super
effective. John Hughes has a couple of

papers on hard bugs that he found. So he
found a bug in a distributed database

called mnesia. And that bug was dependent
on opening the database, closing it and

opening it again. So this is not the kind
of bug that you find by just writing a

bunch of smart unit tests. Right? So, if
you did anything shorter in the beginning,

so if you just open the file and then did
some lookups there, that would not

manifest the bug. You really needed to
close and then open again. Have you

turned, have you tried turning it off and
on again? But then the database breaks in

this case. And here's another example
called The Mysteries of Dropbox. So you

can imagine that with Dropbox you really
want certain properties to hold. Right?

And it turns out they didn't. They never
worried about writing properties down. But

John Hughes did it and found a couple of
bugs. So here's one. It's kind of hard to

read where it says client 1 writes a into
a file that was previously empty. So that

funky turnstile there is empty. So writes
a into a file and then deletes the file

and another client writes, replaces, sees
the a in the file replaces it with a b.

And then client 1 goes and writes c into
the file that it previously thought to be

empty. And then unfortunately, even though
you can imagine that you should see either

b or c in that file, but Dropbox deleted
it. So I think they fixed that bug now.

But. so you go. So it goes. Oscar Wikstrom
has a couple of great, pretty recent blog

posts on properties in a screencasts
editor that I highly recommend. So this is

a great tool for finding bugs, but it's
not the same as having a proof. Right? So,

you can still imagine that you can find
very subtle bugs that are not covered by

QuickCheck. QuickCheck just randomizes,
just generates randomized tests. So, that

is not the same thing as making sure that
there aren't any bugs. So the great thing

about Idris and the reason I chose it for
this talk is that Idris allows you to not

just encode properties in the language. It
also allows you to encode proofs in the

language. So here is the associative
property for the list concatenation

operation. And if you look at the top that
has the definition of that function from

the Idris standard library, it says ++, in
goes a list, in goes another list, out

comes a list. Then it says, well, if you
concatenate the empty list with any list,

that is just that list "right". Do you
see that? The second one says, well, if we

concatenate a list that starts with the
element x and goes on with xs, then we

just sort of pull the x in front and
concatenate the rest with "right". So

that's a classic recursive definition of
list concatenation in functional

programming. And now here's something
really strange in Idris. Here's the type

declaration for a definition, again in the
standard library, called appendAssoc. And

it says, if you have a list a, you have a
list b, and you have a list c, and in the

type it says, oh, the associative property
should hold. Right. And so this is a

statement of that property. That's
wonderful. It's not the same as a proof.

So, but writing proofs, who loved that in
math? Oh. Oh, you're good! I didn't. I'm

sorry. So. So the great thing about Idris
is, it helps you write down the proofs.

I'll show you how that works just really,
really briefly. So here's that. So here's

just what I showed you on that slide. So I
can load that in there and it says, well,

you're not done. You didn't write a proof
for that property, but in Idris, you can

just push a bunch of buttons. Now, I love
that. So I can push one button and it

says, oh, well, you should write a proof
of that form. You have lists a, b and c.

Well, now and I can push another button
that says, well, you're doing this on

lists and if you're writing anything on
lists, you always need to distinguish

between the two cases of the empty list
and the list that consists of the first

element x and further element xs. And then
it says, well, write down something, but

then I can tell Idris: Well, I'm too lazy.
I'm not going to write anything so I can

just push a button. And Idris wrote this
so you can see me, but I didn't type this

right. I just pushed a button and it says,
Refl. What is Refl? What could that be?

Well, you can ask it what Refl is. It says
Refl. Oh, you can see here, landing here.

Refl is just a proof sort of a built in
proof that says that if two things are

true, two things are equal, if they're
identical, if they're the same. Right. And

that kind of makes sense in the first
equation, because the first equation of

appendAssoc corresponds to the first
equation of ++. Can you see that, how it

corresponds? Can you see that? The first
list is empty. Can you see that? Can you

see how the first list is empty with the
first equation of appendAssoc and the

first list is empty up there with ++. Can
you see that? OK. And then it just says,

well, then obviously. Well, not quite
obviously, but then sort of the the way

that the definition works, it comes out
just right. So what's really important is

that Idris accepts that proof with the
first. The second one is slightly more

tricky. But again, we can get help because
we know that appendAssoc is this recursive

function. It recurses on the first
argument. So we're just going to do the

same thing in the proof. And you can tell
Idris that it should use that, that it

should use that fact, if you will. So
here's the recursive call. And again, I'm

too lazy to push a button. But if I push
that button, it also puts in Refl and

there's loads. So this it might be a
mystery to you how it works, but this is a

proof of the associate of property of the
list concatenation in Idris. And since

Idris helps you write it, it's kind of fun
to do that. Oddly enough, even for

somebody who doesn't, usually who doesn't
usually doesn't enjoy proofs. So the way

that you program in Idris, we haven't done
that a lot in this talk, is that you put a

lot of information in the types and the
more information you put in the types, the

better Idris will get at figuring out the
correct definition. And you don't have to

do it by yourself. OK. So that's really
nice. OK. So we got that and and sort of

these kinds of proof assisting systems
such as Idris have been used in a lot of

real world systems. One one prominent
example is SEL4, a version of the L4

micro kernel, has a long history, but
important properties of that kernel have

been verified. It runs in the security
enclave on iOS and even though it's

written in C, it provably does not have
buffer overflows or a lot of the nasty

things that are responsible for a lot of
security exploits. Compcert is another

example, which is a verified, I should
mention this has been verified with the

help of a tool called Isabel. Also, great
fun to use. There's a project called

called Compcert, which is a verified C
compiler, which is important for a lot of

certified software where, you know, the
source code might be certified. But how do

you know that the compiler generates
correct code? And you know, because it's

been proven to be correct. And even there,
you can shoot, you can cheat sometimes. So

for example, register allocators, very
complicated, very hard to prove right.

But what you can do is you can write a
checker that the register allocator did

its job, did its job well and you can
verify the checker. And so you can cheat a

bit. So there's tools for that. We've seen
Idris and there's a number of other tools

and they're getting more and more mature.
And they're great fun, really. They really

are great fun. But, you know, going back,
switching down a gear a little bit,

there's lots of useful properties that you
can look for in your programs. So

commutativity might be useful that you can
switch the two arguments for an operation.

Also, if you have relations, you might
remember that from some math class,

there's some properties here like
reflexivity, symmetry, antisymmetry and

transitivity. Reflexivity says that a is
always related to a. Symmetry says if it's

one way, if a and b are one way related,
they need to be related the other way too.

Antisymmetry intuitively would seem kind
of the opposite. That doesn't make sense.

It's just says: if two things are related
in both both ways around. So, for example,

you know, orders like less or equal are
antisymmetrical, then they must be the

same. And transitivity just says that you
can form chains of your relation. So those

are a little dictionary of useful
properties that you can look for. Let me

close with one fancy property that you've
probably seen somewhere and that property

is called Functor. And you might have seen
in your programing language, in your list

library or in your stream library. There's
a function called map, right? And you

know, even Java has that and has had it
for many years. And what map does is, if

you have some, you know, in Java, for
example, it says "Stream", or it might be

"List", right. It says, well, if I have a
list of As, I can apply a function to each

element of that list. But you can
generalize that, it doesn't have to be

lists. It could be an Optional of As, for
example. You could also apply a function

to the value that's in there. So, you can
generalize that notion, and then it's a

functor. And, of course, in Idris, you can
write down equations for functors. And,

please ignore the technicalities here,
(stammers) but, if you sort of pick out

where it says "functorIdentity",
the middle row says g v equals to v,

which means g is the identity
function. When you feed in v, you always

get v back. And when you use map with the
identity function, you apply the identity

function on each element of your list or
whatever it is. Then then you always get

back the same list. And here just says you
get function composition. So if you apply

one function and then another function and
you do that either inside or outside the

map, you should also get the same results.
So there's also just as there is

associativity with monoids, with functors.
There's these laws and you might think,

well, where would I look for a functor?
I've never seen a functor except for the

ones on streams. A couple weeks ago in a
training, somebody said, well, you always

start with that animal example. Shouldn't
you look for a functor there? And I was

kind of, you know, sweat broke out on my
forehead, I was like, where's that gonna

go? But, we came up with this. So, if you
go back, you can see that this is obvious.

So, what you need for functors is, you
need a type parameter. Right. And so you

just look for a place to stick a type
parameter, any place here at all. And

if you look at Dillo and Parrot, they
both prominently have this weight thing.

Right. And so that seems more important
than the other two properties, which are

specific to particular kind of animal. And
so the weight, the thing to do is just to,

well, you can see I replaced upper case
Weight by lower case weight and made that

into a type parameter, and, I can then
provide a functor implementation down

there. And you might think, what is that
good for? Well, I don't know. Well, one

thing that you could do is you could
provide a different representation for

weights. Another thing that you could do,
if you look at the type for runOverAnimal,

it says animal weight -&gt; animal weight and
weight is a type variable. What that type

signature tells you is that runOverAnimal
does not know what weight is. And that

means that the weight cannot change as a
result of that function. And you see that

in the type signature you get immediate,
small benefit, but you get a benefit even

with silly examples such as this one. And
that really brings me to the end. So in

your software, in your domain model, look
for a Combinator, look for a function that

will combine two things into a bigger
thing. See if you can make that thing

associative and look for a neutral
element. And very often you will find one;

make it a monoid, you know, say monoid a
couple of times. You'll remember it.

You'll remember it. Generally, write
properties for the things, for the

operations in your software, test those
properties using QuickCheck. You know, if

you feel like you have a lot of time,
prove them correct. Find the functor. If

you found, if you found the monoid, you
know, find the functor next. You know, and

it takes, it might take time. I'm
very old. As you noticed at the beginning.

So. So it gets easier over the years and
it will just seem like a regular staple of

your of your arsenal when you program. And
of course, when the important properties

in your program have either been written
down, if they've been tested with

QuickCheck or even proven, then you can
sleep much more soundly than maybe you

currently can. Thank you very much.
<i>applause</i>

Herald: Thank you, Mike. So I see we have
three minutes for questions. Maybe that's

two or three questions. If you have any
come to the microphones, please. Do we

have a question from the Internet? No, not
yet. So, microphone two. Right.

Question: Hi. So, QuickCheck generated
hundred tests. Yes. What can we say about

the quality of this test? Could we say
your program was correct using thoese

tests? Are these tests good?
Answer: Yeah. Very good question. So what

would you say about the quality of the
tests? And indeed, if you really do serve

industrial strength applications, a quick
QuickCheck comes with a bunch of tools

that let you look, for example, at the
distribution of the individual example

rated and while you didn't quite see me do
that, but I mean, for your domain objects,

you will typically write generators that
will generate those examples and you can

reason about the distribution of those.
And you absolutely should do that because

otherwise you might miss large areas of
your test space. So, but there

are tools and they help you do that. But
even if you don't do that, you know, it's,

you find a lot of, I found a lot of
bugs in my software even without worrying

about that. But if you go beyond that,
look at the distribution thing.

Herald: Thank you, next one, please.
Number two.

Q: Let's say I've hacked a program, for
example, in Java or C# or whatever. How do

I, how do I apply what I learned so
far? So, where do I start when I have

already completed the C# program with,
yeah, how do I apply QuickCheck on that?

A: So, just pragmatically because it's
written in C#? That's the question?

Q: Yes
A: So, well, I have to be very concrete

here, I mean, so, if you can think
properties, right, one way to do, so,

for example, so, C# you can link with F#
and there is a QuickCheck version for F#

called "FsCheck". And FsCheck, actually,
even though it's itself written in F#, you

can also use it from C#. So, you have two
options. You can write your tests in a

slightly more awkward fashion in C#, or
you could just link your code with F# test

suite and write it down there. And there
is a fairly reasonable Java QuickCheck, I

hear. Another idea would be to use the
slightly more fancier, the slightly

fancier QuickChecks that exist for Scala
and Enclosure. I'm sure there's one for

Kotlin as well, and link that against your
Java code. Does that answer your question?

Q: So whatever language I use, I have to
find out what the correct implementation

of QuickCkeck?
A: Yeah. Yeah. But as I said, I mean

usually, a fun thing I do in training is,
I chat "QuickCheck" and somebody calls on

the language, you know. Quick, QuickCheck
PHP or something like that. And there is

one, sure enough, I didn't know about before.
Q: Thank you.

Herald: All right. Thank you. And thank
you, Mike, again, for showing us a way to

sleeping soundly.
A: Thank you.

<i>applause</i>

<i>postroll music</i>

Subtitles created by c3subtitles.de
in the year 2020. Join, and help us!