1
00:00:00,000 --> 00:00:19,369
36C3 preroll music
2
00:00:19,369 --> 00:00:25,550
Herald: Our next talk is held by Mike
Sperber, and he is already very ready for
3
00:00:25,550 --> 00:00:31,009
that. He's head of a software company in
Tübingen in Germany and he's going to talk
4
00:00:31,009 --> 00:00:36,540
about "Getting software right with
properties, generator tests and proofs".
5
00:00:36,540 --> 00:00:42,470
And the main thing here is functional
programming. One tiny thing you might not
6
00:00:42,470 --> 00:00:50,740
know about him is that 1986 he won a
federal competition on IT and so give him
7
00:00:50,740 --> 00:00:53,870
a warm applause for that and also for his
talk.
8
00:00:53,870 --> 00:00:58,220
applause
9
00:00:58,220 --> 00:01:04,690
Mike Sperber: Thank you very much. So is
anybody actively using the induction loop
10
00:01:04,690 --> 00:01:09,110
feature in the first couple of rows? Cuz
I know somebody who would like to know.
11
00:01:09,110 --> 00:01:17,080
Not right now. Okay. Anyway, so let me get
one shameless plug of advertising out of
12
00:01:17,080 --> 00:01:22,230
the way. If you find the contents of this
talk interesting, we're running a
13
00:01:22,230 --> 00:01:26,170
developer conference in Berlin in February
called Bob, which is very friendly and
14
00:01:26,170 --> 00:01:32,580
very nice, very tiny compared to this one.
And we'd love to see you there. Another
15
00:01:32,580 --> 00:01:39,570
thing, this is an introductory talk. So if
you were expecting the latest developments
16
00:01:39,570 --> 00:01:44,280
on proof tactic, in fact, if you know what
proof tactic is, then all you might get
17
00:01:44,280 --> 00:01:48,890
from this talk is sort of mild amusement.
And I won't be mad at you at all if you go
18
00:01:48,890 --> 00:01:56,160
for one of the more exciting talks. Ok?
So. Or leave at any time. That's perfectly
19
00:01:56,160 --> 00:02:01,791
fine, if this material is not for you.
Speaking of introductory talks, here's a
20
00:02:01,791 --> 00:02:05,740
piece of code written in the language that
I will use for this talk and it's called
21
00:02:05,740 --> 00:02:12,160
Idris. Who has written an Idris program
before? Very good. Ok. Oh, there's one
22
00:02:12,160 --> 00:02:17,810
person back there. That means if any part
of this program, as soon as I'm done
23
00:02:17,810 --> 00:02:22,129
explaining, is not clear to you, it's also
not clear to two or three hundred other
24
00:02:22,129 --> 00:02:26,540
people in this room. And I would love to
have your help. Interrupt me, ask a
25
00:02:26,540 --> 00:02:30,250
question anytime in the talk if there's
anything here not clear. It's going to
26
00:02:30,250 --> 00:02:34,799
get, even though it's meant to be
introductory, will get quite technical at
27
00:02:34,799 --> 00:02:41,120
times. So let me try explaining this one.
So this is a classic example in functional
28
00:02:41,120 --> 00:02:46,139
programming that I use often in my talks,
about animals on the Texas Highway. And if
29
00:02:46,139 --> 00:02:49,680
you can see there, the central definition
says data Animal. That's the data
30
00:02:49,680 --> 00:02:54,189
definition of animals. And in this
particular version of the Texas Highway,
31
00:02:54,189 --> 00:02:58,189
there's two different kinds of animals.
There is Armadillo, it's where it says
32
00:02:58,189 --> 00:03:03,860
Dillo there. And there's Parrots, for some
reason, on the Texas Highway. Does that
33
00:03:03,860 --> 00:03:07,389
make sense? Two different kinds of
animals. And you see that definition.
34
00:03:07,389 --> 00:03:14,430
Yeah, nod, that greatly helps me. And if
you see those two definitions for Dillo
35
00:03:14,430 --> 00:03:18,529
and Parrot, you can see, while the arrows
are kind of funny, but you can see that
36
00:03:18,529 --> 00:03:23,930
Dillo and Parrots have two properties
each, and it says their Liveness. That's
37
00:03:23,930 --> 00:03:27,689
one of the properties of an armadillo. And
up there at the very top, you see the
38
00:03:27,689 --> 00:03:31,519
definition of Liveness, it says Liveness
means dead or alive. It's an armadillo,
39
00:03:31,519 --> 00:03:36,919
can be dead or alive on the Texas Highway.
And there's also the Weight. And well, you
40
00:03:36,919 --> 00:03:41,710
see, this colon thing is the type
signature for the constructor for
41
00:03:41,710 --> 00:03:45,979
armadillos. So it says there's a liveness
going on, there's a weight going on, and
42
00:03:45,979 --> 00:03:49,989
then it constructs an animal. And for a
Parrot, there's a string. Every parrot
43
00:03:49,989 --> 00:03:54,150
speaks, right? And so it's the sentence
the Parrot says, and also the Weight. And
44
00:03:54,150 --> 00:03:58,590
it also produces an animal. And, up there,
you can see the definition of Weight is
45
00:03:58,590 --> 00:04:02,442
for simplicity's sake, I'm saying that
Weight is a type. So that's kind of
46
00:04:02,442 --> 00:04:06,150
unusual for Idris, but you don't have to
worry about it. But you can see there,
47
00:04:06,150 --> 00:04:10,830
Weight is just the same thing as an
integer. And if you look down there, where
48
00:04:10,830 --> 00:04:16,549
it says a1, a2 and a3, that has three
examples for animals. So it says a1:
49
00:04:16,549 --> 00:04:21,480
Animal, just to say that a1 is an animal.
So, Idris is a language that always has
50
00:04:21,480 --> 00:04:26,610
type declarations. And it says a1 is Dillo
Alive 10. And that means it's an
51
00:04:26,610 --> 00:04:31,210
armadillo, it's still alive, and it
weighs, let's say, ten kilograms. The
52
00:04:31,210 --> 00:04:35,500
second one is dead, a little bit heavier,
weighs twelve kilograms. And the third
53
00:04:35,500 --> 00:04:40,590
animal is a parrot that knows, well, it's
a pirate's parrot, obviously, and maybe
54
00:04:40,590 --> 00:04:48,062
weighs three kilograms. Ok, so far? Ok. So
if you have any question about any of
55
00:04:48,062 --> 00:04:52,260
that, then please ask away. So, what
happens to animals on the Texas Highway
56
00:04:52,260 --> 00:04:56,540
is, you know, people drive cars, they run
them over. So there's a function down
57
00:04:56,540 --> 00:05:01,100
here, and, well, we're doing functional
programing, shouldn't worry you at all.
58
00:05:01,100 --> 00:05:04,870
But what's important here is that it says
there is an animal going in, an animal
59
00:05:04,870 --> 00:05:09,520
going out. And really what this means is
that this animal object up there is not
60
00:05:09,520 --> 00:05:15,820
really the animal. It is the state of the
animal at a given time. So, runOverAnimal.
61
00:05:15,820 --> 00:05:19,770
you can see the type signature that says
an animal goes in, an animal goes out. And
62
00:05:19,770 --> 00:05:24,400
what it really means is, the state of the
animal goes in before it gets run over and
63
00:05:24,400 --> 00:05:29,920
the state of the animal after it gets run
over comes out. And then while we know
64
00:05:29,920 --> 00:05:34,260
there's two different kinds of animals.
And that means that for the definition of
65
00:05:34,260 --> 00:05:37,910
runOverAnimal, we need what's called
equations. There's two different
66
00:05:37,910 --> 00:05:41,640
equations. And the first equation says
what happens to armadillos when they get
67
00:05:41,640 --> 00:05:45,840
run over. So an armadillo has a liveness
and a weight. Here's something going on
68
00:05:45,840 --> 00:05:50,700
called pattern matching. And the second
equation says when there's a parrot going
69
00:05:50,700 --> 00:05:54,690
on it has a sentence and a weight, and on
the right hand side, you can see, well,
70
00:05:54,690 --> 00:05:59,360
when an armadillo gets run over. Well, all
that means is, the liveness sort of turns
71
00:05:59,360 --> 00:06:03,320
to Dead. We're constructing a new
armadillo object and it's dead and it has
72
00:06:03,320 --> 00:06:09,210
the same weight as before. And the
function, the equation at the bottom says,
73
00:06:09,210 --> 00:06:15,180
well, when we run over a parrot, it turns
really, really quiet. Ok? So, classic
74
00:06:15,180 --> 00:06:20,940
example. Ok so far. We're going to return
to that example at the very end. Right now
75
00:06:20,940 --> 00:06:24,820
it's just to illustrate the language that
we're doing things in and we're going to
76
00:06:24,820 --> 00:06:32,210
do a lot of things without complicated
programs. So, well. So, I'm going to jump
77
00:06:32,210 --> 00:06:36,020
around a little bit. So, one thing. So,
just the other day, two weeks ago, I was
78
00:06:36,020 --> 00:06:40,120
teaching a course on architecture and
somebody said: Well, there's this problem.
79
00:06:40,120 --> 00:06:43,958
I'm building a domain model. I'm putting
the domain model in a database. And, you
80
00:06:43,958 --> 00:06:47,430
know, customer comes in, has new
requirement or somebody comes in, has new
81
00:06:47,430 --> 00:06:51,260
requirements. And that always ends the
same way. I put a new call in the database
82
00:06:51,260 --> 00:06:55,810
and, you know, seven, eight, nine, it just
goes on and on. As the software gets older
83
00:06:55,810 --> 00:07:01,330
and older and older, more columns that
create the old ones seem a little stale.
84
00:07:01,330 --> 00:07:08,250
And so, yes, well, how can we build models
that are flexible? And so, here's
85
00:07:08,250 --> 00:07:12,950
something completely different, you might
think. So, here's sort of the key to that,
86
00:07:12,950 --> 00:07:16,800
to building flexible models. Does anybody
recognize this? Does anybody associate a
87
00:07:16,800 --> 00:07:23,020
word with this? laughter Very good. So,
you might remember, depending on what
88
00:07:23,020 --> 00:07:26,810
state you went to school in, you might
remember that this is a property called
89
00:07:26,810 --> 00:07:31,910
associativity. Right? And it means that we
can associate either the A and the B first
90
00:07:31,910 --> 00:07:38,200
with the parentheses or the B and the C.
So, and this is, if you take away one
91
00:07:38,200 --> 00:07:42,860
thing from this talk, it's associativity.
Knowing what that is is one of the most
92
00:07:42,860 --> 00:07:47,060
useful things in software development. So,
of course it's just a generic equation, we
93
00:07:47,060 --> 00:07:51,025
really need to be more specific, namely
that we're dealing with numbers and
94
00:07:51,025 --> 00:07:54,880
addition. And you might know that it's not
just addition that's associative. Also,
95
00:07:54,880 --> 00:07:58,670
multiplication, for example, is
associative. So here's a little mathy
96
00:07:58,670 --> 00:08:03,760
stuff there at the beginning. So, you see
that upside down A. That says "for all".
97
00:08:03,760 --> 00:08:09,030
We just say for all. What that means is
"for all A, B and C". And then, this funny
98
00:08:09,030 --> 00:08:13,850
epsilon-shape letter kind of thing, it
means "element of". And then that funky N
99
00:08:13,850 --> 00:08:17,490
means the natural numbers. So all the
numbers from zero, one, two, three, the
100
00:08:17,490 --> 00:08:22,790
whole numbers from zero on up. So, what
that means is, for all natural numbers A,
101
00:08:22,790 --> 00:08:29,720
B and C, the associative property holds
when you add them up. But while it says
102
00:08:29,720 --> 00:08:33,500
numbers in addition, it doesn't just hold
for numbers and, addition, in fact,
103
00:08:33,500 --> 00:08:37,819
associativity is everywhere around us.
Specifically, it's everywhere around us
104
00:08:37,819 --> 00:08:41,680
when we program. So here's another
example. When you're dealing with lists
105
00:08:41,680 --> 00:08:45,100
and that funky, the two pluses that you
see there, they are just list
106
00:08:45,100 --> 00:08:50,449
concatenation. So you concatenate two
lists and well, of course you can
107
00:08:50,449 --> 00:08:55,809
concatenate three lists by just using that
double plus in any order. And that's also
108
00:08:55,809 --> 00:09:00,730
associative. So, it doesn't matter if you
first concatenate the B and the C and then
109
00:09:00,730 --> 00:09:06,029
tack the A onto the front, or if you
concatenate the A and B and tack the C on
110
00:09:06,029 --> 00:09:09,750
at the end. Doesn't matter, you always get
the same result. So lists and
111
00:09:09,750 --> 00:09:17,249
concatenation also have this associative
property. And here's something that I
112
00:09:17,249 --> 00:09:22,139
always find very, very enlightening is
that you can construct images that way.
113
00:09:22,139 --> 00:09:27,470
Well, you don't see it here. So here's an
image. Well, it's from a cool researcher
114
00:09:27,470 --> 00:09:31,610
of mine and functional programing, Brent
Yorgey, and he has a great library out
115
00:09:31,610 --> 00:09:37,019
called diagrams, for constructing diagrams
out of parts. And so this really is what
116
00:09:37,019 --> 00:09:40,839
associativity is about. It's about
operators that construct things out of
117
00:09:40,839 --> 00:09:44,740
parts. And so, as you can see here, well
there's different shapes here, there's
118
00:09:44,740 --> 00:09:48,500
sort of the black rectangles, there's a
different rectangle set, that denote the
119
00:09:48,500 --> 00:09:52,180
towers of Hanoi. We're not really going to
deal with the towers of Hanoi here,
120
00:09:52,180 --> 00:09:57,989
really. The important thing that the image
consists of several parts. And well, in
121
00:09:57,989 --> 00:10:01,300
normal or sort of in classic object-
oriented programming, when you do
122
00:10:01,300 --> 00:10:05,639
graphics, you have a canvas and you might
draw pixels on that canvas. You know,
123
00:10:05,639 --> 00:10:11,170
might be square shaped or a circle shaped
canvas pixels. But what we're doing here
124
00:10:11,170 --> 00:10:18,079
is, we are treating an image as a data
type and the definition is not important.
125
00:10:18,079 --> 00:10:23,350
What is important is that there are a
couple of functions that construct sort of
126
00:10:23,350 --> 00:10:28,671
simple images. So here's a function that
you might imagine called star and it
127
00:10:28,671 --> 00:10:33,790
constructs stars. And well, you can see up
there there's a type declaration and it
128
00:10:33,790 --> 00:10:37,949
says, well, the star function, it accepts
an integer, it accepts a Mode, whatever
129
00:10:37,949 --> 00:10:42,209
that is, accepts a Color and it produces
an Image. And we can call that star
130
00:10:42,209 --> 00:10:48,750
function with the arguments 200 and Solid
and Gold. So Mode is Solid or Outline. And
131
00:10:48,750 --> 00:10:53,980
then we have a Color and we get Image and
that image is an object. Not particularly
132
00:10:53,980 --> 00:11:00,140
exciting. But while we might have another
function called Polygon, Polygon takes two
133
00:11:00,140 --> 00:11:05,759
integers that denote the size of the
polygon and the number of vertices, and
134
00:11:05,759 --> 00:11:09,759
also whether it's an outline or whether
it's solid and a color. And for example,
135
00:11:09,759 --> 00:11:17,079
if we call it with 180, again, that's the
size and 5 we get a five corner polygon
136
00:11:17,079 --> 00:11:23,390
and we get that as an outline and it's in
red. Now, the idea here is that we can
137
00:11:23,390 --> 00:11:27,490
combine. Just as we can combine two
numbers or we can combine two lists, we
138
00:11:27,490 --> 00:11:31,629
can combine two images. Maybe the most
intuitive way of combining two images is
139
00:11:31,629 --> 00:11:36,560
just sticking them beside each other. So
there's a function there called beside.
140
00:11:36,560 --> 00:11:41,009
And it takes an image and it takes another
image and produces an image. Right. And
141
00:11:41,009 --> 00:11:44,820
this is exactly what we're thinking about
when we talk about associativity. We're
142
00:11:44,820 --> 00:11:50,709
talking about a sort of a binary operator
that produces the same thing that went in.
143
00:11:50,709 --> 00:11:55,640
And so, for example, we could stick those
two images next to each other. We could
144
00:11:55,640 --> 00:12:01,249
also imagine an operator called above that
just puts one image above the other image.
145
00:12:01,249 --> 00:12:05,270
And we can combine these two things. Here
it really is important that the same thing
146
00:12:05,270 --> 00:12:09,519
comes out so that it's image goes in,
another image goes in, an image goes out.
147
00:12:09,519 --> 00:12:14,529
So we could again call above on the result
of beside and make arrangements. So here's
148
00:12:14,529 --> 00:12:18,829
a tiling arrangement for your bathroom or
something like that. Now, beside and above
149
00:12:18,829 --> 00:12:23,930
are are two possible operators and you
might already think about associativity,
150
00:12:23,930 --> 00:12:28,990
but really the more fundamental one is
overlay. You put two images on top of each
151
00:12:28,990 --> 00:12:33,279
other. And so again, overlay has the right
type. An image goes in, another image goes
152
00:12:33,279 --> 00:12:38,199
in, and an image comes out. And if we take
the gold star and the pentagon and put
153
00:12:38,199 --> 00:12:46,089
them on top of each other, then it looks
like this. And we can then formulate an
154
00:12:46,089 --> 00:12:51,559
associativity property. It might not quite
look the same because I wrote overlay in
155
00:12:51,559 --> 00:12:56,314
front rather than between the operators.
We could also write it between. But just
156
00:12:56,314 --> 00:13:00,250
to show you that it's the same idea. So,
it doesn't really matter if we first take
157
00:13:00,250 --> 00:13:05,209
two images, A and B and superimpose those
two and then put those two on top of C or
158
00:13:05,209 --> 00:13:13,319
if we do it in another order. Does that
make sense so far? Ok. No? Do you have a
159
00:13:13,319 --> 00:13:16,390
question?
Anonymous: Mumbling
160
00:13:16,390 --> 00:13:20,970
Mike: Yeah, so ahh, good point, good
point. So this implies that there must be
161
00:13:20,970 --> 00:13:24,910
some kind of, that there's probably some
notion of transparency involved. Yes. Yes,
162
00:13:24,910 --> 00:13:29,480
there is. But then you have associativity.
And really what it means. Very good
163
00:13:29,480 --> 00:13:38,399
question. So, if you think of this image
in terms of the color at certain
164
00:13:38,399 --> 00:13:43,600
coordinates, Right, Well, you need to
think about how to combine those two
165
00:13:43,600 --> 00:13:48,670
colors that are in the constituent images.
And you can imagine that there also has to
166
00:13:48,670 --> 00:13:53,329
be a combination operation for the colour.
And that also needs to be associative as a
167
00:13:53,329 --> 00:13:58,600
prerequisite for the for the overlay
operation to be associative. Does that
168
00:13:58,600 --> 00:14:07,669
make sense now? Thank you. Good question.
Great question. So anyway, so since this
169
00:14:07,669 --> 00:14:11,589
associativity property is something that
is not just restricted to numbers, as we
170
00:14:11,589 --> 00:14:15,229
may have learned in school, it really
makes sense to get. And that means that
171
00:14:15,229 --> 00:14:18,979
when we talk about associativity, we
always have to name two things. We have to
172
00:14:18,979 --> 00:14:22,679
say what set we're operating on and what
the operation is. And the combination
173
00:14:22,679 --> 00:14:28,369
those two things has a name in mathematics
and it's not the best name, but it's
174
00:14:28,369 --> 00:14:34,040
called a semigroup. Right. And, but, you
know, if you drop it in certain circles,
175
00:14:34,040 --> 00:14:39,100
they'll think that you're an expert on
mathematics, you might try that. So, just
176
00:14:39,100 --> 00:14:43,029
to go over that: So, you have some subset
S, and that S might be Image, it might be
177
00:14:43,029 --> 00:14:46,640
the natural numbers or something like
that. And we have an operation that I'm
178
00:14:46,640 --> 00:14:53,570
just gonna call circle here, then take any
a, b and c from that set S. We can use
179
00:14:53,570 --> 00:14:57,769
circle as an operator and we have that
associativity property and for that circle
180
00:14:57,769 --> 00:15:02,389
you can put in overlay, you can put in
beside, you can put in above, you can put
181
00:15:02,389 --> 00:15:08,899
in + you can put in times or you can put
in the list concatenation operator, the
182
00:15:08,899 --> 00:15:15,759
++. Okay? And that associativity is great.
It's really my favorite property because
183
00:15:15,759 --> 00:15:20,119
it means when we have a whole lot of
things that we combine, we can
184
00:15:20,119 --> 00:15:24,560
parenthesise in any way we want. We will
get the same result no matter which way we
185
00:15:24,560 --> 00:15:28,839
parenthesise them. And that really means,
we can leave out the parentheses when we
186
00:15:28,839 --> 00:15:33,070
write an expression that involves only the
circle operator if it's associative, if we
187
00:15:33,070 --> 00:15:36,839
can just leave out all the parentheses
because the parentheses don't matter. And
188
00:15:36,839 --> 00:15:41,889
that makes it well, that makes it
instantly easier to read, I think. Also it
189
00:15:41,889 --> 00:15:46,870
has practical uses. So if you do big data
processing associativity means that if you
190
00:15:46,870 --> 00:15:52,040
have large datasets that span several
machines or several hard drives or several
191
00:15:52,040 --> 00:15:58,679
data sources, and you're combining them
and you have an associative combination
192
00:15:58,679 --> 00:16:03,259
operation, it just means you can rearrange
that combination operation according to
193
00:16:03,259 --> 00:16:07,579
the load in your compute cluster. And that
makes it a very useful property when
194
00:16:07,579 --> 00:16:12,220
you're doing big data processing in sort
of MapReduce based frameworks. But, I
195
00:16:12,220 --> 00:16:16,410
mean, that's a practical application, but
I think it's much more useful,
196
00:16:16,410 --> 00:16:21,220
associativity is much more useful when you
use it for designing your domain model.
197
00:16:21,220 --> 00:16:25,749
And I talked in the beginning how, well,
you want to avoid always adding more
198
00:16:25,749 --> 00:16:29,559
database columns. And one way of doing
that is to view your domain model, not as
199
00:16:29,559 --> 00:16:33,220
something that has more and more
properties, but your domain model as
200
00:16:33,220 --> 00:16:37,439
building blocks that you combine into a
larger building blocks the same way that
201
00:16:37,439 --> 00:16:42,749
we combine images from simpler images. So
here's one of the great papers from
202
00:16:42,749 --> 00:16:46,439
functional programing, one of my two or
three favorites from Brent Yorgey. And
203
00:16:46,439 --> 00:16:50,219
it's called "Something Something: Theme
and Variations". And you can see that it
204
00:16:50,219 --> 00:16:54,940
is about images. And these images get
superimposed with an operation that is
205
00:16:54,940 --> 00:17:00,589
just like overlay and that is, that title
is eminently googleable. Now, it has a
206
00:17:00,589 --> 00:17:04,540
funny word there. It says, it doesn't say
semigroup, could say"Semigroup: Theme and
207
00:17:04,540 --> 00:17:08,640
Variation", it says "Monoid: Theme and
Variation". And a monoid, well, it's also
208
00:17:08,640 --> 00:17:12,360
not something, even though it sounds kind
of fancy, it's actually not much more
209
00:17:12,360 --> 00:17:16,400
complicated than a semigroup. It's a
semigroup. And also the semigroup has a
210
00:17:16,400 --> 00:17:20,186
special element called the neutral
element. And whenever we combine something
211
00:17:20,186 --> 00:17:24,070
with a neutral element, it doesn't matter
if we do it in front or at the back, we
212
00:17:24,070 --> 00:17:27,790
get the same thing back. So, of course,
the neutral element with respect to
213
00:17:27,790 --> 00:17:33,660
numbers, in addition, would be zero. The
neutral element with respect to lists and
214
00:17:33,660 --> 00:17:38,540
and concatenation would be the empty list.
I always hear several voices. That's
215
00:17:38,540 --> 00:17:42,505
wonderful. Thank you. And the same thing
for the overlay and beside and above, you
216
00:17:42,505 --> 00:17:46,400
can imagine that you have just an empty
image that has only, that consists only of
217
00:17:46,400 --> 00:17:51,530
transparency, that can work as the neutral
element. So all of these things that I
218
00:17:51,530 --> 00:17:55,310
showed you that are associative, they're
not just associative. They're not just
219
00:17:55,310 --> 00:18:00,740
semigroups, they're also monoids. And so,
as I said, as long as you remember
220
00:18:00,740 --> 00:18:04,510
associativity, that's the important thing.
But often you also find a monoid, and
221
00:18:04,510 --> 00:18:07,910
monoids in the wild they're just
everywhere. We've seen them for numbers
222
00:18:07,910 --> 00:18:12,501
and lists and images, music forms a
natural monoid. You can you can describe
223
00:18:12,501 --> 00:18:19,660
musical structure with monoid operations.
You can treat animations, the time axis.
224
00:18:19,660 --> 00:18:24,670
You can define monoidal combination of
animations. A famous example in functional
225
00:18:24,670 --> 00:18:28,630
programming is with financial contracts.
If you were here last year for a talk of
226
00:18:28,630 --> 00:18:32,270
mine, we talked about semiconductor-
fabrication routes, which sounds very
227
00:18:32,270 --> 00:18:37,121
concrete, but also they form a monoid. The
properties themselves that we'll see for a
228
00:18:37,121 --> 00:18:40,690
monoid are all kinds of things. They're
everywhere around you. And these are
229
00:18:40,690 --> 00:18:44,820
really the key towards making flexible
domain models because in almost any domain
230
00:18:44,820 --> 00:18:48,880
model you can find a monoid just by
looking for building blocks and for ways
231
00:18:48,880 --> 00:18:54,330
of combining those building blocks into a
larger building blocks. So let me get
232
00:18:54,330 --> 00:19:00,100
back. So I said, well, you can use
associativity or you can use this monoid
233
00:19:00,100 --> 00:19:04,260
thing to guide your design. And I haven't
really made that concrete yet. And so I
234
00:19:04,260 --> 00:19:09,370
stole a couple of pictures from Brent's
paper. So you remember the beside and the
235
00:19:09,370 --> 00:19:14,380
above operations. And those are fine for
arranging things sort of in the vertical
236
00:19:14,380 --> 00:19:19,900
and the horizontal axis. The way that they
work is, they make, they put a bounding
237
00:19:19,900 --> 00:19:23,930
box around every picture and then they
arrange the bounding boxes either beside
238
00:19:23,930 --> 00:19:28,291
each other or above each other. So it's a
slightly more involved thought. And that
239
00:19:28,291 --> 00:19:32,200
works great when you're, when, you know,
your picture is, happens to be a square
240
00:19:32,200 --> 00:19:36,670
that's aligned with the axes. It doesn't
work so well if your picture is rotated,
241
00:19:36,670 --> 00:19:40,300
right. Because the bounding box, the
bounding box then is too big. And if you
242
00:19:40,300 --> 00:19:44,260
want to attach anything about, just about
in any direction, then there's going to be
243
00:19:44,260 --> 00:19:49,860
a gap in your picture. And so beside and
above are not particularly good operations
244
00:19:49,860 --> 00:19:54,790
as the basis for an image library. The
overlay operation is much better. But that
245
00:19:54,790 --> 00:19:59,100
leaves open the question how you can
arrange pictures, several pictures so that
246
00:19:59,100 --> 00:20:04,940
they are beside each other or that they
just touch. And Brent came up with this
247
00:20:04,940 --> 00:20:12,010
idea of an envelope, technical idea. So
the idea is that, well, if you give me, so
248
00:20:12,010 --> 00:20:17,520
the red dot there, that's the origin. If
you give me a vector starting at the
249
00:20:17,520 --> 00:20:22,580
origin, I will tell you how far you have
to go along that vector so that I can draw
250
00:20:22,580 --> 00:20:27,470
that blue perpendicular line that's just
outside the shape. And that's called an
251
00:20:27,470 --> 00:20:33,960
envelope. And envelopes are wonderful. So
if you ship each picture not just with
252
00:20:33,960 --> 00:20:38,400
sort of the visuals that you see, but also
with a function that describes the
253
00:20:38,400 --> 00:20:43,680
envelope, then you can use that envelope
to arrange pictures both in the horizontal
254
00:20:43,680 --> 00:20:48,470
and the vertical, but also in the diagonal
by just drawing vectors so that they
255
00:20:48,470 --> 00:20:52,580
touch. So, that's a slightly more
complicated idea. Does it make sense? And
256
00:20:52,580 --> 00:20:57,160
Brent goes through the motions of using
that inspiration from the monoid that he
257
00:20:57,160 --> 00:21:02,810
is getting. He's saying "Everything must
be a monoid! Absolutely.", and uses that
258
00:21:02,810 --> 00:21:06,170
as a guiding principle through the
library. So I'm not going to go into
259
00:21:06,170 --> 00:21:11,380
technical detail on how that works, but
it's a very pleasing paper to read on
260
00:21:11,380 --> 00:21:17,560
that. And it results in a beautiful
library that's great fun to use. So that
261
00:21:17,560 --> 00:21:22,050
means, though, that you also have to find
a monoidal combination operation for the
262
00:21:22,050 --> 00:21:25,670
envelopes. You can't just, we've already
seen how we can combine the pictures
263
00:21:25,670 --> 00:21:29,351
themselves, but we also need to combine
the envelopes. And fortunately, that's
264
00:21:29,351 --> 00:21:33,300
pretty easy. If somebody sets a vector in
a certain direction, then that envelope is
265
00:21:33,300 --> 00:21:37,040
just a maximum. Those two pictures, right,
if you combine that ellipse and that
266
00:21:37,040 --> 00:21:41,260
square, you can see that I'm just going to
have to go to the maximum of those two
267
00:21:41,260 --> 00:21:46,700
numbers in order to just be outside the
composite shape that that comes up
268
00:21:46,700 --> 00:21:52,930
superimposing those two things. So that's
great. Now, I sort of introduced these
269
00:21:52,930 --> 00:21:56,920
properties as a mathematical thing, right.
I said, well, there's this fancy, fancy
270
00:21:56,920 --> 00:22:02,830
upside down operator says for all images
and we might say for all images. Now, we
271
00:22:02,830 --> 00:22:08,810
can also formulate these properties as
code. And that's really where additional
272
00:22:08,810 --> 00:22:12,240
magic is. So, for example, the
associativity property, well, there's not
273
00:22:12,240 --> 00:22:15,610
much of a difference except that the
image1 and image2, they are now in
274
00:22:15,610 --> 00:22:19,640
typewriter font. So we could put those in
the program. But there's still that
275
00:22:19,640 --> 00:22:24,130
mathematical stuff on top. But in a
functional language, in a lot of other
276
00:22:24,130 --> 00:22:28,340
languages too by now, we could also put
the top line and translate that into code.
277
00:22:28,340 --> 00:22:32,190
And it might look like this. So that's
what it looks like in Idris. So, it's not
278
00:22:32,190 --> 00:22:37,380
quite the same, but maybe we recognize the
structure. So, we say, well, there's a
279
00:22:37,380 --> 00:22:41,370
property called and the property is just
called overlayAssociative. So we give it a
280
00:22:41,370 --> 00:22:48,100
name. So, Idris is an ASCII language,
still so, primarily. So, we say just
281
00:22:48,100 --> 00:22:52,760
forAll there instead of the upside down
all. And then it says arbTriple arbImage
282
00:22:52,760 --> 00:22:58,930
arbImage arbImage. And that means for all
arbitrary triples of arbitrary images and
283
00:22:58,930 --> 00:23:02,180
other arbitrary image and another
arbitrary image, so, triples, three
284
00:23:02,180 --> 00:23:08,430
things. And we're going to call those
three images image1, image2 and image3.
285
00:23:08,430 --> 00:23:15,150
That funky backslash there, that's a
Lambda in Idris. And then the overlay prop
286
00:23:15,150 --> 00:23:21,460
means that while, if we overlay one way
and we overlay another way, according to
287
00:23:21,460 --> 00:23:26,260
associativity, we get the same result. Do
you recognize that structure? Right. That
288
00:23:26,260 --> 00:23:31,060
it's the same thing. So that we're writing
structurally the same thing that we wrote
289
00:23:31,060 --> 00:23:36,790
in mathematical notation. Now as a piece
of code. And now the great thing is once
290
00:23:36,790 --> 00:23:40,801
we've written it as a piece of code, we
can manipulate it in a program.
291
00:23:40,801 --> 00:23:44,570
So, one way, there is different ways
of manipulating it. But one of the most
292
00:23:44,570 --> 00:23:48,140
useful ones is, again by another great
researcher in functional programming,
293
00:23:48,140 --> 00:23:52,000
John Hughes, came up with something called
QuickCheck. So if there's another thing
294
00:23:52,000 --> 00:23:55,870
you take away from this talk is: google
QuickCheck. And whatever language you use,
295
00:23:55,870 --> 00:24:00,530
it doesn't have to be Idris. In fact, I
had to hack together a QuickCheck for this
296
00:24:00,530 --> 00:24:04,400
talk, but basically any other language is
going to have a QuickCheck, whether that
297
00:24:04,400 --> 00:24:08,730
language be a functional language or
whether it's Java or Python or R or
298
00:24:08,730 --> 00:24:15,490
something like that. You can always get a
QuickCheck for that. And I'm going to try
299
00:24:15,490 --> 00:24:22,560
and demonstrate this QuickCheck thing not
by thinking about the design so much, but
300
00:24:22,560 --> 00:24:28,090
by demonstrating a property of something
that's very error prone. So, here's this
301
00:24:28,090 --> 00:24:35,230
idea, we want to have a representation for
sets of natural numbers. And we're going
302
00:24:35,230 --> 00:24:40,800
to represent those sets of natural numbers
by a list of intervals. So, by a list of
303
00:24:40,800 --> 00:24:44,991
ranges, if you will, between two numbers.
Now, I'll try to explain that. So, up
304
00:24:44,991 --> 00:24:49,510
there at the top, it has a type
definition. It says, ISet, interval set,
305
00:24:49,510 --> 00:24:56,650
is a type. And that type is defined to be
just a synonym for a list of pairs of
306
00:24:56,650 --> 00:25:00,380
natural numbers. That's what those round
parentheses with a comma in the middle
307
00:25:00,380 --> 00:25:05,340
mean. OK. And just to see what that means
is, there's a function there. I haven't,
308
00:25:05,340 --> 00:25:10,000
I've lighted the definition, but what's
important about it is its type signature.
309
00:25:10,000 --> 00:25:15,610
It takes an interval set and it produces a
list of all of the members of that set.
310
00:25:15,610 --> 00:25:20,920
And you can see sort of a demo thing here
that I typed in before the talk. So, if I
311
00:25:20,920 --> 00:25:27,260
apply iToList so that, the brackets there
they just mean the list, and we feed in a
312
00:25:27,260 --> 00:25:32,240
list of intervals and those intervals are
from zero to three, from five to seven,
313
00:25:32,240 --> 00:25:36,860
and from nine to ten, respectively.
They're all inclusive. And you can see
314
00:25:36,860 --> 00:25:40,460
down there is a list of all of the
members. So, the first interval is from 0
315
00:25:40,460 --> 00:25:47,040
to 3. So, it has the numbers 0, 1, 2 and
3. The next one goes from five to seven.
316
00:25:47,040 --> 00:25:51,490
So it has the three numbers 5, 6, 7. And
the last one goes from nine to ten. So it
317
00:25:51,490 --> 00:25:56,580
has the two numbers, 9 and 10 there. Does
that make any sense again? Slightly more
318
00:25:56,580 --> 00:26:05,760
complicated example. So let's see. So, of
course, well, not of course, but the way
319
00:26:05,760 --> 00:26:09,521
we want to do it, the way I want to do it
is, I want to have the interval set
320
00:26:09,521 --> 00:26:15,870
structured in a certain way. I don't just
want any list of any pair of numbers to
321
00:26:15,870 --> 00:26:19,660
denote an interval set. And therefore,
here is a function that describes what it
322
00:26:19,660 --> 00:26:25,610
needs to be a valid interval set. Right.
So, for example, we don't really, in order
323
00:26:25,610 --> 00:26:29,430
to have efficient processing, we don't
really want two intervals in one interval
324
00:26:29,430 --> 00:26:33,120
set to overlap. Right. We want them to be
disjoint and we also want them to be
325
00:26:33,120 --> 00:26:37,620
ordered so we can have efficient
operations for certain things. Right. And
326
00:26:37,620 --> 00:26:41,410
so, let's go through this. So, there is an
isValid function that just tells you
327
00:26:41,410 --> 00:26:46,230
whether that interval set is valid or not.
It says, well, if that set, and there's
328
00:26:46,230 --> 00:26:50,140
three different cases here, which is why
there's three different equations, in the
329
00:26:50,140 --> 00:26:54,660
first equation says the empty interval
set, the empty brackets mean the empty
330
00:26:54,660 --> 00:26:58,750
list, and if the intervals, the list
representing the interval set is empty,
331
00:26:58,750 --> 00:27:04,070
then we're going to say True. Empty set -
perfectly fine. The next case says, our
332
00:27:04,070 --> 00:27:08,520
interval set consists only of a single
interval and that single interval goes
333
00:27:08,520 --> 00:27:14,210
from low to high. Well, we kind of
interpret that there, but, and, well, that
334
00:27:14,210 --> 00:27:18,360
interval set is valid, if low comes in
front of high. Right, they shouldn't be
335
00:27:18,360 --> 00:27:23,870
the other way around. So, does that make
sense? Somebody in, can you nod at the
336
00:27:23,870 --> 00:27:29,010
back, a little bit? You're still there?
OK. Thank you. Great. So, then it becomes
337
00:27:29,010 --> 00:27:33,190
a little bit more complicated and it says,
well, this is the third case, when there's
338
00:27:33,190 --> 00:27:38,670
at least two intervals in the interval
set. And those two intervals are, the
339
00:27:38,670 --> 00:27:44,900
first one goes from lo1 to hi1. The second
one goes from lo2 to hi2. So, those ::,
340
00:27:44,900 --> 00:27:48,810
they separate the first element of a list
from the rest. And then there's the rest
341
00:27:48,810 --> 00:27:54,140
of the list. And it says, well, again, we
want the interval to be ordered so that
342
00:27:54,140 --> 00:27:59,380
the lower numbers are on the left. That's
where it says lo1 is less or equal hi1.
343
00:27:59,380 --> 00:28:03,730
And then it says, well, that there should
be a gap between two consecutive
344
00:28:03,730 --> 00:28:09,010
intervals. Otherwise, they should be one
interval, which is why the high from one
345
00:28:09,010 --> 00:28:14,750
interval should be separated from the low
of the next interval by at least one. And
346
00:28:14,750 --> 00:28:18,930
then we're going to say, well, also we
want all the rest of the list, including
347
00:28:18,930 --> 00:28:26,740
lo2 and hi2 to be valid too. So far so
good? OK, so this is probably, well, the
348
00:28:26,740 --> 00:28:32,370
second most complicated piece of code. So,
anyway, so, here's, so, we might imagine a
349
00:28:32,370 --> 00:28:37,341
union function. And the union function,
guess what, it forms a monoid, with
350
00:28:37,341 --> 00:28:43,300
respect to interval sets. So, it takes,
two internal sets go in and another one
351
00:28:43,300 --> 00:28:47,510
comes out. And if you've written that kind
of thing before, you might notice it's
352
00:28:47,510 --> 00:28:52,790
probably a little tricky with that fancy
validity condition that's there. So how
353
00:28:52,790 --> 00:28:58,250
can we get this right? Well, what we do is
we write down properties. Of course, we
354
00:28:58,250 --> 00:29:02,770
could write down associativity. I'll leave
that as an exercise. Another one is just
355
00:29:02,770 --> 00:29:08,740
very simple. Just a very simple property
that says for all pairs of two arbitrary
356
00:29:08,740 --> 00:29:13,890
interval sets, we want the union of those
two interval sets to be valid, a valid
357
00:29:13,890 --> 00:29:21,300
data structure. We want the union function
to preserve validity. OK? Makes sense? So
358
00:29:21,300 --> 00:29:26,580
here's another property that says, well, I
already gave you this function or I told
359
00:29:26,580 --> 00:29:32,710
you that there is this function iToList,
which just gives us a list of elements of
360
00:29:32,710 --> 00:29:36,860
an interval set. And what we can do is, we
can use sort of that representation,
361
00:29:36,860 --> 00:29:40,710
that's also a representation for sets. We
can use that representation sort of as a
362
00:29:40,710 --> 00:29:44,640
model and say, well, if we take the
unions, you see there for all pairs,
363
00:29:44,640 --> 00:29:49,770
again, of arbitrary interval sets, we take
the union. It says iUnion, iset1 and
364
00:29:49,770 --> 00:29:55,570
iset2. And we convert that to a list. And,
we could also do, we could instead convert
365
00:29:55,570 --> 00:30:00,760
each individual set to a list and then
just merge those two lists. And that
366
00:30:00,760 --> 00:30:06,390
should yield the same result. So, in a
way, we're just giving a very simple model
367
00:30:06,390 --> 00:30:10,330
for our interval sets, right, and that
would, so those two criteria would be kind
368
00:30:10,330 --> 00:30:15,940
of nice to have in order to get our
implementation correct. And I already got
369
00:30:15,940 --> 00:30:20,440
started before the talk on this. Looks
like this. No. Doesn't look like this.
370
00:30:20,440 --> 00:30:29,390
We'll get to that later. But like this.
So, here's what I came up with. So, you
371
00:30:29,390 --> 00:30:33,290
see there is that, while there's all this
other code there, ignore that. But there
372
00:30:33,290 --> 00:30:36,961
is iUnion says ISet -> ISet -> ISet, do
you see that? And then, there's two
373
00:30:36,961 --> 00:30:41,120
equations that say, well, the first set is
empty, then I'm just going to give you the
374
00:30:41,120 --> 00:30:44,620
second one. And if the second one is
empty, I'm just going to give you the
375
00:30:44,620 --> 00:30:49,510
first one. Right? Classic things when you
have union or concatenation operations or
376
00:30:49,510 --> 00:30:54,200
something like that. And now you can see
the third case. It gets tricky, right?
377
00:30:54,200 --> 00:30:58,350
Again, you don't need, I mean, main thing
is you need to understand it's tricky.
378
00:30:58,350 --> 00:31:05,840
Well, the third one is such that, well,
says so that both have at least one
379
00:31:05,840 --> 00:31:11,210
element and that element is in the
interval lo1 and hi1 in the first case and
380
00:31:11,210 --> 00:31:15,130
lo2 and hi2 in the second case. And then
there's the rest. And I already put in a
381
00:31:15,130 --> 00:31:20,570
little bit of code, and I said, well if
lo1 comes after hi1 (means hi2), then we
382
00:31:20,570 --> 00:31:25,350
want to start with lo2 to hi2 and then
continue with the union. In the other
383
00:31:25,350 --> 00:31:29,350
case, if lo2 comes after hi2 (means hi1),
then we're gonna start with lo1 and hi1.
384
00:31:29,350 --> 00:31:33,809
And in the other case, it means, that no
interval comes before the other, and
385
00:31:33,809 --> 00:31:38,800
therefore we need to merge the two
intervals at the beginning. Does that make
386
00:31:38,800 --> 00:31:43,980
remote sense? Right. Don't worry. We'll
get back on solid track. So, we just take
387
00:31:43,980 --> 00:31:47,850
the minimum of those two intervals and
maximum of those two intervals and we do
388
00:31:47,850 --> 00:31:52,090
this. Now, the great thing is, I told you
about this tool by John Hughes called
389
00:31:52,090 --> 00:31:58,740
QuickCheck. And the great thing is, we can
load this into Idris. And then here comes
390
00:31:58,740 --> 00:32:10,920
a REPL, and we can say, I hope I'm doing
this right. So, we want QuickCheck, and we
391
00:32:10,920 --> 00:32:16,860
want, what was it called? It was called
prop_unionCorrect. I hope I'm doing this
392
00:32:16,860 --> 00:32:21,840
right. And, well, very small font. But you
can see here it says "100 tests". And that
393
00:32:21,840 --> 00:32:27,100
is what QuickCheck does, as, it takes your
code version off the property and
394
00:32:27,100 --> 00:32:32,130
automatically generates a lot of tests for
them. And that is super effective at
395
00:32:32,130 --> 00:32:36,250
weeding out bugs. So it says, well, the
thing that you wrote is correct. It always
396
00:32:36,250 --> 00:32:43,550
produces interval sets that when you take
the list, it gives you the right result.
397
00:32:43,550 --> 00:32:50,360
But there was that other criterion called
unionValid. And there it says, and this is
398
00:32:50,360 --> 00:32:54,520
really the better part, of course, of
QuickCheck is, when it fails, it says it's
399
00:32:54,520 --> 00:33:00,500
falsifiable. It says there is a counter
example. And so, here it says, I did nine
400
00:33:00,500 --> 00:33:05,070
tests, I generated nine random tests, and
I found one where the result is not valid.
401
00:33:05,070 --> 00:33:11,360
And the great thing is that we can then go
and cut and paste this example. So I could
402
00:33:11,360 --> 00:33:17,810
say iUnion, this, remove the comma in the
middle, and call this. And well, what
403
00:33:17,810 --> 00:33:23,780
happens here is, what we can see is, we
can see 2 and 4, 1 and 1, and 3 and 5, and
404
00:33:23,780 --> 00:33:27,600
what's not valid about. So, by the way,
this is randomized. So, this always goes
405
00:33:27,600 --> 00:33:31,860
differently. So I have to look at it, too.
So, then it says, well, those two
406
00:33:31,860 --> 00:33:35,260
intervals, they should really just be
merged and they should just be one
407
00:33:35,260 --> 00:33:43,711
interval. Right? And so, it didn't do that
correctly. And the reason for that, maybe
408
00:33:43,711 --> 00:33:55,010
you saw it. So, and, what happened is that
it ran into one of those two cases here
409
00:33:55,010 --> 00:33:59,630
where it says if lo1 greater than hi2 or
lo2 greater than hi1. Remember that I told
410
00:33:59,630 --> 00:34:04,630
you there needs to be a gap of at least
one between them. Remember? And here's an
411
00:34:04,630 --> 00:34:11,030
off-by-one error that says, well. So this
says, they can, lo1 greater than hi2 says
412
00:34:11,030 --> 00:34:14,610
they can still be right next to each
other. Right? And this is what happened
413
00:34:14,610 --> 00:34:19,760
here. We need to make sure that there is
that gap in here. So, I can fix it like
414
00:34:19,760 --> 00:34:38,060
this. Loaded again. Oh, no. There's still
a counterexample. So, and we can try that
415
00:34:38,060 --> 00:34:42,940
out, so, and that's great. We get test
cases that sort of show where the bugs
416
00:34:42,940 --> 00:34:51,470
are. And in this case, well, what happened
here? They still overlap. And what
417
00:34:51,470 --> 00:34:59,711
happened here? So can you see it? So, you
can see that the first two intervals, they
418
00:34:59,711 --> 00:35:05,440
must run into that last case. Right.
Because they overlap. Zero is the interval
419
00:35:05,440 --> 00:35:12,920
from 0 to 3 and the interval from 0 to 5.
They overlap. So we need to get to that
420
00:35:12,920 --> 00:35:18,850
case. And so it merges them. And then it
went and and somehow didn't merge it with
421
00:35:18,850 --> 00:35:30,400
the 6 and the 7 that's there. And, so,
well, if you look at it. So it must have
422
00:35:30,400 --> 00:35:38,800
done this. And and what it did is, it then
went on with the rest there. Let's have
423
00:35:38,800 --> 00:35:58,820
one more look. What actually happened? So
there it is. So, it merged those and then
424
00:35:58,820 --> 00:36:02,640
you can see that it went into a symmetry
problem here. Well, maybe you don't see.
425
00:36:02,640 --> 00:36:08,190
But, you know, this is tricky stuff. I
couldn't do this by myself. So you can see
426
00:36:08,190 --> 00:36:18,780
here that it just tacks the result onto
iset1Rest, whereas the maximum of hi1 and
427
00:36:18,780 --> 00:36:25,890
hi2 could, might violate the consistency
criteria if it's the wrong one, and then
428
00:36:25,890 --> 00:36:30,270
it runs into one of the other cases. Now
I've never seen this tricky one. Does it
429
00:36:30,270 --> 00:36:34,110
make sense? But, can you see that it
should be symmetrical? The last one. Can
430
00:36:34,110 --> 00:36:39,960
you see it? OK, so we'll try and make it
symmetrical. Do it like this. So we'll
431
00:36:39,960 --> 00:36:51,640
say, well, if so, this only works. So if
hi1 is less than hi2. So we really need to
432
00:36:51,640 --> 00:36:59,480
make sure, then it is perfectly. And then
the maximum of those two numbers is hi1.
433
00:36:59,480 --> 00:37:05,380
Does that make sense? And so the max of
those two numbers is hi1 and then it's
434
00:37:05,380 --> 00:37:12,100
perfectly valid to tack it onto iset1Rest.
In the other case, hi2 is greater and we
435
00:37:12,100 --> 00:37:18,480
need to go and do something different and
rip this out here. Stick it in front here
436
00:37:18,480 --> 00:37:35,950
and then. And then. And now it's
symmetrical. OK. So, load this. And, ahh!
437
00:37:35,950 --> 00:37:40,619
It has passed the test. OK, live great.
applause
438
00:37:40,619 --> 00:37:47,100
Thank you. I did practice getting it
correct, right. But you can, you know,
439
00:37:47,100 --> 00:37:51,340
this kind of stuff. It always gets me. I
mean, you know, with old age especially,
440
00:37:51,340 --> 00:37:55,530
this kind of stuff, it always drives the
sweat on my forehead, right? You know,
441
00:37:55,530 --> 00:37:59,340
there's off-by-one. There is, you know, I
don't know how many cases there need to
442
00:37:59,340 --> 00:38:03,140
be. And QuickCheck is the kind of thing
that weeds out the bugs. And even though
443
00:38:03,140 --> 00:38:07,060
it weeds out the bugs in a different order
each time, it always weeds them all out.
444
00:38:07,060 --> 00:38:11,770
OK. So it's a great tool. Now, I recommend
that you try that. It generates tests from
445
00:38:11,770 --> 00:38:18,869
properties. OK, where are we? So let me
let me give you a couple of real world
446
00:38:18,869 --> 00:38:23,030
examples. So if you're using X windows,
there's a there's a tiling, a window
447
00:38:23,030 --> 00:38:26,890
manager, xmonad. It's already a couple of
years old and they don't do much
448
00:38:26,890 --> 00:38:33,330
development on it anymore. That's because
it's correct. Right. laughter Right. And
449
00:38:33,330 --> 00:38:39,280
why is it correct? Well, it's because they
wrote down a lot of properties for the
450
00:38:39,280 --> 00:38:45,690
geometry and the tiling algorithms and
verified them using QuickCheck. And so I
451
00:38:45,690 --> 00:38:49,120
sort of loosely translated. So, Don
Stewart, one of the authors of xmonad
452
00:38:49,120 --> 00:38:53,340
graciously wrote a couple of blog posts on
a simplified version of xmonad and I
453
00:38:53,340 --> 00:39:01,580
translated them into Idris. So, here's a
very simple idea of just a stacking window
454
00:39:01,580 --> 00:39:05,280
manager. So, it doesn't do geometry, it
just has stacks of windows and it has
455
00:39:05,280 --> 00:39:10,460
several workspaces. In each workspace is a
stack of windows. So here's a data type
456
00:39:10,460 --> 00:39:17,029
called a StackSet, its parameterized by a
type called window. We'll see later why
457
00:39:17,029 --> 00:39:20,930
there's a type parameter and why it just
doesn't say what the windows are. And then
458
00:39:20,930 --> 00:39:25,140
it says there's a constructor StackSet
and there's two fields in there. One is
459
00:39:25,140 --> 00:39:34,071
called "current", that's the number of the
workspace that's currently
460
00:39:34,071 --> 00:39:37,460
active. And then there's "stacks",
which is a map from the number of the
461
00:39:37,460 --> 00:39:44,070
workspace to the stack of, to the list of
windows that sit in that workspace. Again,
462
00:39:44,070 --> 00:39:48,270
so here, really the technicalities are
not particularly important, but there is a
463
00:39:48,270 --> 00:39:54,090
bunch of operations that operate on this
window manager configuration. And again,
464
00:39:54,090 --> 00:39:58,300
here, really the details aren't important.
So you could create an empty stack set.
465
00:39:58,300 --> 00:40:03,350
You could say, well, you know, I have the
number of a window that I want to get to
466
00:40:03,350 --> 00:40:07,270
the front. And please make me, please
rotate me, the stack set around so that I
467
00:40:07,270 --> 00:40:12,990
can see it. "peek" means, you know, maybe
I can get the topmost window that the user
468
00:40:12,990 --> 00:40:17,070
is currently looking at. "rotate" means
I'm just going to rotate the workspaces
469
00:40:17,070 --> 00:40:21,030
around in either left or right direction.
That's what that ordering argument. "push"
470
00:40:21,030 --> 00:40:25,660
is, I push a new window onto the current
workspace. "insert" means insert a window
471
00:40:25,660 --> 00:40:31,630
into one of the other workspaces. "delete"
means I delete a window. "shift" means,
472
00:40:31,630 --> 00:40:35,720
also means I shift something with the
windows. Not really important what they
473
00:40:35,720 --> 00:40:42,520
do. But you can imagine again, just as we
did with the interval sets is validity
474
00:40:42,520 --> 00:40:46,820
criterion or an invariant that should hold
for these operations. And it's very
475
00:40:46,820 --> 00:40:51,390
simple. Well, it says well, if you have a
stack set with some windows in it, I'm
476
00:40:51,390 --> 00:40:55,320
just going to tell you whether that stack
set is consistent. And by doing that, I'm
477
00:40:55,320 --> 00:41:01,800
just going to say, well, the current, the
number of the current stack
478
00:41:01,800 --> 00:41:07,280
should not be higher than the number of
window stacks that there are. Right. So,
479
00:41:07,280 --> 00:41:11,051
the number of stacks that there are. And
the other one, that just says a window
480
00:41:11,051 --> 00:41:18,280
should not be in several of the
workspaces. Right? And then I can go and
481
00:41:18,280 --> 00:41:22,910
maybe with this definition, all
those function definitions aren't very
482
00:41:22,910 --> 00:41:26,950
complicated. But, I can go and write a
whole bunch of properties. And if you just
483
00:41:26,950 --> 00:41:30,800
understand, well, maybe the second one,
"prop_view_I", you understand all of them.
484
00:41:30,800 --> 00:41:35,770
It just says, well, for all pairs of a
natural number and a stack set that are a
485
00:41:35,770 --> 00:41:39,970
"stackIndex" and "stackSet", I want, if I
call the "view" function, which is one of
486
00:41:39,970 --> 00:41:45,020
the operations, I want the view function
to produce a consistent stack set. And
487
00:41:45,020 --> 00:41:50,220
then it goes on to do all of that for all
the other ones. At the bottom here, you
488
00:41:50,220 --> 00:41:53,929
can see some prerequisites that need to
hold for the property so that invariant
489
00:41:53,929 --> 00:41:59,360
only needs to hold if the window, if the
number of the window is actually smaller
490
00:41:59,360 --> 00:42:05,119
than the size of the stack set. Otherwise,
I think the function just returns what
491
00:42:05,119 --> 00:42:09,850
would go with it, what went in there, So
that's a very, that's just a very
492
00:42:09,850 --> 00:42:13,901
efficient way to invent properties, to
think of some invariant that shall hold in
493
00:42:13,901 --> 00:42:17,490
your data structure. And if you know
Idris, you can sometimes encode that in
494
00:42:17,490 --> 00:42:21,150
the types, but often that's kind of
tedious. And you can just write it down as
495
00:42:21,150 --> 00:42:25,750
a property and then have QuickCheck check
it for you. And it's not particularly
496
00:42:25,750 --> 00:42:29,790
exciting for the simple definition, but
you can imagine that the actual definition
497
00:42:29,790 --> 00:42:33,710
when you have tiling window management
going on is much more complicated than the
498
00:42:33,710 --> 00:42:38,420
one that you just saw. But you can keep
those same properties, right? There still
499
00:42:38,420 --> 00:42:42,170
needs to be some consistency invariant
that, if you have tilings, the windows
500
00:42:42,170 --> 00:42:46,150
don't overlap, and things like that. That
should be obvious. Write those properties
501
00:42:46,150 --> 00:42:51,452
down, check them using QuickCheck and that
will weed out a lot of the bugs.
502
00:42:51,452 --> 00:42:55,900
Here's an example from our
practice. We, couple months ago, we were
503
00:42:55,900 --> 00:43:03,220
tasked with migrating a giant Visual Basic
6 application. It had a password checking
504
00:43:03,220 --> 00:43:07,120
function there. You can see here a Visual
Basic 6 type signature. And the property
505
00:43:07,120 --> 00:43:14,690
that we wrote was, well, if we create the
hash from the password and we compare it
506
00:43:14,690 --> 00:43:19,590
with the hash that's in the database, then
they should all come out the same. And to
507
00:43:19,590 --> 00:43:23,609
our surprise, that function, that test,
that property, failed when we ran it for
508
00:43:23,609 --> 00:43:29,410
QuickCheck and we had to correct it
because that password hash is restricted
509
00:43:29,410 --> 00:43:36,600
to 11 characters by some restriction in
the database schema. And so that means
510
00:43:36,600 --> 00:43:40,330
that you can use QuickCheck not just to
sort of check the correctness of things
511
00:43:40,330 --> 00:43:44,490
that you already know, but to actually
develop a model for what goes on in your
512
00:43:44,490 --> 00:43:50,780
software, which you don't always know very
well. So that's what we did there. Another
513
00:43:50,780 --> 00:43:57,090
example is, we wrote, for a large
industrial client, we needed to write a
514
00:43:57,090 --> 00:44:01,200
synchronization application. So when you
had two mobile devices and they would sort
515
00:44:01,200 --> 00:44:05,770
of meet as strangers, they would exchange
data and they all needed to look at the
516
00:44:05,770 --> 00:44:10,940
same sort of device configuration data.
And we didn't want them to exchange all
517
00:44:10,940 --> 00:44:15,230
the data every single time. We just wanted
to exchange them, the data blocks that the
518
00:44:15,230 --> 00:44:20,840
other side was missing. And again, there's
great algorithms for this based on Merkle
519
00:44:20,840 --> 00:44:24,470
trees. They're pretty complicated. You
have to do a lot of bit fiddling with
520
00:44:24,470 --> 00:44:28,950
that. But fortunately, the property for
that is pretty easy to write. So here's
521
00:44:28,950 --> 00:44:34,430
the property that says, well, so the
synchronization algorithm works on sets of
522
00:44:34,430 --> 00:44:39,910
blocks, whatever a block is. So you can
see the property here for all pairs of
523
00:44:39,910 --> 00:44:45,130
sets of blocks and more sets of blocks. So
they're called bs1 and bs2. Block set one
524
00:44:45,130 --> 00:44:53,030
and block set two. What we can do is, we
want, if we union those two, then we get
525
00:44:53,030 --> 00:44:57,270
all the blocks in the system. We call that
all, or we can call the synchronization
526
00:44:57,270 --> 00:45:05,490
algorithm and that will give us two new
block sets, block set bs1' and bs2'. And
527
00:45:05,490 --> 00:45:10,940
those block sets are the ones that get
transferred to the other side. OK. And the
528
00:45:10,940 --> 00:45:16,150
criterion here just says if we take the
ones that we have, if we union them with
529
00:45:16,150 --> 00:45:20,190
the ones that we get, we should get all of
them. That should be all of them. And that
530
00:45:20,190 --> 00:45:24,530
should be the same for both sides. And
also, we want the algorithm to be
531
00:45:24,530 --> 00:45:28,850
efficient so we don't want it to transfer
blocks. So we want to make sure that the
532
00:45:28,850 --> 00:45:32,880
blocks that we have and the blocks that we
get are disjoint. That they don't have any
533
00:45:32,880 --> 00:45:37,770
elements in common. Otherwise, we could
make that algorithm trivially correct by
534
00:45:37,770 --> 00:45:42,420
just transferring all the blocks every
single time. And I can tell you, I
535
00:45:42,420 --> 00:45:46,060
sweated. You know, I sweated one or two
weeks over this algorithm and it was
536
00:45:46,060 --> 00:45:52,790
really hard to write. But this one test
weeded out all the bugs that I found along
537
00:45:52,790 --> 00:45:58,730
the way. So that is just super, super
effective. John Hughes has a couple of
538
00:45:58,730 --> 00:46:02,859
papers on hard bugs that he found. So he
found a bug in a distributed database
539
00:46:02,859 --> 00:46:09,600
called mnesia. And that bug was dependent
on opening the database, closing it and
540
00:46:09,600 --> 00:46:14,650
opening it again. So this is not the kind
of bug that you find by just writing a
541
00:46:14,650 --> 00:46:19,710
bunch of smart unit tests. Right? So, if
you did anything shorter in the beginning,
542
00:46:19,710 --> 00:46:24,980
so if you just open the file and then did
some lookups there, that would not
543
00:46:24,980 --> 00:46:29,320
manifest the bug. You really needed to
close and then open again. Have you
544
00:46:29,320 --> 00:46:33,830
turned, have you tried turning it off and
on again? But then the database breaks in
545
00:46:33,830 --> 00:46:39,000
this case. And here's another example
called The Mysteries of Dropbox. So you
546
00:46:39,000 --> 00:46:45,940
can imagine that with Dropbox you really
want certain properties to hold. Right?
547
00:46:45,940 --> 00:46:50,030
And it turns out they didn't. They never
worried about writing properties down. But
548
00:46:50,030 --> 00:46:54,700
John Hughes did it and found a couple of
bugs. So here's one. It's kind of hard to
549
00:46:54,700 --> 00:47:00,711
read where it says client 1 writes a into
a file that was previously empty. So that
550
00:47:00,711 --> 00:47:07,170
funky turnstile there is empty. So writes
a into a file and then deletes the file
551
00:47:07,170 --> 00:47:12,220
and another client writes, replaces, sees
the a in the file replaces it with a b.
552
00:47:12,220 --> 00:47:18,340
And then client 1 goes and writes c into
the file that it previously thought to be
553
00:47:18,340 --> 00:47:23,190
empty. And then unfortunately, even though
you can imagine that you should see either
554
00:47:23,190 --> 00:47:28,660
b or c in that file, but Dropbox deleted
it. So I think they fixed that bug now.
555
00:47:28,660 --> 00:47:37,130
But. so you go. So it goes. Oscar Wikstrom
has a couple of great, pretty recent blog
556
00:47:37,130 --> 00:47:44,740
posts on properties in a screencasts
editor that I highly recommend. So this is
557
00:47:44,740 --> 00:47:51,200
a great tool for finding bugs, but it's
not the same as having a proof. Right? So,
558
00:47:51,200 --> 00:47:56,150
you can still imagine that you can find
very subtle bugs that are not covered by
559
00:47:56,150 --> 00:48:00,040
QuickCheck. QuickCheck just randomizes,
just generates randomized tests. So, that
560
00:48:00,040 --> 00:48:06,510
is not the same thing as making sure that
there aren't any bugs. So the great thing
561
00:48:06,510 --> 00:48:10,960
about Idris and the reason I chose it for
this talk is that Idris allows you to not
562
00:48:10,960 --> 00:48:15,530
just encode properties in the language. It
also allows you to encode proofs in the
563
00:48:15,530 --> 00:48:20,720
language. So here is the associative
property for the list concatenation
564
00:48:20,720 --> 00:48:24,640
operation. And if you look at the top that
has the definition of that function from
565
00:48:24,640 --> 00:48:30,100
the Idris standard library, it says ++, in
goes a list, in goes another list, out
566
00:48:30,100 --> 00:48:35,730
comes a list. Then it says, well, if you
concatenate the empty list with any list,
567
00:48:35,730 --> 00:48:41,080
that is just that list "right". Do you
see that? The second one says, well, if we
568
00:48:41,080 --> 00:48:45,510
concatenate a list that starts with the
element x and goes on with xs, then we
569
00:48:45,510 --> 00:48:50,830
just sort of pull the x in front and
concatenate the rest with "right". So
570
00:48:50,830 --> 00:48:54,890
that's a classic recursive definition of
list concatenation in functional
571
00:48:54,890 --> 00:48:59,060
programming. And now here's something
really strange in Idris. Here's the type
572
00:48:59,060 --> 00:49:04,010
declaration for a definition, again in the
standard library, called appendAssoc. And
573
00:49:04,010 --> 00:49:09,180
it says, if you have a list a, you have a
list b, and you have a list c, and in the
574
00:49:09,180 --> 00:49:14,960
type it says, oh, the associative property
should hold. Right. And so this is a
575
00:49:14,960 --> 00:49:20,980
statement of that property. That's
wonderful. It's not the same as a proof.
576
00:49:20,980 --> 00:49:29,690
So, but writing proofs, who loved that in
math? Oh. Oh, you're good! I didn't. I'm
577
00:49:29,690 --> 00:49:33,900
sorry. So. So the great thing about Idris
is, it helps you write down the proofs.
578
00:49:33,900 --> 00:49:38,730
I'll show you how that works just really,
really briefly. So here's that. So here's
579
00:49:38,730 --> 00:49:42,690
just what I showed you on that slide. So I
can load that in there and it says, well,
580
00:49:42,690 --> 00:49:46,560
you're not done. You didn't write a proof
for that property, but in Idris, you can
581
00:49:46,560 --> 00:49:50,900
just push a bunch of buttons. Now, I love
that. So I can push one button and it
582
00:49:50,900 --> 00:49:55,360
says, oh, well, you should write a proof
of that form. You have lists a, b and c.
583
00:49:55,360 --> 00:49:59,380
Well, now and I can push another button
that says, well, you're doing this on
584
00:49:59,380 --> 00:50:04,800
lists and if you're writing anything on
lists, you always need to distinguish
585
00:50:04,800 --> 00:50:09,000
between the two cases of the empty list
and the list that consists of the first
586
00:50:09,000 --> 00:50:14,590
element x and further element xs. And then
it says, well, write down something, but
587
00:50:14,590 --> 00:50:18,710
then I can tell Idris: Well, I'm too lazy.
I'm not going to write anything so I can
588
00:50:18,710 --> 00:50:23,060
just push a button. And Idris wrote this
so you can see me, but I didn't type this
589
00:50:23,060 --> 00:50:29,359
right. I just pushed a button and it says,
Refl. What is Refl? What could that be?
590
00:50:29,359 --> 00:50:35,810
Well, you can ask it what Refl is. It says
Refl. Oh, you can see here, landing here.
591
00:50:35,810 --> 00:50:39,530
Refl is just a proof sort of a built in
proof that says that if two things are
592
00:50:39,530 --> 00:50:44,720
true, two things are equal, if they're
identical, if they're the same. Right. And
593
00:50:44,720 --> 00:50:48,590
that kind of makes sense in the first
equation, because the first equation of
594
00:50:48,590 --> 00:50:53,420
appendAssoc corresponds to the first
equation of ++. Can you see that, how it
595
00:50:53,420 --> 00:51:01,420
corresponds? Can you see that? The first
list is empty. Can you see that? Can you
596
00:51:01,420 --> 00:51:05,180
see how the first list is empty with the
first equation of appendAssoc and the
597
00:51:05,180 --> 00:51:13,700
first list is empty up there with ++. Can
you see that? OK. And then it just says,
598
00:51:13,700 --> 00:51:18,501
well, then obviously. Well, not quite
obviously, but then sort of the the way
599
00:51:18,501 --> 00:51:22,650
that the definition works, it comes out
just right. So what's really important is
600
00:51:22,650 --> 00:51:27,740
that Idris accepts that proof with the
first. The second one is slightly more
601
00:51:27,740 --> 00:51:35,960
tricky. But again, we can get help because
we know that appendAssoc is this recursive
602
00:51:35,960 --> 00:51:39,480
function. It recurses on the first
argument. So we're just going to do the
603
00:51:39,480 --> 00:51:47,281
same thing in the proof. And you can tell
Idris that it should use that, that it
604
00:51:47,281 --> 00:51:52,440
should use that fact, if you will. So
here's the recursive call. And again, I'm
605
00:51:52,440 --> 00:51:56,440
too lazy to push a button. But if I push
that button, it also puts in Refl and
606
00:51:56,440 --> 00:52:02,260
there's loads. So this it might be a
mystery to you how it works, but this is a
607
00:52:02,260 --> 00:52:06,980
proof of the associate of property of the
list concatenation in Idris. And since
608
00:52:06,980 --> 00:52:12,190
Idris helps you write it, it's kind of fun
to do that. Oddly enough, even for
609
00:52:12,190 --> 00:52:16,090
somebody who doesn't, usually who doesn't
usually doesn't enjoy proofs. So the way
610
00:52:16,090 --> 00:52:22,070
that you program in Idris, we haven't done
that a lot in this talk, is that you put a
611
00:52:22,070 --> 00:52:26,280
lot of information in the types and the
more information you put in the types, the
612
00:52:26,280 --> 00:52:30,320
better Idris will get at figuring out the
correct definition. And you don't have to
613
00:52:30,320 --> 00:52:38,890
do it by yourself. OK. So that's really
nice. OK. So we got that and and sort of
614
00:52:38,890 --> 00:52:43,790
these kinds of proof assisting systems
such as Idris have been used in a lot of
615
00:52:43,790 --> 00:52:48,990
real world systems. One one prominent
example is SEL4, a version of the L4
616
00:52:48,990 --> 00:52:52,810
micro kernel, has a long history, but
important properties of that kernel have
617
00:52:52,810 --> 00:52:57,410
been verified. It runs in the security
enclave on iOS and even though it's
618
00:52:57,410 --> 00:53:01,280
written in C, it provably does not have
buffer overflows or a lot of the nasty
619
00:53:01,280 --> 00:53:06,161
things that are responsible for a lot of
security exploits. Compcert is another
620
00:53:06,161 --> 00:53:11,310
example, which is a verified, I should
mention this has been verified with the
621
00:53:11,310 --> 00:53:15,690
help of a tool called Isabel. Also, great
fun to use. There's a project called
622
00:53:15,690 --> 00:53:20,390
called Compcert, which is a verified C
compiler, which is important for a lot of
623
00:53:20,390 --> 00:53:24,360
certified software where, you know, the
source code might be certified. But how do
624
00:53:24,360 --> 00:53:29,610
you know that the compiler generates
correct code? And you know, because it's
625
00:53:29,610 --> 00:53:34,430
been proven to be correct. And even there,
you can shoot, you can cheat sometimes. So
626
00:53:34,430 --> 00:53:38,050
for example, register allocators, very
complicated, very hard to prove right.
627
00:53:38,050 --> 00:53:41,710
But what you can do is you can write a
checker that the register allocator did
628
00:53:41,710 --> 00:53:47,810
its job, did its job well and you can
verify the checker. And so you can cheat a
629
00:53:47,810 --> 00:53:52,150
bit. So there's tools for that. We've seen
Idris and there's a number of other tools
630
00:53:52,150 --> 00:53:56,510
and they're getting more and more mature.
And they're great fun, really. They really
631
00:53:56,510 --> 00:54:01,540
are great fun. But, you know, going back,
switching down a gear a little bit,
632
00:54:01,540 --> 00:54:05,130
there's lots of useful properties that you
can look for in your programs. So
633
00:54:05,130 --> 00:54:09,970
commutativity might be useful that you can
switch the two arguments for an operation.
634
00:54:09,970 --> 00:54:13,790
Also, if you have relations, you might
remember that from some math class,
635
00:54:13,790 --> 00:54:17,580
there's some properties here like
reflexivity, symmetry, antisymmetry and
636
00:54:17,580 --> 00:54:24,320
transitivity. Reflexivity says that a is
always related to a. Symmetry says if it's
637
00:54:24,320 --> 00:54:28,920
one way, if a and b are one way related,
they need to be related the other way too.
638
00:54:28,920 --> 00:54:32,840
Antisymmetry intuitively would seem kind
of the opposite. That doesn't make sense.
639
00:54:32,840 --> 00:54:38,770
It's just says: if two things are related
in both both ways around. So, for example,
640
00:54:38,770 --> 00:54:43,369
you know, orders like less or equal are
antisymmetrical, then they must be the
641
00:54:43,369 --> 00:54:47,820
same. And transitivity just says that you
can form chains of your relation. So those
642
00:54:47,820 --> 00:54:53,260
are a little dictionary of useful
properties that you can look for. Let me
643
00:54:53,260 --> 00:54:58,300
close with one fancy property that you've
probably seen somewhere and that property
644
00:54:58,300 --> 00:55:03,810
is called Functor. And you might have seen
in your programing language, in your list
645
00:55:03,810 --> 00:55:08,540
library or in your stream library. There's
a function called map, right? And you
646
00:55:08,540 --> 00:55:13,910
know, even Java has that and has had it
for many years. And what map does is, if
647
00:55:13,910 --> 00:55:18,670
you have some, you know, in Java, for
example, it says "Stream", or it might be
648
00:55:18,670 --> 00:55:22,640
"List", right. It says, well, if I have a
list of As, I can apply a function to each
649
00:55:22,640 --> 00:55:26,980
element of that list. But you can
generalize that, it doesn't have to be
650
00:55:26,980 --> 00:55:31,190
lists. It could be an Optional of As, for
example. You could also apply a function
651
00:55:31,190 --> 00:55:35,320
to the value that's in there. So, you can
generalize that notion, and then it's a
652
00:55:35,320 --> 00:55:42,690
functor. And, of course, in Idris, you can
write down equations for functors. And,
653
00:55:42,690 --> 00:55:47,660
please ignore the technicalities here,
(stammers) but, if you sort of pick out
654
00:55:47,660 --> 00:55:52,198
where it says "functorIdentity",
the middle row says g v equals to v,
655
00:55:52,198 --> 00:55:56,000
which means g is the identity
function. When you feed in v, you always
656
00:55:56,000 --> 00:56:00,640
get v back. And when you use map with the
identity function, you apply the identity
657
00:56:00,640 --> 00:56:05,530
function on each element of your list or
whatever it is. Then then you always get
658
00:56:05,530 --> 00:56:10,990
back the same list. And here just says you
get function composition. So if you apply
659
00:56:10,990 --> 00:56:14,930
one function and then another function and
you do that either inside or outside the
660
00:56:14,930 --> 00:56:18,490
map, you should also get the same results.
So there's also just as there is
661
00:56:18,490 --> 00:56:24,230
associativity with monoids, with functors.
There's these laws and you might think,
662
00:56:24,230 --> 00:56:27,999
well, where would I look for a functor?
I've never seen a functor except for the
663
00:56:27,999 --> 00:56:31,959
ones on streams. A couple weeks ago in a
training, somebody said, well, you always
664
00:56:31,959 --> 00:56:38,790
start with that animal example. Shouldn't
you look for a functor there? And I was
665
00:56:38,790 --> 00:56:42,840
kind of, you know, sweat broke out on my
forehead, I was like, where's that gonna
666
00:56:42,840 --> 00:56:47,330
go? But, we came up with this. So, if you
go back, you can see that this is obvious.
667
00:56:47,330 --> 00:56:52,219
So, what you need for functors is, you
need a type parameter. Right. And so you
668
00:56:52,219 --> 00:56:55,740
just look for a place to stick a type
parameter, any place here at all. And
669
00:56:55,740 --> 00:56:59,439
if you look at Dillo and Parrot, they
both prominently have this weight thing.
670
00:56:59,439 --> 00:57:03,200
Right. And so that seems more important
than the other two properties, which are
671
00:57:03,200 --> 00:57:09,589
specific to particular kind of animal. And
so the weight, the thing to do is just to,
672
00:57:09,589 --> 00:57:14,060
well, you can see I replaced upper case
Weight by lower case weight and made that
673
00:57:14,060 --> 00:57:19,810
into a type parameter, and, I can then
provide a functor implementation down
674
00:57:19,810 --> 00:57:23,581
there. And you might think, what is that
good for? Well, I don't know. Well, one
675
00:57:23,581 --> 00:57:27,110
thing that you could do is you could
provide a different representation for
676
00:57:27,110 --> 00:57:31,073
weights. Another thing that you could do,
if you look at the type for runOverAnimal,
677
00:57:31,073 --> 00:57:35,820
it says animal weight -> animal weight and
weight is a type variable. What that type
678
00:57:35,820 --> 00:57:41,120
signature tells you is that runOverAnimal
does not know what weight is. And that
679
00:57:41,120 --> 00:57:45,630
means that the weight cannot change as a
result of that function. And you see that
680
00:57:45,630 --> 00:57:49,650
in the type signature you get immediate,
small benefit, but you get a benefit even
681
00:57:49,650 --> 00:57:55,590
with silly examples such as this one. And
that really brings me to the end. So in
682
00:57:55,590 --> 00:58:00,480
your software, in your domain model, look
for a Combinator, look for a function that
683
00:58:00,480 --> 00:58:05,800
will combine two things into a bigger
thing. See if you can make that thing
684
00:58:05,800 --> 00:58:09,609
associative and look for a neutral
element. And very often you will find one;
685
00:58:09,609 --> 00:58:13,490
make it a monoid, you know, say monoid a
couple of times. You'll remember it.
686
00:58:13,490 --> 00:58:17,461
You'll remember it. Generally, write
properties for the things, for the
687
00:58:17,461 --> 00:58:23,390
operations in your software, test those
properties using QuickCheck. You know, if
688
00:58:23,390 --> 00:58:27,770
you feel like you have a lot of time,
prove them correct. Find the functor. If
689
00:58:27,770 --> 00:58:32,390
you found, if you found the monoid, you
know, find the functor next. You know, and
690
00:58:32,390 --> 00:58:35,960
it takes, it might take time. I'm
very old. As you noticed at the beginning.
691
00:58:35,960 --> 00:58:42,190
So. So it gets easier over the years and
it will just seem like a regular staple of
692
00:58:42,190 --> 00:58:47,100
your of your arsenal when you program. And
of course, when the important properties
693
00:58:47,100 --> 00:58:51,070
in your program have either been written
down, if they've been tested with
694
00:58:51,070 --> 00:58:54,840
QuickCheck or even proven, then you can
sleep much more soundly than maybe you
695
00:58:54,840 --> 00:59:04,820
currently can. Thank you very much.
applause
696
00:59:04,820 --> 00:59:10,841
Herald: Thank you, Mike. So I see we have
three minutes for questions. Maybe that's
697
00:59:10,841 --> 00:59:16,119
two or three questions. If you have any
come to the microphones, please. Do we
698
00:59:16,119 --> 00:59:25,950
have a question from the Internet? No, not
yet. So, microphone two. Right.
699
00:59:25,950 --> 00:59:34,050
Question: Hi. So, QuickCheck generated
hundred tests. Yes. What can we say about
700
00:59:34,050 --> 00:59:38,580
the quality of this test? Could we say
your program was correct using thoese
701
00:59:38,580 --> 00:59:42,561
tests? Are these tests good?
Answer: Yeah. Very good question. So what
702
00:59:42,561 --> 00:59:46,380
would you say about the quality of the
tests? And indeed, if you really do serve
703
00:59:46,380 --> 00:59:50,760
industrial strength applications, a quick
QuickCheck comes with a bunch of tools
704
00:59:50,760 --> 00:59:54,880
that let you look, for example, at the
distribution of the individual example
705
00:59:54,880 --> 01:00:02,250
rated and while you didn't quite see me do
that, but I mean, for your domain objects,
706
01:00:02,250 --> 01:00:06,150
you will typically write generators that
will generate those examples and you can
707
01:00:06,150 --> 01:00:09,990
reason about the distribution of those.
And you absolutely should do that because
708
01:00:09,990 --> 01:00:15,420
otherwise you might miss large areas of
your test space. So, but there
709
01:00:15,420 --> 01:00:20,390
are tools and they help you do that. But
even if you don't do that, you know, it's,
710
01:00:20,390 --> 01:00:24,260
you find a lot of, I found a lot of
bugs in my software even without worrying
711
01:00:24,260 --> 01:00:28,560
about that. But if you go beyond that,
look at the distribution thing.
712
01:00:28,560 --> 01:00:34,360
Herald: Thank you, next one, please.
Number two.
713
01:00:34,360 --> 01:00:40,660
Q: Let's say I've hacked a program, for
example, in Java or C# or whatever. How do
714
01:00:40,660 --> 01:00:50,080
I, how do I apply what I learned so
far? So, where do I start when I have
715
01:00:50,080 --> 01:00:57,690
already completed the C# program with,
yeah, how do I apply QuickCheck on that?
716
01:00:57,690 --> 01:01:01,230
A: So, just pragmatically because it's
written in C#? That's the question?
717
01:01:01,230 --> 01:01:03,720
Q: Yes
A: So, well, I have to be very concrete
718
01:01:03,720 --> 01:01:07,839
here, I mean, so, if you can think
properties, right, one way to do, so,
719
01:01:07,839 --> 01:01:14,790
for example, so, C# you can link with F#
and there is a QuickCheck version for F#
720
01:01:14,790 --> 01:01:19,089
called "FsCheck". And FsCheck, actually,
even though it's itself written in F#, you
721
01:01:19,089 --> 01:01:23,460
can also use it from C#. So, you have two
options. You can write your tests in a
722
01:01:23,460 --> 01:01:28,491
slightly more awkward fashion in C#, or
you could just link your code with F# test
723
01:01:28,491 --> 01:01:35,330
suite and write it down there. And there
is a fairly reasonable Java QuickCheck, I
724
01:01:35,330 --> 01:01:39,530
hear. Another idea would be to use the
slightly more fancier, the slightly
725
01:01:39,530 --> 01:01:45,150
fancier QuickChecks that exist for Scala
and Enclosure. I'm sure there's one for
726
01:01:45,150 --> 01:01:49,220
Kotlin as well, and link that against your
Java code. Does that answer your question?
727
01:01:49,220 --> 01:01:54,510
Q: So whatever language I use, I have to
find out what the correct implementation
728
01:01:54,510 --> 01:01:56,890
of QuickCkeck?
A: Yeah. Yeah. But as I said, I mean
729
01:01:56,890 --> 01:02:00,800
usually, a fun thing I do in training is,
I chat "QuickCheck" and somebody calls on
730
01:02:00,800 --> 01:02:04,679
the language, you know. Quick, QuickCheck
PHP or something like that. And there is
731
01:02:04,679 --> 01:02:07,989
one, sure enough, I didn't know about before.
Q: Thank you.
732
01:02:07,990 --> 01:02:12,140
Herald: All right. Thank you. And thank
you, Mike, again, for showing us a way to
733
01:02:12,140 --> 01:02:15,251
sleeping soundly.
A: Thank you.
734
01:02:15,251 --> 01:02:20,649
applause
735
01:02:20,649 --> 01:02:27,109
postroll music
736
01:02:27,109 --> 01:02:45,600
Subtitles created by c3subtitles.de
in the year 2020. Join, and help us!