Math for hackers

Edit subtitles

0:00 - 0:22

ChaosWest TV intro music
0:26 - 0:30

Herald: Good afternoon, everybody, and
welcome back to Chaos TV West.
0:31 - 0:37

We are going to see a prerecorded
talk by Michael Sperber
0:37 - 0:40

about mathematics for hackers.
0:40 - 0:45

Michael is… once was a mathematics
student, but never finished his degree
0:45 - 0:48

and is in the process of
revisiting this old love of his.
0:50 - 0:53

Dieser Vortrag wird auch
simultan übersetzt in Deutsch.
0:53 - 0:56

And, well, enjoy.
0:56 - 0:57

Michael Sperber: Welcome!
0:57 - 1:00

Well, I'm here to get you all
excited about math for hackers.
1:00 - 1:04

Now, you can have an entire hacker's
life without having encountered
1:04 - 1:09

much of mathematics. After all, this
conference and this community to a large
1:09 - 1:13

degree really is about finding unexpected
behavior in software such as Heartbleed.
1:13 - 1:18

I mean, there's zillions of examples, but
Heartbleed is an example where, you know,
1:18 - 1:22

the OpenSSL API exposed secret memory
through a regular API call. That was
1:22 - 1:28

certainly unexpected. Now, you might
consider this unexpected behavior to be
1:28 - 1:31

just an inevitable part of software
development because it's so complex,
1:31 - 1:35

right? Things that you didn't expect
happen due to all the complications and
1:35 - 1:40

the complexity. On the other hand, with
critical infrastructure, such as OpenSSL,
1:40 - 1:46

you really want certain aspects of that to
really run reliably, correctly. And so
1:46 - 1:49

really, that unexpected behavior comes
down to a correctness issue. And,
1:49 - 1:54

of course, that's where mathematics might
come in because mathematics deals in
1:54 - 1:59

correctness a lot of the time in the form
of proofs. Now, you can certainly apply
1:59 - 2:06

proofs to software systems to great
benefit. But of course, proofs are often
2:06 - 2:11

tedious and a lot of work. They involve a
great deal of mathematical intuition.
2:11 - 2:14

You might not feel like that today, and
you don't have to, because I will talk
2:14 - 2:20

about something else today. I will instead
talk about mathematics as a language.
2:20 - 2:23

Right. This is not a new idea,
it's hundreds of years old.
2:23 - 2:26

Here's a quote from Galileo
that said, well,
2:26 - 2:32

the world really can only be understood
in terms of the language of mathematics.
2:32 - 2:35

And, you know, a lot of
mathematics concerns itself with finding
2:35 - 2:40

formal methods for the world around us.
Now I want to emphasize the fact that
2:40 - 2:45

really mathematics doesn't really model
the world around us directly. It really
2:45 - 2:47

does that through the detour through our
perception, right?
2:47 - 2:51

Mathematics really is
a fundamentally human construct
2:51 - 2:55

that our mind uses to understand
the world around us.
2:55 - 2:59

And in that way, it's also a social
construct because, well, we don't really
2:59 - 3:04

need mathematics, if not for communication
between different people.
3:04 - 3:10

And that's really where, you know, the
value of mathematics for software comes in
3:10 - 3:15

because software also is an image, sort of.
Our perceptions perceptions of the world
3:15 - 3:20

around us, a lot of the time. At least
useful software is. So, one particularly
3:20 - 3:24

useful part of mathematics for hackers,
for software construction, is, well, first
3:24 - 3:29

of all, the language of equations. And
here's a couple of equations, right?
3:29 - 3:33

x plus two equals six, that's very simple,
specifies the value for x.
3:33 - 3:37

This next equation, well, it has...
so x is a variable
3:37 - 3:40

and the next equation has two variables,
has x and y.
3:40 - 3:44

So it has two x squared plus
two y minus two equals zero.
3:44 - 3:49

That's a quadratic equation. Might even
have, you know, several solutions. And
3:49 - 3:54

finally, there's an equation that involves
a function whose definition is unknown.
3:54 - 3:58

It just says, well,
f(5) plus f(7) equals 70.
3:58 - 4:01

Now, you know,
when it comes to functions,
4:01 - 4:04

well, those are things that also
exist in software. So we might
4:04 - 4:09

conceivably use the language of equations
involving functions to describe functions
4:09 - 4:14

in our software. So we might take that
equation to describe that piece of
4:14 - 4:18

software written over on the right-hand
side. And, well, you know, we could try to
4:18 - 4:22

kind of work through it. Right? You can
see there's a function, there's a variable
4:22 - 4:27

that's external to that function. So it
goes across several invocations of it,
4:27 - 4:30

called y. And it's initialized to 7.
4:30 - 4:34

And maybe if we call f(5),
x is going to be five.
4:34 - 4:37

Y'know, try to go through it in your mind.
4:37 - 4:41

z is going to stash
the old value of y, which is seven,
4:41 - 4:44

and then y is going to be set to x.
4:44 - 4:48

So it's not going to be five. And then we
multiply x with the old value of y, so we
4:48 - 4:53

multiply 5 with 7 getting 35.
So f(5) might return 35.
4:53 - 4:59

And with f(7), we then after that call f(7)
y is now 5,
4:59 - 5:04

we stash the old value
in z, so it's now 5
5:04 - 5:06

x is 7,
and it returns 35 again.
5:06 - 5:10

Only the product, the multiplication,
works in the other order.
5:10 - 5:14

And the sum of that is 70, so you might
say, well, that equation describes that
5:14 - 5:20

function over there. But really, we had to
go through a lot of detail and sort of
5:20 - 5:24

tracing through the program to figure that
out. And also, it only works with that
5:24 - 5:30

initial value of y and it only works if we
call those functions from left to right.
5:30 - 5:34

But the language of equation doesn't say
anything like that, that we really need to
5:34 - 5:38

respect order or something like that.
Really, f(5), in mathematics, should
5:38 - 5:43

always return the same value. Otherwise,
you know, it becomes much harder to really
5:43 - 5:48

consistently apply the language of
equations to programs. Therefore,
5:48 - 5:52

Really, that's I guess the first hint
that mathematics gives you,
5:52 - 5:56

is that you should write functions that
always return the same value given the
5:56 - 6:03

same input. But, I digress.
Another aspect of the language of
6:03 - 6:07

equations is a concept called
substitution, that you probably know. So
6:07 - 6:13

imagine that first equation says x equals
to 7. And then there's a formula that
6:13 - 6:17

has that variable occurring. And in the
context of that first equation, what we
6:17 - 6:22

can do is you can kind of plug in that
equation, we can substitute every
6:22 - 6:29

occurrence of x with the number 7 in
this case and then use that to figure out
6:29 - 6:33

the final result of that formula. That's
also something that fundamentally only
6:33 - 6:40

works, you know, if x doesn't change over
time. If your functions really return the
6:40 - 6:44

same value every time you call them with
the same input. Now, the substitution
6:44 - 6:50

principle is not just for equations of the
form where you say x is a certain number.
6:50 - 6:54

You might also have an equation that
expresses x in terms of another variable,
6:54 - 6:59

such as z in this case. And you know, that
first equation means, well, I can plug in
6:59 - 7:07

2z every time x occurs in that formula and
use that to do further evaluation or
7:07 - 7:12

simplification of that formula. And what's
more, it also works the other way around,
7:12 - 7:17

right? I can work. I can sort of read that
equation in the second line, also from
7:17 - 7:23

right to left. So order does not matter.
So that's a fundamental principle of a
7:23 - 7:27

branch of mathematical algebra, right,
that we have variables and that we have
7:27 - 7:33

that substitution principle and that we
have equations. So well, here's a well
7:33 - 7:38

known algebraic formula, the binomial
formula, or one of them, at least.
7:40 - 7:44

You might've learned that at high school.
But today I want to look at a couple of
7:44 - 7:51

more slightly more fundamental equations
that might be part might have been part of
7:51 - 7:56

your school algebra class. So I want to
point out that all of these equations have
7:56 - 7:59

names, and you might try to think of them
right now, but I'll try to go through them.
7:59 - 8:04

You might think of the names and
then I'll resolve when I'm done. So the
8:04 - 8:08

first equation says when you multiply
three numbers, it doesn't matter whether
8:08 - 8:12

you will first multiply the first and the
second one and then the result with the
8:12 - 8:18

third one, or whether you first multiply
the second and the third one and then
8:18 - 8:22

multiply with the first (I'm going from
right to left here). The second equation
8:22 - 8:27

says you can swap the two arguments of a
multiplication, and the third one says
8:27 - 8:31

that you can kind of multiply out a sum in
the sense that,
8:31 - 8:36

when you have
a times the sum of b and c,
8:36 - 8:42

you can multiply a individually,
with each of the of the sums there.
8:42 - 8:46

Now here are the names:
The first one says, well, it doesn't
8:46 - 8:52

matter which way we associate the
multiplication with these three numbers,
8:52 - 8:54

so it's called "associativity".
8:54 - 9:00

The second one? Well, it comes
from a verb that says "to swap".
9:00 - 9:02

So it's called "commutativity".
9:02 - 9:07

And the third one is about distributing
a multiplication over an addition.
9:07 - 9:09

So it's called "distributivity".
9:09 - 9:14

And so these are all very valuable
and useful equations, but
9:14 - 9:17

really the most useful one is the first
one, associativity.
9:17 - 9:21

And associativity is something that
works not just for plus.
9:21 - 9:26

It also works for multiplication,
for example.
9:26 - 9:30

So really, when you talk about
associativity, you don't talk about it in
9:30 - 9:36

isolation. You have to say what operation
you mean is associative or is not
9:36 - 9:41

associative; in this case, "plus" or
"times". Moreover, associativity is
9:41 - 9:45

something that's not restricted to
numbers. So, for example, here are two
9:45 - 9:51

Boolean equations that specify
associativity for a Boolean "or" or "and",
9:51 - 9:56

so with the "or" operation you can take
three Boolean values, it doesn't matter
9:56 - 9:59

which way you "or" them up,
you always get the same result.
9:59 - 10:03

And the same works for
the Boolean "and" operation.
10:03 - 10:05

While these are kind of primitive values,
10:05 - 10:08

but you can also have
values that have structure.
10:08 - 10:11

So, for example, you might
program with a list data structure.
10:11 - 10:16

So you have a list with elements,
a1, a2, and so on, until aN.
10:16 - 10:23

And you might take two lists, where the
second one has elements b1, b2, ..., bm
10:23 - 10:28

and you concatenate them together. I
wrote this with that funny bow tie symbol
10:28 - 10:33

there, concatenation. So you concatenate
these two lists, and that operation,
10:33 - 10:39

that concatenates any two lists is,
if you think about it, also associative.
10:39 - 10:44

Now, what value does it have
to know that an operation is associative?
10:44 - 10:48

So here's a classic example where that's
valuable. One way to think about,
10:48 - 10:53

associativity as well: you can move the
parentheses around, right? And if you can
10:53 - 11:00

move the parentheses around, it doesn't
matter where they are. And that means that
11:00 - 11:04

you can leave them out entirely if you
have sort of a chain of the same kind of
11:04 - 11:10

operation. So imagine you have a large
parallel computation that's structured in
11:10 - 11:15

the following way: you have individual
computations a, b, c, d and so on until z,
11:15 - 11:18

and they're all independent of one
another. And to get the result of your
11:18 - 11:23

overall computation, you all combine them
with that operation that's written as that
11:23 - 11:28

black rhomboid. I'm just going to say
diamond from now. So the diamond
11:28 - 11:35

operation. And so the combination works
with a diamond operation. But if you know
11:35 - 11:39

that the diamond operation is associative,
it means you can sort of bunch up your
11:39 - 11:46

"abc"s any way that you like. And so, for
example, you might say, Well, if I've got
11:46 - 11:51

four machines working on this computation,
I can split my individual computations in
11:51 - 11:56

four bunches. Combine, you know, diamond,
the first bunch, the second bunch, the
11:56 - 12:00

third bunch and the fourth bunch, and then
combine the results of those and I get the
12:00 - 12:06

same results. It doesn't really matter if
it's five bunches or six or seven, or even
12:06 - 12:10

if it's irregular, right? And so that's
the freedom that associativity gives you
12:10 - 12:17

in distributing your computation. And
this, what what I just told you, this is a
12:17 - 12:21

classic big data abstraction for large
scale parallel computations called
12:21 - 12:28

MapReduce. But I really want to talk about
something more fundamental today, which is
12:28 - 12:33

how we think about the things that happen
in our software. So here is an excerpt
12:33 - 12:38

from the Java API documentation and ir
draws an oval, so it has to do with
12:38 - 12:44

graphics, with images. And you notice,
right the draw over operation, it has
12:44 - 12:51

returned type void and the documentation
there talks about it in terms of pixels.
12:51 - 12:58

Now, really, we think of an oval as this
thing, right, as an object or a shape or
12:58 - 13:02

something like that, right, as an entity
in its own right. But if you look at the
13:02 - 13:06

documentation there, you can see that
drawOval doesn't create an object that
13:06 - 13:12

corresponds to the notion or mind has of
an oval, right? It just flips the colors
13:12 - 13:18

of a bunch of pixels in some imagined
canvas that drawOval paints on. So that's
13:18 - 13:22

very different, right? And then the notion
of having a thing that's an oval is really
13:22 - 13:27

something that our mind reconstructs from
those pixels. But that gives us a very
13:27 - 13:33

limited sort of API for dealing with
images. And you can do that differently in
13:33 - 13:37

a way that that is amenable to algebra and
mathematics. And well, you can find that
13:37 - 13:43

in several places. One that's particularly
accessible is the Racket Image Teachpack
13:43 - 13:46

library that comes with the racket system.
So I put a download link down there,
13:46 - 13:52

generally a great system to play with. And
so this is how it works: Well, Racket is a
13:52 - 13:57

lisp like language, so the syntax works of
a different from Java or C in that when
13:57 - 14:03

you call a function, the parentheses go,
around the function call and the arguments
14:03 - 14:08

and the arguments are separated by white
space rather than commas. So first thing
14:08 - 14:12

is you call a function called circle, you
pass it three arguments: 50, solid, and
14:12 - 14:17

red, and it returns a red circle. Well, as
you might imagine. But really, I want you
14:17 - 14:23

to understand that it doesn't flip pixels
or something like that. It really returns
14:23 - 14:30

an object representing the circle. And
it's only the racket API that then
14:30 - 14:36

displays that object as an actual picture
that we can view. But it could also do
14:36 - 14:41

something else with that picture: save it
to a file, send it to somebody over the
14:41 - 14:47

network or something like that. So well,
first function returns a circle. Second
14:47 - 14:52

function is also square function that
returns a square. You can see I can put
14:52 - 14:56

something, I can put outline instead of
solid to say that I want just the outline
14:56 - 15:01

of a square or there's this blue star at
the bottom returned by the star function.
15:01 - 15:04

So these three functions are kind of
primitive in the way that they create
15:04 - 15:09

images out of thin air from a bunch of
numbers and strings. And here's another
15:09 - 15:16

such function called right triangle that
creates a triangle out of thin air. Now
15:16 - 15:21

there's three more operations here that
don't create something out of thin air. So
15:21 - 15:24

there's the rotate function, for example:
it takes an image. An existing image - it
15:24 - 15:28

has to exist before you call "rotate" and
rotates it, in this case by 10 degrees,
15:28 - 15:33

just a little bit. And then there's two
other functions, flip-vertical, which
15:33 - 15:37

flips an image vertically and flip-
horizontal, which flips it horizontally,
15:37 - 15:43

so creates a mirror image, if you will.
Now, these two functions, they transform
15:43 - 15:48

an image in some way. And what's important
to understand is an image object goes in
15:48 - 15:54

and they returned an image object also.
And that image object that's returned, you
15:54 - 15:59

can pass it to other operations. So, for
example, that's what happens here. You
15:59 - 16:04

can, for example, rotate that triangle
again, as we did on the slide before. And
16:04 - 16:08

you can pass that result, you can stick it
into flip-vertical to then flip the
16:08 - 16:14

rotated image. Now what you see here is
particular examples. So "what happens to
16:14 - 16:20

the triangle when you first rotate and
then flip?" But some of these things can
16:20 - 16:26

be generalized to general equations that
hold for all images. So here's two very
16:26 - 16:30

simple examples. The first one says, Well,
if I flip an image and I flip it again, I
16:30 - 16:36

get the same image out again. And then I
think hopefully plays to your visual
16:36 - 16:40

imagination or intuition. The second
equation says, Well, if you flip
16:40 - 16:44

horizontally and then vertically or you do
it the other way around your first flip
16:44 - 16:47

vertically and then horizontally, you get
the same result. And again that should
16:47 - 16:53

conform to your visual intuition. So those
are traditional equations that
16:53 - 17:00

characterize how images work and how image
construction works. And we can get some
17:00 - 17:06

particularly useful equations from these
three operations that are on this slide
17:06 - 17:11

here. Now what they do is, they don't just
take one image and output one image the
17:11 - 17:17

way that rotate and the flips do, but they
take two images each and return one image
17:17 - 17:21

that results from combining them. So the
first function, "beside", takes two images
17:21 - 17:27

and sticks them together horizontally,
gives you the resulting image. "above"
17:27 - 17:30

puts them above each other. Gives you the
result. An "overlay" puts them on top of
17:30 - 17:37

each other. And these are operations. You
know, they have the same form as "plus" or
17:37 - 17:43

list concatenation or multiplication or
"or" or "and" in the sense that "or" takes
17:43 - 17:47

two Booleans and returns a Boolean. "plus"
takes two numbers, returns a number.
17:47 - 17:51

"beside" takes two images, returns an
image. Same thing for "above" and
17:51 - 17:55

"overlay". And whenever you have that, you
have a function that takes two things and
17:55 - 17:59

returns one thing, you can think about
associativity and you should. And in fact,
17:59 - 18:04

these functions, if you kind of work again
for your visual intuition, you figure out
18:04 - 18:09

that "beside", "above" and "overlay" are
all associative operations. So, with
18:09 - 18:13

"overlay", for example, should be
especially intuitive and that if you have
18:13 - 18:16

three images on top of each other, it
should not matter whether you first kind
18:16 - 18:21

of staple the top ones together and then
the bottom one to the bottom, or whether
18:21 - 18:24

you first staple together the bottom two
and then put the top one on top. In each
18:24 - 18:34

case, you have layers in the same
sequence. Now what can you do knowing that
18:34 - 18:37

you have an associative operation? We
already talked about parallelization.
18:37 - 18:43

Generally, you can do optimizations based
on some of the other equations, for
18:43 - 18:47

example, the equation that says, "flip-
horizontal twice doesn't do anything". You
18:47 - 18:52

can use that to optimize, right? When you
see somebody doing a flip twice, you just
18:52 - 18:58

throw out those operations. You don't have
to compute the actual flip. You can also
18:58 - 19:02

imagine computations where you can cache
intermediate results in the same way that
19:02 - 19:07

you kind of process that parallel
computation in a tree-shape, and it gives
19:07 - 19:13

you great flexibility in that caching and
recombining the cached values. So you can
19:13 - 19:18

generally just use that for performance.
You could also turn those equations into
19:18 - 19:22

what's called property based testing.
Right. You could stick in random values
19:22 - 19:27

for the variables and test whether
equations are actually fulfilled. And that
19:27 - 19:31

might help you debug or ensure the
correctness of your library. And finally,
19:31 - 19:36

if you really need something to be
correct, then you could provide for the
19:36 - 19:41

proofs giving you added assurance using a
technique called induction, because you
19:41 - 19:45

combine two things into a thing that you
kind of have to walk that construction
19:45 - 19:51

backwards. But again, this talk is not
really concerned with proof so much. But
19:51 - 19:56

again, I mentioned that earlier, I really
want to talk about the modeling aspect
19:56 - 20:01

that this mathematical way of thinking
about it suggests, right? Again, remember
20:01 - 20:07

of the Java drawOval function that just
flips the color of a bunch of pixels. And
20:07 - 20:11

here is the picture. But really, we view
this picture - our minds views this
20:11 - 20:17

picture as a bunch of concentric circles,
and each circle is a thing in its own
20:17 - 20:25

right. Now, I don't know what's intuitive
to you, but apparently it was intuitive to
20:25 - 20:33

some people to create that first function
drawOval. And that mathematical intuition
20:33 - 20:37

really tells you, well, make pictures,
make images into objects and then create
20:37 - 20:44

abstractions based upon them. So we can
use mathematics to kind of create you
20:44 - 20:48

create equations after the fact. But
really, mathematics gives us great
20:48 - 20:53

suggestions as what the library, what the
API should look like in the first place
20:53 - 20:58

that. Well, I often see people doing
something else, so that intuition might
20:58 - 21:03

have something to do with that
mathematical understanding. So let me talk
21:03 - 21:07

a little bit more about that language of
mathematics. Now that we figured out the
21:07 - 21:13

importance of associativity or equations
in general. So, people working in algebra,
21:13 - 21:19

they always talk about structures that
have three pieces, right? One piece is a
21:19 - 21:26

mathematical set. So just a collection of
a bunch of things, of objects. Then they
21:26 - 21:30

have operations on those sets and then
they have equations involving those
21:30 - 21:33

operations. On the right hand side, you
can see what we know already. So we have
21:33 - 21:37

some set "M"; it could be anything, could
be Booleans, could be numbers, could be
21:37 - 21:42

images. And then you have an Operation
Diamond that combines two of those M's
21:42 - 21:47

into one M. We here's a little bit funny
notation, right? With a cross there. What
21:47 - 21:52

it says is there's two inputs right cross,
cross, cross for each input. And so the
21:52 - 21:56

diamond operation has one M as an input,
another M as an input. And then there's
21:56 - 22:02

the arrow and that says that there's also
an M as the output. And then finally,
22:02 - 22:06

there's that associativity equation that
we've already seen quite a few times
22:06 - 22:11

today. And those three things packaged up
together, the set and the operation on
22:11 - 22:16

that set (in this case just a single
operation) and equations (in this case,
22:16 - 22:20

associativity), this is all called a semi
group. There are other algebraic
22:20 - 22:26

structures that we'll see. Now this
translates directly to programing. Your
22:26 - 22:32

set "M" in mathematics might just be a type
in your program. And these semi groups, they
22:32 - 22:34

occur in lots of places. So all of those
things that we've already seen with
22:34 - 22:38

associativity, you could just kind of
shorten your language and say, all of
22:38 - 22:43

those things are semi groups. So, for
example, at the top, it says: if you take
22:43 - 22:47

the mathematical set of real numbers,
(that's that funny "R", that hollow "R"),
22:47 - 22:54

the real numbers, with the plus operation,
(it takes two numbers from the "R" set and
22:54 - 22:58

produces one number), of course, has
associativity and therefore it's a semi
22:58 - 23:04

group. Lists, with concatenation, fulfill
the associativity equation. Therefore,
23:04 - 23:09

they form a semi group. And, for example,
the "overlay" operation from the images,
23:09 - 23:14

just as "beside" and "above" do, has that
same shape, right? Combines two images
23:14 - 23:20

into one image and fulfills associativity,
Therefore, images and "overlay" form a
23:20 - 23:26

semi group. Now, a semi group is not the
only algebraic structure, but it's a
23:26 - 23:30

fairly humble one. It can't do a whole lot
and therefore kind of mathematicians build
23:30 - 23:35

up from them. They take a simple algebraic
structure and they add more operations or
23:35 - 23:39

something else to it. So, for example, the
next step up from a semi group is
23:39 - 23:46

something called a monoid. So you take a
semi group and you add something, and I'll
23:46 - 23:50

try to explain what that is. So again, you
have to set you have a binary operation,
23:50 - 23:55

that diamond operation. And you also say
that in that system, there has to be a
23:55 - 24:01

special element. And you have to know what
it is. And that element is called
24:01 - 24:06

"bottom". That's what that line says, you
know, that upside-down tee, that's what I
24:06 - 24:11

call "bottom". And it also has to be part
of that set M. And now, as usual because
24:11 - 24:16

every monoid is a semi group, the
associativity equation holds. And the
24:16 - 24:20

bottom line explains about the identity
element about the bottom element, where it
24:20 - 24:26

says, Well, if you use the diamond
operator to combine "a" with the identity
24:26 - 24:31

or identity with "a", you get back "a". In
German, you often use the word "neutral
24:31 - 24:38

element". So it's neutral with with
respect to the binary operation that you
24:38 - 24:42

have in your semi group. And hopefully
that appeals to your intuition as it does
24:42 - 24:48

to mine. And if you take an algebra class
at the university level, you might learn
24:48 - 24:53

about more involved algebraic structures,
such as groups and rings and fields. And
24:53 - 24:57

every time you go a step up, right, a
monoid is a semi group plus a neutral
24:57 - 25:03

element. A group is a monoid with even
more operations and more equations. A Ring
25:03 - 25:11

even involves a another operation and more
laws. Same thing for a field. So, you just
25:11 - 25:15

add a bunch of stuff as you go up the
ladder. But for hacking, really? For
25:15 - 25:20

computing, for writing software, the two
most important ones from that stack here
25:20 - 25:23

are semi group and monoid and you don't
really have to worry about the rest unless
25:23 - 25:29

you really write mathematically inclined
software that involves those algebraic
25:29 - 25:33

instructions. So we've seen a bunch of
semi groups. Have we seen a bunch of
25:33 - 25:36

monoids? Yeah. Well, essentially all the
semi groups that we've already seen are
25:36 - 25:43

also monoids. So for example, the plus
operation on the numbers; zero is the
25:43 - 25:46

identity there: you add zero to a number,
you get back that same number. With
25:46 - 25:52

multiplication, well you multiply a number
by one, you get back that same number. If
25:52 - 25:55

you have a list you concatenate with it
the empty list, doesn't matter whether
25:55 - 26:01

it's left or right, well, sure enough, you
get the same list. Now the bottom three
26:01 - 26:05

examples might look a little bit funny
because they suggest that there's an
26:05 - 26:10

identity for this those operations, but
they all are the empty image, so they're
26:10 - 26:16

invisible. And so that points that out to
you. And monoids - I can't really over
26:16 - 26:22

emphasize this - monoids are everywhere
around us. So, a well-known software
26:22 - 26:27

engineer asked me a while ago and said:
Well, your mathematical structure is all
26:27 - 26:32

good and well, but what about really
concrete software things? You know, we use
26:32 - 26:35

shopping carts. How could you possibly use
mathematical structure for a shopping
26:35 - 26:41

cart? Therefore as well, a shopping cart -
you might have a shopping cart associated
26:41 - 26:46

with your account at some shop. Right? But
on another day you might go shopping, you
26:46 - 26:52

forget to log into your account. But you
can usually still fill a shopping cart.
26:52 - 26:58

Now, once you checkout that shopping cart,
you buy the stuff in it, it will usually
26:58 - 27:01

ask you to log in. Now there's two
shopping carts, one inside your account
27:01 - 27:07

and the one that you just filled. They
have to be combined into one. And really,
27:07 - 27:10

it's worthwhile to think about what
properties that combination operation
27:10 - 27:16

should have. But there's many other
examplesfor monoids. I have some concrete
27:16 - 27:21

examples from applications that I've
worked on. A famous paper in functional
27:21 - 27:23

programing, for example, is about
financial contracts - sure ehough, they
27:23 - 27:29

form a monoid. Recently, I talked to a
company that makes configurable flower
27:29 - 27:33

bouquets, so you might have well, you
might combine two flowers, two sets of
27:33 - 27:38

flowers into one to form a bouquet. You
know, all kinds of things. Monoliths
27:38 - 27:44

really are everywhere around us. So much
that, my friend Conal Elliott has elevated
27:44 - 27:48

this to a general design principle,
whenever you write software for a
27:48 - 27:54

particular domain, you should look for
mathematical algebraic structure. And
27:54 - 27:57

because monoids are so common, one thing
that you can always try is look for binary
27:57 - 28:02

operation on your objects. Check of the
associative law holds. Look for an
28:02 - 28:08

identity. And if you don't find that
binary operation, OK, but it's not
28:08 - 28:13

associative, you know, that's maybe a sign
that you should try to make it
28:13 - 28:17

associative. Doesn't always work, but
frequently does. And and usually gives you
28:17 - 28:20

some improvement in the process. Same
thing for identity: If there isn't one,
28:20 - 28:25

you might make one up, trusting that you
might need it later. Conal calles this the
28:25 - 28:30

Tao check. Something that I really like.
It's whether the software model that
28:30 - 28:35

you're creating aligns with mathematics.
And that means, with the deeper structure
28:35 - 28:39

of the universe that humanity has
discovered in the hundreds of years that
28:39 - 28:44

were there before computing. Monoids are
also useful in differentiation from things
28:44 - 28:47

that are not monoids. So, for example, the
maximum operation that gives you the
28:47 - 28:52

bigger of two numbers does not form a
monoid. It gives you a semi group. It's
28:52 - 28:58

associative, but well, there's no
identity, right? Thinking about it, you
28:58 - 29:02

might think, well, but can't I take minus
infinity or something like that? But of
29:02 - 29:07

course, that's not a real number. But it
maybe gives you a hint as to how you could
29:07 - 29:12

complete that structure if you really need
a monoid. I'll get back to that in a
29:12 - 29:17

second, but first I want to talk about a
general principle in that, it's not just
29:17 - 29:22

discovering monoids in the domain of
objects around you, you can also construct
29:22 - 29:28

monoids systematically. Specifically, you
can create larger monoids out of smaller
29:28 - 29:32

ones. So this is probably the most
complicated slide that I have. So don't
29:32 - 29:35

worry, it'll be over in the second, but
we'll try to go through it line by line.
29:35 - 29:40

So imagine you have two momoids. The first
one is called M1, the second one is called
29:40 - 29:45

M2, and each has their own diamond
combination operation. The first one is
29:45 - 29:50

called Diamond 1, the second one is called
Diamond 2. The first one takes two M1s,
29:50 - 29:57

gives you back an M1. The second one takes
two M2s and gives you back an M2. Now,
29:57 - 30:03

using those two momoids, you can create a
larger monoid by taking pairs of elements
30:03 - 30:08

from M1 and M2. That's what those funny
cross means. It's something called the
30:08 - 30:14

Cartesian product; out of two sets, it
forms the set of pairs where there's one
30:14 - 30:19

from each of those two sets in there. And
in order to make that a monoid, we have to
30:19 - 30:24

define the binary operation. I'm just
going to call it diamond. You can see the
30:24 - 30:28

diamond is defined in such a way that one
pair out of M1 and M2 goes in, another pair
30:28 - 30:34

out of M1 and M2 goes in and it returns
another pair out of M1 and M2. And the way
30:34 - 30:40

it's defined is, you take two pairs and
you combine the first elements from both
30:40 - 30:45

pairs using the M1 monoid and the Diamond
one operation. And you combine the second
30:45 - 30:51

elements of both pairs using the diamond
operation from the M2 monoid. And sure
30:51 - 30:56

enough, that gives you an operation that's
associative. I didn't put that on the
30:56 - 30:59

slide, but it also gives you an identity
with a pair that consists of the
30:59 - 31:08

individual identities from M1 and M2. You
know, that works. Also, sometimes you have
31:08 - 31:13

a semi group lying around, but you really
need a monoid. I mean, a semi group that's
31:13 - 31:16

not a momoid, but you really need a
monoid. Here's a little construction that
31:16 - 31:21

will build a monoid from a semi group. So
again, we start with the set. We start
31:21 - 31:26

with a binary operation on that set called
diamond, and then we create another set
31:26 - 31:30

that I call M-bottom (again, that funny
upside-down tee that you see there). And
31:30 - 31:35

we create it by adding artificially one
new element to it, right? It can be in
31:35 - 31:45

there before. And then we define a new
operation, Diamond-bottom, that operates
31:45 - 31:50

on that set. And and we just define it in
such a way that,if either side of that
31:50 - 31:55

operation is the identity, the bottom
element, we just have it return the other
31:55 - 32:01

side. So, "a, Diamond-bottom, bottom"
returns, you "a", and the same thing the
32:01 - 32:07

other way around. And whenever that
operation is called on two things that are
32:07 - 32:13

not bottom, we just call the original
operation. And sure enough, that gives you
32:13 - 32:16

a monoid. [glitch] that the thing before,
the M before, was just a semi group, and
32:16 - 32:20

you might have seen that construction when
programing. So, for example, the Java
32:20 - 32:25

library has something called an option,
and that is exactly that. And again, when
32:25 - 32:29

you're defining abstractions like that,
the mathematical notion of the
32:29 - 32:34

mathematical equations can provide a guide
as to how that API should behave if it's
32:34 - 32:39

supposed to be reasonable. If you're still
bear with me and have a little bit of
32:39 - 32:45

patience left, here's another concept it's
a little bit more abstract called the
32:45 - 32:49

homomorphism. And that's useful and
thinking about APIs, at least APIs that
32:49 - 32:55

combine in this way via something like a
monoid. So look at the first equation: It
32:55 - 33:00

says, well, if you take an overlay of two
images, a and b, - more like this - right,
33:00 - 33:04

and you rotate it, that is the same as
taking one image, rotating it, take
33:04 - 33:10

another image, rotating it and slapping
them together after the fact. And what
33:10 - 33:15

that means is really, "rotate" here is
what's called a homomorphism that can kind
33:15 - 33:22

of slip inside the overlay operation. Or
another way to talk about it is, commutes
33:22 - 33:27

or can trade places with the overlay
operation in that, on the left hand side
33:27 - 33:31

of the equation, the rotate is outside and
the overlay is inside, and on the other
33:31 - 33:35

side of the equation, it's exactly the
other way around. With the next equation
33:35 - 33:39

there flip-vertical of beside, [noise
while indicating the flip], right, and
33:39 - 33:45

then the idea is, well, you flip one, you
flip one and then do "beside". That's
33:45 - 33:48

certainly eminently reasonable, but if you
think about it, it might depend on the
33:48 - 33:52

alignment that you choose for your images
as you put them side to side, right, you
33:52 - 33:58

might decide to align them by the top half
like this. And then you might come to the
33:58 - 34:02

conclusion that that equation does not
hold. So, but again, it provides a guide.
34:02 - 34:06

It probably should hold - you should
design the "beside" operation in such a
34:06 - 34:11

way that it works. Now the equation at the
bottom really does not work, right? There
34:11 - 34:19

it is. But if you have five minutes after
this talk, or even now, you might think
34:19 - 34:23

about an equation that's very similar to
that one, that does hold and that gives
34:23 - 34:29

you some insight. That's also a useful
notion when you're designing APIs. Here's
34:29 - 34:36

another useful notion in that, Well, with
images, there's the visual aspect, right,
34:36 - 34:40

that if you perceive images by looking at
a certain pair of coordinates and
34:40 - 34:45

perceiving a certain color. And that color
might be, you might represent it by a
34:45 - 34:49

value and that value might come from
different types. And that gives you
34:49 - 34:54

different notions of images. So, for
example, the image of bools well, so it
34:54 - 34:59

means each coordinate can only be on or
off, or black and white. So it gives you
34:59 - 35:03

black and white picture. The next one,
where there's just a number there might
35:03 - 35:13

give you some some notion of grayscale. Or
you would, of course, then as in the
35:13 - 35:16

bottom two examples, you could stick RGB
triples in there to give you arbitrary
35:16 - 35:20

colors or even an alpha channel or
something like that. So when you have a
35:20 - 35:25

type such as image that has a parameter,
very frequently you can add a little bit
35:25 - 35:29

of structure, what's called a map
operation that you might have seen (not
35:29 - 35:34

going to worry about the details there),
but you might get a structure called a
35:34 - 35:42

functor. And the word functor comes from a
particularly abstract branch of algebra
35:42 - 35:45

called category theory. But even, you
know, the mathematicians sometimes call
35:45 - 35:52

that abstract nonsense. But, despite the
term, it really is something that's very
35:52 - 35:57

useful and very insightful and it gives
you a lot of insight into the structure of
35:57 - 36:02

things. And category theory frequently
describes relationships between things
36:02 - 36:07

through diagrams that are quite beautiful.
So here's a characterization of the idea
36:07 - 36:15

of a monoid. You don't have to understand
the diagram, but maybe, after this and
36:15 - 36:18

after you've mastered monoids for a while,
you'll have look at category theory and
36:18 - 36:23

enjoy it. You get more complicated
diagrams such as this one in more involved
36:23 - 36:30

abstractions such as, I mentioned functor.
There's also the infamous monad and these
36:30 - 36:36

are things that are practical, eminently
practical for creating higher order and
36:36 - 36:41

generally higher level abstractions.
Again, not for this talk. That's a
36:41 - 36:47

different talk. But I hope I've instilled
in you the sense that mathematics, at
36:47 - 36:51

least this kind of simple mathematics, is
useful. If you felt that it hasn't been
36:51 - 36:55

simple, that it's been hard, I want to
point out that, well, it might be hard to
36:55 - 37:01

understand everything that's been on the
slides on this talk, but it's not due to
37:01 - 37:06

the fact that it's complicated. It's not.
There's not a lot of moving parts or
37:06 - 37:12

something like that, but it is abstract,
and the human mind needs some time. It
37:12 - 37:15

finds it difficult to deal with
abstraction, and therefore you might -
37:15 - 37:19

also, if you're interested in this stuff -
you might give your mind some time to get
37:19 - 37:23

adjusted to those mathematical concepts
and maybe look at it again a couple of
37:23 - 37:29

weeks. Just have a brief look. You know,
maybe look for a monoid or just a binary
37:29 - 37:34

operation, look for associativity and by
and by, that stuff will become more
37:34 - 37:39

familiar and commonplace to you, and
you'll find it easier to deal with it. And
37:39 - 37:43

once that happens, I think you'll find
that very satisfying. So that's pretty
37:43 - 37:49

much it. If you want to read up some more
detail on this, I put in four references
37:49 - 37:52

that are linked from the slides that you
hopefully download from somewhere. But you
37:52 - 38:00

can also just search for those titles and
you'll find those papers. So first one is
38:00 - 38:04

a paper by Brent Yorgey that really takes
that idea of using monoids to structure an
38:04 - 38:11

image library to the limit. That's just a
great paper. Then there's a book by Sandy
38:11 - 38:18

Maguire on algebra-driven design, applies
that idea to different practical settings.
38:18 - 38:22

Conal Elliot gave a great foundational
type talk on something called denotational
38:22 - 38:25

design. So that's a video. Also has a
paper, but I really recommend you look at
38:25 - 38:30

the video. And the classic paper that
introduced this notion of dealing with
38:30 - 38:34

images by combining them is by Peter
Henderson, something called functional
38:34 - 38:41

geometry. And I hope you'll enjoy looking
at that stuff as much as I will. And I
38:41 - 38:52

hope you enjoy the Congress. I hope to see
you around. Well, have a good time. Bye.
38:52 - 38:59

Herald: So thank you for this interesting
talk, Michael. And there are a few
38:59 - 39:05

questions from the internet already. And
for those who haven't asked questions yet,
39:05 - 39:13

if you use the hashtag #rc3cwtv on either
Twitter, Mastodon or the proper IRC
39:13 - 39:18

channel, we can still include them in this
Q&A. And the first question I would like
39:18 - 39:26

to ask you, Michael, is maybe a strange
one, but you mentioned this concept of
39:26 - 39:33

homomorphisms of the monoids. And I was
wondering whether we could do the reverse
39:33 - 39:40

rounds. Could could you use these
mathematics to visualize APIs?
39:40 - 39:49

Michael: That's an excellent question. So
let me think about it for a moment..., so
39:49 - 39:58

with a monoid... So yeah, so the answer
is, first of all, yes. So with the monoid,
39:58 - 40:02

one way to think about a monoid is not
just this abstract thing that's going on
40:02 - 40:08

right, but since, any value in the monoid,
can kind of be built up by this operation
40:08 - 40:14

that's sitting in the monoid, you could
also just represent a monoid not just by
40:14 - 40:23

some abstraction, but you can build what's
called a free structure, and that is tree
40:23 - 40:27

structured and that you could visualize.
So that's one half of the answer. And on
40:27 - 40:31

the last slide, I had to reference the
very first reference on a paper by Brent
40:31 - 40:37

Yorgey shows how that works. And it's
generally a paper on visualization. So
40:37 - 40:41

that would be one half of the answer. The
other half is that I remember one
40:41 - 40:45

industrial project that I've worked on
where people wanted - so there was kind of
40:45 - 40:51

a complicated data flow problem, and the
requirement of the customer was, well,
40:51 - 40:57

people want to assemble a configuration
for a scheduling problem in semiconductor
40:57 - 41:01

fabrication, if you will. And they said,
Well, we want we just want a bunch of
41:01 - 41:07

boxes and arrows, and people should be
able to arrange that visually to specify a
41:07 - 41:12

configuration. And we ended up looking for
the right formalism to do that. In fact,
41:12 - 41:17

there is one in category theory. And
category theory generally is diagrams, but
41:17 - 41:21

there is a concept in Category theory
which just happened to fit, and we found
41:21 - 41:29

it by going from the customer requirement
to the mathematics. And that gave us a
41:29 - 41:34

naturally and very pleasant and actually
mathematically precise visualization form
41:34 - 41:39

for that. So I'd argue it's a great tool
to think about visualization. Hope that
41:39 - 41:44

helps a little bit.
H: OK, so we can finally look forward to
41:44 - 41:49

this, this kind of Neuromancer-like
landscapes while hacking.
41:49 - 41:53

Michael: Well, yeah, it feels like that
sometimes, yes.
41:53 - 41:56

H: Another question was from a viewer who
had never seen such sexy algebra in their
41:56 - 42:02

life: Whether you're willing to submit a
proposal next year to go into a little
42:02 - 42:09

more in-depth continuation of this talk.
M: Sure. I'm very happy to.
42:09 - 42:14

H: That's a short answer to a relatively
long...
42:14 - 42:18

M: OK. Can't say you haven't been warned,
but yeah, I'll try.
42:18 - 42:25

H: Yeah, and also it needs to be accepted
by the content team, obviously. Another
42:25 - 42:29

question that is about something you said
you wouldn't talk about, and that is about
42:29 - 42:34

formal verification of algorithms M:
Yeah, but none the less might be useful
42:34 - 42:40

for the audience to know if they're
seeking formal verification for
42:40 - 42:44

algorhithms. What avenues you would
suggest to people, especially people
42:44 - 42:48

without much of a formal background in
mathematics, to start looking at, as in,
42:48 - 42:54

placed for further learning.
M: Yes. Yeah, that's that's an excellent
42:54 - 42:59

question. So I think one place that I'd
start with, and I think there have been
42:59 - 43:02

talks about this at earlier iterations the
of the Congress that I'm sure you can
43:02 - 43:07

find. There is, in fact, there's a new, if
you will, family of programing languages
43:07 - 43:11

that are based on strong types called
dependent types. And they're very
43:11 - 43:16

expressive and they allow you to express
mathematical properties. Now, in the old
43:16 - 43:20

days, we would write down a mathematical
property, of a program that we wanted to
43:20 - 43:25

have, and then we would use - if things
are good, we would use a proof assistant
43:25 - 43:31

to find the proof, and that would be a
very cumbersome process that you would
43:31 - 43:36

have to do. Now, with this new family of
programing languages, Idris and Agda are
43:36 - 43:43

two prominent examples, what you can do is
you can give the type and you can push a
43:43 - 43:46

button (essentially - it's not quite so
easy, but there's actually IDE support
43:46 - 43:50

where you can push a couple of buttons and
it will generate a program that has that
43:50 - 43:56

type). And that program is guaranteed to
be correct, right? Because with the
43:56 - 44:00

combination of the type and the program
itself, there is the proof right there
44:00 - 44:05

sitting there. And that, at least for
simpler things, takes away a lot of the
44:05 - 44:13

burden, at least at the burden of getting
into this kind of tool. It's still kind of
44:13 - 44:16

intricate work to put together more
complicated proofs, but to get into that
44:16 - 44:20

medium, it's just great fun and it's a
very satisfying activity. And you don't
44:20 - 44:23

have that impression that you sometimes
have with proof assistants that you're
44:23 - 44:28

kind of fiddling in the dark with a
screwdriver. So, it it's a concrete place
44:28 - 44:31

that you're looking for. I would I would
recommend looking at a programing language
44:31 - 44:35

called Idris and maybe look at some of the
Congress talks that we've had about in the
44:35 - 44:41

past.
H: Yeah. Having been to at least one of
44:41 - 44:46

those Congress talks, I also would ask a
follow up question to that: Does the new
44:46 - 44:52

typing capability that, for example, Rust
provides, rise to this level that you
44:52 - 44:55

mentioned with Idris and other languages,
or is that not advanced enough for that
44:55 - 44:59

yet?
M: I'm not an expert on rust. I haven't
44:59 - 45:06

seen IDE support on that level. I'm pretty
sure the type - I mean Rust has a type
45:06 - 45:10

system that has a specific kind of
expressive capability that allows you to
45:10 - 45:15

express memory safety. Right? And then
because, your type-correct program, if
45:15 - 45:19

you're not using the unsafe features, is
guaranteed, is kind of proven to be
45:19 - 45:27

correct and to adhere to those to those
memory safety criteria. But I have not
45:27 - 45:32

seen the general expressivity that
language is like Idris gives you in Rust.
45:32 - 45:39

Maybe there's a way to do it, but I'm I'm
reasonably confident it's not as practical
45:39 - 45:44

and not as easy to get into.
H: Yeah, and also it's very off-topic
45:44 - 45:49

since you said you wouldn't be talking
about this anyway right now. So we're sort
45:49 - 46:01

of veering. One of the other questions is,
will you be around in the 2D world
46:01 - 46:04

sometime to talk directly to people from
the audience?
46:04 - 46:10

M: Yeah, sure, we'll do that right after
the talk. Maybe after a little lunch
46:10 - 46:12

break.
H: And then what nickname will you be and
46:12 - 46:15

where will you be hanging out?
M: My Twitter handle, which was also being
46:15 - 46:21

called "sperbsen".
H: We have this 2D rc3 world, this kind of
46:21 - 46:27

online adventure thing where the people
visiting are walking around with their
46:27 - 46:31

avatars and, well you haven't been there
yet, I take it from your...
46:31 - 46:34

M: No, no, no. I have been there. I'm just
saying my avatar there has the same name
46:34 - 46:37

as my Twitter handle.
H: I'm sorry. I'm sorry for interrupting
46:37 - 46:40

you..
M: So it's called "sperbsen" "Sperber" was
46:40 - 46:44

not available anymore.
H: I apologize.
46:44 - 46:53

M: So I'll be there.
H: And I think this... There's another
46:53 - 46:59

question, and it's: Could something like a
Turing machine have been constructed with
46:59 - 47:07

the mathematics of 2000 years ago?
M: So I got to Turing machine and I got
47:07 - 47:09

mathematics as of two thousand years ago,
but I lost...
47:09 - 47:13

H: Could something like a Turing machine
be constructed with the mathematics of two
47:13 - 47:18

thousand years ago. This may be a bit of a
speculative question.
47:18 - 47:28

M: I don't see anything that wasn't there.
I don't see why not. I'm not 100 percent
47:28 - 47:35

sure what exactly the mathematics of 2000
years ago was, right. You know, my basic
47:35 - 47:40

knowledge goes back a couple of hundred
years, but the basic mechanism of a Turing
47:40 - 47:44

machine is very simple. It's not my
preferred mechanism to talk about
47:44 - 47:50

mathematics and hacking, but, sure, why
not?
47:50 - 47:58

H: And the, I think, final question is
[...] start teaching Haskell in teaching?
47:58 - 48:03

M: Well, that's a great question. It kind
of goes, I'm doing a workshop later on
48:03 - 48:07

teaching programing. I think teaching
programing using Haskell is a terrible
48:07 - 48:15

idea. So obviously, you all know, I'm a
big fan of functional programming. I
48:15 - 48:18

actually do a lot of Haskell in my daily
life. and I've given talks using and on
48:18 - 48:22

Haskell in the past. But Haskell is a
programing language coming out of the
48:22 - 48:25

research community and used for
professional programing. If you really
48:25 - 48:29

want to do an effective introduction to
programing, you're well off using
48:29 - 48:34

specialized programing languages for
learning. That's what I will advocate
48:34 - 48:39

later today at 4pm Middle European Time.
H: OK, thank you. And this concludes the
48:39 - 48:48

session, thanks for being here.
M: Thank you. Thanks for having me.
48:48 - 49:06

postroll music
49:06 - 49:09

Subtitles created by c3subtitles.de
in the year 2021. Join, and help us!

Title:: Math for hackers
Description:: more » « less
Video Language:: English
Duration:: 49:12

	alcuna edited English subtitles for Math for hackers
	alcuna edited English subtitles for Math for hackers
	alcuna edited English subtitles for Math for hackers
	alcuna edited English subtitles for Math for hackers
	alcuna edited English subtitles for Math for hackers
	alcuna edited English subtitles for Math for hackers
	alcuna edited English subtitles for Math for hackers
	verbrannt_c3subs edited English subtitles for Math for hackers

Show all

English subtitles

Revisions

Revision 11 Edited

alcuna

Math for hackers

Revisions

Our website uses cookies

Operating cookies (Required)