Ruby Conf 2013 - Advanced Concurrent Programming in Ruby

Edit subtitles

0:17 - 0:18

JERRY D'ANTONIO: Good afternoon everyone.
0:18 - 0:20

Hope you guys had a good lunch.
0:20 - 0:22

Hopefully you're not gonna fall asleep on
me.
0:22 - 0:26

I'll do my best to keep that from happening.
0:26 - 0:27

As the slide here says, my name is Jerry
0:27 - 0:31

D'Antonio. I work for a company called VHT,
formerly
0:31 - 0:34

Virtual Holes Technology, and we are an Erlang
and
0:34 - 0:37

Ruby shop out of Akron, Ohio. And, again as
0:37 - 0:38

the slide says, I'm here to talk to you
0:38 - 0:39

guys about concurrency.
0:39 - 0:43

So, yesterday I was monitoring the Tweets
for the
0:43 - 0:44

conference, and somebody sent out a Tweet
that I
0:44 - 0:46

thought was very interesting, and it asked
the question,
0:46 - 0:50

it said something to the effect of, RubyConf
should
0:50 - 0:53

give me a reason why I want to use
0:53 - 0:55

Ruby in 2014.
0:55 - 0:58

Now, I assume, like the rest of you guys,
0:58 - 1:00

I really love Ruby. And I think there's a
1:00 - 1:02

million reasons why people would want to continue
using
1:02 - 1:06

Ruby in the future. Unfortunately, when the
question comes
1:06 - 1:09

to concurrency, selling Ruby is a little bit
harder
1:09 - 1:11

of a sell, all right.
1:11 - 1:14

Now, I'm not talking here about interpretive
issues. I'm
1:14 - 1:16

not talking about the global interpreter lock.
I'm talking
1:16 - 1:17

about any of that kind of stuff. I'm talking
1:17 - 1:20

about abstraction. All right, let me give
you a
1:20 - 1:21

brief bit of history.
1:21 - 1:23

Years ago, I used to work in banking systems,
1:23 - 1:27

and we wrote highly performant, highly concurrent
systems in
1:27 - 1:29

C++. Now, if you've ever had to deal with
1:29 - 1:32

concurrency in a language like C++, you realize
that
1:32 - 1:34

it is full of a lot of pain and
1:34 - 1:37

agony. You spawn a bunch of threads, and a
1:37 - 1:41

bunch of low-level concur- locking, you know,
in terms
1:41 - 1:44

of, of kernel-level objects to try and synchronize
those
1:44 - 1:46

threads. And it's very easy to get wrong.
It
1:46 - 1:48

was so easy to get wrong that we actually
1:48 - 1:51

have whole categories oc bugs named after
common concurrency
1:51 - 1:52

errors, all right.
1:52 - 1:54

So like most people who do that work, eventually
1:54 - 1:55

I had to get out of it because it
1:55 - 1:57

was just too painful, right. But five, six
years
1:57 - 1:59

ago I discovered Ruby. I've been using Ruby
ever
1:59 - 2:01

since, and I love Ruby.
2:01 - 2:05

Unfortunately, the concurrency tools that
Ruby provides to us
2:05 - 2:07

are pretty much the same things that I was
2:07 - 2:10

using fifteen years ago in C++. We had thread
2:10 - 2:14

dot new, we have mutex, where we can cross-synchronize,
2:14 - 2:16

and we can do all this low-level stuff which
2:16 - 2:17

is every bit as painful, right.
2:17 - 2:19

Now, if you look at what's going on in
2:19 - 2:23

other languages, with respect to concurrency,
today we have
2:23 - 2:26

this thing called asynchronous concurrency,
k. Rather than trying
2:26 - 2:28

to place a bunch of different threads and
place
2:28 - 2:29

a bunch of locks on a bunch of things
2:29 - 2:34

and get a bunch of contention, instead, we
send
2:34 - 2:37

operations off onto different threads or different
processes. They
2:37 - 2:40

do their thing, and then we coordinate those,
right.
2:40 - 2:42

And if you look around at a lot of
2:42 - 2:43

the languages not called Ruby, you see a lot
2:43 - 2:46

of really cool things going on, right. Languages
like
2:46 - 2:50

Erlang and Clojure and Scala and, even JavaScript
and
2:50 - 2:53

Java and C# are doing some very interesting
things
2:53 - 2:54

with respect to concurrency.
2:54 - 2:57

Now, Ruby, being the great language it is,
we
2:57 - 3:00

can actually use these same abstractions in
Ruby if
3:00 - 3:03

we take the time to build them, K. They
3:03 - 3:04

don't exist in our standard library right
now. They
3:04 - 3:08

don't exist in the, the language itself. But
we
3:08 - 3:09

can still build them and we can still use
3:09 - 3:10

them.
3:10 - 3:12

So my goal today is to give you a
3:12 - 3:15

survey of some of the asynchronous concurrency
techniques that
3:15 - 3:18

are being used in other languages. And show
you
3:18 - 3:21

how you can use those in Ruby today.
3:21 - 3:24

This is going to be an incredibly code-heavy
presentation,
3:24 - 3:27

all right. Pretty much every slide in here
is
3:27 - 3:30

either a picture or code, right. Now, there's
a
3:30 - 3:32

lot of stuff to cover and I'm not gonna
3:32 - 3:33

be able to go over everything in detail, so
3:33 - 3:36

my goal here is - all of the, in
3:36 - 3:39

this presentation, in its entirety, with extensive
notes and
3:39 - 3:41

all of the source code samples are up on
3:41 - 3:41

GitHub.
3:41 - 3:43

So you can go out there, you can clone
3:43 - 3:44

the repo, you can pull this down. You can
3:44 - 3:46

run all the code I'm gonna show you. So
3:46 - 3:47

as I go through this, I'm gonna ask that
3:47 - 3:49

you focus on the concepts that we're gonna
talk
3:49 - 3:52

about. Cause these concepts for things that
are independent
3:52 - 3:55

of any particular programming languages, and
they are concepts
3:55 - 3:57

that will allow you, once you understand them,
to
3:57 - 4:00

start thinking about your code differently
and start solving
4:00 - 4:01

problems differently.
4:01 - 4:03

And so that's really my hope is that after
4:03 - 4:05

you leave here today, you'll have some new
ways
4:05 - 4:07

that you can think about code in terms of
4:07 - 4:09

concurrency, and you'll have an interest in
going out
4:09 - 4:11

and starting to write more concurrent code,
K.
4:11 - 4:12

Most of the code we're gonna look at today
4:12 - 4:14

is gonna be from a gem that I put
4:14 - 4:18

together called concurrent_ruby. It's MIT
License, opensource, available on
4:18 - 4:20

GitHub. It's something that we use at VHT
today,
4:20 - 4:22

because we had some very specific needs we
wanted
4:22 - 4:23

to fulfill.
4:23 - 4:25

And I'm gonna show you that gem, because it's
4:25 - 4:27

the one I know the best. But it is
4:27 - 4:30

by no means the, the best or canonical or
4:30 - 4:33

right way to do it. These concepts, like I
4:33 - 4:36

said, are concepts that are independent of
any language
4:36 - 4:37

and can be implemented in many different ways.
I'm
4:37 - 4:40

just gonna show you one possible way to use
4:40 - 4:42

these particular ideas within Ruby.
4:42 - 4:45

All right. So that being said, let's go ahead
4:45 - 4:46

and move forward.
4:46 - 4:47

In order to do this, we're gonna need a
4:47 - 4:52

crash test dummy, all right. When writing
code examples
4:52 - 4:54

of concurrent code, often times we throw in
these
4:54 - 4:57

random sleep statements, and we say, something
important happened
4:57 - 4:59

here, all right. It's kind of fake when we
4:59 - 5:00

do that. It sort of gets the point across,
5:00 - 5:02

but it's not a really good example.
5:02 - 5:04

So for this presentation, I created a class
that
5:04 - 5:06

we're gonna use as our crash test dummy in
5:06 - 5:08

most of the examples that we're gonna go over
5:08 - 5:10

today. And I put this class together on purpose,
5:10 - 5:14

not to show good object-oriented design, because
it's a
5:14 - 5:17

really crappy design, but this class will
express a
5:17 - 5:20

couple of ideas that are very important to
us
5:20 - 5:21

when writing concurrent code.
5:21 - 5:23

So let me show you the crash test dummy
5:23 - 5:25

we're gonna use, K.
5:25 - 5:27

This is a very simple class. It does one
5:27 - 5:30

simple thing. When you create this, an instance
of
5:30 - 5:32

this, you're gonna pass a name of a company
5:32 - 5:36

in. Yahoo, Microsoft, Apple, Google - whatever.
When you
5:36 - 5:38

call update on this, it's gonna go out to
5:38 - 5:42

this Yahoo API, it's gonna retrieve information
about what
5:42 - 5:46

ticker symbols that company trades at, under,
on different
5:46 - 5:48

stock exchanges around the world, that data's
gonna come
5:48 - 5:51

back with some Ajax wrapper stuff around it,
so
5:51 - 5:52

we're gonna strip that off, and then we're
gonna
5:52 - 5:56

update this internal member variable with
that data. Now.
5:56 - 5:58

Here's a couple things to keep in mind. How
5:58 - 6:01

many people here have heard that shared, mutable
data
6:01 - 6:05

is bad? All right. Whatever you were told
is
6:05 - 6:08

a lie. It's ter- it is ten times worse
6:08 - 6:14

than that, right. Shared mutable data in concurrent
programming
6:14 - 6:16

is really bad.
6:16 - 6:18

This thing is fat with shared mutable data.
All
6:18 - 6:21

right, first this thing goes out and it performs
6:21 - 6:23

blocking IO, right. That's good for us in
respect
6:23 - 6:25

to concurrency, because blocking IO is one
reason why
6:25 - 6:27

we want to write concurrent code.
6:27 - 6:30

But it then goes, and the object itself mutates
6:30 - 6:32

when we go and we update it, which means,
6:32 - 6:35

now, this thing, if we share it across threads,
6:35 - 6:38

is shared mutable data. Even worse, this has
an
6:38 - 6:42

internal member variable which is an array
of hashes,
6:42 - 6:45

which we expose through and attribute reader.
Which means
6:45 - 6:48

we now are passing a reference to a mutable
6:48 - 6:51

object outside of this thing, potentially
across threads.
6:51 - 6:54

So this thing is very, very dangerous, and
that's
6:54 - 6:56

why we're gonna use this as our example, because
6:56 - 6:58

we're gonna show different ways that we can
use
6:58 - 7:01

this thing in a concurrent environment, and
hopefully not
7:01 - 7:03

keep ourselves up late trying to debug all
kinds
7:03 - 7:04

of weird bugs.
7:04 - 7:07

So, with that, let's talk about the first
concurrency
7:07 - 7:09

object that we're gonna look at. It's called
Future.
7:09 - 7:11

How many people have heard of Future in terms
7:11 - 7:13

of asynchronous concurrency? Cool.
7:13 - 7:17

A Future is a general term to describe any
7:17 - 7:21

particular operation that gets started and
returns a result
7:21 - 7:24

at some point in the future. OK, so it's
7:24 - 7:28

a class of different types of, of asynchronous
concurrency
7:28 - 7:29

objects.
7:29 - 7:32

Future also, very specifically, is one of
the two
7:32 - 7:36

core concurrency abstractions in the Clojure
programming language. For
7:36 - 7:38

those of you not familiar with Clojure, it's
a
7:38 - 7:40

Lisp-like language run from the JVM, which
is designed
7:40 - 7:43

specifically to be concurrency friendly.
7:43 - 7:45

So here's how a future works - very simple.
7:45 - 7:48

It's probably the simplest and more pure asynchronous
concurrency
7:48 - 7:49

abstraction. Here's how it works.
7:49 - 7:51

You create a future and you give it some
7:51 - 7:56

operation. At that point, the runtime schedules
that operation
7:56 - 8:00

as soon as possible, OK. A future has three
8:00 - 8:03

states. It can be pending, which is what happens
8:03 - 8:06

on creation. It's not done yet. Once the operation
8:06 - 8:09

completes, it can be either fulfilled or rejected,
K.
8:09 - 8:12

If the operation completes successfully, it
becomes fulfilled. If
8:12 - 8:16

the operation throws it an exception, that
exception gets
8:16 - 8:21

swallowed, and the, the, the state becomes
rejected.
8:21 - 8:23

At that point, you can then retrieve either
the
8:23 - 8:27

value for the successful operation, or you
can retrieve
8:27 - 8:28

the reason for the rejection, which would
be the,
8:28 - 8:32

the exception that was thrown. So very simple.
Very
8:32 - 8:35

straight-forward. You basically throw this
thing off on another
8:35 - 8:37

thread, let the runtime schedule it, do important
stuff,
8:37 - 8:39

and then later on you come back and you
8:39 - 8:41

ask, what was the result of that operation?
And
8:41 - 8:43

if it blows up, your program doesn't blow
up,
8:43 - 8:46

it just tells you that your operation blew
up.
8:46 - 8:46

So that's great.
8:46 - 8:48

It's very simple, and it's a very easy way
8:48 - 8:51

to start adding concurrency to your programs.
Now, how
8:51 - 8:54

many people are JavaScript programmers here?
Right. So you
8:54 - 8:57

guys are all familiar with call-backs, right?
Whenever you're
8:57 - 9:00

dealing with asynchronous concurrency, there
are two ways that
9:00 - 9:04

you can retrieve the results of the asynchronous
operation.
9:04 - 9:07

One, as we see here, is query the object,
9:07 - 9:09

and say, what happened? The other way is to
9:09 - 9:12

attach a call-back, right. This is the JavaScript
way
9:12 - 9:15

of doing things - we attach call-backs. Ruby,
it
9:15 - 9:19

turns out, has a very serviceable call-back
mechanism built
9:19 - 9:21

into the standard library.
9:21 - 9:22

How many people here have heard of the observable
9:22 - 9:26

module? Right. There you go. The observable
module, which
9:26 - 9:29

we know is based upon the game-of-four pattern
observer,
9:29 - 9:33

actually can work as a very fine call-back
mechanism
9:33 - 9:35

for any kind of asynchronous object.
9:35 - 9:37

So in this particular library, I've tried
to make
9:37 - 9:40

it as consistent as possible and dependent
only on
9:40 - 9:42

the Ruby standard library. So in this case,
this
9:42 - 9:45

future class implementation is observable.
So you can attach
9:45 - 9:48

an observer to that, and then when the operation
9:48 - 9:51

completes, the observable will be called,
it will be
9:51 - 9:53

given the time that the operation finished,
it'll be
9:53 - 9:56

given the value of the operation for the exception
9:56 - 9:58

that was thrown. Very simple.
9:58 - 10:01

Now, I want to take a step away from
10:01 - 10:02

future for a second, and talk about a very
10:02 - 10:05

important concept that is not unique to future
but
10:05 - 10:08

applies to all asynchronous concurrency, all
right.
10:08 - 10:13

There is a code smell on this string. Right.
10:13 - 10:14

I'll even give you a hint to where it's
10:14 - 10:17

at. It is up here on line eighteen. Now
10:17 - 10:19

there's code smell. Think about it for a second,
10:19 - 10:24

and I'll give you the answer what it is,
10:24 - 10:24

all right.
10:24 - 10:28

All advanced asynchronous concurrency abstractions
try as much as
10:28 - 10:32

possible to hide the details of concurrency
from us.
10:32 - 10:35

They try and hide the locking and the threading
10:35 - 10:36

and all this other stuff that has to go
10:36 - 10:39

on. But what we cannot do is change the
10:39 - 10:42

actual nature of concurrent operations. And
when we're dealing
10:42 - 10:46

with concurrency, the order of operations
is non-deterministic, OK.
10:46 - 10:49

I'm sure you've heard that term before. It's
non-deterministic.
10:49 - 10:52

We cannot guarantee at what order things happen.
So
10:52 - 10:54

line eighteen right here, which looks fairly
innocuous, is
10:54 - 10:57

very interesting because, once we create this
future on
10:57 - 11:01

line seventeen, that block is scheduled for
operation, and
11:01 - 11:04

we have no control over when that thing occurs
11:04 - 11:06

with respect to anything else.
11:06 - 11:09

So it is theoretically possible that that
operation could
11:09 - 11:13

complete before we add the observer on line
eighteen,
11:13 - 11:17

K. That's just the nature of non-determinism.
11:17 - 11:21

Now, in this particular case, right, this
particular future
11:21 - 11:24

implementation is aware of that and the add-observer
method
11:24 - 11:27

here behaves in a way that you would expect
11:27 - 11:31

it to, despite that non-determinism. But the
take-away from
11:31 - 11:33

this, and this'll apply to everything we talk
about
11:33 - 11:37

today, is concurrency is non-deterministic,
and that can never
11:37 - 11:40

change, so always when using these concurrent
abstractions, in
11:40 - 11:43

any language, keep that non-determinism in
mind.
11:43 - 11:45

OK. So that's a future. We're gonna refer
back
11:45 - 11:47

to this a lot because you're gonna see the
11:47 - 11:49

API of this future several times.
11:49 - 11:51

Let's talk about another very interesting
abstraction that comes
11:51 - 11:54

out of Clojure. Let's talk about the Agent.
Clojure
11:54 - 11:59

has two core concurrency abstractions: future
and agent, right.
11:59 - 12:01

Agent is, Clojure is the only language I know
12:01 - 12:02

of that does something like agent. It's very
fascinating,
12:02 - 12:04

and that's why I like to talk about it.
12:04 - 12:07

I'll give you an example. Let's say you're
writing
12:07 - 12:09

a video game, and that video game is old-school
12:09 - 12:11

arcade-style video game, and there's a score,
all right,
12:11 - 12:13

and you've got all kinds of threads running
around
12:13 - 12:15

doing different things, and each one of those
threads
12:15 - 12:16

wants to update the score.
12:16 - 12:18

And the way we would do that in the
12:18 - 12:21

old-school days was then put some kind of
lock
12:21 - 12:24

around that score, and every thread that wanted
to
12:24 - 12:26

change the score would have to obtain that
lock
12:26 - 12:28

and it would have to block until it got
12:28 - 12:29

that lock and then it would update the score.
12:29 - 12:31

So it happens that you have all these threads
12:31 - 12:32

that want to be doing these different things,
but
12:32 - 12:34

every time they have to update the score,
they
12:34 - 12:36

have to block and go into contention with
another.
12:36 - 12:40

It's very inefficient, right. So the agent,
from Clojure,
12:40 - 12:43

turns that on its head and says, rather than
12:43 - 12:47

putting the lock on the, the value that we
12:47 - 12:51

want to change, let's instead queue up the
operations
12:51 - 12:54

against that value and do it sans-locking.
12:54 - 12:56

So here's how the agent works.
12:56 - 12:57

Right, and it's really fascinating.
12:57 - 12:59

Create an agent and give it an initial value.
12:59 - 13:01

The value can be anything, and the score the
13:01 - 13:04

input would be like zero. K, then, when a
13:04 - 13:07

thread wants to change that value, rather
than getting
13:07 - 13:11

the current value instead, it throws an operation
at
13:11 - 13:14

the agent, which is running on its own thread,
13:14 - 13:19

and it says, perform this operation against
that value.
13:19 - 13:21

The agent itself then queues up all of these
13:21 - 13:24

different operations, runs them one at a time,
in
13:24 - 13:26

order, so there's no contention amongst those
operations. When
13:26 - 13:29

an operation runs, it has complete access
to that
13:29 - 13:33

value. Another great thing about is that the
operation
13:33 - 13:35

we're sending doesn't have to have top-level
knowledge of
13:35 - 13:37

what the value is.
13:37 - 13:40

When that block runs, the agent gives the
block
13:40 - 13:42

the current value, and the block returns what
the
13:42 - 13:46

new value is, right. And, much like many of
13:46 - 13:48

the things in this library, this particular
implementation of
13:48 - 13:52

agent also supports the observable interface.
So we can
13:52 - 13:53

hang an observer off of this.
13:53 - 13:55

So now let's go back to that video game
13:55 - 13:58

score example. In that case, every thread,
we create
13:58 - 14:00

an agent, set its initial value to zero, and
14:00 - 14:02

every thread that wants to then update the
score
14:02 - 14:04

can throw a block operation at that threa-
at
14:04 - 14:07

that agent. At any point in time, anything
can
14:07 - 14:09

retrieve the value of the agent. That retrieval,
though,
14:09 - 14:12

will get you the value at that time, irrespective
14:12 - 14:15

if things are still queued up, right.
14:15 - 14:17

But, we can then hang an observer off of
14:17 - 14:20

that and make it the observer's responsibility
to update
14:20 - 14:22

the score on the screen, and now we can
14:22 - 14:26

take that, that video game score type scenario
and
14:26 - 14:29

we can run that using agent with absolutely
zero
14:29 - 14:32

locking and zero blocking of our worker threads.
14:32 - 14:34

And it's a very fascinating approach. Like
I said,
14:34 - 14:37

the idea for this comes from the language
Clojure.
14:37 - 14:40

Now, again I'm gonna take a step aside. We
14:40 - 14:41

took a step aside a minute ago and talked
14:41 - 14:44

about non-determinism. I'm gonna take a step
aside and
14:44 - 14:47

talk about another concept that's very important
in concurrent
14:47 - 14:48

programming.
14:48 - 14:52

This code right here is horribly, horribly
broken. All
14:52 - 14:55

right. The agent does exactly what it's supposed
to
14:55 - 14:57

do, but the way I use the agent in
14:57 - 15:00

this code is ridiculously broken and bad.
Take a
15:00 - 15:01

second to look at it and think about it
15:01 - 15:03

and I'll tell you what the problem is.
15:03 - 15:06

K. I gave you a hint earlier on, during
15:06 - 15:08

the introduction. I said there's something
that we really
15:08 - 15:10

need to worry about that's really, really,
really bad.
15:10 - 15:14

And that's mutation, K.
15:14 - 15:16

The value that we gave to that agent is
15:16 - 15:19

an array, right. And that is an array of
15:19 - 15:24

hashes. And those are mutable data structures.
So whenever
15:24 - 15:26

we retrieve the value from that agent, we
are
15:26 - 15:31

retrieving a reference to a mutable data structure.
15:31 - 15:33

So although that agent will queue up those
operations
15:33 - 15:35

on one thread to make sure they don't compete
15:35 - 15:38

with one another, any thread that has a handle
15:38 - 15:40

to that agent and wants to get that value
15:40 - 15:44

is gonna be retrieving a mutable reference,
right. And
15:44 - 15:46

that's really bad. That can lead to a lot
15:46 - 15:49

of pain and agony. Don't do that.
15:49 - 15:50

So here's the question - how do we avoid
15:50 - 15:52

doing that? Cause, let's be honest. We're
working in
15:52 - 15:55

Ruby. A lot of the languages that are concurrency-friendly
15:55 - 15:58

are things like Clojure and Erlang and Haskel
that
15:58 - 16:02

have immutable variables. Ruby does not have
immutable variables.
16:02 - 16:03

Problem.
16:03 - 16:05

So how do we solve this problem? In this
16:05 - 16:10

case, we solve it with a hamster.
16:10 - 16:12

Did anybody see that coming?
16:12 - 16:14

In 2000, a gentleman named Phill Bagwell started
doing
16:14 - 16:16

some research into something called a tri.
All right,
16:16 - 16:19

a tri is a high-performance tree-like data
structure which
16:19 - 16:22

every node has at least two hundred and fifty-six
16:22 - 16:23

other nodes hanging off of it.
16:23 - 16:25

By having two-hundred and fifty-six nodes
off of every
16:25 - 16:28

node, we can have one million nodes in a
16:28 - 16:31

tri with only three levels deep, which gives
us
16:31 - 16:34

incredibly fast access to those nodes.
16:34 - 16:36

He then followed that up with something called
an
16:36 - 16:39

ideal hash tree. Has anybody heard of an ideal
16:39 - 16:41

hash tree? All right, an ideal hash tree is
16:41 - 16:44

a very high-performance data structure which
allows you to
16:44 - 16:48

create immutable data structures, but have
them perform very
16:48 - 16:51

fast, because what you're doing is in the
tree,
16:51 - 16:54

you're storing indexes to the objects, you're
not, and
16:54 - 16:56

when you copy things, you are now copying
the
16:56 - 16:58

indexes, not the objects themselves.
16:58 - 17:02

An ideal hash tree is the underlying engine
that
17:02 - 17:03

makes Clojure work.
17:03 - 17:09

Clojure gets its immutable variables with
high performance list
17:09 - 17:12

comprehensions because the underlying engine
is this thing called
17:12 - 17:16

an ideal hash tree. Well, fortunately for
us, a
17:16 - 17:18

couple of Rubyists went and read the same
papers
17:18 - 17:20

by Phil Bagwell and had the same ideas, and
17:20 - 17:22

they created a library for us that provides
us
17:22 - 17:26

with high performance immutable data structures,
and that particular
17:26 - 17:28

library is called Hamster, k.
17:28 - 17:32

Hamster's an opensource gem. It proves a threadsafe,
immutable,
17:32 - 17:35

high-performance data structure. So now what
we've done in
17:35 - 17:37

this case is we've replaced that example of
the
17:37 - 17:40

agent we used before, getting rid of the array,
17:40 - 17:43

which is not safe, putting in our Hamster
vector
17:43 - 17:45

instead. Now every time we operate in that
vector
17:45 - 17:47

we of course have to replace the vector, cause
17:47 - 17:49

it's a non-mutable struct- immutable data
structure.
17:49 - 17:52

But really all we're doing is internally just
replacing
17:52 - 17:54

the indexing, and it's doing it for me fast.
17:54 - 17:57

So now we've created a thread safe immutable
data
17:57 - 17:59

structure inside of our agent, and now we
get
17:59 - 18:02

the behavior that we wanted with absolute
and complete
18:02 - 18:03

safety, k.
18:03 - 18:06

So the point to understand is in Ruby, we
18:06 - 18:10

have mutable variables. Mutable variables
are bad in concurrency,
18:10 - 18:13

so when you are using variables, passing variables
across
18:13 - 18:17

threads, make sure you, whenever possible,
make them immutable.
18:17 - 18:20

Now, the second example is an important one
for
18:20 - 18:20

comparison.
18:20 - 18:23

This, the second example is from a library
called
18:23 - 18:26

thread_safe. It's written by one of the people
from
18:26 - 18:29

the JRuby core team, and it provides thread-safe
implementations
18:29 - 18:33

of Ruby's hash and array. It is a really,
18:33 - 18:35

really great library that does really, really
great work,
18:35 - 18:36

and it's something you should definitely know
about if
18:36 - 18:40

you're doing concurrent code, because thread
safety is important.
18:40 - 18:42

But notice, with that, we still have the same
18:42 - 18:46

problem we had before, cause it's still mutable,
K.
18:46 - 18:49

We're still passing mutable records. So, you
definitely should
18:49 - 18:50

get to know thread_safe, you definitely should
get to
18:50 - 18:53

know Hamster. You should definitely be aware
of thread-safety,
18:53 - 18:57

but remember, whenever possible, immutability
is the best.
18:57 - 19:00

So. So there's JavaScript programmers. How
many JavaScript programmers
19:00 - 19:04

in here are familiar with promises? All right.
19:04 - 19:08

So a promise is a contract between you and
19:08 - 19:11

something that happens on another thread.
A promise is
19:11 - 19:14

a very popular data struct- or excuse me,
concurrency
19:14 - 19:18

abstraction in JavaScript, and it's divined
by two specifications.
19:18 - 19:20

Promises a and promises a plus.
19:20 - 19:23

Promise is basically a future, right. It's
part of
19:23 - 19:25

that general class of future. We send it this
19:25 - 19:28

thing off with the promise, and it promises
to
19:28 - 19:29

us that it will get us a value at
19:29 - 19:31

some point.
19:31 - 19:33

Promises as expressed in JavaScript are very
different than
19:33 - 19:36

the future that we saw earlier in one special
19:36 - 19:40

way. Promises are chainable, K. The future
we looked
19:40 - 19:42

at earlier was a one-shot deal. Send it out
19:42 - 19:44

there, it does its thing, it returns a value
19:44 - 19:47

and it becomes immutable at that point. Done.
19:47 - 19:49

Promises are chainable. A promise can beget
a promise
19:49 - 19:51

can beget a promise can beget a promise. They're
19:51 - 19:53

not only chainable, but you can make trees
out
19:53 - 19:56

of them as well, K. In order to make
19:56 - 19:58

that work, there are some actual some error-handling
semantics
19:58 - 20:01

built on there, too. You can say, when this
20:01 - 20:03

happens, rescue this, on error this and so
on
20:03 - 20:04

and sort forth.
20:04 - 20:06

So a promise is very much like a future
20:06 - 20:10

in this implementation a promise supports
all the same
20:10 - 20:12

methods we saw earlier on future. The idea
of
20:12 - 20:15

state being pending, fulfilled, rejected.
Value, reason and so
20:15 - 20:18

forth. The difference is, however, the chainability
of this.
20:18 - 20:22

There's greater internal plexy to make that
happen, but
20:22 - 20:23

you can then use that in a very similar
20:23 - 20:26

way, especially for chaining.
20:26 - 20:31

In this particular case, promise does not
implement observable,
20:31 - 20:32

and the reason why is that in this case,
20:32 - 20:35

the call-back mechanism is built into promise
through the,
20:35 - 20:38

the chaining, right. So promise, like I said
this
20:38 - 20:42

particular implementation is very true to
promises a and
20:42 - 20:44

a plus specifications from JavaScript, but
it's a Ruby-ish
20:44 - 20:45

library.
20:45 - 20:50

OK. How many people know what Chron is? That
20:50 - 20:53

should be everybody in the room. So one of
20:53 - 20:55

the things that we oftentimes want to do is
20:55 - 20:57

we want to have a task or something that
20:57 - 20:59

occurs at a very specific time. If you're
in
20:59 - 21:01

Rails land, there are gems that allow us to
21:01 - 21:03

do this, right. But if you're outside of Rails
21:03 - 21:06

land, not quite so easy.
21:06 - 21:09

So looking at Java, of all languages, Java
provides
21:09 - 21:12

this really cool abstraction that allows us
to handle
21:12 - 21:14

this thing where we want to have something
happen
21:14 - 21:16

at a certain time. Now of course because it's
21:16 - 21:19

Java, it has a really stupid name. It's called
21:19 - 21:22

the scheduled executor service.
21:22 - 21:25

But it's actually a really cool abstraction.
And so
21:25 - 21:27

because I'm not a Java person I'm gonna call
21:27 - 21:31

it schedule task. So this implementation schedule_task
is based
21:31 - 21:34

upon Java scheduled executive service, and
it does basically
21:34 - 21:37

the same thing. You create this thing, you
pass
21:37 - 21:38

it a block. And you say, I want this
21:38 - 21:41

operation to occur at a certain time, right.
21:41 - 21:43

And then you can say either this many seconds
21:43 - 21:45

from now, or at this specific time. Right.
Then
21:45 - 21:48

it just goes off and it does that, right.
21:48 - 21:51

This supports the same, for consistency, supports
the same
21:51 - 21:52

kind of interface that we saw in future earlier,
21:52 - 21:54

cause I'm, I'm not that smart and I like
21:54 - 21:56

things to work the same so I can remember
21:56 - 21:57

them.
21:57 - 21:59

So it provides us with a state that's pending
21:59 - 22:02

and fulfilled and reject, and provides us
value and
22:02 - 22:03

reason and so forth. And this can go ahead
22:03 - 22:06

and go on and make that operation occur at
22:06 - 22:08

a specific time. Now, the astute among you
might
22:08 - 22:10

be saying, all right, Jerry, that's really
cool, but
22:10 - 22:12

how is that different from just basically
creating a
22:12 - 22:15

future and making the thing go to sleep, you
22:15 - 22:16

know, when you first create it?
22:16 - 22:18

Well, it really isn't. I mean, I could literally
22:18 - 22:19

do the same thing by creating a future and
22:19 - 22:21

having the first line of the block I passed
22:21 - 22:24

the future be sleep. There's two reasons why
I
22:24 - 22:25

don't.
22:25 - 22:28

One, is cancel-ability, right. The intent
of a future
22:28 - 22:30

is to say go and do this right now.
22:30 - 22:33

The intent of a scheduled task is go and
22:33 - 22:35

do this later. So a scheduled task can be
22:35 - 22:37

canceled. You can't cancel a future. Once
you set
22:37 - 22:40

that thing in motion it's done, right. It's
just
22:40 - 22:41

gonna work.
22:41 - 22:43

This allows us to cancel it. There's another
reason.
22:43 - 22:45

It's more important, and again, this is something
that,
22:45 - 22:48

it transcends this particular implementation
but is true of
22:48 - 22:51

concurrency abstractions in general, K.
22:51 - 22:54

And it's the idea of intent. You'll notice
as
22:54 - 22:56

we've gone through this that we've worked
very hard
22:56 - 23:00

to decouple our business logic from our concurrency
logic.
23:00 - 23:02

It's done on purpose, all right. If you've
ever
23:02 - 23:05

tried to test concurrent business logic, you
found out
23:05 - 23:09

that it's probably very hard and painful,
all right.
23:09 - 23:10

When we test, we set things at a known
23:10 - 23:13

state, we change that state and verify that
the
23:13 - 23:16

new state is what it's supposed to be. Concurrency
23:16 - 23:19

is non-deterministic. It is very hard in a
concurrent
23:19 - 23:22

environment to create a known state. That's
the whole
23:22 - 23:23

problem.
23:23 - 23:27

So if you decouple your business logic from
your
23:27 - 23:30

concurrency logic, you can test the business
logic in
23:30 - 23:32

a way that's not concurrent, make sure that
your
23:32 - 23:34

business logic does exactly what it's supposed
to do
23:34 - 23:37

- our crash test dummy example being that.
Make
23:37 - 23:39

sure it does what it's supposed to do.
23:39 - 23:43

Then you can take a concurrency abstraction
that is
23:43 - 23:45

tested and that has defined behavior and does
a
23:45 - 23:48

specific thing, and you can put the two together
23:48 - 23:51

with a very minimal intersection, and now
you're testing
23:51 - 23:53

a concurrency becomes minimal, because you
just have to
23:53 - 23:55

test an intersection, right.
23:55 - 23:57

So when we do that, now we have code
23:57 - 24:01

that very clearly expresses intent. When I'm
looking at
24:01 - 24:04

code and you see something that says schedule
task,
24:04 - 24:07

that expresses intent. It has meaning. It
tells you
24:07 - 24:09

that there are certain concurrent behaviors
going on.
24:09 - 24:12

You see something called future. That expresses
intent. You
24:12 - 24:16

see something called agent, that expresses
intent. So although
24:16 - 24:19

in this case we could have simulated schedule
task
24:19 - 24:22

with future, or in fact we could have simulated
24:22 - 24:23

all of these things with something called
actor that
24:23 - 24:26

we're gonna look at later on. By having a
24:26 - 24:28

abstraction that does one thing very well
we better
24:28 - 24:33

express intent and we allow ourselves to optimize
that
24:33 - 24:36

abstraction for the thing it needs to do.
24:36 - 24:38

So, schedule_task, in this case, looks very
much like
24:38 - 24:41

future, but it has that scheduling.
24:41 - 24:42

Now for all of us who like to use
24:42 - 24:45

Chrone - oh, and also, in this particular
case,
24:45 - 24:47

this implementation schedule_task does, of
course, support observer as
24:47 - 24:49

well, so that we can have that call-back type
24:49 - 24:52

ability, right. Again, I'm not very bright.
I like
24:52 - 24:54

my things to work consistently so I can remember
24:54 - 24:57

how they go. So this observ- is observable
as
24:57 - 24:57

well.
24:57 - 24:59

So getting back to that Chrone example, we
have
24:59 - 25:02

another reason to use Chrone is repetition.
We want
25:02 - 25:04

something to happen over and over and over
and
25:04 - 25:06

over again. Whether it's every five seconds
or every
25:06 - 25:10

minute or every ten minutes or whatever, K.
25:10 - 25:12

Java provides us with a really cool abstraction
to
25:12 - 25:16

do that too. And unlike scheduled executor
service, the
25:16 - 25:19

abstraction for this in Java actually has
a name
25:19 - 25:23

that's not entirely stupid. It's called a
timer_task, right.
25:23 - 25:27

So a timer_task is simply this. It says, here's
25:27 - 25:29

an operation I want you to perform. I want
25:29 - 25:33

to perform at this particular interval, five
seconds- however
25:33 - 25:36

many seconds, right - and if that tasks takes
25:36 - 25:38

longer than a certain timeout value, kill
it.
25:38 - 25:41

Now this is broken, K. You know, I notice
25:41 - 25:44

though, one thing different about this from
the things
25:44 - 25:46

we saw before is this is a run method
25:46 - 25:48

that we're calling on here, on line nine.
25:48 - 25:49

Remember that when we come back to that in
25:49 - 25:50

a minute, K.
25:50 - 25:52

So I create this timer task, we give it
25:52 - 25:54

the timer values, we send this thing off and
25:54 - 25:55

just say just go do this thing over and
25:55 - 25:58

over and over again, K. Shouldn't surprise
you by
25:58 - 26:02

now that, in this particular implementation,
cause, you know,
26:02 - 26:06

we like consistency, it also supports observability
as well,
26:06 - 26:08

K. So we can have this timer task go,
26:08 - 26:10

we can run it, we can attach an observer
26:10 - 26:12

to it, and every time that it occurs, we
26:12 - 26:14

can have the observer or observers respond
to that
26:14 - 26:16

in a call-back like fashion, K.
26:16 - 26:20

Now, so one of the things that this does,
26:20 - 26:21

one of my co-workers is using this in a
26:21 - 26:24

project he's working on, he came to me and
26:24 - 26:25

said, how do I stop this thing once it's
26:25 - 26:28

started? I'm like, what do you mean, it's
supposed
26:28 - 26:29

to go forever.
26:29 - 26:31

You can call stop on it, but you'd call
26:31 - 26:32

that, but I don't wanna, see, but I don't
26:32 - 26:34

wanna call stop from main thread. He said,
what
26:34 - 26:37

happens if within the block that's running,
there's something
26:37 - 26:40

occurs, and I want to, based upon that logic,
26:40 - 26:41

change the execution whenever I want to shut
this
26:41 - 26:42

thing down?
26:42 - 26:44

So I thought, it's a good use case I
26:44 - 26:46

hadn't thought of. but it's very smart one.
So,
26:46 - 26:48

based upon that, him and I sat down. We
26:48 - 26:49

paired and created small changes and said,
you know
26:49 - 26:52

what, let's just, inside that block, every
time it
26:52 - 26:56

executes, lets pass a reference to the task
itself.
26:56 - 27:00

Basically, self. Right. Within the block,
self isn't gonna
27:00 - 27:02

be what we want it to be, so we'll
27:02 - 27:05

pass that task in. Now, that scheduled task
has
27:05 - 27:08

the ability to change its own life cycle within
27:08 - 27:11

the block by changing its own timer values
or
27:11 - 27:16

by making or stopping itself if necessary.
OK.
27:16 - 27:18

Now last, a really important topic we're gonna
talk
27:18 - 27:20

about, well, not the last topic, but really,
a
27:20 - 27:23

big topic. Let me ask this question: How many
27:23 - 27:27

people have heard of the actor model for concurrency?
27:27 - 27:28

OK.
27:28 - 27:31

Good. How many people have heard - so, Always
27:31 - 27:34

Sunny in Philadelphia for those of you who
don't
27:34 - 27:35

know.
27:35 - 27:37

How many people here have heard that Erlang
implements
27:37 - 27:41

the actor model for concurrency? A few? All
right.
27:41 - 27:44

So actor model is sort of a big deal
27:44 - 27:46

these days. Now here's the interesting - now,
I've
27:46 - 27:47

been doing this a long time, nearly twenty
years,
27:47 - 27:49

and if there's one thing I've learned in twenty
27:49 - 27:51

years of being a programmer, is if there's
anything
27:51 - 27:53

that programmers want to talk about, apparently
it's also
27:53 - 27:56

worth getting into ridiculous flame wars over.
27:56 - 28:00

And actor, the actor model for concurrency
is the
28:00 - 28:02

same thing. K. This is surprisingly controversial
in some
28:02 - 28:05

circles, K. There are some people who think
that
28:05 - 28:07

the actor model can only do concurrency and
everything
28:07 - 28:09

else should just go away, and there are people
28:09 - 28:10

who think that they're completely wrong, right.
28:10 - 28:13

Not gonna weigh into that debate. Just understand
that
28:13 - 28:16

debate, they exist. There's also a debate
about what,
28:16 - 28:20

exactly, an actor is. So here's the thing,
at,
28:20 - 28:24

the actor model was first proposed in 1973,
right.
28:24 - 28:28

A gentleman named Caller Hewitt and his associates
working
28:28 - 28:31

at the MIT Artificial Intelligence lab, published
a paper
28:31 - 28:33

called The Actor Model for Concurrency.
28:33 - 28:36

Well, it shouldn't surprise you that in the
past
28:36 - 28:39

forty years, a lot has changed with respect
to
28:39 - 28:42

programming. What they really described in
this paper was
28:42 - 28:45

a pattern, right. They weren't, they weren't
using the
28:45 - 28:47

term pattern the same way back then. So they
28:47 - 28:49

don't call it a pattern. But they said, we've
28:49 - 28:51

seen a bunch of things that behave a certain
28:51 - 28:53

way, and we're gonna document the way these
things
28:53 - 28:54

behave.
28:54 - 28:56

And anything that behaves this way we're gonna
retroactively
28:56 - 28:59

call an actor. Right. This is also before
the
28:59 - 29:01

days of object orientation, long before Gang
of Four
29:01 - 29:05

wrote their book, so we didn't have great
diagramming
29:05 - 29:08

techniques for creating class and object designs.
29:08 - 29:10

So they used the only notation that they knew
29:10 - 29:13

at the time, which was a mathematical notation,
which
29:13 - 29:16

means this paper, which is very fascinating
to read,
29:16 - 29:20

is, has very limited direct applicability
to today. The
29:20 - 29:22

ideas, forty years later, are still very good,
but
29:22 - 29:25

the paper, as far as being a blueprint for
29:25 - 29:27

creating an actor, is very limited.
29:27 - 29:30

So as a result, today in 2013, there is
29:30 - 29:36

no single universally agreed upon strict definition
of what
29:36 - 29:40

an actor is. Additionally, there is no one
canonical
29:40 - 29:43

implementation of actor. If you look at everything
that
29:43 - 29:45

people- that claims to be an actor, and things
29:45 - 29:47

that people claim are actors even when they
don't
29:47 - 29:49

claim to be actors themselves - they all look
29:49 - 29:51

very, very different.
29:51 - 29:53

And so that leads to a lot of debate
29:53 - 29:55

within the actor community about what really
an actor
29:55 - 29:57

should look like. K.
29:57 - 29:59

So for the purposes of this presentation,
I'm going
29:59 - 30:02

to give you my definition of actor. OK, I'm
30:02 - 30:04

sure I'm gonna get flamed for it by somebody,
30:04 - 30:05

but, you know, we have to move forward.
30:05 - 30:08

So here's my definition of actor. An actor
is
30:08 - 30:15

an independent single purpose concurrent computational
entity. Again, an
30:16 - 30:23

independent single purpose concurrent computational
entity that communicates via
30:24 - 30:29

messages, K. It's gotta, it's gotta do something.
It's
30:29 - 30:32

got to be independent computational entity
that does something.
30:32 - 30:35

A class called actor is not an actor. A
30:35 - 30:39

class called actor can create an object which
behaves
30:39 - 30:42

as an actor, but it's gotta be something that
30:42 - 30:42

does something.
30:42 - 30:44

It has to be concurrent. All right, that was
30:44 - 30:46

one of the key things about the original paper.
30:46 - 30:49

And it has to be single-purpose. Now this
is
30:49 - 30:52

critical. When Hewitt and his colleagues wrote
this paper,
30:52 - 30:57

they, their examples were people performing
a play on
30:57 - 30:59

a stage.
30:59 - 31:03

Every actor fulfills a role. Every actor plays
that
31:03 - 31:06

role. There is not a tremendous amount of,
there's
31:06 - 31:10

no overriding control of those actors. They're
all acting
31:10 - 31:14

independently. But they coordinate amongst
themselves to do something
31:14 - 31:16

greater than the sum of its parts.
31:16 - 31:19

So an actor must perform a role in order
31:19 - 31:22

to be an actor, K. And one of the
31:22 - 31:24

key things about the original Huett paper
was it
31:24 - 31:27

said, y ou must communicate via messages,
K. Now
31:27 - 31:30

they didn't define what a message is, and
that's
31:30 - 31:32

one of the areas where there's a lot of
31:32 - 31:34

contention these days is, what constitutes
a message?
31:34 - 31:37

Now, if you were using Erlang, or using Scala,
31:37 - 31:40

or some language like that, they have built-in
inter-process
31:40 - 31:44

messaging systems, right. The bang operator
in Erlang is
31:44 - 31:46

a way of one process sending a message to
31:46 - 31:47

another process.
31:47 - 31:50

A message in that case is pretty cut and
31:50 - 31:54

dry. In Ruby, we have no similar underlying
communication
31:54 - 31:57

mechanism to define a message. Therefore in
Ruby it's
31:57 - 32:00

kind of hard to decide what constitutes a
message.
32:00 - 32:00

So.
32:00 - 32:02

The example I'm gonna show you, the example
I
32:02 - 32:05

like the best, is by no means the right
32:05 - 32:07

example or the canonical example and I'm sure
a
32:07 - 32:09

lot of people will think that my choice is
32:09 - 32:11

not good, and that's fine, but the example
I'm
32:11 - 32:14

gonna show you is based upon Scala's actor
class
32:14 - 32:17

for, a trait, excuse me, from the Scala standard
32:17 - 32:18

library.
32:18 - 32:21

Now Scala has since deprecated this particular
trait and
32:21 - 32:25

moved on to the acka library. But this particular
32:25 - 32:29

implementation of actor from Scala served
Scala programmers very
32:29 - 32:31

well for a number of years, and it exhibits
32:31 - 32:33

all of the things an actor is supposed to
32:33 - 32:35

exhibit, and its also very simple. And I'm
a
32:35 - 32:37

simple guy. I like simple libraries that are
loosely
32:37 - 32:40

coupled and that give me things that work
really
32:40 - 32:41

well by themselves.
32:41 - 32:43

So here's how this works.
32:43 - 32:47

Straightforward. You extend the actor class.
The actor class
32:47 - 32:50

there, then gives your object all of the message
32:50 - 32:54

patching semantics it needs. It gives it the
threading
32:54 - 32:55

it needs so this thing can run on its
32:55 - 32:58

own thread, right, and it handles queuing
of the
32:58 - 33:01

messages. Every time a message comes in, it
calls
33:01 - 33:06

the act methods, which you, in your subclass,
override,
33:06 - 33:08

in order to give your actor some behavior,
right.
33:08 - 33:10

In this case, this is a pretty simple example,
33:10 - 33:12

all its gonna do is basically echo the message
33:12 - 33:13

to the screen, K.
33:13 - 33:17

Straightforward - very simple. Now, the problem
with that,
33:17 - 33:19

when you give each actor its own thread, or
33:19 - 33:21

its own process, the problem you run into
is
33:21 - 33:25

one of contention. Blocking. Right.
33:25 - 33:28

If your actor performs some lengthy operation
such as
33:28 - 33:31

blocking IO, you run the risk of having a
33:31 - 33:34

whole bunch of stuff back up in the queue.
33:34 - 33:37

So most actor implementations will give you
some ability
33:37 - 33:41

to pool actors off of some shared mailbox,
right.
33:41 - 33:43

So that that way you can have a whole
33:43 - 33:45

bunch of threads running with a whole bunch
of
33:45 - 33:46

different actors and you can send messages
to one
33:46 - 33:47

place, K.
33:47 - 33:52

F#'s mailbox processor works this way. Acka
works this
33:52 - 33:54

way. Scala's original acka works this way,
cellular works
33:54 - 33:57

this way, right. The idea of a pool. So
33:57 - 33:58

the way you get a pool out of this
33:58 - 34:00

particular implementation is you just call
the pool method
34:00 - 34:03

off of the, the class, tell it how many
34:03 - 34:05

things you want, then we return a whole bunch
34:05 - 34:09

of actors that all share one mailbox, and
it'll
34:09 - 34:11

return the mailbox, right. It's very simple.
34:11 - 34:12

You can then run each of the things in
34:12 - 34:14

the pool and you can start sending messages
at
34:14 - 34:15

it, and all those things in the pool will
34:15 - 34:20

then start handling those messages, K. It's
very straightforward,
34:20 - 34:20

K.
34:20 - 34:22

And again this is a very Scala-ish way of
34:22 - 34:27

doing things. So when, now you'll notice though
when
34:27 - 34:28

you call post and you send the message in
34:28 - 34:30

there there's no way to then interact with
that
34:30 - 34:33

message or that result later on, K. That was
34:33 - 34:38

by design in Scala's actor class, because
the original
34:38 - 34:42

actor model paper from 1973 said actors only
interact
34:42 - 34:45

with each other via messages, K.
34:45 - 34:47

Now, again, a lot's changed in forty years
and
34:47 - 34:49

that's not necessarily that efficient, and
so sometimes you
34:49 - 34:51

want to have other ways to interact with that,
34:51 - 34:54

and so Scala, Martin O'Dersky being a very
smart
34:54 - 34:57

guy, decided to create other ways to interact
with
34:57 - 35:00

the actor. So here's two other ways that you
35:00 - 35:03

can interact with this particular actor when
sending messages.
35:03 - 35:06

The first one is the post question mark, all
35:06 - 35:08

right. What that does is it sends a message
35:08 - 35:12

to the actor and it returns a future object,
35:12 - 35:15

K. This is a very common paradigm in asynchronous
35:15 - 35:18

programming where we send something off for
processing and
35:18 - 35:21

we get back a future. In this case, that
35:21 - 35:23

future object behaves exactly the same as
if you
35:23 - 35:27

put a future object in the beginning: pending,
fulfilled,
35:27 - 35:29

rejected, value, reason, K.
35:29 - 35:31

So when I send this thing off to the
35:31 - 35:33

actor I get my future object back, I go
35:33 - 35:34

through my very important stuff and later
on I
35:34 - 35:37

query that object to see how it occurred.
35:37 - 35:40

The second example is post bang, all right.
There
35:40 - 35:42

are cases where you may want to use an
35:42 - 35:46

actor in a synchronous capacity, but here's
the problem.
35:46 - 35:49

An actor's trying to queue up these operations,
so
35:49 - 35:52

as to prevent locking. So we don't have to
35:52 - 35:53

lock so that we can have these happen one
35:53 - 35:55

at a time. If we try and use an
35:55 - 35:59

actor synchronously and asynchronously at
the same time we
35:59 - 36:02

run the risk of breaking things very badly.
36:02 - 36:04

So any case where you might want to use
36:04 - 36:06

this thing synchronously - and again Scala
provided the
36:06 - 36:11

same capability - you're gonna call this method
post
36:11 - 36:15

bang, which will then block and wait for the
36:15 - 36:18

operation to complete, thus imitating synchronous
behavior. Now the
36:18 - 36:21

problem with that is when that occurs, there's
no
36:21 - 36:23

way of knowing what the result is other than
36:23 - 36:26

the return value of the, of the method.
36:26 - 36:29

This case, on success, this will return the,
the
36:29 - 36:33

result of the, the processing of the message,
right.
36:33 - 36:34

So in this particular case, this is one of
36:34 - 36:36

the few places in this library we're gonna
see
36:36 - 36:39

any kind of exceptions being raised, all right.
If
36:39 - 36:40

this thing times out, we're gonna raise a
time
36:40 - 36:43

out exception, if, for some reason the message
can't
36:43 - 36:45

be queued, we're gonna get a life cycle exception.
36:45 - 36:49

If our operation throws an exception, the
actor will
36:49 - 36:51

then handle that internally the way it handles
all
36:51 - 36:54

other exceptions, and then reraise that exception
out of
36:54 - 36:55

this particular method, all right.
36:55 - 36:57

And that way it allows us to treat this
36:57 - 36:59

in a very synchronous way, but we're given
a
36:59 - 37:01

very strong warning that, really, what we're
doing might
37:01 - 37:03

not be quite the best way to do it
37:03 - 37:04

and we're gonna get exceptions, so we're gonna
want
37:04 - 37:06

to wrap that in a rescue block, OK.
37:06 - 37:08

And, at this point it should be no surprise
37:08 - 37:12

to you that this particular actor implementation
also supports
37:12 - 37:14

observers, because it's very common in the
case of
37:14 - 37:17

actor frameworks to provide some sort of call
back
37:17 - 37:20

against messages being processed successfully.
37:20 - 37:24

So here we leverage that observer again. And
if
37:24 - 37:25

you're familiar with actors, you know that
the canonical
37:25 - 37:28

actor example, the hello world of actors,
is a
37:28 - 37:30

ping pong example, so for completeness, here
is a
37:30 - 37:33

ping pong example using this particular actor
implementation.
37:33 - 37:36

That actually, I took that directly from the
Scala
37:36 - 37:39

tutorial on actors and, and rebuilt it using
Ruby,
37:39 - 37:40

K.
37:40 - 37:44

So last concept. How many people here have
heard
37:44 - 37:48

about Erlang being a very fault-tolerant language.
Nine-nine's availability
37:48 - 37:50

in some cases (00:37:49). How many people
would like
37:50 - 37:53

to have their Ruby programs have nine nines
of
37:53 - 37:56

uptime? All right, that should be everybody
in the
37:56 - 37:57

room.
37:57 - 37:59

There's nothing magic about Erlang. You probably
have heard
37:59 - 38:02

of the let-it-fail philosophy of Erlang, all
right. There
38:02 - 38:05

really, truly is nothing magical about Erlang.
In the
38:05 - 38:08

language or the virtual machine itself. This,
it's a
38:08 - 38:09

great language and does some really cool things
but
38:09 - 38:15

then the actual nine-nine fault tolerance
comes from something
38:15 - 38:18

called the supervisor, all right.
38:18 - 38:20

How many - how many people have heard of
38:20 - 38:22

the supervisor in Erlang? OK. This is such
a
38:22 - 38:25

powerful concept that you see supervisors
all the time
38:25 - 38:28

in concurrency libraries. You see supervisors
in Acka, you
38:28 - 38:31

see supervisors in Celluloid, you see supervisors
all the
38:31 - 38:31

time.
38:31 - 38:33

So what is a supervisor? The idea in Erlang
38:33 - 38:36

is, when we create these processes, we send
them
38:36 - 38:39

out to do things, and if something goes wrong
38:39 - 38:41

we want to let them crash. Why do we
38:41 - 38:42

let them crash?
38:42 - 38:44

WE let them crash because we don't want to
38:44 - 38:47

have anything in some kind of intermediate
state, K.
38:47 - 38:48

If you're like me you've probably programmed
some kind
38:48 - 38:50

of wrapper at some point in your life, where
38:50 - 38:52

you thought I'm gonna put this really great
simple
38:52 - 38:55

O-O wrapped around some whole bunch of complex
stuff.
38:55 - 38:57

And I've got connections and all these various
things
38:57 - 38:59

in there, and then one of those things blows
38:59 - 39:01

up. Now I've got this mess, and I've got
39:01 - 39:02

to dig through all this kind of crap and
39:02 - 39:04

figure out what state is this thing in so
39:04 - 39:07

I can get the broken pieces back where they
39:07 - 39:08

need to be.
39:08 - 39:11

Erlang says no. Don't do that. There's only
two
39:11 - 39:12

states. Good or bad. If it's good, great.
If
39:12 - 39:14

it's not, you should just kill it and let
39:14 - 39:17

it die. Right, and that works only if there's
39:17 - 39:20

a way to restart it, all right. This philosophy
39:20 - 39:22

is really good because now it's very easy,
it's
39:22 - 39:25

very freeing. So I've got this complex thing,
something
39:25 - 39:26

blows up, I'm just gonna kill everything.
39:26 - 39:28

All right. But that depends if I have something
39:28 - 39:31

that restarts it. And in Erlang, that's the
supervisor,
39:31 - 39:36

K. This right here is a functionally complete
implementation
39:36 - 39:40

of Erlang's supervisor module, K. It's a,
the Erlang's
39:40 - 39:42

supervisor module provides a lot of really
great capabilities.
39:42 - 39:46

It provides something called restart strategies,
which allows you
39:46 - 39:48

to define when one thing dies, what would
you
39:48 - 39:49

do with the other things (00:39:49). It allows
you
39:49 - 39:52

to find child-types that could be permanent,
temporary, transient,
39:52 - 39:54

as meanings (00:39:53). You can provide a
sliding window
39:54 - 39:58

of, of intervals. You can say if we get
39:58 - 40:00

x number of crashes within y period of time,
40:00 - 40:02

we're gonna shut the whole thing down.
40:02 - 40:05

And one of the best things about Erlang's
supervisor
40:05 - 40:09

is something called supervisor trees. Supervisors
can manage supervisors
40:09 - 40:12

which can manage other supervisors. So if
you look
40:12 - 40:15

at Erlang's documentation on how to build
fault-tolerant systems,
40:15 - 40:19

they discuss a bunch of very different tree
structures,
40:19 - 40:19

K.
40:19 - 40:23

This particular implementation here is a functionally
complete version
40:23 - 40:26

of Erlang supervisor, and here's how it works.
You
40:26 - 40:29

can give this thing anything that supports
three methods.
40:29 - 40:31

A blocking run method, a runable predicate
method -
40:31 - 40:34

excuse me, a running predicate method, and
this stock
40:34 - 40:36

method. You can use that to - and then
40:36 - 40:38

you use the run method to start it and
40:38 - 40:40

the stop method to shut it down.
40:40 - 40:42

That's why we looked at a couple of things
40:42 - 40:44

earlier that had that run method. So what
we're
40:44 - 40:45

doing in this case, we're gonna create a super-
40:45 - 40:48

we're gonna create an actor. From that, from
the
40:48 - 40:50

actor class we create a pool of actors, all
40:50 - 40:53

right. We're gonna create a couple timer tasks,
which,
40:53 - 40:55

that have random intervals. We're gonna create
a supervisor.
40:55 - 40:57

We're gonna tell the supervisor, manage and
monitor all
40:57 - 41:01

of these things, here, add_worker, add_worker,
add_worker. We then
41:01 - 41:04

start the supervisor, and it runs, all right.
And
41:04 - 41:06

at that point it starts up all of those
41:06 - 41:07

things, and all of those things run, and they
41:07 - 41:08

all do all the things they want to do
41:08 - 41:10

and the supervisor monitors them, and if any
of
41:10 - 41:13

them should crash, the supervisor will restart
them based
41:13 - 41:15

upon the restart strategy.
41:15 - 41:18

And if you want, you can have supervisors
monitor
41:18 - 41:20

supervisors, so that that way, if something's
wrong with
41:20 - 41:22

the supervisor, it can restart that whole
thing. All
41:22 - 41:24

right, and thus you can get a supervisor tree
41:24 - 41:28

and that is how languages like Erlang, and
libraries
41:28 - 41:31

like Celluloid and Acka and so forth, get
their,
41:31 - 41:36

their fault-tolerant abilities, by using acts-
supervisors to manage
41:36 - 41:37

those processes.
41:37 - 41:40

OK. Now this is a really long presentation
and
41:40 - 41:41

we don't have a lot of times, so I
41:41 - 41:42

want to mention two of the libraries that
are
41:42 - 41:46

very, that, that, express some really cool
ideas in
41:46 - 41:48

terms of concurrency. And the first one is
gonna
41:48 - 41:50

be something called a vent machine. Vent machine
is
41:50 - 41:53

based upon the reactor pattern, right. Reactor
pattern was
41:53 - 41:56

first documented in 2000.
41:56 - 41:58

We like vent machine a lot at VHT. A
41:58 - 42:03

vent machine's basically like node.js4 for
Ruby, K. Oh,
42:03 - 42:05

again, all these slides are gonna be up on
42:05 - 42:07

GitHub as well as all the coding samples.
42:07 - 42:10

Then the other thing is Celluloid. Celluloid
is a
42:10 - 42:14

fairly well-known, fairly popular actor-based
library, written in Ruby,
42:14 - 42:16

all right. It's got a good following, it's
got
42:16 - 42:20

a very good community. And the Celluloid library
makes
42:20 - 42:24

it, has the expressed interest in making it
easy
42:24 - 42:28

for you to add concurrency to your code.
42:28 - 42:30

This right here is the original example we
showed
42:30 - 42:32

at the very beginning of our crash test dummy,
42:32 - 42:34

with one change. Up at the very top, you
42:34 - 42:40

see include Celluloid. That makes this object
inherently asynchronous.
42:40 - 42:44

It becomes something you can create actors
from, K.
42:44 - 42:47

The Celluloid is a great library for making
your
42:47 - 42:51

job easy. But this particular implementation
I'm showing you
42:51 - 42:53

here is horrible broken because it violates
a bunch
42:53 - 42:56

of Celluloid rules, K.
42:56 - 43:00

Celluloid is, has a very tightly coupled and
very,
43:00 - 43:02

it's very powerful, but it's very tightly
coupled and
43:02 - 43:07

it has a very, a lot of complexity internally.
43:07 - 43:09

Because it's providing a lot of auto magic
in
43:09 - 43:12

order to prevent you from harming yourself
through concurrency.
43:12 - 43:16

So when you look at Celluloid documentation,
there's a
43:16 - 43:19

page of gotchas, which describe the, the idiomatic
way
43:19 - 43:20

in which you need to use Celluloid.
43:20 - 43:23

So Celluloid is another very powerful library
for doing
43:23 - 43:29

actors, and for doing supervisors. And, and
- but,
43:29 - 43:32

using Celluloid properly requires a little
bit of work.
43:32 - 43:36

So I encourage you to look not only at
43:36 - 43:39

the library that I put together, but the vent
43:39 - 43:41

machine and also Celluloid. Make sure when
using each
43:41 - 43:43

of those libraries that you are aware of the
43:43 - 43:45

idiosyncrasies of those libraries and how
they work. And
43:45 - 43:48

remember, you can never escape the underlying
realities of
43:48 - 43:53

concurrency, which are non-determinism and
shared mutable data.
43:53 - 43:55

So with that, my final thought is this. All
43:55 - 43:57

right. My challenge to you is to go out
43:57 - 44:01

and write code, K. If you've never written
concurrent
44:01 - 44:04

code before, you should know that writing
good concurrent
44:04 - 44:07

code is something that requires effort. You
can't learn
44:07 - 44:10

about it by reading. You have to do it.
44:10 - 44:13

Concurrent systems don't behave the way non-concurrent
systems work.
44:13 - 44:15

They have different design patterns that make
them work.
44:15 - 44:18

They have different ways of testing and debugging
and
44:18 - 44:19

the only way to learn this is to write
44:19 - 44:21

the code.
44:21 - 44:22

Over the past forty-five minutes, we've looked
at a
44:22 - 44:24

tremendous amount of code that did a lot of
44:24 - 44:27

very, very powerful things, and we never,
ever once
44:27 - 44:30

had to type thread dot new. We never once
44:30 - 44:34

had to type dot synchronize off a mutex object,
44:34 - 44:35

K.
44:35 - 44:38

You can go out, using the libraries that we've
44:38 - 44:39

looked at today, using the code that I've
put
44:39 - 44:43

up on GitHub, and you can write concurrent
code.
44:43 - 44:45

So if you think concurrency is important,
which you
44:45 - 44:49

should, if you think that learning to program
concurrency
44:49 - 44:51

is good for you at your job and your
44:51 - 44:54

career, which you should, and if you think
concurrency
44:54 - 44:55

is something that is going to become just
more
44:55 - 44:59

important in the near future, which it is,
then
44:59 - 45:01

you need to go out and write code.
45:01 - 45:06

So add my GitHub page, all of the slides,
45:06 - 45:09

detailed notes, all of the source code in
RB
45:09 - 45:11

files, and even a gem file. So that's my
45:11 - 45:14

challenge to you. Pull out your computer,
get clon-
45:14 - 45:17

open up your favorite editor, git clone, bundle
install,
45:17 - 45:20

and write concurrent code. And with that,
I'm out
45:20 - 45:21

of time.
45:21 - 45:22

Thank you very much. My name is Jerry D'Antonio.

Title:: Ruby Conf 2013 - Advanced Concurrent Programming in Ruby
Description:: more » « less
Duration:: 45:53

Amara Bot edited English subtitles for Ruby Conf 2013 - Advanced Concurrent Programming in Ruby

English subtitles

Revisions

Revision 1 Imported

Amara Bot

Ruby Conf 2013 - Advanced Concurrent Programming in Ruby

Revisions

Our website uses cookies

Operating cookies (Required)