WEBVTT

00:00:24.810 --> 00:00:25.859
HARY KRISHNAN: So, thank you very much

00:00:25.859 --> 00:00:28.099
for being here on a Saturday evening, this
late.

00:00:28.099 --> 00:00:30.430
My talk got pushed to the last, but I

00:00:30.430 --> 00:00:34.540
appreciate you being here, first. My name's
Hari. I

00:00:34.540 --> 00:00:36.910
work at MavenHive. So this is a talk about

00:00:36.910 --> 00:00:43.530
Ruby memory model. So before I start, how
many

00:00:43.530 --> 00:00:46.559
of you have heard about memory model and know

00:00:46.559 --> 00:00:51.909
what it is? Show of hands, please. OK. Let's

00:00:51.909 --> 00:00:55.150
see where this talk goes. So why I did

00:00:55.150 --> 00:00:58.839
I come up with this talk topic. So I

00:00:58.839 --> 00:01:01.809
started my career with Java, and I spent a

00:01:01.809 --> 00:01:04.860
lot many years with Java, and Java has a

00:01:04.860 --> 00:01:08.890
very clearly documented memory model. And
it kind of

00:01:08.890 --> 00:01:10.500
gets to you because with all that, you don't

00:01:10.500 --> 00:01:14.049
feel safe enough doing multi-threaded programming
at all. So

00:01:14.049 --> 00:01:17.710
with Ruby, we've always been talking about,
you know,

00:01:17.710 --> 00:01:21.290
doing multi-process for multi-process parallelism,

00:01:21.290 --> 00:01:24.450
rather than multi-threaded parallelism,

00:01:24.450 --> 00:01:28.710
even though the language actually supports,
you know, multi-threading

00:01:28.710 --> 00:01:30.799
semantics. Of course we know it's called single-threaded
and

00:01:30.799 --> 00:01:34.259
all that, but I just got curious, like, what

00:01:34.259 --> 00:01:36.499
is the real memory model behind Ruby, and
I

00:01:36.499 --> 00:01:39.149
just wanted to figure that out. So this talk

00:01:39.149 --> 00:01:42.439
is all about my learnings as I went through,

00:01:42.439 --> 00:01:46.350
like, various literatures, and figured out,
and I tried

00:01:46.350 --> 00:01:48.289
to combine, like, get a gist of the whole

00:01:48.289 --> 00:01:50.509
thing. And cram it into some twenty minutes
so

00:01:50.509 --> 00:01:52.340
that I could, like, probably give you a very

00:01:52.340 --> 00:01:55.600
useful session, like, from which you can further
do

00:01:55.600 --> 00:02:01.069
more digging on this, right. So when I talked

00:02:01.069 --> 00:02:03.420
to my friends about memory model, the first
thing

00:02:03.420 --> 00:02:05.540
that comes up to their mind is probably this

00:02:05.540 --> 00:02:10.139
- heap, heap, non-heap, stack, whatever. I'm
not gonna

00:02:10.139 --> 00:02:14.069
talk about that. I'm not gonna talk about
this

00:02:14.069 --> 00:02:17.450
either. It's not about, you know, optimizing
your memory,

00:02:17.450 --> 00:02:21.040
or search memory leeks, or garbage collection.
This talk

00:02:21.040 --> 00:02:23.330
is not about that either. So what the hell

00:02:23.330 --> 00:02:27.370
am I gonna talk about? First, a quick exercise.

00:02:27.370 --> 00:02:31.360
So let's start with this and see where it

00:02:31.360 --> 00:02:35.760
goes. Simple code. Not much to process late
in

00:02:35.760 --> 00:02:38.890
the day. There's a shared variable called
'n', and

00:02:38.890 --> 00:02:42.030
there are thousand threads over that, and
each of

00:02:42.030 --> 00:02:45.379
those threads want to increment that shared
variable hundred

00:02:45.379 --> 00:02:49.379
times, right. And what is the expected output?
I'm

00:02:49.379 --> 00:02:51.200
not gonna question you, I'm just gonna give
it

00:02:51.200 --> 00:02:55.180
away. It's 100,000. It's fairly straightforward
code. I'm sure

00:02:55.180 --> 00:02:57.200
all of you have done this, and it's no

00:02:57.200 --> 00:03:01.680
big deal. So what's the real output? MRI is

00:03:01.680 --> 00:03:05.319
very faithful, it gives you what you expected.
100,000,

00:03:05.319 --> 00:03:08.720
right. So what happens next? I'm running it
on

00:03:08.720 --> 00:03:12.569
Rubinius. This is what you see. And it's always

00:03:12.569 --> 00:03:15.760
going to be a different number every time
you

00:03:15.760 --> 00:03:19.140
run it. And that's JRuby. It gives you a

00:03:19.140 --> 00:03:22.629
lower number. Some of you may be guessing
already,

00:03:22.629 --> 00:03:24.489
and you probably know it, why it gives you

00:03:24.489 --> 00:03:28.159
a lower number. So why all this basic stupid

00:03:28.159 --> 00:03:31.230
code and some stupid counter over here, right?
So

00:03:31.230 --> 00:03:34.189
I just wanted to get a really basic example

00:03:34.189 --> 00:03:36.299
to explain the concept of increment is not
a

00:03:36.299 --> 00:03:40.040
single instruction, right. The reason why
I'm talking about

00:03:40.040 --> 00:03:43.390
this is, I love Ruby because the syntax is

00:03:43.390 --> 00:03:46.629
so terse, and it's so simple, it's so readable,

00:03:46.629 --> 00:03:49.310
right. But it does not mean every single instruction

00:03:49.310 --> 00:03:52.140
on the screen is going to be executed straight

00:03:52.140 --> 00:03:54.810
away, right. So at least, to my junior self,

00:03:54.810 --> 00:03:56.599
this is the first advice I would give, when

00:03:56.599 --> 00:04:00.590
I started, you know, multi-threaded programming.
So at least

00:04:00.590 --> 00:04:05.980
three steps. Lowered increments store, right.
That's, even further,

00:04:05.980 --> 00:04:09.879
really simple piece of code like, you know,
a

00:04:09.879 --> 00:04:12.879
plus equals to, right. So this is what we

00:04:12.879 --> 00:04:15.750
really want to happen. You have a count, you

00:04:15.750 --> 00:04:18.399
lowered it, you increment it, you stored it.
Then

00:04:18.399 --> 00:04:21.019
the next thread comes along. It lowers it,
increments

00:04:21.019 --> 00:04:23.220
it, stores it. You have the next result which

00:04:23.220 --> 00:04:25.750
is what you expect, right. But we live in

00:04:25.750 --> 00:04:28.260
a world where threads don't want to be our

00:04:28.260 --> 00:04:31.470
friend. They do this. One guy comes along,
reads

00:04:31.470 --> 00:04:33.920
it, increments it. The other guy also reads
the

00:04:33.920 --> 00:04:37.440
older value, increments it. And both of them
go

00:04:37.440 --> 00:04:40.020
and save the same value, right. So this is

00:04:40.020 --> 00:04:42.120
a classic case of lost update. I'm sure most

00:04:42.120 --> 00:04:44.060
of you have seen it in the database world.

00:04:44.060 --> 00:04:46.770
But this pretty much happens a lot in the

00:04:46.770 --> 00:04:48.860
multi-threading world, right. But why did
it not happen

00:04:48.860 --> 00:04:51.620
with MRI? And what did you see the right

00:04:51.620 --> 00:04:53.190
result?? [00:04:52]? That, I'm sure a lot
of you

00:04:53.190 --> 00:04:55.580
know, but let's step, let's part that question
and

00:04:55.580 --> 00:05:00.500
just move a little ahead. So, as you observed

00:05:00.500 --> 00:05:03.770
earlier, a lot of reordoring happening in
instructions, right.

00:05:03.770 --> 00:05:07.210
Like, the threads were context-switching,
and they were reordering

00:05:07.210 --> 00:05:11.139
statements. So where does this reordering
happen? Reordering can

00:05:11.139 --> 00:05:14.740
happen at multiple levels. So start from the
top.

00:05:14.740 --> 00:05:18.150
You have the compiler, which can do simple
optimizations

00:05:18.150 --> 00:05:20.780
like look closer?? [00:05:20]. Even that can
change the

00:05:20.780 --> 00:05:23.990
order of your statements in your code, right.
Next,

00:05:23.990 --> 00:05:27.680
when the code gets translated to, you know,
machine-level

00:05:27.680 --> 00:05:30.639
language, goes to core, and your CP cores
are

00:05:30.639 --> 00:05:34.430
at liberty, again, to reorder them for performance.
And

00:05:34.430 --> 00:05:37.020
next comes the memory system, right. The memory
system

00:05:37.020 --> 00:05:39.669
is like the combined global memory, which
all the

00:05:39.669 --> 00:05:42.490
CPUs can read, and also they're individual
caches. But

00:05:42.490 --> 00:05:45.840
why do CPUs have caches? They want to, memory

00:05:45.840 --> 00:05:47.710
is slow, so they want to load, reload all

00:05:47.710 --> 00:05:50.080
the values, refactor it, keep it in the cache,

00:05:50.080 --> 00:05:52.710
again improve performance. So even the memory
system can

00:05:52.710 --> 00:05:55.940
conspire against you and reorder the loads
and stores

00:05:55.940 --> 00:05:59.380
after the memory registers. And that can cause
reordering,

00:05:59.380 --> 00:06:03.319
right. So this is really, really crazy. Like,
I'm

00:06:03.319 --> 00:06:07.550
a very stupid programmer, who works at the
programming

00:06:07.550 --> 00:06:10.599
language level. I don't really understand
the structure of

00:06:10.599 --> 00:06:13.169
the hardware and things like that. So how
do

00:06:13.169 --> 00:06:15.550
I keep myself abstracted from all this, you
know,

00:06:15.550 --> 00:06:21.550
really crazy stuff? So that's essentially
a memory model.

00:06:21.550 --> 00:06:23.930
So what, what is a memory model? A memory

00:06:23.930 --> 00:06:27.180
model describes the interactions of threads
through memory and

00:06:27.180 --> 00:06:28.970
their shared use of data. So this is straight

00:06:28.970 --> 00:06:30.919
out of Wikipedia, right. So if you just read

00:06:30.919 --> 00:06:34.610
it first, either you're gonna think it's really
simple,

00:06:34.610 --> 00:06:38.069
and probably even looks stupid, but otherwise
you might

00:06:38.069 --> 00:06:40.789
not even understand. So I was the second category.

00:06:40.789 --> 00:06:43.879
So what does this all mean? So when there

00:06:43.879 --> 00:06:48.580
are so many complications with the reordering,
the reads

00:06:48.580 --> 00:06:51.129
and writes of memory and things like that,
as

00:06:51.129 --> 00:06:54.759
a programmer you need certain guarantees from
the programming

00:06:54.759 --> 00:06:56.840
language, and the virtual machine you're working
on top

00:06:56.840 --> 00:07:01.039
of, to say this is how multi-threaded shared,
I

00:07:01.039 --> 00:07:03.979
mean, multi-threaded access to shared memory
is going to

00:07:03.979 --> 00:07:05.940
work. These are the basic guarantees and these
are

00:07:05.940 --> 00:07:09.310
the simple rules of how the system works.
So

00:07:09.310 --> 00:07:13.160
you can reliably work code against that, right.
So

00:07:13.160 --> 00:07:15.139
in, in effect, a memory model is just a

00:07:15.139 --> 00:07:21.479
specification. Any Java programmers here,
in the house? Great.

00:07:21.479 --> 00:07:25.860
So how many of you know about JSR 133?

00:07:25.860 --> 00:07:31.270
The memory model, double check locking - OK.
Some

00:07:31.270 --> 00:07:37.280
people. Single term issue? OK - some more
hands.

00:07:37.280 --> 00:07:39.620
So Java was the first programming language
which came

00:07:39.620 --> 00:07:43.360
up with a concept called memory model, right.
Because,

00:07:43.360 --> 00:07:45.610
the first thing is, right ones?? [00:07:45]
won't run

00:07:45.610 --> 00:07:48.110
anywhere. It had to be predictable across
platforms, across

00:07:48.110 --> 00:07:51.740
reimplementations, and things like that. So
the, there had

00:07:51.740 --> 00:07:54.650
to be a JSR which specified what is the

00:07:54.650 --> 00:07:56.860
memory model that it can code against so that

00:07:56.860 --> 00:08:02.129
your multi-threaded code works predictably,
and deterministically across platforms

00:08:02.129 --> 00:08:08.520
and across virtual machines. Right? So essentially
that's where

00:08:08.520 --> 00:08:11.280
my, you know, whole thing started. I had gone

00:08:11.280 --> 00:08:14.509
through the Java memory model, and was pretty
much

00:08:14.509 --> 00:08:16.960
really happy that someone had taken the pain
to

00:08:16.960 --> 00:08:18.590
write it down in clear terms so that you

00:08:18.590 --> 00:08:25.590
don't have to worry about multi-threading.
Hold on, sorry.

00:08:28.020 --> 00:08:34.669
Sorry about that. Cool. So. Memory model gives
you

00:08:34.669 --> 00:08:40.610
rules at three broad levels. Atomicity, visibility
and ordering.

00:08:40.610 --> 00:08:43.039
So atomicity is as simple as, you know, variable

00:08:43.039 --> 00:08:47.030
assignment. Is a variable assignment an indivisible
unit of

00:08:47.030 --> 00:08:49.520
work, or not? The rules around that, and it

00:08:49.520 --> 00:08:52.370
also talks about rules around, can you assign
hashes,

00:08:52.370 --> 00:08:55.070
send arrays indivisibly and things like that.
These rules

00:08:55.070 --> 00:08:57.670
can change based on every alligned version,
and things

00:08:57.670 --> 00:09:01.940
like that. Next is visibility. So in that
example

00:09:01.940 --> 00:09:05.040
which you talked about, I mean, we saw two

00:09:05.040 --> 00:09:07.310
threads trying to read the same value. Essentially
they

00:09:07.310 --> 00:09:09.390
are spying on each other. And it was not

00:09:09.390 --> 00:09:11.529
clear at what point the data had to become

00:09:11.529 --> 00:09:14.860
visible to each of those threads. So essentially
visibility

00:09:14.860 --> 00:09:18.240
is about that. And that is ensured through
memory

00:09:18.240 --> 00:09:21.800
barriers and ordering, which is the next thing.
So

00:09:21.800 --> 00:09:25.120
ordering is about how the loads and stores
are

00:09:25.120 --> 00:09:28.600
sequenced, or, you know, let's say you want
to

00:09:28.600 --> 00:09:30.720
write a piece of code, critical section as
you

00:09:30.720 --> 00:09:32.880
call it. And you don't want the compiler to

00:09:32.880 --> 00:09:35.510
do any crazy things to improve performance.
So you

00:09:35.510 --> 00:09:38.140
say, I make it synchronized, and it has to

00:09:38.140 --> 00:09:40.399
behave in a, behave in a nice serial?? [00:09:40]

00:09:40.399 --> 00:09:44.730
manner. So that ?? manner is ensured by ordering.

00:09:44.730 --> 00:09:47.940
Ordering is a really complex area. It talks
about

00:09:47.940 --> 00:09:50.850
causality, logical clocks and all that. I
won't go

00:09:50.850 --> 00:09:54.250
into those details. But I've been worrying
you with

00:09:54.250 --> 00:09:58.070
all this, you know, computer science basics
and all

00:09:58.070 --> 00:10:00.010
this. Why the hell am I talking about it

00:10:00.010 --> 00:10:02.430
in a Ruby conference? Ruby is single-threaded,
anyway. Why

00:10:02.430 --> 00:10:05.640
the hell should I care about it, right? OK.

00:10:05.640 --> 00:10:09.120
Do you really think languages like Ruby are
thread

00:10:09.120 --> 00:10:14.940
safe? Show of hands, anyone? So thread safety,
I'm

00:10:14.940 --> 00:10:18.600
talking only about Ruby - maybe Python. GIL
based

00:10:18.600 --> 00:10:25.600
languages. Are they thread safe? No? OK. In
fact

00:10:25.700 --> 00:10:30.649
they're not. Having single-threaded does not
mean it's thread-safe,

00:10:30.649 --> 00:10:33.670
right. Threads can switch context, and based
on how

00:10:33.670 --> 00:10:36.079
the language has been implemented and how
often the

00:10:36.079 --> 00:10:38.529
threads can switch context, and at what point
they

00:10:38.529 --> 00:10:44.010
can switch, things can go wrong, right. And
another

00:10:44.010 --> 00:10:46.040
pretty popular myth - I don't think many people

00:10:46.040 --> 00:10:49.389
believe it here, in this audience at least.
I

00:10:49.389 --> 00:10:52.440
don't have concurrency problems because I'm
running on single

00:10:52.440 --> 00:10:55.690
core. Not true. Again, threads can switch
context and

00:10:55.690 --> 00:10:58.630
run on the same core and still have dirty

00:10:58.630 --> 00:11:02.800
reads and things like that. So concurrency
is all

00:11:02.800 --> 00:11:05.550
about interleavings, right. Again, goes back
to reordering. I

00:11:05.550 --> 00:11:07.870
think I've been talking about this too often.
And

00:11:07.870 --> 00:11:11.950
let's not, again, worry with that. It's about
interleavings.

00:11:11.950 --> 00:11:15.620
We'll leave it at that. So let's, before we

00:11:15.620 --> 00:11:19.240
understand more about, you know, the memory
model and

00:11:19.240 --> 00:11:21.019
what it has to do with Ruby, let's just

00:11:21.019 --> 00:11:25.060
understand a little bit about threading in
Ruby. So

00:11:25.060 --> 00:11:28.100
all of you know, green threads, as of 1.8,

00:11:28.100 --> 00:11:31.430
there was only one worse thread, which was
being

00:11:31.430 --> 00:11:35.220
multiplexed with multiple Ruby threads, which
were being scheduled

00:11:35.220 --> 00:11:38.980
on it through global interpreter lock. 1.9
comes along,

00:11:38.980 --> 00:11:41.200
there is a one to one mapping between the

00:11:41.200 --> 00:11:43.660
Ruby thread and OS thread, but still the Ruby

00:11:43.660 --> 00:11:46.620
thread cannot use the OS thread unless it
has

00:11:46.620 --> 00:11:50.980
the global VM lock as its call now. The

00:11:50.980 --> 00:11:55.750
JVL acquire. So does having a Global Interpreter
Lock

00:11:55.750 --> 00:12:00.709
make you thread safe? It depends. It does
make

00:12:00.709 --> 00:12:03.260
you thread safe in a way, but let's see.

00:12:03.260 --> 00:12:05.329
So how does GIL work? This is a very

00:12:05.329 --> 00:12:08.510
simplistic representation of how GIL works.
So you have

00:12:08.510 --> 00:12:12.120
two threads here. One is already holding the
GIL.

00:12:12.120 --> 00:12:15.519
So it's, it's working with the OS thread.
And

00:12:15.519 --> 00:12:18.820
now when there is another thread waiting on
it,

00:12:18.820 --> 00:12:21.190
waiting on the GIL to do its work, it

00:12:21.190 --> 00:12:22.510
sends a, it wakes up the timer thread. Time

00:12:22.510 --> 00:12:26.790
thread is, again, another Ruby thread. The
timer thread

00:12:26.790 --> 00:12:30.410
now goes and interrupts the thread holding
the GIL,

00:12:30.410 --> 00:12:32.040
and if the GIL, if the thread holding the

00:12:32.040 --> 00:12:34.889
GIL is done with whatever it's doing - I'll

00:12:34.889 --> 00:12:36.550
get to it in a bit - it just

00:12:36.550 --> 00:12:40.320
releases the lock, and now thread two can
take

00:12:40.320 --> 00:12:42.829
over and do its thing. Well this is the

00:12:42.829 --> 00:12:48.329
basic working that at least I understood about
GIL.

00:12:48.329 --> 00:12:50.300
But there are details to this, right. It's
not

00:12:50.300 --> 00:12:57.300
as simple as what we saw. So, when you

00:12:57.779 --> 00:13:00.930
initialize a thread, or create a thread in
Ruby,

00:13:00.930 --> 00:13:03.100
you pass it a block of code. So how

00:13:03.100 --> 00:13:06.240
does that work? You take a block of code,

00:13:06.240 --> 00:13:07.769
you put it inside the thread. What the thread

00:13:07.769 --> 00:13:10.480
does is usually it acquires a JVL and a

00:13:10.480 --> 00:13:14.019
block?? [00:13:11]. It executes the block
of code. It

00:13:14.019 --> 00:13:17.089
releases the, returns and releases the lock,
right. So

00:13:17.089 --> 00:13:19.470
essentially this is how it works. So during
that

00:13:19.470 --> 00:13:21.899
period of executation of the block, no other
thread

00:13:21.899 --> 00:13:24.380
is allowed to work. So that makes you almost

00:13:24.380 --> 00:13:28.110
thread safe, right? But not really. If that's
how

00:13:28.110 --> 00:13:30.600
it's going to work, what if that thread is

00:13:30.600 --> 00:13:33.899
going to hog the GIL, and not allow any

00:13:33.899 --> 00:13:35.760
other thread to work? So there has to be

00:13:35.760 --> 00:13:38.430
some kind of lock fairness, right. So that's
where

00:13:38.430 --> 00:13:41.180
the timer thread comes in and interrupts it.
OK.

00:13:41.180 --> 00:13:43.130
Does that mean the thread holding the GIL
immediately

00:13:43.130 --> 00:13:45.190
gives it up, and says here you go, you

00:13:45.190 --> 00:13:48.740
can start and work with it? Not really. Again

00:13:48.740 --> 00:13:51.389
the thread holding the GIL will only release
the

00:13:51.389 --> 00:13:53.920
GIL if it is at a context to its

00:13:53.920 --> 00:13:57.019
boundary. What that is, is fairly complicated.
I don't

00:13:57.019 --> 00:13:59.920
want to go into the details. I think people

00:13:59.920 --> 00:14:02.540
who here know a lot better C than me,

00:14:02.540 --> 00:14:05.110
and are deep C divers really, they can probably

00:14:05.110 --> 00:14:08.670
tell you, you know, how, at what the GIL

00:14:08.670 --> 00:14:11.040
can get released. If a C thread, a C

00:14:11.040 --> 00:14:13.269
code makes a call to Ruby code, can it

00:14:13.269 --> 00:14:15.449
or can it not release the GIL? All those

00:14:15.449 --> 00:14:18.399
things are there, right. So all these complexities
are

00:14:18.399 --> 00:14:21.360
really, really hard to deal with. I came across

00:14:21.360 --> 00:14:25.139
this blog by Jesse Storimer. It's excellent
and I

00:14:25.139 --> 00:14:27.440
strongly encourage you to go through the two-part
blog

00:14:27.440 --> 00:14:30.990
about, you know, nobody understands GIL. It's
really, really

00:14:30.990 --> 00:14:33.550
important, if you're trying to do any sort
of

00:14:33.550 --> 00:14:39.740
multi-threaded programming in Ruby. So do
you still think

00:14:39.740 --> 00:14:42.740
Ruby is thread safe because it's got GIL?
I'm

00:14:42.740 --> 00:14:48.740
talking about MRI, essentially. So the thing
is, we

00:14:48.740 --> 00:14:51.630
can't depend on GIL, right. GIL is not documented

00:14:51.630 --> 00:14:54.050
anywhere that this is exactly how it works.
This

00:14:54.050 --> 00:14:56.079
is when the timer thread wakes up. These are

00:14:56.079 --> 00:14:59.310
the time slices alotted to the thread acquiring
the

00:14:59.310 --> 00:15:03.190
JVL. There is no documentation around at what
point

00:15:03.190 --> 00:15:04.860
the GIL can be released, can it not be

00:15:04.860 --> 00:15:07.009
released, and things like that. There's no,
it's not

00:15:07.009 --> 00:15:10.259
predictable, and if you depend on it, what
could

00:15:10.259 --> 00:15:13.139
also happen is even within MRI, when you're
moving

00:15:13.139 --> 00:15:15.920
from version to version, if something changes
in GIL,

00:15:15.920 --> 00:15:22.220
your code with behave nondeterministically.
And what about language

00:15:22.220 --> 00:15:25.209
in Ruby implementations that don't even have
a GIL?

00:15:25.209 --> 00:15:27.009
So obviously that's the big problem, right.
If you

00:15:27.009 --> 00:15:29.610
write a gem or something which has to be

00:15:29.610 --> 00:15:32.079
multi-threaded, and if you're depending on
the GIL to

00:15:32.079 --> 00:15:34.769
do its thing to keep you safe, then obviously

00:15:34.769 --> 00:15:38.550
it cannot work on Rubinius and JRuby. Let
that

00:15:38.550 --> 00:15:41.310
alone, even, even if you give that up, even

00:15:41.310 --> 00:15:44.360
with MRI, it's not entirely correct to say
that

00:15:44.360 --> 00:15:47.490
you're thread safe, because there is a GIL
that

00:15:47.490 --> 00:15:52.660
will ensure that only one thread is running.
So

00:15:52.660 --> 00:15:54.610
what did I find out? Ruby really does not

00:15:54.610 --> 00:15:57.350
have a documented memory model. It's pretty
much similar

00:15:57.350 --> 00:16:00.480
to Python. It doesn't have a clearly documented
memory

00:16:00.480 --> 00:16:05.279
model. What is the implication of that? So
as

00:16:05.279 --> 00:16:07.540
I mentioned previously, a memory model is
like a

00:16:07.540 --> 00:16:10.769
specification. This is exactly how the system
has to

00:16:10.769 --> 00:16:14.600
provide a certain minimum guarantee to the
users of

00:16:14.600 --> 00:16:17.730
the language, right, regarding multi threaded
access to shared

00:16:17.730 --> 00:16:22.500
memory. Now, basically if I don't have a written

00:16:22.500 --> 00:16:23.720
down memory model, and I am going to write

00:16:23.720 --> 00:16:26.540
a Ruby implementation to model, I have the
liberty

00:16:26.540 --> 00:16:29.509
to choose whatever memory model I want. So
the

00:16:29.509 --> 00:16:32.889
code, if you're writing against MRI, may not
essentially

00:16:32.889 --> 00:16:36.720
work right on my, you know, my implementation
of

00:16:36.720 --> 00:16:41.339
Ruby. That's the big implication, right. So
Ruby right

00:16:41.339 --> 00:16:45.769
now depends on underlying virtual machines.
Even after ER,

00:16:45.769 --> 00:16:47.699
you have bad code compilations, so even MRI
is

00:16:47.699 --> 00:16:50.839
almost like a VM. So that has no specification

00:16:50.839 --> 00:16:52.959
for a memory model, but it does have something,

00:16:52.959 --> 00:16:55.279
right, internally. If you have to go through
the

00:16:55.279 --> 00:16:58.130
C code and understand. It's not guaranteed
to remain

00:16:58.130 --> 00:17:01.079
the same from version to version, as I understand,

00:17:01.079 --> 00:17:05.069
right. And obviously JRuby and Rubinius, they
depend on

00:17:05.069 --> 00:17:08.260
JVM and LLVM respectively. And they all have
a

00:17:08.260 --> 00:17:11.819
clearly documented memory model. You could
have a read

00:17:11.819 --> 00:17:15.260
at it. And the only thing is, if Ruby

00:17:15.260 --> 00:17:18.079
had an implementation - sorry, a specification
for a

00:17:18.079 --> 00:17:22.220
memory model, it could be, you know, implemented
using

00:17:22.220 --> 00:17:27.599
the constructs available on JVM and LLVM.
But this

00:17:27.599 --> 00:17:29.450
is what we have. We don't have much to

00:17:29.450 --> 00:17:33.200
do. What do we do under the circumstances?
We

00:17:33.200 --> 00:17:36.640
have to engineer our code for thread safety.
We

00:17:36.640 --> 00:17:40.120
can't bask under the safety that, there is
a

00:17:40.120 --> 00:17:42.410
GIL and so it's going to help me keep

00:17:42.410 --> 00:17:44.530
my code thread safe. So even I can write

00:17:44.530 --> 00:17:47.690
multiple, you know, multi threaded code without
actually worrying

00:17:47.690 --> 00:17:51.290
about serious synchronization issues and things
like that. It's

00:17:51.290 --> 00:17:54.500
totally not the right thing to do. I think

00:17:54.500 --> 00:17:57.370
any which way, Ruby is a language I love,

00:17:57.370 --> 00:17:59.710
and I'm sure all of you love, so. And

00:17:59.710 --> 00:18:02.670
it's progressing my leaps and bounds, and
eventually we're

00:18:02.670 --> 00:18:04.840
going to write more and more complex systems
with

00:18:04.840 --> 00:18:09.390
Ruby. And who knows, we might have true parallelism

00:18:09.390 --> 00:18:13.980
very soon, right. So why, still, stay in the

00:18:13.980 --> 00:18:17.210
same mental block that we don't want to write,

00:18:17.210 --> 00:18:20.480
you know, thread safe code that's anyway single
threaded.

00:18:20.480 --> 00:18:22.150
We might as well get into the mindset of

00:18:22.150 --> 00:18:26.130
writing proper thread safe code, and try and
probably

00:18:26.130 --> 00:18:29.500
come up with a memory model, right. But I

00:18:29.500 --> 00:18:31.700
think for now we just start engineering code
for

00:18:31.700 --> 00:18:36.860
thread safety. Simple Mutex, I'm sure all
of you

00:18:36.860 --> 00:18:39.580
know, but it's really, really important for
even a

00:18:39.580 --> 00:18:44.090
stupid operation like a plus equals two. So
simple

00:18:44.090 --> 00:18:46.970
things which are noticed in Ruby code bases
and

00:18:46.970 --> 00:18:50.530
Rails code bases as well, like generally,
is, there

00:18:50.530 --> 00:18:52.920
is like a synchronized, you know, a section
of

00:18:52.920 --> 00:18:56.260
the code has lots of synchronization and everything.
It's

00:18:56.260 --> 00:18:58.530
really safe. But we leave an innocent accessor
lying

00:18:58.530 --> 00:19:00.760
around, and that causes a lot of, you know,

00:19:00.760 --> 00:19:04.360
pain, like debugging those issues. And general
issues like,

00:19:04.360 --> 00:19:08.020
you know, state mutations, inside methods
is really a

00:19:08.020 --> 00:19:10.270
bad idea. So if you're looking for issues
around

00:19:10.270 --> 00:19:12.200
multi threading, this might be a good place
to

00:19:12.200 --> 00:19:14.350
start. So I just listed a few of them

00:19:14.350 --> 00:19:16.310
here. I didn't want to make a really dense

00:19:16.310 --> 00:19:19.210
talk with all the details. You can always
catch

00:19:19.210 --> 00:19:20.940
me offline and I can tell you some of

00:19:20.940 --> 00:19:23.600
my experiences and probably even listen to
you and

00:19:23.600 --> 00:19:25.980
learn from you about some of the issues that

00:19:25.980 --> 00:19:28.820
we can solve by actually writing proper thread
safe

00:19:28.820 --> 00:19:33.080
code in Ruby. I came across a few gems

00:19:33.080 --> 00:19:35.090
which were really, really nice. Both of them
happen

00:19:35.090 --> 00:19:38.680
to be written by headius. The first one is

00:19:38.680 --> 00:19:40.730
atomic. Atomic is almost trying to give you
the

00:19:40.730 --> 00:19:44.970
similar constructs like the Java utility concurrent
package. It

00:19:44.970 --> 00:19:51.300
tries to, it's kind of compatible across MRI,
JRuby,

00:19:51.300 --> 00:19:53.800
and Rubinius, which is also a really nice
thing.

00:19:53.800 --> 00:19:56.560
So you have atomic integers and atomic floats,
which

00:19:56.560 --> 00:19:59.900
do increments actually in an atomic way, which
is

00:19:59.900 --> 00:20:02.460
excellent. And then there is thread_safe library,
which also

00:20:02.460 --> 00:20:04.590
has a few thread safe data structures. I'm
trying

00:20:04.590 --> 00:20:06.570
to play around with these libraries right
now, but

00:20:06.570 --> 00:20:09.150
they may be a good, you know, starting point

00:20:09.150 --> 00:20:10.780
if you are trying to do higher level constructs

00:20:10.780 --> 00:20:15.620
for concurrency. And that's pretty much it.
I'm open

00:20:15.620 --> 00:20:21.820
to take questions. Thank you. And before anything
I

00:20:21.820 --> 00:20:23.420
really would like to thank you all, again
for

00:20:23.420 --> 00:20:27.140
being here for the talk, and thank the GCRC

00:20:27.140 --> 00:20:31.410
organizers, you know, they've done a great
job with

00:20:31.410 --> 00:20:38.410
this conference. A big shout out to them.

00:20:46.470 --> 00:20:46.510
V.O.: Any questions?

00:20:46.510 --> 00:20:46.540
H.K.: Yeah?

00:20:46.540 --> 00:20:46.560
QUESTION: Hey.

00:20:46.560 --> 00:20:46.590
H.K.: Hi.

00:20:46.590 --> 00:20:47.520
QUESTION: If, for example, if a Ruby code
is running

00:20:47.520 --> 00:20:51.530
in the JVM, in JRuby, how does, because none

00:20:51.530 --> 00:20:53.810
of the Ruby code is written in a thread

00:20:53.810 --> 00:20:56.580
safe way. How do, how does it internally manage

00:20:56.580 --> 00:20:58.750
- does it actually, yeah, yesterday Yogi talked
about

00:20:58.750 --> 00:21:00.940
the point that ActiveRecord is not actually
thread safe.

00:21:00.940 --> 00:21:03.520
Can you explain it in detail like in a

00:21:03.520 --> 00:21:04.460
theoretical way?

00:21:04.460 --> 00:21:06.560
H.K.: OK. What is thread safety in

00:21:06.560 --> 00:21:09.010
general, right? Thread safety is about how
the data

00:21:09.010 --> 00:21:13.280
is consistently maintained after multi-threaded
access to that shared

00:21:13.280 --> 00:21:17.130
data, right. So Ruby essentially has a GIL
because

00:21:17.130 --> 00:21:19.620
internal implementations are not thread safe,
right. That's why

00:21:19.620 --> 00:21:22.110
you want to have a GIL to protect you

00:21:22.110 --> 00:21:25.840
from those problems. But as far as JRuby is

00:21:25.840 --> 00:21:29.280
concerned, or Rubinius is concerned, the implementation
itself is

00:21:29.280 --> 00:21:31.930
not written in C. JRuby is written in Ruby

00:21:31.930 --> 00:21:34.400
again, I mean JRuby itself, and Rubinius is
written

00:21:34.400 --> 00:21:37.660
in Ruby. And some of these actual internal
constructs

00:21:37.660 --> 00:21:40.580
are thread safe when compared to MRI. I haven't

00:21:40.580 --> 00:21:43.190
actually taken a look in detail into the code

00:21:43.190 --> 00:21:47.520
of these code bases, but if they are implemented

00:21:47.520 --> 00:21:50.000
properly, you can be thread safe - internally,
at

00:21:50.000 --> 00:21:53.340
least - so, which means, the base code of

00:21:53.340 --> 00:21:55.720
JRuby itself might be thread safe. It's only
not

00:21:55.720 --> 00:21:58.200
thread safe because the gems on top of it,

00:21:58.200 --> 00:22:01.050
which are trying to run. They may have, like,

00:22:01.050 --> 00:22:04.890
thread safety issues, right. Does that answer
your question,

00:22:04.890 --> 00:22:05.840
like, or- ?

00:22:05.840 --> 00:22:08.200
QUESTION: About thread safety?? [00:22:09].

00:22:08.200 --> 00:22:11.720
H.K.: Sure, sure. So those gems will not work.
That's

00:22:11.720 --> 00:22:13.840
the point. Like what I want to convey here,

00:22:13.840 --> 00:22:16.910
is whatever gems we are offering, and whatever
code

00:22:16.910 --> 00:22:18.780
we are writing, we might get it - it's

00:22:18.780 --> 00:22:20.240
a good idea to get into the habit of

00:22:20.240 --> 00:22:22.860
writing thread safe code, so that we can actually

00:22:22.860 --> 00:22:25.460
encourage a truly parallel Ruby, right. We
don't, we

00:22:25.460 --> 00:22:27.530
don't have to stay in the same paradigm of

00:22:27.530 --> 00:22:31.520
OK we have to be single threaded.

00:22:31.520 --> 00:22:37.010
QUESTION: So Mutex based thread management
is one way.

00:22:37.010 --> 00:22:40.060
There's also like actors and futures and things
like that.

00:22:40.060 --> 00:22:41.890
And there's a gem called cellulite-

00:22:41.890 --> 00:22:42.680
H.K.: Yup.

00:22:42.680 --> 00:22:45.040
QUESTION: That, combined with something called
Hamster,

00:22:45.040 --> 00:22:46.390
which makes everything immutable-

00:22:46.390 --> 00:22:46.840
H.K.: Yup.

00:22:46.840 --> 00:22:47.960
QUESTION: Is another way to do it.

00:22:47.960 --> 00:22:48.160
H.K.: Yup.

00:22:48.160 --> 00:22:49.070
QUESTION: Have you done it or like,

00:22:49.070 --> 00:22:49.950
what's your experience with that?

00:22:49.950 --> 00:22:53.130
H.K.: Yeah, I have tried out actors, with
revactor,

00:22:53.130 --> 00:22:54.330
and lockless concurrency is

00:22:54.330 --> 00:22:56.830
something I definitely agree is a good idea.
But

00:22:56.830 --> 00:23:01.440
I'm specifically talking about, you know,
lock-based concurrency, like,

00:23:01.440 --> 00:23:04.530
Mutex-based concurrency. This area is also
important because it's

00:23:04.530 --> 00:23:07.960
not like thread mutable state is bad. It is,

00:23:07.960 --> 00:23:10.770
it is actually applicable in certain scenarios.
When we

00:23:10.770 --> 00:23:13.360
are working in this particular paradigm, we
still need

00:23:13.360 --> 00:23:19.170
the safety of a memory model. Any other questions?

00:23:19.170 --> 00:23:26.170
QUESTION: Thanks for the talk Hari. It was
really

00:23:28.200 --> 00:23:28.650
good.

00:23:28.650 --> 00:23:29.550
H.K.: Thanks.

00:23:29.550 --> 00:23:31.140
QUESTION: Is there a way that

00:23:31.140 --> 00:23:35.050
you would recommend to test if you have done

00:23:35.050 --> 00:23:37.850
threading properly or not? I mean, I know,
bugs

00:23:37.850 --> 00:23:38.420
that come out-

00:23:38.420 --> 00:23:38.610
H.K.: Right.

00:23:38.610 --> 00:23:38.980
QUESTION: Like I have

00:23:38.980 --> 00:23:41.680
written bugs that come out of badly written,
you

00:23:41.680 --> 00:23:43.750
know, not thread safe code, as.

00:23:43.750 --> 00:23:44.510
H.K.: So-

00:23:44.510 --> 00:23:47.190
QUESTION: Like, ?? [00:23:46] so, you catch
them.

00:23:47.190 --> 00:23:51.510
H.K.: At least, my opinion, and a lot of people
have

00:23:51.510 --> 00:23:53.960
done research in this area, their opinion
also is

00:23:53.960 --> 00:23:57.600
that it's not possible to write tests against
multi

00:23:57.600 --> 00:24:00.480
threaded code where there is shared data.
Because it's

00:24:00.480 --> 00:24:04.230
nondeterministic and nonrepeatable. The kind
of results you get,

00:24:04.230 --> 00:24:06.920
you can only test it against a heuristic.
For

00:24:06.920 --> 00:24:09.430
example, if you have a deterministic use case
at

00:24:09.430 --> 00:24:11.620
the top level, you can probably test it against

00:24:11.620 --> 00:24:14.490
that. But exact test cases can never be written

00:24:14.490 --> 00:24:16.070
for this.

00:24:16.070 --> 00:24:19.240
V.O.: Any more questions?

00:24:19.240 --> 00:24:26.240
H.K.: Cool. All right. Thank you so much.