HARY KRISHNAN: So, thank you very much

for being here on a Saturday evening, this
late.

My talk got pushed to the last, but I

appreciate you being here, first. My name's
Hari. I

work at MavenHive. So this is a talk about

Ruby memory model. So before I start, how
many

of you have heard about memory model and know

what it is? Show of hands, please. OK. Let's

see where this talk goes. So why I did

I come up with this talk topic. So I

started my career with Java, and I spent a

lot many years with Java, and Java has a

very clearly documented memory model. And
it kind of

gets to you because with all that, you don't

feel safe enough doing multi-threaded programming
at all. So

with Ruby, we've always been talking about,
you know,

doing multi-process for multi-process parallelism,

rather than multi-threaded parallelism,

even though the language actually supports,
you know, multi-threading

semantics. Of course we know it's called single-threaded
and

all that, but I just got curious, like, what

is the real memory model behind Ruby, and
I

just wanted to figure that out. So this talk

is all about my learnings as I went through,

like, various literatures, and figured out,
and I tried

to combine, like, get a gist of the whole

thing. And cram it into some twenty minutes
so

that I could, like, probably give you a very

useful session, like, from which you can further
do

more digging on this, right. So when I talked

to my friends about memory model, the first
thing

that comes up to their mind is probably this

- heap, heap, non-heap, stack, whatever. I'm
not gonna

talk about that. I'm not gonna talk about
this

either. It's not about, you know, optimizing
your memory,

or search memory leeks, or garbage collection.
This talk

is not about that either. So what the hell

am I gonna talk about? First, a quick exercise.

So let's start with this and see where it

goes. Simple code. Not much to process late
in

the day. There's a shared variable called
'n', and

there are thousand threads over that, and
each of

those threads want to increment that shared
variable hundred

times, right. And what is the expected output?
I'm

not gonna question you, I'm just gonna give
it

away. It's 100,000. It's fairly straightforward
code. I'm sure

all of you have done this, and it's no

big deal. So what's the real output? MRI is

very faithful, it gives you what you expected.
100,000,

right. So what happens next? I'm running it
on

Rubinius. This is what you see. And it's always

going to be a different number every time
you

run it. And that's JRuby. It gives you a

lower number. Some of you may be guessing
already,

and you probably know it, why it gives you

a lower number. So why all this basic stupid

code and some stupid counter over here, right?
So

I just wanted to get a really basic example

to explain the concept of increment is not
a

single instruction, right. The reason why
I'm talking about

this is, I love Ruby because the syntax is

so terse, and it's so simple, it's so readable,

right. But it does not mean every single instruction

on the screen is going to be executed straight

away, right. So at least, to my junior self,

this is the first advice I would give, when

I started, you know, multi-threaded programming.
So at least

three steps. Lowered increments store, right.
That's, even further,

really simple piece of code like, you know,
a

plus equals to, right. So this is what we

really want to happen. You have a count, you

lowered it, you increment it, you stored it.
Then

the next thread comes along. It lowers it,
increments

it, stores it. You have the next result which

is what you expect, right. But we live in

a world where threads don't want to be our

friend. They do this. One guy comes along,
reads

it, increments it. The other guy also reads
the

older value, increments it. And both of them
go

and save the same value, right. So this is

a classic case of lost update. I'm sure most

of you have seen it in the database world.

But this pretty much happens a lot in the

multi-threading world, right. But why did
it not happen

with MRI? And what did you see the right

result?? [00:04:52]? That, I'm sure a lot
of you

know, but let's step, let's part that question
and

just move a little ahead. So, as you observed

earlier, a lot of reordoring happening in
instructions, right.

Like, the threads were context-switching,
and they were reordering

statements. So where does this reordering
happen? Reordering can

happen at multiple levels. So start from the
top.

You have the compiler, which can do simple
optimizations

like look closer?? [00:05:20]. Even that can
change the

order of your statements in your code, right.
Next,

when the code gets translated to, you know,
machine-level

language, goes to core, and your CP cores
are

at liberty, again, to reorder them for performance.
And

next comes the memory system, right. The memory
system

is like the combined global memory, which
all the

CPUs can read, and also they're individual
caches. But

why do CPUs have caches? They want to, memory

is slow, so they want to load, reload all

the values, refactor it, keep it in the cache,

again improve performance. So even the memory
system can

conspire against you and reorder the loads
and stores

after the memory registers. And that can cause
reordering,

right. So this is really, really crazy. Like,
I'm

a very stupid programmer, who works at the
programming

language level. I don't really understand
the structure of

the hardware and things like that. So how
do

I keep myself abstracted from all this, you
know,

really crazy stuff? So that's essentially
a memory model.

So what, what is a memory model? A memory

model describes the interactions of threads
through memory and

their shared use of data. So this is straight

out of Wikipedia, right. So if you just read

it first, either you're gonna think it's really
simple,

and probably even looks stupid, but otherwise
you might

not even understand. So I was the second category.

So what does this all mean? So when there

are so many complications with the reordering,
the reads

and writes of memory and things like that,
as

a programmer you need certain guarantees from
the programming

language, and the virtual machine you're working
on top

of, to say this is how multi-threaded shared,
I

mean, multi-threaded access to shared memory
is going to

work. These are the basic guarantees and these
are

the simple rules of how the system works.
So

you can reliably work code against that, right.
So

in, in effect, a memory model is just a

specification. Any Java programmers here,
in the house? Great.

So how many of you know about JSR 133?

The memory model, double check locking - OK.
Some

people. Single term issue? OK - some more
hands.

So Java was the first programming language
which came

up with a concept called memory model, right.
Because,

the first thing is, right ones?? [00:07:45]
won't run

anywhere. It had to be predictable across
platforms, across

reimplementations, and things like that. So
the, there had

to be a JSR which specified what is the

memory model that it can code against so that

your multi-threaded code works predictably,
and deterministically across platforms

and across virtual machines. Right? So essentially
that's where

my, you know, whole thing started. I had gone

through the Java memory model, and was pretty
much

really happy that someone had taken the pain
to

write it down in clear terms so that you

don't have to worry about multi-threading.
Hold on, sorry.

Sorry about that. Cool. So. Memory model gives
you

rules at three broad levels. Atomicity, visibility
and ordering.

So atomicity is as simple as, you know, variable

assignment. Is a variable assignment an indivisible
unit of

work, or not? The rules around that, and it

also talks about rules around, can you assign
hashes,

send arrays indivisibly and things like that.
These rules

can change based on every alligned version,
and things

like that. Next is visibility. So in that
example

which you talked about, I mean, we saw two

threads trying to read the same value. Essentially
they

are spying on each other. And it was not

clear at what point the data had to become

visible to each of those threads. So essentially
visibility

is about that. And that is ensured through
memory

barriers and ordering, which is the next thing.
So

ordering is about how the loads and stores
are

sequenced, or, you know, let's say you want
to

write a piece of code, critical section as
you

call it. And you don't want the compiler to

do any crazy things to improve performance.
So you

say, I make it synchronized, and it has to

behave in a, behave in a nice serial?? [00:09:40]

manner. So that ?? manner is ensured by ordering.

Ordering is a really complex area. It talks
about

causality, logical clocks and all that. I
won't go

into those details. But I've been worrying
you with

all this, you know, computer science basics
and all

this. Why the hell am I talking about it

in a Ruby conference? Ruby is single-threaded,
anyway. Why

the hell should I care about it, right? OK.

Do you really think languages like Ruby are
thread

safe? Show of hands, anyone? So thread safety,
I'm

talking only about Ruby - maybe Python. GIL
based

languages. Are they thread safe? No? OK. In
fact

they're not. Having single-threaded does not
mean it's thread-safe,

right. Threads can switch context, and based
on how

the language has been implemented and how
often the

threads can switch context, and at what point
they

can switch, things can go wrong, right. And
another

pretty popular myth - I don't think many people

believe it here, in this audience at least.
I

don't have concurrency problems because I'm
running on single

core. Not true. Again, threads can switch
context and

run on the same core and still have dirty

reads and things like that. So concurrency
is all

about interleavings, right. Again, goes back
to reordering. I

think I've been talking about this too often.
And

let's not, again, worry with that. It's about
interleavings.

We'll leave it at that. So let's, before we

understand more about, you know, the memory
model and

what it has to do with Ruby, let's just

understand a little bit about threading in
Ruby. So

all of you know, green threads, as of 1.8,

there was only one worse thread, which was
being

multiplexed with multiple Ruby threads, which
were being scheduled

on it through global interpreter lock. 1.9
comes along,

there is a one to one mapping between the

Ruby thread and OS thread, but still the Ruby

thread cannot use the OS thread unless it
has

the global VM lock as its call now. The

JVL acquire. So does having a Global Interpreter
Lock

make you thread safe? It depends. It does
make

you thread safe in a way, but let's see.

So how does GIL work? This is a very

simplistic representation of how GIL works.
So you have

two threads here. One is already holding the
GIL.

So it's, it's working with the OS thread.
And

now when there is another thread waiting on
it,

waiting on the GIL to do its work, it

sends a, it wakes up the timer thread. Time

thread is, again, another Ruby thread. The
timer thread

now goes and interrupts the thread holding
the GIL,

and if the GIL, if the thread holding the

GIL is done with whatever it's doing - I'll

get to it in a bit - it just

releases the lock, and now thread two can
take

over and do its thing. Well this is the

basic working that at least I understood about
GIL.

But there are details to this, right. It's
not

as simple as what we saw. So, when you

initialize a thread, or create a thread in
Ruby,

you pass it a block of code. So how

does that work? You take a block of code,

you put it inside the thread. What the thread

does is usually it acquires a JVL and a

block?? [00:13:11]. It executes the block
of code. It

releases the, returns and releases the lock,
right. So

essentially this is how it works. So during
that

period of executation of the block, no other
thread

is allowed to work. So that makes you almost

thread safe, right? But not really. If that's
how

it's going to work, what if that thread is

going to hog the GIL, and not allow any

other thread to work? So there has to be

some kind of lock fairness, right. So that's
where

the timer thread comes in and interrupts it.
OK.

Does that mean the thread holding the GIL
immediately

gives it up, and says here you go, you

can start and work with it? Not really. Again

the thread holding the GIL will only release
the

GIL if it is at a context to its

boundary. What that is, is fairly complicated.
I don't

want to go into the details. I think people

who here know a lot better C than me,

and are deep C divers really, they can probably

tell you, you know, how, at what the GIL

can get released. If a C thread, a C

code makes a call to Ruby code, can it

or can it not release the GIL? All those

things are there, right. So all these complexities
are

really, really hard to deal with. I came across

this blog by Jesse Storimer. It's excellent
and I

strongly encourage you to go through the two-part
blog

about, you know, nobody understands GIL. It's
really, really

important, if you're trying to do any sort
of

multi-threaded programming in Ruby. So do
you still think

Ruby is thread safe because it's got GIL?
I'm

talking about MRI, essentially. So the thing
is, we

can't depend on GIL, right. GIL is not documented

anywhere that this is exactly how it works.
This

is when the timer thread wakes up. These are

the time slices alotted to the thread acquiring
the

JVL. There is no documentation around at what
point

the GIL can be released, can it not be

released, and things like that. There's no,
it's not

predictable, and if you depend on it, what
could

also happen is even within MRI, when you're
moving

from version to version, if something changes
in GIL,

your code with behave nondeterministically.
And what about language

in Ruby implementations that don't even have
a GIL?

So obviously that's the big problem, right.
If you

write a gem or something which has to be

multi-threaded, and if you're depending on
the GIL to

do its thing to keep you safe, then obviously

it cannot work on Rubinius and JRuby. Let
that

alone, even, even if you give that up, even

with MRI, it's not entirely correct to say
that

you're thread safe, because there is a GIL
that

will ensure that only one thread is running.
So

what did I find out? Ruby really does not

have a documented memory model. It's pretty
much similar

to Python. It doesn't have a clearly documented
memory

model. What is the implication of that? So
as

I mentioned previously, a memory model is
like a

specification. This is exactly how the system
has to

provide a certain minimum guarantee to the
users of

the language, right, regarding multi threaded
access to shared

memory. Now, basically if I don't have a written

down memory model, and I am going to write

a Ruby implementation to model, I have the
liberty

to choose whatever memory model I want. So
the

code, if you're writing against MRI, may not
essentially

work right on my, you know, my implementation
of

Ruby. That's the big implication, right. So
Ruby right

now depends on underlying virtual machines.
Even after ER,

you have bad code compilations, so even MRI
is

almost like a VM. So that has no specification

for a memory model, but it does have something,

right, internally. If you have to go through
the

C code and understand. It's not guaranteed
to remain

the same from version to version, as I understand,

right. And obviously JRuby and Rubinius, they
depend on

JVM and LLVM respectively. And they all have
a

clearly documented memory model. You could
have a read

at it. And the only thing is, if Ruby

had an implementation - sorry, a specification
for a

memory model, it could be, you know, implemented
using

the constructs available on JVM and LLVM.
But this

is what we have. We don't have much to

do. What do we do under the circumstances?
We

have to engineer our code for thread safety.
We

can't bask under the safety that, there is
a

GIL and so it's going to help me keep

my code thread safe. So even I can write

multiple, you know, multi threaded code without
actually worrying

about serious synchronization issues and things
like that. It's

totally not the right thing to do. I think

any which way, Ruby is a language I love,

and I'm sure all of you love, so. And

it's progressing my leaps and bounds, and
eventually we're

going to write more and more complex systems
with

Ruby. And who knows, we might have true parallelism

very soon, right. So why, still, stay in the

same mental block that we don't want to write,

you know, thread safe code that's anyway single
threaded.

We might as well get into the mindset of

writing proper thread safe code, and try and
probably

come up with a memory model, right. But I

think for now we just start engineering code
for

thread safety. Simple Mutex, I'm sure all
of you

know, but it's really, really important for
even a

stupid operation like a plus equals two. So
simple

things which are noticed in Ruby code bases
and

Rails code bases as well, like generally,
is, there

is like a synchronized, you know, a section
of

the code has lots of synchronization and everything.
It's

really safe. But we leave an innocent accessor
lying

around, and that causes a lot of, you know,

pain, like debugging those issues. And general
issues like,

you know, state mutations, inside methods
is really a

bad idea. So if you're looking for issues
around

multi threading, this might be a good place
to

start. So I just listed a few of them

here. I didn't want to make a really dense

talk with all the details. You can always
catch

me offline and I can tell you some of

my experiences and probably even listen to
you and

learn from you about some of the issues that

we can solve by actually writing proper thread
safe

code in Ruby. I came across a few gems

which were really, really nice. Both of them
happen

to be written by headius. The first one is

atomic. Atomic is almost trying to give you
the

similar constructs like the Java utility concurrent
package. It

tries to, it's kind of compatible across MRI,
JRuby,

and Rubinius, which is also a really nice
thing.

So you have atomic integers and atomic floats,
which

do increments actually in an atomic way, which
is

excellent. And then there is thread_safe library,
which also

has a few thread safe data structures. I'm
trying

to play around with these libraries right
now, but

they may be a good, you know, starting point

if you are trying to do higher level constructs

for concurrency. And that's pretty much it.
I'm open

to take questions. Thank you. And before anything
I

really would like to thank you all, again
for

being here for the talk, and thank the GCRC

organizers, you know, they've done a great
job with

this conference. A big shout out to them.

V.O.: Any questions?

H.K.: Yeah?

QUESTION: Hey.

H.K.: Hi.

QUESTION: If, for example, if a Ruby code
is running

in the JVM, in JRuby, how does, because none

of the Ruby code is written in a thread

safe way. How do, how does it internally manage

- does it actually, yeah, yesterday Yogi talked
about

the point that ActiveRecord is not actually
thread safe.

Can you explain it in detail like in a

theoretical way?

H.K.: OK. What is thread safety in

general, right? Thread safety is about how
the data

is consistently maintained after multi-threaded
access to that shared

data, right. So Ruby essentially has a GIL
because

internal implementations are not thread safe,
right. That's why

you want to have a GIL to protect you

from those problems. But as far as JRuby is

concerned, or Rubinius is concerned, the implementation
itself is

not written in C. JRuby is written in Ruby

again, I mean JRuby itself, and Rubinius is
written

in Ruby. And some of these actual internal
constructs

are thread safe when compared to MRI. I haven't

actually taken a look in detail into the code

of these code bases, but if they are implemented

properly, you can be thread safe - internally,
at

least - so, which means, the base code of

JRuby itself might be thread safe. It's only
not

thread safe because the gems on top of it,

which are trying to run. They may have, like,

thread safety issues, right. Does that answer
your question,

like, or- ?

QUESTION: About thread safety?? [00:22:09].

H.K.: Sure, sure. So those gems will not work.
That's

the point. Like what I want to convey here,

is whatever gems we are offering, and whatever
code

we are writing, we might get it - it's

a good idea to get into the habit of

writing thread safe code, so that we can actually

encourage a truly parallel Ruby, right. We
don't, we

don't have to stay in the same paradigm of

OK we have to be single threaded.

QUESTION: So Mutex based thread management
is one way.

There's also like actors and futures and things
like that.

And there's a gem called cellulite-

H.K.: Yup.

QUESTION: That, combined with something called
Hamster,

which makes everything immutable-

H.K.: Yup.

QUESTION: Is another way to do it.

H.K.: Yup.

QUESTION: Have you done it or like,

what's your experience with that?

H.K.: Yeah, I have tried out actors, with
revactor,

and lockless concurrency is

something I definitely agree is a good idea.
But

I'm specifically talking about, you know,
lock-based concurrency, like,

Mutex-based concurrency. This area is also
important because it's

not like thread mutable state is bad. It is,

it is actually applicable in certain scenarios.
When we

are working in this particular paradigm, we
still need

the safety of a memory model. Any other questions?

QUESTION: Thanks for the talk Hari. It was
really

good.

H.K.: Thanks.

QUESTION: Is there a way that

you would recommend to test if you have done

threading properly or not? I mean, I know,
bugs

that come out-

H.K.: Right.

QUESTION: Like I have

written bugs that come out of badly written,
you

know, not thread safe code, as.

H.K.: So-

QUESTION: Like, ?? [00:23:46] so, you catch
them.

H.K.: At least, my opinion, and a lot of people
have

done research in this area, their opinion
also is

that it's not possible to write tests against
multi

threaded code where there is shared data.
Because it's

nondeterministic and nonrepeatable. The kind
of results you get,

you can only test it against a heuristic.
For

example, if you have a deterministic use case
at

the top level, you can probably test it against

that. But exact test cases can never be written

for this.

V.O.: Any more questions?

H.K.: Cool. All right. Thank you so much.