1
00:00:24,810 --> 00:00:25,859
HARY KRISHNAN: So, thank you very much

2
00:00:25,859 --> 00:00:28,099
for being here on a Saturday evening, this
late.

3
00:00:28,099 --> 00:00:30,430
My talk got pushed to the last, but I

4
00:00:30,430 --> 00:00:34,540
appreciate you being here, first. My name's
Hari. I

5
00:00:34,540 --> 00:00:36,910
work at MavenHive. So this is a talk about

6
00:00:36,910 --> 00:00:43,530
Ruby memory model. So before I start, how
many

7
00:00:43,530 --> 00:00:46,559
of you have heard about memory model and know

8
00:00:46,559 --> 00:00:51,909
what it is? Show of hands, please. OK. Let's

9
00:00:51,909 --> 00:00:55,150
see where this talk goes. So why I did

10
00:00:55,150 --> 00:00:58,839
I come up with this talk topic. So I

11
00:00:58,839 --> 00:01:01,809
started my career with Java, and I spent a

12
00:01:01,809 --> 00:01:04,860
lot many years with Java, and Java has a

13
00:01:04,860 --> 00:01:08,890
very clearly documented memory model. And
it kind of

14
00:01:08,890 --> 00:01:10,500
gets to you because with all that, you don't

15
00:01:10,500 --> 00:01:14,049
feel safe enough doing multi-threaded programming
at all. So

16
00:01:14,049 --> 00:01:17,710
with Ruby, we've always been talking about,
you know,

17
00:01:17,710 --> 00:01:21,290
doing multi-process for multi-process parallelism,

18
00:01:21,290 --> 00:01:24,450
rather than multi-threaded parallelism,

19
00:01:24,450 --> 00:01:28,710
even though the language actually supports,
you know, multi-threading

20
00:01:28,710 --> 00:01:30,799
semantics. Of course we know it's called single-threaded
and

21
00:01:30,799 --> 00:01:34,259
all that, but I just got curious, like, what

22
00:01:34,259 --> 00:01:36,499
is the real memory model behind Ruby, and
I

23
00:01:36,499 --> 00:01:39,149
just wanted to figure that out. So this talk

24
00:01:39,149 --> 00:01:42,439
is all about my learnings as I went through,

25
00:01:42,439 --> 00:01:46,350
like, various literatures, and figured out,
and I tried

26
00:01:46,350 --> 00:01:48,289
to combine, like, get a gist of the whole

27
00:01:48,289 --> 00:01:50,509
thing. And cram it into some twenty minutes
so

28
00:01:50,509 --> 00:01:52,340
that I could, like, probably give you a very

29
00:01:52,340 --> 00:01:55,600
useful session, like, from which you can further
do

30
00:01:55,600 --> 00:02:01,069
more digging on this, right. So when I talked

31
00:02:01,069 --> 00:02:03,420
to my friends about memory model, the first
thing

32
00:02:03,420 --> 00:02:05,540
that comes up to their mind is probably this

33
00:02:05,540 --> 00:02:10,139
- heap, heap, non-heap, stack, whatever. I'm
not gonna

34
00:02:10,139 --> 00:02:14,069
talk about that. I'm not gonna talk about
this

35
00:02:14,069 --> 00:02:17,450
either. It's not about, you know, optimizing
your memory,

36
00:02:17,450 --> 00:02:21,040
or search memory leeks, or garbage collection.
This talk

37
00:02:21,040 --> 00:02:23,330
is not about that either. So what the hell

38
00:02:23,330 --> 00:02:27,370
am I gonna talk about? First, a quick exercise.

39
00:02:27,370 --> 00:02:31,360
So let's start with this and see where it

40
00:02:31,360 --> 00:02:35,760
goes. Simple code. Not much to process late
in

41
00:02:35,760 --> 00:02:38,890
the day. There's a shared variable called
'n', and

42
00:02:38,890 --> 00:02:42,030
there are thousand threads over that, and
each of

43
00:02:42,030 --> 00:02:45,379
those threads want to increment that shared
variable hundred

44
00:02:45,379 --> 00:02:49,379
times, right. And what is the expected output?
I'm

45
00:02:49,379 --> 00:02:51,200
not gonna question you, I'm just gonna give
it

46
00:02:51,200 --> 00:02:55,180
away. It's 100,000. It's fairly straightforward
code. I'm sure

47
00:02:55,180 --> 00:02:57,200
all of you have done this, and it's no

48
00:02:57,200 --> 00:03:01,680
big deal. So what's the real output? MRI is

49
00:03:01,680 --> 00:03:05,319
very faithful, it gives you what you expected.
100,000,

50
00:03:05,319 --> 00:03:08,720
right. So what happens next? I'm running it
on

51
00:03:08,720 --> 00:03:12,569
Rubinius. This is what you see. And it's always

52
00:03:12,569 --> 00:03:15,760
going to be a different number every time
you

53
00:03:15,760 --> 00:03:19,140
run it. And that's JRuby. It gives you a

54
00:03:19,140 --> 00:03:22,629
lower number. Some of you may be guessing
already,

55
00:03:22,629 --> 00:03:24,489
and you probably know it, why it gives you

56
00:03:24,489 --> 00:03:28,159
a lower number. So why all this basic stupid

57
00:03:28,159 --> 00:03:31,230
code and some stupid counter over here, right?
So

58
00:03:31,230 --> 00:03:34,189
I just wanted to get a really basic example

59
00:03:34,189 --> 00:03:36,299
to explain the concept of increment is not
a

60
00:03:36,299 --> 00:03:40,040
single instruction, right. The reason why
I'm talking about

61
00:03:40,040 --> 00:03:43,390
this is, I love Ruby because the syntax is

62
00:03:43,390 --> 00:03:46,629
so terse, and it's so simple, it's so readable,

63
00:03:46,629 --> 00:03:49,310
right. But it does not mean every single instruction

64
00:03:49,310 --> 00:03:52,140
on the screen is going to be executed straight

65
00:03:52,140 --> 00:03:54,810
away, right. So at least, to my junior self,

66
00:03:54,810 --> 00:03:56,599
this is the first advice I would give, when

67
00:03:56,599 --> 00:04:00,590
I started, you know, multi-threaded programming.
So at least

68
00:04:00,590 --> 00:04:05,980
three steps. Lowered increments store, right.
That's, even further,

69
00:04:05,980 --> 00:04:09,879
really simple piece of code like, you know,
a

70
00:04:09,879 --> 00:04:12,879
plus equals to, right. So this is what we

71
00:04:12,879 --> 00:04:15,750
really want to happen. You have a count, you

72
00:04:15,750 --> 00:04:18,399
lowered it, you increment it, you stored it.
Then

73
00:04:18,399 --> 00:04:21,019
the next thread comes along. It lowers it,
increments

74
00:04:21,019 --> 00:04:23,220
it, stores it. You have the next result which

75
00:04:23,220 --> 00:04:25,750
is what you expect, right. But we live in

76
00:04:25,750 --> 00:04:28,260
a world where threads don't want to be our

77
00:04:28,260 --> 00:04:31,470
friend. They do this. One guy comes along,
reads

78
00:04:31,470 --> 00:04:33,920
it, increments it. The other guy also reads
the

79
00:04:33,920 --> 00:04:37,440
older value, increments it. And both of them
go

80
00:04:37,440 --> 00:04:40,020
and save the same value, right. So this is

81
00:04:40,020 --> 00:04:42,120
a classic case of lost update. I'm sure most

82
00:04:42,120 --> 00:04:44,060
of you have seen it in the database world.

83
00:04:44,060 --> 00:04:46,770
But this pretty much happens a lot in the

84
00:04:46,770 --> 00:04:48,860
multi-threading world, right. But why did
it not happen

85
00:04:48,860 --> 00:04:51,620
with MRI? And what did you see the right

86
00:04:51,620 --> 00:04:53,190
result?? [00:04:52]? That, I'm sure a lot
of you

87
00:04:53,190 --> 00:04:55,580
know, but let's step, let's part that question
and

88
00:04:55,580 --> 00:05:00,500
just move a little ahead. So, as you observed

89
00:05:00,500 --> 00:05:03,770
earlier, a lot of reordoring happening in
instructions, right.

90
00:05:03,770 --> 00:05:07,210
Like, the threads were context-switching,
and they were reordering

91
00:05:07,210 --> 00:05:11,139
statements. So where does this reordering
happen? Reordering can

92
00:05:11,139 --> 00:05:14,740
happen at multiple levels. So start from the
top.

93
00:05:14,740 --> 00:05:18,150
You have the compiler, which can do simple
optimizations

94
00:05:18,150 --> 00:05:20,780
like look closer?? [00:05:20]. Even that can
change the

95
00:05:20,780 --> 00:05:23,990
order of your statements in your code, right.
Next,

96
00:05:23,990 --> 00:05:27,680
when the code gets translated to, you know,
machine-level

97
00:05:27,680 --> 00:05:30,639
language, goes to core, and your CP cores
are

98
00:05:30,639 --> 00:05:34,430
at liberty, again, to reorder them for performance.
And

99
00:05:34,430 --> 00:05:37,020
next comes the memory system, right. The memory
system

100
00:05:37,020 --> 00:05:39,669
is like the combined global memory, which
all the

101
00:05:39,669 --> 00:05:42,490
CPUs can read, and also they're individual
caches. But

102
00:05:42,490 --> 00:05:45,840
why do CPUs have caches? They want to, memory

103
00:05:45,840 --> 00:05:47,710
is slow, so they want to load, reload all

104
00:05:47,710 --> 00:05:50,080
the values, refactor it, keep it in the cache,

105
00:05:50,080 --> 00:05:52,710
again improve performance. So even the memory
system can

106
00:05:52,710 --> 00:05:55,940
conspire against you and reorder the loads
and stores

107
00:05:55,940 --> 00:05:59,380
after the memory registers. And that can cause
reordering,

108
00:05:59,380 --> 00:06:03,319
right. So this is really, really crazy. Like,
I'm

109
00:06:03,319 --> 00:06:07,550
a very stupid programmer, who works at the
programming

110
00:06:07,550 --> 00:06:10,599
language level. I don't really understand
the structure of

111
00:06:10,599 --> 00:06:13,169
the hardware and things like that. So how
do

112
00:06:13,169 --> 00:06:15,550
I keep myself abstracted from all this, you
know,

113
00:06:15,550 --> 00:06:21,550
really crazy stuff? So that's essentially
a memory model.

114
00:06:21,550 --> 00:06:23,930
So what, what is a memory model? A memory

115
00:06:23,930 --> 00:06:27,180
model describes the interactions of threads
through memory and

116
00:06:27,180 --> 00:06:28,970
their shared use of data. So this is straight

117
00:06:28,970 --> 00:06:30,919
out of Wikipedia, right. So if you just read

118
00:06:30,919 --> 00:06:34,610
it first, either you're gonna think it's really
simple,

119
00:06:34,610 --> 00:06:38,069
and probably even looks stupid, but otherwise
you might

120
00:06:38,069 --> 00:06:40,789
not even understand. So I was the second category.

121
00:06:40,789 --> 00:06:43,879
So what does this all mean? So when there

122
00:06:43,879 --> 00:06:48,580
are so many complications with the reordering,
the reads

123
00:06:48,580 --> 00:06:51,129
and writes of memory and things like that,
as

124
00:06:51,129 --> 00:06:54,759
a programmer you need certain guarantees from
the programming

125
00:06:54,759 --> 00:06:56,840
language, and the virtual machine you're working
on top

126
00:06:56,840 --> 00:07:01,039
of, to say this is how multi-threaded shared,
I

127
00:07:01,039 --> 00:07:03,979
mean, multi-threaded access to shared memory
is going to

128
00:07:03,979 --> 00:07:05,940
work. These are the basic guarantees and these
are

129
00:07:05,940 --> 00:07:09,310
the simple rules of how the system works.
So

130
00:07:09,310 --> 00:07:13,160
you can reliably work code against that, right.
So

131
00:07:13,160 --> 00:07:15,139
in, in effect, a memory model is just a

132
00:07:15,139 --> 00:07:21,479
specification. Any Java programmers here,
in the house? Great.

133
00:07:21,479 --> 00:07:25,860
So how many of you know about JSR 133?

134
00:07:25,860 --> 00:07:31,270
The memory model, double check locking - OK.
Some

135
00:07:31,270 --> 00:07:37,280
people. Single term issue? OK - some more
hands.

136
00:07:37,280 --> 00:07:39,620
So Java was the first programming language
which came

137
00:07:39,620 --> 00:07:43,360
up with a concept called memory model, right.
Because,

138
00:07:43,360 --> 00:07:45,610
the first thing is, right ones?? [00:07:45]
won't run

139
00:07:45,610 --> 00:07:48,110
anywhere. It had to be predictable across
platforms, across

140
00:07:48,110 --> 00:07:51,740
reimplementations, and things like that. So
the, there had

141
00:07:51,740 --> 00:07:54,650
to be a JSR which specified what is the

142
00:07:54,650 --> 00:07:56,860
memory model that it can code against so that

143
00:07:56,860 --> 00:08:02,129
your multi-threaded code works predictably,
and deterministically across platforms

144
00:08:02,129 --> 00:08:08,520
and across virtual machines. Right? So essentially
that's where

145
00:08:08,520 --> 00:08:11,280
my, you know, whole thing started. I had gone

146
00:08:11,280 --> 00:08:14,509
through the Java memory model, and was pretty
much

147
00:08:14,509 --> 00:08:16,960
really happy that someone had taken the pain
to

148
00:08:16,960 --> 00:08:18,590
write it down in clear terms so that you

149
00:08:18,590 --> 00:08:25,590
don't have to worry about multi-threading.
Hold on, sorry.

150
00:08:28,020 --> 00:08:34,669
Sorry about that. Cool. So. Memory model gives
you

151
00:08:34,669 --> 00:08:40,610
rules at three broad levels. Atomicity, visibility
and ordering.

152
00:08:40,610 --> 00:08:43,039
So atomicity is as simple as, you know, variable

153
00:08:43,039 --> 00:08:47,030
assignment. Is a variable assignment an indivisible
unit of

154
00:08:47,030 --> 00:08:49,520
work, or not? The rules around that, and it

155
00:08:49,520 --> 00:08:52,370
also talks about rules around, can you assign
hashes,

156
00:08:52,370 --> 00:08:55,070
send arrays indivisibly and things like that.
These rules

157
00:08:55,070 --> 00:08:57,670
can change based on every alligned version,
and things

158
00:08:57,670 --> 00:09:01,940
like that. Next is visibility. So in that
example

159
00:09:01,940 --> 00:09:05,040
which you talked about, I mean, we saw two

160
00:09:05,040 --> 00:09:07,310
threads trying to read the same value. Essentially
they

161
00:09:07,310 --> 00:09:09,390
are spying on each other. And it was not

162
00:09:09,390 --> 00:09:11,529
clear at what point the data had to become

163
00:09:11,529 --> 00:09:14,860
visible to each of those threads. So essentially
visibility

164
00:09:14,860 --> 00:09:18,240
is about that. And that is ensured through
memory

165
00:09:18,240 --> 00:09:21,800
barriers and ordering, which is the next thing.
So

166
00:09:21,800 --> 00:09:25,120
ordering is about how the loads and stores
are

167
00:09:25,120 --> 00:09:28,600
sequenced, or, you know, let's say you want
to

168
00:09:28,600 --> 00:09:30,720
write a piece of code, critical section as
you

169
00:09:30,720 --> 00:09:32,880
call it. And you don't want the compiler to

170
00:09:32,880 --> 00:09:35,510
do any crazy things to improve performance.
So you

171
00:09:35,510 --> 00:09:38,140
say, I make it synchronized, and it has to

172
00:09:38,140 --> 00:09:40,399
behave in a, behave in a nice serial?? [00:09:40]

173
00:09:40,399 --> 00:09:44,730
manner. So that ?? manner is ensured by ordering.

174
00:09:44,730 --> 00:09:47,940
Ordering is a really complex area. It talks
about

175
00:09:47,940 --> 00:09:50,850
causality, logical clocks and all that. I
won't go

176
00:09:50,850 --> 00:09:54,250
into those details. But I've been worrying
you with

177
00:09:54,250 --> 00:09:58,070
all this, you know, computer science basics
and all

178
00:09:58,070 --> 00:10:00,010
this. Why the hell am I talking about it

179
00:10:00,010 --> 00:10:02,430
in a Ruby conference? Ruby is single-threaded,
anyway. Why

180
00:10:02,430 --> 00:10:05,640
the hell should I care about it, right? OK.

181
00:10:05,640 --> 00:10:09,120
Do you really think languages like Ruby are
thread

182
00:10:09,120 --> 00:10:14,940
safe? Show of hands, anyone? So thread safety,
I'm

183
00:10:14,940 --> 00:10:18,600
talking only about Ruby - maybe Python. GIL
based

184
00:10:18,600 --> 00:10:25,600
languages. Are they thread safe? No? OK. In
fact

185
00:10:25,700 --> 00:10:30,649
they're not. Having single-threaded does not
mean it's thread-safe,

186
00:10:30,649 --> 00:10:33,670
right. Threads can switch context, and based
on how

187
00:10:33,670 --> 00:10:36,079
the language has been implemented and how
often the

188
00:10:36,079 --> 00:10:38,529
threads can switch context, and at what point
they

189
00:10:38,529 --> 00:10:44,010
can switch, things can go wrong, right. And
another

190
00:10:44,010 --> 00:10:46,040
pretty popular myth - I don't think many people

191
00:10:46,040 --> 00:10:49,389
believe it here, in this audience at least.
I

192
00:10:49,389 --> 00:10:52,440
don't have concurrency problems because I'm
running on single

193
00:10:52,440 --> 00:10:55,690
core. Not true. Again, threads can switch
context and

194
00:10:55,690 --> 00:10:58,630
run on the same core and still have dirty

195
00:10:58,630 --> 00:11:02,800
reads and things like that. So concurrency
is all

196
00:11:02,800 --> 00:11:05,550
about interleavings, right. Again, goes back
to reordering. I

197
00:11:05,550 --> 00:11:07,870
think I've been talking about this too often.
And

198
00:11:07,870 --> 00:11:11,950
let's not, again, worry with that. It's about
interleavings.

199
00:11:11,950 --> 00:11:15,620
We'll leave it at that. So let's, before we

200
00:11:15,620 --> 00:11:19,240
understand more about, you know, the memory
model and

201
00:11:19,240 --> 00:11:21,019
what it has to do with Ruby, let's just

202
00:11:21,019 --> 00:11:25,060
understand a little bit about threading in
Ruby. So

203
00:11:25,060 --> 00:11:28,100
all of you know, green threads, as of 1.8,

204
00:11:28,100 --> 00:11:31,430
there was only one worse thread, which was
being

205
00:11:31,430 --> 00:11:35,220
multiplexed with multiple Ruby threads, which
were being scheduled

206
00:11:35,220 --> 00:11:38,980
on it through global interpreter lock. 1.9
comes along,

207
00:11:38,980 --> 00:11:41,200
there is a one to one mapping between the

208
00:11:41,200 --> 00:11:43,660
Ruby thread and OS thread, but still the Ruby

209
00:11:43,660 --> 00:11:46,620
thread cannot use the OS thread unless it
has

210
00:11:46,620 --> 00:11:50,980
the global VM lock as its call now. The

211
00:11:50,980 --> 00:11:55,750
JVL acquire. So does having a Global Interpreter
Lock

212
00:11:55,750 --> 00:12:00,709
make you thread safe? It depends. It does
make

213
00:12:00,709 --> 00:12:03,260
you thread safe in a way, but let's see.

214
00:12:03,260 --> 00:12:05,329
So how does GIL work? This is a very

215
00:12:05,329 --> 00:12:08,510
simplistic representation of how GIL works.
So you have

216
00:12:08,510 --> 00:12:12,120
two threads here. One is already holding the
GIL.

217
00:12:12,120 --> 00:12:15,519
So it's, it's working with the OS thread.
And

218
00:12:15,519 --> 00:12:18,820
now when there is another thread waiting on
it,

219
00:12:18,820 --> 00:12:21,190
waiting on the GIL to do its work, it

220
00:12:21,190 --> 00:12:22,510
sends a, it wakes up the timer thread. Time

221
00:12:22,510 --> 00:12:26,790
thread is, again, another Ruby thread. The
timer thread

222
00:12:26,790 --> 00:12:30,410
now goes and interrupts the thread holding
the GIL,

223
00:12:30,410 --> 00:12:32,040
and if the GIL, if the thread holding the

224
00:12:32,040 --> 00:12:34,889
GIL is done with whatever it's doing - I'll

225
00:12:34,889 --> 00:12:36,550
get to it in a bit - it just

226
00:12:36,550 --> 00:12:40,320
releases the lock, and now thread two can
take

227
00:12:40,320 --> 00:12:42,829
over and do its thing. Well this is the

228
00:12:42,829 --> 00:12:48,329
basic working that at least I understood about
GIL.

229
00:12:48,329 --> 00:12:50,300
But there are details to this, right. It's
not

230
00:12:50,300 --> 00:12:57,300
as simple as what we saw. So, when you

231
00:12:57,779 --> 00:13:00,930
initialize a thread, or create a thread in
Ruby,

232
00:13:00,930 --> 00:13:03,100
you pass it a block of code. So how

233
00:13:03,100 --> 00:13:06,240
does that work? You take a block of code,

234
00:13:06,240 --> 00:13:07,769
you put it inside the thread. What the thread

235
00:13:07,769 --> 00:13:10,480
does is usually it acquires a JVL and a

236
00:13:10,480 --> 00:13:14,019
block?? [00:13:11]. It executes the block
of code. It

237
00:13:14,019 --> 00:13:17,089
releases the, returns and releases the lock,
right. So

238
00:13:17,089 --> 00:13:19,470
essentially this is how it works. So during
that

239
00:13:19,470 --> 00:13:21,899
period of executation of the block, no other
thread

240
00:13:21,899 --> 00:13:24,380
is allowed to work. So that makes you almost

241
00:13:24,380 --> 00:13:28,110
thread safe, right? But not really. If that's
how

242
00:13:28,110 --> 00:13:30,600
it's going to work, what if that thread is

243
00:13:30,600 --> 00:13:33,899
going to hog the GIL, and not allow any

244
00:13:33,899 --> 00:13:35,760
other thread to work? So there has to be

245
00:13:35,760 --> 00:13:38,430
some kind of lock fairness, right. So that's
where

246
00:13:38,430 --> 00:13:41,180
the timer thread comes in and interrupts it.
OK.

247
00:13:41,180 --> 00:13:43,130
Does that mean the thread holding the GIL
immediately

248
00:13:43,130 --> 00:13:45,190
gives it up, and says here you go, you

249
00:13:45,190 --> 00:13:48,740
can start and work with it? Not really. Again

250
00:13:48,740 --> 00:13:51,389
the thread holding the GIL will only release
the

251
00:13:51,389 --> 00:13:53,920
GIL if it is at a context to its

252
00:13:53,920 --> 00:13:57,019
boundary. What that is, is fairly complicated.
I don't

253
00:13:57,019 --> 00:13:59,920
want to go into the details. I think people

254
00:13:59,920 --> 00:14:02,540
who here know a lot better C than me,

255
00:14:02,540 --> 00:14:05,110
and are deep C divers really, they can probably

256
00:14:05,110 --> 00:14:08,670
tell you, you know, how, at what the GIL

257
00:14:08,670 --> 00:14:11,040
can get released. If a C thread, a C

258
00:14:11,040 --> 00:14:13,269
code makes a call to Ruby code, can it

259
00:14:13,269 --> 00:14:15,449
or can it not release the GIL? All those

260
00:14:15,449 --> 00:14:18,399
things are there, right. So all these complexities
are

261
00:14:18,399 --> 00:14:21,360
really, really hard to deal with. I came across

262
00:14:21,360 --> 00:14:25,139
this blog by Jesse Storimer. It's excellent
and I

263
00:14:25,139 --> 00:14:27,440
strongly encourage you to go through the two-part
blog

264
00:14:27,440 --> 00:14:30,990
about, you know, nobody understands GIL. It's
really, really

265
00:14:30,990 --> 00:14:33,550
important, if you're trying to do any sort
of

266
00:14:33,550 --> 00:14:39,740
multi-threaded programming in Ruby. So do
you still think

267
00:14:39,740 --> 00:14:42,740
Ruby is thread safe because it's got GIL?
I'm

268
00:14:42,740 --> 00:14:48,740
talking about MRI, essentially. So the thing
is, we

269
00:14:48,740 --> 00:14:51,630
can't depend on GIL, right. GIL is not documented

270
00:14:51,630 --> 00:14:54,050
anywhere that this is exactly how it works.
This

271
00:14:54,050 --> 00:14:56,079
is when the timer thread wakes up. These are

272
00:14:56,079 --> 00:14:59,310
the time slices alotted to the thread acquiring
the

273
00:14:59,310 --> 00:15:03,190
JVL. There is no documentation around at what
point

274
00:15:03,190 --> 00:15:04,860
the GIL can be released, can it not be

275
00:15:04,860 --> 00:15:07,009
released, and things like that. There's no,
it's not

276
00:15:07,009 --> 00:15:10,259
predictable, and if you depend on it, what
could

277
00:15:10,259 --> 00:15:13,139
also happen is even within MRI, when you're
moving

278
00:15:13,139 --> 00:15:15,920
from version to version, if something changes
in GIL,

279
00:15:15,920 --> 00:15:22,220
your code with behave nondeterministically.
And what about language

280
00:15:22,220 --> 00:15:25,209
in Ruby implementations that don't even have
a GIL?

281
00:15:25,209 --> 00:15:27,009
So obviously that's the big problem, right.
If you

282
00:15:27,009 --> 00:15:29,610
write a gem or something which has to be

283
00:15:29,610 --> 00:15:32,079
multi-threaded, and if you're depending on
the GIL to

284
00:15:32,079 --> 00:15:34,769
do its thing to keep you safe, then obviously

285
00:15:34,769 --> 00:15:38,550
it cannot work on Rubinius and JRuby. Let
that

286
00:15:38,550 --> 00:15:41,310
alone, even, even if you give that up, even

287
00:15:41,310 --> 00:15:44,360
with MRI, it's not entirely correct to say
that

288
00:15:44,360 --> 00:15:47,490
you're thread safe, because there is a GIL
that

289
00:15:47,490 --> 00:15:52,660
will ensure that only one thread is running.
So

290
00:15:52,660 --> 00:15:54,610
what did I find out? Ruby really does not

291
00:15:54,610 --> 00:15:57,350
have a documented memory model. It's pretty
much similar

292
00:15:57,350 --> 00:16:00,480
to Python. It doesn't have a clearly documented
memory

293
00:16:00,480 --> 00:16:05,279
model. What is the implication of that? So
as

294
00:16:05,279 --> 00:16:07,540
I mentioned previously, a memory model is
like a

295
00:16:07,540 --> 00:16:10,769
specification. This is exactly how the system
has to

296
00:16:10,769 --> 00:16:14,600
provide a certain minimum guarantee to the
users of

297
00:16:14,600 --> 00:16:17,730
the language, right, regarding multi threaded
access to shared

298
00:16:17,730 --> 00:16:22,500
memory. Now, basically if I don't have a written

299
00:16:22,500 --> 00:16:23,720
down memory model, and I am going to write

300
00:16:23,720 --> 00:16:26,540
a Ruby implementation to model, I have the
liberty

301
00:16:26,540 --> 00:16:29,509
to choose whatever memory model I want. So
the

302
00:16:29,509 --> 00:16:32,889
code, if you're writing against MRI, may not
essentially

303
00:16:32,889 --> 00:16:36,720
work right on my, you know, my implementation
of

304
00:16:36,720 --> 00:16:41,339
Ruby. That's the big implication, right. So
Ruby right

305
00:16:41,339 --> 00:16:45,769
now depends on underlying virtual machines.
Even after ER,

306
00:16:45,769 --> 00:16:47,699
you have bad code compilations, so even MRI
is

307
00:16:47,699 --> 00:16:50,839
almost like a VM. So that has no specification

308
00:16:50,839 --> 00:16:52,959
for a memory model, but it does have something,

309
00:16:52,959 --> 00:16:55,279
right, internally. If you have to go through
the

310
00:16:55,279 --> 00:16:58,130
C code and understand. It's not guaranteed
to remain

311
00:16:58,130 --> 00:17:01,079
the same from version to version, as I understand,

312
00:17:01,079 --> 00:17:05,069
right. And obviously JRuby and Rubinius, they
depend on

313
00:17:05,069 --> 00:17:08,260
JVM and LLVM respectively. And they all have
a

314
00:17:08,260 --> 00:17:11,819
clearly documented memory model. You could
have a read

315
00:17:11,819 --> 00:17:15,260
at it. And the only thing is, if Ruby

316
00:17:15,260 --> 00:17:18,079
had an implementation - sorry, a specification
for a

317
00:17:18,079 --> 00:17:22,220
memory model, it could be, you know, implemented
using

318
00:17:22,220 --> 00:17:27,599
the constructs available on JVM and LLVM.
But this

319
00:17:27,599 --> 00:17:29,450
is what we have. We don't have much to

320
00:17:29,450 --> 00:17:33,200
do. What do we do under the circumstances?
We

321
00:17:33,200 --> 00:17:36,640
have to engineer our code for thread safety.
We

322
00:17:36,640 --> 00:17:40,120
can't bask under the safety that, there is
a

323
00:17:40,120 --> 00:17:42,410
GIL and so it's going to help me keep

324
00:17:42,410 --> 00:17:44,530
my code thread safe. So even I can write

325
00:17:44,530 --> 00:17:47,690
multiple, you know, multi threaded code without
actually worrying

326
00:17:47,690 --> 00:17:51,290
about serious synchronization issues and things
like that. It's

327
00:17:51,290 --> 00:17:54,500
totally not the right thing to do. I think

328
00:17:54,500 --> 00:17:57,370
any which way, Ruby is a language I love,

329
00:17:57,370 --> 00:17:59,710
and I'm sure all of you love, so. And

330
00:17:59,710 --> 00:18:02,670
it's progressing my leaps and bounds, and
eventually we're

331
00:18:02,670 --> 00:18:04,840
going to write more and more complex systems
with

332
00:18:04,840 --> 00:18:09,390
Ruby. And who knows, we might have true parallelism

333
00:18:09,390 --> 00:18:13,980
very soon, right. So why, still, stay in the

334
00:18:13,980 --> 00:18:17,210
same mental block that we don't want to write,

335
00:18:17,210 --> 00:18:20,480
you know, thread safe code that's anyway single
threaded.

336
00:18:20,480 --> 00:18:22,150
We might as well get into the mindset of

337
00:18:22,150 --> 00:18:26,130
writing proper thread safe code, and try and
probably

338
00:18:26,130 --> 00:18:29,500
come up with a memory model, right. But I

339
00:18:29,500 --> 00:18:31,700
think for now we just start engineering code
for

340
00:18:31,700 --> 00:18:36,860
thread safety. Simple Mutex, I'm sure all
of you

341
00:18:36,860 --> 00:18:39,580
know, but it's really, really important for
even a

342
00:18:39,580 --> 00:18:44,090
stupid operation like a plus equals two. So
simple

343
00:18:44,090 --> 00:18:46,970
things which are noticed in Ruby code bases
and

344
00:18:46,970 --> 00:18:50,530
Rails code bases as well, like generally,
is, there

345
00:18:50,530 --> 00:18:52,920
is like a synchronized, you know, a section
of

346
00:18:52,920 --> 00:18:56,260
the code has lots of synchronization and everything.
It's

347
00:18:56,260 --> 00:18:58,530
really safe. But we leave an innocent accessor
lying

348
00:18:58,530 --> 00:19:00,760
around, and that causes a lot of, you know,

349
00:19:00,760 --> 00:19:04,360
pain, like debugging those issues. And general
issues like,

350
00:19:04,360 --> 00:19:08,020
you know, state mutations, inside methods
is really a

351
00:19:08,020 --> 00:19:10,270
bad idea. So if you're looking for issues
around

352
00:19:10,270 --> 00:19:12,200
multi threading, this might be a good place
to

353
00:19:12,200 --> 00:19:14,350
start. So I just listed a few of them

354
00:19:14,350 --> 00:19:16,310
here. I didn't want to make a really dense

355
00:19:16,310 --> 00:19:19,210
talk with all the details. You can always
catch

356
00:19:19,210 --> 00:19:20,940
me offline and I can tell you some of

357
00:19:20,940 --> 00:19:23,600
my experiences and probably even listen to
you and

358
00:19:23,600 --> 00:19:25,980
learn from you about some of the issues that

359
00:19:25,980 --> 00:19:28,820
we can solve by actually writing proper thread
safe

360
00:19:28,820 --> 00:19:33,080
code in Ruby. I came across a few gems

361
00:19:33,080 --> 00:19:35,090
which were really, really nice. Both of them
happen

362
00:19:35,090 --> 00:19:38,680
to be written by headius. The first one is

363
00:19:38,680 --> 00:19:40,730
atomic. Atomic is almost trying to give you
the

364
00:19:40,730 --> 00:19:44,970
similar constructs like the Java utility concurrent
package. It

365
00:19:44,970 --> 00:19:51,300
tries to, it's kind of compatible across MRI,
JRuby,

366
00:19:51,300 --> 00:19:53,800
and Rubinius, which is also a really nice
thing.

367
00:19:53,800 --> 00:19:56,560
So you have atomic integers and atomic floats,
which

368
00:19:56,560 --> 00:19:59,900
do increments actually in an atomic way, which
is

369
00:19:59,900 --> 00:20:02,460
excellent. And then there is thread_safe library,
which also

370
00:20:02,460 --> 00:20:04,590
has a few thread safe data structures. I'm
trying

371
00:20:04,590 --> 00:20:06,570
to play around with these libraries right
now, but

372
00:20:06,570 --> 00:20:09,150
they may be a good, you know, starting point

373
00:20:09,150 --> 00:20:10,780
if you are trying to do higher level constructs

374
00:20:10,780 --> 00:20:15,620
for concurrency. And that's pretty much it.
I'm open

375
00:20:15,620 --> 00:20:21,820
to take questions. Thank you. And before anything
I

376
00:20:21,820 --> 00:20:23,420
really would like to thank you all, again
for

377
00:20:23,420 --> 00:20:27,140
being here for the talk, and thank the GCRC

378
00:20:27,140 --> 00:20:31,410
organizers, you know, they've done a great
job with

379
00:20:31,410 --> 00:20:38,410
this conference. A big shout out to them.

380
00:20:46,470 --> 00:20:46,510
V.O.: Any questions?

381
00:20:46,510 --> 00:20:46,540
H.K.: Yeah?

382
00:20:46,540 --> 00:20:46,560
QUESTION: Hey.

383
00:20:46,560 --> 00:20:46,590
H.K.: Hi.

384
00:20:46,590 --> 00:20:47,520
QUESTION: If, for example, if a Ruby code
is running

385
00:20:47,520 --> 00:20:51,530
in the JVM, in JRuby, how does, because none

386
00:20:51,530 --> 00:20:53,810
of the Ruby code is written in a thread

387
00:20:53,810 --> 00:20:56,580
safe way. How do, how does it internally manage

388
00:20:56,580 --> 00:20:58,750
- does it actually, yeah, yesterday Yogi talked
about

389
00:20:58,750 --> 00:21:00,940
the point that ActiveRecord is not actually
thread safe.

390
00:21:00,940 --> 00:21:03,520
Can you explain it in detail like in a

391
00:21:03,520 --> 00:21:04,460
theoretical way?

392
00:21:04,460 --> 00:21:06,560
H.K.: OK. What is thread safety in

393
00:21:06,560 --> 00:21:09,010
general, right? Thread safety is about how
the data

394
00:21:09,010 --> 00:21:13,280
is consistently maintained after multi-threaded
access to that shared

395
00:21:13,280 --> 00:21:17,130
data, right. So Ruby essentially has a GIL
because

396
00:21:17,130 --> 00:21:19,620
internal implementations are not thread safe,
right. That's why

397
00:21:19,620 --> 00:21:22,110
you want to have a GIL to protect you

398
00:21:22,110 --> 00:21:25,840
from those problems. But as far as JRuby is

399
00:21:25,840 --> 00:21:29,280
concerned, or Rubinius is concerned, the implementation
itself is

400
00:21:29,280 --> 00:21:31,930
not written in C. JRuby is written in Ruby

401
00:21:31,930 --> 00:21:34,400
again, I mean JRuby itself, and Rubinius is
written

402
00:21:34,400 --> 00:21:37,660
in Ruby. And some of these actual internal
constructs

403
00:21:37,660 --> 00:21:40,580
are thread safe when compared to MRI. I haven't

404
00:21:40,580 --> 00:21:43,190
actually taken a look in detail into the code

405
00:21:43,190 --> 00:21:47,520
of these code bases, but if they are implemented

406
00:21:47,520 --> 00:21:50,000
properly, you can be thread safe - internally,
at

407
00:21:50,000 --> 00:21:53,340
least - so, which means, the base code of

408
00:21:53,340 --> 00:21:55,720
JRuby itself might be thread safe. It's only
not

409
00:21:55,720 --> 00:21:58,200
thread safe because the gems on top of it,

410
00:21:58,200 --> 00:22:01,050
which are trying to run. They may have, like,

411
00:22:01,050 --> 00:22:04,890
thread safety issues, right. Does that answer
your question,

412
00:22:04,890 --> 00:22:05,840
like, or- ?

413
00:22:05,840 --> 00:22:08,200
QUESTION: About thread safety?? [00:22:09].

414
00:22:08,200 --> 00:22:11,720
H.K.: Sure, sure. So those gems will not work.
That's

415
00:22:11,720 --> 00:22:13,840
the point. Like what I want to convey here,

416
00:22:13,840 --> 00:22:16,910
is whatever gems we are offering, and whatever
code

417
00:22:16,910 --> 00:22:18,780
we are writing, we might get it - it's

418
00:22:18,780 --> 00:22:20,240
a good idea to get into the habit of

419
00:22:20,240 --> 00:22:22,860
writing thread safe code, so that we can actually

420
00:22:22,860 --> 00:22:25,460
encourage a truly parallel Ruby, right. We
don't, we

421
00:22:25,460 --> 00:22:27,530
don't have to stay in the same paradigm of

422
00:22:27,530 --> 00:22:31,520
OK we have to be single threaded.

423
00:22:31,520 --> 00:22:37,010
QUESTION: So Mutex based thread management
is one way.

424
00:22:37,010 --> 00:22:40,060
There's also like actors and futures and things
like that.

425
00:22:40,060 --> 00:22:41,890
And there's a gem called cellulite-

426
00:22:41,890 --> 00:22:42,680
H.K.: Yup.

427
00:22:42,680 --> 00:22:45,040
QUESTION: That, combined with something called
Hamster,

428
00:22:45,040 --> 00:22:46,390
which makes everything immutable-

429
00:22:46,390 --> 00:22:46,840
H.K.: Yup.

430
00:22:46,840 --> 00:22:47,960
QUESTION: Is another way to do it.

431
00:22:47,960 --> 00:22:48,160
H.K.: Yup.

432
00:22:48,160 --> 00:22:49,070
QUESTION: Have you done it or like,

433
00:22:49,070 --> 00:22:49,950
what's your experience with that?

434
00:22:49,950 --> 00:22:53,130
H.K.: Yeah, I have tried out actors, with
revactor,

435
00:22:53,130 --> 00:22:54,330
and lockless concurrency is

436
00:22:54,330 --> 00:22:56,830
something I definitely agree is a good idea.
But

437
00:22:56,830 --> 00:23:01,440
I'm specifically talking about, you know,
lock-based concurrency, like,

438
00:23:01,440 --> 00:23:04,530
Mutex-based concurrency. This area is also
important because it's

439
00:23:04,530 --> 00:23:07,960
not like thread mutable state is bad. It is,

440
00:23:07,960 --> 00:23:10,770
it is actually applicable in certain scenarios.
When we

441
00:23:10,770 --> 00:23:13,360
are working in this particular paradigm, we
still need

442
00:23:13,360 --> 00:23:19,170
the safety of a memory model. Any other questions?

443
00:23:19,170 --> 00:23:26,170
QUESTION: Thanks for the talk Hari. It was
really

444
00:23:28,200 --> 00:23:28,650
good.

445
00:23:28,650 --> 00:23:29,550
H.K.: Thanks.

446
00:23:29,550 --> 00:23:31,140
QUESTION: Is there a way that

447
00:23:31,140 --> 00:23:35,050
you would recommend to test if you have done

448
00:23:35,050 --> 00:23:37,850
threading properly or not? I mean, I know,
bugs

449
00:23:37,850 --> 00:23:38,420
that come out-

450
00:23:38,420 --> 00:23:38,610
H.K.: Right.

451
00:23:38,610 --> 00:23:38,980
QUESTION: Like I have

452
00:23:38,980 --> 00:23:41,680
written bugs that come out of badly written,
you

453
00:23:41,680 --> 00:23:43,750
know, not thread safe code, as.

454
00:23:43,750 --> 00:23:44,510
H.K.: So-

455
00:23:44,510 --> 00:23:47,190
QUESTION: Like, ?? [00:23:46] so, you catch
them.

456
00:23:47,190 --> 00:23:51,510
H.K.: At least, my opinion, and a lot of people
have

457
00:23:51,510 --> 00:23:53,960
done research in this area, their opinion
also is

458
00:23:53,960 --> 00:23:57,600
that it's not possible to write tests against
multi

459
00:23:57,600 --> 00:24:00,480
threaded code where there is shared data.
Because it's

460
00:24:00,480 --> 00:24:04,230
nondeterministic and nonrepeatable. The kind
of results you get,

461
00:24:04,230 --> 00:24:06,920
you can only test it against a heuristic.
For

462
00:24:06,920 --> 00:24:09,430
example, if you have a deterministic use case
at

463
00:24:09,430 --> 00:24:11,620
the top level, you can probably test it against

464
00:24:11,620 --> 00:24:14,490
that. But exact test cases can never be written

465
00:24:14,490 --> 00:24:16,070
for this.

466
00:24:16,070 --> 00:24:19,240
V.O.: Any more questions?

467
00:24:19,240 --> 00:24:26,240
H.K.: Cool. All right. Thank you so much.