1 00:00:24,810 --> 00:00:25,859 HARY KRISHNAN: So, thank you very much 2 00:00:25,859 --> 00:00:28,099 for being here on a Saturday evening, this late. 3 00:00:28,099 --> 00:00:30,430 My talk got pushed to the last, but I 4 00:00:30,430 --> 00:00:34,540 appreciate you being here, first. My name's Hari. I 5 00:00:34,540 --> 00:00:36,910 work at MavenHive. So this is a talk about 6 00:00:36,910 --> 00:00:43,530 Ruby memory model. So before I start, how many 7 00:00:43,530 --> 00:00:46,559 of you have heard about memory model and know 8 00:00:46,559 --> 00:00:51,909 what it is? Show of hands, please. OK. Let's 9 00:00:51,909 --> 00:00:55,150 see where this talk goes. So why I did 10 00:00:55,150 --> 00:00:58,839 I come up with this talk topic. So I 11 00:00:58,839 --> 00:01:01,809 started my career with Java, and I spent a 12 00:01:01,809 --> 00:01:04,860 lot many years with Java, and Java has a 13 00:01:04,860 --> 00:01:08,890 very clearly documented memory model. And it kind of 14 00:01:08,890 --> 00:01:10,500 gets to you because with all that, you don't 15 00:01:10,500 --> 00:01:14,049 feel safe enough doing multi-threaded programming at all. So 16 00:01:14,049 --> 00:01:17,710 with Ruby, we've always been talking about, you know, 17 00:01:17,710 --> 00:01:21,290 doing multi-process for multi-process parallelism, 18 00:01:21,290 --> 00:01:24,450 rather than multi-threaded parallelism, 19 00:01:24,450 --> 00:01:28,710 even though the language actually supports, you know, multi-threading 20 00:01:28,710 --> 00:01:30,799 semantics. Of course we know it's called single-threaded and 21 00:01:30,799 --> 00:01:34,259 all that, but I just got curious, like, what 22 00:01:34,259 --> 00:01:36,499 is the real memory model behind Ruby, and I 23 00:01:36,499 --> 00:01:39,149 just wanted to figure that out. So this talk 24 00:01:39,149 --> 00:01:42,439 is all about my learnings as I went through, 25 00:01:42,439 --> 00:01:46,350 like, various literatures, and figured out, and I tried 26 00:01:46,350 --> 00:01:48,289 to combine, like, get a gist of the whole 27 00:01:48,289 --> 00:01:50,509 thing. And cram it into some twenty minutes so 28 00:01:50,509 --> 00:01:52,340 that I could, like, probably give you a very 29 00:01:52,340 --> 00:01:55,600 useful session, like, from which you can further do 30 00:01:55,600 --> 00:02:01,069 more digging on this, right. So when I talked 31 00:02:01,069 --> 00:02:03,420 to my friends about memory model, the first thing 32 00:02:03,420 --> 00:02:05,540 that comes up to their mind is probably this 33 00:02:05,540 --> 00:02:10,139 - heap, heap, non-heap, stack, whatever. I'm not gonna 34 00:02:10,139 --> 00:02:14,069 talk about that. I'm not gonna talk about this 35 00:02:14,069 --> 00:02:17,450 either. It's not about, you know, optimizing your memory, 36 00:02:17,450 --> 00:02:21,040 or search memory leeks, or garbage collection. This talk 37 00:02:21,040 --> 00:02:23,330 is not about that either. So what the hell 38 00:02:23,330 --> 00:02:27,370 am I gonna talk about? First, a quick exercise. 39 00:02:27,370 --> 00:02:31,360 So let's start with this and see where it 40 00:02:31,360 --> 00:02:35,760 goes. Simple code. Not much to process late in 41 00:02:35,760 --> 00:02:38,890 the day. There's a shared variable called 'n', and 42 00:02:38,890 --> 00:02:42,030 there are thousand threads over that, and each of 43 00:02:42,030 --> 00:02:45,379 those threads want to increment that shared variable hundred 44 00:02:45,379 --> 00:02:49,379 times, right. And what is the expected output? I'm 45 00:02:49,379 --> 00:02:51,200 not gonna question you, I'm just gonna give it 46 00:02:51,200 --> 00:02:55,180 away. It's 100,000. It's fairly straightforward code. I'm sure 47 00:02:55,180 --> 00:02:57,200 all of you have done this, and it's no 48 00:02:57,200 --> 00:03:01,680 big deal. So what's the real output? MRI is 49 00:03:01,680 --> 00:03:05,319 very faithful, it gives you what you expected. 100,000, 50 00:03:05,319 --> 00:03:08,720 right. So what happens next? I'm running it on 51 00:03:08,720 --> 00:03:12,569 Rubinius. This is what you see. And it's always 52 00:03:12,569 --> 00:03:15,760 going to be a different number every time you 53 00:03:15,760 --> 00:03:19,140 run it. And that's JRuby. It gives you a 54 00:03:19,140 --> 00:03:22,629 lower number. Some of you may be guessing already, 55 00:03:22,629 --> 00:03:24,489 and you probably know it, why it gives you 56 00:03:24,489 --> 00:03:28,159 a lower number. So why all this basic stupid 57 00:03:28,159 --> 00:03:31,230 code and some stupid counter over here, right? So 58 00:03:31,230 --> 00:03:34,189 I just wanted to get a really basic example 59 00:03:34,189 --> 00:03:36,299 to explain the concept of increment is not a 60 00:03:36,299 --> 00:03:40,040 single instruction, right. The reason why I'm talking about 61 00:03:40,040 --> 00:03:43,390 this is, I love Ruby because the syntax is 62 00:03:43,390 --> 00:03:46,629 so terse, and it's so simple, it's so readable, 63 00:03:46,629 --> 00:03:49,310 right. But it does not mean every single instruction 64 00:03:49,310 --> 00:03:52,140 on the screen is going to be executed straight 65 00:03:52,140 --> 00:03:54,810 away, right. So at least, to my junior self, 66 00:03:54,810 --> 00:03:56,599 this is the first advice I would give, when 67 00:03:56,599 --> 00:04:00,590 I started, you know, multi-threaded programming. So at least 68 00:04:00,590 --> 00:04:05,980 three steps. Lowered increments store, right. That's, even further, 69 00:04:05,980 --> 00:04:09,879 really simple piece of code like, you know, a 70 00:04:09,879 --> 00:04:12,879 plus equals to, right. So this is what we 71 00:04:12,879 --> 00:04:15,750 really want to happen. You have a count, you 72 00:04:15,750 --> 00:04:18,399 lowered it, you increment it, you stored it. Then 73 00:04:18,399 --> 00:04:21,019 the next thread comes along. It lowers it, increments 74 00:04:21,019 --> 00:04:23,220 it, stores it. You have the next result which 75 00:04:23,220 --> 00:04:25,750 is what you expect, right. But we live in 76 00:04:25,750 --> 00:04:28,260 a world where threads don't want to be our 77 00:04:28,260 --> 00:04:31,470 friend. They do this. One guy comes along, reads 78 00:04:31,470 --> 00:04:33,920 it, increments it. The other guy also reads the 79 00:04:33,920 --> 00:04:37,440 older value, increments it. And both of them go 80 00:04:37,440 --> 00:04:40,020 and save the same value, right. So this is 81 00:04:40,020 --> 00:04:42,120 a classic case of lost update. I'm sure most 82 00:04:42,120 --> 00:04:44,060 of you have seen it in the database world. 83 00:04:44,060 --> 00:04:46,770 But this pretty much happens a lot in the 84 00:04:46,770 --> 00:04:48,860 multi-threading world, right. But why did it not happen 85 00:04:48,860 --> 00:04:51,620 with MRI? And what did you see the right 86 00:04:51,620 --> 00:04:53,190 result?? [00:04:52]? That, I'm sure a lot of you 87 00:04:53,190 --> 00:04:55,580 know, but let's step, let's part that question and 88 00:04:55,580 --> 00:05:00,500 just move a little ahead. So, as you observed 89 00:05:00,500 --> 00:05:03,770 earlier, a lot of reordoring happening in instructions, right. 90 00:05:03,770 --> 00:05:07,210 Like, the threads were context-switching, and they were reordering 91 00:05:07,210 --> 00:05:11,139 statements. So where does this reordering happen? Reordering can 92 00:05:11,139 --> 00:05:14,740 happen at multiple levels. So start from the top. 93 00:05:14,740 --> 00:05:18,150 You have the compiler, which can do simple optimizations 94 00:05:18,150 --> 00:05:20,780 like look closer?? [00:05:20]. Even that can change the 95 00:05:20,780 --> 00:05:23,990 order of your statements in your code, right. Next, 96 00:05:23,990 --> 00:05:27,680 when the code gets translated to, you know, machine-level 97 00:05:27,680 --> 00:05:30,639 language, goes to core, and your CP cores are 98 00:05:30,639 --> 00:05:34,430 at liberty, again, to reorder them for performance. And 99 00:05:34,430 --> 00:05:37,020 next comes the memory system, right. The memory system 100 00:05:37,020 --> 00:05:39,669 is like the combined global memory, which all the 101 00:05:39,669 --> 00:05:42,490 CPUs can read, and also they're individual caches. But 102 00:05:42,490 --> 00:05:45,840 why do CPUs have caches? They want to, memory 103 00:05:45,840 --> 00:05:47,710 is slow, so they want to load, reload all 104 00:05:47,710 --> 00:05:50,080 the values, refactor it, keep it in the cache, 105 00:05:50,080 --> 00:05:52,710 again improve performance. So even the memory system can 106 00:05:52,710 --> 00:05:55,940 conspire against you and reorder the loads and stores 107 00:05:55,940 --> 00:05:59,380 after the memory registers. And that can cause reordering, 108 00:05:59,380 --> 00:06:03,319 right. So this is really, really crazy. Like, I'm 109 00:06:03,319 --> 00:06:07,550 a very stupid programmer, who works at the programming 110 00:06:07,550 --> 00:06:10,599 language level. I don't really understand the structure of 111 00:06:10,599 --> 00:06:13,169 the hardware and things like that. So how do 112 00:06:13,169 --> 00:06:15,550 I keep myself abstracted from all this, you know, 113 00:06:15,550 --> 00:06:21,550 really crazy stuff? So that's essentially a memory model. 114 00:06:21,550 --> 00:06:23,930 So what, what is a memory model? A memory 115 00:06:23,930 --> 00:06:27,180 model describes the interactions of threads through memory and 116 00:06:27,180 --> 00:06:28,970 their shared use of data. So this is straight 117 00:06:28,970 --> 00:06:30,919 out of Wikipedia, right. So if you just read 118 00:06:30,919 --> 00:06:34,610 it first, either you're gonna think it's really simple, 119 00:06:34,610 --> 00:06:38,069 and probably even looks stupid, but otherwise you might 120 00:06:38,069 --> 00:06:40,789 not even understand. So I was the second category. 121 00:06:40,789 --> 00:06:43,879 So what does this all mean? So when there 122 00:06:43,879 --> 00:06:48,580 are so many complications with the reordering, the reads 123 00:06:48,580 --> 00:06:51,129 and writes of memory and things like that, as 124 00:06:51,129 --> 00:06:54,759 a programmer you need certain guarantees from the programming 125 00:06:54,759 --> 00:06:56,840 language, and the virtual machine you're working on top 126 00:06:56,840 --> 00:07:01,039 of, to say this is how multi-threaded shared, I 127 00:07:01,039 --> 00:07:03,979 mean, multi-threaded access to shared memory is going to 128 00:07:03,979 --> 00:07:05,940 work. These are the basic guarantees and these are 129 00:07:05,940 --> 00:07:09,310 the simple rules of how the system works. So 130 00:07:09,310 --> 00:07:13,160 you can reliably work code against that, right. So 131 00:07:13,160 --> 00:07:15,139 in, in effect, a memory model is just a 132 00:07:15,139 --> 00:07:21,479 specification. Any Java programmers here, in the house? Great. 133 00:07:21,479 --> 00:07:25,860 So how many of you know about JSR 133? 134 00:07:25,860 --> 00:07:31,270 The memory model, double check locking - OK. Some 135 00:07:31,270 --> 00:07:37,280 people. Single term issue? OK - some more hands. 136 00:07:37,280 --> 00:07:39,620 So Java was the first programming language which came 137 00:07:39,620 --> 00:07:43,360 up with a concept called memory model, right. Because, 138 00:07:43,360 --> 00:07:45,610 the first thing is, right ones?? [00:07:45] won't run 139 00:07:45,610 --> 00:07:48,110 anywhere. It had to be predictable across platforms, across 140 00:07:48,110 --> 00:07:51,740 reimplementations, and things like that. So the, there had 141 00:07:51,740 --> 00:07:54,650 to be a JSR which specified what is the 142 00:07:54,650 --> 00:07:56,860 memory model that it can code against so that 143 00:07:56,860 --> 00:08:02,129 your multi-threaded code works predictably, and deterministically across platforms 144 00:08:02,129 --> 00:08:08,520 and across virtual machines. Right? So essentially that's where 145 00:08:08,520 --> 00:08:11,280 my, you know, whole thing started. I had gone 146 00:08:11,280 --> 00:08:14,509 through the Java memory model, and was pretty much 147 00:08:14,509 --> 00:08:16,960 really happy that someone had taken the pain to 148 00:08:16,960 --> 00:08:18,590 write it down in clear terms so that you 149 00:08:18,590 --> 00:08:25,590 don't have to worry about multi-threading. Hold on, sorry. 150 00:08:28,020 --> 00:08:34,669 Sorry about that. Cool. So. Memory model gives you 151 00:08:34,669 --> 00:08:40,610 rules at three broad levels. Atomicity, visibility and ordering. 152 00:08:40,610 --> 00:08:43,039 So atomicity is as simple as, you know, variable 153 00:08:43,039 --> 00:08:47,030 assignment. Is a variable assignment an indivisible unit of 154 00:08:47,030 --> 00:08:49,520 work, or not? The rules around that, and it 155 00:08:49,520 --> 00:08:52,370 also talks about rules around, can you assign hashes, 156 00:08:52,370 --> 00:08:55,070 send arrays indivisibly and things like that. These rules 157 00:08:55,070 --> 00:08:57,670 can change based on every alligned version, and things 158 00:08:57,670 --> 00:09:01,940 like that. Next is visibility. So in that example 159 00:09:01,940 --> 00:09:05,040 which you talked about, I mean, we saw two 160 00:09:05,040 --> 00:09:07,310 threads trying to read the same value. Essentially they 161 00:09:07,310 --> 00:09:09,390 are spying on each other. And it was not 162 00:09:09,390 --> 00:09:11,529 clear at what point the data had to become 163 00:09:11,529 --> 00:09:14,860 visible to each of those threads. So essentially visibility 164 00:09:14,860 --> 00:09:18,240 is about that. And that is ensured through memory 165 00:09:18,240 --> 00:09:21,800 barriers and ordering, which is the next thing. So 166 00:09:21,800 --> 00:09:25,120 ordering is about how the loads and stores are 167 00:09:25,120 --> 00:09:28,600 sequenced, or, you know, let's say you want to 168 00:09:28,600 --> 00:09:30,720 write a piece of code, critical section as you 169 00:09:30,720 --> 00:09:32,880 call it. And you don't want the compiler to 170 00:09:32,880 --> 00:09:35,510 do any crazy things to improve performance. So you 171 00:09:35,510 --> 00:09:38,140 say, I make it synchronized, and it has to 172 00:09:38,140 --> 00:09:40,399 behave in a, behave in a nice serial?? [00:09:40] 173 00:09:40,399 --> 00:09:44,730 manner. So that ?? manner is ensured by ordering. 174 00:09:44,730 --> 00:09:47,940 Ordering is a really complex area. It talks about 175 00:09:47,940 --> 00:09:50,850 causality, logical clocks and all that. I won't go 176 00:09:50,850 --> 00:09:54,250 into those details. But I've been worrying you with 177 00:09:54,250 --> 00:09:58,070 all this, you know, computer science basics and all 178 00:09:58,070 --> 00:10:00,010 this. Why the hell am I talking about it 179 00:10:00,010 --> 00:10:02,430 in a Ruby conference? Ruby is single-threaded, anyway. Why 180 00:10:02,430 --> 00:10:05,640 the hell should I care about it, right? OK. 181 00:10:05,640 --> 00:10:09,120 Do you really think languages like Ruby are thread 182 00:10:09,120 --> 00:10:14,940 safe? Show of hands, anyone? So thread safety, I'm 183 00:10:14,940 --> 00:10:18,600 talking only about Ruby - maybe Python. GIL based 184 00:10:18,600 --> 00:10:25,600 languages. Are they thread safe? No? OK. In fact 185 00:10:25,700 --> 00:10:30,649 they're not. Having single-threaded does not mean it's thread-safe, 186 00:10:30,649 --> 00:10:33,670 right. Threads can switch context, and based on how 187 00:10:33,670 --> 00:10:36,079 the language has been implemented and how often the 188 00:10:36,079 --> 00:10:38,529 threads can switch context, and at what point they 189 00:10:38,529 --> 00:10:44,010 can switch, things can go wrong, right. And another 190 00:10:44,010 --> 00:10:46,040 pretty popular myth - I don't think many people 191 00:10:46,040 --> 00:10:49,389 believe it here, in this audience at least. I 192 00:10:49,389 --> 00:10:52,440 don't have concurrency problems because I'm running on single 193 00:10:52,440 --> 00:10:55,690 core. Not true. Again, threads can switch context and 194 00:10:55,690 --> 00:10:58,630 run on the same core and still have dirty 195 00:10:58,630 --> 00:11:02,800 reads and things like that. So concurrency is all 196 00:11:02,800 --> 00:11:05,550 about interleavings, right. Again, goes back to reordering. I 197 00:11:05,550 --> 00:11:07,870 think I've been talking about this too often. And 198 00:11:07,870 --> 00:11:11,950 let's not, again, worry with that. It's about interleavings. 199 00:11:11,950 --> 00:11:15,620 We'll leave it at that. So let's, before we 200 00:11:15,620 --> 00:11:19,240 understand more about, you know, the memory model and 201 00:11:19,240 --> 00:11:21,019 what it has to do with Ruby, let's just 202 00:11:21,019 --> 00:11:25,060 understand a little bit about threading in Ruby. So 203 00:11:25,060 --> 00:11:28,100 all of you know, green threads, as of 1.8, 204 00:11:28,100 --> 00:11:31,430 there was only one worse thread, which was being 205 00:11:31,430 --> 00:11:35,220 multiplexed with multiple Ruby threads, which were being scheduled 206 00:11:35,220 --> 00:11:38,980 on it through global interpreter lock. 1.9 comes along, 207 00:11:38,980 --> 00:11:41,200 there is a one to one mapping between the 208 00:11:41,200 --> 00:11:43,660 Ruby thread and OS thread, but still the Ruby 209 00:11:43,660 --> 00:11:46,620 thread cannot use the OS thread unless it has 210 00:11:46,620 --> 00:11:50,980 the global VM lock as its call now. The 211 00:11:50,980 --> 00:11:55,750 JVL acquire. So does having a Global Interpreter Lock 212 00:11:55,750 --> 00:12:00,709 make you thread safe? It depends. It does make 213 00:12:00,709 --> 00:12:03,260 you thread safe in a way, but let's see. 214 00:12:03,260 --> 00:12:05,329 So how does GIL work? This is a very 215 00:12:05,329 --> 00:12:08,510 simplistic representation of how GIL works. So you have 216 00:12:08,510 --> 00:12:12,120 two threads here. One is already holding the GIL. 217 00:12:12,120 --> 00:12:15,519 So it's, it's working with the OS thread. And 218 00:12:15,519 --> 00:12:18,820 now when there is another thread waiting on it, 219 00:12:18,820 --> 00:12:21,190 waiting on the GIL to do its work, it 220 00:12:21,190 --> 00:12:22,510 sends a, it wakes up the timer thread. Time 221 00:12:22,510 --> 00:12:26,790 thread is, again, another Ruby thread. The timer thread 222 00:12:26,790 --> 00:12:30,410 now goes and interrupts the thread holding the GIL, 223 00:12:30,410 --> 00:12:32,040 and if the GIL, if the thread holding the 224 00:12:32,040 --> 00:12:34,889 GIL is done with whatever it's doing - I'll 225 00:12:34,889 --> 00:12:36,550 get to it in a bit - it just 226 00:12:36,550 --> 00:12:40,320 releases the lock, and now thread two can take 227 00:12:40,320 --> 00:12:42,829 over and do its thing. Well this is the 228 00:12:42,829 --> 00:12:48,329 basic working that at least I understood about GIL. 229 00:12:48,329 --> 00:12:50,300 But there are details to this, right. It's not 230 00:12:50,300 --> 00:12:57,300 as simple as what we saw. So, when you 231 00:12:57,779 --> 00:13:00,930 initialize a thread, or create a thread in Ruby, 232 00:13:00,930 --> 00:13:03,100 you pass it a block of code. So how 233 00:13:03,100 --> 00:13:06,240 does that work? You take a block of code, 234 00:13:06,240 --> 00:13:07,769 you put it inside the thread. What the thread 235 00:13:07,769 --> 00:13:10,480 does is usually it acquires a JVL and a 236 00:13:10,480 --> 00:13:14,019 block?? [00:13:11]. It executes the block of code. It 237 00:13:14,019 --> 00:13:17,089 releases the, returns and releases the lock, right. So 238 00:13:17,089 --> 00:13:19,470 essentially this is how it works. So during that 239 00:13:19,470 --> 00:13:21,899 period of executation of the block, no other thread 240 00:13:21,899 --> 00:13:24,380 is allowed to work. So that makes you almost 241 00:13:24,380 --> 00:13:28,110 thread safe, right? But not really. If that's how 242 00:13:28,110 --> 00:13:30,600 it's going to work, what if that thread is 243 00:13:30,600 --> 00:13:33,899 going to hog the GIL, and not allow any 244 00:13:33,899 --> 00:13:35,760 other thread to work? So there has to be 245 00:13:35,760 --> 00:13:38,430 some kind of lock fairness, right. So that's where 246 00:13:38,430 --> 00:13:41,180 the timer thread comes in and interrupts it. OK. 247 00:13:41,180 --> 00:13:43,130 Does that mean the thread holding the GIL immediately 248 00:13:43,130 --> 00:13:45,190 gives it up, and says here you go, you 249 00:13:45,190 --> 00:13:48,740 can start and work with it? Not really. Again 250 00:13:48,740 --> 00:13:51,389 the thread holding the GIL will only release the 251 00:13:51,389 --> 00:13:53,920 GIL if it is at a context to its 252 00:13:53,920 --> 00:13:57,019 boundary. What that is, is fairly complicated. I don't 253 00:13:57,019 --> 00:13:59,920 want to go into the details. I think people 254 00:13:59,920 --> 00:14:02,540 who here know a lot better C than me, 255 00:14:02,540 --> 00:14:05,110 and are deep C divers really, they can probably 256 00:14:05,110 --> 00:14:08,670 tell you, you know, how, at what the GIL 257 00:14:08,670 --> 00:14:11,040 can get released. If a C thread, a C 258 00:14:11,040 --> 00:14:13,269 code makes a call to Ruby code, can it 259 00:14:13,269 --> 00:14:15,449 or can it not release the GIL? All those 260 00:14:15,449 --> 00:14:18,399 things are there, right. So all these complexities are 261 00:14:18,399 --> 00:14:21,360 really, really hard to deal with. I came across 262 00:14:21,360 --> 00:14:25,139 this blog by Jesse Storimer. It's excellent and I 263 00:14:25,139 --> 00:14:27,440 strongly encourage you to go through the two-part blog 264 00:14:27,440 --> 00:14:30,990 about, you know, nobody understands GIL. It's really, really 265 00:14:30,990 --> 00:14:33,550 important, if you're trying to do any sort of 266 00:14:33,550 --> 00:14:39,740 multi-threaded programming in Ruby. So do you still think 267 00:14:39,740 --> 00:14:42,740 Ruby is thread safe because it's got GIL? I'm 268 00:14:42,740 --> 00:14:48,740 talking about MRI, essentially. So the thing is, we 269 00:14:48,740 --> 00:14:51,630 can't depend on GIL, right. GIL is not documented 270 00:14:51,630 --> 00:14:54,050 anywhere that this is exactly how it works. This 271 00:14:54,050 --> 00:14:56,079 is when the timer thread wakes up. These are 272 00:14:56,079 --> 00:14:59,310 the time slices alotted to the thread acquiring the 273 00:14:59,310 --> 00:15:03,190 JVL. There is no documentation around at what point 274 00:15:03,190 --> 00:15:04,860 the GIL can be released, can it not be 275 00:15:04,860 --> 00:15:07,009 released, and things like that. There's no, it's not 276 00:15:07,009 --> 00:15:10,259 predictable, and if you depend on it, what could 277 00:15:10,259 --> 00:15:13,139 also happen is even within MRI, when you're moving 278 00:15:13,139 --> 00:15:15,920 from version to version, if something changes in GIL, 279 00:15:15,920 --> 00:15:22,220 your code with behave nondeterministically. And what about language 280 00:15:22,220 --> 00:15:25,209 in Ruby implementations that don't even have a GIL? 281 00:15:25,209 --> 00:15:27,009 So obviously that's the big problem, right. If you 282 00:15:27,009 --> 00:15:29,610 write a gem or something which has to be 283 00:15:29,610 --> 00:15:32,079 multi-threaded, and if you're depending on the GIL to 284 00:15:32,079 --> 00:15:34,769 do its thing to keep you safe, then obviously 285 00:15:34,769 --> 00:15:38,550 it cannot work on Rubinius and JRuby. Let that 286 00:15:38,550 --> 00:15:41,310 alone, even, even if you give that up, even 287 00:15:41,310 --> 00:15:44,360 with MRI, it's not entirely correct to say that 288 00:15:44,360 --> 00:15:47,490 you're thread safe, because there is a GIL that 289 00:15:47,490 --> 00:15:52,660 will ensure that only one thread is running. So 290 00:15:52,660 --> 00:15:54,610 what did I find out? Ruby really does not 291 00:15:54,610 --> 00:15:57,350 have a documented memory model. It's pretty much similar 292 00:15:57,350 --> 00:16:00,480 to Python. It doesn't have a clearly documented memory 293 00:16:00,480 --> 00:16:05,279 model. What is the implication of that? So as 294 00:16:05,279 --> 00:16:07,540 I mentioned previously, a memory model is like a 295 00:16:07,540 --> 00:16:10,769 specification. This is exactly how the system has to 296 00:16:10,769 --> 00:16:14,600 provide a certain minimum guarantee to the users of 297 00:16:14,600 --> 00:16:17,730 the language, right, regarding multi threaded access to shared 298 00:16:17,730 --> 00:16:22,500 memory. Now, basically if I don't have a written 299 00:16:22,500 --> 00:16:23,720 down memory model, and I am going to write 300 00:16:23,720 --> 00:16:26,540 a Ruby implementation to model, I have the liberty 301 00:16:26,540 --> 00:16:29,509 to choose whatever memory model I want. So the 302 00:16:29,509 --> 00:16:32,889 code, if you're writing against MRI, may not essentially 303 00:16:32,889 --> 00:16:36,720 work right on my, you know, my implementation of 304 00:16:36,720 --> 00:16:41,339 Ruby. That's the big implication, right. So Ruby right 305 00:16:41,339 --> 00:16:45,769 now depends on underlying virtual machines. Even after ER, 306 00:16:45,769 --> 00:16:47,699 you have bad code compilations, so even MRI is 307 00:16:47,699 --> 00:16:50,839 almost like a VM. So that has no specification 308 00:16:50,839 --> 00:16:52,959 for a memory model, but it does have something, 309 00:16:52,959 --> 00:16:55,279 right, internally. If you have to go through the 310 00:16:55,279 --> 00:16:58,130 C code and understand. It's not guaranteed to remain 311 00:16:58,130 --> 00:17:01,079 the same from version to version, as I understand, 312 00:17:01,079 --> 00:17:05,069 right. And obviously JRuby and Rubinius, they depend on 313 00:17:05,069 --> 00:17:08,260 JVM and LLVM respectively. And they all have a 314 00:17:08,260 --> 00:17:11,819 clearly documented memory model. You could have a read 315 00:17:11,819 --> 00:17:15,260 at it. And the only thing is, if Ruby 316 00:17:15,260 --> 00:17:18,079 had an implementation - sorry, a specification for a 317 00:17:18,079 --> 00:17:22,220 memory model, it could be, you know, implemented using 318 00:17:22,220 --> 00:17:27,599 the constructs available on JVM and LLVM. But this 319 00:17:27,599 --> 00:17:29,450 is what we have. We don't have much to 320 00:17:29,450 --> 00:17:33,200 do. What do we do under the circumstances? We 321 00:17:33,200 --> 00:17:36,640 have to engineer our code for thread safety. We 322 00:17:36,640 --> 00:17:40,120 can't bask under the safety that, there is a 323 00:17:40,120 --> 00:17:42,410 GIL and so it's going to help me keep 324 00:17:42,410 --> 00:17:44,530 my code thread safe. So even I can write 325 00:17:44,530 --> 00:17:47,690 multiple, you know, multi threaded code without actually worrying 326 00:17:47,690 --> 00:17:51,290 about serious synchronization issues and things like that. It's 327 00:17:51,290 --> 00:17:54,500 totally not the right thing to do. I think 328 00:17:54,500 --> 00:17:57,370 any which way, Ruby is a language I love, 329 00:17:57,370 --> 00:17:59,710 and I'm sure all of you love, so. And 330 00:17:59,710 --> 00:18:02,670 it's progressing my leaps and bounds, and eventually we're 331 00:18:02,670 --> 00:18:04,840 going to write more and more complex systems with 332 00:18:04,840 --> 00:18:09,390 Ruby. And who knows, we might have true parallelism 333 00:18:09,390 --> 00:18:13,980 very soon, right. So why, still, stay in the 334 00:18:13,980 --> 00:18:17,210 same mental block that we don't want to write, 335 00:18:17,210 --> 00:18:20,480 you know, thread safe code that's anyway single threaded. 336 00:18:20,480 --> 00:18:22,150 We might as well get into the mindset of 337 00:18:22,150 --> 00:18:26,130 writing proper thread safe code, and try and probably 338 00:18:26,130 --> 00:18:29,500 come up with a memory model, right. But I 339 00:18:29,500 --> 00:18:31,700 think for now we just start engineering code for 340 00:18:31,700 --> 00:18:36,860 thread safety. Simple Mutex, I'm sure all of you 341 00:18:36,860 --> 00:18:39,580 know, but it's really, really important for even a 342 00:18:39,580 --> 00:18:44,090 stupid operation like a plus equals two. So simple 343 00:18:44,090 --> 00:18:46,970 things which are noticed in Ruby code bases and 344 00:18:46,970 --> 00:18:50,530 Rails code bases as well, like generally, is, there 345 00:18:50,530 --> 00:18:52,920 is like a synchronized, you know, a section of 346 00:18:52,920 --> 00:18:56,260 the code has lots of synchronization and everything. It's 347 00:18:56,260 --> 00:18:58,530 really safe. But we leave an innocent accessor lying 348 00:18:58,530 --> 00:19:00,760 around, and that causes a lot of, you know, 349 00:19:00,760 --> 00:19:04,360 pain, like debugging those issues. And general issues like, 350 00:19:04,360 --> 00:19:08,020 you know, state mutations, inside methods is really a 351 00:19:08,020 --> 00:19:10,270 bad idea. So if you're looking for issues around 352 00:19:10,270 --> 00:19:12,200 multi threading, this might be a good place to 353 00:19:12,200 --> 00:19:14,350 start. So I just listed a few of them 354 00:19:14,350 --> 00:19:16,310 here. I didn't want to make a really dense 355 00:19:16,310 --> 00:19:19,210 talk with all the details. You can always catch 356 00:19:19,210 --> 00:19:20,940 me offline and I can tell you some of 357 00:19:20,940 --> 00:19:23,600 my experiences and probably even listen to you and 358 00:19:23,600 --> 00:19:25,980 learn from you about some of the issues that 359 00:19:25,980 --> 00:19:28,820 we can solve by actually writing proper thread safe 360 00:19:28,820 --> 00:19:33,080 code in Ruby. I came across a few gems 361 00:19:33,080 --> 00:19:35,090 which were really, really nice. Both of them happen 362 00:19:35,090 --> 00:19:38,680 to be written by headius. The first one is 363 00:19:38,680 --> 00:19:40,730 atomic. Atomic is almost trying to give you the 364 00:19:40,730 --> 00:19:44,970 similar constructs like the Java utility concurrent package. It 365 00:19:44,970 --> 00:19:51,300 tries to, it's kind of compatible across MRI, JRuby, 366 00:19:51,300 --> 00:19:53,800 and Rubinius, which is also a really nice thing. 367 00:19:53,800 --> 00:19:56,560 So you have atomic integers and atomic floats, which 368 00:19:56,560 --> 00:19:59,900 do increments actually in an atomic way, which is 369 00:19:59,900 --> 00:20:02,460 excellent. And then there is thread_safe library, which also 370 00:20:02,460 --> 00:20:04,590 has a few thread safe data structures. I'm trying 371 00:20:04,590 --> 00:20:06,570 to play around with these libraries right now, but 372 00:20:06,570 --> 00:20:09,150 they may be a good, you know, starting point 373 00:20:09,150 --> 00:20:10,780 if you are trying to do higher level constructs 374 00:20:10,780 --> 00:20:15,620 for concurrency. And that's pretty much it. I'm open 375 00:20:15,620 --> 00:20:21,820 to take questions. Thank you. And before anything I 376 00:20:21,820 --> 00:20:23,420 really would like to thank you all, again for 377 00:20:23,420 --> 00:20:27,140 being here for the talk, and thank the GCRC 378 00:20:27,140 --> 00:20:31,410 organizers, you know, they've done a great job with 379 00:20:31,410 --> 00:20:38,410 this conference. A big shout out to them. 380 00:20:46,470 --> 00:20:46,510 V.O.: Any questions? 381 00:20:46,510 --> 00:20:46,540 H.K.: Yeah? 382 00:20:46,540 --> 00:20:46,560 QUESTION: Hey. 383 00:20:46,560 --> 00:20:46,590 H.K.: Hi. 384 00:20:46,590 --> 00:20:47,520 QUESTION: If, for example, if a Ruby code is running 385 00:20:47,520 --> 00:20:51,530 in the JVM, in JRuby, how does, because none 386 00:20:51,530 --> 00:20:53,810 of the Ruby code is written in a thread 387 00:20:53,810 --> 00:20:56,580 safe way. How do, how does it internally manage 388 00:20:56,580 --> 00:20:58,750 - does it actually, yeah, yesterday Yogi talked about 389 00:20:58,750 --> 00:21:00,940 the point that ActiveRecord is not actually thread safe. 390 00:21:00,940 --> 00:21:03,520 Can you explain it in detail like in a 391 00:21:03,520 --> 00:21:04,460 theoretical way? 392 00:21:04,460 --> 00:21:06,560 H.K.: OK. What is thread safety in 393 00:21:06,560 --> 00:21:09,010 general, right? Thread safety is about how the data 394 00:21:09,010 --> 00:21:13,280 is consistently maintained after multi-threaded access to that shared 395 00:21:13,280 --> 00:21:17,130 data, right. So Ruby essentially has a GIL because 396 00:21:17,130 --> 00:21:19,620 internal implementations are not thread safe, right. That's why 397 00:21:19,620 --> 00:21:22,110 you want to have a GIL to protect you 398 00:21:22,110 --> 00:21:25,840 from those problems. But as far as JRuby is 399 00:21:25,840 --> 00:21:29,280 concerned, or Rubinius is concerned, the implementation itself is 400 00:21:29,280 --> 00:21:31,930 not written in C. JRuby is written in Ruby 401 00:21:31,930 --> 00:21:34,400 again, I mean JRuby itself, and Rubinius is written 402 00:21:34,400 --> 00:21:37,660 in Ruby. And some of these actual internal constructs 403 00:21:37,660 --> 00:21:40,580 are thread safe when compared to MRI. I haven't 404 00:21:40,580 --> 00:21:43,190 actually taken a look in detail into the code 405 00:21:43,190 --> 00:21:47,520 of these code bases, but if they are implemented 406 00:21:47,520 --> 00:21:50,000 properly, you can be thread safe - internally, at 407 00:21:50,000 --> 00:21:53,340 least - so, which means, the base code of 408 00:21:53,340 --> 00:21:55,720 JRuby itself might be thread safe. It's only not 409 00:21:55,720 --> 00:21:58,200 thread safe because the gems on top of it, 410 00:21:58,200 --> 00:22:01,050 which are trying to run. They may have, like, 411 00:22:01,050 --> 00:22:04,890 thread safety issues, right. Does that answer your question, 412 00:22:04,890 --> 00:22:05,840 like, or- ? 413 00:22:05,840 --> 00:22:08,200 QUESTION: About thread safety?? [00:22:09]. 414 00:22:08,200 --> 00:22:11,720 H.K.: Sure, sure. So those gems will not work. That's 415 00:22:11,720 --> 00:22:13,840 the point. Like what I want to convey here, 416 00:22:13,840 --> 00:22:16,910 is whatever gems we are offering, and whatever code 417 00:22:16,910 --> 00:22:18,780 we are writing, we might get it - it's 418 00:22:18,780 --> 00:22:20,240 a good idea to get into the habit of 419 00:22:20,240 --> 00:22:22,860 writing thread safe code, so that we can actually 420 00:22:22,860 --> 00:22:25,460 encourage a truly parallel Ruby, right. We don't, we 421 00:22:25,460 --> 00:22:27,530 don't have to stay in the same paradigm of 422 00:22:27,530 --> 00:22:31,520 OK we have to be single threaded. 423 00:22:31,520 --> 00:22:37,010 QUESTION: So Mutex based thread management is one way. 424 00:22:37,010 --> 00:22:40,060 There's also like actors and futures and things like that. 425 00:22:40,060 --> 00:22:41,890 And there's a gem called cellulite- 426 00:22:41,890 --> 00:22:42,680 H.K.: Yup. 427 00:22:42,680 --> 00:22:45,040 QUESTION: That, combined with something called Hamster, 428 00:22:45,040 --> 00:22:46,390 which makes everything immutable- 429 00:22:46,390 --> 00:22:46,840 H.K.: Yup. 430 00:22:46,840 --> 00:22:47,960 QUESTION: Is another way to do it. 431 00:22:47,960 --> 00:22:48,160 H.K.: Yup. 432 00:22:48,160 --> 00:22:49,070 QUESTION: Have you done it or like, 433 00:22:49,070 --> 00:22:49,950 what's your experience with that? 434 00:22:49,950 --> 00:22:53,130 H.K.: Yeah, I have tried out actors, with revactor, 435 00:22:53,130 --> 00:22:54,330 and lockless concurrency is 436 00:22:54,330 --> 00:22:56,830 something I definitely agree is a good idea. But 437 00:22:56,830 --> 00:23:01,440 I'm specifically talking about, you know, lock-based concurrency, like, 438 00:23:01,440 --> 00:23:04,530 Mutex-based concurrency. This area is also important because it's 439 00:23:04,530 --> 00:23:07,960 not like thread mutable state is bad. It is, 440 00:23:07,960 --> 00:23:10,770 it is actually applicable in certain scenarios. When we 441 00:23:10,770 --> 00:23:13,360 are working in this particular paradigm, we still need 442 00:23:13,360 --> 00:23:19,170 the safety of a memory model. Any other questions? 443 00:23:19,170 --> 00:23:26,170 QUESTION: Thanks for the talk Hari. It was really 444 00:23:28,200 --> 00:23:28,650 good. 445 00:23:28,650 --> 00:23:29,550 H.K.: Thanks. 446 00:23:29,550 --> 00:23:31,140 QUESTION: Is there a way that 447 00:23:31,140 --> 00:23:35,050 you would recommend to test if you have done 448 00:23:35,050 --> 00:23:37,850 threading properly or not? I mean, I know, bugs 449 00:23:37,850 --> 00:23:38,420 that come out- 450 00:23:38,420 --> 00:23:38,610 H.K.: Right. 451 00:23:38,610 --> 00:23:38,980 QUESTION: Like I have 452 00:23:38,980 --> 00:23:41,680 written bugs that come out of badly written, you 453 00:23:41,680 --> 00:23:43,750 know, not thread safe code, as. 454 00:23:43,750 --> 00:23:44,510 H.K.: So- 455 00:23:44,510 --> 00:23:47,190 QUESTION: Like, ?? [00:23:46] so, you catch them. 456 00:23:47,190 --> 00:23:51,510 H.K.: At least, my opinion, and a lot of people have 457 00:23:51,510 --> 00:23:53,960 done research in this area, their opinion also is 458 00:23:53,960 --> 00:23:57,600 that it's not possible to write tests against multi 459 00:23:57,600 --> 00:24:00,480 threaded code where there is shared data. Because it's 460 00:24:00,480 --> 00:24:04,230 nondeterministic and nonrepeatable. The kind of results you get, 461 00:24:04,230 --> 00:24:06,920 you can only test it against a heuristic. For 462 00:24:06,920 --> 00:24:09,430 example, if you have a deterministic use case at 463 00:24:09,430 --> 00:24:11,620 the top level, you can probably test it against 464 00:24:11,620 --> 00:24:14,490 that. But exact test cases can never be written 465 00:24:14,490 --> 00:24:16,070 for this. 466 00:24:16,070 --> 00:24:19,240 V.O.: Any more questions? 467 00:24:19,240 --> 00:24:26,240 H.K.: Cool. All right. Thank you so much.