WEBVTT 00:00:24.810 --> 00:00:25.859 HARY KRISHNAN: So, thank you very much 00:00:25.859 --> 00:00:28.099 for being here on a Saturday evening, this late. 00:00:28.099 --> 00:00:30.430 My talk got pushed to the last, but I 00:00:30.430 --> 00:00:34.540 appreciate you being here, first. My name's Hari. I 00:00:34.540 --> 00:00:36.910 work at MavenHive. So this is a talk about 00:00:36.910 --> 00:00:43.530 Ruby memory model. So before I start, how many 00:00:43.530 --> 00:00:46.559 of you have heard about memory model and know 00:00:46.559 --> 00:00:51.909 what it is? Show of hands, please. OK. Let's 00:00:51.909 --> 00:00:55.150 see where this talk goes. So why I did 00:00:55.150 --> 00:00:58.839 I come up with this talk topic. So I 00:00:58.839 --> 00:01:01.809 started my career with Java, and I spent a 00:01:01.809 --> 00:01:04.860 lot many years with Java, and Java has a 00:01:04.860 --> 00:01:08.890 very clearly documented memory model. And it kind of 00:01:08.890 --> 00:01:10.500 gets to you because with all that, you don't 00:01:10.500 --> 00:01:14.049 feel safe enough doing multi-threaded programming at all. So 00:01:14.049 --> 00:01:17.710 with Ruby, we've always been talking about, you know, 00:01:17.710 --> 00:01:21.290 doing multi-process for multi-process parallelism, 00:01:21.290 --> 00:01:24.450 rather than multi-threaded parallelism, 00:01:24.450 --> 00:01:28.710 even though the language actually supports, you know, multi-threading 00:01:28.710 --> 00:01:30.799 semantics. Of course we know it's called single-threaded and 00:01:30.799 --> 00:01:34.259 all that, but I just got curious, like, what 00:01:34.259 --> 00:01:36.499 is the real memory model behind Ruby, and I 00:01:36.499 --> 00:01:39.149 just wanted to figure that out. So this talk 00:01:39.149 --> 00:01:42.439 is all about my learnings as I went through, 00:01:42.439 --> 00:01:46.350 like, various literatures, and figured out, and I tried 00:01:46.350 --> 00:01:48.289 to combine, like, get a gist of the whole 00:01:48.289 --> 00:01:50.509 thing. And cram it into some twenty minutes so 00:01:50.509 --> 00:01:52.340 that I could, like, probably give you a very 00:01:52.340 --> 00:01:55.600 useful session, like, from which you can further do 00:01:55.600 --> 00:02:01.069 more digging on this, right. So when I talked 00:02:01.069 --> 00:02:03.420 to my friends about memory model, the first thing 00:02:03.420 --> 00:02:05.540 that comes up to their mind is probably this 00:02:05.540 --> 00:02:10.139 - heap, heap, non-heap, stack, whatever. I'm not gonna 00:02:10.139 --> 00:02:14.069 talk about that. I'm not gonna talk about this 00:02:14.069 --> 00:02:17.450 either. It's not about, you know, optimizing your memory, 00:02:17.450 --> 00:02:21.040 or search memory leeks, or garbage collection. This talk 00:02:21.040 --> 00:02:23.330 is not about that either. So what the hell 00:02:23.330 --> 00:02:27.370 am I gonna talk about? First, a quick exercise. 00:02:27.370 --> 00:02:31.360 So let's start with this and see where it 00:02:31.360 --> 00:02:35.760 goes. Simple code. Not much to process late in 00:02:35.760 --> 00:02:38.890 the day. There's a shared variable called 'n', and 00:02:38.890 --> 00:02:42.030 there are thousand threads over that, and each of 00:02:42.030 --> 00:02:45.379 those threads want to increment that shared variable hundred 00:02:45.379 --> 00:02:49.379 times, right. And what is the expected output? I'm 00:02:49.379 --> 00:02:51.200 not gonna question you, I'm just gonna give it 00:02:51.200 --> 00:02:55.180 away. It's 100,000. It's fairly straightforward code. I'm sure 00:02:55.180 --> 00:02:57.200 all of you have done this, and it's no 00:02:57.200 --> 00:03:01.680 big deal. So what's the real output? MRI is 00:03:01.680 --> 00:03:05.319 very faithful, it gives you what you expected. 100,000, 00:03:05.319 --> 00:03:08.720 right. So what happens next? I'm running it on 00:03:08.720 --> 00:03:12.569 Rubinius. This is what you see. And it's always 00:03:12.569 --> 00:03:15.760 going to be a different number every time you 00:03:15.760 --> 00:03:19.140 run it. And that's JRuby. It gives you a 00:03:19.140 --> 00:03:22.629 lower number. Some of you may be guessing already, 00:03:22.629 --> 00:03:24.489 and you probably know it, why it gives you 00:03:24.489 --> 00:03:28.159 a lower number. So why all this basic stupid 00:03:28.159 --> 00:03:31.230 code and some stupid counter over here, right? So 00:03:31.230 --> 00:03:34.189 I just wanted to get a really basic example 00:03:34.189 --> 00:03:36.299 to explain the concept of increment is not a 00:03:36.299 --> 00:03:40.040 single instruction, right. The reason why I'm talking about 00:03:40.040 --> 00:03:43.390 this is, I love Ruby because the syntax is 00:03:43.390 --> 00:03:46.629 so terse, and it's so simple, it's so readable, 00:03:46.629 --> 00:03:49.310 right. But it does not mean every single instruction 00:03:49.310 --> 00:03:52.140 on the screen is going to be executed straight 00:03:52.140 --> 00:03:54.810 away, right. So at least, to my junior self, 00:03:54.810 --> 00:03:56.599 this is the first advice I would give, when 00:03:56.599 --> 00:04:00.590 I started, you know, multi-threaded programming. So at least 00:04:00.590 --> 00:04:05.980 three steps. Lowered increments store, right. That's, even further, 00:04:05.980 --> 00:04:09.879 really simple piece of code like, you know, a 00:04:09.879 --> 00:04:12.879 plus equals to, right. So this is what we 00:04:12.879 --> 00:04:15.750 really want to happen. You have a count, you 00:04:15.750 --> 00:04:18.399 lowered it, you increment it, you stored it. Then 00:04:18.399 --> 00:04:21.019 the next thread comes along. It lowers it, increments 00:04:21.019 --> 00:04:23.220 it, stores it. You have the next result which 00:04:23.220 --> 00:04:25.750 is what you expect, right. But we live in 00:04:25.750 --> 00:04:28.260 a world where threads don't want to be our 00:04:28.260 --> 00:04:31.470 friend. They do this. One guy comes along, reads 00:04:31.470 --> 00:04:33.920 it, increments it. The other guy also reads the 00:04:33.920 --> 00:04:37.440 older value, increments it. And both of them go 00:04:37.440 --> 00:04:40.020 and save the same value, right. So this is 00:04:40.020 --> 00:04:42.120 a classic case of lost update. I'm sure most 00:04:42.120 --> 00:04:44.060 of you have seen it in the database world. 00:04:44.060 --> 00:04:46.770 But this pretty much happens a lot in the 00:04:46.770 --> 00:04:48.860 multi-threading world, right. But why did it not happen 00:04:48.860 --> 00:04:51.620 with MRI? And what did you see the right 00:04:51.620 --> 00:04:53.190 result?? [00:04:52]? That, I'm sure a lot of you 00:04:53.190 --> 00:04:55.580 know, but let's step, let's part that question and 00:04:55.580 --> 00:05:00.500 just move a little ahead. So, as you observed 00:05:00.500 --> 00:05:03.770 earlier, a lot of reordoring happening in instructions, right. 00:05:03.770 --> 00:05:07.210 Like, the threads were context-switching, and they were reordering 00:05:07.210 --> 00:05:11.139 statements. So where does this reordering happen? Reordering can 00:05:11.139 --> 00:05:14.740 happen at multiple levels. So start from the top. 00:05:14.740 --> 00:05:18.150 You have the compiler, which can do simple optimizations 00:05:18.150 --> 00:05:20.780 like look closer?? [00:05:20]. Even that can change the 00:05:20.780 --> 00:05:23.990 order of your statements in your code, right. Next, 00:05:23.990 --> 00:05:27.680 when the code gets translated to, you know, machine-level 00:05:27.680 --> 00:05:30.639 language, goes to core, and your CP cores are 00:05:30.639 --> 00:05:34.430 at liberty, again, to reorder them for performance. And 00:05:34.430 --> 00:05:37.020 next comes the memory system, right. The memory system 00:05:37.020 --> 00:05:39.669 is like the combined global memory, which all the 00:05:39.669 --> 00:05:42.490 CPUs can read, and also they're individual caches. But 00:05:42.490 --> 00:05:45.840 why do CPUs have caches? They want to, memory 00:05:45.840 --> 00:05:47.710 is slow, so they want to load, reload all 00:05:47.710 --> 00:05:50.080 the values, refactor it, keep it in the cache, 00:05:50.080 --> 00:05:52.710 again improve performance. So even the memory system can 00:05:52.710 --> 00:05:55.940 conspire against you and reorder the loads and stores 00:05:55.940 --> 00:05:59.380 after the memory registers. And that can cause reordering, 00:05:59.380 --> 00:06:03.319 right. So this is really, really crazy. Like, I'm 00:06:03.319 --> 00:06:07.550 a very stupid programmer, who works at the programming 00:06:07.550 --> 00:06:10.599 language level. I don't really understand the structure of 00:06:10.599 --> 00:06:13.169 the hardware and things like that. So how do 00:06:13.169 --> 00:06:15.550 I keep myself abstracted from all this, you know, 00:06:15.550 --> 00:06:21.550 really crazy stuff? So that's essentially a memory model. 00:06:21.550 --> 00:06:23.930 So what, what is a memory model? A memory 00:06:23.930 --> 00:06:27.180 model describes the interactions of threads through memory and 00:06:27.180 --> 00:06:28.970 their shared use of data. So this is straight 00:06:28.970 --> 00:06:30.919 out of Wikipedia, right. So if you just read 00:06:30.919 --> 00:06:34.610 it first, either you're gonna think it's really simple, 00:06:34.610 --> 00:06:38.069 and probably even looks stupid, but otherwise you might 00:06:38.069 --> 00:06:40.789 not even understand. So I was the second category. 00:06:40.789 --> 00:06:43.879 So what does this all mean? So when there 00:06:43.879 --> 00:06:48.580 are so many complications with the reordering, the reads 00:06:48.580 --> 00:06:51.129 and writes of memory and things like that, as 00:06:51.129 --> 00:06:54.759 a programmer you need certain guarantees from the programming 00:06:54.759 --> 00:06:56.840 language, and the virtual machine you're working on top 00:06:56.840 --> 00:07:01.039 of, to say this is how multi-threaded shared, I 00:07:01.039 --> 00:07:03.979 mean, multi-threaded access to shared memory is going to 00:07:03.979 --> 00:07:05.940 work. These are the basic guarantees and these are 00:07:05.940 --> 00:07:09.310 the simple rules of how the system works. So 00:07:09.310 --> 00:07:13.160 you can reliably work code against that, right. So 00:07:13.160 --> 00:07:15.139 in, in effect, a memory model is just a 00:07:15.139 --> 00:07:21.479 specification. Any Java programmers here, in the house? Great. 00:07:21.479 --> 00:07:25.860 So how many of you know about JSR 133? 00:07:25.860 --> 00:07:31.270 The memory model, double check locking - OK. Some 00:07:31.270 --> 00:07:37.280 people. Single term issue? OK - some more hands. 00:07:37.280 --> 00:07:39.620 So Java was the first programming language which came 00:07:39.620 --> 00:07:43.360 up with a concept called memory model, right. Because, 00:07:43.360 --> 00:07:45.610 the first thing is, right ones?? [00:07:45] won't run 00:07:45.610 --> 00:07:48.110 anywhere. It had to be predictable across platforms, across 00:07:48.110 --> 00:07:51.740 reimplementations, and things like that. So the, there had 00:07:51.740 --> 00:07:54.650 to be a JSR which specified what is the 00:07:54.650 --> 00:07:56.860 memory model that it can code against so that 00:07:56.860 --> 00:08:02.129 your multi-threaded code works predictably, and deterministically across platforms 00:08:02.129 --> 00:08:08.520 and across virtual machines. Right? So essentially that's where 00:08:08.520 --> 00:08:11.280 my, you know, whole thing started. I had gone 00:08:11.280 --> 00:08:14.509 through the Java memory model, and was pretty much 00:08:14.509 --> 00:08:16.960 really happy that someone had taken the pain to 00:08:16.960 --> 00:08:18.590 write it down in clear terms so that you 00:08:18.590 --> 00:08:25.590 don't have to worry about multi-threading. Hold on, sorry. 00:08:28.020 --> 00:08:34.669 Sorry about that. Cool. So. Memory model gives you 00:08:34.669 --> 00:08:40.610 rules at three broad levels. Atomicity, visibility and ordering. 00:08:40.610 --> 00:08:43.039 So atomicity is as simple as, you know, variable 00:08:43.039 --> 00:08:47.030 assignment. Is a variable assignment an indivisible unit of 00:08:47.030 --> 00:08:49.520 work, or not? The rules around that, and it 00:08:49.520 --> 00:08:52.370 also talks about rules around, can you assign hashes, 00:08:52.370 --> 00:08:55.070 send arrays indivisibly and things like that. These rules 00:08:55.070 --> 00:08:57.670 can change based on every alligned version, and things 00:08:57.670 --> 00:09:01.940 like that. Next is visibility. So in that example 00:09:01.940 --> 00:09:05.040 which you talked about, I mean, we saw two 00:09:05.040 --> 00:09:07.310 threads trying to read the same value. Essentially they 00:09:07.310 --> 00:09:09.390 are spying on each other. And it was not 00:09:09.390 --> 00:09:11.529 clear at what point the data had to become 00:09:11.529 --> 00:09:14.860 visible to each of those threads. So essentially visibility 00:09:14.860 --> 00:09:18.240 is about that. And that is ensured through memory 00:09:18.240 --> 00:09:21.800 barriers and ordering, which is the next thing. So 00:09:21.800 --> 00:09:25.120 ordering is about how the loads and stores are 00:09:25.120 --> 00:09:28.600 sequenced, or, you know, let's say you want to 00:09:28.600 --> 00:09:30.720 write a piece of code, critical section as you 00:09:30.720 --> 00:09:32.880 call it. And you don't want the compiler to 00:09:32.880 --> 00:09:35.510 do any crazy things to improve performance. So you 00:09:35.510 --> 00:09:38.140 say, I make it synchronized, and it has to 00:09:38.140 --> 00:09:40.399 behave in a, behave in a nice serial?? [00:09:40] 00:09:40.399 --> 00:09:44.730 manner. So that ?? manner is ensured by ordering. 00:09:44.730 --> 00:09:47.940 Ordering is a really complex area. It talks about 00:09:47.940 --> 00:09:50.850 causality, logical clocks and all that. I won't go 00:09:50.850 --> 00:09:54.250 into those details. But I've been worrying you with 00:09:54.250 --> 00:09:58.070 all this, you know, computer science basics and all 00:09:58.070 --> 00:10:00.010 this. Why the hell am I talking about it 00:10:00.010 --> 00:10:02.430 in a Ruby conference? Ruby is single-threaded, anyway. Why 00:10:02.430 --> 00:10:05.640 the hell should I care about it, right? OK. 00:10:05.640 --> 00:10:09.120 Do you really think languages like Ruby are thread 00:10:09.120 --> 00:10:14.940 safe? Show of hands, anyone? So thread safety, I'm 00:10:14.940 --> 00:10:18.600 talking only about Ruby - maybe Python. GIL based 00:10:18.600 --> 00:10:25.600 languages. Are they thread safe? No? OK. In fact 00:10:25.700 --> 00:10:30.649 they're not. Having single-threaded does not mean it's thread-safe, 00:10:30.649 --> 00:10:33.670 right. Threads can switch context, and based on how 00:10:33.670 --> 00:10:36.079 the language has been implemented and how often the 00:10:36.079 --> 00:10:38.529 threads can switch context, and at what point they 00:10:38.529 --> 00:10:44.010 can switch, things can go wrong, right. And another 00:10:44.010 --> 00:10:46.040 pretty popular myth - I don't think many people 00:10:46.040 --> 00:10:49.389 believe it here, in this audience at least. I 00:10:49.389 --> 00:10:52.440 don't have concurrency problems because I'm running on single 00:10:52.440 --> 00:10:55.690 core. Not true. Again, threads can switch context and 00:10:55.690 --> 00:10:58.630 run on the same core and still have dirty 00:10:58.630 --> 00:11:02.800 reads and things like that. So concurrency is all 00:11:02.800 --> 00:11:05.550 about interleavings, right. Again, goes back to reordering. I 00:11:05.550 --> 00:11:07.870 think I've been talking about this too often. And 00:11:07.870 --> 00:11:11.950 let's not, again, worry with that. It's about interleavings. 00:11:11.950 --> 00:11:15.620 We'll leave it at that. So let's, before we 00:11:15.620 --> 00:11:19.240 understand more about, you know, the memory model and 00:11:19.240 --> 00:11:21.019 what it has to do with Ruby, let's just 00:11:21.019 --> 00:11:25.060 understand a little bit about threading in Ruby. So 00:11:25.060 --> 00:11:28.100 all of you know, green threads, as of 1.8, 00:11:28.100 --> 00:11:31.430 there was only one worse thread, which was being 00:11:31.430 --> 00:11:35.220 multiplexed with multiple Ruby threads, which were being scheduled 00:11:35.220 --> 00:11:38.980 on it through global interpreter lock. 1.9 comes along, 00:11:38.980 --> 00:11:41.200 there is a one to one mapping between the 00:11:41.200 --> 00:11:43.660 Ruby thread and OS thread, but still the Ruby 00:11:43.660 --> 00:11:46.620 thread cannot use the OS thread unless it has 00:11:46.620 --> 00:11:50.980 the global VM lock as its call now. The 00:11:50.980 --> 00:11:55.750 JVL acquire. So does having a Global Interpreter Lock 00:11:55.750 --> 00:12:00.709 make you thread safe? It depends. It does make 00:12:00.709 --> 00:12:03.260 you thread safe in a way, but let's see. 00:12:03.260 --> 00:12:05.329 So how does GIL work? This is a very 00:12:05.329 --> 00:12:08.510 simplistic representation of how GIL works. So you have 00:12:08.510 --> 00:12:12.120 two threads here. One is already holding the GIL. 00:12:12.120 --> 00:12:15.519 So it's, it's working with the OS thread. And 00:12:15.519 --> 00:12:18.820 now when there is another thread waiting on it, 00:12:18.820 --> 00:12:21.190 waiting on the GIL to do its work, it 00:12:21.190 --> 00:12:22.510 sends a, it wakes up the timer thread. Time 00:12:22.510 --> 00:12:26.790 thread is, again, another Ruby thread. The timer thread 00:12:26.790 --> 00:12:30.410 now goes and interrupts the thread holding the GIL, 00:12:30.410 --> 00:12:32.040 and if the GIL, if the thread holding the 00:12:32.040 --> 00:12:34.889 GIL is done with whatever it's doing - I'll 00:12:34.889 --> 00:12:36.550 get to it in a bit - it just 00:12:36.550 --> 00:12:40.320 releases the lock, and now thread two can take 00:12:40.320 --> 00:12:42.829 over and do its thing. Well this is the 00:12:42.829 --> 00:12:48.329 basic working that at least I understood about GIL. 00:12:48.329 --> 00:12:50.300 But there are details to this, right. It's not 00:12:50.300 --> 00:12:57.300 as simple as what we saw. So, when you 00:12:57.779 --> 00:13:00.930 initialize a thread, or create a thread in Ruby, 00:13:00.930 --> 00:13:03.100 you pass it a block of code. So how 00:13:03.100 --> 00:13:06.240 does that work? You take a block of code, 00:13:06.240 --> 00:13:07.769 you put it inside the thread. What the thread 00:13:07.769 --> 00:13:10.480 does is usually it acquires a JVL and a 00:13:10.480 --> 00:13:14.019 block?? [00:13:11]. It executes the block of code. It 00:13:14.019 --> 00:13:17.089 releases the, returns and releases the lock, right. So 00:13:17.089 --> 00:13:19.470 essentially this is how it works. So during that 00:13:19.470 --> 00:13:21.899 period of executation of the block, no other thread 00:13:21.899 --> 00:13:24.380 is allowed to work. So that makes you almost 00:13:24.380 --> 00:13:28.110 thread safe, right? But not really. If that's how 00:13:28.110 --> 00:13:30.600 it's going to work, what if that thread is 00:13:30.600 --> 00:13:33.899 going to hog the GIL, and not allow any 00:13:33.899 --> 00:13:35.760 other thread to work? So there has to be 00:13:35.760 --> 00:13:38.430 some kind of lock fairness, right. So that's where 00:13:38.430 --> 00:13:41.180 the timer thread comes in and interrupts it. OK. 00:13:41.180 --> 00:13:43.130 Does that mean the thread holding the GIL immediately 00:13:43.130 --> 00:13:45.190 gives it up, and says here you go, you 00:13:45.190 --> 00:13:48.740 can start and work with it? Not really. Again 00:13:48.740 --> 00:13:51.389 the thread holding the GIL will only release the 00:13:51.389 --> 00:13:53.920 GIL if it is at a context to its 00:13:53.920 --> 00:13:57.019 boundary. What that is, is fairly complicated. I don't 00:13:57.019 --> 00:13:59.920 want to go into the details. I think people 00:13:59.920 --> 00:14:02.540 who here know a lot better C than me, 00:14:02.540 --> 00:14:05.110 and are deep C divers really, they can probably 00:14:05.110 --> 00:14:08.670 tell you, you know, how, at what the GIL 00:14:08.670 --> 00:14:11.040 can get released. If a C thread, a C 00:14:11.040 --> 00:14:13.269 code makes a call to Ruby code, can it 00:14:13.269 --> 00:14:15.449 or can it not release the GIL? All those 00:14:15.449 --> 00:14:18.399 things are there, right. So all these complexities are 00:14:18.399 --> 00:14:21.360 really, really hard to deal with. I came across 00:14:21.360 --> 00:14:25.139 this blog by Jesse Storimer. It's excellent and I 00:14:25.139 --> 00:14:27.440 strongly encourage you to go through the two-part blog 00:14:27.440 --> 00:14:30.990 about, you know, nobody understands GIL. It's really, really 00:14:30.990 --> 00:14:33.550 important, if you're trying to do any sort of 00:14:33.550 --> 00:14:39.740 multi-threaded programming in Ruby. So do you still think 00:14:39.740 --> 00:14:42.740 Ruby is thread safe because it's got GIL? I'm 00:14:42.740 --> 00:14:48.740 talking about MRI, essentially. So the thing is, we 00:14:48.740 --> 00:14:51.630 can't depend on GIL, right. GIL is not documented 00:14:51.630 --> 00:14:54.050 anywhere that this is exactly how it works. This 00:14:54.050 --> 00:14:56.079 is when the timer thread wakes up. These are 00:14:56.079 --> 00:14:59.310 the time slices alotted to the thread acquiring the 00:14:59.310 --> 00:15:03.190 JVL. There is no documentation around at what point 00:15:03.190 --> 00:15:04.860 the GIL can be released, can it not be 00:15:04.860 --> 00:15:07.009 released, and things like that. There's no, it's not 00:15:07.009 --> 00:15:10.259 predictable, and if you depend on it, what could 00:15:10.259 --> 00:15:13.139 also happen is even within MRI, when you're moving 00:15:13.139 --> 00:15:15.920 from version to version, if something changes in GIL, 00:15:15.920 --> 00:15:22.220 your code with behave nondeterministically. And what about language 00:15:22.220 --> 00:15:25.209 in Ruby implementations that don't even have a GIL? 00:15:25.209 --> 00:15:27.009 So obviously that's the big problem, right. If you 00:15:27.009 --> 00:15:29.610 write a gem or something which has to be 00:15:29.610 --> 00:15:32.079 multi-threaded, and if you're depending on the GIL to 00:15:32.079 --> 00:15:34.769 do its thing to keep you safe, then obviously 00:15:34.769 --> 00:15:38.550 it cannot work on Rubinius and JRuby. Let that 00:15:38.550 --> 00:15:41.310 alone, even, even if you give that up, even 00:15:41.310 --> 00:15:44.360 with MRI, it's not entirely correct to say that 00:15:44.360 --> 00:15:47.490 you're thread safe, because there is a GIL that 00:15:47.490 --> 00:15:52.660 will ensure that only one thread is running. So 00:15:52.660 --> 00:15:54.610 what did I find out? Ruby really does not 00:15:54.610 --> 00:15:57.350 have a documented memory model. It's pretty much similar 00:15:57.350 --> 00:16:00.480 to Python. It doesn't have a clearly documented memory 00:16:00.480 --> 00:16:05.279 model. What is the implication of that? So as 00:16:05.279 --> 00:16:07.540 I mentioned previously, a memory model is like a 00:16:07.540 --> 00:16:10.769 specification. This is exactly how the system has to 00:16:10.769 --> 00:16:14.600 provide a certain minimum guarantee to the users of 00:16:14.600 --> 00:16:17.730 the language, right, regarding multi threaded access to shared 00:16:17.730 --> 00:16:22.500 memory. Now, basically if I don't have a written 00:16:22.500 --> 00:16:23.720 down memory model, and I am going to write 00:16:23.720 --> 00:16:26.540 a Ruby implementation to model, I have the liberty 00:16:26.540 --> 00:16:29.509 to choose whatever memory model I want. So the 00:16:29.509 --> 00:16:32.889 code, if you're writing against MRI, may not essentially 00:16:32.889 --> 00:16:36.720 work right on my, you know, my implementation of 00:16:36.720 --> 00:16:41.339 Ruby. That's the big implication, right. So Ruby right 00:16:41.339 --> 00:16:45.769 now depends on underlying virtual machines. Even after ER, 00:16:45.769 --> 00:16:47.699 you have bad code compilations, so even MRI is 00:16:47.699 --> 00:16:50.839 almost like a VM. So that has no specification 00:16:50.839 --> 00:16:52.959 for a memory model, but it does have something, 00:16:52.959 --> 00:16:55.279 right, internally. If you have to go through the 00:16:55.279 --> 00:16:58.130 C code and understand. It's not guaranteed to remain 00:16:58.130 --> 00:17:01.079 the same from version to version, as I understand, 00:17:01.079 --> 00:17:05.069 right. And obviously JRuby and Rubinius, they depend on 00:17:05.069 --> 00:17:08.260 JVM and LLVM respectively. And they all have a 00:17:08.260 --> 00:17:11.819 clearly documented memory model. You could have a read 00:17:11.819 --> 00:17:15.260 at it. And the only thing is, if Ruby 00:17:15.260 --> 00:17:18.079 had an implementation - sorry, a specification for a 00:17:18.079 --> 00:17:22.220 memory model, it could be, you know, implemented using 00:17:22.220 --> 00:17:27.599 the constructs available on JVM and LLVM. But this 00:17:27.599 --> 00:17:29.450 is what we have. We don't have much to 00:17:29.450 --> 00:17:33.200 do. What do we do under the circumstances? We 00:17:33.200 --> 00:17:36.640 have to engineer our code for thread safety. We 00:17:36.640 --> 00:17:40.120 can't bask under the safety that, there is a 00:17:40.120 --> 00:17:42.410 GIL and so it's going to help me keep 00:17:42.410 --> 00:17:44.530 my code thread safe. So even I can write 00:17:44.530 --> 00:17:47.690 multiple, you know, multi threaded code without actually worrying 00:17:47.690 --> 00:17:51.290 about serious synchronization issues and things like that. It's 00:17:51.290 --> 00:17:54.500 totally not the right thing to do. I think 00:17:54.500 --> 00:17:57.370 any which way, Ruby is a language I love, 00:17:57.370 --> 00:17:59.710 and I'm sure all of you love, so. And 00:17:59.710 --> 00:18:02.670 it's progressing my leaps and bounds, and eventually we're 00:18:02.670 --> 00:18:04.840 going to write more and more complex systems with 00:18:04.840 --> 00:18:09.390 Ruby. And who knows, we might have true parallelism 00:18:09.390 --> 00:18:13.980 very soon, right. So why, still, stay in the 00:18:13.980 --> 00:18:17.210 same mental block that we don't want to write, 00:18:17.210 --> 00:18:20.480 you know, thread safe code that's anyway single threaded. 00:18:20.480 --> 00:18:22.150 We might as well get into the mindset of 00:18:22.150 --> 00:18:26.130 writing proper thread safe code, and try and probably 00:18:26.130 --> 00:18:29.500 come up with a memory model, right. But I 00:18:29.500 --> 00:18:31.700 think for now we just start engineering code for 00:18:31.700 --> 00:18:36.860 thread safety. Simple Mutex, I'm sure all of you 00:18:36.860 --> 00:18:39.580 know, but it's really, really important for even a 00:18:39.580 --> 00:18:44.090 stupid operation like a plus equals two. So simple 00:18:44.090 --> 00:18:46.970 things which are noticed in Ruby code bases and 00:18:46.970 --> 00:18:50.530 Rails code bases as well, like generally, is, there 00:18:50.530 --> 00:18:52.920 is like a synchronized, you know, a section of 00:18:52.920 --> 00:18:56.260 the code has lots of synchronization and everything. It's 00:18:56.260 --> 00:18:58.530 really safe. But we leave an innocent accessor lying 00:18:58.530 --> 00:19:00.760 around, and that causes a lot of, you know, 00:19:00.760 --> 00:19:04.360 pain, like debugging those issues. And general issues like, 00:19:04.360 --> 00:19:08.020 you know, state mutations, inside methods is really a 00:19:08.020 --> 00:19:10.270 bad idea. So if you're looking for issues around 00:19:10.270 --> 00:19:12.200 multi threading, this might be a good place to 00:19:12.200 --> 00:19:14.350 start. So I just listed a few of them 00:19:14.350 --> 00:19:16.310 here. I didn't want to make a really dense 00:19:16.310 --> 00:19:19.210 talk with all the details. You can always catch 00:19:19.210 --> 00:19:20.940 me offline and I can tell you some of 00:19:20.940 --> 00:19:23.600 my experiences and probably even listen to you and 00:19:23.600 --> 00:19:25.980 learn from you about some of the issues that 00:19:25.980 --> 00:19:28.820 we can solve by actually writing proper thread safe 00:19:28.820 --> 00:19:33.080 code in Ruby. I came across a few gems 00:19:33.080 --> 00:19:35.090 which were really, really nice. Both of them happen 00:19:35.090 --> 00:19:38.680 to be written by headius. The first one is 00:19:38.680 --> 00:19:40.730 atomic. Atomic is almost trying to give you the 00:19:40.730 --> 00:19:44.970 similar constructs like the Java utility concurrent package. It 00:19:44.970 --> 00:19:51.300 tries to, it's kind of compatible across MRI, JRuby, 00:19:51.300 --> 00:19:53.800 and Rubinius, which is also a really nice thing. 00:19:53.800 --> 00:19:56.560 So you have atomic integers and atomic floats, which 00:19:56.560 --> 00:19:59.900 do increments actually in an atomic way, which is 00:19:59.900 --> 00:20:02.460 excellent. And then there is thread_safe library, which also 00:20:02.460 --> 00:20:04.590 has a few thread safe data structures. I'm trying 00:20:04.590 --> 00:20:06.570 to play around with these libraries right now, but 00:20:06.570 --> 00:20:09.150 they may be a good, you know, starting point 00:20:09.150 --> 00:20:10.780 if you are trying to do higher level constructs 00:20:10.780 --> 00:20:15.620 for concurrency. And that's pretty much it. I'm open 00:20:15.620 --> 00:20:21.820 to take questions. Thank you. And before anything I 00:20:21.820 --> 00:20:23.420 really would like to thank you all, again for 00:20:23.420 --> 00:20:27.140 being here for the talk, and thank the GCRC 00:20:27.140 --> 00:20:31.410 organizers, you know, they've done a great job with 00:20:31.410 --> 00:20:38.410 this conference. A big shout out to them. 00:20:46.470 --> 00:20:46.510 V.O.: Any questions? 00:20:46.510 --> 00:20:46.540 H.K.: Yeah? 00:20:46.540 --> 00:20:46.560 QUESTION: Hey. 00:20:46.560 --> 00:20:46.590 H.K.: Hi. 00:20:46.590 --> 00:20:47.520 QUESTION: If, for example, if a Ruby code is running 00:20:47.520 --> 00:20:51.530 in the JVM, in JRuby, how does, because none 00:20:51.530 --> 00:20:53.810 of the Ruby code is written in a thread 00:20:53.810 --> 00:20:56.580 safe way. How do, how does it internally manage 00:20:56.580 --> 00:20:58.750 - does it actually, yeah, yesterday Yogi talked about 00:20:58.750 --> 00:21:00.940 the point that ActiveRecord is not actually thread safe. 00:21:00.940 --> 00:21:03.520 Can you explain it in detail like in a 00:21:03.520 --> 00:21:04.460 theoretical way? 00:21:04.460 --> 00:21:06.560 H.K.: OK. What is thread safety in 00:21:06.560 --> 00:21:09.010 general, right? Thread safety is about how the data 00:21:09.010 --> 00:21:13.280 is consistently maintained after multi-threaded access to that shared 00:21:13.280 --> 00:21:17.130 data, right. So Ruby essentially has a GIL because 00:21:17.130 --> 00:21:19.620 internal implementations are not thread safe, right. That's why 00:21:19.620 --> 00:21:22.110 you want to have a GIL to protect you 00:21:22.110 --> 00:21:25.840 from those problems. But as far as JRuby is 00:21:25.840 --> 00:21:29.280 concerned, or Rubinius is concerned, the implementation itself is 00:21:29.280 --> 00:21:31.930 not written in C. JRuby is written in Ruby 00:21:31.930 --> 00:21:34.400 again, I mean JRuby itself, and Rubinius is written 00:21:34.400 --> 00:21:37.660 in Ruby. And some of these actual internal constructs 00:21:37.660 --> 00:21:40.580 are thread safe when compared to MRI. I haven't 00:21:40.580 --> 00:21:43.190 actually taken a look in detail into the code 00:21:43.190 --> 00:21:47.520 of these code bases, but if they are implemented 00:21:47.520 --> 00:21:50.000 properly, you can be thread safe - internally, at 00:21:50.000 --> 00:21:53.340 least - so, which means, the base code of 00:21:53.340 --> 00:21:55.720 JRuby itself might be thread safe. It's only not 00:21:55.720 --> 00:21:58.200 thread safe because the gems on top of it, 00:21:58.200 --> 00:22:01.050 which are trying to run. They may have, like, 00:22:01.050 --> 00:22:04.890 thread safety issues, right. Does that answer your question, 00:22:04.890 --> 00:22:05.840 like, or- ? 00:22:05.840 --> 00:22:08.200 QUESTION: About thread safety?? [00:22:09]. 00:22:08.200 --> 00:22:11.720 H.K.: Sure, sure. So those gems will not work. That's 00:22:11.720 --> 00:22:13.840 the point. Like what I want to convey here, 00:22:13.840 --> 00:22:16.910 is whatever gems we are offering, and whatever code 00:22:16.910 --> 00:22:18.780 we are writing, we might get it - it's 00:22:18.780 --> 00:22:20.240 a good idea to get into the habit of 00:22:20.240 --> 00:22:22.860 writing thread safe code, so that we can actually 00:22:22.860 --> 00:22:25.460 encourage a truly parallel Ruby, right. We don't, we 00:22:25.460 --> 00:22:27.530 don't have to stay in the same paradigm of 00:22:27.530 --> 00:22:31.520 OK we have to be single threaded. 00:22:31.520 --> 00:22:37.010 QUESTION: So Mutex based thread management is one way. 00:22:37.010 --> 00:22:40.060 There's also like actors and futures and things like that. 00:22:40.060 --> 00:22:41.890 And there's a gem called cellulite- 00:22:41.890 --> 00:22:42.680 H.K.: Yup. 00:22:42.680 --> 00:22:45.040 QUESTION: That, combined with something called Hamster, 00:22:45.040 --> 00:22:46.390 which makes everything immutable- 00:22:46.390 --> 00:22:46.840 H.K.: Yup. 00:22:46.840 --> 00:22:47.960 QUESTION: Is another way to do it. 00:22:47.960 --> 00:22:48.160 H.K.: Yup. 00:22:48.160 --> 00:22:49.070 QUESTION: Have you done it or like, 00:22:49.070 --> 00:22:49.950 what's your experience with that? 00:22:49.950 --> 00:22:53.130 H.K.: Yeah, I have tried out actors, with revactor, 00:22:53.130 --> 00:22:54.330 and lockless concurrency is 00:22:54.330 --> 00:22:56.830 something I definitely agree is a good idea. But 00:22:56.830 --> 00:23:01.440 I'm specifically talking about, you know, lock-based concurrency, like, 00:23:01.440 --> 00:23:04.530 Mutex-based concurrency. This area is also important because it's 00:23:04.530 --> 00:23:07.960 not like thread mutable state is bad. It is, 00:23:07.960 --> 00:23:10.770 it is actually applicable in certain scenarios. When we 00:23:10.770 --> 00:23:13.360 are working in this particular paradigm, we still need 00:23:13.360 --> 00:23:19.170 the safety of a memory model. Any other questions? 00:23:19.170 --> 00:23:26.170 QUESTION: Thanks for the talk Hari. It was really 00:23:28.200 --> 00:23:28.650 good. 00:23:28.650 --> 00:23:29.550 H.K.: Thanks. 00:23:29.550 --> 00:23:31.140 QUESTION: Is there a way that 00:23:31.140 --> 00:23:35.050 you would recommend to test if you have done 00:23:35.050 --> 00:23:37.850 threading properly or not? I mean, I know, bugs 00:23:37.850 --> 00:23:38.420 that come out- 00:23:38.420 --> 00:23:38.610 H.K.: Right. 00:23:38.610 --> 00:23:38.980 QUESTION: Like I have 00:23:38.980 --> 00:23:41.680 written bugs that come out of badly written, you 00:23:41.680 --> 00:23:43.750 know, not thread safe code, as. 00:23:43.750 --> 00:23:44.510 H.K.: So- 00:23:44.510 --> 00:23:47.190 QUESTION: Like, ?? [00:23:46] so, you catch them. 00:23:47.190 --> 00:23:51.510 H.K.: At least, my opinion, and a lot of people have 00:23:51.510 --> 00:23:53.960 done research in this area, their opinion also is 00:23:53.960 --> 00:23:57.600 that it's not possible to write tests against multi 00:23:57.600 --> 00:24:00.480 threaded code where there is shared data. Because it's 00:24:00.480 --> 00:24:04.230 nondeterministic and nonrepeatable. The kind of results you get, 00:24:04.230 --> 00:24:06.920 you can only test it against a heuristic. For 00:24:06.920 --> 00:24:09.430 example, if you have a deterministic use case at 00:24:09.430 --> 00:24:11.620 the top level, you can probably test it against 00:24:11.620 --> 00:24:14.490 that. But exact test cases can never be written 00:24:14.490 --> 00:24:16.070 for this. 00:24:16.070 --> 00:24:19.240 V.O.: Any more questions? 00:24:19.240 --> 00:24:26.240 H.K.: Cool. All right. Thank you so much.