HARY KRISHNAN: So, thank you very much for being here on a Saturday evening, this late. My talk got pushed to the last, but I appreciate you being here, first. My name's Hari. I work at MavenHive. So this is a talk about Ruby memory model. So before I start, how many of you have heard about memory model and know what it is? Show of hands, please. OK. Let's see where this talk goes. So why I did I come up with this talk topic. So I started my career with Java, and I spent a lot many years with Java, and Java has a very clearly documented memory model. And it kind of gets to you because with all that, you don't feel safe enough doing multi-threaded programming at all. So with Ruby, we've always been talking about, you know, doing multi-process for multi-process parallelism, rather than multi-threaded parallelism, even though the language actually supports, you know, multi-threading semantics. Of course we know it's called single-threaded and all that, but I just got curious, like, what is the real memory model behind Ruby, and I just wanted to figure that out. So this talk is all about my learnings as I went through, like, various literatures, and figured out, and I tried to combine, like, get a gist of the whole thing. And cram it into some twenty minutes so that I could, like, probably give you a very useful session, like, from which you can further do more digging on this, right. So when I talked to my friends about memory model, the first thing that comes up to their mind is probably this - heap, heap, non-heap, stack, whatever. I'm not gonna talk about that. I'm not gonna talk about this either. It's not about, you know, optimizing your memory, or search memory leeks, or garbage collection. This talk is not about that either. So what the hell am I gonna talk about? First, a quick exercise. So let's start with this and see where it goes. Simple code. Not much to process late in the day. There's a shared variable called 'n', and there are thousand threads over that, and each of those threads want to increment that shared variable hundred times, right. And what is the expected output? I'm not gonna question you, I'm just gonna give it away. It's 100,000. It's fairly straightforward code. I'm sure all of you have done this, and it's no big deal. So what's the real output? MRI is very faithful, it gives you what you expected. 100,000, right. So what happens next? I'm running it on Rubinius. This is what you see. And it's always going to be a different number every time you run it. And that's JRuby. It gives you a lower number. Some of you may be guessing already, and you probably know it, why it gives you a lower number. So why all this basic stupid code and some stupid counter over here, right? So I just wanted to get a really basic example to explain the concept of increment is not a single instruction, right. The reason why I'm talking about this is, I love Ruby because the syntax is so terse, and it's so simple, it's so readable, right. But it does not mean every single instruction on the screen is going to be executed straight away, right. So at least, to my junior self, this is the first advice I would give, when I started, you know, multi-threaded programming. So at least three steps. Lowered increments store, right. That's, even further, really simple piece of code like, you know, a plus equals to, right. So this is what we really want to happen. You have a count, you lowered it, you increment it, you stored it. Then the next thread comes along. It lowers it, increments it, stores it. You have the next result which is what you expect, right. But we live in a world where threads don't want to be our friend. They do this. One guy comes along, reads it, increments it. The other guy also reads the older value, increments it. And both of them go and save the same value, right. So this is a classic case of lost update. I'm sure most of you have seen it in the database world. But this pretty much happens a lot in the multi-threading world, right. But why did it not happen with MRI? And what did you see the right result?? [00:04:52]? That, I'm sure a lot of you know, but let's step, let's part that question and just move a little ahead. So, as you observed earlier, a lot of reordoring happening in instructions, right. Like, the threads were context-switching, and they were reordering statements. So where does this reordering happen? Reordering can happen at multiple levels. So start from the top. You have the compiler, which can do simple optimizations like look closer?? [00:05:20]. Even that can change the order of your statements in your code, right. Next, when the code gets translated to, you know, machine-level language, goes to core, and your CP cores are at liberty, again, to reorder them for performance. And next comes the memory system, right. The memory system is like the combined global memory, which all the CPUs can read, and also they're individual caches. But why do CPUs have caches? They want to, memory is slow, so they want to load, reload all the values, refactor it, keep it in the cache, again improve performance. So even the memory system can conspire against you and reorder the loads and stores after the memory registers. And that can cause reordering, right. So this is really, really crazy. Like, I'm a very stupid programmer, who works at the programming language level. I don't really understand the structure of the hardware and things like that. So how do I keep myself abstracted from all this, you know, really crazy stuff? So that's essentially a memory model. So what, what is a memory model? A memory model describes the interactions of threads through memory and their shared use of data. So this is straight out of Wikipedia, right. So if you just read it first, either you're gonna think it's really simple, and probably even looks stupid, but otherwise you might not even understand. So I was the second category. So what does this all mean? So when there are so many complications with the reordering, the reads and writes of memory and things like that, as a programmer you need certain guarantees from the programming language, and the virtual machine you're working on top of, to say this is how multi-threaded shared, I mean, multi-threaded access to shared memory is going to work. These are the basic guarantees and these are the simple rules of how the system works. So you can reliably work code against that, right. So in, in effect, a memory model is just a specification. Any Java programmers here, in the house? Great. So how many of you know about JSR 133? The memory model, double check locking - OK. Some people. Single term issue? OK - some more hands. So Java was the first programming language which came up with a concept called memory model, right. Because, the first thing is, right ones?? [00:07:45] won't run anywhere. It had to be predictable across platforms, across reimplementations, and things like that. So the, there had to be a JSR which specified what is the memory model that it can code against so that your multi-threaded code works predictably, and deterministically across platforms and across virtual machines. Right? So essentially that's where my, you know, whole thing started. I had gone through the Java memory model, and was pretty much really happy that someone had taken the pain to write it down in clear terms so that you don't have to worry about multi-threading. Hold on, sorry. Sorry about that. Cool. So. Memory model gives you rules at three broad levels. Atomicity, visibility and ordering. So atomicity is as simple as, you know, variable assignment. Is a variable assignment an indivisible unit of work, or not? The rules around that, and it also talks about rules around, can you assign hashes, send arrays indivisibly and things like that. These rules can change based on every alligned version, and things like that. Next is visibility. So in that example which you talked about, I mean, we saw two threads trying to read the same value. Essentially they are spying on each other. And it was not clear at what point the data had to become visible to each of those threads. So essentially visibility is about that. And that is ensured through memory barriers and ordering, which is the next thing. So ordering is about how the loads and stores are sequenced, or, you know, let's say you want to write a piece of code, critical section as you call it. And you don't want the compiler to do any crazy things to improve performance. So you say, I make it synchronized, and it has to behave in a, behave in a nice serial?? [00:09:40] manner. So that ?? manner is ensured by ordering. Ordering is a really complex area. It talks about causality, logical clocks and all that. I won't go into those details. But I've been worrying you with all this, you know, computer science basics and all this. Why the hell am I talking about it in a Ruby conference? Ruby is single-threaded, anyway. Why the hell should I care about it, right? OK. Do you really think languages like Ruby are thread safe? Show of hands, anyone? So thread safety, I'm talking only about Ruby - maybe Python. GIL based languages. Are they thread safe? No? OK. In fact they're not. Having single-threaded does not mean it's thread-safe, right. Threads can switch context, and based on how the language has been implemented and how often the threads can switch context, and at what point they can switch, things can go wrong, right. And another pretty popular myth - I don't think many people believe it here, in this audience at least. I don't have concurrency problems because I'm running on single core. Not true. Again, threads can switch context and run on the same core and still have dirty reads and things like that. So concurrency is all about interleavings, right. Again, goes back to reordering. I think I've been talking about this too often. And let's not, again, worry with that. It's about interleavings. We'll leave it at that. So let's, before we understand more about, you know, the memory model and what it has to do with Ruby, let's just understand a little bit about threading in Ruby. So all of you know, green threads, as of 1.8, there was only one worse thread, which was being multiplexed with multiple Ruby threads, which were being scheduled on it through global interpreter lock. 1.9 comes along, there is a one to one mapping between the Ruby thread and OS thread, but still the Ruby thread cannot use the OS thread unless it has the global VM lock as its call now. The JVL acquire. So does having a Global Interpreter Lock make you thread safe? It depends. It does make you thread safe in a way, but let's see. So how does GIL work? This is a very simplistic representation of how GIL works. So you have two threads here. One is already holding the GIL. So it's, it's working with the OS thread. And now when there is another thread waiting on it, waiting on the GIL to do its work, it sends a, it wakes up the timer thread. Time thread is, again, another Ruby thread. The timer thread now goes and interrupts the thread holding the GIL, and if the GIL, if the thread holding the GIL is done with whatever it's doing - I'll get to it in a bit - it just releases the lock, and now thread two can take over and do its thing. Well this is the basic working that at least I understood about GIL. But there are details to this, right. It's not as simple as what we saw. So, when you initialize a thread, or create a thread in Ruby, you pass it a block of code. So how does that work? You take a block of code, you put it inside the thread. What the thread does is usually it acquires a JVL and a block?? [00:13:11]. It executes the block of code. It releases the, returns and releases the lock, right. So essentially this is how it works. So during that period of executation of the block, no other thread is allowed to work. So that makes you almost thread safe, right? But not really. If that's how it's going to work, what if that thread is going to hog the GIL, and not allow any other thread to work? So there has to be some kind of lock fairness, right. So that's where the timer thread comes in and interrupts it. OK. Does that mean the thread holding the GIL immediately gives it up, and says here you go, you can start and work with it? Not really. Again the thread holding the GIL will only release the GIL if it is at a context to its boundary. What that is, is fairly complicated. I don't want to go into the details. I think people who here know a lot better C than me, and are deep C divers really, they can probably tell you, you know, how, at what the GIL can get released. If a C thread, a C code makes a call to Ruby code, can it or can it not release the GIL? All those things are there, right. So all these complexities are really, really hard to deal with. I came across this blog by Jesse Storimer. It's excellent and I strongly encourage you to go through the two-part blog about, you know, nobody understands GIL. It's really, really important, if you're trying to do any sort of multi-threaded programming in Ruby. So do you still think Ruby is thread safe because it's got GIL? I'm talking about MRI, essentially. So the thing is, we can't depend on GIL, right. GIL is not documented anywhere that this is exactly how it works. This is when the timer thread wakes up. These are the time slices alotted to the thread acquiring the JVL. There is no documentation around at what point the GIL can be released, can it not be released, and things like that. There's no, it's not predictable, and if you depend on it, what could also happen is even within MRI, when you're moving from version to version, if something changes in GIL, your code with behave nondeterministically. And what about language in Ruby implementations that don't even have a GIL? So obviously that's the big problem, right. If you write a gem or something which has to be multi-threaded, and if you're depending on the GIL to do its thing to keep you safe, then obviously it cannot work on Rubinius and JRuby. Let that alone, even, even if you give that up, even with MRI, it's not entirely correct to say that you're thread safe, because there is a GIL that will ensure that only one thread is running. So what did I find out? Ruby really does not have a documented memory model. It's pretty much similar to Python. It doesn't have a clearly documented memory model. What is the implication of that? So as I mentioned previously, a memory model is like a specification. This is exactly how the system has to provide a certain minimum guarantee to the users of the language, right, regarding multi threaded access to shared memory. Now, basically if I don't have a written down memory model, and I am going to write a Ruby implementation to model, I have the liberty to choose whatever memory model I want. So the code, if you're writing against MRI, may not essentially work right on my, you know, my implementation of Ruby. That's the big implication, right. So Ruby right now depends on underlying virtual machines. Even after ER, you have bad code compilations, so even MRI is almost like a VM. So that has no specification for a memory model, but it does have something, right, internally. If you have to go through the C code and understand. It's not guaranteed to remain the same from version to version, as I understand, right. And obviously JRuby and Rubinius, they depend on JVM and LLVM respectively. And they all have a clearly documented memory model. You could have a read at it. And the only thing is, if Ruby had an implementation - sorry, a specification for a memory model, it could be, you know, implemented using the constructs available on JVM and LLVM. But this is what we have. We don't have much to do. What do we do under the circumstances? We have to engineer our code for thread safety. We can't bask under the safety that, there is a GIL and so it's going to help me keep my code thread safe. So even I can write multiple, you know, multi threaded code without actually worrying about serious synchronization issues and things like that. It's totally not the right thing to do. I think any which way, Ruby is a language I love, and I'm sure all of you love, so. And it's progressing my leaps and bounds, and eventually we're going to write more and more complex systems with Ruby. And who knows, we might have true parallelism very soon, right. So why, still, stay in the same mental block that we don't want to write, you know, thread safe code that's anyway single threaded. We might as well get into the mindset of writing proper thread safe code, and try and probably come up with a memory model, right. But I think for now we just start engineering code for thread safety. Simple Mutex, I'm sure all of you know, but it's really, really important for even a stupid operation like a plus equals two. So simple things which are noticed in Ruby code bases and Rails code bases as well, like generally, is, there is like a synchronized, you know, a section of the code has lots of synchronization and everything. It's really safe. But we leave an innocent accessor lying around, and that causes a lot of, you know, pain, like debugging those issues. And general issues like, you know, state mutations, inside methods is really a bad idea. So if you're looking for issues around multi threading, this might be a good place to start. So I just listed a few of them here. I didn't want to make a really dense talk with all the details. You can always catch me offline and I can tell you some of my experiences and probably even listen to you and learn from you about some of the issues that we can solve by actually writing proper thread safe code in Ruby. I came across a few gems which were really, really nice. Both of them happen to be written by headius. The first one is atomic. Atomic is almost trying to give you the similar constructs like the Java utility concurrent package. It tries to, it's kind of compatible across MRI, JRuby, and Rubinius, which is also a really nice thing. So you have atomic integers and atomic floats, which do increments actually in an atomic way, which is excellent. And then there is thread_safe library, which also has a few thread safe data structures. I'm trying to play around with these libraries right now, but they may be a good, you know, starting point if you are trying to do higher level constructs for concurrency. And that's pretty much it. I'm open to take questions. Thank you. And before anything I really would like to thank you all, again for being here for the talk, and thank the GCRC organizers, you know, they've done a great job with this conference. A big shout out to them. V.O.: Any questions? H.K.: Yeah? QUESTION: Hey. H.K.: Hi. QUESTION: If, for example, if a Ruby code is running in the JVM, in JRuby, how does, because none of the Ruby code is written in a thread safe way. How do, how does it internally manage - does it actually, yeah, yesterday Yogi talked about the point that ActiveRecord is not actually thread safe. Can you explain it in detail like in a theoretical way? H.K.: OK. What is thread safety in general, right? Thread safety is about how the data is consistently maintained after multi-threaded access to that shared data, right. So Ruby essentially has a GIL because internal implementations are not thread safe, right. That's why you want to have a GIL to protect you from those problems. But as far as JRuby is concerned, or Rubinius is concerned, the implementation itself is not written in C. JRuby is written in Ruby again, I mean JRuby itself, and Rubinius is written in Ruby. And some of these actual internal constructs are thread safe when compared to MRI. I haven't actually taken a look in detail into the code of these code bases, but if they are implemented properly, you can be thread safe - internally, at least - so, which means, the base code of JRuby itself might be thread safe. It's only not thread safe because the gems on top of it, which are trying to run. They may have, like, thread safety issues, right. Does that answer your question, like, or- ? QUESTION: About thread safety?? [00:22:09]. H.K.: Sure, sure. So those gems will not work. That's the point. Like what I want to convey here, is whatever gems we are offering, and whatever code we are writing, we might get it - it's a good idea to get into the habit of writing thread safe code, so that we can actually encourage a truly parallel Ruby, right. We don't, we don't have to stay in the same paradigm of OK we have to be single threaded. QUESTION: So Mutex based thread management is one way. There's also like actors and futures and things like that. And there's a gem called cellulite- H.K.: Yup. QUESTION: That, combined with something called Hamster, which makes everything immutable- H.K.: Yup. QUESTION: Is another way to do it. H.K.: Yup. QUESTION: Have you done it or like, what's your experience with that? H.K.: Yeah, I have tried out actors, with revactor, and lockless concurrency is something I definitely agree is a good idea. But I'm specifically talking about, you know, lock-based concurrency, like, Mutex-based concurrency. This area is also important because it's not like thread mutable state is bad. It is, it is actually applicable in certain scenarios. When we are working in this particular paradigm, we still need the safety of a memory model. Any other questions? QUESTION: Thanks for the talk Hari. It was really good. H.K.: Thanks. QUESTION: Is there a way that you would recommend to test if you have done threading properly or not? I mean, I know, bugs that come out- H.K.: Right. QUESTION: Like I have written bugs that come out of badly written, you know, not thread safe code, as. H.K.: So- QUESTION: Like, ?? [00:23:46] so, you catch them. H.K.: At least, my opinion, and a lot of people have done research in this area, their opinion also is that it's not possible to write tests against multi threaded code where there is shared data. Because it's nondeterministic and nonrepeatable. The kind of results you get, you can only test it against a heuristic. For example, if you have a deterministic use case at the top level, you can probably test it against that. But exact test cases can never be written for this. V.O.: Any more questions? H.K.: Cool. All right. Thank you so much.