< Return to Video

Garden City Ruby 2014 - Native Extensions Served 3 Ways by Tejas Deinkar

  • 0:00 - 0:00
    Yeah, so, hi everyone.
  • 0:00 - 0:00
    Like Swanand said, I'm also part of the team
    that's organized this,
  • 0:00 - 0:01
    so I want to just say it's really awesome
    to see, like,
  • 0:01 - 0:01
    this huge a crowd on a Friday morning in Banglore.
  • 0:01 - 0:01
    So, what I'm speaking about today is native
    extensions in Ruby.
  • 0:01 - 0:01
    The title of my talk is Native Extensions
    Served 3 Ways.
  • 0:01 - 0:01
    About myself: I'm Tejas, Tejas Dinkar.
  • 0:02 - 0:02
    I am a partner at Nilenso software, which
    I am an employee on
  • 0:02 - 0:02
    collective and it's a really fun place to
    work.
  • 0:02 - 0:02
    If you want to know more, catch me tomorrow.
  • 0:02 - 0:02
    I'm on Twitter. I am tdinkar, and on GitHub
    I am GJA.
  • 0:02 - 0:02
    You can find most of my OpenSource contributions
    over there.
  • 0:02 - 0:03
    So, about my talk, this is actually a pretty
    technical talk,
  • 0:03 - 0:03
    so expect to see lots of code. I hope to have
    five minutes for questions.
  • 0:03 - 0:03
    I'll try to beat ?? and beg for it (?? @ 00:01:18).
  • 0:03 - 0:03
    But if I don't have time, just catch me in
    the hallway
  • 0:03 - 0:03
    and I can try to answer whatever I can.
  • 0:04 - 0:04
    So I'll mostly be covering C extensions, FFI
    and Swig.
  • 0:04 - 0:04
    Let's talk about first why you would ever
    want to build
  • 0:04 - 0:04
    a native extension for Ruby. There's a bunch
    of different reasons.
  • 0:04 - 0:04
    Number one is to maybe integrate with new
    libraries.
  • 0:04 - 0:04
    Say like a new database has dome out, like,
    for example LibDrizzle,
  • 0:04 - 0:05
    which is a new library that came out to work
    with
  • 0:05 - 0:05
    MySQL in the Drizzle database.
  • 0:05 - 0:05
    You might want to port that over to Ruby.
  • 0:05 - 0:05
    You might want to improve performance of critical
    code.
  • 0:05 - 0:05
    There's different ways of doing this, of course.
  • 0:06 - 0:06
    You could try doing JRuby, look at different
    caches,
  • 0:06 - 0:06
    but somethings you have an algorithm that
    you
  • 0:06 - 0:06
    just want to implement in native code,
  • 0:06 - 0:06
    or there's already a great library that implements
    it.
  • 0:06 - 0:06
    Someone has given me an example of CSAT,
  • 0:06 - 0:07
    which I think Bundler could have actually
    used,
  • 0:07 - 0:07
    or could use, to kind of resolve gem deficiencies,
  • 0:07 - 0:07
    and that's written in C++.
  • 0:07 - 0:07
    Sometimes you wanna just move that there to
  • 0:07 - 0:07
    improve performance, and of course you want
    to
  • 0:08 - 0:08
    write code that works, of course different
    languages,
  • 0:08 - 0:08
    and in general it's a lot of fun.
  • 0:08 - 0:08
    You know like, real hackers program in C
  • 0:08 - 0:08
    and all that stuff.
  • 0:08 - 0:08
    So you could just feel super elite by doing
    this.
  • 0:08 - 0:09
    So before we talk about native extensions,
  • 0:09 - 0:09
    I'm just gonna take a small segue and ask-
  • 0:09 - 0:09
    So let's talk about Python a little bit.
  • 0:09 - 0:09
    Like how many people in the house are like,
    are Pythonistas?
  • 0:09 - 0:09
    How many big Python fans?
  • 0:10 - 0:10
    Well, that's actually surprisingly small.
  • 0:10 - 0:10
    I thought there would be more.
  • 0:10 - 0:10
    So I've figured out the best way to write
    Python code,
  • 0:10 - 0:10
    and I'm gonna tell you this right now and-
  • 0:10 - 0:10
    Yes, of course, I am trolling you.
  • 0:10 - 0:11
    So over here I have a Python interpreter open
  • 0:11 - 0:11
    and I'm gonna say import Ruby.
  • 0:11 - 0:11
    Yeah, that seems innocent enough.
  • 0:11 - 0:11
    I'm gonna say Ruby dot eval foobar dot size.
  • 0:11 - 0:11
    Hmm, that seems to return six.
  • 0:12 - 0:12
    So that seems to work. Let's try something
    more complicated.
  • 0:12 - 0:12
    I'm gonna define a method, factorial, and,
    def factorial,
  • 0:12 - 0:12
    partial, N equals zero, partial, factorial,
    N minus one,
  • 0:12 - 0:12
    et cetera.
  • 0:12 - 0:12
    Yes it's still recursive, I know.
  • 0:12 - 0:13
    And finally I'm gonna call it,
  • 0:13 - 0:13
    and I'm gonna say Ruby dot eval factorial
    five.
  • 0:13 - 0:13
    Yet again you'll see that I actually get the
    result, right.
  • 0:13 - 0:13
    So wow, what's this? Have I implemented Python
    in Ruby?
  • 0:13 - 0:13
    Let's look at the code which actually makes
    this happen.
  • 0:14 - 0:14
    How can I actually make this work?
  • 0:14 - 0:14
    Well, surprisingly, it's about a dozen lines
    of code.
  • 0:14 - 0:14
    I know this might be a bit hard to read from
    the back,
  • 0:14 - 0:14
    so I'm just gonna read out the interesting
    bits line by line.
  • 0:14 - 0:14
    So it's about a dozen lines of code,
  • 0:14 - 0:15
    but let's remove all the Python stuff and
    let's just
  • 0:15 - 0:15
    look at the Ruby parts of this. It's even
    a lot less,
  • 0:15 - 0:15
    yeah, so let's go through this one by one.
  • 0:15 - 0:15
    I start off my including Ruby dot H.
  • 0:15 - 0:15
    Ruby dot H, most of you who are familiar
  • 0:16 - 0:16
    with C and C++ dot H files, are hetero files.
  • 0:16 - 0:16
    They basically have the definitions of various
  • 0:16 - 0:16
    constructs that Ruby or, like, whatever,
  • 0:16 - 0:16
    that your library exposes, so that you can,
  • 0:16 - 0:16
    so that your compiler knows what definitions
    exist in the first place.
  • 0:16 - 0:17
    So any Ruby extension that you write,
  • 0:17 - 0:17
    you'll need to include Ruby dot H,
  • 0:17 - 0:17
    and most of the time this is completely sufficient.
  • 0:17 - 0:17
    You don't need to include anything else.
  • 0:17 - 0:17
    So let's start looking at the actual code.
  • 0:18 - 0:18
    I have a method called Python Ruby eval.
  • 0:18 - 0:18
    Yeah, and it accepts one parameter,
  • 0:18 - 0:18
    which is a string. OK, nothing complicated
    here.
  • 0:18 - 0:18
    What I do next is I take that string and
  • 0:18 - 0:18
    I pass it to a function called R-B eval string,
  • 0:18 - 0:19
    right, which stands for Ruby eval string.
  • 0:19 - 0:19
    Nothing really magical over here.
  • 0:19 - 0:19
    All I'm doing is I'm calling Ruby's eval,
  • 0:19 - 0:19
    and you'll notice that the R-B eval returns
    a object of the type value.
  • 0:19 - 0:19
    This value is actually very important.
  • 0:20 - 0:20
    The same way in Ruby every single object inherits
    from object,
  • 0:20 - 0:20
    the corresponding construct in the CRuby extensions
    is the value object.
  • 0:20 - 0:20
    Value object is used to represent every single
    Ruby object
  • 0:20 - 0:20
    from nil to true to false to every single
    custom object.
  • 0:20 - 0:20
    Fine, so I have this value object. What am
    I gonna do next?
  • 0:20 - 0:21
    Well, what I'm gonna do is I'm gonna
  • 0:21 - 0:21
    switch on the type of the object.
  • 0:21 - 0:21
    Yeah, there's a macro called type, and if
    it's a fix num,
  • 0:21 - 0:21
    fix nums are numbers in Ruby,
  • 0:21 - 0:21
    I'm gonna call fix num to N,
  • 0:22 - 0:22
    to just convert that to an integer.
  • 0:22 - 0:22
    If it's a string I'm gonna call string value
    pointer,
  • 0:22 - 0:22
    which converts from a Ruby string into a C
    string.
  • 0:22 - 0:22
    And if it's some other type, like if it's
    an array, yeah,
  • 0:22 - 0:22
    I could have implemented this,
  • 0:22 - 0:23
    but yeah I'm too lazy right now,
  • 0:23 - 0:23
    so I'm just gonna say return none,
  • 0:23 - 0:23
    which is the Python version of nil.
  • 0:23 - 0:23
    So yeah that's pretty much it.
  • 0:23 - 0:23
    That's all it really took to kind of build
  • 0:24 - 0:24
    a small kind of Ruby module which would extend
  • 0:24 - 0:24
    some of the functionality of Ruby into Python.
  • 0:24 - 0:24
    Let's look at something that's like another
    construct.
  • 0:24 - 0:24
    I don't want to build all my quota
  • 0:24 - 0:24
    and I don't want to eval all of it.
  • 0:24 - 0:25
    Let's say I want to require a file.
  • 0:25 - 0:25
    It's pretty much the same.
  • 0:25 - 0:25
    All I need to do is, I accept the file
  • 0:25 - 0:25
    and I do R-B require on that file, yeah,
  • 0:25 - 0:25
    so in general, yay!
  • 0:26 - 0:26
    Actually in that twelve lines of code
  • 0:26 - 0:26
    you really have built your first Ruby extension
  • 0:26 - 0:26
    and your first Python extension.
  • 0:26 - 0:26
    So what I'm really trying to call out,
  • 0:26 - 0:26
    is it really is very simple, like,
  • 0:26 - 0:27
    as Ruby developers we always have a lot of
    fears,
  • 0:27 - 0:27
    like, oh this very simple thing in Ruby.
  • 0:27 - 0:27
    How could I even do it in a C extension?
  • 0:27 - 0:27
    It turns out that the Ruby C extensions are
    great,
  • 0:27 - 0:27
    because they expose almost everything
  • 0:28 - 0:28
    you would ever want to do in Ruby,
  • 0:28 - 0:28
    it exposes the same thing in C.
  • 0:28 - 0:28
    So let's look at, what are,
  • 0:28 - 0:28
    why aren't there more,
  • 0:28 - 0:28
    why aren't we doing it more?
  • 0:28 - 0:29
    Well, the biggest common fear every time
  • 0:29 - 0:29
    somebody mentions C through Ruby or in general
  • 0:29 - 0:29
    hello program, what about memory allocation?
  • 0:29 - 0:29
    Like how do I handle this?
  • 0:29 - 0:29
    Well, as it turns out, it's really not as
    difficult
  • 0:30 - 0:30
    as you might think, and since you are still
    programming
  • 0:30 - 0:30
    in the Ruby world, you actually have
  • 0:30 - 0:30
    a lot of things that can actually help you.
  • 0:30 - 0:30
    In particular there are two, there are these
    two macros, right.
  • 0:30 - 0:30
    The first one basically takes a C pointer
    and stuffs it inside a Ruby object.
  • 0:30 - 0:31
    You just tell it which class you want that
    Ruby object
  • 0:31 - 0:31
    to be in and it will magically be created
    with your pointer.
  • 0:31 - 0:31
    And the last, the second one gets the pointer.
  • 0:31 - 0:31
    What actually happens internally is that your
    memory
  • 0:31 - 0:31
    that's allocated has been tied to this Ruby
    object,
  • 0:32 - 0:32
    and when this Ruby object gets garbage collected,
  • 0:32 - 0:32
    so does your pointer.
  • 0:32 - 0:32
    So in many ways you're basically just re-using
  • 0:32 - 0:32
    the Ruby's GC to build a, you know,
  • 0:32 - 0:32
    to manage your native code as well.
  • 0:32 - 0:33
    Right, no batteries included is the next big
    fear.
  • 0:33 - 0:33
    But just keep in mind that since you are
  • 0:33 - 0:33
    programming in the Ruby extension,
  • 0:33 - 0:33
    in Ruby extensions with C,
  • 0:33 - 0:33
    you actually have access to every single
  • 0:34 - 0:34
    basic functionality that Ruby can provide
    you.
  • 0:34 - 0:34
    There are methods to manipulate arrays,
  • 0:34 - 0:34
    strings, hashes, you name it.
  • 0:34 - 0:34
    It's all very easy to manipulate even in the
    C extension.
  • 0:34 - 0:34
    And of course portability.
  • 0:34 - 0:35
    I have no idea what this comic is about,
  • 0:35 - 0:35
    it was on Geek, but it was the first thing
  • 0:35 - 0:35
    that I found for portability.
  • 0:35 - 0:35
    So most C extensions work only in MRI,
  • 0:35 - 0:35
    except they sort of work in Ruby NS.
  • 0:36 - 0:36
    Like Ruby NS has tried to make the sort of
    API compatible,
  • 0:36 - 0:36
    but it sometimes works, it sometimes doesn't.
  • 0:36 - 0:36
    So basically all you can trust if you're writing
  • 0:36 - 0:36
    C API is your API, your gem is gonna work
    in MRI.
  • 0:36 - 0:36
    So what about, what if you do want to-
  • 0:36 - 0:37
    OK, so, the last concern I always see is,
  • 0:37 - 0:37
    how do I even get started?
  • 0:37 - 0:37
    What is the best practices,
  • 0:37 - 0:37
    how do I build a C extension for this?
  • 0:37 - 0:37
    I've always found that the Ruby source code
  • 0:38 - 0:38
    itself is probably the best documentation
  • 0:38 - 0:38
    for how to build a C extension.
  • 0:38 - 0:38
    It's actually very simple,
  • 0:38 - 0:38
    very easy to understand, very nice to read.
  • 0:38 - 0:38
    So over here I've actually,
  • 0:38 - 0:39
    I'm actually showing you string dot C,
  • 0:39 - 0:39
    and I'm gonna walk through a few lines of
    this code now, OK.
  • 0:39 - 0:39
    So the first line is a method called init
    string, right.
  • 0:39 - 0:39
    This is the equivalent of main for your Ruby
    extension.
  • 0:39 - 0:39
    Whenever your gem is required it is gonna
    call this function.
  • 0:40 - 0:40
    So if there was a gem called string,
  • 0:40 - 0:40
    and I said require string,
  • 0:40 - 0:40
    it would call this method init underscore
    string, yeah.
  • 0:40 - 0:40
    The first thing I'm gonna do over there
  • 0:40 - 0:40
    is I'm gonna say R-B define class.
  • 0:40 - 0:41
    I'm gonna define a class called string,
  • 0:41 - 0:41
    which inherits from object, right.
  • 0:41 - 0:41
    That's exactly equivalent to class string,
  • 0:41 - 0:41
    less than sign object. Nothing complicated
    there.
  • 0:41 - 0:41
    I'm storing this in a variable called
  • 0:42 - 0:42
    R-B underscore C string, yeah,
  • 0:42 - 0:42
    and then I'm gonna define a method
  • 0:42 - 0:42
    on R-B underscore C string called E-Q-L question
    mark, right.
  • 0:42 - 0:42
    What I'm gonna tell it is that this method,
  • 0:42 - 0:42
    when somebody calls E-Q-L question mark on
    any string,
  • 0:42 - 0:43
    call this C function, which is
  • 0:43 - 0:43
    R-B underscore S-T-R underscore equal, right.
  • 0:43 - 0:43
    Still nothing complicated over there.
  • 0:43 - 0:43
    And the last thing says that I expect one
  • 0:43 - 0:43
    extra parameter to be there.
  • 0:44 - 0:44
    Self is always fast, but I want one extra
    parameter.
  • 0:44 - 0:44
    So those are the four simple parameters to
    this.
  • 0:44 - 0:44
    There is your class name, your function name,
  • 0:44 - 0:44
    the C function to call, and the number of
    parameters.
  • 0:44 - 0:44
    Still nothing really complicated.
  • 0:44 - 0:45
    Let's look at the actual implementation of
    the function.
  • 0:45 - 0:45
    Really simple. If self is equal to S-T-R of
    2, return true.
  • 0:45 - 0:45
    Yes, they're the same object, because the
    two of them
  • 0:45 - 0:45
    are the same object ID. They have the same
    object ID.
  • 0:45 - 0:45
    They're actually the same object.
  • 0:46 - 0:46
    Similarly, the second one is not a string.
  • 0:46 - 0:46
    Return false, simple enough.
  • 0:46 - 0:46
    And the last line kind of delegates
  • 0:46 - 0:46
    to the old Ruby equal,
  • 0:46 - 0:46
    which will do the algorithm most of us learned
  • 0:46 - 0:47
    in high school, where you compile,
  • 0:47 - 0:47
    compare a string by a string, character by
    character
  • 0:47 - 0:47
    to figure out, are these two strings equal?
  • 0:47 - 0:47
    So as you can see there's really nothing
  • 0:47 - 0:47
    very complicated in building a C extension.
  • 0:48 - 0:48
    And most of the time your architecture sort
    of looks like this.
  • 0:48 - 0:48
    You have the Ruby, you have the native code
  • 0:48 - 0:48
    on the left. This is the code you kind of
    want to run,
  • 0:48 - 0:48
    and you have Ruby code on the right,
  • 0:48 - 0:48
    and this is the code that you want to consume,
  • 0:48 - 0:49
    that code that you've somehow built.
  • 0:49 - 0:49
    In between, in purple, is a Ruby-aware native
    code.
  • 0:49 - 0:49
    And why do I say Ruby-aware native code?
  • 0:49 - 0:49
    Because you've still written this as native
    code.
  • 0:49 - 0:49
    It's still written in C.
  • 0:50 - 0:50
    It's still compiled down to a dot S-O file
    or a dot Dylib file on Mac.
  • 0:50 - 0:50
    But it's Ruby-aware.
  • 0:50 - 0:50
    It knows how things work in Ruby.
  • 0:50 - 0:50
    Compared to this FFI kind of does the opposite.
  • 0:50 - 0:50
    Instead of Ruby-aware native code,
  • 0:50 - 0:51
    what you have is native-aware Ruby code, right.
  • 0:51 - 0:51
    So what this means is with FFI basically
  • 0:51 - 0:51
    working purely in Ruby, which somehow understands
  • 0:51 - 0:51
    how the native architecture of the system-
  • 0:51 - 0:51
    OK, so FFI is a Ruby D-S-L.
  • 0:52 - 0:52
    It's really easy to implement.
  • 0:52 - 0:52
    It's even easier than MRI, like than the C
    extension.
  • 0:52 - 0:52
    It actually works across all Ruby implementations.
  • 0:52 - 0:52
    I would actually say that all the Ruby implementors
  • 0:52 - 0:52
    got together one day and said, how can we
    make something
  • 0:52 - 0:53
    that'll make it easy for us to integrate with
    libraries,
  • 0:53 - 0:53
    so it works on JRuby, it works on MRI,
  • 0:53 - 0:53
    it even works on Mac Ruby, Mac L? Ruby B-N-S,
  • 0:53 - 0:53
    you name it, it works.
  • 0:53 - 0:53
    And it basically converts to and from C primitives
    for you directly.
  • 0:54 - 0:54
    So let's look at an example.
  • 0:54 - 0:54
    I'm just taking the example straight out of
    GitHub.
  • 0:54 - 0:54
    This one's not complicated at all. All I'm
    doing is I'm saying require FFI.
  • 0:54 - 0:54
    I'm saying this is an FFI library,
  • 0:54 - 0:54
    my module is FFI library,
  • 0:54 - 0:55
    and I'm saying it attaches to lib C.
  • 0:55 - 0:55
    And by lib C that doesn't imply
  • 0:55 - 0:55
    that the library is written in C.
  • 0:55 - 0:55
    What that means is this C standard
  • 0:55 - 0:55
    library that you want to connect to.
  • 0:56 - 0:56
    And I'm creating a method called puts and
  • 0:56 - 0:56
    I'm saying puts takes one argument - it's
    a string,
  • 0:56 - 0:56
    and it returns one argument, which is a,
  • 0:56 - 0:56
    one value which is an integer.
  • 0:56 - 0:56
    Realy nothing complicated over there.
  • 0:56 - 0:57
    This creates a static class method,
  • 0:57 - 0:57
    or a module method on this module,
  • 0:57 - 0:57
    so I can say my lib dot puts, hello world,
    using lib C.
  • 0:57 - 0:57
    It's very, very easy to attach to a function
    using FFI.
  • 0:57 - 0:57
    And let's quickly look at another example.
  • 0:58 - 0:58
    This one I'm attaching pow, which takes two
    doubles
  • 0:58 - 0:58
    and returns a double, and you can see it in
    action over there.
  • 0:58 - 0:58
    It works.
  • 0:58 - 0:58
    And here I'm attaching to lib dot M, which
    is the math library.
  • 0:58 - 0:58
    So actually FFI supports a lot of built-in
    types.
  • 0:58 - 0:59
    It supports integers, characters,
  • 0:59 - 0:59
    and for everything else,
  • 0:59 - 0:59
    every single pointer type that you would use
    in C,
  • 0:59 - 0:59
    it supports, well, a pointer.
  • 0:59 - 0:59
    So in general FFI is probably your best solution
    for everything.
  • 1:00 - 1:00
    If you're trying to build a new gem and
  • 1:00 - 1:00
    you want people to use your gem, and
  • 1:00 - 1:00
    you're not just doing it for fun,
  • 1:00 - 1:00
    you probably want to build it using FFI.
  • 1:00 - 1:00
    It also lets you do your modeling in Ruby,
  • 1:00 - 1:01
    which means the deployments also a little
    bit easier.
  • 1:01 - 1:01
    You don't need to struggle with Make files
  • 1:01 - 1:01
    and other stuff kind of build that extension.
  • 1:01 - 1:01
    Unfortunately one piece of misinformation
    that
  • 1:01 - 1:01
    seems to be out there is that FFI,
  • 1:02 - 1:02
    if you build it with FFI,
  • 1:02 - 1:02
    you do not need to worry about garbage collection.
  • 1:02 - 1:02
    I'll show you an example in the next slide,
  • 1:02 - 1:02
    and unfortunately with FFI there is no
  • 1:02 - 1:02
    C++ support without wrapping.
  • 1:02 - 1:03
    So you could see over here that these
  • 1:03 - 1:03
    functions that we attached to are all static
    functions in C.
  • 1:03 - 1:03
    They kind of are not attached to any object.
  • 1:03 - 1:03
    They take fixed number of parameters so
  • 1:03 - 1:03
    that's not possible to wrap C++ functions
    directly.
  • 1:04 - 1:04
    You could write a thin shim, which kind of
  • 1:04 - 1:04
    takes static functions which you can use to
    call your C++ functions.
  • 1:04 - 1:04
    But it still starts getting to be more effort
    and you need to write that in C or C++.
  • 1:04 - 1:04
    So you do still have to worry about the garbage
    collection,
  • 1:04 - 1:04
    however, with FFI, and I'll show you a quick
    example where it matters.
  • 1:04 - 1:05
    Most people will write code that looks
  • 1:05 - 1:05
    something like this and not worry about it.
  • 1:05 - 1:05
    So I have a def run query which will crash,
  • 1:05 - 1:05
    and say D-B connection is equal to my
  • 1:05 - 1:05
    FFI module database connection local host,
  • 1:06 - 1:06
    and my FFI module database query I'm passing
    is a D-B connection
  • 1:06 - 1:06
    and I'm saying select star from users.
  • 1:06 - 1:06
    This will probably work most of the time.
  • 1:06 - 1:06
    But in reality that D-B connection will eventually
    get GC'd.
  • 1:06 - 1:06
    And internally in C your cursor will probably
    hold
  • 1:06 - 1:07
    a pointer to your D-B connection,
  • 1:07 - 1:07
    even though this has not been exposed to you
    via the API.
  • 1:07 - 1:07
    So when your D-B connection gets GC'd,
  • 1:07 - 1:07
    or like when the Ruby object gets GC'd,
  • 1:07 - 1:07
    the pointer is gonna get GC'd in memory,
  • 1:08 - 1:08
    and then when your cursor tries to access
    the D-B connection,
  • 1:08 - 1:08
    it will crash.
  • 1:08 - 1:08
    Yeah, so the standard pattern for solving
    something
  • 1:08 - 1:08
    like this is to make sure that these
  • 1:08 - 1:08
    two objects are aware of each other in the
    Ruby world.
  • 1:08 - 1:09
    The most, in general what I've seen happen
    a lot
  • 1:09 - 1:09
    is you save the database cursor and you kind
    of just say,
  • 1:09 - 1:09
    cursor dot D-B connection is equal to the
    other connection,
  • 1:09 - 1:09
    so that this has a pointer in Ruby as well
  • 1:09 - 1:09
    which corresponds to the C pointer.
  • 1:10 - 1:10
    So it's not as if you can just blindly
  • 1:10 - 1:10
    take the library and, just looking at the
    APIs, do this.
  • 1:10 - 1:10
    Although, granted, with the very primitive
    types,
  • 1:10 - 1:10
    when you're looking at things on the left
    side,
  • 1:10 - 1:10
    the characters, the strings, you're less likely
    to fall,
  • 1:10 - 1:11
    face big memory problems.
  • 1:11 - 1:11
    So that's mostly all I'm gonna speak about
    FFI.
  • 1:11 - 1:11
    If you look at the progression we've made,
  • 1:11 - 1:11
    we've kind of, we've started with gems that
    work in MRI.
  • 1:11 - 1:11
    We've moved onto gems that work in all Ruby
    and (00:18:15 - ??),
  • 1:12 - 1:12
    now let's talk about gems that work on all
    programming languages.
  • 1:12 - 1:12
    And soon we'll talk about taking over the
    world.
  • 1:12 - 1:12
    So SWIG is sort of the answer to that.
  • 1:12 - 1:12
    SWIG stands for the simplified wraper and
    interface generator -
  • 1:12 - 1:12
    which is a big tongue-twister.
  • 1:12 - 1:13
    Basically what SWIG does is it lets you
  • 1:13 - 1:13
    annotate your C and C++ header files.
  • 1:13 - 1:13
    The architecture is sort of like this -
  • 1:13 - 1:13
    there's native code over there, there's some
    magic in between,
  • 1:13 - 1:13
    and then magically you get Ruby code and Python
    code out of it.
  • 1:14 - 1:14
    Let's look a little bit about how this magic
    works.
  • 1:14 - 1:14
    So FFI, sorry, SWIG works off an interface
    file.
  • 1:14 - 1:14
    What it basically is is it's an annotated
    header file
  • 1:14 - 1:14
    and it auto-generates code to make it work
    in your various languages.
  • 1:14 - 1:14
    And how it auto-generates that code depends
    on every single language.
  • 1:14 - 1:15
    So for C for Ruby builds a C extension.
  • 1:15 - 1:15
    So maybe it won't work in JRuby.
  • 1:15 - 1:15
    But for Ruby it actually generates C code
    which will call your library.
  • 1:15 - 1:15
    For Python it's actually a C and a dot py
    file.
  • 1:15 - 1:15
    For Java it builds a JNI interface.
  • 1:16 - 1:16
    And of course you still will have the same
    GC problem
  • 1:16 - 1:16
    that you had while we were discussing FFI.
  • 1:16 - 1:16
    But in general SWIG actually works pretty
    well.
  • 1:16 - 1:16
    There are a couple of Ruby gems out there
  • 1:16 - 1:16
    that are built using SWIG.
  • 1:16 - 1:17
    I've seen it actually used in practice for
  • 1:17 - 1:17
    like a large company which had an algorithm
  • 1:17 - 1:17
    it wanted to share across different programming
    language.
  • 1:17 - 1:17
    They had a Python, a Java, and a Ruby sort
    of front-end,
  • 1:17 - 1:17
    so what they did was build their code in C++,
  • 1:18 - 1:18
    exposed it via SWIG and were able to use it
    in
  • 1:18 - 1:18
    all these three different languages. It's
    really simple.
  • 1:18 - 1:18
    So over here we have a class called rectangle
  • 1:18 - 1:18
    which has a length and a breadth
  • 1:18 - 1:18
    and a constructor and an int called area -
  • 1:18 - 1:19
    and I'm sure most of you would know the implementation
    of this.
  • 1:19 - 1:19
    All I need to do is add some junk at the top
    and the bottom and yeah,
  • 1:19 - 1:19
    no, that's it, and magically it will kind
    of work.
  • 1:19 - 1:19
    So unless I said this is a module called shape
  • 1:19 - 1:19
    that translates to Ruby directly to a name
    space,
  • 1:20 - 1:20
    I just need to say require shapes.
  • 1:20 - 1:20
    Rectangle equals shapes dot rectangle dot
    new, and so on and so forth.
  • 1:20 - 1:20
    So with SWIG it's very easy to quickly kind
  • 1:20 - 1:20
    of generate interfaces across multiple different
  • 1:20 - 1:20
    langues really fast, and you can do this if
    you are,
  • 1:20 - 1:21
    especially if you're the maintainer of the
    actual native library.
  • 1:21 - 1:21
    So if you are the maintainer of the
  • 1:21 - 1:21
    actual native library I would recommend going
    with SWIG.
  • 1:21 - 1:21
    There are other options as well.
  • 1:21 - 1:21
    Ruby has no shortage of ways to include native
    things.
  • 1:22 - 1:22
    I think an old one which has been around
  • 1:22 - 1:22
    for a long time is dynamic load that is basically
  • 1:22 - 1:22
    the port of C's DL open into Ruby. It's been
    around since forever,
  • 1:22 - 1:22
    but I've heard a lot of reports of it
  • 1:22 - 1:22
    being really buggy and in general I think
    both that and Fiddle,
  • 1:22 - 1:23
    which is now - Fiddle is actually coming in
    Ruby 2 or Ruby 2.1 -
  • 1:23 - 1:23
    that is again another way of introducing native
    libraries.
  • 1:23 - 1:23
    Both of these kind of work in concept
  • 1:23 - 1:23
    very similar to how FFI works,
  • 1:23 - 1:23
    so I'm not gonna spend a lot of time covering
    them.
  • 1:24 - 1:24
    I think Fiddle may start becoming
  • 1:24 - 1:24
    popular soon as more and more people start
    using Ruby 2.
  • 1:24 - 1:24
    I don't know if FFI and Fiddle are someday
  • 1:24 - 1:24
    going to start merging together,
  • 1:24 - 1:24
    but in general these are the other two options.
  • 1:24 - 1:25
    So, TL;DR.
  • 1:25 - 1:25
    Native extensions are fun, really easy to
    build.
  • 1:25 - 1:25
    The three big tools which are C extensions,
    FFI and SWIG.
  • 1:25 - 1:25
    You probably want to choose FFI if you don't
    maintain the library,
  • 1:25 - 1:25
    even if it's too easy to write the code for
    it,
  • 1:26 - 1:26
    but SWIG may be better if you actually maintain
  • 1:26 - 1:26
    the library and you want to expose it to a
    number of people.
  • 1:26 - 1:26
    OK, thank you. I think I actually have time
    for questions.
  • 1:26 - 1:26
    How much time do I have for questions?
  • 1:26 - 1:26
    V.O.: Ten minutes.
  • 1:26 - 1:27
    T.D.: OK, so does anyone have any questions?
  • 1:27 - 1:27
    Yes?
  • 1:27 - 1:27
    QUERANT: So when you actuall write the native
    code,
  • 1:27 - 1:27
    right, do you have to take it off during GIL
    acquir-
  • 1:27 - 1:27
    acquiring GIL and using it yourself?
  • 1:28 - 1:28
    T.D.: So actually that's kind of an interesting
    question.
  • 1:28 - 1:28
    I think that like when it actually calls the
  • 1:28 - 1:28
    native extensions that's all the code you
    write,
  • 1:28 - 1:28
    would be considered a single Ruby call.
  • 1:28 - 1:28
    So I'm not actually sure if the GIL is held
    for the entire time.
  • 1:28 - 1:29
    I think by default the GIL would be held
  • 1:29 - 1:29
    for the entire time your code is being executed.
  • 1:29 - 1:29
    QUERANT: OK
  • 1:29 - 1:29
    T.D.: Unless you say do something to create
    a thread and move out.
  • 1:29 - 1:29
    QUERANT: Right, but when you're writing,
  • 1:30 - 1:30
    especially things like the database connection-
  • 1:30 - 1:30
    T.D.: Right.
  • 1:30 - 1:30
    QUERANT: Right, when you have that kind of-
  • 1:30 - 1:30
    T.D.: OK, the database connection.
  • 1:30 - 1:30
    QUERANT: Yeah, if you have- anyway,
  • 1:30 - 1:31
    when you have that kind of a code, right,
  • 1:31 - 1:31
    it's not very- you would assume that somebody
    might
  • 1:31 - 1:31
    have done a thread dot new and, you know,
  • 1:31 - 1:31
    gone ahead and called the still lines of code.
  • 1:31 - 1:31
    T.D.: Right.
  • 1:32 - 1:32
    QUERANT: Which means, like, if you haven't
    taken
  • 1:32 - 1:32
    the global interpreter lock yourself,
  • 1:32 - 1:32
    then the chances are the same problem that
    you
  • 1:32 - 1:32
    said with GC might occur.
  • 1:32 - 1:32
    You might get pre-empted and
  • 1:32 - 1:33
    then horrible things might have happened.
  • 1:33 - 1:33
    So, does that mean every line,
  • 1:33 - 1:33
    every native extension that you write,
  • 1:33 - 1:33
    needs to take GIL because it's
  • 1:33 - 1:33
    an obscure case of some re-doing it in a new
    thread?
  • 1:34 - 1:34
    T.D.: So, actually, so, in theory, I think
    yes.
  • 1:34 - 1:34
    When the code, so, OK, so most native extensions...
  • 1:34 - 1:34
    let me just go all the way back.
  • 1:34 - 1:34
    QUERANT: So go to the C code.
  • 1:34 - 1:34
    T.D.: Yeah, this one, yeah?
  • 1:34 - 1:35
    QUERANT: Right.
  • 1:35 - 1:35
    T.D.: Right, so wht Ruby recalls automatically.
  • 1:35 - 1:35
    When I say string will equal some other string,
  • 1:35 - 1:35
    and as long as the code is in this method,
  • 1:35 - 1:35
    you will be holding the GIL.
  • 1:36 - 1:36
    I don't think anyone else can execute code
    during this time, unless...
  • 1:36 - 1:36
    Actually, I'll need to get back to you.
  • 1:36 - 1:36
    QUERANT: All right.
  • 1:36 - 1:36
    T.D.: Let me check.
  • 1:36 - 1:36
    QUERANT: Yeah, I mean, sure. Yeah, this was
    something that I-
  • 1:36 - 1:37
    T.D.: OK, sure.
  • 1:37 - 1:37
    QUERANT: Yeah, OK. Yeah, so, like on the same
    note, actually,
  • 1:37 - 1:37
    I just want to add, your C extension,
  • 1:37 - 1:37
    you only acquire the GIL if, in your extension,
  • 1:37 - 1:37
    you're going to run something along the running
    thing.
  • 1:38 - 1:38
    But you don't want to,
  • 1:38 - 1:38
    you don't want the control to return to Ruby.
  • 1:38 - 1:38
    For example, let's say you,
  • 1:38 - 1:38
    your C extension takes a measure,
  • 1:38 - 1:38
    and it does some image processing, actually,
  • 1:38 - 1:39
    and you don't want, you just want the C extension
  • 1:39 - 1:39
    to write the file to the disk and call it
    a day.
  • 1:39 - 1:39
    You don't want to return that something to-
  • 1:39 - 1:39
    Then you acquire a GIL in your extension
  • 1:39 - 1:39
    and then that thread will run completely separate
  • 1:40 - 1:40
    from your- which Ruby BM is it running at?
  • 1:40 - 1:40
    Actually, so that's the only time when
  • 1:40 - 1:40
    you will acquire a lock, if any,
  • 1:40 - 1:40
    if you're passing any data back to Ruby.
  • 1:40 - 1:40
    Or anyway, you pass it back, control,
  • 1:40 - 1:41
    the control back to this thing,
  • 1:41 - 1:41
    then you don't want to acquire the GIL yourself
    actually.
  • 1:41 - 1:41
    There are constructs for that, but generally
    not recommended.
  • 1:41 - 1:41
    T.D.: Yeah, so I believe what he's saying
    is correct.
  • 1:41 - 1:41
    I was slightly mistaken, I think the GIL is
    only acquired
  • 1:42 - 1:42
    when you enter the function.
  • 1:42 - 1:42
    As soon as you enter the function the GIL
    is released,
  • 1:42 - 1:42
    unless I'm mistaken, that's correct, right?
  • 1:42 - 1:42
    QUERANT: That's correct.
  • 1:42 - 1:42
    T.D.: Yeah. So, yeah. So I guess if you actually
    call
  • 1:42 - 1:43
    anything that's a Ruby construct from here,
  • 1:43 - 1:43
    you can actually call a method from within
    your C function body.
  • 1:43 - 1:43
    I think at that point you'll need to re-acquire
    the global interpreter lock.
  • 1:43 - 1:43
    But you're correct that GIL is only caught
    when you enter the method.
  • 1:43 - 1:43
    V.O.: We have time for more questions.
  • 1:44 - 1:44
    QUERANT: Hey Tejas, how do you test native
    extensions?
  • 1:44 - 1:44
    T.D.: OK, so, it's like I said.
  • 1:44 - 1:44
    The architecture of most of your things is
    sort of like this,
  • 1:44 - 1:44
    where you have native code, Ruby-aware native
    code,
  • 1:44 - 1:44
    and your actual Ruby code. Presumably you're
    doing something
  • 1:44 - 1:45
    a Google test or something, to test your native
    code,
  • 1:45 - 1:45
    and to test your actual Ruby code depending
    on what your library is.
  • 1:45 - 1:45
    It will vary drastically.
  • 1:45 - 1:45
    So for example, if you're writing something
  • 1:45 - 1:45
    that connects to a database,
  • 1:46 - 1:46
    you may want to actually step out the things
    that actually call.
  • 1:46 - 1:46
    Say if you're implementing with FFI or even
    with a native extension,
  • 1:46 - 1:46
    if you're making something like a database
    call
  • 1:46 - 1:46
    you may actually want to mock out or stamp
    out the
  • 1:46 - 1:46
    actual implementation that connects the two.
  • 1:46 - 1:47
    But if you're doing something that's maybe
    not so intensive,
  • 1:47 - 1:47
    maybe something like a JSON parsing library,
  • 1:47 - 1:47
    what I would recommend at this level is actually
    writing an integration test,
  • 1:47 - 1:47
    actually parse it in JSON and make sure it
    actually
  • 1:47 - 1:47
    returns to the actual, you know, representation
    that you expected.
  • 1:48 - 1:48
    So the answer to how do you test is actually
    it varies,
  • 1:48 - 1:48
    very, very drastically, and I've seen like
    different
  • 1:48 - 1:48
    maturities of tests across like, all of them.
    Does that answer your question?
  • 1:48 - 1:48
    QUERANT: Maybe, yeah.
  • 1:48 - 1:48
    T.D.: Anything else?
  • 1:48 - 1:49
    OK then, I guess I-
  • 1:49 - 1:49
    That's it, like, thank you very much. And
    yeah.
Title:
Garden City Ruby 2014 - Native Extensions Served 3 Ways by Tejas Deinkar
Description:

more » « less
Duration:
28:47

English subtitles

Revisions