Garden City Ruby 2014 - Native Extensions Served 3 Ways by Tejas Deinkar

Edit subtitles

0:00 - 0:00

Yeah, so, hi everyone.
0:00 - 0:00

Like Swanand said, I'm also part of the team
that's organized this,
0:00 - 0:01

so I want to just say it's really awesome
to see, like,
0:01 - 0:01

this huge a crowd on a Friday morning in Banglore.
0:01 - 0:01

So, what I'm speaking about today is native
extensions in Ruby.
0:01 - 0:01

The title of my talk is Native Extensions
Served 3 Ways.
0:01 - 0:01

About myself: I'm Tejas, Tejas Dinkar.
0:02 - 0:02

I am a partner at Nilenso software, which
I am an employee on
0:02 - 0:02

collective and it's a really fun place to
work.
0:02 - 0:02

If you want to know more, catch me tomorrow.
0:02 - 0:02

I'm on Twitter. I am tdinkar, and on GitHub
I am GJA.
0:02 - 0:02

You can find most of my OpenSource contributions
over there.
0:02 - 0:03

So, about my talk, this is actually a pretty
technical talk,
0:03 - 0:03

so expect to see lots of code. I hope to have
five minutes for questions.
0:03 - 0:03

I'll try to beat ?? and beg for it (?? @ 00:01:18).
0:03 - 0:03

But if I don't have time, just catch me in
the hallway
0:03 - 0:03

and I can try to answer whatever I can.
0:04 - 0:04

So I'll mostly be covering C extensions, FFI
and Swig.
0:04 - 0:04

Let's talk about first why you would ever
want to build
0:04 - 0:04

a native extension for Ruby. There's a bunch
of different reasons.
0:04 - 0:04

Number one is to maybe integrate with new
libraries.
0:04 - 0:04

Say like a new database has dome out, like,
for example LibDrizzle,
0:04 - 0:05

which is a new library that came out to work
with
0:05 - 0:05

MySQL in the Drizzle database.
0:05 - 0:05

You might want to port that over to Ruby.
0:05 - 0:05

You might want to improve performance of critical
code.
0:05 - 0:05

There's different ways of doing this, of course.
0:06 - 0:06

You could try doing JRuby, look at different
caches,
0:06 - 0:06

but somethings you have an algorithm that
you
0:06 - 0:06

just want to implement in native code,
0:06 - 0:06

or there's already a great library that implements
it.
0:06 - 0:06

Someone has given me an example of CSAT,
0:06 - 0:07

which I think Bundler could have actually
used,
0:07 - 0:07

or could use, to kind of resolve gem deficiencies,
0:07 - 0:07

and that's written in C++.
0:07 - 0:07

Sometimes you wanna just move that there to
0:07 - 0:07

improve performance, and of course you want
to
0:08 - 0:08

write code that works, of course different
languages,
0:08 - 0:08

and in general it's a lot of fun.
0:08 - 0:08

You know like, real hackers program in C
0:08 - 0:08

and all that stuff.
0:08 - 0:08

So you could just feel super elite by doing
this.
0:08 - 0:09

So before we talk about native extensions,
0:09 - 0:09

I'm just gonna take a small segue and ask-
0:09 - 0:09

So let's talk about Python a little bit.
0:09 - 0:09

Like how many people in the house are like,
are Pythonistas?
0:09 - 0:09

How many big Python fans?
0:10 - 0:10

Well, that's actually surprisingly small.
0:10 - 0:10

I thought there would be more.
0:10 - 0:10

So I've figured out the best way to write
Python code,
0:10 - 0:10

and I'm gonna tell you this right now and-
0:10 - 0:10

Yes, of course, I am trolling you.
0:10 - 0:11

So over here I have a Python interpreter open
0:11 - 0:11

and I'm gonna say import Ruby.
0:11 - 0:11

Yeah, that seems innocent enough.
0:11 - 0:11

I'm gonna say Ruby dot eval foobar dot size.
0:11 - 0:11

Hmm, that seems to return six.
0:12 - 0:12

So that seems to work. Let's try something
more complicated.
0:12 - 0:12

I'm gonna define a method, factorial, and,
def factorial,
0:12 - 0:12

partial, N equals zero, partial, factorial,
N minus one,
0:12 - 0:12

et cetera.
0:12 - 0:12

Yes it's still recursive, I know.
0:12 - 0:13

And finally I'm gonna call it,
0:13 - 0:13

and I'm gonna say Ruby dot eval factorial
five.
0:13 - 0:13

Yet again you'll see that I actually get the
result, right.
0:13 - 0:13

So wow, what's this? Have I implemented Python
in Ruby?
0:13 - 0:13

Let's look at the code which actually makes
this happen.
0:14 - 0:14

How can I actually make this work?
0:14 - 0:14

Well, surprisingly, it's about a dozen lines
of code.
0:14 - 0:14

I know this might be a bit hard to read from
the back,
0:14 - 0:14

so I'm just gonna read out the interesting
bits line by line.
0:14 - 0:14

So it's about a dozen lines of code,
0:14 - 0:15

but let's remove all the Python stuff and
let's just
0:15 - 0:15

look at the Ruby parts of this. It's even
a lot less,
0:15 - 0:15

yeah, so let's go through this one by one.
0:15 - 0:15

I start off my including Ruby dot H.
0:15 - 0:15

Ruby dot H, most of you who are familiar
0:16 - 0:16

with C and C++ dot H files, are hetero files.
0:16 - 0:16

They basically have the definitions of various
0:16 - 0:16

constructs that Ruby or, like, whatever,
0:16 - 0:16

that your library exposes, so that you can,
0:16 - 0:16

so that your compiler knows what definitions
exist in the first place.
0:16 - 0:17

So any Ruby extension that you write,
0:17 - 0:17

you'll need to include Ruby dot H,
0:17 - 0:17

and most of the time this is completely sufficient.
0:17 - 0:17

You don't need to include anything else.
0:17 - 0:17

So let's start looking at the actual code.
0:18 - 0:18

I have a method called Python Ruby eval.
0:18 - 0:18

Yeah, and it accepts one parameter,
0:18 - 0:18

which is a string. OK, nothing complicated
here.
0:18 - 0:18

What I do next is I take that string and
0:18 - 0:18

I pass it to a function called R-B eval string,
0:18 - 0:19

right, which stands for Ruby eval string.
0:19 - 0:19

Nothing really magical over here.
0:19 - 0:19

All I'm doing is I'm calling Ruby's eval,
0:19 - 0:19

and you'll notice that the R-B eval returns
a object of the type value.
0:19 - 0:19

This value is actually very important.
0:20 - 0:20

The same way in Ruby every single object inherits
from object,
0:20 - 0:20

the corresponding construct in the CRuby extensions
is the value object.
0:20 - 0:20

Value object is used to represent every single
Ruby object
0:20 - 0:20

from nil to true to false to every single
custom object.
0:20 - 0:20

Fine, so I have this value object. What am
I gonna do next?
0:20 - 0:21

Well, what I'm gonna do is I'm gonna
0:21 - 0:21

switch on the type of the object.
0:21 - 0:21

Yeah, there's a macro called type, and if
it's a fix num,
0:21 - 0:21

fix nums are numbers in Ruby,
0:21 - 0:21

I'm gonna call fix num to N,
0:22 - 0:22

to just convert that to an integer.
0:22 - 0:22

If it's a string I'm gonna call string value
pointer,
0:22 - 0:22

which converts from a Ruby string into a C
string.
0:22 - 0:22

And if it's some other type, like if it's
an array, yeah,
0:22 - 0:22

I could have implemented this,
0:22 - 0:23

but yeah I'm too lazy right now,
0:23 - 0:23

so I'm just gonna say return none,
0:23 - 0:23

which is the Python version of nil.
0:23 - 0:23

So yeah that's pretty much it.
0:23 - 0:23

That's all it really took to kind of build
0:24 - 0:24

a small kind of Ruby module which would extend
0:24 - 0:24

some of the functionality of Ruby into Python.
0:24 - 0:24

Let's look at something that's like another
construct.
0:24 - 0:24

I don't want to build all my quota
0:24 - 0:24

and I don't want to eval all of it.
0:24 - 0:25

Let's say I want to require a file.
0:25 - 0:25

It's pretty much the same.
0:25 - 0:25

All I need to do is, I accept the file
0:25 - 0:25

and I do R-B require on that file, yeah,
0:25 - 0:25

so in general, yay!
0:26 - 0:26

Actually in that twelve lines of code
0:26 - 0:26

you really have built your first Ruby extension
0:26 - 0:26

and your first Python extension.
0:26 - 0:26

So what I'm really trying to call out,
0:26 - 0:26

is it really is very simple, like,
0:26 - 0:27

as Ruby developers we always have a lot of
fears,
0:27 - 0:27

like, oh this very simple thing in Ruby.
0:27 - 0:27

How could I even do it in a C extension?
0:27 - 0:27

It turns out that the Ruby C extensions are
great,
0:27 - 0:27

because they expose almost everything
0:28 - 0:28

you would ever want to do in Ruby,
0:28 - 0:28

it exposes the same thing in C.
0:28 - 0:28

So let's look at, what are,
0:28 - 0:28

why aren't there more,
0:28 - 0:28

why aren't we doing it more?
0:28 - 0:29

Well, the biggest common fear every time
0:29 - 0:29

somebody mentions C through Ruby or in general
0:29 - 0:29

hello program, what about memory allocation?
0:29 - 0:29

Like how do I handle this?
0:29 - 0:29

Well, as it turns out, it's really not as
difficult
0:30 - 0:30

as you might think, and since you are still
programming
0:30 - 0:30

in the Ruby world, you actually have
0:30 - 0:30

a lot of things that can actually help you.
0:30 - 0:30

In particular there are two, there are these
two macros, right.
0:30 - 0:30

The first one basically takes a C pointer
and stuffs it inside a Ruby object.
0:30 - 0:31

You just tell it which class you want that
Ruby object
0:31 - 0:31

to be in and it will magically be created
with your pointer.
0:31 - 0:31

And the last, the second one gets the pointer.
0:31 - 0:31

What actually happens internally is that your
memory
0:31 - 0:31

that's allocated has been tied to this Ruby
object,
0:32 - 0:32

and when this Ruby object gets garbage collected,
0:32 - 0:32

so does your pointer.
0:32 - 0:32

So in many ways you're basically just re-using
0:32 - 0:32

the Ruby's GC to build a, you know,
0:32 - 0:32

to manage your native code as well.
0:32 - 0:33

Right, no batteries included is the next big
fear.
0:33 - 0:33

But just keep in mind that since you are
0:33 - 0:33

programming in the Ruby extension,
0:33 - 0:33

in Ruby extensions with C,
0:33 - 0:33

you actually have access to every single
0:34 - 0:34

basic functionality that Ruby can provide
you.
0:34 - 0:34

There are methods to manipulate arrays,
0:34 - 0:34

strings, hashes, you name it.
0:34 - 0:34

It's all very easy to manipulate even in the
C extension.
0:34 - 0:34

And of course portability.
0:34 - 0:35

I have no idea what this comic is about,
0:35 - 0:35

it was on Geek, but it was the first thing
0:35 - 0:35

that I found for portability.
0:35 - 0:35

So most C extensions work only in MRI,
0:35 - 0:35

except they sort of work in Ruby NS.
0:36 - 0:36

Like Ruby NS has tried to make the sort of
API compatible,
0:36 - 0:36

but it sometimes works, it sometimes doesn't.
0:36 - 0:36

So basically all you can trust if you're writing
0:36 - 0:36

C API is your API, your gem is gonna work
in MRI.
0:36 - 0:36

So what about, what if you do want to-
0:36 - 0:37

OK, so, the last concern I always see is,
0:37 - 0:37

how do I even get started?
0:37 - 0:37

What is the best practices,
0:37 - 0:37

how do I build a C extension for this?
0:37 - 0:37

I've always found that the Ruby source code
0:38 - 0:38

itself is probably the best documentation
0:38 - 0:38

for how to build a C extension.
0:38 - 0:38

It's actually very simple,
0:38 - 0:38

very easy to understand, very nice to read.
0:38 - 0:38

So over here I've actually,
0:38 - 0:39

I'm actually showing you string dot C,
0:39 - 0:39

and I'm gonna walk through a few lines of
this code now, OK.
0:39 - 0:39

So the first line is a method called init
string, right.
0:39 - 0:39

This is the equivalent of main for your Ruby
extension.
0:39 - 0:39

Whenever your gem is required it is gonna
call this function.
0:40 - 0:40

So if there was a gem called string,
0:40 - 0:40

and I said require string,
0:40 - 0:40

it would call this method init underscore
string, yeah.
0:40 - 0:40

The first thing I'm gonna do over there
0:40 - 0:40

is I'm gonna say R-B define class.
0:40 - 0:41

I'm gonna define a class called string,
0:41 - 0:41

which inherits from object, right.
0:41 - 0:41

That's exactly equivalent to class string,
0:41 - 0:41

less than sign object. Nothing complicated
there.
0:41 - 0:41

I'm storing this in a variable called
0:42 - 0:42

R-B underscore C string, yeah,
0:42 - 0:42

and then I'm gonna define a method
0:42 - 0:42

on R-B underscore C string called E-Q-L question
mark, right.
0:42 - 0:42

What I'm gonna tell it is that this method,
0:42 - 0:42

when somebody calls E-Q-L question mark on
any string,
0:42 - 0:43

call this C function, which is
0:43 - 0:43

R-B underscore S-T-R underscore equal, right.
0:43 - 0:43

Still nothing complicated over there.
0:43 - 0:43

And the last thing says that I expect one
0:43 - 0:43

extra parameter to be there.
0:44 - 0:44

Self is always fast, but I want one extra
parameter.
0:44 - 0:44

So those are the four simple parameters to
this.
0:44 - 0:44

There is your class name, your function name,
0:44 - 0:44

the C function to call, and the number of
parameters.
0:44 - 0:44

Still nothing really complicated.
0:44 - 0:45

Let's look at the actual implementation of
the function.
0:45 - 0:45

Really simple. If self is equal to S-T-R of
2, return true.
0:45 - 0:45

Yes, they're the same object, because the
two of them
0:45 - 0:45

are the same object ID. They have the same
object ID.
0:45 - 0:45

They're actually the same object.
0:46 - 0:46

Similarly, the second one is not a string.
0:46 - 0:46

Return false, simple enough.
0:46 - 0:46

And the last line kind of delegates
0:46 - 0:46

to the old Ruby equal,
0:46 - 0:46

which will do the algorithm most of us learned
0:46 - 0:47

in high school, where you compile,
0:47 - 0:47

compare a string by a string, character by
character
0:47 - 0:47

to figure out, are these two strings equal?
0:47 - 0:47

So as you can see there's really nothing
0:47 - 0:47

very complicated in building a C extension.
0:48 - 0:48

And most of the time your architecture sort
of looks like this.
0:48 - 0:48

You have the Ruby, you have the native code
0:48 - 0:48

on the left. This is the code you kind of
want to run,
0:48 - 0:48

and you have Ruby code on the right,
0:48 - 0:48

and this is the code that you want to consume,
0:48 - 0:49

that code that you've somehow built.
0:49 - 0:49

In between, in purple, is a Ruby-aware native
code.
0:49 - 0:49

And why do I say Ruby-aware native code?
0:49 - 0:49

Because you've still written this as native
code.
0:49 - 0:49

It's still written in C.
0:50 - 0:50

It's still compiled down to a dot S-O file
or a dot Dylib file on Mac.
0:50 - 0:50

But it's Ruby-aware.
0:50 - 0:50

It knows how things work in Ruby.
0:50 - 0:50

Compared to this FFI kind of does the opposite.
0:50 - 0:50

Instead of Ruby-aware native code,
0:50 - 0:51

what you have is native-aware Ruby code, right.
0:51 - 0:51

So what this means is with FFI basically
0:51 - 0:51

working purely in Ruby, which somehow understands
0:51 - 0:51

how the native architecture of the system-
0:51 - 0:51

OK, so FFI is a Ruby D-S-L.
0:52 - 0:52

It's really easy to implement.
0:52 - 0:52

It's even easier than MRI, like than the C
extension.
0:52 - 0:52

It actually works across all Ruby implementations.
0:52 - 0:52

I would actually say that all the Ruby implementors
0:52 - 0:52

got together one day and said, how can we
make something
0:52 - 0:53

that'll make it easy for us to integrate with
libraries,
0:53 - 0:53

so it works on JRuby, it works on MRI,
0:53 - 0:53

it even works on Mac Ruby, Mac L? Ruby B-N-S,
0:53 - 0:53

you name it, it works.
0:53 - 0:53

And it basically converts to and from C primitives
for you directly.
0:54 - 0:54

So let's look at an example.
0:54 - 0:54

I'm just taking the example straight out of
GitHub.
0:54 - 0:54

This one's not complicated at all. All I'm
doing is I'm saying require FFI.
0:54 - 0:54

I'm saying this is an FFI library,
0:54 - 0:54

my module is FFI library,
0:54 - 0:55

and I'm saying it attaches to lib C.
0:55 - 0:55

And by lib C that doesn't imply
0:55 - 0:55

that the library is written in C.
0:55 - 0:55

What that means is this C standard
0:55 - 0:55

library that you want to connect to.
0:56 - 0:56

And I'm creating a method called puts and
0:56 - 0:56

I'm saying puts takes one argument - it's
a string,
0:56 - 0:56

and it returns one argument, which is a,
0:56 - 0:56

one value which is an integer.
0:56 - 0:56

Realy nothing complicated over there.
0:56 - 0:57

This creates a static class method,
0:57 - 0:57

or a module method on this module,
0:57 - 0:57

so I can say my lib dot puts, hello world,
using lib C.
0:57 - 0:57

It's very, very easy to attach to a function
using FFI.
0:57 - 0:57

And let's quickly look at another example.
0:58 - 0:58

This one I'm attaching pow, which takes two
doubles
0:58 - 0:58

and returns a double, and you can see it in
action over there.
0:58 - 0:58

It works.
0:58 - 0:58

And here I'm attaching to lib dot M, which
is the math library.
0:58 - 0:58

So actually FFI supports a lot of built-in
types.
0:58 - 0:59

It supports integers, characters,
0:59 - 0:59

and for everything else,
0:59 - 0:59

every single pointer type that you would use
in C,
0:59 - 0:59

it supports, well, a pointer.
0:59 - 0:59

So in general FFI is probably your best solution
for everything.
1:00 - 1:00

If you're trying to build a new gem and
1:00 - 1:00

you want people to use your gem, and
1:00 - 1:00

you're not just doing it for fun,
1:00 - 1:00

you probably want to build it using FFI.
1:00 - 1:00

It also lets you do your modeling in Ruby,
1:00 - 1:01

which means the deployments also a little
bit easier.
1:01 - 1:01

You don't need to struggle with Make files
1:01 - 1:01

and other stuff kind of build that extension.
1:01 - 1:01

Unfortunately one piece of misinformation
that
1:01 - 1:01

seems to be out there is that FFI,
1:02 - 1:02

if you build it with FFI,
1:02 - 1:02

you do not need to worry about garbage collection.
1:02 - 1:02

I'll show you an example in the next slide,
1:02 - 1:02

and unfortunately with FFI there is no
1:02 - 1:02

C++ support without wrapping.
1:02 - 1:03

So you could see over here that these
1:03 - 1:03

functions that we attached to are all static
functions in C.
1:03 - 1:03

They kind of are not attached to any object.
1:03 - 1:03

They take fixed number of parameters so
1:03 - 1:03

that's not possible to wrap C++ functions
directly.
1:04 - 1:04

You could write a thin shim, which kind of
1:04 - 1:04

takes static functions which you can use to
call your C++ functions.
1:04 - 1:04

But it still starts getting to be more effort
and you need to write that in C or C++.
1:04 - 1:04

So you do still have to worry about the garbage
collection,
1:04 - 1:04

however, with FFI, and I'll show you a quick
example where it matters.
1:04 - 1:05

Most people will write code that looks
1:05 - 1:05

something like this and not worry about it.
1:05 - 1:05

So I have a def run query which will crash,
1:05 - 1:05

and say D-B connection is equal to my
1:05 - 1:05

FFI module database connection local host,
1:06 - 1:06

and my FFI module database query I'm passing
is a D-B connection
1:06 - 1:06

and I'm saying select star from users.
1:06 - 1:06

This will probably work most of the time.
1:06 - 1:06

But in reality that D-B connection will eventually
get GC'd.
1:06 - 1:06

And internally in C your cursor will probably
hold
1:06 - 1:07

a pointer to your D-B connection,
1:07 - 1:07

even though this has not been exposed to you
via the API.
1:07 - 1:07

So when your D-B connection gets GC'd,
1:07 - 1:07

or like when the Ruby object gets GC'd,
1:07 - 1:07

the pointer is gonna get GC'd in memory,
1:08 - 1:08

and then when your cursor tries to access
the D-B connection,
1:08 - 1:08

it will crash.
1:08 - 1:08

Yeah, so the standard pattern for solving
something
1:08 - 1:08

like this is to make sure that these
1:08 - 1:08

two objects are aware of each other in the
Ruby world.
1:08 - 1:09

The most, in general what I've seen happen
a lot
1:09 - 1:09

is you save the database cursor and you kind
of just say,
1:09 - 1:09

cursor dot D-B connection is equal to the
other connection,
1:09 - 1:09

so that this has a pointer in Ruby as well
1:09 - 1:09

which corresponds to the C pointer.
1:10 - 1:10

So it's not as if you can just blindly
1:10 - 1:10

take the library and, just looking at the
APIs, do this.
1:10 - 1:10

Although, granted, with the very primitive
types,
1:10 - 1:10

when you're looking at things on the left
side,
1:10 - 1:10

the characters, the strings, you're less likely
to fall,
1:10 - 1:11

face big memory problems.
1:11 - 1:11

So that's mostly all I'm gonna speak about
FFI.
1:11 - 1:11

If you look at the progression we've made,
1:11 - 1:11

we've kind of, we've started with gems that
work in MRI.
1:11 - 1:11

We've moved onto gems that work in all Ruby
and (00:18:15 - ??),
1:12 - 1:12

now let's talk about gems that work on all
programming languages.
1:12 - 1:12

And soon we'll talk about taking over the
world.
1:12 - 1:12

So SWIG is sort of the answer to that.
1:12 - 1:12

SWIG stands for the simplified wraper and
interface generator -
1:12 - 1:12

which is a big tongue-twister.
1:12 - 1:13

Basically what SWIG does is it lets you
1:13 - 1:13

annotate your C and C++ header files.
1:13 - 1:13

The architecture is sort of like this -
1:13 - 1:13

there's native code over there, there's some
magic in between,
1:13 - 1:13

and then magically you get Ruby code and Python
code out of it.
1:14 - 1:14

Let's look a little bit about how this magic
works.
1:14 - 1:14

So FFI, sorry, SWIG works off an interface
file.
1:14 - 1:14

What it basically is is it's an annotated
header file
1:14 - 1:14

and it auto-generates code to make it work
in your various languages.
1:14 - 1:14

And how it auto-generates that code depends
on every single language.
1:14 - 1:15

So for C for Ruby builds a C extension.
1:15 - 1:15

So maybe it won't work in JRuby.
1:15 - 1:15

But for Ruby it actually generates C code
which will call your library.
1:15 - 1:15

For Python it's actually a C and a dot py
file.
1:15 - 1:15

For Java it builds a JNI interface.
1:16 - 1:16

And of course you still will have the same
GC problem
1:16 - 1:16

that you had while we were discussing FFI.
1:16 - 1:16

But in general SWIG actually works pretty
well.
1:16 - 1:16

There are a couple of Ruby gems out there
1:16 - 1:16

that are built using SWIG.
1:16 - 1:17

I've seen it actually used in practice for
1:17 - 1:17

like a large company which had an algorithm
1:17 - 1:17

it wanted to share across different programming
language.
1:17 - 1:17

They had a Python, a Java, and a Ruby sort
of front-end,
1:17 - 1:17

so what they did was build their code in C++,
1:18 - 1:18

exposed it via SWIG and were able to use it
in
1:18 - 1:18

all these three different languages. It's
really simple.
1:18 - 1:18

So over here we have a class called rectangle
1:18 - 1:18

which has a length and a breadth
1:18 - 1:18

and a constructor and an int called area -
1:18 - 1:19

and I'm sure most of you would know the implementation
of this.
1:19 - 1:19

All I need to do is add some junk at the top
and the bottom and yeah,
1:19 - 1:19

no, that's it, and magically it will kind
of work.
1:19 - 1:19

So unless I said this is a module called shape
1:19 - 1:19

that translates to Ruby directly to a name
space,
1:20 - 1:20

I just need to say require shapes.
1:20 - 1:20

Rectangle equals shapes dot rectangle dot
new, and so on and so forth.
1:20 - 1:20

So with SWIG it's very easy to quickly kind
1:20 - 1:20

of generate interfaces across multiple different
1:20 - 1:20

langues really fast, and you can do this if
you are,
1:20 - 1:21

especially if you're the maintainer of the
actual native library.
1:21 - 1:21

So if you are the maintainer of the
1:21 - 1:21

actual native library I would recommend going
with SWIG.
1:21 - 1:21

There are other options as well.
1:21 - 1:21

Ruby has no shortage of ways to include native
things.
1:22 - 1:22

I think an old one which has been around
1:22 - 1:22

for a long time is dynamic load that is basically
1:22 - 1:22

the port of C's DL open into Ruby. It's been
around since forever,
1:22 - 1:22

but I've heard a lot of reports of it
1:22 - 1:22

being really buggy and in general I think
both that and Fiddle,
1:22 - 1:23

which is now - Fiddle is actually coming in
Ruby 2 or Ruby 2.1 -
1:23 - 1:23

that is again another way of introducing native
libraries.
1:23 - 1:23

Both of these kind of work in concept
1:23 - 1:23

very similar to how FFI works,
1:23 - 1:23

so I'm not gonna spend a lot of time covering
them.
1:24 - 1:24

I think Fiddle may start becoming
1:24 - 1:24

popular soon as more and more people start
using Ruby 2.
1:24 - 1:24

I don't know if FFI and Fiddle are someday
1:24 - 1:24

going to start merging together,
1:24 - 1:24

but in general these are the other two options.
1:24 - 1:25

So, TL;DR.
1:25 - 1:25

Native extensions are fun, really easy to
build.
1:25 - 1:25

The three big tools which are C extensions,
FFI and SWIG.
1:25 - 1:25

You probably want to choose FFI if you don't
maintain the library,
1:25 - 1:25

even if it's too easy to write the code for
it,
1:26 - 1:26

but SWIG may be better if you actually maintain
1:26 - 1:26

the library and you want to expose it to a
number of people.
1:26 - 1:26

OK, thank you. I think I actually have time
for questions.
1:26 - 1:26

How much time do I have for questions?
1:26 - 1:26

V.O.: Ten minutes.
1:26 - 1:27

T.D.: OK, so does anyone have any questions?
1:27 - 1:27

Yes?
1:27 - 1:27

QUERANT: So when you actuall write the native
code,
1:27 - 1:27

right, do you have to take it off during GIL
acquir-
1:27 - 1:27

acquiring GIL and using it yourself?
1:28 - 1:28

T.D.: So actually that's kind of an interesting
question.
1:28 - 1:28

I think that like when it actually calls the
1:28 - 1:28

native extensions that's all the code you
write,
1:28 - 1:28

would be considered a single Ruby call.
1:28 - 1:28

So I'm not actually sure if the GIL is held
for the entire time.
1:28 - 1:29

I think by default the GIL would be held
1:29 - 1:29

for the entire time your code is being executed.
1:29 - 1:29

QUERANT: OK
1:29 - 1:29

T.D.: Unless you say do something to create
a thread and move out.
1:29 - 1:29

QUERANT: Right, but when you're writing,
1:30 - 1:30

especially things like the database connection-
1:30 - 1:30

T.D.: Right.
1:30 - 1:30

QUERANT: Right, when you have that kind of-
1:30 - 1:30

T.D.: OK, the database connection.
1:30 - 1:30

QUERANT: Yeah, if you have- anyway,
1:30 - 1:31

when you have that kind of a code, right,
1:31 - 1:31

it's not very- you would assume that somebody
might
1:31 - 1:31

have done a thread dot new and, you know,
1:31 - 1:31

gone ahead and called the still lines of code.
1:31 - 1:31

T.D.: Right.
1:32 - 1:32

QUERANT: Which means, like, if you haven't
taken
1:32 - 1:32

the global interpreter lock yourself,
1:32 - 1:32

then the chances are the same problem that
you
1:32 - 1:32

said with GC might occur.
1:32 - 1:32

You might get pre-empted and
1:32 - 1:33

then horrible things might have happened.
1:33 - 1:33

So, does that mean every line,
1:33 - 1:33

every native extension that you write,
1:33 - 1:33

needs to take GIL because it's
1:33 - 1:33

an obscure case of some re-doing it in a new
thread?
1:34 - 1:34

T.D.: So, actually, so, in theory, I think
yes.
1:34 - 1:34

When the code, so, OK, so most native extensions...
1:34 - 1:34

let me just go all the way back.
1:34 - 1:34

QUERANT: So go to the C code.
1:34 - 1:34

T.D.: Yeah, this one, yeah?
1:34 - 1:35

QUERANT: Right.
1:35 - 1:35

T.D.: Right, so wht Ruby recalls automatically.
1:35 - 1:35

When I say string will equal some other string,
1:35 - 1:35

and as long as the code is in this method,
1:35 - 1:35

you will be holding the GIL.
1:36 - 1:36

I don't think anyone else can execute code
during this time, unless...
1:36 - 1:36

Actually, I'll need to get back to you.
1:36 - 1:36

QUERANT: All right.
1:36 - 1:36

T.D.: Let me check.
1:36 - 1:36

QUERANT: Yeah, I mean, sure. Yeah, this was
something that I-
1:36 - 1:37

T.D.: OK, sure.
1:37 - 1:37

QUERANT: Yeah, OK. Yeah, so, like on the same
note, actually,
1:37 - 1:37

I just want to add, your C extension,
1:37 - 1:37

you only acquire the GIL if, in your extension,
1:37 - 1:37

you're going to run something along the running
thing.
1:38 - 1:38

But you don't want to,
1:38 - 1:38

you don't want the control to return to Ruby.
1:38 - 1:38

For example, let's say you,
1:38 - 1:38

your C extension takes a measure,
1:38 - 1:38

and it does some image processing, actually,
1:38 - 1:39

and you don't want, you just want the C extension
1:39 - 1:39

to write the file to the disk and call it
a day.
1:39 - 1:39

You don't want to return that something to-
1:39 - 1:39

Then you acquire a GIL in your extension
1:39 - 1:39

and then that thread will run completely separate
1:40 - 1:40

from your- which Ruby BM is it running at?
1:40 - 1:40

Actually, so that's the only time when
1:40 - 1:40

you will acquire a lock, if any,
1:40 - 1:40

if you're passing any data back to Ruby.
1:40 - 1:40

Or anyway, you pass it back, control,
1:40 - 1:41

the control back to this thing,
1:41 - 1:41

then you don't want to acquire the GIL yourself
actually.
1:41 - 1:41

There are constructs for that, but generally
not recommended.
1:41 - 1:41

T.D.: Yeah, so I believe what he's saying
is correct.
1:41 - 1:41

I was slightly mistaken, I think the GIL is
only acquired
1:42 - 1:42

when you enter the function.
1:42 - 1:42

As soon as you enter the function the GIL
is released,
1:42 - 1:42

unless I'm mistaken, that's correct, right?
1:42 - 1:42

QUERANT: That's correct.
1:42 - 1:42

T.D.: Yeah. So, yeah. So I guess if you actually
call
1:42 - 1:43

anything that's a Ruby construct from here,
1:43 - 1:43

you can actually call a method from within
your C function body.
1:43 - 1:43

I think at that point you'll need to re-acquire
the global interpreter lock.
1:43 - 1:43

But you're correct that GIL is only caught
when you enter the method.
1:43 - 1:43

V.O.: We have time for more questions.
1:44 - 1:44

QUERANT: Hey Tejas, how do you test native
extensions?
1:44 - 1:44

T.D.: OK, so, it's like I said.
1:44 - 1:44

The architecture of most of your things is
sort of like this,
1:44 - 1:44

where you have native code, Ruby-aware native
code,
1:44 - 1:44

and your actual Ruby code. Presumably you're
doing something
1:44 - 1:45

a Google test or something, to test your native
code,
1:45 - 1:45

and to test your actual Ruby code depending
on what your library is.
1:45 - 1:45

It will vary drastically.
1:45 - 1:45

So for example, if you're writing something
1:45 - 1:45

that connects to a database,
1:46 - 1:46

you may want to actually step out the things
that actually call.
1:46 - 1:46

Say if you're implementing with FFI or even
with a native extension,
1:46 - 1:46

if you're making something like a database
call
1:46 - 1:46

you may actually want to mock out or stamp
out the
1:46 - 1:46

actual implementation that connects the two.
1:46 - 1:47

But if you're doing something that's maybe
not so intensive,
1:47 - 1:47

maybe something like a JSON parsing library,
1:47 - 1:47

what I would recommend at this level is actually
writing an integration test,
1:47 - 1:47

actually parse it in JSON and make sure it
actually
1:47 - 1:47

returns to the actual, you know, representation
that you expected.
1:48 - 1:48

So the answer to how do you test is actually
it varies,
1:48 - 1:48

very, very drastically, and I've seen like
different
1:48 - 1:48

maturities of tests across like, all of them.
Does that answer your question?
1:48 - 1:48

QUERANT: Maybe, yeah.
1:48 - 1:48

T.D.: Anything else?
1:48 - 1:49

OK then, I guess I-
1:49 - 1:49

That's it, like, thank you very much. And
yeah.

Title:: Garden City Ruby 2014 - Native Extensions Served 3 Ways by Tejas Deinkar
Description:: more » « less
Duration:: 28:47

Amara Bot edited English subtitles for Garden City Ruby 2014 - Native Extensions Served 3 Ways by Tejas Deinkar

English subtitles

Revisions

Revision 1 Imported

Amara Bot

Garden City Ruby 2014 - Native Extensions Served 3 Ways by Tejas Deinkar

Revisions

Our website uses cookies

Operating cookies (Required)