(Theme Music)

This talk is about how Bundler works

How does Bundler work?

This is an interesting question.

We'll talk about it for a while.

This talk is a brief history of

dependency management in Ruby,

a discussion of how libraries and

shared code works now and in the past

because how it works now is directly a result

of how it used to work in the past

and trying to fix problems that happened then.

Before we get started, let me introduce myself:

My name is Andre Arko, I'm @indirect on all social media

that's my avatar, maybe you've seen me on

a webpage somewhere. As my day job

I work at Cloud City Development doing

Ruby, Rails, Ember, and web consulting.

We do web and mobile development

and I mostly do architectural consulting

and Senior Developer pairing and training.

Talk to me if you're company is interested.

I also founded Ruby Together, a non-profit,

it's like npm incorporate without the venture capital.

Ruby Together is a trade association that

takes money from companies and people who

use Ruby and Bundler and RubyGems and all

of the public infrastructure that Rubyists use

and pays for developers to work on that

so that RubyGems.org stays up, and so

people can have gems, which is pretty cool.

As part of my work for Ruby Together I work as

lead of the Bundler team. I've been working on

Bundler since before 1.0 came out, and I've

been team lead for the last four years.

Using Ruby code written by other developers,

nowadays this is actually really easy,

you add a line to your Gemfile,

you go to your terminal and run

bundle install, and you start using it.

Pretty cool, that's really easy.

The thing that I've noticed, talking to people

who use Bundler and think it's awesome

is that, it's not actually clear what just happened.

Based on the text printed out by bundle install

it seems like something got downloaded

and something got installed, but it's not clear.

It's not clear what got downloaded or

installed, or where it happened.

What exactly happened there?

Nobody is really sure.

How does just putting a line in your Gemfile

mean you can just start using somebody else's code?

To explain that, we'll need a little bit of history.

We're going to back in a time.

I'm going to give you a tour from the beginning of sharing

code in Ruby up until now.

And hopefully by the end of it you'll understand

why things work the way they do now.

I'm going to start talking about require,

which came with the very first version of Ruby ever, in 1994.

And then talk about setup.rb from 2000,

and then RubyGems from 2003, and Bundler from 2009.

And that's what we're still using today.

The require method has been around since

1994, with the very first version of Ruby.

What I should say is that it's been there

since at least 1997, since that's

the oldest version controlled Ruby we have.

It was probably there before that though.

Require can be broken down into even

smaller concepts. Using code from

a file is basically the same as inserting

that code and having Ruby run it

as if you'd just written it in the file.

It's actually possible to implement it yourself,

with a one-line function.

This function says; I have a file

name and I want to require it,

and you read the file in the memory

into a string and you pass the

string to eval, and Ruby runs it

and it's just like you typed that code yourself.

There are problems with this.

Require doesn't work this way in real life.

I'm sure it's totally fine that this will

run that same piece of code over and over

if you require it over and over, you like

having lots and lots of constants that keep

getting redefined, I'm sure it's totally fine.

Working around that, is pretty straightforward.

Just keep track of what you've required in an Array

and not require something again if it' been required.

As you can see here,

you set up an Array, you check

to see if the Array already contains

the filename that just got passed in,

and if hasn't been required,

do the same thing we did before,

read the file in, pass it to eval,

and then add it to the array,

so it's not required again later.

In fact, this exactly what Ruby does,

but written in C not in Ruby.

There is a LOADED_FEATURES global variable,

and it's an Array, and it contains a list

of all the required files.

If you want to know if you've required something yet,

check the LOADED_FEATURES array.

There is one more problem with this,

it only works when you pass in absolute paths.

I'm sure you don't mind you typing the

full path from wherever you are to

exactly wherever the file you want to require is.

I'm sure that's fine too.

The easiest way to allow requires that

aren't absolute is to just treat

all requires as if they're relative

to the path where you started

the Ruby program. And that's easy,

but that doesn't help a lot if you

want to require Ruby files from different places.

Say you have a folder full of a library

you wrote and folder full of an application you wrote

and you want to use a library from the app, you can't,

because writing relative paths from wherever

you started the Ruby program would be terrible.

Instead we create an Array that holds the

list of paths we want to load we want to load

Ruby files from, in a burst of creativity

I'm just going to call that variable the

LOAD_PATH, and here's an implementation.

If you put something in the LOAD_PATH Array,

you can then pass a relative path to any directory

that's in the LOAD_PATH Array, and

it will look for the file.

If you require "foo", it will look for a file

named "foo" inside any of the LOAD_PATH directories,

and if the first one we find searching

the LOAD_PATH in order from first to last,

we will require that one.

Coincidentally, this is exactly what

Ruby does, there is a global variable

named LOAD_PATH, and if you put

a string that contains a path to a directory

in it, Ruby will look in that directory

whenever you require something for a file

with that name.

You can totally use the LOAD_PATH to require

files from somewhere else while you're working with them.

Of course, the LOAD<u>PATH, and LOADED</u>_FEATURES

can both be combined, but that didn't

fit on a single slide, so I'll leave that

as an exercise to the listener.

It's pretty straightforward to be honest.

Load paths are pretty cool.

They allow us to load Ruby directories

even if they're spread across multiple places.

At this point, we could even

have automatically, at the start of every script,

the directory that holds the standard library,

to the load path, and then all of the

files that are pretty of the Ruby standard library,

like Net::HTTP, Set, the cool thing that

come with Ruby, could just be available for

require automatically and you wouldn't have

to worry about putting them in the

load path yourself. That's exactly

what Ruby does, the standard library

starts on the load path when Ruby starts.

It's pretty great. This was cool, and

for several years, this was enough.

People just added things to the load path.

Or wrote scripts that added things to the

load path before requiring things before their

actual script happened.

The thing that got tedius out just having

load paths, is that if you want to get code from

someone else, you have to find that code,

download it, put it somewhere, remember where,

put it in the load path, and then require it.

This was tedious.

Setup.rb happened next.

Around the year 2000 everyone is still

installing share Ruby code by hand.

That wasn't so much fun.

A Japanese Ruby developer, Minero Aoki,

wrote setup.rb, and amazingly,

even though this was created in 2000,

setup.rb is still around on the Internet.

The website for this developer is,

i.loveruby.net, which is pretty cool,

and you can even download setup.rb, but

to be honest, it hasn't been updated since 2005,

so I'm not sure it's super helpful to you.

How did setup.rb work?

At it's core it mimicked the classic

UNIX installation pattern,

downloading a piece of software,

decompressing it, and then running

configure make, and make install,

so setup.rb kind of copied that for Ruby.

You would run ruby setup.rb setup,

ruby setup.rb config, ruby setup.rb install

setup.rb would copy all the Ruby files,

there was a specific directory structure,

kind of like a Gem today, with

library files, and bin files you could run as programs,

and support files, and setup.rb would

copy all of those files into a directory

that was already in the load path called,

site ruby, and that was the ruby files

you had installed that were specific

to your computer.

After setup.rb, using Ruby libraries

was much easier than it had been.

You could find a library online,

download it, you had to untar it by hand,

and run ruby setup.rb all by hand,

but then it was all installed, and no more

manual copying, no more having to

manage all these files.

Everything was in the load path,

you could just require it after setup.rb ran.

After a little while, some of the

shortcomings of this scheme became apparent, too.

There were no versions for any libraries,

and after you run setup.rb there's not even

a way to tell what version you have, unless

you write it down, or the library author

was really nice, and put the version into

the code somehow. There was no way

to uninstall, everything thrown into

the same directory. You'd run setup.rb

for 5 different Ruby libraries and

now all of their files are in one directory.

Good luck figuring out which ones belongs to which.

If you delete the wrong one, too bad.

Upgrading was super fun, if there was

a new version of the library, which

good luck finding that out, you

have to remember the website you got

it from in the first place.

I hope you write all these down.

I hope you've written down every

website you've ever downloaded Ruby from.

You have to go back to that website,

remember which version you have, which

as I said before, there's nothing there unless

you wrote it down.

And then you have to download the

tarball with the new version, and

decompress it, and CD into it and

run ruby setup.rb on it all, and

hope that the new version didn't delete

any files because the old files are still there.

This was tedious, it was really tedious.

People frequently had no idea what

was actually happening with their libraries.

It was not uncommon for people to be like

"Oh this doesn't work, I'll just fix it

in my site ruby directory, ok everything

is great now"

Super awesome.

At some point, some people were like

this is not great. What if you could just

gem install. That would be cool.

And so in 2003, RubyGems came to the rescue.

And fixed all of the problems with setup.rb

that were known. You could check

to see if a library existed by running gem list,

install a gem by gem install, uninstall gems.

RubyGems kept each of these libraries in different directories.

You knew which libraries you had, and how to uninstall

and install new versions, all with one command.

No having to find it on the internet somewhere,

download, and unpack it, setup.rb it.

And RubyGems had another super cool trick

up it's sleeves -- versions.

RubyGems actually kept each version of each

gem in a different place. You could install

multiple versions of the same library.

And they could all be in your Ruby

because they didn't all go into one giant folder,

they went into their own separate folders.

Folders for rails 4.1, 4.2, and 5.0.

To make this work, because require doesn't

support versioning, inherently,

RubyGems added a gem method that

let's you say, I need version 1.0 of rack,

and RubyGems will check to make sure it's

installed, put that directory, just the one

with rack 1.0 into your load path.

So when you run require "rack" you'll

get rack 1.0, it's pretty cool.

Calling the gem method, told RubyGems

you wanted to manipulate the load path to

load exactly the version you knew your

code wanted to talk to.

It was pretty useful.

RubyGems also has a way to support

versioning even in commands that

come with gems. The rack gem

comes with the rackup command, and

if you have multiple versions of rack installed,

the rack command could run any of those versions.

RubyGems defaults to the newest version you have

installed, hoping the newest is the right one.

But if that's not, RubyGems checks the first

argument to the command for something

with underscores on either sides,

it takes that as the version number

that you want to use.

In the above example, we're running

rackup from rack version 1.2.2, and only 1.2.2.

If you don't have that version installed, RubyGems will

make you install that version first.

RubyGems was really, really successful.

Ruby grew in popularity a lot, but RubyGems

made sharing Ruby code grow a lot.

Present day we have 100,000 gems, with 1,000,000 versions.

That's a lot of shared Ruby code.

You probably knew this was coming,

but as cool as RubyGems is, it still had

some problems. If you have multiple

applications that all use RubyGems to load

their dependencies, this can be problematic.

It's hard to coordinate across multiple applications

because, each installation of Ruby itself just has

a set of gems. If you ran gem install, now

there are all these gems.

If one developer runs gem install "foo" and

starts using "foo" in their application,

commits that code and checks it in,

and the next person checks it out

and tries to run the application,

it's going to explode, because it doesn't

know what foo is, you need to fix that.

It led to an area of pure manual dependency management.

Start a new job, hooray!

This literally happened to me in 2008.

New job, welcome to the team, here's

your cool new laptop, we

except you to have the application

running by next week.

It actually took me only 3 and a half days,

working overtime on this. It was amazing.

[Audience Laughs]

To figure out which gems to run gem install,

I looked in the README and there

was a list.

And I installed all of them.

But clearly there was some that people

forgot to put in the README, and

then it kind of worked, but I wasn't

able to get images working. And then

some other developer was like,

you need to install imagemagick,

this was before homebrew. It was terrifying.

To try and fix this problem,

of do we just put the gems in the README?

How do we know if we have

written everything in the README?

"I don't know? Try it?"

Of course, you'd need a new machine

to try it on, because after 3 years

of using Ruby you generally have

installed every gem, and you have

no idea what's important and what's not.

It's terrible.

People started to work on tools to help this problem.

Rails added config.gem, this is Rails 2.3, 2.4 era.

You would put all the gems you need in application.rb

This was super helpful if you needed

to know for sure this was the

master list of all the gems

you needed in your application, but

you could only access that list when

Rails was already loaded.

It was pretty bad.

Because RubyGems automatically uses

the newest version of each gem, just having

an older version installed, didn't mean it

would be used. And if you install

some gem a month after the other person did,

maybe there's a new version? You would

just get the new version automatically.

This is also totally a real-life experience that happened to me in 2009.

Debug a production server that just randomly throws exceptions.

For three days.

The other production servers are fine.

We can't reproduce this problem on a single developer laptop.

What is going on? This is so weird.

After 3 days I finally thought to look at

the output from the gemlist for the entire production

machine and I was like, oh this production

server has gem version 1.1.3 and every

other production server and developer laptop has 1.1.4.

That was the problem.

There was a bug and only that server

had this problem.

And then, like I was saying, about Rails versions,

you could gem install rails, be happy,

make a new app, run your server,

everything is great. And then

you switch to another application

that already existed, didn't get

written to use that version of rails,

got writen to use some older version of rails.

You're like, "Okay, let's go!"

"Boom", because you didn't have

the right version of Rails.

If you put your rails version in the rails config

rails would complain you had the wrong version,

but rails had to be successfully started up to

tell you that you had the wrong version, so

it didn't actually help.

Ultimately, it was a significant part of my job

to figure this shit out by hand, and it sucked.

Depending on what you did on your team,

some people on my team at the time spent

a quarter or a third of their time

doing nothing but figuring out and fixing

dependency management issues.

And I felt really, really bad for them.

Sometimes it was me and I felt really bad for me.

Then there's one more, even after,

you've done all of this by hand management,

there's one more problem that RubyGems has

that is another reason why bundler was created.

Activation Errors, they happen in ruby gems

when you load an application and start by

requiring gems, ruby gems will load the newest

versions of those gems that it can.

Sometimes a gem's dependents need

other gems, that need other gems,

and you'll get the newest version of the

child gem. And later you'll say, I also

need this gem, but that gem won't work with the other.

So how common can this be really?

Unfortunately, it was super common.

Not like happens to you every day common,

but like happens you two or three times a year

and when it does you basically tear

all your hair out, delete your entire

ruby install, uninstall and reinstall all your gems,

because figuring out exactly which combo

of installed gems was causing this

problem was a total nightmare.

This is a real-life activation error.

I salvaged this from a presentation I gave in 2010

about why Bundler exists.

This is a rails app, it's loading, and

rails of course depends on ActionPack, this

was the Rails 2.3 era, ActionPack depends on Rack,

Rack is a gem that helps Rails talk to web servers.

And thin, which is a web server, also depends on rack.

So, rack is how rails talks to thin, how thin

talks to rails, but there's a problem.

thin is perfectly happy to use rack 1.1, which makes some

changes to how rack works.

ActionPack is not happy to use rack 1.1, and

can only use rack 1.0. And so

when you run your server, it loads thin

first because thin is the server.

And thin gets to work trying to load the rails app

and your rails app says "I can't use that rack, sorry"

The reason this happens is runtime resolution.

RubyGems figures out which versions

of which gems of which gems it should load.

After RubyGems is already running.

You say, "Hey I need a thing", and

it's like "Okay, this version might work".

And if later on you say,

"I need a thing that doesn't work with things you've already done"

RubyGems just has to be like, can't fix that.

The fix for this problem is to figure out all the versions

before you run your application.

You have to know the versions you're going

to use are all versions that can work together.

Resolving things at install time,

knowing you're installing versions that work together.

How do we make sure all the versions we're

installing work together?

That's actually where Bundler comes in.

Before Bundler, the process of figuring out

which gems would work together

was done entirely by hand and it

consisted of gem uninstall,

gem install a slighty older version, does rails start up yet?

Repeat the process.

When the exception stopped you knew you'd won.

Unsurprisingly, computers are faster at this than people.

Computers are also good and accurate at trying

many, many, many options until one works.

This is what Bundler does.

Bundler figures out the entire list of every gem

and every version of every gem that

you need, but that also all

work together with one another.

This is called Dependency Graph Resolution,

and there's an entire academic literature about this.

It's kind of well-known hard problem, it's

part of the set of problems called NP complete,

and the totally fantastic thing, and I say

this as a person who has to fix Bundler

when it doesn't work, in theory, you can construct

a set of gems in a gemfile such that

it is not possible to find a set of gems that

work together until after the heat death of the universe.

[Audience Laughs]

Most of the time we don't have that long to wait.

We use a lot of tricks, shortcuts, and heuristics

to figure out which gems to try first and

hopefully finish before you've drunk

that cup of coffee or whatever.

We have a large built-up set of tricks over the years

and most Gemfiles resolve in less than 10 seconds.

Which is pretty cool, considering the upper bound

on that is practically infinity.

After finding versions that work together

because this problem was really hard,

and we don't want to do this over and over.

Bundler writes down the exact versions of every gem

that did all work together, so they can be reused

by other people who are also interested in running

your application. That file is called Gemfile.lock.

Shows which gems to be installed,

the versions to install, and as a bonus

the lock file is what makes it possible

to install the exact same version of every

gem on every machine that's running this application.

That means when you develop on your laptop

you get whatever version of the gem that was

newest when you were developing because run

bundle install and got newest version by default.

Because of the lock file, when you put

that on your production server, you're guaranteed

to have the same versions. And you won't

have to spend 3 days figuring out why

that production server doesn't quite

work all of the time.

It's pretty great.

Fundamentally, the core of bundler consist of two steps.

bundle install, and bundle exec.

The steps for bundle install are simple.

They're totally understandable in plain english

It fits on a single slide, which is great.

I edited this slide for ten minutes deleting words.

So the steps are:

1. Read the Gemfile

2.Ask RubyGems.org for a list of all the gems we need

3. Find versions of those gems both allowed by Gemfile

4. Once found, write all those down in lock and install them all.

And that's how bundle install works.

BundleInstall uses RubyGems under the covers

to the installation, and so every

bundle is it's own little rubygems isolated install.

Every application has it's own rubygems thanks to bundler.

The next step is bundle exec.

This is how we use that applications dedicated ruby gems

instead of the one with whatever in it

because you ran gem install last year.

The way bundle exec works is:

1. Reads the Gemfile, and lock if it's there.

2a. Use locked gems if possible OR

2b. Find versions that work together like install would.

except bundle exec doesn't do any installing.

3. Deletes any existing gems in the LOAD_PATH

4. Adds the exact gem at the exact version at the load path.

That's it. That's all bundle exec does.

Once all the gems work together, and

there exact versions are in the load path

your application is happy. There is no

activation errors, all your requires succeed, I hope.

Everything is pretty great.

As I think I promised in the abstract for this talk,

here's a bundle exec removing pro tip.

I don't really like typing bundle exec, I find it

really annoying, but bundler provides a way

to not have to type it all the time.

And it's to create programs that map to

ruby gems installation that

belongs to that application.

You can use the binstubs command,

bundle binstubs [some gem]

and it will create, in the bin directory,

a program for that gem, that only

runs the exact version that belongs to

that application. So if you have

rspec in your rails app, you can have

bin/rspec that will only load the rspec

for your app. This way you can have

bin/rspec refer to rspec 3, and this application

can have rspec 2. Rails has started to do this.

Rails 4 ships with bin/rails bin/rake that are scoped

so when you run bin/rails, you get the exact

rails version for this application and not another one.

When you run bin/rake you get the exact version of rake.

Pretty cool, no more bundle exec.

If everyone did this, you can check in these binstubs

so you can take bin/rspec, but it in git,

and it'll be mapped to that application forever,

so no one would have bundle exec

ever again if everyone did this.

Now we bundle install, all our gems

show up. We have versions

dedicated for individual applications.

But, as you probably sensed a problem

going through history, that wasn't actually

the end. There are still problems

that show up after bundler came out.

The biggest problem that was left was

running bundle install, took forever.

If you lived a long time from the United States

it took a really long time.

I talked to some developers in South Africa

when I went there to give a talk

and they told me about how running

bundle install means they literally get

up to start making a cup of coffee

that they can finish before bundle install does.

To try and speed things up, bundler 1.1

created a completely different

way to get information from rubygems about gems.

And that sped things up by 50%, a big win.

We keep working on this, bundler 1.9 just

came out this month. There's a bunch more

improvements we're working on.

If you're interested in following along with that,

the bundler websites has news annoucements

at bundler.io, and twitter we're also @bundlerio.

Having said all of this, if you use Bundler,

I would totally love to have your help working on it.

It's an open source project.

We've dedicated a lot of time to making it easy

for people who don't know how to do open source

to help with Bundler, and to start working on Bundler,

and to get into open source that way.

It's a project at Github.com/bundler/bundler.

If you're interested but don't know where to start

email the bundler team at team@bundler.io

and we'll get you set up.

On the other hand, if you

have a job that means you have money,

but not time, join Ruby Together, and give

us money, and we'll work on Bundler, and it'll be

better. As RubyTogether grows, we will also be

tackling bigger community issues.

We want to add easy to use gem mirrors so you

don't have to go all the way to rubygems.org

for your office or data center, we want to

add better public benchmarks. There's a project

calling ruby-bench that's starting to do that,

and we'd really like to expand it.

There's a bunch of other things

that RubyTogether is working on that are cool

If you want Bundler or RubyTogether stickers

I have a giant pile, so find me later.

That's it.

[Audience Applause]