Tag Archives: Ruby

How does Bundler do its magic!?

Every Rubyist knows Bundler, the -now prominent- dependency management tool for Ruby programs. Ever since I came accross Bundler, I’ve been wondering how does it do its magic behind the scenes.

I stumbled upon a great article by Pat Shaughnessy, “How does Bundler bundle?” which explains how -after been given a list of dependencies in the Gemfile– Bundler resolves the exact list of gems and their dependencies (as well as their exact versions) to install and enable. The article answered the first half of the question “How does Bundler bundle?”. But it left me wondering about the second half: “How does Bundler constraint your bundle’s code to using/requiring only the proper set of gems listed in the Gemfile and their dependencies, as resolved by Bundler”.

In other words:

After a little investigation, I guess I got a answers to these questions:

1. How does Bundler.setup and Bundler.require work?

To understand it, we first need to understand how the Ruby’s `Kernel#require` method works. This is the method you call whenever you require a gem or a file.

Actually the core Ruby library -now- includes two implementations of `Kernel#require`:

  • The first is the plain old `Kernel#require`. This is how `require` used to work in Ruby 1.8 and earlier.

    When we `require` a file/gem (via a non-absolute path), it simply searches through the directories listed in the $LOAD_PATH ($:) global variable, and

    • If the file is already loaded, `require` returns false.
    • Else if the file is found in one of the directories in the $LOAD_PATH, `require` loads it.
    • Else if the file can’t be found, `require` raises a LoadError.

    This implementation didn’t modify the $LOAD_PATH, or try to search in the directories of installed gems whose paths were not listed in $LOAD_PATH.

  • The second implementation, the `Kernel#require` implementation included in RubyGems.

    This is a patch/replacement version of the plain old `Kernel#require`.Note: Back in Ruby1.8, we had to `require ‘rubygems’` to enable RubyGems for the subsequent code, the implementation of Kernel#require would then be patched/replaced by this new implementation.

    In Ruby1.9+, however, RubyGems is included in RubyCore, and is enabled by default, and there’s no need to `require ‘rubygems’`, it works out of the box (more).

    This `Kernel#require` implementation’s source can be found here. We can simply view its documentation by running `$ ri Kernel#require`, which reads:

    When RubyGems is required, Kernel#require is replaced with our own which is capable of loading gems on demand.
    
    When you call require 'x', this is what happens:
      * If the file can be loaded from the existing Ruby loadpath, it is.
      * Otherwise, installed gems are searched for a file that matches. If it's found in gem 'y', that gem is activated (added to the loadpath).
    
    The normal require functionality of returning false if that file has already been loaded is preserved.
        

So -in short-, the version of `require <file>` that we all use now works by searching for the file in all directories listed in $LOAD_PATH, and if it didn’t find it, it would start searching in all ‘lib/’ directories of the gems installed by RubyGems in your Ruby installation. Once the file is found, the lib/ dir of the gem containing this file is prepended to the $LOAD_PATH, and the file is loaded. So, back to our first question:

“How does Bundler.setup and Bundler.require work so as to run your code in the context of the bundle?”

After Bundler resolves the exact gem versions that we want to be activated/enabled (as explained in Pat’s Article), Bundler.setup() (or `require ‘bundler/setup’`) simply collects the paths to these gem versions’ lib/ directories, and prepends them all to the $LOAD_PATH global array, where any subsequent call to a `require ‘<file>’` would first start looking into the lib/ directories enabled by Bundler.

Any subsequent call to `require ‘file’`, will thus simply find the needed file in the $LOAD_PATH, since it’s now ready with the correct gem versions. There will be no need to search for new directories to add to the load path, and if the <file> wasn’t found in the $LOAD_PATH prepared by Bundler, a LoadError will be raised.

In order to prevent `Kernel#require` from looking further in the other installed gems, Bundler also disables the patched `Kernel#require` by reverting to the original implementation.

So for your code to properly use Bundler, it has to call `Bundler.setup(*groups)` (or simply `require ‘bundler/setup’`) at the very beginning.

Bundler.require on the other hand, is just a convenience method that simply auto-requires all the code in your bundler, without you having to make lots of calls to `require`. Bundler.require also implicitly calls Bundler.setup, so you don’t have to call Bundler.setup before calling Bundler.require.

Moving on ..

2. How does `bundle exec` work?

So, what if you want to call a ruby program/command/executable that doesn’t use Bundler (i.e. doesn’t call Bundler.setup at the beginning of its execution), or you’re just not sure if it does. For example:

  • If you want to run `rake db:migrate` where you’re asking Rake (which doesn’t neccessarily use Bundler to fetch dependencies) to run a couple of Rails’ tasks (that consumes Rails code and its dependencies). You’ll then need to run it as `bundle exec rake db:migrate` to make sure it’ll be using the proper Rails dependencies for your project.
  • On the other hand, while creating a new rails project, you do not need (though it won’t hurt) to `bundle exec rails new <project>`. Since the `rails` cli script already calls Bundler.setup.

The details of the way `bundle exec <command>` works is better explained in the command’s manpage, but it can be abstracted/simplified into a couple of main points:

  1. It adds -rbundler/setup ruby command-line option (equivalent to `require ‘bundler/setup’` and Bundler.setup) to the $RUBYOPT variable, so that when it executes any ruby <command> in a sub shell/process, Bundler will do its magic (as explained above).
  2. It invokes <command> through `Kernel#exec` (code here).

Moving on ..

3. How does Bundler’s binstubs work?

Now to the last point:

“How does Bundler’s `bundle install –binstubs` generate wrapper commands/executables that -when executed directly- would emulate `bundle exec <command/executable>`.”

First, Mislav Marohnić brilliantly explains the concept of binstubs in an article that’s part of rbenv’s documentation (read that first, really).

Bundler simply creates a minimal wrapper that includes the following code:

require 'bundler/setup'
load Gem.bin_path(<gem-name>, <command-name>)

We’re now familiar with the first line. Basic `Bundler.setup` magic (explained above). The second line simply loads the ruby code contained in the command’s executable from the appropriate gem, and executes it. All after having loaded the appropriate library files in the $LOAD_PATH, and unpatched the `Kernel#require` to prevent loading files from other gems not included in the bundle.

That would be it 🙂

If you found a mistake -very probable- or have a comment please feel free to leave a comment.

Advertisements