Performance implications of block&capture helpers

Posted by Jonathan

Recently there have been again some articles on block and capture helpers in Rails (e.g. this one).

A block/capture lets you write nicer looking helper functions in Rails, especially if you want to render some HTML before and after a given piece of code. Typical use cases are styled blocks (HTML) that should surround your items.

Assume we have projects that should be displayed on an overview page. You want to render a nice box with the project title and the project description for each project. Often you end up with partials like this:

<div class="box">
  <span class="box_title"><%=h project.title %></span>
  <div class="box_body">
    <%=h project.description %>
  </div>
</div>

When you want to re-use this partial for models other than a project and further, sometimes render different HTML in the box_body you create a helper function. The most common approaches are using a start/end combination of helpers and a block/capture helper.

The idea of the start/end helper combination is that you call one helper to create/print all the HTML before your item rendering, render your item, and then call another helper to close all the tags and finish the box:

# application_heper.rb
def start_box(title)
  out = "<div class='box'>"
  out += "<span class='box_title'>#{h(title)}</span>"
  out += "<div class='box_body'>"
  out
end

def end_box
  "</div></div>"
end

# view

<%= start_box(project.title) %>
  <%=h project.description %>
  <--! add custom HTML here -->
<%= end_box %>

This works fine but is a bit un-elegant. Further, you can forget the end_box-function and nothing would complain. Your HTML would just be broken.

A much nicer looking solution is the block/capture helper:

# application_helper.rb
def box(title, &block)
  out = "<div class='box'>"
  out += "<span class='box_title'>#{h(title)}</span>"
  out += "<div class='box_body'>"
  out << capture(&block) if block_given?
  out += "</div></div>"
  block ? concat(out, block.binding) : out
end

# view
<% box(project.title) do %>
  <%=h project.description %>
  <--! more custom HTML here -->
<% end %>

So we are using a block here to pass our box content. The block is created with the do/end style and then passed as an argument to the helper function. The helper function then creates some output, evaluates the block, and then continues to print some HTML strings . By using the block syntax we can never forget to "close" a box as the Ruby interpreter would complain about the missing end keyword. Further, in the helper method we can choose to not render anything. This technique is especially useful for administrative links.

I'm a big fan of the block/capture helper and favor its syntax any time above the start/end way. I just wanted to post here about a disadvantage of the block/capture way that you should be aware of.

During a recent performance analysis for a client I profiled their root page which is an overview page with many HTML-boxed elements. I noticed that it rendered really slow, espically if you had many items on the overview page. After a bit of digging into the rendering, we found out that their block/capture helpers were eating 70-80% of rendering time. The problem is that creating a block (aka a closure), storing its binding (scope and surrounding variables) and then passing around this closure is expensive compared with the "pure" string output helper.

How much more expensive can be shown by this benchmark. I created a new test Rails project with two actions. Each action displays a project, one uses the start/end helper and the other one the block/capture helper. On the X-axis you see the number of helper calls inside the view and on the Y-axis you see the number of requests per second (a reported by ruby script/performance/request -n 1000 -b).

So the block/capture helper style is a lot slower than the simple start/end helper. But it only matters if you use it a lot on a page. With 250 calls on a page, the block/capture style helper has only 20% of the requests per second that the start/end style helper can deliver. 250 calls may seem like a lot but in my case the page displayed 50 boxed items and the page had several other boxed content (e.g. login, stats, ads). If you then add a lot of variables in the closures scope (that need to be included in the binding), the page rendering can get really slow.

I'm not arguing in general against the block/capture, I really like the resulting syntax and flexibility. But as often with syntactic sugar and nicer looking code, you trade it for performance. Most of the time this should not matter but when it does, it shows!.

DISCLAIMER: Those number are not statistically valid and you should not try to get too much out of them. Use them as a hint were to look for slow rendering.

Comments

Leave a response

  1. Oscar Del BenMay 25, 2008 @ 08:08 PM

    Nice. Thanks for share. I think that most of the times we can resolve this by caching the fragment (if possible).

  2. Hugo BaraunaMay 26, 2008 @ 11:23 AM

    Good to know that, thanks! So, I would like to know if you have some kind of tool (or script) that can automate the creation of a benchmark graphic like yours. Or better, waht tools did you use to generate those benchmarks and graphic?

    Thanks!

  3. JonathanMay 26, 2008 @ 12:55 PM

    @Hugo:

    I used script/performance/request to get the numbers and used excel/numbers to create the graph. If I would do a longer performance run, I would script script/performance/request to be called multiple times with multiple parameters.

  4. ernstMay 26, 2008 @ 02:45 PM

    i’m incapable of getting script/performance/request do give me some output on linux. problem must be the open command (default ‘open s x%x’, which would only work reasonably on osx – but my app resides on a linux server and i want to test it there). i’ve tried things like “less s”, “lynx %s x%x” and “echo %s” and combinations of it, but i’m inable to get output. can you help me? thanks

  5. JonathanMay 26, 2008 @ 04:42 PM

    @ernst:

    The profile/benchmark results are written to RAILS_ROOT/tmp/, you can just open them there with your editor/browser of choice.

  6. ernstMay 26, 2008 @ 08:31 PM

    thanks very much! :)

  7. whwiJune 09, 2008 @ 09:12 AM

    hi

    i get this errors with the scipt with rails 2.0.2 and rails 2.1 :-( can you help me ?

    ruby script/performance/request -n 1000 -b /usr/lib/ruby/gems/1.8/gems/actionpack-2.1.0/lib/action_controller/request_profiler.rb:43:in `read’: Datei oder Verzeichnis nicht gefunden – -b (Errno::ENOENT) from /usr/lib/ruby/gems/1.8/gems/actionpack-2.1.0/lib/action_controller/request_profiler.rb:43:in `define_run_method’ from /usr/lib/ruby/gems/1.8/gems/actionpack-2.1.0/lib/action_controller/request_profiler.rb:16:in `initialize’ from /usr/lib/ruby/gems/1.8/gems/actionpack-2.1.0/lib/action_controller/request_profiler.rb:87:in `new’ from /usr/lib/ruby/gems/1.8/gems/actionpack-2.1.0/lib/action_controller/request_profiler.rb:87:in `run’ from /usr/lib/ruby/gems/1.8/gems/actionpack-2.1.0/lib/action_controller/request_profiler.rb:83:in `run’ from /usr/lib/ruby/gems/1.8/gems/rails-2.1.0/lib/commands/performance/request.rb:6 from /usr/local/lib/site_ruby/1.8/rubygems/custom_require.rb:27:in `gem_original_require’ from /usr/local/lib/site_ruby/1.8/rubygems/custom_require.rb:27:in `require’ from script/performance/request:3

    i test it with ruby script/performance/request -n 1000 -b http://localhost:3000/ and mongrel running on port 3000 but same problems

    best regards

  8. JonathanJune 10, 2008 @ 08:14 PM

    You need to supply a performance script, see http://railscasts.com/episodes/98