SimplyStored and CouchDB

Posted by Jonathan

Yesterday I gave a presentation about CouchDB and SimplyStored, our convenience Ruby library, at the Ruby User Group Berlin.

There is a recording of the presentation at ustream.tv.

Mathias and I wrote SimplyStored in order to easily interact with Ruby objects serialized in CouchDB. We use CouchDB as the main data store for Scalarium and so far it has been great. But it is a bit cumbersome to write all those map and reduce functions yourself.

SimplyStored generates the JavaScript map&reduce functions for handling associations or dynamic finders for you.

SimplyStored offers:

  • Models
  • Associations
  • Callbacks
  • Validations
  • Dynamic finder
  • S3 attachments
  • Paranoid delete

    class User
      include SimplyStored::Couch

      property :login
      property :age
      property :accepted_terms_of_service, :type => :boolean
      property :last_login, :type => Time
    end

    user = User.new(:login => 'Bert', 
                    :age => 12, 
                    :accepted_terms_of_service => true, 
                    :last_login = Time.now)
    user.save

    User.find_by_age(12).login
    # => 'Bert'

    User.all
    # => [user]

    class Post
      include SimplyStored::Couch

      property :title
      property :body

      belongs_to :user
    end

    class User
      has_many :posts
    end

    post = Post.create(:title => 'My first post', 
                       :body => 'SimplyStored is so nice!', 
                       :user => user)

    user.posts
    # => [post]

    Post.find_all_by_title_and_user_id('My first post', user.id).first.body
    # => 'SimplyStored is so nice!'

    post.destroy

    user.posts(:force_reload => true)
    # => []
  

The code is on github and OpenSource: SimplyStored example code

Another thing I talked about is RockingChair. RockingChair is an in-memory CouchDB implementation that understands all of SimplyStored's functionality. We use it to speed up our tests and be able to run them in parallel.

Amazon EC2 HighMemory Instances

Posted by Jonathan

Very very nice: 34.2 GB RAM and 68.4 GB RAM instances on EC2

# free -m
             total       used       free     shared    buffers     cached
Mem:         70007       2205      67801          0         28        595
-/+ buffers/cache:       1581      68425
Swap:            0          0          0

And

# cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel(R) Xeon(R) CPU           X5550  @ 2.67GHz
stepping	: 5
cpu MHz		: 2666.760
cache size	: 8192 KB
physical id	: 0
siblings	: 1
core id		: 0
cpu cores	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu tsc msr pae mce cx8 apic mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr dca popcnt lahf_lm
bogomips	: 5336.34
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

[..]

We just included support for those machines in Scalarium, our EC2 cluster management plattform.

Webistrano/Capistrano problem with git

Posted by Jonathan

Recently I helped a friend debug a problem when deploying with Webistrano/Capistrano.

He was using a git repository and used SSH keys for authentication. Every time he tried to deploy he got this error:

 executing locally: "git ls-remote ssh://repo.example.com/git/myproject.git HEAD"
*** Could not save revision: Unable to resolve revision for 'HEAD' on repository 'ssh://repo.example.com/git/myproject.git'.

When running this command manually as the Webistrano user, everything worked fine.

We checked the usual suspects: the SSH key, the permissions on the SSH dirs/files, user, firewall & co. Everything seemed correct and worked when we ran the command by hand.

After a bit of tinkering I had the Eureka moment: the git command was not in $PATH when running under Passenger!

Git was installed and worked when we logged in as the Webistrano user. But when Passenger runs Webistrano it doesn't load all your shell config files. So if git is not in a standard location like /usr/bin or /bin Capistrano (which by this time will be called from Webistrano to do the actual deployment) will not find it.

I our case git was installed in /usr/local/bin and thus not in the default path. We ended up symlinking it to /usr/bin and everything worked like a charm.

I just committed a fix to Capistrano to make debugging such errors in the future easier. Capistrano will now check every local command it executes and see if it is in path. So with the latest version on github the error message would have looked like this:

 executing locally: "git ls-remote ssh://repo.example.com/git/myproject.git HEAD"
*** executable 'git' not present or not in $PATH on the local system!
*** Could not save revision: Unable to resolve revision for 'HEAD' on repository 'ssh://repo.example.com/git/myproject.git'.

So if you are running any shell commands under Passenger remember that it doesn't use a full login-shell.

Scotland on Rails 2009 Slides

Posted by Jonathan

I know, it is a bit late, but here are my slides from Scotland on Rails:


The slides are also available as a PDF download: Advanced Deployment

Scotland on Rails was again a great conference. A very interesting crowd in a very nice city. I'm looking forward to next year!

Webistrano 1.4 released

Posted by Jonathan

I just released Webistrano 1.4. Webistrano is a tool for managing Capistrano deployments and offers a rich web UI. It lets you manage projects with their stages and keep track who deployed which version to which servers.

Webistrano 1.4 brings many new features that make deployment easier. The most prominent are:

  • Recipe versioning - recipes are now versioned so that you can keep track of changes
  • Project cloning - you can now create a template project and clone it over and over again
  • Array parameters - support for arrays as values for configuration parameters
  • CAS-auth support - Single Sign-On support by delegating authentication to a CAS server. See the documentation
  • Enhaced UI - nicer overviews of deployments and many small fixes
  • Cancel deployments - a running deployment can now be canceled by Webistrano. The running Capistrano instance will be killed so use this feature with care
  • Track deployed revisions - Webistrano will track which revision was deployed. This way you always know which version is running where
  • Updated packages - Rails 2.1 and Capistrano 2.5.0

Apart from that some smaller enhancements and fixes went into the 1.4 release. See the CHANGELOG for a complete list.

Further, there is now a Webistrano mailing list at GoogleGroups.

Go get Webistrano from the project homepage as a download or checkout the source:

Download: webistrano-1.4.zip (3.4 MB)

# Development version:

svn co http://labs.peritor.com/svn/webistrano/trunk

# Stable version:

svn co http://labs.peritor.com/svn/webistrano/branches/1.4

RailsConf Europe 2008

Posted by Jonathan

Day two of RailsConf Europe 2008 is over and so are my two sessions.

On tutorial day Mathias and I did a 4h workshop on deploying and monitoring Rails applications. The tutorial went really well, apart from the AirportExpress base station not coping with 100 laptops connecting to it. In the practical part we had a FreeBSD server with 40 virtual machines running and helped the audience deploy an example application with git or svn and Mongrel or mod_rails.

On day two I held my Security on Rails session where I go over the various attacks and countermeasures against Rails applications. This session was also well received and I hope I could educate people a bit about WebAppSecurity.

The slides are available as PDF here: Security on Rails (PDF) Deploying and Monitoring Rails (PDF)

Further, you can find both presentations at slideshare.

Security On Rails
View SlideShare presentation or Upload your own. (tags: ruby rubyonrails)


The slides are available as PDF here: Security on Rails (PDF) Deploying and Monitoring Rails (PDF)

If you attended one of my sessions, I encourage you to rate them at the conference site.

So far my expectations have been met and I've could catch up with a lot of people. I'm looking forward to day three of RailsConf Europe!

Gem permissions

Posted by Jonathan

Lately I've seen several people struggling with an error message like this:

`gem_original_require': no such file to load -- sqlite3/database

This is for the SQLite3 gem but I've seen it also with Capistrano.

The problem is that some files have incorrect permissions in the authors repository and those incorrect permission are replayed by the gem packaging.

The next version of SQLite and Capistrano will solve those problems, the short term fix is just:

$ sudo chmod -R a+r /usr/lib/ruby/gems/1.8/gems


why I love the windows experience

Posted by Jonathan

Ok, this is not Microsoft's fault, but it is a perfect example for the windows experience.

While working on a nasty IE JavaScript bug in my windows VM I kept getting an annoying popup saying: "SUN Java - Update available".

After clicking it away several times, I though "heck let's install it so it keeps quiet" and I clicked on the icon. After waiting several minutes for the installer to load data and seeing some of the nice progress bar, I got this wonderful message:

Warning: Operating system not supported! - Thank you for choosing Java (TM) ...

So the VM is running XP SP1 and Sun's update requires SP2. When I clicked away the failed installation, I got a nice error window saying:

Installation failed due to user aborting

Thanks for annoying me, keeping me from work for 15minutes, and in the end blaming me for not be able to install ...

Nice new edge feature: test/do declaration style testing

Posted by Jonathan

Rails 2 introduced ActiveSupport::TestCase and friends, RoR's enhancement of Test::Unit.

Those extra classes made testing Rails controllers easier and removed the need for cluttered setup methods. Today DHH committed a new feature to ActiveSupport::TestCase (by Jay Fields) that allows Rails tests to match up with RSpec's and Shoulda's nicer declaration style test naming: test/do declaration style testing.

In plain Test::Unit each test would be a method named 'test_' followed by the name the test:

def test_email_format_is_validated
  ...
end

def test_invalid_credit_card_number_throws_exception
  ...
end

This works ok but is a bit clumsy and gets ugly with long method names. In edge you can now write the test like this

test 'email format is validated' do
  ...
end

test 'invalid credit card number throws exception' do
  ...
end

What happens in the background is that ActiveSupport::TestCase will just generate the test_email_format_is_validated method for you. What is missing is a nice integration with the test runner.

This brings Rails developers that envy RSpec's and Shoulda's declarative style to the same level. RSpec&co can still do more tricks but most developers I know really just lust for the it 'should do as I want it to' do ... end syntax and don't really care about the a.should == b.

Performance implications of block&capture helpers

Posted by Jonathan

Recently there have been again some articles on block and capture helpers in Rails (e.g. this one).

A block/capture lets you write nicer looking helper functions in Rails, especially if you want to render some HTML before and after a given piece of code. Typical use cases are styled blocks (HTML) that should surround your items.

Assume we have projects that should be displayed on an overview page. You want to render a nice box with the project title and the project description for each project. Often you end up with partials like this:

<div class="box">
  <span class="box_title"><%=h project.title %></span>
  <div class="box_body">
    <%=h project.description %>
  </div>
</div>

When you want to re-use this partial for models other than a project and further, sometimes render different HTML in the box_body you create a helper function. The most common approaches are using a start/end combination of helpers and a block/capture helper.

The idea of the start/end helper combination is that you call one helper to create/print all the HTML before your item rendering, render your item, and then call another helper to close all the tags and finish the box:

# application_heper.rb
def start_box(title)
  out = "<div class='box'>"
  out += "<span class='box_title'>#{h(title)}</span>"
  out += "<div class='box_body'>"
  out
end

def end_box
  "</div></div>"
end

# view

<%= start_box(project.title) %>
  <%=h project.description %>
  <--! add custom HTML here -->
<%= end_box %>

This works fine but is a bit un-elegant. Further, you can forget the end_box-function and nothing would complain. Your HTML would just be broken.

A much nicer looking solution is the block/capture helper:

# application_helper.rb
def box(title, &block)
  out = "<div class='box'>"
  out += "<span class='box_title'>#{h(title)}</span>"
  out += "<div class='box_body'>"
  out << capture(&block) if block_given?
  out += "</div></div>"
  block ? concat(out, block.binding) : out
end

# view
<% box(project.title) do %>
  <%=h project.description %>
  <--! more custom HTML here -->
<% end %>

So we are using a block here to pass our box content. The block is created with the do/end style and then passed as an argument to the helper function. The helper function then creates some output, evaluates the block, and then continues to print some HTML strings . By using the block syntax we can never forget to "close" a box as the Ruby interpreter would complain about the missing end keyword. Further, in the helper method we can choose to not render anything. This technique is especially useful for administrative links.

I'm a big fan of the block/capture helper and favor its syntax any time above the start/end way. I just wanted to post here about a disadvantage of the block/capture way that you should be aware of.

During a recent performance analysis for a client I profiled their root page which is an overview page with many HTML-boxed elements. I noticed that it rendered really slow, espically if you had many items on the overview page. After a bit of digging into the rendering, we found out that their block/capture helpers were eating 70-80% of rendering time. The problem is that creating a block (aka a closure), storing its binding (scope and surrounding variables) and then passing around this closure is expensive compared with the "pure" string output helper.

How much more expensive can be shown by this benchmark. I created a new test Rails project with two actions. Each action displays a project, one uses the start/end helper and the other one the block/capture helper. On the X-axis you see the number of helper calls inside the view and on the Y-axis you see the number of requests per second (a reported by ruby script/performance/request -n 1000 -b).

So the block/capture helper style is a lot slower than the simple start/end helper. But it only matters if you use it a lot on a page. With 250 calls on a page, the block/capture style helper has only 20% of the requests per second that the start/end style helper can deliver. 250 calls may seem like a lot but in my case the page displayed 50 boxed items and the page had several other boxed content (e.g. login, stats, ads). If you then add a lot of variables in the closures scope (that need to be included in the binding), the page rendering can get really slow.

I'm not arguing in general against the block/capture, I really like the resulting syntax and flexibility. But as often with syntactic sugar and nicer looking code, you trade it for performance. Most of the time this should not matter but when it does, it shows!.

DISCLAIMER: Those number are not statistically valid and you should not try to get too much out of them. Use them as a hint were to look for slow rendering.

Webistrano 1.3 released

Posted by Jonathan

I'm proud to announce Webistrano 1.3!

Webistrano is a Web UI for managing Capistrano deployments. It lets you manage projects and their stages like test, production, and staging with different settings. Those stages can then be deployed with Capistrano through Webistrano.

The 1.3 adds several new shiny features to Webistrano that make deployment easier:

  • Better Git support through Capistrano 2.2
  • Support for Phusion Passenger / mod_rails
  • Ability to temporary disable hosts for a deployment
  • A command-line interface with script/deploy
  • A simple permission system

The complete changelog is available through the Webistrano project site.

One very often demanded feature is the ability to temporary disable a host for a deployment. This is helpful when you want to deploy a stage without changing the stage configuration even if one or more hosts are down:

Another scenario is when you want to execute a task only on a limited set of servers.

The script/deploy command is a nice little gem, especially useful if you want to script Webistrano:

$ ruby script/deploy 
Usage: deploy [options] project stage
    -h, --help                       This message
    -e, --environment=ENV            RAILS_ENV for Webistrano (default: production)
    -u, --username=NAME              Webistrano username to use (default: admin)
    -t, --task=NAME                  Capistrano task to invoke (default: deploy)
    -d, --description=TEXT           Deployment comment for Webistrano records

Further, Webistrano now offers built-in tasks for managing mod_rails deployments. It will override the default deploy tasks and ask for the necessary configuration entries so that using mod_rails becomes even easier.

Upgrading from previous releases is very easy, see the Upgrading wiki page.

Webistrano 1.3 can be downloaded here. Webistrano is BSD-licensed and the project site is open for everybody. Please see the project page for more documentation and screenshots. There are even some screencasts.

back home again

Posted by Jonathan

After being one week on the road, I'm finally back home again. Last week I presented at RubyFools Copenhagen and Scotland on Rails and therefore traveled a lot.

This was my first time in Copenhagen and it seemed like a very nice city. The conference venue was very nice, a brand new university building with a big hall. My talk about Rails on AWS was well received. I've added some information about the new EC2 features like elastic IPs and availability zones. A video of the presentation should hopefully soon be available. After the sessions I've spend some time chatting with Matz about Ruby/JRuby/Rubinius and his work in Japan.

Next I traveled to Edinburgh for Scotland on Rails. Edinburgh feels like my second home town as I've studied and worked there for a while. The conference was very good organized and had a different Ruby crew there. Most of the RubyFools Copenhagen folks went straight to RubyFools Oslo, so there were many new faces in Edinburgh.

I gave a presentation about Rails Patterns, typical problems of real-life Rails production sites and solutions/patterns. Afterwards I had a couple of nice conversations with other developers and their experiences with similar situations.

Apart from the great conference, I had a chance to spend some time in Edinburgh and catch-up with some people there.

My slides can be found here:

RubyFools Copenhagen: Rails on AWS (PDF)

Scotland on Rails: Rails Patterns (PDF)




RubyFools Copenhagen: Rails on AWS (PDF)

Scotland on Rails: Rails Patterns (PDF)

Asset Packer reminder

Posted by Jonathan

If you ever got an error like this while testing your code in production

ActionView::TemplateError (private method `chomp' called for nil:NilClass) in
...
<%= stylesheet_link_merged :styles %>
<%= javascript_include_merged :libs %>

then you forgot to create your packed JavaScript and CSS assets:

rake asset:packager:build_all

Remote cache pitfalls

Posted by Jonathan

Just a small note for people using Capistrano/Webistrano and the remote_cache deployment strategy.

set :deploy_via, :remote_cache

The remote_cache strategy creates a cached-copy directory in your #{deploy_to}/shared base. It then checks out the coe once and in contrast to the default deployment strategy. After the initial checkout subsequent deployments will do a `svn up` and copy the result over to #{deplot_to}/releases/.

Using remote_cache your deployments are usually a bit faster but there is a catch. If you ever change the repository variable, e.g. because you switch to another tag of move the stable branch, your deployments will either fail or do not completely update.

This is due to the fact, that the new deployment does a

$ svn up -rYOUR_REV http://svn.example.com/svn/branches/my_new_branch

on the cached copy. With a different branch or tag than the one the `svn checkout` command was executed with, this will not work. In order to fix it, just delete the cached-copy directory. It will be re-created on the next deployment.

$ rm -rf /path/to/deploy/shared/cached-copy

Web 2.0 Expo Berlin

Posted by Jonathan

I'm just back from today's Web 2.0 Expo sessions and I'm not sure I will attend tomorrow. Many have written about this before, but the creative, social atmosphere is missing due to the conference labyrinth halls. Boy, I'm happy I haven't spend > 1.000 Euros on this. No real food, a lot of product presentations, not enough room for socializing and to many suits for my taste.

Still, I had some nice conversations and met some interesting people.

I did again a session on scaling with Amazon EC2 and S3, the slides can be found here.

This time a also talked a bit about how we use S3 and EC2 to drive our Webmail Portal product, PeritorMail at Peritor.

SlideShare | View

Also nice the AWS announcement of S3 being available in EU data centers. Now I'm only waiting for EC2 in the EU...