Scaling Rails with Apache 2.2, mod_proxy_balancer and Mongrel

Posted by Jonathan

Unitl this week we used Lighttpd and FastCGI for MeinProf.de. The setup was nearly the same as described in the must read series scaling rails (1, 2, 3, 4) from poocs.net.

We used this setup from day 1 but always had some small issues with Lighttpd. Lighttpd was crashing every couple of days. Nothing dramatic, we had a script that monitored Lighttpd and restarted it if necessary. During the last weeks Lighttpd started to crash once a day and lately even once an hour. This was unacceptable and as we knew that we were going to get some serious press coverage in Germany we looked for alternatives.

43people and Basecamp use Apache 1.3 and FastCGI so this seemed like a good alternative. Just switch the webserver and we would be done. Unfortunately Apache 1.3 cannot loadbalance the FastCGI request and there is very little documentation on Apache 1.3 and remote FastCGI processes. Apache 2.0 is no better and has problems with mod_fastcgi. We needed remote FastCGI listeners as our hardware is quite old and we have many slow machines as opposed to a few fast ones that could use local FastCGI to handle the load.

Enter Mongrel.

Mongrel is a fast HTTP library and server for Ruby that is intended for hosting Ruby web applications of any kind using plain HTTP rather than FastCGI or SCGI. It is framework agnostic and already supports Ruby On Rails, Og+Nitro, and Camping frameworks.

With Mongrel your application server becomes a webserver that speaks HTTP so you “only” need to loadbalance and proxy normal HTTP request to it. Mongrel was stable during our tests so we looked for the HTTP proxy solution. Apache had always mod_proxy and could therefore proxy HTTP requests but we needed to loadbalancer these. The are extra packages for this kind of stuff like Balance but we wanted something more integrated and didn’t want to introduce more components.

Enter Apache 2.2 and mod_proxy_balancer.

Apache 2.2 introduced a new proxy module, mod_proxy_balancer. This module does exactly this, it balances proxy requests. You can define a cluster of proxies and use this cluster in your mod_proxy statement instead of just one proxy server.

With this setup we use Apache 2.2 to handle all incoming requests. Apache 2.2 uses mod_proxy to redirect the incoming HTTP requests to the mod_proxy_balancer cluster. The cluster consists of several Mongrel processes on each application server (and now also internal web server) and distributes the requests.

mod_proxy_balancer is more configurable that Lighttpd’s mod_fastcgi. For example you can specify load factors or routes for each cluster member. See the documentation for details.

Our httpd.conf looks like this:

First you define the cluster and tell it of which members it is composed of.

<Proxy balancer://myclustername>
  # cluster member 1
  BalancerMember http://192.168.0.1:3000 
  BalancerMember http://192.168.0.1:3001

  # cluster member 2, the fastest machine so double the load
  BalancerMember http://192.168.0.11:3000 loadfactor=2
  BalancerMember http://192.168.0.11:3001 loadfactor=2

  # cluster member 3
  BalancerMember http://192.168.0.12:3000
  BalancerMember http://192.168.0.12:3001

  # cluster member 4
  BalancerMember http://192.168.0.13:3000
  BalancerMember http://192.168.0.13:3001
</Proxy>

Then you proxy the location or virtual host to the cluster:

<VirtualHost *:80>
  ServerAdmin info@meinprof.de
  ServerName www.meinprof.de
  ServerAlias meinprof.de
  ProxyPass / balancer://meinprofcluster/
  ProxyPassReverse / balancer://meinprofcluster/
  ErrorLog /var/log/www/www.meinprof.de/apache_error_log
  CustomLog /var/log/www/www.meinprof.de/apache_access_log combined
</VirtualHost>

The slash at the end of the ProxyPass directive is very important.

Mongrel itself is startet on the cluster nodes like this:

# mongrel_rails start -d -e production -p 3000
# mongrel_rails start -d -e production -p 3001

Another nice feature of mod_proxy_balancer is the balancer-manager. It is a web interface to the configuration of the mod_proxy_balancer cluster through which you can query or edit your cluster nodes without the need to restart Apache.

In order to use balancer-manager include this in your configuration:

<Location /balancer-manager>
  SetHandler balancer-manager
</Location>

Of course you should protect this location through Apache’s require valid-user or Allow from directives.

So far this solution has proven much more stable (at least on FreeBSD) and was able to handle our peak traffic of 350.000 page request per day. In practice we use up to 8 Mongrel processes on each cluster node and it seems that Apache is the bottleneck and not our application servers as before. The next step for us is to introduce another web server that handles the incoming HTTP requests and has it’s own Mongrel cluster.

Comments

Leave a response

  1. Roger WilcoApril 22, 2006 @ 12:47 AM
    Just out of curiosity: Did you also evaluate lighttpd's mod_proxy before switching to Apache+mod_proxy_balancer? If yes, how did it compete?
  2. JonathanApril 22, 2006 @ 01:10 AM
    No, because we had the stability problems with Lighttpd. We wouldn't have switched to Apache2.2 if Lighttpd was stable on FreeBSD. But Lighttpd was certainly faster on the HTTP requests.
  3. Dan KubbApril 22, 2006 @ 01:13 AM
    Are you using the worker MPM? Did you test the new Event MPM out? For static content I'd highly encourage using mod_expires and mod_headers to set the Expires and Cache-Control headers (respectively). You could even use mod_cache so that the front-end server doesn't need to even talk to Mongrel to handle requests for images, style sheets and javascript files. Also I was curious if you're Rails apps are sending the Expires and Cache-Control headers in the responses? Most modern browsers and caches (like Apache's mod_cache) can use these headers to cache the responses for the time period you set in the headers. This saves Mongrel the expense of having to generate the same response more than once in a given time period. This could work really well for anything that isn't customized for each user. Despite all that caching some requests will eventually make it to Mongrel and supporting Conditional GET can make a big difference in the percieved and actual speed of your applications.
  4. JonathanApril 22, 2006 @ 01:17 AM
    Just to add to the stability issue. I also have FreeBSD machines which run Lighttpd with smaller load without any problems.
  5. Zed A. ShawApril 22, 2006 @ 01:53 AM
    Roger Wilco, turns out lighttpd's mod_proxy support is total crap. It has frequent errors, doesn't compensate for backends going down, doesn't balance properly in certain modes, and is just generally bit rotted. Jan is supposedly fixing it, but given Jonathan's stated crashes I wouldn't hold my breath. Yes, I'm a bit fed up with lighttpd right now.
  6. JonathanApril 22, 2006 @ 04:00 PM
    We haven't deployed mod_cache or mod_expires yet but we are playing with them. I will update the article after further testing.
  7. JustinApril 22, 2006 @ 06:12 PM
    I believe Zed mentions on the Mongrel site that there are performance issues when running Mongrel on Mac OS X and FreeBSD. Given that you're running on FreeBSD, have you experienced any of the (relative) slower performance running Mongrel on FreeBSD?
  8. James DApril 23, 2006 @ 04:40 AM
    Out of interest, how do you start your mongrels on boot? Cheers
  9. JonathanApril 23, 2006 @ 02:50 PM
    @James: We boot Mongrel on the BSDs through /etc/rc.local. On Linux we just start them by hand. @Justin: We didn't compare Mongrel on Linux vs. FreeBSD. But we tried FreeBSD with the sendfile gem and couldn't measure a real performance improvement.
  10. PavelApril 24, 2006 @ 03:52 PM
    Did you try pound [http://www.apsis.ch/pound/]? On my site it handle over 80M req/day easily (3x P4 (pound) -> 2x dual AMD 270 with Apache 2.0 + mod_fastcgi)
  11. CoreyApril 26, 2006 @ 06:08 PM
    Pavel -- you should send the Pound people a note. They claim: "The largest volume reported to date is a site with an average of about 5M requests per day, peaking at over 400 requests/sec." Sounds like you've bested that by a bit.
  12. JeremyApril 27, 2006 @ 06:56 PM
    Thanks for the great article! How long are your mongrel processes living? Do you have anything in place to monitor them?
  13. JonathanApril 27, 2006 @ 10:53 PM
    At the moment we have no problems with the mongrel processes. This setup is lot more stable then the Lighttpd/FastCGI combination. As we have only a few application servers my monitor by hand.
  14. Jean-PhilippeApril 28, 2006 @ 05:11 PM
    Nice figure. What tool did you used to produce them ?
  15. JonathanApril 28, 2006 @ 05:22 PM
    I used OmniGraffle for the Mac for the figures.
  16. Delhi Designing India Site WebApril 29, 2006 @ 11:56 PM
    Thanks for the write-up :)
  17. Will GreenMay 03, 2006 @ 10:31 PM
    Looks cool enough, but how do I get Apache to serve the static stuff (images/css/js) but proxy the rest, without using a VirtualHost? Can't wrap it all in a Location block, because ProxyPass directives in a Location block ignore the first argument to ProxyPass (the url), and use the url in the Location block instead. So, I think I'd need to first use mod_alias to pull the images, stylesheets, and javascripts directories into Apache's namespace, then prevent those from being proxied, then proxy everything else. How does the performance of static content served by Apache compare to that served by Mongrel?