Shinji Kuwayama

26 Jun, 2009

Rolling mongrel restarts

Posted by: Shinji Kuwayama In: Rails| Tech Tips

A lot of us are using haproxy to manage mongrel clusters; it’s sweet, because haproxy is smart enough to “skip” unresponsive mongrels and make sure users get served active ones.

One problem you may have seen is during a deploy — when all the mongrels are restarted at once, haproxy ends up running around in circles, unable to find anyone to serve the latest request. In such a situation, you might get the dreaded 503 Error.

Here’s a cool workaround — have Capistrano reboot one mongrel at a time, so that haproxy always has something to hang its hat on:

namespace :mongrel do
  desc <<-DESC
  Rolling restart: 1 mongrel at a time.
  DESC
  task :rolling_restart do
    for i in 6000..6004 do
      ENV['HOSTS'] = ""
      find_servers(:roles => :app).each do |server|
        ENV['HOSTS'] = "#{server.host}:#{server.port}"
        puts "Restarting #{i} on #{server.host}:#{server.port}..."
        sudo "/usr/bin/monit restart mongrel_thepoint_#{i} ; true"
        puts "Sleeping 30 seconds before the next mongrel."
        sleep 30
      end
    end
  end
end

namespace :deploy do
  desc <<-DESC
    Deploy, with a rolling restart
  DESC
  task :rolling do
    update
    mongrel.rolling_restart
  end
end

There are two things you'd want to adjust here for your particular environment -- the number of mongrels in your cluster, and the number of seconds to wait between each restart request. (In this example, 5 and 30, respectively.)

There's a catch here -- this won't work if you have migrations to run, because you probably don't want stale mongrels hitting your newly-migrated database.

For now, we're just using this when there are no migrations, and using a conventional deploy:long when there are.

Can anyone suggest a way to adapt this approach in a way that could accommodate migrations?

Credit is due to Engine Yard, of course, for pointing us in the right direction. Thanks krutten!

1 Response to "Rolling mongrel restarts"

1 | Dustin Anderson

October 20th, 2009 at 9:11 pm

Avatar

this isn’t a fully baked solution, but I’m thinking that if you have a master-master database setup, you could start by taking one of the databases offline, run migrations on it, and then start taking mongrels offline and restarting them in a new “pool”. You’d have to programmatically make it so the “new pool” of mongrels (already restarted) were waiting to serve requests, and were already hooked up to the migrated database.

You’d have to wait until requests dropped and then at some point you’d hold requests (it would have to be short enough so they wouldn’t timeout) and then as soon as all the old requests were processed on the old pool, you could start handling the requests from the new pool.

You could then merge the changes that the newly migrated database hasn’t seen from the old database. I dunno… seems sketchy. You’d have to make sure all those requests don’t conflict with anything. I think you’re screwed dude.

Comment Form

About

View Shinji Kuwayama's profile on LinkedIn Shinji Kuwayama is a Rails developer in Chicago, Illinois.

Twitter

Posting tweet...