A lot of us are using haproxy to manage mongrel clusters; it’s sweet, because haproxy is smart enough to “skip” unresponsive mongrels and make sure users get served active ones.
One problem you may have seen is during a deploy — when all the mongrels are restarted at once, haproxy ends up running around in circles, unable to find anyone to serve the latest request. In such a situation, you might get the dreaded 503 Error.
Here’s a cool workaround — have Capistrano reboot one mongrel at a time, so that haproxy always has something to hang its hat on:
namespace :mongrel do
desc <<-DESC
Rolling restart: 1 mongrel at a time.
DESC
task :rolling_restart do
for i in 6000..6004 do
ENV['HOSTS'] = ""
find_servers(:roles => :app).each do |server|
ENV['HOSTS'] = "#{server.host}:#{server.port}"
puts "Restarting #{i} on #{server.host}:#{server.port}..."
sudo "/usr/bin/monit restart mongrel_thepoint_#{i} ; true"
puts "Sleeping 30 seconds before the next mongrel."
sleep 30
end
end
end
end
namespace :deploy do
desc <<-DESC
Deploy, with a rolling restart
DESC
task :rolling do
update
mongrel.rolling_restart
end
end
There are two things you'd want to adjust here for your particular environment -- the number of mongrels in your cluster, and the number of seconds to wait between each restart request. (In this example, 5 and 30, respectively.)
There's a catch here -- this won't work if you have migrations to run, because you probably don't want stale mongrels hitting your newly-migrated database.
For now, we're just using this when there are no migrations, and using a conventional deploy:long when there are.
Can anyone suggest a way to adapt this approach in a way that could accommodate migrations?
Credit is due to Engine Yard, of course, for pointing us in the right direction. Thanks krutten!