Daily Archives: July 18, 2012

cron DevOps Linux

What I Wish Some Had Told Me About Writing Cron Jobs

Much like Doc Brown and Marty McFly, cron and I go way back. It is without doubt one of thing single most valuable tools you can use in linux system management. Though what I’ve learned over the years is that it can be hard to write jobs that reliably produce the results I want. I wish when I started writing these jobs executed by cron in the 1990s someone had told me a few of these things.
 

Don’t Allow Two Copies of Your Job to Run at Once

 
A common problem with cron jobs is that the cron daemon will launch new jobs while the old job is running. Sometimes this doesn’t cause a problem, but generally you expect only one job to launch at a time. If you’re using cron to control jobs that launch every 5 or 10 minutes, but only want one to run at a time its useful to implement some type of locking. A simple a method is to use something like this:

You can get more complicate using flock, or other atomic locking mechanisms. For most purposes this is good enough.
 

Sleep for a Bit

 
Ever have a cron job overload a whole server tier because logs rotate at 4am? Or, got a complaint from someone that you were overloading there application by having 200 server contact them at once? A quick fix is to have the job sleep for a random time after being launched. For example:

This does a good job of spreading the load out for expensive jobs, and avoid thundering herd problems. I generally pick an interval long enough so that my servers will be distributed throughout the period, but still meets my goal. For example, I might spread an expensive once a day job over an hour, but a job that runs ever 5 minutes may only be spread over 90 seconds. Lastly, this should only be used for things that you can except a loose time window around.
 

Log It


I’ll be the first to admit I do this all the time. I hate getting emails from cron, but in general you should avoid doing this. When everything is working this isn’t a big deal, but when something goes wrong, you’ve thrown away all the data that told you what happened. So, redirect to a log file, and overwrite or rotate that file.

Hopefully these tips help you out, and solve some of your cron pains.