[I originally posted this over at the Pythian blog. If you're not following it, you should! Way more content, by far smarter people than lil ol' me.]
It's holiday season in many parts of the world, but it's not all parties and egg-nog. Caretakers of critical IT systems often have significant work to do as those systems roll over into a new year. In some cases, we're just monitoring to make sure nothing unusual happens. In other cases, however, we're making the unusual happen: running those batch jobs and backups that only fire once per year.
So what happens when you update the crontab on a critical system to accommodate year-end processing? What happens when, despite all your diligence and devotion to human reliability guidelines, you perform a simple slip, and instead of typing
crontab -e, you type
crontab -r? Well, the documentation tells you what happens:
-r Remove the current crontab. -e Edit the current crontab
(Incidentally, if there isn't a rule of software design that requires you to LOOK DOWN AT YOUR KEYBOARD WHEN CHOOSING FLAGS FOR YOUR UTILITIES, there damn well should be. Seriously, who thought that combinations of options was a good idea???)
What happens next is up to you. You can:
- Pray that you typed crontab -l first, and can recover the listing from your terminal logs
- Search feverishly through email, old trouble tickets, and other internal documentation to reconstruct the contents of the crontab
- Reach out to your sysadmin (assuming that's not you, and that the sysadmin isn't already on holiday) for help in restoring the contents from backup, and won't that be a fun conversation?
- Restore from a backup that you've been making yourself, all along.
I'm sure that most people would pick option #4, and it's very easy to implement. You can give yourself the gift of "crontab insurance" by adding a simple line to the beginning of each of your crontabs:
01 00 * * * /usr/bin/crontab -l > ~/.crons/cronbak.`date +\%u` 2>&1
You'll also need to create the .crons directory first, and double-check the path to the crontab executable on your system. But the result should be a one-week rotation of crontab backups, named cronbak.[1-7]. If you can't be certain that you'll notice something's wrong with your crontab in 1 week, you can easily get additional "insurance" by changing %u to %d (day of month) or even %j (day of year).
While this may seem like an inconsequential thing, it's not. This little backup technique has saved me and other members of my team significant amounts of work and precious client time. I have personally had to execute 3 of the four solutions listed above (thankfully, not #3...yet ), and having this backup on hand is by far the most preferable, stress-free option. No matter how good we are at what we do, it's important to remember that even maximum human reliability is not infallibility. Take the time to do this simple thing; someday you'll be glad you did.