Month: January 2012

Dealing with Outages

No matter what service you’re building, at some point you can expect to have an outage. Even if your software is designed and scaled perfectly, one of your service providers may let you down, leading to a flurry of calls from customers. Plus the internet has many natural enemies in life (rodents, backhoes, and bullets),…

January 30, 2012
Operations On-Call Go-Bag

A go-bag (or bug-out bag) is commonly discussed in terms of emergency preparedness; it’s a bag containing things you would need to use to survive for 72 hours. For survival, you would be looking at things like food, water, a medical kit, and some basic tools. The idea being that you may have to leave…

January 26, 2012
Trouble Tickets: Annoying, but Useful

If you work in operations, you probably have used a ticketing system or two. They are common across the industry, and every organization has its own particular workflow. In my younger days I loathed them, since they seemed to be an impediment to me doing my job. Today, I’d describe myself as a reluctant fan.…

January 24, 2012
Two Quick Chef Gotchas

Configuration management is a hot topic these days. Chef is one of the more popular choices, and does a fairly good job helping you maintain consistent configuration across your environment. That said it isn’t fool proof. I’ve outlined two common scenarios in which you might introduce a configuration issue. Removing a File, Package, User, or…

January 23, 2012
Physical Infrastructure for the Win

Tired of cloud infrastructure performance? Wishing you could get a couple SSDs to solve your IOps issues in EC2? Trying to reduce your operating expenses, and are ok with the capital expense? There are plenty of reasons that you should consider physical, or managed hardware but managing it presents its own challenges. In order to…

January 21, 2012