Outage

  • January 8, 2009
  • Get a horse! That’s what snide carriage riders would say when passing one of those new fangled automobiles broken down by the side of the road. That was many years ago and despite the catcalls it was the buggy whip that was retired as the car became king.

    The history of technology is littered with little failures as the technology matured. Edison and Westinghouse had a nasty battle over DC vs. AC current and railroad tycoons refused to standardize on track gauge while overbuilding capacity. Where such problems did not exist there were ruthless moguls and early monopolies like Standard Oil.

    Whenever a new technology infrastructure is introduced opportunities for failure are widespread. There are so many moving parts that it’s hard to get everything right and until the little things are perfected you get glitches. My favorite story in this line is the evolution of Standard Time. We take time zones for granted but it took the railroads to standardize the way we measure time.

    Until railroads became prominent every town set its clocks by the sun, which made it impossible to publish a schedule. A traveler on a train moving east or west would experience the same time distortion a coast-to-coast air traveler experiences even now. Today, my watch and cell phone reset themselves so the only thing I notice may be a bit of fatigue.

    I think we had one of those get-a-horse/new infrastructure moments the other day when Salesforce.com went down for 38 minutes. The official reason given was a memory allocation error, whatever that means. For users it was an inconvenience—like being too early for a train because your watch was set to local time.

    Let me go way out on a limb and say that this will not be the last time this system has an outage. I don’t care, really. Last time I looked even the phone system has an expected outage—something like 40 seconds per year but I haven’t looked for a while.

    The most interesting question for me is what an outage means for the future of the technology. There was a lot of hysteria the other day about the significance of the outage and my friend Paul Greenberg did a good job of putting it into perspective.

    To those who worry publicly about the viability of cloud computing and of Salesforce.com in particular I might ask if you’ve ever had a flight run late, lost some ice cream due to a brown out or dropped a cell phone call. I know the answer. It’s all part of both modern living and our reliance on infrastructures that are tremendously complex and maybe not that old, as in seasoned.

    Through it all Salesforce’s customer site for tracking incidents www.trust.salesforce.com did a good job of not hiding the problem even if they weren’t phenomenally helpful in explaining it. And I think the point of the exercise is not what happened but how it was handled and rectified. Having procedures in place and a commitment to transparency are the most important things here. They train a customer to believe that, no matter what, a problem will be dealt with effectively. So if you are still hysterical about the outage, get over it. Or get a horse.

    Published: 15 years ago