Search This Blog

Wednesday, February 20, 2008

Amazon's service failure provides cautionary Anywhere lessons

Saskatchewan Shelf Cloud (Credit: Jeff Kerr and apod.nasa.gov)

For those who started their long weekend early last week, Amazon's storage 'cloud' service goes was offline for about three hours on Friday. When I combine this with the similarly long Blackberry outage earlier in the week, I think there are some lessons worth noting:

  • Outages that seem important to you aren't important to service providers. Too many people assume that by outsourcing their technology challenges, they'll be getting world class service and risk management in return. Based on the quotes of Amazon and Research In Motion executives, those assumptions are misplaced. RIM co-CEO Jim Balsillie dismissed the Blackberry outage as "an intermittent delay, a couple of hours. It's old news." Amazon at least admitted that the downtime was unacceptable, but only did that after customers spent hours searching for the cause of the problem.
  • Cloud services don't guarantee anything. No matter how good those service level agreements sound when you sign them, when the service is down, you're down as well. And if you look very carefully at most of those service level agreements, the penalties for not providing the service are limited to what you are paying that month. That's cold comfort when your business's revenue goes to zero for an unknown period of time.
  • Anywhere services need more than commodity service. Many Web 2.0 startups have staked their future on the hope that cloud computing is "good enough" to propel their business models. But as consumers get used to Anywhere services -- ones that anyone can use on any device on any network -- the more they will be disappointed by garden variety, commodity service. Those companies aspiring to be the next Google should remember that Google started out by building its own massively-redundant infrastructure in closets at Stanford University rather than just piggybacking on university resources. Anywhere reliability and scale will require more than formless cloud infrastructures to work 24 hours a day, seven days a week.

One final note: one of the companies I consider to be a great Anywhere company already is FedEx. While some may argue that it isn't in the Anywhere information business, many of their executives would disagree strenuously, noting that the information they collect on packages and deliveries is just as valuable as the packages themselves. I remember one of the CIOs of FedEx commenting, "Our data center is a lot like Noah's Ark: we have two of everything." And their circa 1996 thinking about contingency planning and reliability of service as documented by Wired Magazine is a great example for companies today to consider:

Behind one of these straitlaced corporate citadels, a low-slung building squats buried under a vine-covered earthworks, shielded by walls of thick concrete. Formally known as the Global Operations Center, it serves as a subterranean command facility for the entire FedEx distribution and delivery system. Employees call it "the Bunker."

The lighting in the Bunker is subdued, and a hushed intensity crackles through the climate-controlled air. On the walls, giant flat-panel projection screens display real-time weather maps of the continental United States, while workstations around the periphery stand equipped with banks of computer terminals and heavy black telephones. A team in the back of the room specializes in domestic operations, and another behind it focuses on surface transportation. Up front is the international unit; a bevy of flight crew dispatchers are positioned off to the left, and there's a handful of meteorologists tucked off in a dark corner.

"It's pretty quiet here now," explains Bunker manager Pete Gwaltney. "But come midnight, the place will be a whole lot busier. At peak periods, we operate in five-minute decision cycles.

"Gwaltney's job is to keep the FedEx distribution network running smoothly despite the inevitable grind of glitches and failures that plague any complex mechanical system. But as he nonchalantly puts it, "This company spends lots of money preparing for contingencies."

To demonstrate the point, he explains how FedEx launches an empty jet freighter each night from Portland, Oregon, bound for Memphis. The jet tracks a course that brings it close to several FedEx terminal airports so that if one of the jets parked on the ground suffers a sudden mechanical failure, the empty freighter can swoop down and pick up the stricken plane's cargo.

The image of that empty FedEx jet streaking through the night reminds me of the old "doomsday" bombers that were kept aloft and on alert during the Cold War. "Jeez," I remark. "It's like Strategic Air Command around here." Gwaltney smiles, as if the same thought crossed his mind a long, long time ago. "Actually," he says, "it's more like Strategic Freight Command."

That's what I think of as the gold standard for Anywhere services. And for those companies who think they can bet their futures and investors' money on cloud-based, best-effort services and compete with companies that think like FedEx, good luck with that. You'll need it.


No comments: