For the London Free Press – May 9, 2011
Recent outages at Amazon and Sony’s PlayStation Network have left businesses and consumers without service for lengthy periods of time.
The tech press is full of articles suggesting the bloom is off the rose for cloud services and cloud providers are in denial about risks. These articles call on online providers to take financial responsibility and offer more than token services credits.
These outages have done more than just prevented gamers from playing. Services provided by Foursquare, Hootsuite, Discovr, the New York Times and others were affected by the “Amazonpocalypse”. Other businesses using Amazon were barely affected, as they designed their use with disaster prevention in mind.
One reason cloud services are inexpensive is that they come with no guarantees, and no liability on the part of the provider. That’s not meant to suggest online providers aren’t motivated to keep their services running. It’s bad for business if they don’t. But some are better than others, and problems can occur despite provider efforts.
If users expect financial responsibility and compensation for their losses in a failure, they can expect to pay more.
Online service provider user agreements contain limitation clauses that deny liability if the services don’t work. At most, there might be a refund for the cost of their services proportionate to the amount of downtime. If users want more, they can expect to pay for the provider’s insurance to back up the liability. And in practice. most users opt not to pay more for liability protection.
Anyone using online or cloud services needs to first consider how crucial the services are to them. What will the effect be if the service is disrupted for a short or long period of time, or if their online data is lost?
If such disruptions would have serious effects, then the user must take steps to control those risks.
For the risk of losing data, it might be as simple as keeping local backups, or keeping a mirrored copy at a different service provider at a different location.
To keep the service operating continuously, users should take a close look at how the service is provided, and plan their use in a way designed to survive failure.
In other words, assume things will fail, plan around that, and test to ensure the plan works.
Amazon, for example, has several “availability zones”. Amazon customers who were able to switch between them suffered only minor issues.
Another approach is to use multiple service providers based in different locations.