Amazon Cloud Burst – Cloud Computing Still a Good Strategy?

On Thursday April 21, 2011 Amazon’s Web Services that is used by many companies to host their websites, databases, and other corporate computing systems went down or bounced offline for 11 hours.  This is bad and very visible, especially for the pioneer of “cloud computing”.  But does it mean that because Amazon went down we should shy away from the cloud? 

Not at all. 

Every experienced IT professional has experienced down time either from hardware, software, telecom, fiber cuts, HVAC outage, human error, etc.  If you haven’t you have been very lucky or not in IT long enough.  Technology will break and humans will make errors—it’s how fast you get back online that matters.  It also matters if you’ve lost data.  Amazon didn’t lose any data; they lost connectivity for a bit.  Many firms will lose data and that’s a million times worse than losing connectivity.

When a cloud provider like Amazon goes down, there’s usually a perfect storm.  Three or more things have to happen for them to go down.  And no matter how bullet-proof you think you are, I can come up with scenarios that will bring you down.  I can remember losing a data volume, an email server getting corrupted, and it was stressful when things go down.  But for the most part, the sun still came up the next day and the outage was not life threatening.  Smaller firms have much less redundancy and that 11 hour Amazon outage might have been a 2-3 day outage.  Recall Google had an outage and I know very large corporations that had outages that tried to keep it out of the papers.  It happens.  Now if it keeps happening, maybe it’s time to find a new cloud provider but ditching the cloud because of this visible outage would be a bad decision.  So what will you do—stay away from a cloud provider so that you can run on your infrastructure that is less protected from outages?  That would be called throwing the baby out with the bath water.  

Outage protection is like a big insurance policy.  How much redundant equipment, software, networks, and people can you afford to keep from going down?  Then you have to add some sanity to the business case.  You don’t want to spend $1000 to protect $100.  Figure out how much revenue/day your firm generates from the systems you deploy.  Then figure out your single points of failure and how much it would cost to eliminate those single points.  Then you’ve got a good benchmark with which to spend.  If you generate $1MM per day, then it might make sense to spend an extra $200,000 per year to improve your operational effectiveness.  On the flip side, if your firm makes $500,000 per year in revenue you are not going to spend an additional $200,000 to make it more redundant.

And guess what?  As you systematically remove your single points of failure, you start looking for cost-effective solutions to add this redundancy.  When you reach the end of this rainbow, your datacenter and operational plan will look very similar to a cloud provider like Amazon.

 

 

What did you think of this article?




Trackbacks
  • No trackbacks exist for this post.
Comments
  • No comments exist for this post.
Leave a comment

Submitted comments are subject to moderation before being displayed.

 Enter the above security code (required)

 Name

 Email (will not be published)

 Website

Your comment is 0 characters limited to 3000 characters.