Amazon S3 and the Changing Storage Market
I noticed Nicholas Carr’s short article about Why S3 Failed about the root cause of the outage to Amazon’s hosted storage service on February 15th. The root cause — a surge in authorization requests — is not the interesting part of this story. Rather, the interesting part is the ensuing discussion about the pros and cons of using a hosted storage solution versus keeping storage in-house.
In a comment to Carr’s posting, Marc Farley — a long-time author and analyst in the storage solutions space — pointed out:
Nick, storage customers tend to be far more conservative about risk than your average web services customers. Paranoid skepticism about storage creates different market dynamics. In other words, it ain’t cost-driven. Yesterday I wrote about it on my blog.
It makes sense to me that storage solutions — whether hosted or in-house — are (or should be) subject to a higher level of “paranoid skepticism” than other types of Internet services. For users and companies, it’s all about the data. Service disruptions are annoying, but data loss is terrifying.
However, I think Marc misses several points. First, S3 is has a lot going for it beyond being “cheap”. The primary attraction of S3 is its programmatically accessibility and its ability to handle random reading and writing of files (rather than bulk update and restore). This makes it appealing to incorporate into many online and desktop applications.
Second, Marc says in his blog entry:
If you expect something like 5 nines, then I would suggest that S3’s problems today cast a different shade of cloud over those expectations.
I say that the Amazon S3 Service Level Agreement — more than this week’s outage — should be the instrument that “casts a different shade of cloud” over expectations that S3 will provide 5 nines1 availability. The SLA only commits to 3 nines2 in any billing period. Businesses and individuals requiring higher availability must look elsewhere or provide their own redundancy (and read Joel’s insights about the realities of high availability).
Marc’s not entirely wrong, though. In this segment, he mostly hits the nail on the head:
The bottom line question is whether you think you can do better on your own?
That’s close to right. Expressed more fully, the bottom line is whether you think can either do it better for the same cost or do it cheaper with the same level of service. But I digress.
I imagine that Marc’s professional history is largely devoted to working with large institutions and enterprises because those have been the traditional customers for professional storage solutions. With that perspective, Marc goes on to say:
Today, most storage customers think so and I believe most will continue to think so for a long time, despite what market analyst thinkers like Nick Carr believe.
Traditional storage solution customers probably can do better on their own than by using Amazon S3, but what Nick Carr realizes that Marc Farley doesn’t is that Amazon S3 isn’t aimed that that market.
I don’t think that Marc fully appreciates the dramatic market shift that is taking place today in the storage services space. I would be shocked if a significant proportion of Amazon S3’s customer base was traditional storage solution customers. I expect that most S3 customer are small businesses, startups, sole proprietorships, and individuals. Customers in this market don’t need petabytes of storage with carrier-grade availability (at least initially) and they almost certainly don’t have the technical expertise in-house (or the money to contract for that expertise) to build their own storage solution. What these customers do need is a storage solution with low up-front costs, pay-for-what-you-use billing, and access to lots more storage on short notice in case they’re an overnight success.
Thus, Marc might be surprised by Friday’s Wired article: Customers Shrug Off S3 Service Failure, but Nicholas Carr and Wired’s CTO understand:
For startups (and even for our own startups), it is a calculated risk to put all eggs in the EC2/S3 basket. Considering the cost savings overall, today’s glitch may have been acceptable for startups that use S3, like Twitter, given the bigger picture.
Indeed.
1 “Five nines” is 99.999% availability, which allows only 25.9 seconds a month or 5.26 minutes a year of unplanned downtime.
2 “3 nines” is 99.9% availability, which allows 43.2 minutes a month or 8.76 hours a year of unplanned downtime.