One of the weaknesses inherent in cloud-based services was displayed once again on February 28 when Gmail service was lost to some 500,000 users, or as Google expressed it, 0.29% of the total user base (I love the way that small percentages are used in an attempt to disguise the impact of the outage – or maybe it’s a way to make those affected believe that they are truly special). Later reports put the figure for affected users to closer to 0.2%, but that’s still a figure counted in the tens of thousands of users who woke up to find that their email was unavailable.
Any service or application can suffer a problem that renders it inaccessible to users. Normally the problem is a temporary hiccup, service is restored quickly, and users may not even realize that an outage occurred. What was curious in this case was that users lost data in that when they signed back into Gmail all of their messages, contacts, and other information was unavailable. I assume that access to the data has been subsequently restored for all users as I’ve heard no further reports to indicate otherwise.
According to some reports, the problem appears to have been caused by the installation of some new code that had an unplanned and unforeseen effect on user mailboxes. This kind of thing can happen with any software but what’s interesting here is the feeling of helplessness that it generated for users.
When a service runs in the cloud, you really have no idea where your data is held or who is maintaining it for you. In addition, you have no control over changes that the provider who runs the service wishes to make. Most of the time, changes flow smoothly as new hardware is added to take the load of new users or software is upgraded or patched. As a user, I can’t say that I have ever been affected by losing access to Gmail, but maybe I have been lucky.
But when things go wrong with a major cloud service, they go wrong for lots of users. And unlike when you depend on an application that runs in-house on your company’s own servers, it’s hard to find someone to report the problem or shout at to relieve some of your frustration. The cloud is an amorphous blob in many respects and service providers occupy a place somewhere in the blob that’s often hard to reach, especially when things go wrong.
Gmail is a free service and you can argue that the value of someone free is just that – zero charge. You can argue that a properly managed commercial hosted email service that runs in the cloud wouldn’t experience such an outage and that even if problems occur, the framework of commercial contracts and Service Level Agreements that connect companies and service providers will ensure that everyone knows about the problem and the steps that are being taken to resolve it ASAP.
All of this is certainly true, but I wonder what will happen when an outage affects the mailboxes of a major company such as some of the marquee names that providers are trumpeting as they sign these companies up to move them from on-premise to cloud implementations. The relationship between a company and a cloud provider is emphatically not the same as that which exists in a traditional outsourcing arrangement where the client has a direct connection to service and account managers that they can contact if problems arise. Indeed, if problems start to escalate, clients are usually able to reach executives of the outsourcing provider to emphasize the effect of the outage on company operations and to encourage a faster resolution. Often these interactions do exactly nothing to resolve a problem, apart that is to allow the customer to blow off some steam at the provider. I guess there’s some value in that. I certainly can attest to the unique experience of having a customer CIO tearing into me as the company I worked for struggled to restore a satisfactory level of service for an outsourced Exchange 2007 deployment. I didn’t particularly enjoy the encounter but it seemed to make the customer feel better.
But when you’re in the cloud, you’re just one voice within a very large group. Even large companies that might have contracted for 50,000 or more seats will be dwarfed by the sheer number of mailboxes that cloud services such as Google Apps or Microsoft BPOS (soon to be Office 365) support. And when you’re just one voice it’s hard to be heard amongst the cries of pain provoked by any service outage, even if one of the mailboxes that’s affected belongs to a company’s CEO.
I wonder if some of the companies who are so enthusiastic to embrace the potential of the cloud really realize the potential downside of the arrangement. There are benefits to be achieved such as faster access to newer technology, releasing IT staff from mundane activities like server maintenance to allow them to focus on higher-value activities, and a potential decrease in operational cost that is much beloved by CIOs. Nothing in life is ever all upside and much of the dark side of the cloud is still an unexplored area that we’ll really only discover as we work through future outages. Gmail has had its outages and Office 365 will have its outages – and the screams will be heard in Hades.
Brilliant metaphore Tony, Cheers Noel