Office 365 support case resolved – thankfully!

On April 29, I reported the poor support experience I had received as a result of the upgrade of my Office 365 tenant domain from the Wave 14 release to Wave 15. Essentially, a support call reported on April 8 had produced zero progress, despite many messages to and fro between myself and Microsoft’s Office 365 support team. All in all, it was a tiring and frustrating time.

Four hours after posting the report, I was contacted by a UK-based Microsoft escalation engineer. Coincidences do happen, but in this case I think that the public protest had the desired effect on Microsoft’s bunged up support processes. In fact, it’s depressing that posting a blog produced an escalation because it points to a problem in the support process. Normal customers who don’t blog won’t get the same response. It is probable that my visibility within the Exchange community as someone who writes extensively on the topic also assisted in the escalation process.

The good news is that at 22:30 on April 30 Outlook informed me that it had to restart because of a change made by the administrator (in fact, Outlook forced me to restart 3 times, for a reason that I haven’t quite figured out). I logged into my tenant and discovered that OWA used the Wave 15 interface and that all the administrative functions worked as expected. ActiveSync and EWS clients connected flawlessly to the upgraded service. The problem was solved 22 days after being first reported.

Joy! Something might have happened... — Joy! Something might have happened…

What have I learned from the experience? Here are some thoughts:

Microsoft front-line staff are just a filter. No surprise here because all major support organizations use front-line staff to filter incoming calls, solve the most obvious (and some that are not), and pass a certain percentage to second-level support via an escalation process. What surprised me about this case was how long Microsoft allowed the call to remain at the first level despite frequent communication back and forth with me. I asked repeatedly for updates but nothing happened. Clearly the internal escalation process did not function properly.

Microsoft escalation engineers know their stuff (at least, the person I dealt with did). Once the case was escalated things happened more quickly (as you’d expect). The focus was sharper, the questions more pertinent, and action occurred. Tools such as those described in KB2598970 collected information from my workstation to help detect the source of the problem. Communications were restrained and content rich. All in all, a much better experience.

Expect a delay if something has to change in the datacenter. Second level support can go so far with massive cloud systems. Their role seems to be to investigate problems, collect information, and then figure out what needs to be done. In this case a change needed to be made to my tenant domain. Unlike what might happen in an on-premises situation, senior support staff cannot take actions to user accounts (or their equivalents) because Office 365 is, by necessity, an extremely locked down environment where only specific people can interact with user data under controlled conditions. The upshot is that some delay is built into the system to have information fed back to the datacenter team and for them to respond. I like this because it shows that Microsoft is serious about protecting customer data – no shortcuts are taken to solve problems that might compromise data.

The service keeps on running even when back-end migration problems happen. I reported the problem in April 8 and it was resolved on April 30. Sounds bad. But all clients continued to function properly and access Exchange, Lync, and SharePoint during this period. An end user would not have known that anything was wrong. I think that this must be the situation with many Office 365 issues because if something really does go wrong then huge numbers of people are affected. In this case, a partial migration had resulted in a Wave 15 administration front-end attempting to talk to Wave 14 servers at the back-end. The different protocols involved caused the error. As it turns out, I’m told that the problem originated when my tenant subscription was changed last year and that this has uncovered a problem that Microsoft will now fix.

Document everything. This advice is often given to people who experience the joys of reporting a problem to support. You have to know and record your facts because you will be asked about them. Facts help identify where the problem might lie and how it might be solved. Write everything down, including the details of the interactions with the support team (time, date, and duration) as you might need to use this data to force an escalation.

The bottom line is that my Office 365 tenant domain is now back to full health. I am genuinely surprised that it took so long for Microsoft to solve the problem but am glad that things eventually worked out. It’s just a pity that it took so long to resolve and that escalation only happened after the incident was exposed to the full glare of publicity.

I doubt that many other tenant domains will be in the same situation. Office 365 has not really been around long enough for many companies to switch subscription types and Microsoft is now aware of the issue and will fix it. But I sure hope that the folks who run Office 365 support take action to improve their escalation processes so that other customers do not experience the same kind of extended case resolution as occurred here.

Follow Tony @12Knocksinna

Update 2 May: I was called this morning by a Microsoft customer support manager to discuss the problem and how Microsoft worked as the issue unfolded. I thought that the discussion was very open and helpful, which is always a good thing.

4 responses to “Office 365 support case resolved – thankfully!”

rnair

May 1, 2013

terrific report – appreciate it as always Tony… 🙂

Thank you,
Ratish Nair
MSExchangeGuru.com

Andrew Mazurek

May 1, 2013

Well done. Tony this is why I started to blog left right and so on… I am looking into hybrid deployment to enhance our Exchange, but have some second thoughts. Imagine IT folks that pushed cloud solution facing this problem. This is why controlling your own environment is not always bad idea. The solution maybe private cloud, as I am in charge of what, when and how.
I understand that impact was negligible, but what if something more serious like firmware update accident. Quality control should improve to prevent this kind of issues. For my users 3 restarts of Outlook is at least 2 to many. Unless we stop giving vendors a pass…

Jake Zimmer

September 14, 2013

Our email has been down for 8 days now and we are definitely feeling the pain of the Office 365 support process. I understand the importance of a locked down backend, but there should be enough engineers on staff who can fix the backend issues with minimal red tape in a timely fashion….8 days without email is 7 days and 23 hours too long to be down. There’s a reason that we pay Microsoft $2400/month to host our email, we don’t want to stress over downtime, etc.

1. Tony Redmond (“Thoughts of an Idle Mind”)
  
  September 14, 2013
  
  HI Jake,
  
  It’s sad to hear about your experience with Office 365 support. Can you give any other details that might help us understand what’s going on?
  
  TR