On July 13, Microsoft took the decision to withdraw Exchange 2010 SP1 RU4 (the fourth roll-up update for SP1). As you’re aware, a roll-up update (RU) is a regular release of the patches and other fixes that Microsoft has accumulated for a version of Exchange. RU4 was released on June 22, 2011 so it hasn’t been available long. Microsoft released the prior update (RU3) on March 8, 2011 but customers soon encountered problems with Blackberry devices sending duplicate messages. Microsoft then re-released RU3 on April 6, 2011.
In the case of RU4, the problem is a tad surprising because it occurs during a fundamental operation – moving items around. It’s the kind of thing that you’d really expect a QA group to pick up:
“A small number of customers have reported when the Outlook client is used to move or copy a folder that subfolders and content for the moved folder are deleted. After investigation we have determined that the folder and item contents do not appear in the destination folder as expected but may be recovered from the Recoverable Items folder (what was previously known as Dumpster in older versions of Exchange) from the original folder. This behavior occurs due to a customer requested change in SP1 RU4 which allowed deleted Public Folders to be recovered. Outlook and Exchange are not correctly processing the folder move and copy operations causing the folder contents to appear to be deleted.”
It requires a pretty serious event for a development group to publicly withdraw software. To have to withdraw roll-up updates twice in quick succession seems to indicate that Microsoft has a real problem in their quality control or release process and begs the question whether customers should have confidence in future patches or other software released for Exchange.
I think that taking this attitude is a somewhat simplistic view of the situation. Here’s why. First, Exchange is a very complex product that spans over 21 million lines of code. Although I am sure that the development process is well honed after some sixteen or seventeen years of building Exchange, things are becoming more complex all the time as the development group now has to create code to serve the twin platforms of on-premises and cloud (Exchange Online).
Some insight into the complexity that Microsoft development groups deal with might be gained from the excellent books written by Steve McConnell about Microsoft development practices based on his experience of shipping several products, including Rapid Development: Taming Wild Software Schedules, Code Complete: A Practical Handbook of Software Construction and Software Estimation: Demystifying the Black Art (Best Practices). In “Code Complete”, McConnell mentions that there might be 10-20 code defects per 1,000 lines. I believe that this is an old number based on early releases of products such as Excel that has likely decreased with the introduction of automated code checking tools and better software development frameworks, but it’s still probable that every 1,000 lines of code has one or two defects lurking. I can’t believe that the Exchange code base includes over 21,000 bugs, but I bet that Microsoft has a substantial database of known bugs, potential problems, customer requests for enhancements, and other reasons why code might need to be changed in the future. It’s just the nature of complex software.
It’s also important to realize that a specific defect might never be exposed in the normal course of events, might only appear in very specific circumstances, or become a knock-on effect as a result of code changed elsewhere including a Microsoft or non-Microsoft client. I doubt that we will ever get to zero code defects in commercial software so we’re always going to have to cope with patches and service packs for Exchange, Windows, SharePoint et al.
Second, given that we deal with a complex software environment, it makes sense to protect production systems by never deploying roll-up updates, service packs or indeed new versions without testing in a realistic environment that adequately mimics the production workload. In this context, testing doesn’t mean just checking that the software will install. It means testing Exchange on the Windows build used in production accessed by all the clients (and versions) that you use and alongside all third-party software products that interact with Exchange. In short, it’s not a quick and simple process.
If you rush to deploy software as soon as it’s released by Microsoft, you run the risk of encountering a problem that impacts users. For example, if you had deployed RU3 without testing, you’d have to explain to Blackberry users why they were seeing duplicate messages. In the case of RU4, you might have run into the situation where Outlook users report that they had “lost” data when they moved or copied folders. Both situations underline the importance of testing before deployment.
The third factor to consider is the maturity demonstrated by the Exchange development group in quickly acknowledging the problem and taking the necessary action to withdraw the software, even if it exposed Exchange to the ridicule of some commentators. I think this behavior shows a certain dedication to the installed base and so even if I am not utterly impressed at the fact that Microsoft has had to withdraw two roll-up updates in quick succession, the disappointment is somewhat mitigated by their fast action and open communications, allied to an expectation that this situation has served as a wake-up call to the QA and support folks who hopefully will do better with future releases.
And for the rest of us, it’s a great reminder that software like Exchange is general-purpose in that it’s created by engineers who have zero visibility of many varied ways that Exchange is deployed in the field. If only for that reason alone, you should protect yourself against software bugs by testing, testing, and more testing before anything is deployed.
Microsoft plans to fix the problem in Exchange 2010 RU5, which is expected to be available sometime in August. Microsoft has an interim update (KB2581545) that can be applied if you have already deployed RU4 (but remember the requirement for testing). You can contact Microsoft support to get the interim update.
Update July 28: Microsoft has rereleased RU4. See my commentary on WindowsITPro.com.
Tony, I basically agree with what you said. However, why can’t Micosoft test the RU accordingly? A move item operation is nothing esoteric. And to fuel the fire: 3 of my customers had cases open with PSS about the exact same issue – long before RU2(!) has been released. All of the cases were closed with “not reproducable”. So maybe they(PSS, the PG, etc.) should rethink their processes and *really* listen to customers and not denying the existance of severe issues…
I agree that it’s disappointing that the Exchange QA and release teams seem to have fallen down on the job by not testing the fix that was made to support deleted public folders to be recovered. They will probably say that this was a totally unexpected side effect that resulted from some unusual interaction between Exchange and Outlook. Whatever excuse is proffered, it’s still a bad situation, My point is that I hope that the problems with RU3 and RU4 will make Microsoft change the way that it deals with roll-up updates. First by doing better internal testing. Second by releasing early versions of updates as betas that can be tested by the Exchange community. In this case, the logic that many eyes will probably detect any lurking problems seems unanswerable.
How do customers typically test a new version of Exchange? Are they simply doing manual testing of features? Are there third party tools which will allow you to test a new version of Exchange using automated test tools?
Testing is usually done by deploying the new version of Exchange on virtual servers that mimic the production environment. In other words, the same versions of O/S, Exchange, and third party software and clients are run together to validate that the combination doesn’t exhibit any problems. Some companies have test scripts that they use to validate the new configuration. Others have a small group of people use the new software as their work platform for a couple of weeks. I don’t know of any software that will automate a full realistic test of Exchange.
BTW, nice to see you again (virtually) Jonathan…
Nice to talk to you again as well!
What sort of scripting do companies do? Do they do Outlook only client scripting or do they attempt to do OWA scripting as well?
Do you think the folder move problem could have been found using a VSTO script or do you think it would require some sort of GUI based test?
I’m afraid I am not a scripting guru… My observation is that mostly it is Outlook that is tested and often it is done by the support desk checking out features manually.
You seem to be ob both Microsoft’s side as well as their approach to this issue … Any thought on running for office ?? I mean political office not Microsoft Office !!!!
My point is that there is usually some goodness and badness in these situations. Obviously, Microsoft has a horrible quality problem in RU4 that should have been detected and fixed before the RU was released. On the other hand, they have fessed up to the problem and taken the appropriate action to both inform customers and provide an interim fix while they figure out what went wrong and how they will fix the issue permanently. My experience as a VP for ten years is that some other large companies might not have done so well in the same situation.
I wonder if Microsoft doesn’t have its “head in the cloud” these days to the point where they have shifted too many resources away from development and testing of their on premise releases?
My other observation is how could a customer expect to find a problem like this in their test cycle before deployment if Microsoft couldn’t? It would be very inefficient for the customer to be expected to unit test every possible email related scenario. This is Microsoft’s job. The customer test cycle should only concern itself with how does it deploy on their hardware and software stack and interaction with 3rd party software that Microsoft is not responsible for.
We didn’t go to RU3 before it was pulled thankfully, but our normal testing would not flag up any problems with Blackberries.
Simply because our test environment doesn’t include BES, I doubt many peoples do.
Even though BES Express is cheap/free you’ll normally be running test environment in a virtualised or compartmented environment which may not have access to the outside world that BES needs.