The debate began with sheer denials, mostly on the basis that it didn’t seem to make sense for someone to attempt to second-guess the Exchange development engineers who have been working on this problem for many years. As the erudite Boris Lokhvitsky remarked: “In your car, do you have the desire to modify the combustion sequence or rearrange the valves in the engine so that it would run faster?”
In fact, Exchange 2013 evolved the failover criteria used by Exchange 2010 to take account of server health when Active Manager makes a decision about what target server to select to host a failing database in BCSS, or “best copy and server selection.”
But after a while, the esteemed Scott Schnoll weighed in to say that there is a way because Exchange accommodates a method called an Active Manager Extension, part of the third-party replication (TPR) API that exists in both Exchange 2010 and Exchange 2013. The TPR allows storage vendors to write their own continuous replication code and then stitch it together with the rest of the DAG components so that everything works together seamlessly. At least, that’s the theory.
TechNet says: “Exchange 2013 also includes a third-party replication API that enables organizations to use third-party synchronous replication solutions instead of the built-in continuous replication feature. Microsoft supports third-party solutions that use this API, provided that the solution provides the necessary functionality to replace all native continuous replication functionality that’s disabled as a result of using the API. Solutions are supported only when the API is used within a DAG to manage and activate mailbox database copies.”
On the surface, TPR seems like a wonderful idea. But the sad fact is that only EMC has ever implemented TPR in a solution called “Zero-Loss Protection for Exchange”, where they distinguish between “Native Database Availability Groups” (the normal kind) and “Synchronous Database Availability Groups” (the kind you’d use with an EMC CLARiiON SAN). The EMC Replication Enabler for Exchange is the component that leverages TPR.
I’m sure that EMC was very excited when Microsoft told them about the TPR because it must have seemed like a great way for EMC to defend their SAN installed base at a time when Microsoft was telling customers that they were engineering Exchange to exploit low-cost storage solutions. Since then the evidence is that not many people have actually used EMC’s solution and no other storage company appears to have been too interested in taking on the cost to develop and maintain their own replication solution for a DAG.
Indeed, given the hype around JBOD-type storage for Exchange, especially in the two years since Microsoft shipped Exchange 2013, anyone who proposed building a third-party replication solution for expensive SANs might be regarded as a candidate for lying down in a cool dark room until the idea passed. Even EMC is quite on the topic of using their code with Exchange 2013 and I imagine that the Replication Enabler is heading to the great byte wastebasket soon, if it hasn’t already reached there.
So Scott was right in his assertion that there is a way for someone to affect the way that Active Manager handles database failovers. You simply have to crack open your favorite IDE and write the code to leverage TPR. Simple. Just like that. Or maybe not. But the bad news is that your code will only work for Exchange 2010 and Exchange 2013 because Microsoft announced their intention to deprecate the API at the recent Ignite conference. It seems that Exchange 2016 will be the last version to support DIY DAG failovers.
As for me, I think I’ll let the Exchange developers take care of how replication happens inside DAGs. It just seems easier all round.
Follow Tony @12Knocksinna