Thoughts about having a dedicated archive server in a DAG


An article covering some aspects of the deployment of archive mailboxes posted on SearchExchange by Brian Posey, a well-known and respected writer about Exchange, attracted my attention. While Brian makes some good points, I think the general thrust of his argument is flawed. Here’s why.

His central point is that organizations can create mailbox databases that contain only archive mailboxes and place them on low-cost storage attached to a low-end server. He says that administrators are “concerned with how much storage space the archives would consume” and that’s why they’d like to use low-cost storage.

I don’t have much argument with these statements. I do, however, argue that you should not use SATA drives with an archive server. First, while these drives are cheap, they tend to fail more often than their SAS counterparts and most administrators won’t want to be constantly monitoring archive servers to detect and fix failed drives. I think it makes more sense to invest a little more money and deploy SAS drives deployed in a RAID-1 array to host archive mailboxes. Something like the HP MDS 600 would do the job nicely but there are many other good, reasonably-priced SAS-based direct-attached storage modules available today. The extra hardware investment is justified by the lower operational cost. (the question has to be asked if your archives are worth anything, shouldn’t they be protected by reasonable storage?). If you do decide to use JBOD, be sure to follow the advice from the Exchange development group and deploy at least three database copies to protect your data and use high-quality 7200 rpm disks.

Brian then goes on to say that low-end hardware “can be problematic if you introduce DAGs (Database Availability Groups)”. His statement that DAG’s don’t distinguish a database that contains archive mailboxes is absolutely true. This is part of the charm of Exchange 2010 – that you can mix and match mailbox types to meet the precise needs of your organization. But Brian then says that creating an archive database in a DAG causes problems because “DAGs require each database copy to use the same storage path, which can be an issue if the archive server is using SATA storage.”

Hmmm… again, he’s absolutely on the money when he states that each database copy in a DAG requires the same storage path pointing to the database files and transaction logs (think E:\ARCHIVE\DB1-ARCHIVE.EDB).  I guess this might be an issue if you run out of drive letters on your server, but there are ways around that issue (mount points anyone?). See TechNet for more details about Exchange 2010 storage.

The biggest argument I have is with the statement that “You’ll also need to provide enough storage space for each server in the DAG to store a copy of the archive database.” This is factually incorrect. You absolutely need to create sufficient copies of a database within a DAG to provide resilience against failure. In most cases, three database copies is sufficient and I expect that this will be the optimum situation for archive databases (the more paranoid amongst us might favour four copies). I think that any DAG that hosts archive servers will have more than five member servers so you wouldn’t have to create copies of the archive databases on two of the five servers. Note that it’s a well-known tenet of DAG design that DAGs that have more member servers are more flexible and resilient against failure than those with fewer members. In other words, given the choice between deploying two DAGs of three members each and one of six, I’d always go with the larger DAG.

The article ends by recommending that you consider creating a low-end DAG to host archive mailboxes. I don’t go along with this advice because I think it makes more sense to incorporate a dedicated archive server (if that’s what you want to use) alongside the other servers within a DAG to provide a single entity to be managed. I just don’t see an advantage of having a separate DAG that’s dedicated solely to user archives and prefer the flexibility of being able to distribute a couple of passive copies of archive databases across other DAG members so that these copies are available and can be automatically activated just in case something goes wrong with the archive server.

Please don’t get the idea that I condemn Brian for advancing his ideas in his article. We’re at the start of defining best practice about the most efficient and effective deployment tactics for entities such as dedicated archive servers and I expect that many will support his perspective. That’s OK too – the important thing is that we debate the issues and gradually come to a view as to what best practice actually is that’s based on hard knowledge of operational realities encountered in real life deployments. For example, the question of RTO/RPO for archive data needs to be discussed in a deployment scenario so that everyone understands how quickly archive data can be restored if the need occurs. Another important point to discuss is the purpose of the archive database. After all, now that Exchange 2010 supports massive (in comparative terms to what we have had before) mailboxes of 25GB or larger, maybe you don’t need them as you can keep everything in the regular database and use retention policies to help users keep swelling folders under control? Of course, you might end up with a massive OST as well, but that’s another day’s work!

Let the discussion begin!

–  Tony

Need more information about Exchange 2010 archives? Read all about the topic in Microsoft Exchange Server 2010 Inside Out, also available at Amazon.co.uk. The book is also available in a Kindle edition. Other e-book formats for the book are available from the O’Reilly web site. Or if you’d like to discuss the topic in person, come to an Exchange 2010 Maestro training event and debate the finer points of this and other interesting areas of Exchange with Paul Robichaux and myself.

Posted in Exchange, Exchange 2010 | Tagged , , , | 7 Comments

Off to Spring Connections and an interesting video


This video posted on the Seattle Pi site is an interesting walk down through memory lane as it wanders from the introduction of products such as Word to Exchange to SharePoint Portal Server to today’s fascination about all things to do with the cloud. Of course, it’s a marketing video, but even so, it’s interesting to see how you can take all the products and stitch them together into a story that makes sense. After watching it, there’s no way that you could be left in any doubt that some seer at Microsoft was guiding the whole thing from development on 3.5 inch floppy disks to CD to DVD to the cloud. It all makes sense now…

With that thought, I shall head off to Spring Connections 2011 in Orlando and see those of you who plan to attend there. I’m speaking at 8am on Tuesday next, March 29 in the Tuscany D+E rooms at the JW Marriott Resort. Apart from this keynote, be sure to attend Mark Minasi’s whimsically titled keynote “Cloud Computing: A (Lapsed) Economist’s View” at 8am on Monday.

Other well-known speakers about topics related to Exchange include Jim McBee, Michael B. Smith, Kevin Laahs, Kieran McCorry, and Mike Crowley. Should be a good event!

Now off to the airport…

– Tony

Posted in Uncategorized | Leave a comment

The spectrum of technical writing


I am often asked what is the difference between writing a blog post and an article that’s published in a magazine such as Windows IT Pro. In response, I think that a spectrum of technical writing exists that goes something like this:

  • The most basic (and often very useful) contribution that people make is in an online forum such as the TechNet forum for Exchange 2010. While many forums have moderators to keep everyone focused on the topic on hand, few exert much quality control over the contributions. You therefore accept whatever information you glean from these sites on face value. You might recognize a contributor, in which case you’d assign a greater import to the content, but many contributions are repetitive, recycle information from elsewhere, or are just plain wrong.
  • Blogs represent the next level up in the spectrum. A blog is a very personal expression of someone’s opinion on a topic. Hopefully, if it’s a technology topic, the author will do some research before they post. This happens in good blogs and you can see the quality of the material when it does happen. However, it’s important to remember that humans are fallible and that the content of very few blogs are reviewed from an editorial or technical perspective. The most notable exception is provided by blogs written by product teams such as the Exchange development group, where I have every reason to believe that posts are carefully scrutinized before publication. This doesn’t imply that mistakes don’t happen; it just means that mistakes occur less often because many eyes have probably looked content over before it appears in public.
  • Magazine articles, at least those published by reputable companies, represent the next level. Articles differ from blog posts from development teams because they are usually independent and therefore don’t include the same kind of Kool-Aid content that invariably sneaks in when someone working for a development group discusses their product. Articles are similar to blog posts from development groups because they go through an editorial cycle intended to improve the flow and content of the article and to eliminate obvious mistakes before publication. The better the magazine, the more work they put into the editorial cycle – this is one of the primary factors that attracted me to write for Windows NT Magazine (the predecessor of Windows IT Pro). To give you an insight, before Windows IT Pro publishes an article, it will be edited by a copy editor and reviewed at least once by a technical reviewer, who is responsible for the technical accuracy of the material. Content that isn’t quite on the money may also be looked at by a separate reviewer. In addition, article submissions that come into the magazine are often looked at by contributing editors to check that the level of content will be attractive to readers and is aligned with the kind of material that the magazine publishes. The point here is that good publishing teams take extraordinary care to ensure that compelling technical content appears in their magazine.
  • White papers are usually generated by technology vendors, often in an attempt to explain the finer details of how their products work. These documents are usually worth perusing as they contain all manner of detail that may not be available elsewhere. And while white papers do receive a great deal of editorial attention as they are drafted and prepared for publication, some are afflicted by the need to pay attention to the paymaster. In other words, if a company sponsors a white paper, the author may not possess full freedom to express their true opinion about the technology. Good companies allow authors the independence of their thoughts (often because the company’s technology is pretty good) whereas not-so-good companies put out white papers that are pretty dark and useful only when serving as bin liners.
  • Books represent the outer limit of my technical writing spectrum. These are usually the work of a single author (writing good technical books in a team is very difficult and often a recipe to end friendship) and take many months to produce. My Exchange 2010 Inside Out book spans 400,000 words and took me 15 months to write. It is unsurpassed as a door stop or aid to sleep. Books require a huge amount of help from series editors, copy editors, technical reviewers, indexers, artists, and other professionals to get the job done. And while books allow authors the undoubted luxury of being able to cover topics in-depth, the sad facts are that a) content starts to age as soon as you write it and will be outdated by service packs and new releases and b) you’ll always leave out something that someone is interested in.

The spectrum of technical writing evolves over time. I’m sure that some would consider Twitter as a viable medium and there will be new ways to communicate that evolve in the future. e-Books that can be quickly developed and revised to keep the content aligned with the current state of technology is definitely a promising platform for the future that may lead to the demise of printed books.

Then again, this is a blog post so it’s personal, could be full of mistakes, and is highly open to debate (see description above).

– Tony

Microsoft Exchange Server 2010 Inside Out is also available at Amazon.co.uk and a Kindle edition. Other e-book formats for the book are available from the O’Reilly web site. Alternatively, to listen to over 18 hours (not including the labs) of detailed technical information about Exchange 2010 over three days, come along to the Exchange 2010 Maestro Seminars that will run in San Diego (May), London (June), and Greenwich, CT (October).

Posted in Technology, Writing | Tagged , , | Leave a comment

Exchange 2010, mailbox imports, and LocalSystem


Anything posted on the Internet has a reasonably long half-life, a fact that I was reminded of when I researched some information about the Windows LocalSystem account and discovered an article that I wrote for Windows IT Pro magazine about Exchange 2000.

In a nutshell, the article discussed a major change that Microsoft made in Exchange 2000 to replace the “service account” as the basis for running the set of Exchange services (such as the Information Store and Directory Services) that collectively form the product. The first generation of Exchange (versions 4.0 through 5.5) ran on Windows NT 3.51 and Windows NT 4.0 and didn’t use the Active Directory. It’s fair to say that these versions were not really designed to cope with the needs of major enterprises and reflected some of the thinking that found its way into PC LAN-type products at the time. When you installed Exchange, you created a service account, a highly privileged account that Exchange used to run its services. The service account was often called “ADMIN”. All manner of things could go wrong with this account that wreaked havoc on Exchange, including the inadvertent changing of the service account’s password, something that was guaranteed to bring everything to a crashing halt followed by some pretty pointed inquiries as to who changed the ****ing password!

Looking back, we now realize just what a fundamental transformation occurred in Exchange 2000 as Microsoft introduced many of the structures that still persist today and allow Exchange to cope with the needs of the largest enterprise, or indeed, very large hosted environments. Replacing Exchange’s own X.500-based Directory Services with Active Directory is often cited as the biggest change, but I think that moving to use LocalSystem as the basis for running the suite of Exchange services is possibly equally important. Just have a look at the set of services that run on an Exchange 2010 server today and reflect on the fact that all run under LocalSystem.

Why would I even think about LocalSystem at this point? After all, it’s been around since Exchange 2000 and any of the initial glitches have been eradicated over Exchange 2000, 2003, 2007, and 2010 and all the various service packs and fixes that Microsoft has released since 1999.

My interest was reignited by a discussion about the Microsoft Exchange Mailbox Replication Service (MRS). Exchange 2010 SP1 extended the functionality of MRS to include the ability to process mailbox import and export requests. These requests are created with the New-MailboxImportRequest and New-MailboxExportRequest cmdlets and these cmdlets either read from (import) or write to (export) PST files held in a network file share. For more details, see my Windows IT Pro article on the topic.

Log-on properties of the Mailbox Replication Service

As you can see from the screen shot, MRS runs under the context of the LocalSystem account. Getting back to our story, I have been pretty public about my dislike of PST files as I regard the file format to be tremendously insecure. Any MAPI client can open a PST file and even if you “protect”a PST with a password, anyone who wants to get into the file can do so easily with a password cracker utility. With this fact in mind it is clear that the PSTs used for import and export operations must be protected against people who might browse network shares looking for interesting data. At the same time, the network share has to be accessible to MRS.

The advice given to administrators is that the network share should be locked down and read-write access allowed to the Exchange Trusted Subsystem. This is usually sufficient to allow MRS to manipulate the PSTs – unless MRS is running on the same server that hosts the network share, in which case MRS runs under the context of LocalSystem, and if the network share is not accessible to LocalSystem, you’ll see a file access error when you attempt to create the new import or export request. The error message will be similar to this:

Unable to open PST file '\\ExServer1\Imports\Test1.pst'. Error details: Access to the path '\\ExServer1\Imports\Test1.pst' is denied.; Microsoft.Exchange.MailboxReplicationService.RemotePermanentException: Access to the path '\\ExServer1\Imports\Test1.pst' is denied.

The solution is to make sure that the network share is accessible by the SYSTEM account as this will allow MRS to open the files using its LocalSystem credentials.

I must admit that I didn’t meet this problem myself. It came about in a frustrating manner for a colleague who found that attempts to create some MRS import requests were successful while others failed. In figuring out what happened, it’s important to realize that MRS runs on every Client Access Server and that multiple MRS instances may operate within an Active Directory site. The failed MRS imports were those for the MRS instance running on the same server that hosted the network share while the successful imports were processed by an MRS instance running on a different server. You would never see this problem if you host PSTs in a network share on a dedicated file share server that didn’t run the Exchange 2010 CAS role, but you could in other circumstances – unless you make sure that LocalSystem has read-write access to the network share.

– Tony

Need some more information about Exchange 2010 SP1? If so, check out Microsoft Exchange Server 2010 Inside Out, also available at Amazon.co.uk. The book is also available in a Kindle edition. Other e-book formats for the book are available from the O’Reilly web site. Alternatively, come along to the Exchange 2010 Maestro Seminars that will run in San Diego (May), London (June), and Greenwich, CT (October).

Posted in Exchange, Exchange 2010 | Tagged , , , , , , | 7 Comments

Chariot derailed at Lansdowne Road


Seeing English white shirts coming out for a rugby international creates a special feeling for both opposition and spectators. Unless the game is played at Twickenham, the feeling is not affection. It’s not hatred either, even in Scotland when centuries of repression and the Braveheart fever is at its height, but it’s definitely more of a steely determination to compete and put bodies on the line.

I first discovered this particular feeling about England rugby teams when I refereed an under-19 international in Glasgow in March 1997. Apart from being woken by the late, great Bill McLaren, who wanted to know all about my background so that he could inform listeners even more comprehensively during his TV match commentary, my special memory of the game was the way that the Scots suddenly added a yard of pace and a degree of aggression in all contact situations. As I recall, Scotland won the match… and Bill McLaren gave me some Hawick Balls, so it was a good day all round.

And so the Saturday of the last weekend of the 2011 Six Nations found us in Lansdowne Road in Dublin, aka the Aviva Stadium. This was England’s first visit to the newly-rebuilt stadium and they came seeking a Grand Slam and championship. I’d already seen England in action twice at Twickenham as TMO for the England v Italy and England v Scotland games and expected a calm, collected, and aggressive performance. Boy, I was wrong on all counts.

England didn’t play badly. They didn’t play at all and were lucky to get away with a 24-8 loss. Or rather, Ireland produced their best performance of the championship to deny England space and to compete magnificently in all the contact situations. English players were tackled by two or three Irish players, they never made ground at ruck and maul, and the scrum was solid. The Irish backs provided the cutting edge and demonstrated just how good a collective unit they can be. By comparison, the English backs were ponderous and laboured and generated just the one chance, which died a death when they passed the ball to Ireland. Not even the arrival of the celebrated Jonny Wilkinson could do anything to ignite any sort of rhythm in England’s rank. By comparison, the Irish backline was sharp and created all manner of chances, and it was nice to see Brian O’Driscoll cap an excellent performance with a trademark one-handed pick up of a low pass before scoring a record 25th championship try.

“Swing low, sweet chariot” is England’s rugby anthem. Lots of fans had come across to celebrate the Grand Slam and championship and while England ended up winning the Six Nations championship on points difference (France really screwed up against Italy last week – a win there with a reasonable point difference might have made the championship tighter), all their fans got in Dublin was the chance to go on an epic pub crawl to drown their sorrows. The English team looked quite happy with themselves when they received the trophy at the Four Seasons hotel in Dublin after France had beaten Wales in Paris, but I suspect that they’ll be less content over the next few days when the nature of today’s defeat sinks in and the natural euphoria surrounding the award of a trophy fades.

Now that the 2011 Six Nations is out of the way, the business end of the club season comes into focus with the quarter-final of the Heineken and Amlin cups in three weeks time. I’ll be in Barcelona for the Perpignan vs. Toulon game at the Olympic stadium. It should be a tasty encounter and I’m looking forward to it already!

– Tony

Posted in Rugby | Tagged , , | 1 Comment

Be careful with VM snapshots of Exchange 2010 servers


Those who are considering virtualizing production Exchange 2010 servers should read the fine print contained in the TechNet article “Exchange 2010 System Requirements“. In particular, this text is crucial:

“Some hypervisors include features for taking snapshots of virtual machines. Virtual machine snapshots capture the state of a virtual machine while it’s running. This feature enables you to take multiple snapshots of a virtual machine and then revert the virtual machine to any of the previous states by applying a snapshot to the virtual machine. However, virtual machine snapshots aren’t application aware, and using them can have unintended and unexpected consequences for a server application that maintains state data, such as Exchange. As a result, making virtual machine snapshots of an Exchange guest virtual machine isn’t supported.”

Eeek! In a nutshell, this means that all support bets are off if you take a snapshot of a running Exchange 2010 server with VMware or Hyper-V and then attempt to revert to the state of the server contained in the snapshot. Don’t expect sympathy from Microsoft support if you ring up to report that things don’t work so well after you’ve used a snapshot to go back to a known system configuration.

In practice, snapshots are fantastic in a lab environment as they allow you to deploy Exchange servers quickly and to go back to a known state if the need arises (you assume that more errors occur in a lab environment that might cause a server to become unusable). In production, snapshots can work pretty well for Exchange 2010 servers that are largely stateless. If you have dedicated CAS or Hub Transport servers, you’ll probably not run into many difficulties if you need to revert to a snapshot of a previous configuration. You might screw up the transport dumpster a tad, but you won’t notice this unless you run into a more serious problem and require Exchange to replay some messages that should be in the dumpster… if the messages aren’t there, you might lose them unless they can be found in another dumpster.

Things are far more problematic with mailbox servers, especially those that operate within a Database Availability Group (DAG). These servers are super-stateful and may be communicating in all manner of mysterious ways, including block-mode replication. Because this is the case, it’s extremely likely that reverting to a previous snapshot of a running and loaded mailbox server will be a sorrowful event. You might run into problems such as the database copies on the server being unrecognized within the DAG, being forced to reseed database copies, or even having the server fail to rejoin the domain or cluster for one reason or another (expired computer password, etc.). All in all, it’s a messy place to be.

Because of the potential for problems it’s best to avoid taking snapshots of running Exchange 2010 mailbox servers. For sure, you can take snapshots of inactive servers (for example, shut the computer down after installing a new service pack and then take a snapshot) but even so, don’t assume that these snapshots can be used to bring a reconstituted server back into production without encountering some glitches along the way.

Problems after reverting to a snapshot is not the only thing to be aware of with Exchange 2010 mailbox servers. You shouldn’t use features like Vmotion to move DAG members to other hosts as this can also cause the DAG to have a severe headache. Microsoft’s perspective appears to be that customers should use the high availability features built into Exchange 2010 and not attempt to change the underlying platform when the DAG will not be aware of the change. This post provides a good overview of the issues involved with Vmotion.

My preference is to use physical computers for mailbox servers. I’ll cheerfully virtualize the rest, including such esoteric components like load balancers, but given the choice, I’ll always go with the comfort factor that a well-specified mailbox server delivers. This is largely a matter of personal choice allied to a suspicion that problems are easier to sort out when things go wrong on a physical box.

Everyone is rightly interested in virtualization because of its potential to increase the utilization of hardware. But the fine print has a nasty habit of catching people who let their enthusiasm run ahead of the capabilities of technology. All the more reason to conduct realistic operational tests of any new server product before bringing it into production so that you know how to deal with different kinds of server outages on both physical and virtual platforms.

– Tony

For more information about Exchange 2010 and the many cool features included in this release, see Microsoft Exchange Server 2010 Inside Out, also available at Amazon.co.uk. The book is also available in a Kindle edition.

Posted in Exchange 2010, Technology | Tagged , , , , , , | 13 Comments

Varying replication mode within a DAG


Log shipping is a well-known method for data replication between Exchange servers (and other computers). It made its first appearance in Exchange in the Local Continuous Replication (LCR) and Cluster Continuous Replication (CCR) features of Exchange 2007, added to with Standby Cluster Replication (SCR) in Exchange 2007 SP1, and is used to distribute replicated data within a Database Availability Group (DAG) in Exchange 2010 and Exchange 2010 SP1.

The basic arrangement for log shipping is simple: a transaction log file is generated on a source server and is either pulled by the target server (Exchange 2007) or pushed to the servers that contain database copies (Exchange 2010). In either case, it is the Microsoft Exchange Replication Service that is responsible for transferring data. The difference between the two methods is accounted for by the fact that Exchange 2007 only supports a single database copy for LCR, CCR, or SCR while Exchange 2010 supports up to sixteen database copies within a DAG.

The problem with depending on files is that losing one can lead to data loss. An Exchange transaction log holds 1MB of data composed of interleaved transactional steps generated by client activity. A single complete transaction, for example the creation of a new mail message, is composed of several steps as the new item is initiated, populated with data, and finally committed. If you lose a transaction log, all of the transactions in the log are obviously unavailable and while the affected items might be relatively unimportant (who will miss yet another auto-reply), they might be ultra-critical, such as a message from the CEO about an important acquisition. It therefore makes sense to minimize the risk of losing any transactional data in whatever way is possible, which is the logic for the introduction of block-mode replication in Exchange 2010 SP1.

Shipping complete transaction logs around is referred to as file-mode replication. Exchange 2010 servers always commence replication in this mode. However, from Sp1 onwards, DAG member servers are able to switch into block-mode replication if replication is proceeding smoothly within the DAG and no copy or replay queues are accumulating on the DAG members.

Block-mode replication means that the server that holds the active copy of a database will push data to the servers that hold the passive copies of the database as soon as data for a new transaction is written into the log buffer. The log buffer is an in-memory cache that holds current transaction data. After 1MB of data is accumulated, the log buffer is flushed to create a transaction log. Obviously, this process still continues as it’s critical to continue to capture transactions in a way that they can be replayed should servers crash and memory be erased.

Switching between modes is automatic and is managed by a component called the log copier that monitors the copy and replay queue lengths as transaction logs are generated. If queues start to build, the log copier will switch back into file-mode replication and remain in that state until conditions ease and the queues clear.

How do you know what’s happening on a server? The following PowerShell command interrogates the Windows Performance Monitor counters that are maintained by the MsExchangeRepl process.

Get-Counter -ComputerName ExServer1 -Counter “\MSExchange Replication(*)\Continuous replication – block mode Active”

Timestamp               CounterSamples                                                     ---------               --------------                                               3/16/2011 10:18:11 AM   \\exserver1\\msexchange replication(db2)\continuous replication – block mode active :                                  0                                                             \\exserver1\\msexchange replication(db1)\continuous replication – block mode active :                                  0                                                             \\exserver1\\msexchange replication(db4)\continuous replication – block mode active :                                  0                                                             \\exserver1\\msexchange replication(db3)\continuous replication – block mode active :                                  0                                                             \\exserver1\\msexchange replication(_total)\continuous replication – block mode active :                               0

We can see that a separate counter is maintained for each database on the server plus an overall counter. In this case, we can see that there are four databases (DB1, DB2, DB3, and DB4). The value of each counter is 0 (zero), so we know that this server is currently operating in file-mode replication for each of these databases. A value of 1 (one) indicates block-mode replication is active. Of course, you can also look at these counters through Performance Monitor, but that’s pretty boring as the values don’t change that often.

Another method is by using the Get-WMIObject cmdlet. The same data is interrogated. In this example (modified version of code taken from MSDN), we want to report any instance of a database on a specified server (ExServer1) where block-mode replication is currently active.

Get-WMIObject -ComputerName ExServer1 Win32_PerfRawData_MSExchangeReplication_MSExchangeReplication | Where-Object {$_.ContinuousReplicationBlockModeActive -eq "1"} | Where-Object {$_.name -ne "_total"} |Format-table Name, ContinuousReplicationBlockModeActive -AutoSize

Name ContinuousReplicationBlockModeActive
---- ------------------------------------
db2                                     1
db1                                     1
db4                                     1
db3                                     1

It is possible that you’d never see block-mode replication in action. Running two virtualized Exchange servers on a laptop is an exercise in slow disk I/O and queues form rapidly during heavy activity such as mailbox moves or mailbox imports. The same might be true for stressed mailbox servers. In these circumstances Exchange will play safe and remain in file-mode replication mode. It’s also possible that block-mode replication will be possible to one server and not another, again because one of the server is stressed and copy or replay queues have accumulated there.

The point about block-mode replication is that it enables data to be transferred from the active database to its passive copies much faster than if the DAG has to wait for complete transaction logs. In the case of heavily loaded servers that are generating multiple transaction logs every second, the difference might be relatively small in time as measured by humans, but every millisecond counts in a crash.

When data is transferred from the active server to a server holding a passive copy, it is stored in the log buffer on the receiving server and becomes part of the transaction stream that will be processed by that server. Another improvement in SP1 is that if a crash occurs during block-mode replication that prevents the contents of a complete transaction log being received, the receiving server is able to close off the incomplete log and use its contents during the activation process to bring the selected database copy to a point that is as close to up-to-date as possible.

All in all, this is very nice work and evidence of growing maturity in Exchange high availability technology.

– Tony

For more information about how things work within a DAG, see chapter 8 of Microsoft Exchange Server 2010 Inside Out, also available at Amazon.co.uk and in a Kindle edition. Other e-book formats for the book are available from the O’Reilly web site.

Posted in Exchange, Exchange 2010 | Tagged , , | 8 Comments

Communications failure in Twickenham


On March 13 I was back at Twickenham to be the TV Match Official (TMO) for the England v Scotland Six Nations game. This is the oldest international fixture and was first played in 1871, so there’s a bit of history and whenever the two teams come together you can be sure of a dogfight.

The game ended 22-16 for England but the scoreline disguises some minor panic for the communications technicians.  It all started in the 59th minute when referee Romain Poite pulled up with a calf strain. He was replaced by assistant referee Jerome Garces, also of France and Andrew Small took over as the second assistant referee. This was the first time that a referee was replaced in the middle of a Six Nations international and while the protocol for such a replacement is well understood and the switch performed flawlessly, the communications equipment started to malfunction immediately afterwards. As normal, I was located in the TV Director’s outside broadcast unit and there was no communications between this unit and the referee. Other problems existed on the field as the assistant referees couldn’t hear the referee either.

Fortunately England decided to change four players the next time that the ball went out of play and the technicians were able to change the on-field radios and batteries. However, while this restored communications from the referee and between the referee and his assistants, he couldn’t hear me. Losing communications isn’t a real issue unless a decision has to be referred to the TMO.

Scotland duly went over the line in the 74th minute. There was some doubt whether the ball was grounded correctly as it squirted away from the scorer, Max Evans, immediately afterwards. Jerome stopped play and referred the decision to the TMO. I could hear the question (“try or no try”), but couldn’t respond. All hell broke loose in the outside broadcast unit as the BBC attempted to restore the link but in the meantime a decision to award the try was made – the only problem was how to tell the referee. The BBC director told the floor manager (the person responsible for running coverage from the sideline) to tell the referee but Jerome didn’t know who this person was and quite correctly declined to accept a decision from him.

Before the game, Andrew Pearce, the number 5 official (all international and high-profile rugby matches have two replacement officials on the sideline; most of the time they control substitutions and temporary suspensions or as in this case, the number 4 goes on in the case of an injury) and I had exchanged mobile phone numbers and I called him to relay the decision. Jerome was happy to accept the information from Andrew and the try was awarded. While it was good to make the right decision, it was disappointing that we had to depend on a mobile phone – and I was lucky that we both used O2 as I’ve been told since that other phone providers don’t have a reliable signal around Twickenham.

Despite valiant efforts from Scotland, England closed the game out afterwards and now look forward to coming to Dublin on March 19 when they’ll attempt to win the Grand Slam. We’ll just have to see about that!

– Tony

Posted in Rugby | Tagged , , , , | 2 Comments

Preparing for Spring Connections and first impressions of Office 365


Earlier this week Brian K. Winstead, the author of the Exchange and Outlook blog on http://www.windowsitpro.com, contacted me to ask about the keynote that I’ll be giving at the Spring 2011 Exchange Connections event at the JW Marriott Resort hotel in Orlando from March 27 to 30. The resulting conversation is available online.

I’ve said before that 2011 will be a year of migration. According to this blog by Ian Hameroff of Exchange Product Management, they believe that 60% of the Exchange 2003/2007 installed base will upgrade this year. The choice facing many companies is to proceed with an on-premises or a hosted deployment, including the option to use Office 365, and this is one of the topics that I plan to talk about in Orlando. I’ve spoken to a lot of people about Office 365 and have had some exposure to its predecessor, the variant of Exchange Online that’s currently available as part of the mouthful called Microsoft Business Productivity Online Services (BPOS). Of course, this version is based on Exchange 2007 rather than Exchange 2010, which is what Office 365 uses, so it’s not the same. I was therefore keen to get hold of an Office 365 account and hoped that Microsoft would accept my application to be part of the Office 365 beta. Alas, this didn’t happen. To their credit, Microsoft was swamped with applications to join the beta. Even though I write about the technology as a contributing editor to Windows IT Pro magazine and am a current Exchange MVP, my application was duly dispatched to the wastebasket or filed under “maybe, some day, perhaps”.

In any case, InfoWorld columnist and fellow MVP, J. Peter Bruzzese, received access to Office 365 (he reports some of his recent musings about Office 365 here). I guess InfoWorld carries more weight with Microsoft PR than Windows IT Pro. So be it. Peter was kind enough to offer me an Office 365 from his test domain and I accepted the offer with thanks. I’ve been using my 25GB Office 365 mailbox complete with an online archive for a week or so since. Not long enough to learn everything about a product that is still in beta, but certainly long enough to arrive at some early conclusions.

So here’s the thing about Office 365 – it is totally boring for anyone who’s been trained as an Exchange administrator. All the fun (???) of setting up and running servers is removed because Microsoft does it all for you and hides the interesting technical detail behind a boring web-based administration console that could be managed by my grandmother. And that’s the point of a utility email service. It has to be boring, and robust, and dependable, just like any other utility service. Think of electricity or water – would you want to have administrative control over their delivery to your house? The answer is a resounding “No” – all you need is a meter to tell you how much of the utility you’re using, which is roughly the equivalent of the Office 365 administration option that tells you how many users you have in your domain and what licences are assigned (or in Office 365 parlance, the “plans” used by each user; a plan dictates what functionality is available to the user). Apart from checking the meter from time to time (or ignoring it until a utility bill arrives), the only interaction you have with a utility is to connect a new device. You might be brave and change a plug for an electric device before you plug it into the socket, but that’s about it. The equivalent in an Office 365 world is to set up a new user, an operation that is much easier than rewiring a plug.

There is some new administration work to do in a co-existence scenario when part of the company uses on-premises Exchange servers and some users connect to Office 365. Federation and directory synchronization are two critical activities to master here. But I suspect that the administration effort will peak sharply at the time when users first migrate to Office 365 due to the need to establish high-fidelity connectivity between the on-premises and cloud environments and to move mailbox data.

Companies should not underestimate the effort required to migrate users or that required to ensure that the right networking and operational configurations are in place to support Office 365. This is especially so when users want to move large mailboxes from on-premises servers to cloud-based servers, unless of course your company possesses ultra-wide network connections to transport all the data across the Internet to Microsoft’s datacenters. Of course, asking users to clean out mailboxes before the mailboxes are moved is often an act of total futility as has been proved in previous migrations over many years. Users are simply too busy to do this kind of housekeeping and anyway, why do you need to do it when storage is cheap?

After the initial burst of activity (which could last several months for a large company), I suspect that the migration and interoperability workload will decline to allow administrators to concentrate on other more productive activities.

From a user perspective, Office 365 is very much the same Outlook Web App (OWA) experience as delivered by Exchange 2010 SP1 (including selectable themes). The only issue I encountered was using the Chrome browser – OWA didn’t seem to want to allow me to save or send a new message. If you elect not to use a web browser, then you can choose an RPC-over-HTTP (aka Outlook Anywhere) connection for Outlook. It took about ten seconds to configure a connection between Outlook 2010 and Office 365 and, as far as I could tell, the subsequent experience was exactly similar to that when connected to an on-premises Exchange 2010 server, which is exactly what you’d expect.

I also connected my iPhone to my Office 365 mailbox. Curiously, I wasn’t able to use the inbuilt Microsoft Exchange-type connection and had to revert to using an IMAP setup. Fortunately Microsoft has published all the necessary settings via ECP and I was able to plug the settings for IMAP and SMTP into the iPhone and then synchronize.

Administrators get a web interface that’s a modified version of the on-premises Exchange 2010 ECP plus a separate Office 365 web interface that’s used to configure other non-Exchange options. The Office 365 version of ECP allows for options such as selecting the plan associated with a mailbox and displaying some mailbox data that is accessed through EMC for an on-premises deployment such as current mailbox size. If you know ECP for Exchange 2010 you won’t find much different here, nor is there anything different about connecting a mobile device to an Office 365 mailbox.

Office 365 therefore delivers exactly what it says on the box: a utility email service. This will be exactly what many companies need. I can’t, for instance, see the logic why a new start-up company would deploy on-premises servers unless they had a really good reason for doing so (such as being in the business of developing add-on software for Exchange). On the other hand, the utilitarian nature of Office 365 cannot deliver the flexibility or custom-built environment that is possible with an on-premises deployment and I suspect that there are many large companies that will find Office 365 to be a compelling vision that they cannot use at this point. The situation is likely to change over time as company requirements evolve and Microsoft builds out Office 365 and its successor products so that they can respond to the needs voiced by customers.

One interesting question exists that I can’t find an answer to yet is how Microsoft plans to migrate BPOS accounts to Office 365 (“move mailbox” is not a complete answer!). As covered here, creating and using new Office 365 mailboxes is straightforward but given the careful planning and enormous effort that companies dedicate to migrating from Exchange 2007 to Exchange 2010, a huge amount of work must be going on behind the scenes for Microsoft to prepare to move the millions of BPOS mailboxes they currently support. Like any migration, this has to be done with as little disruption on the end-user as possible and that’s where careful planning and solid execution pay big benefits. Some insight into the technical details of this transition would be compelling information. Maybe I shouldn’t care because when you’re using a utility, you don’t have any exposure to what goes on behind the scenes and couldn’t care less as long as the service stays running. But the technologist in me just wonders and this is the first time that Microsoft has had to cope with the migration challenge for millions of mailboxes!

All of this is interesting stuff and I look forward to debating the topic of Office 365 amongst others with those who attend Spring Connections. And I am sure that we will have similar debates at the Exchange 2010 Maestro events later on this year!

Until then,

– Tony

Posted in Cloud, Exchange 2010, Office 365, Technology, Training | Tagged , , , , , , , | 11 Comments

Oooh… that’s some copy queue length!


The Database Availability Group (DAG) is probably the best Exchange 2010 feature from both an impact and technology perspective. Microsoft took a really good decision by not limiting the DAG to enterprise editions of Exchange, even if you can only mount five databases on a server (counting both active and passive) with this edition and still require Windows 2008 Enterprise (because of the dependency on Windows Failover Clustering). The reason why it’s such a good decision is that it allows DAGs to be deployed to protect Exchange 2010 deployments from the very smallest to the very largest, which can’t be a bad thing.

Of course, another good thing about the DAG is that it hides much of the complexity involved behind the Exchange Management Console (EMC). Sure, you can plunge into mano-a-mano combat with the finer points of the DAG through the command line but there’s no good reason to do this unless you really must tune something that cannot be reached through EMC, such as sorting out networks automatically generated when a DAG is formed – the rule is that you can only have one MAPI network while several replication networks are supported.

MAPI networks are used to connect DAG member servers to network services such as Active Directory and for connections between a mailbox server and the CAS. These networks are registered in DNS, use the default gateway, and are enabled for Microsoft networks file and print sharing. Replication networks are used to transfer transaction logs between servers and are not registered in DNS, do not use the default gateway, and disable Microsoft networks. A server can survive with just the MAPI network (single NIC) as Exchange will route all traffic across this network, but if the MAPI network fails it forces a server failover within the DAG.

The Add-DatabaseAvailabilityGroupServer cmdlet (run manually or by the EMC Create New DAG wizard) attempts to make things easy for administrators by scanning for network cards known to the cluster service when a server joins a DAG (note that if you add a NIC to a server after it’s in a DAG or change the subnets used by the DAG, you can force Exchange to discover the NIC by running the Set-DatabaseAvailabilityGroup cmdlet with the -DiscoverNetworks parameter. For example:

Set-DatabaseAvailabilityGroup -Identity DAG1 -DiscoverNetworks

During enumeration, each NIC is assigned its own DAG network entry. If a server has multiple NICs, it’s possible that the enumeration and creation of DAG networks will end up with some entries that have a single endpoint and that just won’t work. The DAG will continue to function because Exchange contains fall-back logic that will instruct the DAG to use the only network that’s registered in DNS, meaning that all traffic will be routed across the MAPI network. However, it’s not good to persist with this situation and an administrator will have to sort it out by collapsing the networks so that you end up with the various subnets allocated correctly to the right DAG networks and the superfluous networks removed.  This kind of complex one-off work is best done with EMS as you couldn’t expect the EMC UI to cater for every possible situation that might arise. See MSDN for more information about the creation of DAG networks.

In any case, this post is not about the finer points of DAG networks. In fact, I just wanted to share an unusual hiccup that EMC displayed the other day.

EMC displays a huge copy queue length for a database

Even a hugely active database is hardly likely to create so many transaction logs that it would accumulate such a massive copy queue length (9,223,372,038,654,775 is the figure shown in the screen shot). Of course, this is the result of a transient glitch that EMC took seriously – a quick refresh exposed the true state of affairs and the copy queue length came up with the expected zero value. But isn’t it nice to know that Exchange 2010 is designed to deal with such a long queue – or at least that EMC is ready to display the good news about such a queue to an administrator!

Where would we be without a glitch to make the day go by faster?

– Tony

Posted in Exchange, Exchange 2010 | Tagged , , , | 7 Comments