The meaning of events 4113 and 4114 to Exchange 2010 SP1


One of the lesser discussed updates included in Exchange 2010 SP1 is the addition of database redundancy monitoring by the Microsoft Exchange Replication service. This occurs for databases that are in a Database Availability Group (DAG) and is intended to ensure that each replicated database has sufficient healthy copies to make it redundant. In this context, sufficient means at least two copies and a healthy copy is one that is online and has no problems copying or replaying transaction logs.

Exchange 2010 SP1 contains a script called CheckDatabaseRedundancy.ps1 in the default \Scripts folder. You can run the script interactively to check a database and it is also possible to incorporate the script into Microsoft SCOM monitoring. See the Exchange 2010 SP1 CHM for details.

What’s interesting to me is that the same script is run hourly by the Microsoft Exchange Replication service to perform a redundancy health check for replicated databases . If Exchange finds that a replicated database doesn’t have sufficient healthy copies, the Microsoft Exchange Replication Service (MSExchangeRepl) logs event 4113 as shown below.

Event 4113 logged for a non-redundant database

In this instance I had a database (DB2) that had just been added to the DAG and no database copies apart from the active had yet been generated. The information captured in the event reports that only one copy exists, which is what I expected. The Replication Service will continue to check and log event 4113 for the database every twenty minutes until it goes into a state where sufficient redundancy exists to accommodate an outage and allow the failed copy of the database to failover to a healthy copy.

Event 4114 logged for a database deemed to be redundant

Event 4114 is logged for replicated databases that pass the redundancy health check. As you can see from the screen shot above, database DB2 has passed the health check because two healthy copies are available. If you scroll on down through the event details, you can see other information such as the replay and copy queue lengths for the copies.

Of course, you can debate the threshold that Microsoft has selected for database redundancy. In production environments, two copies is probably not sufficient to ensure the kind of redundancy that many companies demand, but it is enough to ensure that the DAG can cope with an outage that affects the disk that holds the database (or its transaction logs) or the server on which the database is mounted. As such, two copies is redundant enough for the purpose of illustration and initial deployment if not for long-term protection.

One interesting point is that Exchange 2010 uses the Windows 2008 Task Scheduler to run CheckDatabaseRedundancy.ps1 to check database copies. Exchange 2010 SP1 introduced a dependency on the Task Scheduler in that if this service is disabled on a server, you won’t be able to install the mailbox role. Not many people disable the Task Scheduler, so it’s not a problem that is encountered often but there are instances where overly-zealous Windows build engineers disable anything and everything before handing a server over to the Exchange deployment team who then see the problem illustrated below. The issue is obvious “The Task Scheduler Service is not running” but you might just wonder why this is a problem for Exchange. Now you know… To be fair to the Microsoft engineers, they know about this issue and are working out how to make it go away, probably by implementing the dependency check as part of adding a mailbox server to a DAG, which seems like the right place to test for a service that has to be running to validate database redundancy.

Whoops - Mailbox Role won't install because the Task Scheduler is disabled

In any case, the point is that enabling a redundancy check for database copies is a nice example of proactive monitoring that has found its way into Exchange 2010 SP1, something that may just make the life of administrators a little easier, especially as they tease out the complexities of DAGs under the stresses and strains of real-life production environments.

– Tony

Find out more interesting details about Exchange 2010 SP1 with a copy of my Microsoft Exchange Server 2010 Inside Out book and continue to learn more about the mysteries of the Exchange 2010 Information Store!

Advertisement

About Tony Redmond

Lead author for the Office 365 for IT Pros eBook and writer about all aspects of the Office 365 ecosystem.
This entry was posted in Exchange, Exchange 2010 and tagged , , . Bookmark the permalink.

7 Responses to The meaning of events 4113 and 4114 to Exchange 2010 SP1

  1. elgibaly says:

    you can check my post aqnd see my situation

    http://elgibaly.wordpress.com/2010/11/19/exchange-2010-database-redundancy-health-check-failed/

    please let me know what do you think?

    Mahmoud

  2. Ken Merrigan says:

    Last week I renamed the first mail database created on each of our two mailbox servers (which contain only the arbitration mailboxes) from the assigned “Mailbox Database” random number to “MBDBEXCH# Do Not Use”. The purpose being so our admins would not add user mailboxes to these databases. Since I renamed them I receive Event ID 4113 for the databases on each mailbox server every 15 – 30 minutes. We do not plan on adding these databases to our DAG and want them to exist as single copies containing only the arbitration mailboxes on local storage. Is there anyway to get rid of these messages. Thank you.

  3. Hi Ken,

    I think this is by design and I don’t know of any way to get rid of the messages. Exchange is nagging you to tell you that your databases are not redundant. It seems like the creation of a DAG within a site causes this to happen… that’s my theory, not based on absolute fact.

    TR

  4. Ken Merrigan says:

    Thank you for the response Tony. I guess I will have to live with it. What is interesting about this as I mentioned above is that these messages did not start until I changed the name of the database. We installed this enviornment in mid November and besides the two mailbox databases created by Exchange during the MB Server install we added two MB Databases for testing and created a DAG and added the two test databases for replication. The enviornment ran this way for nearly three weeks without a single Event ID 4113 message. When I renamed the original databases on December 2, the messages started. Anyway, just an FYI. Thanks for responding, and I enjoy reading yor informative articles in WIndows IT Pro Magazine and on other web site.

    KM

  5. Alberto says:

    Thanks Tony, this explain the error.

  6. David Ray\ says:

    Ken (or more likely other interested folks as nearly 2 yrs have passed),
    The reason you started to get those 4113 errors after you renamed the database is that the CheckDatabaseRedundancy.ps1 script is configured to ignore any database name “Microsoft Database ” followed by 10 digits. Line 25 (of the version I am looking at from SP2) sets a parameter:
    $SkipDatabasesRegex = “^Mailbox Database \d{10}$”,
    That is a regular expression string that matches “Mailbox Database ” and 10 digits with nothing else at the beginning or end of the name.
    On two of my DAG servers I have a database in which I save mailboxes of disabled users. I do not have these databases replicated. I got tired of always having 4113 errors warning of the impending doom of these databases. The database names start with MDB2 followed by more digits and all the other databases start with MDB1 followed my more digits. So I modified the string above as so:
    SkipDatabasesRegex = “^Mailbox Database \d{10}$|^MDB2”,
    This now matches on the original string OR any string starting with MDB2. The script uses this later to filter the list of databases being tested. Now I don’t get 4113 on these non-replicated databases anymore.

    /David

    • STGdb says:

      @David Ray

      Thanks – I have a setup where I don’t repllicate all of my DB’s also and that alert was getting annoying. Your post led me in the right direction. With a little modification on your code it fixed the issue.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.