Datacenter Activation Coordination (DAC) mode is a property setting for a database availability group (DAG). DAC mode is specifically designed for DAGs with three or more members that are extended to two Active Directory sites. DAC mode is disabled by default and should only be enabled for DAGs with three or more DAG members that have been deployed in a multi-datacenter configuration. DAC mode shouldn’t be enabled for:

  • 2 member DAGs where each member is in a different Active Directory site
  • 2-16 member DAGs where all members are in the same Active Directory site

Confusion Over the DAG Scenario 

Where the first datacenter contains two DAG members and the witness server, and the second datacenter contains two other DAG members. If the first datacenter loses power and you activate the DAG in the second datacenter (for example, by activating the alternate file share witness in the second datacenter),What if the first datacenter is restored without network connectivity to the second datacenter, the DAG may enter a split brain syndrome.

Answer

Now After Reading the above paragraph I Went into a Mode where I stopped thinking and just went on saying ….. WHAT THE HELL IS HAPPENING WITH DAG and What is this Alternate FSW

Thanks to a guy in MCS (Kashif Awan) who helped me understand This Issue

First let us understand the Problem

I have 2 DAG nodes in Site A and 2 DAG nodes in Site B. As this is An Even number of Nodes thus We would be needing an FSW. So On Site A I created an FSW. So Far So good.

Note: Majority Node for the Cluster to be Up in this Scenario is 3 (thus If Site A goes down than my Cluster is Down and out because my majority Nodes would be Down)

So let us now Assume Site A goes Down . Our majority is Down and thus My Cluster is Down as well. (Urgh No Emails means no Work and You Loose your job as well :p)

So make Sure that You have a Plan a head of time and in case of this Whole Site Going down You have already Planned a Server in Site B and created a Folder which is Shared From before(Alternate FSW). Now you Simply need to point the DAG in Site B to Alternate FSW and Your Cluster would be UP and Running.[trust me Isn’t that Easy lols]

Here is Now the tricky part Comes in. You Call 911 For help :p and take out your backup tapes and Start Restoring you manage to get your site A up in 2 Days. Now in that meantime Site B was receiving all the emails. Now If We get the Site A up and the WAN LINK IS STILL DOWN the problem would be

SPLIT BRAIN Syndrom.

Servers in DAG in Site A Thinks they have an FSW in Site A and As soon as they are Restored they would Try to Mount the DATABASE as they have the majority in there LOCAL Site WHILE Server in Site B are already up and running and have all the new Emails.(This is a problem)

Let us Understand DACP (Protocol which Runs Behind DAC)

DACP was created to address this issue. Active Manager stores a bit in memory (either a 0 or a 1) that tells the DAG whether it’s allowed to mount local databases that are assigned as active on the server. When a DAG is running in DAC mode (which would be any DAG with three or more members), each time Active Manager starts up the bit is set to 0, meaning it isn’t allowed to mount databases. Because it’s in DAC mode. Now It Would Contact all the Servers in the DAG to ask

DOES Any Server in the DAG Has this bit set to 1(Which would give the Server the permission that it should now mount its database and Sets its BIT to 1 As well) but in Site A None of the Servers have there bit Set to 1 (As all have just Recovered from a Failure and not Mounted there databases)and when it tries to look for a Server in the DAG which has there BIT set to 1 (it wouldn’t find any because none of them have yet mounted there databases. Thus they wont have there DAC bit set to 1)

Thus You can clearly See how beneficial this could be to enable DAC mode in Multiple Datacenters.

Set-DatabaseAvailabilityGroup -Identity DAG2 -DatacenterActivationMode DagOnly

In the preceding example, a DAG named DAG2, which is a cross-site DAG with more than three members, is enabled for DAC mode.

Now aren’t you Saying Cumon Fazal How should I say my Job and get things up ASAP.

Calm Down Here are the Steps J

Before We start We should understand

In a high availability configuration, automatic recovery is initiated by the system, and the failure typically leaves the messaging system in a fully functional state. By contrast, a datacenter failure is considered to be a disaster recovery event, and recovery must be manually performed and completed for the client service to be restored, and for the outage to end. The process you perform is referred to as a datacenter switchover. As with many disaster recovery scenarios, prior planning and preparation for a datacenter switchover can simplify the recovery process and reduce the duration of the outage.

There are four basic steps that you complete to perform a datacenter switchover, after making the initial decision to activate the second datacenter

1. Terminate a partially running datacenter   This step involves terminating Mailbox and Unified Messaging services in the primary datacenter, if any services are still running.

2. Validate and confirm the prerequisites for the second datacenter   This step can be performed in parallel with step 1 because validation of the health of the infrastructure dependences in the second datacenter is largely independent of the first datacenter services.

3. Activate the Mailbox servers   This step begins the process of activating the second datacenter. This step can be performed in parallel with step 4 because the Microsoft Exchange services can handle database outages and recover. Activating the Mailbox servers involves a process of marking the failed servers from the primary datacenter as unavailable followed by activation of the servers in the second datacenter. The activation process for Mailbox servers depends on whether the DAG is in database activation coordination (DAC) mode

Use the Stop-DatabaseAvailabilityGroup cmdlet to stop a member of a database availability group (DAG) or to stop an entire Active Directory site. Stop-DatabaseAvailabilityGroup is used during a datacenter switchover. This cmdlet is used to mark the members of the DAG in a failed datacenter as stopped

Note: Stop-DatabaseAvailabilityGroup cmdlet can be run against a DAG only when the DAG is configured with a DatacenterActivationMode value of DagOnly

Use the Restore-DatabaseAvailabilityGroup cmdlet to activate database availability group (DAG) member servers in a secondary or standby datacenter. This process is typically performed after the failure or deactivation of the active DAG member servers in a primary production datacenter

The Restore-DatabaseAvailabilityGroup cmdlet performs several operations that affect the structure and membership of the DAG. This task will:

  • Force quorum for the DAG to enable the surviving members of the DAG to start and provide service;
  • Change the DAG witness from the primary witness server to an alternate witness server; and
  • Remove any failed members from the DAG.

4. Activate the other server roles   This involves using the URL mapping information and the Domain Name System (DNS) change methodology to perform all required DNS updates.

Users Should have Access to Services After Step 3 and 4 are cpompleted.

Now Lets look at things with Details.

Let us First look at the DataCentre Which is Down and out.

2 Scenarios could Occur here.

1)What if Some Servers are down in Site A and you need to perform the datacenter Switchover

than on the Remaining Server you need to set the DAG state to Stopped On these Servers. Stopped is a state of Active Manager that prevents databases from mounting and Active Manager on each server in the failed datacenter is put into this state by using the Stop-DatabaseAvailabilityGroup cmdlet.

2)What If the Whole Site Goes down and no cmdlets can be run on the Server.

In this Scenario

The second datacenter must now be updated to represent which primary datacenter servers are stopped. This is done by running the same Stop-DatabaseAvailabilityGroup command with the ConfigurationOnly parameter using the same ActiveDirectorySite parameter and specifying the name of the Active Directory site in the failed primary datacenter. The purpose of this step is to inform the servers in the second datacenter about which mailbox servers are available to use when restoring service

When the DAG is in DAC mode, the steps to complete activation of the mailbox servers in the second datacenter are as follows:

  1. The Cluster service must be stopped on each DAG member in the second datacenter. You can use the Stop-Service cmdlet to stop the service (for example, Stop-Service ClusSvc), or use net stop clussvc from an elevated command prompt.
  2. The Mailbox servers in the standby datacenter are then activated by using the Restore-DatabaseAvailabilityGroup cmdlet. The Active Directory site of the standby datacenter is passed to the Restore-DatabaseAvailabilityGroup cmdlet to identify which servers to use to restore service. If this command succeeds, the quorum criteria are shrunk to the servers in the standby datacenter. If the number of servers in that datacenter is an even number, the DAG will switch to using the alternate witness server as identified by the setting on the DAG object.
  3. The databases can now be activated. Depending on the specific configuration used by the organization, this may not be automatic. If the servers in the standby datacenter have an activation blocked setting, the system won’t do an automatic failover from the primary datacenter to the standby datacenter of any database. If no failover restrictions are present for any of the database copies in the standby datacenter, the system will activate copies in the second datacenter assuming they are healthy. If databases are configured with an activation blocked setting that requires explicit manual action, there are two choices for action:
    1. Clear the setting that blocks activation. This will make the system return to its default behavior, which is to activate any available copy.
    2. Leave the setting unchanged and use the Move-ActiveMailboxDatabase cmdlet to complete the database activation in the second datacenter. To complete this step using the Move-ActiveMailboxDatabase cmdlet when activation blocked is set, you must explicitly identify the target of the move.
  4. The last step is to review all error and warning messages from the tasks. Any indicated warnings should be followed up and corrected. The task design model for these commands is to only fail if they can’t achieve the fundamental goal of their design. For example, the Restore-DatabaseAvailabilityGroup cmdlet will fail if it can’t shrink the quorum of the DAG to allow a server in the second datacenter to be restarted for servicing without causing a quorum outage. However, each task’s output is also used to identify the issues that require administrator follow-up. You’re strongly encouraged to save all task output and review it for follow-up actions

So far so good…..Now let us Assume that our Primary datacenter is Up and Running.

1. As part of the datacenter switchover process, the Mailbox servers in the primary datacenter were put into a stopped state thus we now need to mark it in started State ( If the DAG is in DAC mode, you can reincorporate the DAG members in the primary site by using the Start-DatabaseAvailabilityGroup )

2. After the Mailbox servers in the primary datacenter have been incorporated into the DAG, they will need some time to synchronize their database copies. Depending on the nature of the failure, the length of the outage, and actions taken by an administrator during the outage, this may require reseeding the database copies

3. After a majority of the databases are in a healthy state in the primary datacenter, the failback outage can be scheduled

i) The DAG was configured to use an alternate witness server. The DAG must be reconfigured to use a witness server in the primary datacenter.( Set-DatabaseAvailabilityGroup)

ii) The databases being reactivated in the primary datacenter should be dismounted in the second datacenter.

iii) After the databases have been dismounted, the Client Access server URLs should be moved from the second datacenter to the primary datacenter.

iv) Because each database in the primary datacenter is in a healthy state, it can be activated in the primary datacenter by performing database switchovers

v) After each database is moved to the primary datacenter, it can be mounted

Advertisements