Adaptive Replication Mode in GSLB

Note: In NSX Advanced Load Balancer version 21.1.3, this feature is under tech preview.

Overview

Avi GSLB is comprised of leader and follower sites. Any federated object (GSLB configuration) is required to be configured on the leader site. Leader site eventually replicate the object to the peer follower sites.

Whenever any GSLB configuration is done on the GSLB leader site, the configuration is propagated to the active follower site in the following ways:

  1. Continuous Replication mode (Default mode)
  2. Manual Replication mode
  3. Adaptive Replication

Continuous Replication Mode

The configuration synchronization across the GSLB follower sites from the GSLB leader site is automatic and instant as soon as any configuration change is performed on the leader site. The configuration replication is initiated automatically by the leader site to all the follower sites as soon as any configuration change is performed on the site. This method of replication is called the continuous replication method.

Manual Replication Mode

Manual mode requires user intervention to replicate the GSLB configuration from leader to the follower sites. The admin creates the manual checkpoints on Avi Vantage. Avi GSLB leader site replicates all federated objects to peer sites till the checkpoint.

For more details on GSLB canary update, refer to GSLB Canary Update guide.

Adaptive Replication Mode

Starting with Avi Vantage version 21.1.3, adaptive replication mode is supported.

In adaptive replication mode, whenever any GSLB configuration change is done on Avi GSLB leader site, it gets processed on the leader site first. If it runs on leader site successfully i.e., without any issues, only then the configuration change gets propagated to the other GSLB active follower sites.

In adaptive mode, Avi Vantage looks for local feedback first (on leader site) before replicating to the other active follower sites, i.e., federated objects/ GSLB configuration changes are first applied on the leader and their feedback are considered to make replicate decision.

Use Case

DNS service is mission critical service and it is very important to have the maximum uptime. In GSLB as config gets propagated to all sites/ locations, a small error can lead to the issues on multiple data centers that can eventually result into application failure. To prevent such issue, adaptive replication becomes essential as any faulty configuration object will not be replicated to the follower sites.

At the time of peak traffic and major events like Black Friday, Cyber Monday and so on, having an ability to change the config with complete confidence plays an important role.

Using adaptive replication mode, if any federated config object causes issue on local site then it will not be replicated to peer follower sites and replication stalls. When replication stalls, adaptive replication generates an event to notify that replication is currently stalled with reason and possible recommendation.

Currently, the adaptive replication generates events in two cases, namely,

  • When a configured domains/ subdomains are not hosted by existing and enabled DNS virtual service.

  • When any federated config version causes replication to stall.

The following is an example of an event. As per this example, this event was raised when configured GSLB domains, com and local were not hosted by enabled DNS virtual service. Also, the DNS virtual service were disabled.

Configuration Steps to enable Adaptive Replication Mode

The following are the steps to enable adaptive replication mode:

  1. Navigate to Infrastructure > GSLB. Click on the pencil icon.

  2. Select Adaptive option in Replication Mode.

  3. Click on Save.

Adaptive Replication Status

GSLB site configuration screen shows the replication status. If you hover over mouse to replication status then pop-up appears that shows the status based on the replication state. The following parameters are displayed.

  1. Replication status — The common statuses are Sync In Progress, In Sync and Sync Stalled.

  2. Number of pending objects — The number of GSLB config objects yet to be replicated.

  3. Reason — The reason shows the possible cause that is triggering replication issue.

  4. Recommendation — This shows some hints or recommended way to resolve the replication issue. If replication issue cannot be determined, then recommendation will have Contact VMware support team message displayed.

Example:

The following are the sample screens:


Notes:

There are some mandatory requirement for adaptive replication to work, such as,

  • All GSLB configured domains/ subdomains must be placed on existing and enabled DNS virtual service on the leader.

  • All the DNS virtual service involved in GSLB domains/ subdomains must be on adaptive compatible version (21.1.3 or above) on the leader.

Document Revision History

Date Change Summary
December 20, 2021 New KB added for 21.1.3