Avi GSLB Service Health Monitors

GSLB service is the representation of a global application deployed at multiple sites. The GSLB service configuration defines the FQDN of the application, the backing virtual services in various sites, and the priority or ratios governing selection of a particular virtual service at any given time. The configuration also defines the health-monitoring methods by which unhealthy components can be identified so that best alternatives may be selected.

Prerequisite Reading

GSLB Service Health Monitoring

There are two categories of GSLB service health monitoring:

  • control-plane
  • data-plane.

One or both can be applied on a per-application basis.

GSLB Member Healthchecks

Control-Plane-Based Global Application Health Monitoring

Independent of Avi GSLB, every Avi Controller cluster routinely performs local health checks to collect the health scores and performance metrics of virtual services under its direct control. In addition, if GslbService.controller_health_status_enabled is True, active GSLB sites will also periodically query the Avi Controllers at all the other sites specified within the GSLB site configuration (both active and passive sites). 

Note: Control-plane health monitoring does not apply to virtual services configured on a third-party load balancer VIP or standalone servers.

For the below figure, only the active Controller in DC1 is shown collecting health information from 3 other Controllers. DC1’s Controller passes a coalesced picture of health status to its local DNS (solid arrow). In reality, the active Controllers in DC2 and AWS update their respective local DNS virtual services with control-plane-based health status.

Control plane healthcheck

Data-Plane-Based Global Application Health Monitoring

In contrast to control-plane-based health monitoring, no site’s Controller cluster is queried. Instead, health checks go directly to participating services, i.e., to the data plane. At an active site, an SE hosting a GSLB DNS virtual service performs periodic health checks against all GSLB pool members (including the virtual services local to it). A dedicated Service Engine can be configured to perform these health checks. Active monitors generate synthetic traffic from the DNS Service Engine to mark a virtual service (pool member) up or down, based on its response. The below diagram shows the DNS in DC1 (the only active site) performing this function against its local virtual service (VS-A1), as well as VS-A2, VS-A3 and VS-A4.

As previously mentioned, the object used for this is called the GslbHealthMonitor. Ping, TCP, UDP, DNS, and HTTP(S) health monitors are supported (as mentioned in the Configuring Health Monitoring section below. Additionally, a custom monitor can be configured as per the requirement.

For more information on health monitors, refer to Health Monitors on Avi Vantage.

Data plane healthcheck

Localized Data-Plane-Based Global Application Health Monitoring

In the below figure, DC3 is transformed into an active site via deployment of an Avi DNS SE. This nominally adds four additional data-plane health checks, from DC3’s Avi DNS SE to each of the four member virtual services.

Data plane healthcheck

Now let’s focus attention on the DC1 DNS check of VS-A4 as well as the DC3 DNS check of VS-A1. Suppose we wish or need to avoid these checks for one or both of the following reasons:

  • The firewalls protecting DC1 and DC3 have been configured to permit the Avi Controllers to communicate, but they block direct access to VS-A4 (from DC1) and to VS-A1 (from DC3).
  • We want to scale the performance of each Avi DNS SE by minimizing the number of health checks each must perform. Since each member VS is already being data-plane checked by a local DNS, we consider it wasteful for remote DNSes to replicate those checks.

We can achieve the optimization depicted in the below figure by optioning VS-A1 and VS-A4 for localized data-plane health checks. Two data-plane health checks are eliminated. Instead, each DNS SE gets the health information it needs by querying the remote site’s Avi Controller.

localized data-plane health check

This hybrid approach, which combines control- and data-plane health checking, is enabled for a global service on an individual member VS basis. The only restriction is that the member VS runs on an active Avi site.

Enabling Data-Plane-Based Global Application Health Monitoring in the Avi UI

A GSLB site optimizes health checks by identifying other GSLB sites as health-monitoring proxies. In the below figure, the pull-down menu of the Health Monitor Proxy field offers three Avi GSLB sites capable of performing local checks for the GSLB.

enabling-localized-health-checks.png

Interaction with the GSLB DNS

When it is determined a GSLB service pool member (i.e., some participating virtual service) is down, one of four standard responses are returned by the GSLB DNS. In the GslbService object, set the GslbService.down_response parameter to select one of these four:

  • GSLB_SERVICE_DOWN_RESPONSE_NONE – the default option, simply drops the request.
  • GSLB_SERVICE_DOWN_RESPONSE_FALLBACK_IP – respond with a single preset fallback IP address, which typically would point to a server that’s serving a “sorry” page.
  • GSLB_SERVICE_DOWN_RESPONSE_ALL_RECORDS – return all IP addresses of all members of all pools
  • GSLB_SERVICE_DOWN_RESPONSE_EMPTY – return an empty DNS response; can be used to make the client retry in certain cases.

Options and Combinations for GSLB Service Health Monitoring

  • Control-plane health checking only – Active data-plane health monitors are not configured for this mode. All active Controllers are configured to coalesce health status collected locallly with the statistics collected from remote Controllers.

Coalesced stats are then passed from each active Controller (cluster) to its local DNS. This method is only available for members implemented as Avi Vantage virtual services.

  • Data-plane health checking only – Set GslbService.controller_health_status_enabled to false. Each GSLB DNS performs health checks on all GSLB member virtual services (including those hosted on external sites).
  • Both control and data-plane health checking – For a member virtual service to be marked UP, both control and data health should report UP. If the control-plane health check is failing due to a remote Controller being down or inaccessible, but data-plane health checking is still possible, then it alone determines status of the member virtual service.

Optimizing Health Checking

Setting data-plane health monitor scope appropriately

GslbService.health_monitor_scope – An optional parameter that takes on one of two states. By default, it is set to GSLB_SERVICE_HEALTH_MONITOR_ALL_MEMBERS, in which case DNS SEs actively probe pool members at both Avi and external sites. However, the parameter can be set to GSLB_SERVICE_HEALTH_MONITOR_ONLY_NON_AVI_MEMBERS, such that external member status is collected the only way possible, while the health checking is offloaded from DNS SEs to Avi Controllers local to the GSLB pool members.

Limiting the number of active sites

When a large fraction of GSLB sites are configured to be active, the load on Controllers and the networks interconnecting them can be excessive. For example, consider two deployments, each with 10 Avi sites. One has 5 active Controllers, the other just 2. Each regularly scheduled remote-site health check from an active Controller collects health status from 9 remote sites. Compare the throughput consumed when 5 Controllers probe 9 sites each, versus when just 2 Controllers probe 9 sites. That’s 45 remote-site collections per unit of time compared to just 18. The latter is considerably more throughput-efficient, while still delivering reasonable GSLB DNS redundancy for HA.

Configuring GSLB Health Monitors using Avi UI

The below specifications apply to the examples shown in this section using Avi UI.

  • The customer has multiple data centers.
  • Two global applications spanning both are going to be deployed. Each will require health monitors to be configured.
  • view.sales.avi.local will run in US-West and rely on US-Central as a disaster recovery site.
  • pay.sales.avi.local will run at US-West as well as US-East to achieve both high availability and optimal user experience.

Configure a data-plane health monitor for GSLB service

This operation can only be performed by an authorized user logged into the GSLB leader Controller.

The Templates > Profiles > Global Health Monitor tab shows the five pre-existing system-standard monitors:

system-standard GSLB health monitors

For a system-standard monitor, the better practice is to define a brand new monitor by clicking on Create. The defaults that apply will be populated into the editor window, and can then be modified as desired. Refer to the New GSLB Health Monitor editor window below.

our-GLSB-TCP-monitor

  • Successful Checks – The number of consecutive successful health checks before a virtual service is marked UP.
  • Failed Checks – The number of consecutive failed health checks before a virtual service is marked DOWN.
  • Send Interval – The number of seconds between health checks to a given virtual service
  • Receive Timeout – A valid response from the server is expected within this number of seconds. It must be less than the send interval. If server status is regularly flapping between UP and DOWN, consider increasing this value.
    Is Federated? – This option helps define the object’s replication scope. If enabled, the object is replicated across the federation. Else, it is visible within the Controller-cluster and its associated service engines.
    is_federated is set to True only when GSLB is turned on. A federated health monitor is used for GSLB purposes while it is not applicable for a regular health monitor. This implies that a GSLB service cannot be associated with a regular health monitor, because GSLB service is a federated object, while the health monitor is not. Conversely, A pool cannot be associated with a federated health monitor because the pool is not a federated object.
  • Health Monitor Port – Regardless of what port the associated virtual services use, this monitor will direct its health checks to port 80. A monitor port is mandatory for HTTP(S), TCP, UDP and external health monitors.

Clicking on Save in the new GSLB Health Monitor editor completes the custom-monitor creation.

Note: Starting with Avi Vantage release 18.2.5, health monitoring is supported for DNS virtual service configured in the active/standby mode.

Configuring GSLB health monitors using Avi CLI

Login to the leader node (10.10.25.10) and use the configure gslbhealthmonitor <monitor name> command to provide type and port number for the monitor.

: > configure gslbhealthmonitor global-http-hm
: gslbhealthmonitor> type health_monitor_http
: gslbhealthmonitor> monitor_port 80
: gslbhealthmonitor> save