External Health Monitor

Overview

This article covers the specific configuration for external health monitor type.  Refer to Overview of Health Monitors article for general monitor information, implementation, and other monitor types.

The external monitor type allows scripts to be written to provide highly customized and granular health checks. The scripts may be Linux shell, Python, or Perl, which can be used to run wget, netcat, curl, snmpget, mysql-client, or dig. External monitors have constrained access to resources, such as CPU and memory to ensure normal functioning of Avi Service Engines. As with any custom scripting, thoroughly validate the long term stability of the implemented script before pointing it at production servers.

Errors generated from the script may be viewed in the output of the Operations > Events log.

Avi Vantage includes three sample scripts via the System-Xternal Perl, Python and Shell monitors.

Note: Avi Vantage supports IPv6 external health monitors.

Creating External Health Monitor

To create the external health monitor,

  1. From the NSX Advanced Load Balancer UI, navigate to Templates > Profiles > Health Monitors.

  2. Click on Create to open the CREATE HEALTH MONITOR screen.

  3. Under the General tab, enter the basic information about the health monitor. Note: Select External as the Type to view the settings specific to external health monitor.

  4. Configure External health monitor settings.

  5. Configure Role-Based Access Control (RBAC)

  6. Click Save to complete the External Health Monitor creation.

Configuring General Settings

Under the General tab of the CREATE HEALTH MONITOR screen, configure the following:

  1. Enter a unique Name for the monitor.

  2. Enter a Description.

  3. Select External as the Type of Health Monitor.
    Note: Once the Type of Monitor is selected, options specific to the external health monitor type are displayed.

  4. Select the option Is Federated? to replicate the object across the federation. When this option is not selected, the object is visible within the Controller-cluster and its associated SEs.This option is enabled only when GSLB is activated. A federated health monitor is used for GSLB purposes while it is not applicable for a regular health-monitor. A GSLB service cannot be associated with a regular health monitor, because GSLB service is a federated object, while the health monitor is not. Conversely, a pool cannot be associated with a federated health monitor because the pool is not a federated object.

  5. Enter the Send Interval value (in seconds). This value determines how frequently the health monitor initiates an active check of a server. The frequency range is 1 to 3600.

  6. Enter the Receive Timeout value (in seconds). The server must return a valid response to the health monitor within the specified time limit. The receive timeout range is 1 to 2400 or the send interval value minus 1 second.
    Note: If the status of a server continually flips between up and down, this may indicate that the receive timeout is too aggressive for the server.

  7. Enter Successful Checks. This is the number of consecutive health checks that must succeed before NSX Advanced Load Balancer marks a down server as up. The minimum is 1, and the maximum is 50.

  8. Enter Failed Checks. This is the number of consecutive health checks that on failing, NSX Advanced Load Balancer marks a server as down. The minimum is 1, and the maximum is 50.
    External Health Monitor

Configuring External Health Monitor Settings

As a best practice, clean up any temporary files created by scripts.
While building an external monitor, you need to manually test the successful implementation of the commands. To test a command from an SE, it may be necessary to switch to the proper namespace or tenant.  The production external monitor will correctly use the proper tenant.  To manually switch tenants when testing a command from the SE CLI, follow the commands in the following article:  Manually Validate Server Health. To configure external health monitor settings,

  1. To input the script, do either one of the following:
    • Click IMPORT FILE and select the required file. The script from the file is automatically pasted to the text box.
    • Copy the script and paste it in the text box available.
  2. In the Script Parameters field, enter any optional arguments to apply. These strings are passed in as arguments to the script, such as $1 = server IP, $2 = server port.

  3. Enter the port to override the port defined in the server pool. To use the server port, enter 0.

  4. In the Script Variables field, enter the custom environment variables to be fed into the script to simplify re-usability. For instance, a script that authenticates to the server may have a variable set to USER=test.

    External Health Monitor

Configuring RBAC

Under the Role-Based Access Control (RBAC) section, configure labels to control access to the health monitor based on the defined roles

  1. Click Add.
  2. Enter the Key and the corresponding values.

See Granular RBAC for more information.

Sample Scripts

MySQL Example Script

#!/bin/bash
#mysql --host=$IP --user=root --password=s3cret! -e "select 1"

SharePoint Example Script

#!/bin/bash
#curl http://$IP:$PORT/Shared%20Documents/10m.dat -I -L --ntlm -u $USER:$PASS -I -L > /run/hmuser/$HM_NAME.out 2>/dev/null
curl http://$IP:$PORT/Shared%20Documents/10m.dat -I -L --ntlm -u $USER:$PASS -I -L | grep "200 OK"

postgresql Example Script

Example 1:

In this example, the script makes Avi Service Engine to query the database. On getting successful response, Avi Service Engine marks the server up, else it marks the server down.

#!/bin/bash
#exporting username's password
export PGPASSWORD='password123'
psql -U aviuser -h $IP -p $PORT -d aviuser -c "SELECT * FROM employees"

Example 2:

In this example, the script makes Avi Service Engine to query the database and parse the response for cell present at the provided row, column and match it to the provided string. If it is matched, then the server will be marked as up, else the server will be marked down.

#!/bin/bash
#example script for
#string match to cell present at row,column of query response
row=2
column=2
match_string="bob"
#exporting username's password
export PGPASSWORD='password123'
response="$(psql --field-separator=' ' -t --no-align -U aviuser -h $IP -p $PORT -d aviuser -c "SELECT * FROM employees")"
str="$(awk -v r="$row" -v c="$column" 'FNR == r {print $c}' <<< "$response")"
if [ "$str" = "$match_string" ]; then
    echo "Matched"
fi

RADIUS Example Script

The below example performs an Access-Request using PAP authentication against the RADIUS pool member and checks for an Access-Accept response.


#!/usr/bin/python3 
import os 
import radius 
try: 
    r = radius.Radius(os.environ['RAD_SECRET'], 
                      os.environ['IP'], 
                      port=int(os.environ['PORT']), 
                      timeout=int(os.environ['RAD_TIMEOUT'])) 
    if r.authenticate(os.environ['RAD_USERNAME'], os.environ['RAD_PASSWORD']): 
        print('Access Accepted') 
except: 
    pass 

RAD_SECRET, RAD_TIMEOUT, RAD_USERNAME and RAD_PASSWORD can be passed in the health monitor script variables, for example:


RAD_SECRET=foo123 RAD_USERNAME=avihealth RAD_PASSWORD=bar123 RAD_TIMEOUT=1

Applications like curl can have different syntax for v4 and v6 addresses. The external health monitor scripts should be aware of these syntax. Following are the examples:

Using Domain Names

Starting with Avi Vantage 21.1.3, to resolve domain names, DNS Resolution on Service Engine should be configured.


EXT_HM=exthm.example.com
curl <http://$EXT_HM:8123/path/to/resource> | grep "200 OK"```

*Shell Script Example for IPV6 Support*

Shell Script Example for IPV6 Support

#!/bin/bash
#curl -v $IP:$PORT >/run/hmuser/$HM_NAME.$IP.$PORT.out
if [[ $IP =~ : ]];
then curl -v [$IP]:$PORT;
else curl -v $IP:$PORT;
fi

perl Script Example for IPV6 Support

#!/usr/bin/perl -w
my $ip= $ARGV[0];
my $port = $ARGV[1];
my $curl_out;
if ($ip =~ /:/) {
$curl_out = `curl -v "[$ip]":"$port" 2>&1`;
} else {
$curl_out = `curl -v "$ip":"$port" 2>&1`;
}
if (index($curl_out, "200 OK") != -1) {
    print "Server is up";
}

Handling Errors

The external health monitor logs error messages that explicitly mention the cause for failure. For example,

  • Unexpected response code, received: [int] expected: [int]
  • Unexpected redirect URL: [str]
  • Application server down
  • Springboard application unavailable NSX Advanced Load Balancer introduces the tag ext_hm_usr_err_msg to display specific custom error message, as required. The external health monitor script returns the response output and if this data contains ext_hm_usr_err_msg tag, then the server is marked down with the reason External HM failed with error.
    Consider this example to understand how the error is handled in NSX Advanced Load Balancer.
    This is an external health monitor in Python script to set up an HTTP connection.

#!/usr/bin/python3
import sys
import http.client
try:
    conn = http.client.HTTPConnection(sys.argv[1]+':'+sys.argv[2])
    conn.request("HEAD", "/index.html")
except Exception as e: 
    print("ext_hm_usr_err_msg: Http get request Failed with " + str(e))
    exit()

r1 = conn.getresponse()
if r1.status == 200:
    print(r1.status, r1.reason)
else:
    print("ext_hm_usr_err_msg:"+str(r1.status)+","+r1.reason)

There are two possible outcomes.

  • If the HTTP connection is not established then, the error that will be reported with External HM failed with error. In the response string this is printed with the reason, for example connection refused.
  • If the connection is established, and NSX Advanced Load Balancer gets a response, but the response is not 200, then the error is still generated.

The custom script can be modified, as required.

From the image, the ext_hm_usr_err_msg tag is displayed with the error. Here, the error is HTTP get request failed with (Errno 111) Connection refused.

Handling Error Messages

The server is marked down with the reason 404, Not found as shown below: Handling Error Messages

List of SE Packages

Scripting Languages

  • Bash (shell script)
  • perl
  • Python

Linux Packages (apt)

  • curl
  • snmp
  • dnsutils
  • libpython2.7
  • python-dev
  • mysql-client
  • nmap
  • freetds-dev
  • freetds-bin
  • ldapsearch
  • postgresql-client

Python Packages (pip)

  • pymssql
  • cx_Oracle (and related libraries for Oracle Database 12c) — Avi Vantage 17.1.3 onwards
  • py-radius — Avi Vantage 17.2.5 onwards

NTP Health Monitor Example using netcat program

nc -zuv pool.ntp.org 123 2>&1 | grep "(ntp) open"

The sample configuration for using a native perl script is as follows:


#!/usr/bin/perl 
# ntpdate.pl

# this code will query a ntp server for the local time and display
# it.  it is intended to show how to use a NTP server as a time
# source for a simple network connected device.

# 
# For better clock management see the offical NTP info at:
# http://www.eecis.udel.edu/~ntp/
#

# written by Tim Hogard (thogard@abnormal.com)
# Thu Sep 26 13:35:41 EAST 2002
# this code is in the public domain.
# it can be found here http://www.abnormal.com/~thogard/ntp/

$HOSTNAME=shift;
$HOSTNAME="192.168.1.254" unless $HOSTNAME ;	# our NTP server
$PORTNO=123;			# NTP is port 123
$MAXLEN=1024;			# check our buffers

use Socket;

#we use the system call to open a UDP socket
socket(SOCKET, PF_INET, SOCK_DGRAM, getprotobyname("udp")) or die "socket: $!";

#convert hostname to ipaddress if needed
$ipaddr   = inet_aton($HOSTNAME);
$portaddr = sockaddr_in($PORTNO, $ipaddr);

# build a message.  Our message is all zeros except for a one in the protocol version field
# $msg in binary is 00 001 000 00000000 ....  or in C msg[]={010,0,0,0,0,0,0,0,0,...}
#it should be a total of 48 bytes long
$MSG="\01 

Note: The ntpdate or ntpq programs are not packaged in the Service Engine, and hence cannot be used at this point in time.

Upgrade to Python 3.0

Starting with Avi Vantage release 20.1.1, the Avi Controller and Service Engines use Python 3.0.

The external Python health monitors should be converted to Python 3.0 syntax as part of upgrade procedure.

Steps Prior to the Upgrade

Before initiating the upgrade to Avi Vantage release 20.1.1,

  1. Identify the external Health Monitors using Python

  2. Remove the health monitors, or replace them with a non-Python health monitor

  3. Ensure that the health monitor script is modified to Python 3.0 syntax

After this, upgrade to Avi Vanatge release 20.1.1

Steps Post Upgrade

After upgrading to Avi Vantage release 20.1.1,

  1. Replace the existing (Python 2.7) health monitor script with the Python 3 script.

  2. Re-apply the health monitor to the required pools, and remove the temporary non-Python health monitor (if configure).