Possible Floating IP and Health Probes Connectivity and Networking issues

Posted by Barac in SQL Monitoring, SQL Server, SQL Tips and Tricks on Jan 21st, 2022 | 3 comments

I will take you through the steps on how to properly configure Azure Load Balancer with Azure SQL Server VMs (IaaS), Windows Failover Cluster (WFC), Availability Group AG, and listener endpoint.

The following link shows us what is the best Microsoft Practice, and how Microsoft suggests we should configure an Internal Load Balancer (ILB) for the Availability Group Listener.

Create & configure the load balancer

How LB really works with Always On Availability Group on SQL Server on Azure VMs?

There are two types of Load Balancers – Internal load balancers which balance traffic within a VNET and external load balancers which balance traffic to and from an internet-connected endpoint.

This means that the internal load balancer’s role would be to evenly distribute incoming traffic across a group of Azure VMs within Backend Pool inside a virtual network.

Something similar to the following picture.

ILB and Availability Group Listener

The picture is from Load Balancer insights – a pre-built monitoring dashboard from the Azure Portal

Is this what we are looking for in our Availability Group (AG) / Internal Load Balancer (ILB) configuration?

Not really!

What is the role of Load Balancer for an Availability Group Listener in Azure?

Load Balancer is doing exactly that, it evenly distributes incoming internal and external traffic to Azure virtual machines.
Nevertheless, if you want to accomplish high availability within Windows Failover Cluster and connect your application via Availability Group Listener you will need to configure Internal Load Balancer a bit differently.

This is what we are looking for.

ILB and Availability Group Listener

Load Balancer Configuration

There are four really important parts of the Load Balancer configuration:

Frontend IP configuration
Backend pools (Virtual machines)
Health probes (Protocol, Port)
Load balancing rules (Frontend IP address, Backend pool, Protocol, Port, Health Probe, Floating IP)

Frontend IP configuration

Internal Load Balancer frontend IP address will be your private IP address of your Azure Load Balancer, this is going to be point of connection for your application. For SQL Server Always On AG configuration Load Balancer Frontend IP will be IP address of the AG listener.

If you are tying to trobleshout conectivity of your newly deployed load balancer bear in mind that ICMP protocol is not permitted through the Azure load balancer. To test connectivity, you should do a port ping.
You can use other tools, such as PSPing, Nmap, and telnet, to test connectivity to a specific TCP port, instead of ping.exe which uses ICMP.

PSPing example: PSPing -4 -t addressyouwanttotest:1433

MS documentation: Create the load balancer and configure the IP address

Backend pool

Next important part of the Azure Load Balancer is Backend Pool. Azure Load Balancer allowing incoming traffic/requests to flow from frontend IP to the Group of VMs beloging to the Backend Pool.

MS documentation: Configure the back-end pool

Health probes

The health probe status determines which SQL Server instances currently have ownership over the availability group listener. Health probe role is simple to inform ILB to which server should redirect the incoming requests/traffic. To acomplish this you need to specify port (Example 59999, or any other not in use by other services), ILB will use this port to check health status of each SQL Instance.

MS documentation: Create a probe

Load balancing rule

A load balancing rule distributes incoming requests to the group of backend pool instances having health probe status healthy. Neverthless, having Floating IP feature enabled those requests/traffic will be mapped to Frontend IP address of the Load Balancer.

Floating IP feature is an important part of the Load Balancing Rule, and has to be enabled for our Availability Group Listener/ ILB configuration. Without Floating IP enabled, we will have a traditional load balancing behavier and traffic will be load balanced between virtual machines within Backend Pool list if healthy, disregarding which SQL Server instances currently have ownership over the availability group listener

MS documentation: Set the load-balancing rules

You need to be sure that nothing else listens on the Health Probe port you choose during ILB configuration.

If you misconfigure the port you might have all VMs in the backend pool available or all of them unavailable as in the following example

ILB and Availability Group Listener

If we run the following CMD command on the primary replica you will notice that there is a process listening on this port

netstat -ano | find "59999"

To see the process behind this port we can use Command Prompt or simple open task manager and look for this PID, it will be Resource Hosting Subsystem (RHS) process

It’s really important to choose the right/available port for your health probe. In case you choose a port used by another system process, both replicas may end up listening at the same time, not just primary. On the secondary replica, you should not find any process listening to this Health Probe port.

In case of failover happened you will have a similar situation, and only one SQL Instance (Primary at that time) should have rhs.exe listening to this port.

3 Responses to “ “SQL Server / Configure an Azure Load Balancer for a SQL Server Always On AG in Azure Virtual Machines / Possible Floating IP and Health Probes Connectivity and Networking issues”

Michael Ou says:

January 20, 2023 at 3:08 am

I’ve found this series of blogs is very helpful for me to get familiar SQL Server services in Azure platform.

Reply
Zack says:

May 24, 2024 at 4:50 pm

Thank you, your article helped us troubleshoot our Azur setup.

Reply
Scott says:

August 13, 2024 at 3:12 am

so the real main reason for the floating IP is to work in combination with the failover cluster resource host subsystem to determine which node is primary / active writer… so that load balancer’s incoming traffic is always only routed to the active node rather than active/active load balancing.

Reply