MSSQL 2014 AlwaysOn Availability Group Cluster & Gratuitous ARP (GARP) Issue

MSSQL 2014 AlwaysOn cluster running on Windows 2012 R2 doesn’t send Gratuitous ARP (GARP) packets by default!

I have recently come across gratuitous arp (GARP) issues while working on Microsoft SQL 2014 AlwaysOn Availability Group cluster setup. I experienced the following –

  1. MSSQL 2014 AlwaysOn cluster with AlwaysOn Availability Group (AG) setup was done as per best practices and experts recommendations; all cluster related services were running OK without any issue.
  2. clients sitting on the same IP network/same VLAN were able to connect to the AlwaysOn AG listener Virtual IP (VIP) address immediately after a cluster failover happen from Node-A to Node-B and vice versa.
  3. however, clients sitting on different IP subnets were NOT able to connect to the VIP immediately after a cluster failover.
  4. clients sitting on different IP subnets waited for 20MIN to get connect to the VIP.
  5. this 20minutes is MAC address lifetime on the ethernet switch (I use Juniper EX-series switches) where the servers are connected (connected to physical Hypervisor).
  6. on the network layer the switch “ARP table” was showing previously learnt MAC address for the AG Listener VIP; the switch didn’t updated MAC address after a cluster failover triggered. The switch flushed out the old MAC and re-learnt the new correct MAC address after the MAC age time (20min) expired on the switch.

I was looking for a solution and found “GARP Reply” needs to be enabled on the Juniper EX switch manually – I have done that but still NO improvement!

Also looked at Microsoft KB documents and forums – people are saying GARP needs to be turned on the network switch which I have DONE already without any success.

After doing further digging inside I found that the Windows 2012 R2 servers were not sending any GARP packets so the switch was not updating the ARP table although it is configured to work with GARP.

To get this working – Windows server registry object “ArpRetryCount” needs to be added; Microsoft said the following about this –

“Determines how many times TCP sends an Address Request Packet for its own address when the service is installed. This is known as a gratuitous Address Request Packet. TCP sends a gratuitous Address Request Packet to determine whether the IP address to which it is assigned is already in use on the network.”

Add the registry entry as following –

-HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
-REG_DWORD > ArpRetryCount
-Value is between 0-3 (use value 3)

0 – dont send garp
1 – send garp once only
2 – send garp twice
3 – send garp three times (Default Value – actually not present on Windows 2012 R2)

To enable “GARP reply” on Juniper EX & SRX platform – user the following command –

#set interface interface_name/number gratuitous-arp-reply

The interface can be a physical interface, logical interface, interface group, SVI or IRB.

To enable GARP on Cisco IOS – use interface command “ip gratuitous-arps“.

References:
https://technet.microsoft.com/en-us/library/cc957526.aspx
http://www.juniper.net/techpubs/en_US/junos13.2/topics/usage-guidelines/interfaces-configuring-gratuitous-arp.html
http://www.cisco.com/web/techdoc/dc/reference/cli/nxos/commands/l3/ip_arp_gratuitous.html