VMware ESXi Host Memory Management, Monitoring, Alert Notification – Part 1

When it comes VMware memory monitoring – two items to monitor (i)ESXi host memory (ii)VM memory. There are bunch of memory related terminologies and calculations here in this space. I am discussing host memory monitoring here –

-understand physical memory usage monitoring
-what is the right memory counter to monitor & alert notification for esxi host
-what is the right gauge of memory monitoring & alert notification for esxi host

Will also setup Nagios check plugin to monitor the above with performance data for graph (Part 2).

Before moving forward; let’s have a look into Mem.MinFreePct function. This function manage how much host memory should be kept free and when the hypervisor should kick-off advanced memory reclamation techniques such as ballooning, compression, swapping.

(Configuration> Advanced Settings>Mem)
memminfreepct

Based on free host memory & reclamation techniques – there are four (04) different states of host memory utilization;

State Name Mem Reclamation Technique Good or Bad Note
High At this state “Transparent Page Sharing” is will be always running. This is default behaviour. Good – this is normal This is defined by Mem.MinFreePct function. Don’t disable TPS – not recommended.
Soft At this state host will activate memory ballooning. Not good enough This is 64% of Mem.MinFreePct. This means physical memory near to max out.  If host unable to go back to previous state itself – take necessary action to free up more mem.
Hard At this state host will start doing memory compression and hypervisor level swapping. Bad – memory under stress This is 32% of Mem.MinFreePct. Need to free up memory by migrating VMs to other hosts or upgrade memory.
Low At this state host will no more serve any page to VMs. Very Bad – fix it ASAP This is 16% of Mem.MinFreePct. This protects host VMkernel layer from Purple Screen of Death.

Prior to ESXi-5.x this (high state) was set to 6% by default – this means host system will always keep 6% of total physical memory free before activate advanced memory reclamation technique; let’s say an ESXi-4.x host with 64GB memory will be required at least 3.84GB free to be in the High state (normal).

Starting from ESXi-5.x this calculation is no more 6% by default – because high memory servers (512GB/768GB) are becoming common these days; 6% of 512GB is 30.72GB its huge free memory.

The new calculation is following –

Free Memory Threshold Range Calculation Note
6% First 0GB to 4 GB 6% of 4GB
4% Starting from 4GB to 12GB (12-4=8) 4% of 8GB
2% Starting from 12GB to 28GB (28-12=16) 4% of 16GB
1% Remaining memory i.e. 36GB if total size is 64GB (64-28=36)
i.e. 68GB if total size is 96GB (96-28=68)

Based on above – on a system with 128GB memory, the min free memory required to be in “high state” calculation is following –

i. 6% of first 4GB – this is 245.76MB (first 0-4GB)
ii. 4% of 8GB – this is 327.68MB (0-4GB|4-12GB)
iii. 2% of 16GB – this is 327.68MB (0-4GB|4-12GB|12-28GB)
iv. 1% of 100GB – this is 1024MB (0-4|4-12|12-28|28-128GB)
v. Total is 1925.12MB (245.76+327.68+327.68+1024).

esxmemfree

Based on the above we can setup monitoring & alert notification for a 128GB host as following –

Mem State Min Free Mem Monitoring Action Calculation
High 1925.12MB No action required Based on above
Soft 1232.0768MB Warning alert 64% of Mem.MinFreePct
Hard 616.384MB Critical alert 32% of Mem.MinFreePct
Low 308.0192MB Critical alert 16% of Mem.MinFreePct

Also at “Hard” state – memory performance measurement counter “Swap used” will be greater than 0. This condition also should trigger alarm.

vmware-perf-mem

esxtop-mem
(esxtop – memory high state)

References:
http://blogs.vmware.com/vsphere/2012/05/memminfreepct-sliding-scale-function.html