Thursday, March 28, 2013

VMware Distributed Power Management (DPM)

One of the features Enterprise license brings to VMware vSphere is Distributed Power Management (DPM). This feature saves power by dynamically adjusting cluster capacity according to existing workloads. DPM will power off and power on hosts in the cluster based on load average of each host. During power off, VMs are automatically consolidated on remaining hosts. 

For powering on ESXi hosts, DPM uses one of the following technologies: iLO, IPMI or WoL. WoL packets (Magic packets) are sent over vMotion interface by another host in the cluster. DPM puts hosts into so called "standby" by actually entering the host in ACPI S5 power state. In this state power consumption is minimal, no user mode or system code is run and the systems`s context is not preserved. Some components are still running to ensure starting of the host. This state is also called "soft off".

DPM evaluates CPU and memory utilization for each host in the cluster and tries to keep the host within a specific range. By default utilization range is 45% to 81%. The range is computed from two settings:
DemandCapacityRatioTarget = utilization target of the host - by default it is 63%
DemandCapacityRatioToleranceHost = variation around utilization target - by default it is 18%
Utilization range = DemandCapacityRatioTarget +/- DemandCapacityRatioToleranceHost
Default settings can be changed in DRS - Advanced settings and can be set between 40% and 90% for DemandCapacityRatioTarget and between 10% and 40% for DemandCapacityRatioToleranceHost.

DPM uses two time intervals for evaluating power on and power off recommendations. For power on the period is 300 seconds (5  minutes), while for power off the period is 2400 seconds (40 minutes). This means DPM considers more important responses to increased load. However, this also means that a sudden increase in the load will be be considered by DPM after 5 minutes and will be resolved after the host boots up, which may add another 5 to 10 minutes. The values can be changed by setting parameters VmDemandHistorySecsHostOn (default 300 seconds) and VmDemandHistorySecsHostOff (default 2400 seconds) to a value between 0 and 3600 seconds.

DPM will ensure that at least one host will be left running. The settings MinPoweredOnCpuCapacity (default 1 MHz) and MinPoweredOnMemCapacity (default 1 MB) are used to control how many hosts will be left running. Default values ensure minimum one host is up and can be changed. For example if the cluster has hosts configured with 24 GHz and 128 GB setting the parameters 24001 MHz and 131073 MB will reserve 2 hosts running at all time. Even with default values, when HA cluster is enabled, DPM will leave 2 hosts powered on to provide fail over resources in case of one host failing.

For more details about how DPM selects hosts for power off and cost/benefit analysis, I strongly recommend the book VMware vSphere Clustering Deepdive and the white paper VMware Distributed Power Management: Concepts and Usage

Configuring DPM

Before enabling DPM on a cluster it is best to test that hosts can be powered off and started. This can be done from vSphere client by putting a host in standby. Check first that the network adapter supports WoL:

Put the host in standby and confirm moving all powered off and suspended VMs to other hosts in the cluster:

After the host is powered off, bring it back from vSphere client:
After all hosts in the cluster have been tested, one last check can be done by going to Cluster - vSphere DRS - Host Options and view if hosts succeeded to exit standby:

Next, enable DPM and choose one of the 3 power management policies:

If DPM behavior needs to be controlled, a very good idea is to have it enabled during NBH and disabled during BH. Set power management to Automatic and add 2 scheduled tasks of type Change cluster power settings in vSphere - one to enable DPM in the evening (19.00) and another one to disable DPM in the morning ( 7.00).  This way during BH all hosts are running at full capacity and during night the data center takes full advantage of power savings. 

No comments: