Thursday, May 24, 2012

PowerCLI - Change virtual NIC device connectivity

More on manipulating VM nics using PowerCLI. The connectivity of the NIC of a VM can be controlled by using the folllowing flags:
connected - specifies whether the NIC is connected or not to the portgroup (valid only for started VMs)
startConnected - specifies whether the device will be connected when the VM starts or not
allowGuestControl - allows the guest to control whether the device is connected or not

To do it,  you can use the vSphere Client, but if you are talking about  more than one VM then the following code is better:

$vmlist = Import-Csv "H:\Scripts\vmlist.txt"

foreach ($row in $vmlist) {
foreach ($netif in get-networkadapter $row.VMname) {
if ($netif.NetworkName -ne "pg_vm_admin"){
Set-NetworkAdapter -NetworkAdapter $netif -StartConnected:$false -Confirm:$false

The code gets a list of powered off VMs and for each one it gets its network adapters. If the network adapter is connected to any other network except administration, it disconnects it. After that I could start up the VMs to see what I was dealing with. 

Another usage is to check the power state of the VM and if the VM is on then disconnect it from the network:

$vmlist = Import-Csv "H:\Scripts\vmlist.txt"

foreach ($row in $vmlist) {
$vm = get-vm $row.VMname

if ($vm.PowerState -eq "PoweredOn") {
foreach ($netif in get-networkadapter $vm) {
if ($netif.NetworkName -ne "pg_vm_admin"){
Set-NetworkAdapter -NetworkAdapter $netif -Connected:$false -Confirm:$false

Wednesday, May 23, 2012

NFS traffic rate limiting on Juniper switches

"And now for something completely different"... 

The configuration for the ESXi infrastructure I worked with is a bit tricky. VMs are hosted on NFS filer, but the same VMs are mounting NFS exports from the same filer. Much like in the picture below:

The physical bandwidth between the access switch where ESXi are connected and the core switches where the Filer is connected is limited. So, I had a lot of ESXi servers, with a lot more VMs competing over the same physical links. Lucky me, I had control over the access switches - Juniper. The next step was elementary: ensure enough bandwidth for ESXi and give some to the VMs - rate limiting using firewall filtering from JunOS.

And this is how it was done. First, a policer was created for NFS traffic coming from VMs (guest OS) which limits the allocated bandwidth - in my case 200 Mbits with a burst size of 10 MB. When the limit is reached packets are discarded. This way two things are achieved: a decent 200 Mbit bandwidth  is ensured for the VMs and small files (up to 10 MB) are transferred very fast to the Filer (no limits). When the VMs demand a lot of resources, the policer steps in and ensures that the critical ESXi vmk traffic gets its share.

[edit firewall]
set policer policer-NFS-1 if-exceeding bandwidth-limit 200m burst-size-limit 10m
set policer policer-NFS-1 then discard

Then, the firewall filter is created. The filter matches all traffic that goes to the IP address of the Filer and applies the policer to it:

[edit firewall family inet]
set filter limit-vlan100-NFS term term-1 from destination-address
set filter limit-vlan100-NFS term term-1 then policer policer-NFS-1
set filter limit-vlan100-NFS term term-default then accept

Last, add firewall filter is applied to the interface - in this case it is VLAN 100:

set interfaces vlan unit 100 family inet filter input limit-vlan100-NFS

The downside is that firewall filtering adds a bit of a load on the CPUs of the switches. Care should be taken when implementing such solutions (as always).

Monday, May 21, 2012

Virtual port group details

I know there are tools that get all this info, but not in the form I needed and when and where I needed - during vmnic migration  and port group reconfiguration. The idea was to redistribute traffic on different vmincs and to keep failover at the vSwitch level.  In order to check the status of the port groups on each host, I used the following:


get-vmhost | foreach {
foreach ($vsw in Get-VirtualSwitch -VMHost $_)
 $hostName = $_.Name
foreach ($vpg in Get-VirtualPortGroup -VirtualSwitch $vsw)
 $row = " " | Select hostName, vpgName, vpgInherit, vpgAN, vpgSN, vpgUN, vpgFailover, vpgLB
$vpgnicteaming = Get-NicTeamingPolicy -VirtualPortGroup $vpg

$row.hostName = $hostName
$row.vpgName = $vpgnicteaming.VirtualPortGroup.Name
$row.vpgInherit = $vpgnicteaming.IsFailoverOrderInherited
$row.vpgAN = $vpgnicteaming.ActiveNic
$row.vpgSN = $vpgnicteaming.StandbyNic
$row.vpgUN = $vpgnicteaming.UnusedNic
$row.vpgFailover = $vpgnicteaming.NetworkFailoverDetectionPolicy
$row.vpgLB = $vpgnicteaming.LoadBalancingPolicy

 $report += $row

The things I am looking for is to find out for each virtual port group what are the active, stanby and unused interfaces, what type of failover and load balancing policies are implemented. After running the script I got the following listing (the listing is reduced):

hostName    : testbox.esxi
vpgName     : VM Network
vpgInherit  : True
vpgAN       : {vmnic0}
vpgSN       :
vpgUN       :
vpgFailover : LinkStatus
vpgLB       : LoadBalanceSrcId

hostName    : testbox.esxi
vpgName     : vmk_vMotion
vpgInherit  : True
vpgAN       : {vmnic1}
vpgSN       :
vpgUN       :
vpgFailover : LinkStatus
vpgLB       : LoadBalanceSrcId

My setup involves separating traffic at vmnic level - vminc0 used for VM traffic, while vmnic1 is used for vMotion. Both vminc0 and vminc1 are attached to the same vSwitch.

Thursday, May 17, 2012

Automate VM deployment from templates

The title is a little pretentious, but the next piece of PowerCLI does the job fairly enough. The only thing I have to do is to have the templates ready and modify the input file.
The script takes as input a csv file, parses each row and initializes variables that are used by New-Vm powercli command:

$input = Import-Csv "deploy_from_template.csv"
foreach ($row in $file) {
$vmname = $row.VmName
$respool = $row.ResPool
$location = $row.Location
$datastore = $row.Datastore
$template = $row.Template
echo "deploying " $vmname
New-Vm -Name $vmname -ResourcePool $respool -Location $location -Datastore $datastore -Template $template -RunAsync

RunAsync starts allows the execution of the next line in the script without waiting for current task to end. So use it wisely.

The csv file has the following structure, but you can put any other parameters in it and modify the script accordingly:

Tuesday, May 15, 2012

PowerCLI - ESXi network interfaces

I have added new gigabit NIC`s on all the ESXi servers and before making any modifications to the traffic flows I wanted to see that the network was up and running. So, the following script came in handy:

### Get pnics and their link speed

get-vmhost | foreach {
$pnics = $_.NetworkInfo.ExtensionData2.NetworkInfo.Pnic
foreach ($p in $pnics) {
$row += $p.Device + " " + $p.LinkSpeed.SpeedMB

After finishing all reconfigurations on ESXi networking - mainly redistributing traffic flows on the newly installed NIC`s, I felt the need to have a quick look over the deeds just done. The idea was to check  that ESXi hosts had the correct vmk interfaces and the correct IP addresses. So, another small bit of scripting:

###Get vmk interfaces and IP addresses for all hosts
foreach ($vmh in get-vmhost)
$row = ""
$row += $vmh.Name + " "
foreach ($i in $vmh.NetworkInfo.ExtensionData2.NetworkConfig.Vnic)
$vmk= $i.Device
$row += $vmk + " "
$vmkip = $i.Spec.Ip.IpAddress
$row += $vmkip + " "
$vmksubnet = $i.Spec.Ip.SubnetMask
$row += $vmksubnet + " "

Monday, May 14, 2012

Data do shrink from one storage to another

The transfer of mailboxes from EMC Centera to NetApp ended up with an interesting result: the data occupied 5 GB less on the NetApp filer.  No dedup or compression, so block size was the main suspect. Digging a bit around and here is the proof:

- Centera`s UxFS has a default block size of 8 kB(and yes, we use default)
- NetApp`s WAFL has a fixed block size of 4 kB
- 4.4 million of files out of which 30% are under 4 kB - 1.3 million

Now, let`s try a simple estimation. For each of the 1.3 million files, a 8 kB block is reserved on EMC. When moved to NetApp, the file gets a 4 kB blocks. So, the freed up space is roughly 1.3 millions x 4 kB ~ 4.98 GB.

Sunday, May 13, 2012

Rsync cyrus volumes from EMC to NetAPP

A couple of weeks ago I was faced with the following problem: clean up the EMC filer that hosts one of our client`s cyrus e-mail mailboxes.Putting it all on paper I got to the following:
- there were more than 3 million files and folders
- the only available protocol is NFS
- the bandwidth is 1 Gbit
- source filer is EMC, destination is NetApp

The chosen option was a copy-paste operation from one filer to another with service interruption to avoid synchronization problems. One question was raised: how long it will take to do it and what tool to use? 

After some parking lot meetings and discussions, the shortlist for tools included cp, rsync and cpio. First thing first, speed tests. I have generated a workload of 50000 files having the following composition: 40% @ 1 KB, 20% @ 2 KB, 20% @ 3 KB and 20% @ 10 KB using for and dd:

### first 20000 files 1K, then increase seq and bs 
for i in $(seq 1 1 20000); do dd if=/dev/zero of=/tmp/FILER/vol99/$i count=1 bs=1024 2>/dev/null; done

Next, the tests, using time command:
### cp
time  cp -a vol99/ /tmp/DEST/vol99

### rsync
time rsync -a vol99/ /tmp/DEST/vol99/ 

### cpio 
time find . -print -depth | cpio -padmuv /tmp/DEST/vol99/

cp -a does the same thing as cp -dpR - preserves links, permissions and is recursive. rsync -a does the same as rsync -rlptgoD and I was mainly interested to ensure recursive transfer, preservation of permissions, modification times, groups and owners. cpio -padmu ensures creation of directories where needed, preservation of mtime.

The fastest was cp and it was considered the baseline. rsync was 7% slower, while cpio was 46% slower (because of the big find piped to it). Since the obvious choice was cp, it was decided to use rsync (mostly because of the possibility to continue interrupted transfers and to do check sums - which was never used). The speed tests also allowed to approximate transfer times of the real data at around 100 minutes. This way I could go ahead with the formalities on the client side.

When rsync was tested against real data (around 400K files and directories), it generated a bit of a surprise. It started to build up its file list, it took more than 15 minutes to do it and it started a very slow transfer process - after 1 hour it did not finish one volume. And we had 5.  The main cause was the partition structure used by cyrus imap for mail store and the big number of files. To get an idea: vol1 has 20 imap partitions: cyrus1 to cyrus20. Each cyrusN partition has the the mailbox structure from A to Z. The first rsync test was:
nohup rsync -a /SRC/vol1/ /DEST/vol1/ &
and this got stuck into creating the file list and transferring very slow. The idea was to have fast transfer, so for the second test rsync was run in parallel for each cyrus partition from 1 to 20:
nohup rsync -a /SRC/vol1/cyrus1/ /DEST/vol1/cyrus1/ &
nohup rsync -a /SRC/vol1/cyrus2/ /DEST/vol1/cyrus2/ &
nohup rsync -a /SRC/vol1/cyrus20/ /DEST/vol1/cyrus20/ &

This time it worked just as expected. Actually, during the operation we managed to increase the CPU on a NetApp 6280 by 10%-15% using parallel rsync.

One more thing to be said: rsync syntax is a bit tricky. If the slash at the end of source directory is omitted, then it will create the directory itself in the destination and you will end up with a shifted folder structure: /DEST/vol99/cyrus20/cyrus20/.