Sunday, December 18, 2022

vCenter Server 8 Upgrade - Unknown Host Error

 I've recently upgraded my vCenter Server 7.0.3 to 8.0 and during the process I've encountered the following error: Error in method invocation [Errno 1] Unknown host

The error is related to using IP address instead of FQDN as shown in the image below . It will appear in stage 2, after VCSA VM is deployed in the environment. 

To avoid it, just use FQDN everywhere. This error was first mentioned on upgrades from 6.7 to 7.0, but somehow in my lab got ported to upgrade from 7.0 to 8.0.  

Thursday, December 8, 2022

VMUG Leader Summit 2022 - How The Times Have Changed

It's been 32 months since the last summit took place and how the times have changed. Not sure if anyone would've dared in February 2020 to imagine the shifts to come a few weeks later and that will keep on coming throughout the next 2 and a half years. And it's not only socially and politically, but also technologically. The status quo of the human society was pretty much ignored and we had to learn, adapt and change. 

This week in Lisbon I had the opportunity to meet again the great community of people from all the corners of the world called VMUG. And the people in it is what makes this whole idea great. In 2020 it was the social part that impressed me too. Technologies come and change, trends may shift (again this word) on the way. The people involved in the process are the most important. 

Talking about people we had (as always) a number of select guests from VMware to talk to us. From Joe Baguley's (CTO EMEA) talk about skills gap and how AI and robots are slowly making it to the masses and into your house, to Duncan Epping's journey throughout VMware and the idea that everyone wants to be something, but very few want to make the effort to become that something, it all evolved around the changes that we are living and need to face. And it is scary, especially since the buzz word without any relation to the event, was AI, and more exactly ChatGPT, an OpenAI project that was trained to interact with humans in a conversation like natural language. The ease of usage and accessibility of this AI model is amazing and it will change a lot. But it is still only a tool in our hands. And the way we use it will make a difference. Only two years ago a talk around this subject would've brought a lot of smiles. 

I am a techie, not really great with words so let me wrap it up. As humans we need certainty, connection, contribution. VMUG is a community that brings all of this and more. And it does it through its members, the people that change, learn and grow along with the community. 

Thursday, December 1, 2022

What I've Learned From Using Instant Clones in vSphere

Instant clone is a technology to create a powered on VM using as source another running VM. An instant clone VM shares memory and disk state with its source VM. Once it is powered on, the instant clone is a fully manageable independent vCenter Server object.  The clones can be customized and have unique MACs, UUID. This makes the technology very appealing for use cases where large number of VMs need to be created in a short time from a controlled point in time - think about VDIs. 

My use case was on-demand labs generated from the same lab template(s). A lab template is made of 3 to 6 VMs of different sizes running interdependent applications. Users login to a web app and then request one or more new labs from the available templates. The web app would start in the background lab provisioning for all the requests via vCenter Server. 

Using full clones would have meant a higher load on the systems and also a longer time to wait for a lab to be ready - boot time of the all the VMs in the cloned lab plus time for services to start in guest OS of each VM. Additionally there was no information on how many labs would be requested at a time. There were also multiple source lab templates having a worse case scenarios of tens to hundreds of VMs being requested within a minute. I chose instant clones as the way forward.  

When using instant clone there are 2 provisioning workflows: running source VM and frozen source VM, as seen in the picture below taken from Understanding Clones in vSphere 7 performance study published by VMware.

In running source VM, a temporary stun is initiated to allow for checkpoint the VM and create the delta disks. Then the source is back to its running state. Each new instant clone will depend on the the shared delta disk potentially hitting the vSphere limit of 255. These delta disks are redo logs and are not tied to snapshot chain, hence not visible in UI. The limit for supported snapshot chain in vSphere is still 32. In case the limit is hit, cloning will fail as described in KB article 67186. To avoid this limitation, you could use frozen source VM provisioning workflow in which the source is frozen and no longer running and the delta disks are only created for child VMs. 

Since the lab templates were actually running different services that did not cope very well with being frozen for longer periods of time, I used running source VM workflow. To create the clones I borrowed and adapted the code from William Lam found here instant clone PowerCLI module (thank you!). He also has some very good articles on the technology. 

What I did not realize at the time is that it will impact the performance of the labs once the number of delta disks increased. The cloned labs were temporary by nature and removed after a specific run time. However the delta disks on the source VMs were not cleaned up and just kept increasing which in the end impacted user experience. So I needed to introduce a cleaning mechanisms. 

The simplest way to clean up source VM was by using an idea that I got from Veeam Snapshot Hunter and to create a snapshot for the lab template VMs (source VMs) and then immediately initiate a delete all command. This will clean up all the delta disks from the source VMs. The PowerCLI script would run nightly as a scheduled job.  

$labPrefix = "lab-1-*"
$vms = Get-VM -Name $labPrefix
foreach ($vm in $vms) {
    $snapTime = get-date -Format "MM/dd/yyyy HH:mm"
    $description = $vm.Name + " " + $snapTime
    New-Snapshot -VM $vm -Name "delta disk cleanup" -Description $description -Memory:$true -Confirm:$false
    Get-Snapshot -VM $vm -Name "delta disk cleanup" |  Remove-Snapshot -RemoveChildren -Confirm:$false    

The plan is to test Vim.VirtualMachine.PromoteDisk(unlink=True) method in the future.

A few take away points:
- instant clone is a very fast cloning technology and it also optimizes resource usage (memory, disk)
- if the number of cloned VMs from the same source is very large ( > 200) use frozen source VM workflow
- when using running source VM, make sure to include a cleanup mechanism of the delta disks
- time synchronization in the source VMs is very important (as always)
- if you need full performance, use full clones