Sysadmin Stories: vSphere

Showing posts with label vSphere. Show all posts

Thursday, January 5, 2023

Managing vSphere VM Templates with Packer

Packer is an open source tool developed by HashiCorp that lets you create identical images using the same source. It helps in implementing and managing golden images across your organization. I will be using Packer in a vSphere environment only and not be using its multi platform support. The use case I am looking at is managing VM templates applying infrastructure as code concepts.

The workflow I am implementing is using base VM templates made of basic OS installation, VMware tools and networking connectivity. These base templates do not need any management except for periodic updates/patches. The base VMs then are customized into project specific templates using Packer. The process installs any given project customization such as additional users, software packages, devices and creates a new template to be used as the source for prod deployment. Packer will not replace a configuration management tool, but it will reduce the time to deploy and configure the prod (or running) instances. It is faster to have a prepped template than to wait for packages to install on each of your instances during prod deployment. The diagram below exemplifies the intended process:

In this workflow, Packer plays a crucial role allowing for fast and repeatable automation of the VM templates based on specific requirements. All credentials are kept in a dedicated secrets manager, called Vault. I will not enter into details about Vault, just keep it in mind as it is used to store any credentials used by Packer. A new set of templates results at the end of the customization process and these are used to run the prod instances.

Packer will also ensure that any changes to the base VM template are tracked and can be repeated in any other infrastructure while they are written in a human readable format. Let's look at a simple example where we modify a CentOS 7 base template. For our project we will use the the following folder structure:

There are 3 files:

variables.pkr.hcl - keeps all variable definitions
tmpl-linux.auto.pkrvars.hcl - keeps the initialized input variables and it will be loaded during run; this allows to only change this file when moving to another environment
tmpl-linux.pkr.hcl - main Packer file

Packer uses HashiCorp Configuration Language (HCL). Let's look at variables.pkr.hcl file contents:

variable "vcenter_server" {
  type        = string
  description = "FQDN or IP address of the vCenter Server instance"
}

variable "build_user" {
  type        = string
  description = "user name for build account"
}

locals {
    timestamp = regex_replace(timestamp(), "[- TZ:]", "") 
}

local "linux_user_pass" {
  expression = vault("/kv/data/linux_workshop", "${var.ssh_user}")
  sensitive  = true
}

local "build_user_pass" {
  expression = vault("/kv/data/build_user", "${var.build_user}")
  sensitive  = true
}

There are 2 types of variables - input variables and local variables. Input variables need to be initialized from a default value, command line, environment or variable files (we are using auto.pkvras.hcl file for this). Local variables cannot be overridden at run time and can viewed as some kind of constants. In the example above the real number of variables has been truncated to keep it readable. You can see the input variables such as "vcenter_server" and "build_user". There are also 2 local variables - "timestamp" which is calculated from a function and used in our case in the note field of the VM and "build_user_pass" which keeps the password for our build user and it takes this value from Vault secrets manager. The "build_user_pass" is marked as sensitive which will hide it from the output.

Next, let's look the variable initialization file tmpl-linux.auto.pkrvars.hcl

vcenter_server = "vcsa.mylab.local"
build_user = "build_user@vsphere.local"

We chose to initialize the variables from a separate file. In it, we just assign values to our input variables. If we need to modify any variable this is the only place where we make the changes which makes it easier to manage. Again, for ease of reading the file has been truncated.

Time to see what the tmpl-linux.pkr.hcl file contains. In the customization we'll apply to our template we are looking at two things:

add a new disk to the target image
install software packages in the target image

We'll look at each section in the packer fil. First we define the required plugins - in our case vsphere. You can make sure that a certain version is loaded.

packer {
  required_version = ">= 1.8.5"
  required_plugins {
    vsphere = {
      version = ">= v1.1.1"
      source  = "github.com/hashicorp/vsphere"
    }
  }
}

Next we define the source block which has the configuration needed by the builder plugin (vsphere plugin loaded above).

source "vsphere-clone" "linux-vm-1" {
  
  # vcenter server connection
  vcenter_server      = "${var.vcenter_server}"
  insecure_connection = "true"
  username            = "${var.build_user}"
  password            = local.build_user_pass

  # virtual infrastructure where we build the templates
  datacenter          = "${var.datacenter}"
  host                = "${var.vsphere_host}"
  datastore           = "${var.datastore}"
  folder              = "Templates/${var.lab_name}"

  # source template name 
  template            = "${var.src_vm_template}"

  # build process connectivity 
  communicator        = "ssh"
  ssh_username        = "${var.ssh_user}"
  ssh_password        = local.linux_user_pass

  # target image name and VM notes  
  vm_name             = "tmpl-${var.lab_name}-${var.new_vm_template}"
  notes               = "build with packer \n version ${local.timestamp} "  

  # target image hardware changes 
  disk_controller_type = ["pvscsi"]
  storage {
      disk_size = var.extra_disk_size
      disk_thin_provisioned = true
      disk_controller_index = 0
  }

  convert_to_template = true
}

In the source we let the build plugin know how to connect to vCenter Server, what virtual infrastructure to use (datastores, hosts), what is the source template that we will use, how to connect to it and what is the configuration for the target template that we build. At the end we instruct the plugin to convert the newly created image to a VM template. Notice the communicator defined as "ssh". Communicators instruct Packer how to upload and execute scripts in the target image. It supports: none, ssh and winrm. Please mind some builders have their own communicators, such as Docker builder.

With the current configuration we can actually define our build process. We've already accomplished half of our customization - adding the new disk is defined in the source block. In the build block we place the configuration that is needed by our build plugin. We will actually use a shell provisioner to install two packages - htop and tree. In my example, shell provisioner is sufficient to do the job, basically run in the target image "yum install". However I would recommend using a proper configuration management tool such as Ansible instead of directly running commands.

build {
  sources = ["source.vsphere-clone.linux-vm-1"]

  provisioner "shell" {
    execute_command = "echo '${local.linux_user_pass}' | sudo -S sh -c '{{ .Vars }} {{ .Path }}'"
    inline = ["yum install tree htop -y"]
  }

}

Notice execute_command - this is a customization of the command we want to run (yum) and we use it to send the sudo password. The password itself is take from the local variable which is initialized with the value from kept in Vault secrets manager (as defined in variables.pkr.hcl).

The only thing left to do is to validate your configuration and run the build process.

packer validate .

packer build .

Please note that variable files in this post have been truncated for ease of reading. If you intend to use this example, you would need to fill in the missing variables and initialize them according to your environment.

Thursday, December 1, 2022

What I've Learned From Using Instant Clones in vSphere

Instant clone is a technology to create a powered on VM using as source another running VM. An instant clone VM shares memory and disk state with its source VM. Once it is powered on, the instant clone is a fully manageable independent vCenter Server object. The clones can be customized and have unique MACs, UUID. This makes the technology very appealing for use cases where large number of VMs need to be created in a short time from a controlled point in time - think about VDIs.

My use case was on-demand labs generated from the same lab template(s). A lab template is made of 3 to 6 VMs of different sizes running interdependent applications. Users login to a web app and then request one or more new labs from the available templates. The web app would start in the background lab provisioning for all the requests via vCenter Server.

Using full clones would have meant a higher load on the systems and also a longer time to wait for a lab to be ready - boot time of the all the VMs in the cloned lab plus time for services to start in guest OS of each VM. Additionally there was no information on how many labs would be requested at a time. There were also multiple source lab templates having a worse case scenarios of tens to hundreds of VMs being requested within a minute. I chose instant clones as the way forward.

When using instant clone there are 2 provisioning workflows: running source VM and frozen source VM, as seen in the picture below taken from Understanding Clones in vSphere 7 performance study published by VMware.

In running source VM, a temporary stun is initiated to allow for checkpoint the VM and create the delta disks. Then the source is back to its running state. Each new instant clone will depend on the the shared delta disk potentially hitting the vSphere limit of 255. These delta disks are redo logs and are not tied to snapshot chain, hence not visible in UI. The limit for supported snapshot chain in vSphere is still 32. In case the limit is hit, cloning will fail as described in KB article 67186. To avoid this limitation, you could use frozen source VM provisioning workflow in which the source is frozen and no longer running and the delta disks are only created for child VMs.

Since the lab templates were actually running different services that did not cope very well with being frozen for longer periods of time, I used running source VM workflow. To create the clones I borrowed and adapted the code from William Lam found here instant clone PowerCLI module (thank you!). He also has some very good articles on the technology.

What I did not realize at the time is that it will impact the performance of the labs once the number of delta disks increased. The cloned labs were temporary by nature and removed after a specific run time. However the delta disks on the source VMs were not cleaned up and just kept increasing which in the end impacted user experience. So I needed to introduce a cleaning mechanisms.

The simplest way to clean up source VM was by using an idea that I got from Veeam Snapshot Hunter and to create a snapshot for the lab template VMs (source VMs) and then immediately initiate a delete all command. This will clean up all the delta disks from the source VMs. The PowerCLI script would run nightly as a scheduled job.

$labPrefix = "lab-1-*"
$vms = Get-VM -Name $labPrefix
foreach ($vm in $vms) {
    $snapTime = get-date -Format "MM/dd/yyyy HH:mm"
    $description = $vm.Name + " " + $snapTime
    New-Snapshot -VM $vm -Name "delta disk cleanup" -Description $description -Memory:$true -Confirm:$false
    Get-Snapshot -VM $vm -Name "delta disk cleanup" |  Remove-Snapshot -RemoveChildren -Confirm:$false    
}

The plan is to test Vim.VirtualMachine.PromoteDisk(unlink=True) method in the future.

A few take away points:

- instant clone is a very fast cloning technology and it also optimizes resource usage (memory, disk)

- if the number of cloned VMs from the same source is very large ( > 200) use frozen source VM workflow

- when using running source VM, make sure to include a cleanup mechanism of the delta disks

- time synchronization in the source VMs is very important (as always)

- if you need full performance, use full clones

Thursday, August 26, 2021

VMworld 2021 - Sessions to watch

This year is going to be the second one in a row when I don't get to do my favorite autumn activity: go to VMworld in Barcelona. But I do get to do part of it - attend the virtual VMworld 2021. And to make it as close as possible to the real experience, I will most probably add some red Spanish wine and jamon on the side.

As for the sessions I am looking forward to attend, I will leave here a few of my choices:

VMware vSAN – Dynamic Volumes for Traditional and Modern Applications [MCL1084]

I've been involved recently in projects with Tanzu and vSAN and this session with Duncan Epping and Cormac Hogan is the place to go to see how vSAN continues to evolve, to learn about new features, integration with with Tanzu and hear some of the best practices.

The Future of VM Provisioning – Enabling VM Lifecycle Through Kubernetes [APP1564]

A session about what I think is a one of the game changers introduced by VMware this year: include VM-based workloads in modern applications using Kubernetes APIs to deploy, configure and manage them. I've been using working with VM service since its official release in May and also wrote small blog post earlier this month.

What's New in vSphere [APP1205]

This is one the sessions I never missed. vSphere is still one of the fundamental technologies for all other transformations. I am interested in finding out what are latest capabilities, the customer challenges and real-world customer successes.

Automation Showdown: Imperative vs Declarative [CODE2786]

There is no way to miss Luc Dekens and Kyle Rudy take on the hot topic of imperative versus declarative infrastructure and understanding when and how you can and should use each of it and see practical examples of it.

Achieving Happiness: The Quest for Something New [IC1484]

I had the honor to meet Amanda Blevins at VMUG Leaders Summit right before the world decided to close. Her presentation wowed the crowd and it was one of the highest rated. So this is something that shouldn't be missed, especially since the pandemic has been around for 18 months and we need to achieve some happiness.

There are hundreds of sessions and the touched areas are so diverse that you can find your picj regardless of your interests in AI, application modernization, Kubernetes, security, network, personal development or plain old virtualization. See you at VMworld 2021!

Thursday, December 26, 2019

Check ESXi MTU settings with PowerCLI

Sometimes the simplest tasks can be time consuming in large environments. This time is about MTU settings on the ESXi host,

First let's see a bit about MTU (Maximum Transmission Unit). It is a setting that defines the largest protocol data unit that can be sent across a network (largest packet or frame). Default setting is 1500 bytes. Having a bigger MTU increases the performance for particular uses cases as large amounts of data transmission over Ethernet. So it's always been set to larger values (9000 bytes) for accessing iSCSI and NFS storage. For a vSphere environment it means it could (and should in some cases) be increased for almost all types of traffic: vMotion, vSAN, Provisioning, FT, iSCSI, NFS, VXLAN, FCoE.

Let's take the use case of accessing a NFS datastore, as seen in the picture below:

The biggest challenge with MTU is to have the environment properly configured end-to-end.This means when you want your ESXi to make use of large MTU for accessing a NFS datastore, you'll need to make sure that distributed virtual switches, physical network interfaces, vmkernel portgroups, physical switches at system level and per port level and filers are configured with the proper MTU. What happens in our example when some elements are configured with the default MTU (1500)? In the case the vmkernel portgroup is set at 1500, then no you will see no performance benefits at all. If one of the physical switches is configured with 1500 bytes, then you will get fragmentation of the packets (performance degradation).

Hoping this short theoretical intro to MTU was helpful, I will jump ahead to the topic: checking ESXi MTU with PowerCLI. We are not treating how to check physical switches and storage devices within present article.

At ESXi level we need to check 3 settings: distributed virtual switch, physical network interfaces (vmnic used for uplinks) and vmkernel portgroups. To accomplish this we make use of two different PowerCLI cmdlets: Get-EsxCli and Get-VMHostNetworkAdapter

The beauty of Get-EsxCli is that it exposes esxcli commands and you can access the host through vCenter Server connection (no root or direct login to the ESXi host is required). The not so nice part its you have to use esxcli syntax in PowerCLI as you will soon see.

Main checks

We will first look at the script checks. Please keep in mind $h is a variable initialized with Get-VMHost cmdlet.

distributed virtual switch - will get the switch name, configured MTU and used uplinks; the loop ensures all dvswitches are checked

(Get-EsxCli -VMHost $h -V2).network.vswitch.dvs.vmware.list.Invoke() | foreach {
    Write-Host "DVSName  $($_.Name) MTU  $($_.MTU) UPLINKS  $($_.Uplinks)"
}

vmnics - check configured MTU, admin and link status for each interface (there is no issue in having unused nics configured differently)

(Get-EsxCli -VMHost $h -V2).network.nic.list.Invoke() |  foreach {
    Write-Host "NIC:"$_.Name "MTU:"$_.MTU "Admin:"$_.AdminStatus "Link:"$_.LinkStatus
}

vmkernel portgroups - check configure MTU and IP address

$vmks = $h | Get-VMHostNetworkAdapter | Where { $_.GetType().Name -eq "HostVMKernelVirtualNicImpl" }
foreach ($v in $vmks) {
    Write-Host "VMK $($v.Name) MTU $($v.MTU) IP $($v.IP)"
    }

The script

Putting it all together, we'll add the three checks in a foreach loop. The loop iterates through all the clusters and within each cluster through all the hosts. The script creates one log file per cluster containing all the hosts in that cluster and their details:

foreach ($cls in Get-Cluster) {
    $fileName = $cls.Name + ".log"
    Write-Host "# CLUSTER $($cls)" -ForegroundColor Yellow
    foreach ($h in $cls |   Get-VMHost) {
        Write-Host "$($h)" -ForegroundColor Yellow
        Add-Content -Path $fileName -Value "$($h)"

        (Get-EsxCli -VMHost $h -V2).network.vswitch.dvs.vmware.list.Invoke() | foreach {
            Write-Host "DVSName  $($_.Name) MTU  $($_.MTU) UPLINKS  $($_.Uplinks)"
            Add-Content -Path $fileName -Value "DVSName $($_.Name) MTU $($_.MTU) UPLINKS $($_.Uplinks)"
        }
        (Get-EsxCli -VMHost $h -V2).network.nic.list.Invoke() |  foreach {
            Write-Host "NIC:"$_.Name "MTU:"$_.MTU "Admin:"$_.AdminStatus "Link:"$_.LinkStatus
            Add-Content -Path $fileName -Value "NIC: $($_.Name) MTU: $($_.MTU) Admin: $($_.AdminStatus) Link: $($_.LinkStatus)"
        }
        $vmks = $h | Get-VMHostNetworkAdapter | Where { $_.GetType().Name -eq "HostVMKernelVirtualNicImpl" }
        foreach ($v in $vmks) {
            Write-Host "VMK $($v.Name) MTU $($v.MTU) IP $($v.IP)"
            Add-Content -Path $fileName -Value "VMK $($v.Name) MTU $($v.MTU) IP $($v.IP)"
         }
    }
}

Opening one of the log files, you see similar output to below:

esx-01.rio.lab
DVSName dvs-Data1 MTU 9000 UPLINKS vmnic3 vmnic2 vmnic1 vmnic0
NIC: vmnic0 MTU: 9000 Admin: Up Link: Up
NIC: vmnic1 MTU: 9000 Admin: Up Link: Up
NIC: vmnic2 MTU: 9000 Admin: Up Link: Up
NIC: vmnic3 MTU: 9000 Admin: Up Link: Up
VMK vmk0 MTU 9000 IP 192.168.10.11
VMK vmk1 MTU 9000 IP 192.168.20.11
VMK vmk2 MTU 9000 IP 192.168.30.11

In this case everything looks good at ESXi level. Easy part is over, so start digging in the physical switches CLI and any other equipment along the path to ensure end to end MTU consistency.

Wednesday, June 19, 2019

NSX- T Part 2 - Going Through Changes

The pop reference in the title was in my mind coming from metal ballad world, but as I found out it can easily be hip-hop. However this is where any reference to musical world stops as the following article will focus on some of the major changes that are brought in by NSX-T.

NSX-T is definitely a game changer. Latest release (version 2.4) brings in a lot of new functionality getting to parity with its vSphere counterpart. Coming from an NSX-v background, I am interested in the main differentiators that NSX-T is bringing into the game.

1. Cross platform support and independence from vCenter Server.

The two come together since supporting multiple platforms required decoupling NSX Manager from vCenter Server. Besides supporting vSphere, NSX-T can also support non-ESX hypervisors such as KVM. I expect to see more coming in the near future. Lastly, NSX-T supports bare metal workloads and native public clouds (see integrating AWS EC2 instances - youtube video here )

I will add here the NSX-T container plugin (NCP) that provides integration with container orchestration platforms (Kubernetes) and container based platform as a service (OpenShift, Pivotal Cloud Foundry).

That's a big step from vSphere environments.

2. Encapsulation protocol

At the core of any network virtualization technology is an encapsulation protocol that allows to send L2 protocols over L3. NSX-v uses VXLAN. Geneve is the protocol used by NSX-T. At the time of writing, Generic Network Virtualization Encapsulation (Geneve) is still an IETF draft, although on the standards track. According to people much wiser than me and that were involved in the specification design for Geneve, what the new protocol brings on the table is bringing the best from other protocols such as VXLAN, NVGRE and STT.

One of the main advantages of Geneve is that is uses a variable length header which allows to include as many options as necessary without being limited to the 24 bit header of VXLAN and NVGRE or using all the time a 64 bit STT style header.

3. N-VDS

NSX-T virtual distributed switches are called N-VDS and they are independent of vCenter Server. For this reason they come in 2 flavors (actually 3) depending on the platform:
- ESXi - NSX's version of vswitch which is implemented as an opaque switch in vSphere
- KVM - VMware's version of OpenvSwitch
- cloud and bare metal - NSX agent

An opaque switch is a network created and managed by a separate entity outside of vSphere. In our case logical networks that are created and managed by NSX-T. They appear in vCenter Server as opaque networks and can be used as backing for VMs. Although not a new thing , they are different from the NSX-v vswitches. This means that installing NSX-T in a vSphere only environment will still bring in the opaque networks instead of the NSX-v logical switches.

4. Routing

Differences are introduced at routing level where a two tiered model is being introduced by NSX-T. A very interesting blog article is here and I briefly will use a picture from it:

In a very short explanation:

Tier-0 logical router provides a gateway service to the physical world
Tier-1 logical router provides services for tenants (multi-tenancy) and cloud management platforms

Not both tiers are needed - depending on the services needed, only one tier can be implemented.

In a way, NSX-v could provide the same tiered model using Edge Services Gateway (ESG's), but in certain designs the routing paths were not optimal. NSX-T delivers optimized routing, simplicity and more architectural flexibility. Also, with the new model, Distributed Logical Router (DLR) control VM has been removed.

I will let you decide which of the changes briefly presented above are more important and I will just list them below:

standalone (vCenter independent)
multi hypervisor support
container integration
GENEVE
NVDS and OpenvSwitches
multi tiered optimized routing

However, if you are moving around the network virtualization world and haven't picked up on NSX-T, maybe it's time to start.

Tuesday, June 11, 2019

Docker Containers and Backup - A Practical Example Using vSphere Storage for Docker

A few months ago I started looking into containers trying to understand both the technology and how it will actually relate to one of the questions I started hearing recently "Can you backup containers?".

I do not want to discourage anyone reading the post (as it is interesting), but going further basic understanding of containers and Docker technology is required. This post will not explain what containers are. It focuses on one aspect of the Docker containers - persistent storage. Contrary to popular believe, containers can have and may need persistent storage. Docker volumes and volume plugins are the technologies for it.

Docker volumes are used to persist data to the container's writable layer. In this case the file system of the docker host. Volume plugins extend the capabilities of Docker volumes across different hosts and across different environments: for example instead of writing container data to the host's filesystem, the container will write data to an AWS EBS volume, or Azure Blob storage or a vSphere VMFS.

Let's take a step down from abstract world. We have a dockerized application: a voting application. It uses a PostgreSQL database to keep the results of the votes. The PostgreSQL DB needs a place to keep its data. We want that place to be outside the Docker host storage and since we are running in a vSphere environment, we'll use vSphere Storage for Docker. Putting it all in a picture would look like this (for simplicity, only PostgreSQL container is represented):

We'll start with the Docker host (the VM running on top of vSphere). Docker Engine is installed on the VM and it runs containers and creates volumes. The DB runs in the container and needs some storage. Let's take a 2 step approach:

First, create the volume. Docker Engine using vSphere Storage for Docker plugin (vDVS Plugin and vDVS vib) creates a virtual disks (vmdk's) on the ESXi host's datastore and maps it back to the Docker volume. Now we have a permanent storage space that we can use.

Second step: the same Docker engine presents the volume to the container and mounts it as a file system mount point in the container.

This makes it possible for the DB running inside the container to write in the vmdk from the vSphere datastore (of course, without knowing it does so). Pretty cool.

The vmdk is automatically attached to the Docker host (the VM). More, when the vmdk is created from Docker command line, it can be given attributes that apply to any vmdk. This means it can be created as:

independent persistent or dependent (very important since this affects the ability to snapshot the vmdk or not)
thick (eager or lazy zeroed) or thin
read only

It can also be assigned a VSAN policy. The vmdk will persist data for the container and across container lifetime. The container can be destroyed, but the vmdk will keep existing on the datastore.

Let's recap: we are using a Docker volume plugin to present vSphere datastore storage space to an application running within a Docker container. Or shorter, the PostgreSQL DB running within the container writes data to vmdk.

Going back to the question - can we backup the container? Since the container itself is actually the runtime instance of Docker image (a template), it does not contain any persistent data. The only data that we need is actually written in vmdk. In this case, the answer is yes. We can back it up in the same way we can backup any vSphere VM. We will actually backup the vmdk attached to the docker host itself.

Probably the hardest question to answer when talking about containers is what data to protect as the container itself is just a runtime instance. By design, containers are ephemeral and immutable. The writable space of the container can be either directly in memory (tmpfs) or on a docker volume. If we need data to persist across container lifecycles, we need to use volumes. The volumes can be implemented by a multitude of storage technologies and this complicates the backup process. Container images represent the template from which containers are launched. They are also persistent data that could be source for backup.

Steps for installing the vSphere Docker Volume Service and testing volumes

prerequisites:

Docker host already exists and it has access to Docker Hub
download the VIB file (got mine from here)

logon to ESXi host, transfer and install the vib

esxcli software vib install -v /tmp/VMWare_bootbank_esx-vmdkops-service_0.21.2.8b7dc30-0.0.1.vib

restart hostd and check the module has been loaded

/etc/init.d/hostd restart
ps -c | grep vdmk

logon to the Dokcer host and install the plugin

sudo  docker plugin install --grant-all-permissions --alias vsphere vmware/vsphere-storage-for-docker:latest

create a volume and inspect it - by default it will create the vmdk as independent persistent which will not allow snapshots to be taken - add option (-o attach-as=persistent) for dependent vmdks

sudo docker volume create --driver=vsphere --name=dockerVol -o size=1gb
sudo docker volume inspect dockerVol
sudo docker volume create --driver=vsphere --name=dockerVolPersistent -o size=1gb -o attach-as=persistent

go to vSphere client to the datastore where the Docker host is configured and check for a new folder dockvols and for the VMDK of the volume created earlier

since the volumes are not used by any container, they are not attached to the Docker host VM. Create a container and attach it the dependent volume

sudo docker container run --rm -d --name devtest --mount source=dockerVolPersistent,target=/vmdk alpine sleep 1d

Lastly, create a backup job with the Docker host as source, exclude other disks and run it.

Thursday, March 7, 2019

vCenter Server Restore with Veeam Backup & Replication

Recently I went through the process of testing vCenter Server appliance restore in the most unfortunate case when the actual vCenter Server was not there. Since the tests were being done for a prod appliance, it was decided to restore it without connectivity to the network. Let's see how this went on.

Test scenario

distributed switches only
VCSA
Simple restore test: put VCSA back in production using a standalone host connected to VBR

Since vCenter is "gone", first thing to do is to directly attach a standalone ESXi host to VBR. The host will be used for restores (this is a good argument for network team's "why do you need connectivity to ESXi, you have vCenter Server"). The process is simple, open VBR console go to Backup Infrastructure and add ESXi host.

You will need to type in the hostname or IP and root account. Since vCenter Server was not actually gone, we had to use the IP instead of FQDN as it was seen through the vCenter Server connection with the FQDN.

Next, start the an entire VM restore

During the restore wizard, select the point in time (by default last one), then select Restore to a different location or with different settings:

Make sure to select the standalone host:

Leave default Resource Pool and datastores, but check the selected datastore has sufficient space. Leave the default folder, however if you still have the source VM change the restored VM's name:

Select the network to connect to. Actually disconnect the network of the restored VM. That was the scenario, right? Since the purpose of this article is not to make you go through the same experience we had, let's not disconnect it. And you will see why immediately:

Keep defaults for the next screens and start the restore (without automatically powering on the VM after restore).

A few minutes later the VM is restored and connected to the distributed port group.

We started by testing a disconnected restored VM, but during the article we didn't disconnect it. And here is why: when we initially disconnected the network of the restored VM, we got an error right after the VM was registered with the host and the restore failed.

Same error was received trying to connect to a distributed portgroup configured with ephemeral binding. The logs show the restore process actually tries to modify network configuration of an existing VM and that makes it fail when VBR is connected directly to the console.When the portgroup is not changed for the restored VM, then the restore process skips updating network configuration. Of course, updating works with standard switch port group.

In short, the following restore scenarios will work when restoring directly through a standalone host:

restore VCSA to the same distributed port group to which the source VM is connected
restore VCSA to a standard portgroup

Wednesday, January 23, 2019

Role Based Access for VMware in Veeam Backup & Replication 9.5 Update 4 - Advanced Backup Features and Recovery

In previous post we looked at how to configure the integration between Veeam and vCenter Server and how to create a simple backup job in self service portal.

In this post we'll go further and look at advanced configuration options available for backup jobs as well at recovery features.

1. VM restore
The user restore the VMs he is in charge of . Login to self service portal (https://em-address:9443/backup) and go to VMs tab. Select the VM you want to restore and press Restore. Two choices are given - overwrite and keep original VM.

Once the restore point has been selected and whether to power on or not the VM, the wizard starts the restore process to production environment. Choosing "keep" will result in a new VM being created with the name: originalVMname_restoredDateTime

In this case pay attention not to power it on after restore as it is connected to the same network with the original VM. If you choose "overwrite", powering on maybe a good idea.

2. Guest file system indexing and file restore

When enabled, VBR will create an index of files and folders on the VM fuest OS during backup. Guest file indexing allows you to search for VM guest OS files inside VM backups and perform 1-click restore in the self service portal.Initially the indexes are created on VBR server, but later on moved to Enterprise Manager server.

Let's take our job created initially and add guest indexing to it. Logon the self service portal, go to jobsa and edit the job and on guest processing tab tick the boc "Enable guest file system indexing".

Add user credentials and if different credentials are needed for VMs in the job press customize credentials. Lastly, define what to index. By default everything is indexed except system folders (Program files, Temp, Windows). You can select to index only specific folders or to index all folders.

Save the configuration and run the backup job. Once the job runs successfully, in self service portal got to Files tab and type in the name of the file you want to find. In my case, the indexed VM is running IIS so I will search for iisstart.htm file.

After finding the file we can now restore the file directly to production VM overwriting the originla file or keeping it. The file can be also downloaded to the local computer in an archive having the name FLR_DATE_HOUR

If we have more files to restore, we can add the file to a Restore list and execute the restore on multiple files at a time:

File level restore is possible without indexing, but it needs to mount the backup and let you manually search the file in the file system.

3. Application item restore

If using Veeam explores allows for a very good granularity and flexibility regarding supported applications, the self service portal is restricted to MSSQL and Oracle applications. On the Items of the self service portal, select the database, the location where to restore (original or alternate) and the point in time.

We'll restore the DB on the same server, but with a different name. For this reason choose restore to another location and press Restore. Specify the server name and the credentials for connecting to admin share and MSSQL DB

Specify the MDF and log file names - change them is the same location with original DB is being used:

Press Finish and watch the progress in the status bar:

When it is finished, you can connect to MSSQL server and check the DB has been created:

The new integration with vCenter Server offers a powerful tool for job management delegation to the users others than backup administrators.

Tuesday, January 22, 2019

Role Based Access for VMware in Veeam Backup & Replication 9.5 Update 4

One of the cool features that Veeam Backup & Replication 9.5 Update 4 comes with is integration with vCenter Server role based access. What does it mean? It allows to delegate permissions to users and groups in Veeam Enterprise Manager based on their permissions in vCenter Server.

A user or group of users is now able to monitor and control the backup and restore of their own VMs in a vSphere environment based on a predefined policy. A policy can be defined through vSphere tags, a role in vCenter or as granular as a single permission in vCenter Server. Delegation is done through the self service portal in Enterprise Manager.

Cool thing is the integration is actually extending vCenter Server access control by adding vSphere tags as control mechanisms. For example DBA's want to do their backup and restore, then just assign a tag to the VMs and create the policy in Enterprise Manager. It's that simple.

Since my environment uses tags, we will test the following scenario: all developers will have access to development VMs which are tagged with "SLA3" vSphere tags.

First, make sure tags exist in vCenter Server and are assigned to the VMs in scope.

Next, you need to install and configure VBR and Enterprise Manager (EM). This is not in the scope of the current article.

Once EM is installed and configured login to the EM portal (https://em-address:9443/) and go to Configuration. Check VBR and vCenter Server are available and reachable.

On Self Service tab you will see the default configuration for all Domain Users group (my lab is AD integrated).

For the test we will create a new configuration.

Let's look at the Delegation Mode - what mechanism we use to define access:

By default, it uses VM privilege and the selected privilege is VirtualMachine.Interact.Backup, but you can choose to use any privilege available in vCenter Server. Once you need more flexibility, you can define roles in vCenter Server and use the delegation based on those roles (a set of privileges applied to an object). Finally, you can use vSphere tags and allow access based on the tags. Once the preferred method of delegation is chosen, it will apply to all self service configurations. So be careful when changing the method.

Now open, the default self service configuration and let's take a look at it. We see it assigns a repository and a quota on that repository. The quota can be global (for all users in group) or individual (per each user).

You also define the default settings for advanced job settings such as compression , deduplication, scheduling of full backups and so on. The settings can be copied from Veeam default or from an existing job.

There are 4 job scheduling options available from allowing full access to the scheduler to no access to the scheduler. We will use the default one (full access to job scheduling). Choose wisely what you want your users to do.

vSphere tags drop down list appeared because I chose the Delegation Method as vSphere tags, but it's left empty.

Let's create a new self service configuration for developers group. Press Add and then:

1. select the Type - user or group and search it in AD

2. select the repository and define the quota

3. select job scheduling options

4. select the vSphere tag

5. configure the advanced settings

Press Save and open the self service portal (https://em-address:9443/backup/). Login with one of the users from the group (Developers in my case). Since user is member of 2 configurations, select which configuration to logon:

Once logged in, the portal displays 5 tabs: Dashboards, Jobs, VMs, Files and Items

Go to Jobs and create a new backup job. A process similar to VBR console job creation will start. First give the job a name, description and decide how many restore points to keep:

Then add the VMs you want to backup. Only VMs with SLA3 tag will be displayed.

If required, enable application aware processing and add credentials for guest processing:

Schedule the job (in this configuration you are allowed):

Lastly, enable notifications (if you want to be alerted about the status of the job)

Now that the job has been created, you can run it:

Meanwhile, we can take a look at how things look in VBR console

You'll notice the running job named with the following convention: Domain\GroupName_Domain\UserName_JobName. So, all is good and running smoothly.

Back in the self service portal, pressing on the job statistics we can see what happened:

There you go, just had a user define a backup job and backup its own VM using a simple vSphere tag and no other settings at either vCenter Server or VBR level. Next time we'll take a look at restore options.