Tuesday, August 27, 2019

Create vCenter Server Roles Using PowerCLI - Applied to Veeam Backup & Replication

Security is important and having a minimal set of permissions is a requirement, not an option. Having this in mind (and being asked a few times by customers), I put together a short script that will create the vCenter Server roles required by Veeam Backup & Replication service account and Veeam ONE service account. The two accounts have different requirements, with Veeam ONE being the most restrictive as it needs mostly read only.

The script itself is pretty straight forward, the more time consuming is getting the privilege lists. So here you are:


And now for the actual scripting part:


$role = "Veeam Backup Server role"
$rolePrivilegesFile = "veeam_vc_privileges.txt"
$vCenterServer = "your-vcenter-server-FQDN"
Connect-VIServer -server $vCenterServer
$roleIds = @()
Get-Content $rolePrivilegesFile | Foreach-Object{
    $roleIds += $_
}
New-VIRole -name $role -Privilege (Get-VIPrivilege -Server $vCenterServer -id $roleIds) -Server $vCenterServer

The script will create a new vCenter Server role assigning it privileges from the file given as input.

If you ever require to get the privileges from vCenter Server then the next piece of code will help (thanks to VMware communities)

$role = "VBR Role"
Get-VIPrivilege -Role $role | Select @{N="Privilege Name";E={$_.Name}},@{N="Privilege ID";E={$_.ID}}

You will use the privilege ID format for creating the new role.

Saturday, July 27, 2019

Veeam Replication - Automatically Fix Detected Invalid Snapshot Configuration

This one comes directly from the field. It is about  that moment when something goes wrong at a client's site and you need to fix. In this case, replicas went wrong. The link between the two sites not being reliable at all and some other networking issues at the DR site (yes, it's always network fault) determined replica VMs to end up in a corrupted state. If you want to keep the replica VMs that are already in the DR site then the fix is pretty simple: remove snapshots and remap vmdk's still pointing to delta files. Manually doing this is cumbersome, especially since some VMs can have multiple disks and when we are talking about tens or hundreds of  VMs. Luckily, some things can be scripted.

We will present some of the modules in the script, since the whole script is published here on GitHub. For ease of readability we will remove some of the try-catch blocks and logging messages from the original script and comment here only the logical part.

Script is given as input parameters vCenter Server where VM replicas are, backup server hostname, replica job status, replica job fail message and replica suffix. The replica suffix is important since it uses it to find VMs

$vbrServer = "vbr1"
$vcServer = "vc1"
$status = "Failed"
$reason = "Detected an invalid snapshot configuration."
$replicaSuffix = "_replica"

The script needs to be executed from a place where both vCenter Server and Veeam backup server are reachable and where PowerCLI module is imported as well as Veeam PowerShell snapin.

Add-PSSnapIn -Name VeeamPSSnapin
Connect-VBRServer -Server $vbrServer
Connect-VIServer -Server $vcServer

Next, the script will get all VMs for which the replication job has failed ($status) with the given reason ($status). For this, we use Get-VBRJob cmdlet, FindLastSession() and GetTaskSessions() methods. Once the VM in a replica job matches the chosen criteria, it i added to an array ($vmList)

$vmList = @()
# get failed replica VM names 
$jobs = Get-VBRJob  | Where {$_.JobType -eq "Replica"}
foreach($job in $jobs)
{
 $session = $job.FindLastSession()
 if(!$session){continue;}
 $tasks = $session.GetTaskSessions() 
 $tasks | foreach { 
        if (($_.Status -eq $status) -and ($_.Info.Reason -match $reason)) {
            $vmList += $_.Name
            }
        }
}

Once we have the list of VMs who's replica failed, it's time to get dirty. The VM name we are looking for is made of the original VM name (from the array) and the replica suffix.

$replicaName = $_ + $replicaSuffix

First, delete the snapshots. Fot this we use PowerCLI cmdlets: Get-VM, Get-Snapshots, Remove-Snapshot.


$replica = Get-VM -Name $replicaName -ea Stop
$replica | Get-Snapshot -ea Stop | Sort-Object -Property Created | Select -First 1 | Remove-Snapshot -RemoveChildren -Confirm:$false -ea Stop

Next, remap the disks if they are still pointing to the delta file. In order to do that, we get all the disks for the replica VM (Get-HardDisk) and we check if the disk name of the replica VM contains the specific delta file characters ("-0000"). This is how we determine if it's a delta disk or a normal disk. Delta disk name is parsed to generate the source disk name (removing the delta characters from vmdk name). Once this is done, it's just a matter of reattaching the source disk to the VM (Remove-HardDisk, New-HardDisk)

# get disks for a replica VM
$disk = $replica |  Get-HardDisk -ea Stop
# process each disk 
$disk | foreach {
    $diskPath = $_.Filename
    # check if disk is delta file
    if ($diskPath -Match "-0000") {
        $diskPath = $_.Filename
        # for each delta file parse the original vmdk name
        $sourceDisk = $diskPath.substring(0,$diskPath.length-12) + ".vmdk"
        # get the datastore where the delta is 
        $datastore = Get-Datastore -Id $_.ExtensionData.Backing.Datastore
        # check the original vmdk still exists on that datastore
        if (Get-HardDisk -Datastore $datastore.Name -DatastorePath $sourceDisk) {
            # remove delta
            Remove-HardDisk -HardDisk $_ -Confirm:$false -ea stop
            # attach original disk
            $newDisk = New-HardDisk -VM $replica -DiskPath $sourceDisk -ea Stop
        } else {
            Write-Host "WARN Could not find $($sourceDisk) on $($datastore.Name) "
            Add-Content -Path  $logFile -Value "WARN: Could not find $($sourceDisk) on $($datastore.Name) "
        }
    }
}

Last thing to do is to consolidate disks. We run Get-VM and check if consolidation is needed. If it's needed, we just run ConsolidateVMDisks_Task() method.


$vm = Get-VM -Name $replicaName -ea Stop
if ($vm.Extensiondata.Runtime.ConsolidationNeeded) {
    $vm.ExtensionData.ConsolidateVMDisks_Task()
}

Now the replica VMs are re-usable. There is some manual job to be done, though. You need to map the replica VM in the replication job and run the job.

Sunday, July 21, 2019

A Few Thoughts on VMUG Romania Meeting

I am one of the VMware User Group (VMUG) leaders in Romania (co-leader). Basically I do event organization: getting speakers, sponsors, location, catering, promotion on social media. Sometimes I present at the meetings, but luckily enough for the past year our small and energetic community had other presenters from community willing to spend their time and share their knowledge. Luckily for me, since I got more time to organize the event.

Last VMUG (July 18th) was a special one for me (hopefully for the community also). We got to organize our biggest event to day and have maybe the best speaker lineup so far - both from sponsors and from community. We had the chance to meet Joe Baguley, VP and CTO, EMEA at VMware, and have him as key note speaker. A nice surprise was to meet and listen to Aylin Sali, CTO at Runecast, and one of the co-founders (Romanian one). As for the other guys, since we met before there were no surprises (at least not during business hours :-) ) just the same inspiring technical leaders that I knew. It was a fun and thrilling event with lots of ideas, networking and knowledge sharing. For this I thank you all: speakers, sponsors, co-leaders. A special thank you goes to the 4 community members that flew in from Timisoara and Cluj (Cluj guys for the second time already) to be with us on that day.

What's next? The bar has been risen a bit (blame the organizers), but this is a great opportunity to do better. Autumn will come with a pre-VMworld event and more great speakers.We also have been planning for a while to do an event outside Bucharest. Lastly, my dream is to organize the Balkanic UserCon. As you can see there is a lot of work to do ahead and a lot of place to grow.

Until next VMUG, I will leave you with the pictures (thanks Titi).














Wednesday, June 19, 2019

NSX- T Part 2 - Going Through Changes

The pop reference in the title was in my mind coming from metal ballad world, but as I found out it can easily be hip-hop. However this is where any reference to musical world stops as the following article will focus on some of the major changes that are brought in by NSX-T.

NSX-T is definitely a game changer. Latest release (version 2.4) brings in a lot of new functionality getting to parity with its vSphere counterpart. Coming from an NSX-v background, I am interested in the main differentiators that NSX-T is bringing into the game.

1. Cross platform support and independence from vCenter Server. 

The two come together since supporting multiple platforms required decoupling NSX Manager from vCenter Server. Besides supporting vSphere, NSX-T can also support non-ESX hypervisors such as KVM. I expect to see more coming in the near future. Lastly, NSX-T supports bare metal workloads and native public clouds (see integrating AWS EC2 instances -  youtube video here )

I will add here the NSX-T container plugin (NCP) that provides integration with container orchestration platforms (Kubernetes) and container based platform as a service (OpenShift, Pivotal Cloud Foundry). 

That's a big step from vSphere environments. 

2. Encapsulation protocol

At the core of any network virtualization technology is an encapsulation protocol that allows to send L2 protocols over L3. NSX-v uses VXLAN. Geneve is the protocol used by NSX-T. At the time of writing, Generic Network Virtualization Encapsulation (Geneve) is still an IETF draft, although on the standards track. According to people much wiser than me and that were involved in the specification design for Geneve, what the new protocol brings on the table is bringing the best from other protocols such as VXLAN, NVGRE and STT. 

One of the main advantages of Geneve is that is uses a variable length header which allows to include as many options as necessary without being limited to the 24 bit header of VXLAN and NVGRE or using all the time a 64 bit STT style header. 

3. N-VDS

NSX-T virtual distributed switches are called N-VDS and they are independent of vCenter Server. For this reason they come in 2 flavors (actually 3) depending on the platform:
- ESXi - NSX's version of vswitch which is implemented as an opaque switch in vSphere
- KVM - VMware's version of OpenvSwitch
- cloud and bare metal - NSX agent

An opaque switch is a network created and managed by a separate entity outside of vSphere. In our  case logical networks that are created and managed by NSX-T. They appear in vCenter Server as opaque networks and can be used as backing for VMs. Although not a new thing , they are different from the NSX-v vswitches. This means that installing NSX-T in a vSphere only environment will still bring in the opaque networks instead of the NSX-v logical switches.

4. Routing 

Differences are introduced at routing level where a two tiered model is being introduced by NSX-T. A very interesting blog article is here and I briefly will use a picture from it:

In a very short explanation:

  • Tier-0  logical router provides a gateway service to the physical world 
  • Tier-1 logical router provides services for tenants (multi-tenancy) and cloud management platforms 
Not both tiers are needed - depending on the services needed, only one tier can be implemented. 


In a way, NSX-v could provide the same tiered model using Edge Services Gateway (ESG's), but in certain designs the routing paths were not  optimal. NSX-T delivers optimized routing, simplicity and more architectural flexibility. Also, with the new model, Distributed Logical Router (DLR) control VM has been removed.


I will let you decide which of the changes briefly presented above are more important and I will just list them below:
  • standalone (vCenter independent) 
  • multi hypervisor support
  • container integration
  • GENEVE
  • NVDS and OpenvSwitches
  • multi tiered optimized routing
However, if you are moving around the network virtualization world and haven't picked up on NSX-T, maybe it's time to start. 

Tuesday, June 11, 2019

Docker Containers and Backup - A Practical Example Using vSphere Storage for Docker

A few months ago I started looking into containers trying to understand both the technology and how it will actually relate to one of the questions I started hearing recently "Can you backup containers?".

I do not want to discourage anyone reading the post  (as it is interesting), but going further basic understanding of containers and Docker technology is required. This post will not explain what containers are. It focuses on one aspect of the Docker containers - persistent storage. Contrary to popular believe, containers can have and may need persistent storage. Docker volumes and volume plugins are the technologies for it.

Docker volumes are used to persist data to the container's writable layer. In this case the file system of the docker host. Volume plugins extend the capabilities of Docker volumes across different hosts and across different environments: for example instead of writing container data to the host's filesystem, the container will write data to an AWS EBS volume, or Azure Blob storage or a vSphere VMFS.

Let's take a step down from abstract world. We have a dockerized application: a voting application. It uses a PostgreSQL database to keep the results of the votes. The PostgreSQL DB needs a place to keep its data. We want that place to be outside the Docker host storage and since we are running in a vSphere environment, we'll use vSphere Storage for Docker. Putting it all in a picture would look like this (for simplicity, only PostgreSQL container is represented):


We'll start with the Docker host (the VM running on top of vSphere). Docker Engine is installed on the VM and it runs containers and creates volumes. The DB runs in the container and needs some storage. Let's take a 2 step approach:

First, create the volume. Docker Engine using vSphere Storage for Docker plugin (vDVS Plugin and vDVS vib) creates a virtual disks (vmdk's) on the ESXi host's datastore and maps it back to the Docker volume. Now we have a permanent storage space that we can use. 

Second step: the same Docker engine presents the volume to the container and mounts it as a file system mount point in the container. 

This makes it possible for the DB running inside the container to write in the vmdk from the vSphere datastore (of course, without knowing it does so). Pretty cool.

The vmdk is automatically attached to the Docker host (the VM). More, when the vmdk is created from Docker command line, it can be given attributes that apply to any vmdk. This means it can be created as:
  • independent persistent or dependent (very important since this affects the ability to snapshot the vmdk or not)
  • thick (eager or lazy zeroed) or thin
  • read only 
It can also be assigned a VSAN policy. The vmdk will persist data for the container and across container lifetime. The container can be destroyed, but the vmdk will keep existing on the datastore. 

Let's recap: we are using a Docker volume plugin to present vSphere datastore storage space to an application running within a Docker container. Or shorter, the PostgreSQL DB running within the container writes data to vmdk. 

Going back to the question - can we backup the container? Since the container itself is actually the runtime instance of Docker image (a template), it does not contain any persistent data. The only data that we need is actually written in vmdk. In this case, the answer is yes. We can back it up in the same way we can backup any vSphere VM. We will actually backup the vmdk attached to the docker host itself. 

Probably the hardest question to answer when talking about containers is what data to protect as the container itself is just a runtime instance. By design, containers are ephemeral and immutable. The writable space of the container can be either directly in memory (tmpfs) or on a docker volume. If we need data to persist across container lifecycles, we  need to use volumes. The volumes can be implemented by a multitude of storage technologies and this complicates the backup process. Container images represent the template from which containers are launched. They are also persistent data that could be source for backup. 


Steps for installing the vSphere Docker Volume Service and testing volumes
  • prerequisites: 
    • Docker host already exists and it has access to Docker Hub
    • download the VIB file (got mine from here)
  • logon to ESXi host, transfer and install the vib

esxcli software vib install -v /tmp/VMWare_bootbank_esx-vmdkops-service_0.21.2.8b7dc30-0.0.1.vib


  • restart hostd and check the module has been loaded
/etc/init.d/hostd restart
ps -c | grep vdmk


  • logon to the Dokcer host and install the plugin 
sudo  docker plugin install --grant-all-permissions --alias vsphere vmware/vsphere-storage-for-docker:latest

  • create a volume and inspect it - by default it will create the vmdk as independent persistent which will not allow snapshots to be taken - add option (-o attach-as=persistent) for dependent vmdks

sudo docker volume create --driver=vsphere --name=dockerVol -o size=1gb
sudo docker volume inspect dockerVol
sudo docker volume create --driver=vsphere --name=dockerVolPersistent -o size=1gb -o attach-as=persistent



  • go to vSphere client to the datastore where the Docker host is configured and check for a new folder dockvols  and for the VMDK of the volume created earlier


  • since the volumes are not used by any container, they are not attached to the Docker host VM. Create a container and attach it the dependent volume

sudo docker container run --rm -d --name devtest --mount source=dockerVolPersistent,target=/vmdk alpine sleep 1d

Lastly, create a backup job with the Docker host as source, exclude other disks and run it.

Thursday, March 7, 2019

vCenter Server Restore with Veeam Backup & Replication

Recently I went through the process of testing vCenter Server appliance restore in the most unfortunate case when the actual vCenter Server was not there. Since the tests were being done for a prod appliance, it was decided to restore it without connectivity to the network. Let's see how this went on.

Test scenario
  • distributed switches only
  • VCSA
  • Simple restore test: put VCSA back in production using a standalone host connected to VBR
Since vCenter is "gone", first thing to do is to directly attach a standalone ESXi host to VBR. The host will be used for restores (this is a good argument for network team's "why do you need connectivity to ESXi, you have vCenter Server"). The process is simple, open VBR console go to Backup Infrastructure and add ESXi host. 

You will need to type in the hostname or IP and root account. Since vCenter Server was not actually gone, we had to use the IP  instead of FQDN as it was seen through the vCenter Server connection with the FQDN. 

Next, start the an entire VM restore


During the restore wizard, select the point in time (by default last one), then select Restore to a different location or with different settings:


Make sure to select the standalone host:

Leave default Resource Pool and datastores, but check the selected datastore has sufficient space. Leave the default folder, however if you still have the source VM change the restored VM's name:

Select the network to connect to. Actually disconnect the network of the restored VM. That was the scenario, right? Since the purpose of this article is not to make you go through the same experience we had, let's not disconnect it. And you will see why immediately:

Keep defaults for the next screens and start the restore (without automatically powering on the VM after restore). 


A few minutes later the VM is restored and connected to the distributed port group. 

We started by testing a disconnected restored VM, but during the article we didn't disconnect it. And here is why: when we initially disconnected the network of the restored VM, we got an error right after the VM was registered with the host and the restore failed. 


Same error was received trying to connect to a distributed portgroup configured with ephemeral binding. The logs show the restore process actually tries to modify network configuration of an existing VM and that makes it fail when VBR is connected directly to the console.When the portgroup is not changed for the restored VM, then the restore process skips updating network configuration. Of course, updating works with standard switch port group. 


In short, the following restore scenarios will work when restoring directly through a standalone host:
  • restore VCSA to the same distributed port group to which the source VM is connected
  • restore VCSA to a standard portgroup



Tuesday, February 19, 2019

Running Veeam PowerShell Scripts in Non-Interactive Mode - Credentials

This time we get back to some PowerShell basics and how to run scripts in non-interactive mode.

Veeam Backup & Replication and scripting go hand in hand really well. And the first thing you do when running a script is to connect to backup server

Connect-VBRServer -Server $ServerName -User $User -Password $Pass

Placing user and pass in clear text in a script may not be the best option. We could use PSCredential object to hold the username and password.

$PSCredential = Get-Credential
Connect-VBRServer -Server $ServerName -Credential $PSCredential

However getting the credentials in the objects implies using Get-Credential cmdlet which is interactive and will prompt to type the user and password. This makes it hard to run the script in non-interactive mode.

To make it non-interactive we need password saved somewhere. To save and encrypted password we can use ConvertFrom-SecureString cmdlet and pipe the output to a file:

(Get-Credential).Password | ConvertFrom-SecureString | Set-Content $encPassFileName

Since the username could be stored in the script itself, we retrieve only the secure string for the password, pipe it to the encrypting cmdlet and then output it to a file. Opening the output file will list something similar to:

000d08c9ddf0115d1118c7a00c04fc297eb01000

Now we need to create the PScredential object:

$password = Get-Content $encPassFileName | ConvertTo-SecureString 
$psCredential = New-Object System.Management.Automation.PsCredential($username,$password)

First, we loaded the content of the file storing the encrypted password and convert it to a secure string (PScredential objects use secure strings). Next we created the object with the username and the password. The generated object can be used to connect to VBR Server

Connect-VBRServer -Server $ServerName -Credential $PSCredential

By default, ConvertFrom-SecureString encryption is done using Windows Data Protection API (DPAPI). For this reason the file cannot be moved to another computer and used from that one. On each computer from where the script is being run, the password must be encrypted separately.