Saturday, March 11, 2023

Clear DNS Cache on VCSA after ESXi IP Address Update

I've recently had to do some changes and modify the ESXi management IP address. Once the ESXi host has been put to maintenance mode and removed from vCenter Server inventory I've updated the DNS server records: A and PTR records. Checking DNS resolution works, I've tried to re-add servers to vCenter Server using their FQDN, but it errored with no route to host. This is because of the DNS client cache on VCSA. 

To solve it fast and not wait too long you need to ssh to VCSA appliance and run the following commands: 

systemctl restart dnsmasq

systemctl restart systemd-resolved

Once the services are restarted you can add again the ESXi hosts using the FQDN.

Wednesday, February 22, 2023

A Look at Veeam Backup & Replication v12 NAS Backup - Creating file share backup job

In the previous post "A Look at Veeam Backup & Replication v12 NAS Backup - Initial Configuration" we discussed about NAS backup architecture and the initial configuration needed to setup infrastructure. It is time to continue with the creation of the backup  job and its options


Give the backup job a name

Select the shares to protect

If  you want to exclude files from being backed up, go to Advanced

Select the repository (we use here S3 compatible on premises minio)

Advanced settings for the backup job will let you specify how many versions to protect

Specify how to process ACLs for files and folders

Define compression and encryption 

If you want to plan periodic backup file maintenance, you can do it here

You can run scripts pre and post job execution, for example if you want to create a snapshot of the file share before the job runs

In case you would like to get warnings in the job about skipped files, configure here

Because we use S3 compatible, we are asked about helper appliance. We will skip it.

On archival page we selected another S3 compatible storage, this time in AWS. We will AWS S3 bucket to hold a copy of the primary storage and also to move there any files older than 3 months

Here you can filter out files that are sent to acrhive

Finally, schedule the backup job

And run it


Wednesday, February 15, 2023

A Look at Veeam Backup & Replication v12 NAS Backup - Initial Configuration

NAS backup was introduced back in v10. At the time I published a small article about how to backup a VSAN based file share using Veeam Backup & Replication (VBR) that you can read here. A lot of things changed since v10 in terms of features added to NAS backup. The new release will add support for direct to object storage backup and immutability just to name a couple.

The architecture stayed fundamentally the same since the release and its main components can be seen in the following diagram : 


File share - storage device where our protected data (files) resides. 

File proxy - backup proxy that runs the data mover service, reads data from the file share and sends it to the backup repository.

Cache proxy - dedicated component in NAS backup architecture to keep metadata about the protected sources. The cache holds temporary data and is used to reduce the load on the source during backups. 

Backup repository  - storage location for the backups 

Backup server - controls and coordinates jobs and resource allocation

In a large environment the components will be sized and distributed accordingly on different machines. The lab setup is a bit different: All Veeam components on the same machine. As source we have installed a True NAS VM  serving 2 NFS file shares 




To populate the file shares we've used a simple bash script similar to the code below. To make it faster, we've used multiple scripts that we ran in parallel. 

#!/bin/bash

for i in {1..100000}
do
    head -c 2K /dev/urandom > /mnt/filer/lots_of_files/2K_file_$i
done

The lab setup has limitations coming from ethernet connectivity between hosts, Intel NUC resources (SCSI interfaces of the local datastores, CPU and RAM) and resources allocated to the VMs. 


VBR Configuration
We will start preparing VBR for NAS backups. Add the required Veeam roles: NAS backup proxy and cache repo. Then we will connect the NFS shares to the server. For brevity of the post, steps in the wizard where no changes are done will be skipped.  

NAS Backup Proxy
In our scenario with all in one VBR deployment, the NAS backup proxy role will be installed by default on the backup server.


In case you would add the proxy role(s) on different machines, then you would need to start the "Add proxy" wizard from "Backup Infrastructure"

Select "General purpose backup proxy" 

Then choose your server to which you want to assign NAS proxy role. If the server you want is not in the list, press Add new to start adding it as a managed server. We've only lowered the number of max concurrent tasks due to lab resources availability. Normally you keep default settings. 


Next follow the wizard's screens and install the proxy role. 


Add Cache Repository

Cache repo role should be as close as possible to proxy and source. It holds metadata in memory, hence having a fast disk is not mandatory. In lab we install it on backup server VM, but in a prod environment you should move it to another machine to avoid resource competition (or size you backup server to accommodate the cache repo). To create a cache repo, go to Backup Repositories and start Add Backup Repository wizard and follow the wizard. No special configuration is needed. 

Select direct attached storage

Select repo OS
Give the repo a name

Select the managed server where to install the cache repo and disk location for the repo (press Populate button). It needs only a few GB of free space. If the server you want is not in the list, press Add new to start adding it as a managed server. 

Choose the folder for the repo

Use the default settings until the end of the wizard. It will install the transport service, 


Add file share
We are using NFS shares served from a TrueNAS VM. We'll go to "Inventory" and start "Add File  Share"





Enter the NFS path to the share. Here you can also select advanced settings on how to process the file share - directly from the filer or using a snapshot of the share


Next select to use any available proxy or specific one. In case you have a distributed architecture with multiple sites, you would select here the proxies closer to you share. Also, the cache repo and how aggressive you want the backup job to be are configured on the same step.

Finally, install Veeam components and finish adding the share.


Backup repository 
From the point of view of a backup repository there is no difference in v12 between a NAS job and a VM image one. Any available repository can be used now including direct to object storage and hardened repository. Just pick your favorite.  

Now you are ready to create the backup job - in this post A Look at Veeam Backup & Replication v12 NAS Backup - Creating file share backup job

Wednesday, February 1, 2023

Five Reasons to Monitor Your Infrastructure with Veeam ONE

 Veeam ONE (VONE) is the monitoring tool for your backup environment and not only. You can use it as well to gain visibility into your virtual infrastructure. And when you think at the features it provides out of the box such as proactive alerting, monitoring, reporting, capacity planning, chargeback, intelligent automation and diagnostics you understand the value it brings to your environment. But what got me so excited about Veeam ONE? Well, it was this:



It is a screen shot from VONE monitoring my lab environment. It's not the fact that there are alarms that got my attention. It is actually what each of those alarms is describing.

Let's look closer. Veeam ONE is monitoring in this case Veeam Backup & Replication server from my personal lab. Upon a quick look at the screenshot you will notice there are 5 different issue displayed. Some are errors, others just warnings. In both cases, these would prove to be critical if ignored in a production environment.

1. Backup repository connection failure 

My scale out backup repository has a capacity tier in Google Cloud Storage. The S3 bucket is not accessible anymore. This alarm triggers by default when the repo is not accessible for more than 5 minutes. 

2. Backup job state 

This alarm is looking at the state of all backup jobs. The backup job has ended with a error because the vSphere tags where deleted from vCenter Server. So no more backups for those VMs. 

3. Suspicious incremental backup size

This is one of the out of the box alarms that can help you in case of a ransomware attacks. It looks at changes in size between incremental backups and it triggers to let you know you should further investigate what's happening. 

4. Job disabled

There are disabled backup jobs. This can be on purpose or by mistake. In any situation, as a backup admin you would like to know if there are any and which they are. The predefined time before the alarm is triggered is 12 hours. More, this alarm has remediation actions. You just need to press "Approve Action" enter a comment and Veeam ONE will enable back the job. How cool is that!


Once the action is executed successfully and the job enabled in VBR, the status changes to let you know


5. Immutability state

Seems that even if I am using S3 compatible storage, immutability flag has not been set. In today's cybersecurity context, this is one small configuration that should be applied to all your repositories. Keep your backups protected from any type of modification. 

Conclusion 

These are only 5 alarms out of hundreds in Veeam ONE that help you keep your IT infrastructure operating securely. The alarms are all coming out of the box, but you can customize them and create your own. As I stated above, any of the issues highlighted by the alarms will prove critical in case of a situation arising. So it's better to catch and solve them pro-actively. 





Thursday, January 5, 2023

Managing vSphere VM Templates with Packer

Packer is an open source tool developed by HashiCorp that lets you create identical images using the same source. It helps in implementing and managing golden images across your organization. I will be using Packer in a vSphere environment only and not be using its multi platform support. The use case I am looking at is managing VM templates applying infrastructure as code concepts. 

The workflow I am implementing is using base VM templates made of basic OS installation, VMware tools and networking connectivity. These base templates do not need any management except for periodic updates/patches. The base VMs then are customized into project specific templates using Packer. The process installs any given project customization such as additional users, software packages, devices and creates a new template to be used as the source for prod deployment. Packer will not replace a configuration management tool, but it will reduce the time to deploy and configure the prod (or running) instances. It is faster to have a prepped template than to wait for packages to install on each of your instances during prod deployment. The diagram below exemplifies the intended process:  

In this workflow, Packer plays a crucial role allowing for fast and repeatable automation of the VM templates based on specific requirements. All credentials are kept in a dedicated secrets manager, called Vault. I will not enter into details about Vault, just keep it in mind as it is used to store any credentials used by Packer. A new set of templates results at the end of the customization process and these are used to run the prod instances.

Packer will also ensure that any changes to the base VM template are tracked and can be repeated in any other infrastructure while they are written in a human readable format. Let's look at a simple example where we modify a CentOS 7 base template. For our project we will use the the following folder structure: 


There are 3 files:
  • variables.pkr.hcl - keeps all variable definitions
  • tmpl-linux.auto.pkrvars.hcl - keeps the initialized input variables and it will be loaded during run; this allows to only change this file when moving to another environment
  • tmpl-linux.pkr.hcl - main Packer file 
Packer  uses HashiCorp Configuration Language (HCL). Let's look at variables.pkr.hcl file contents:

variable "vcenter_server" {
  type        = string
  description = "FQDN or IP address of the vCenter Server instance"
}

variable "build_user" {
  type        = string
  description = "user name for build account"
}

locals {
    timestamp = regex_replace(timestamp(), "[- TZ:]", "") 
}

local "linux_user_pass" {
  expression = vault("/kv/data/linux_workshop", "${var.ssh_user}")
  sensitive  = true
}

local "build_user_pass" {
  expression = vault("/kv/data/build_user", "${var.build_user}")
  sensitive  = true
}

There are 2 types of variables - input variables and local variables. Input variables need to be initialized from a default value, command line, environment or variable files (we are using auto.pkvras.hcl file for this). Local variables cannot be overridden at run time and can viewed as some kind of constants. In the example above the real number of variables has been truncated to keep it readable. You can see the input variables such as "vcenter_server" and "build_user". There are also 2 local variables - "timestamp" which is calculated from a function and used in our case in the note field of the VM and "build_user_pass" which keeps the password for our build user and it takes this value from Vault secrets manager. The "build_user_pass" is marked as sensitive which will hide it from the output.

Next, let's look the variable initialization file tmpl-linux.auto.pkrvars.hcl

vcenter_server = "vcsa.mylab.local"
build_user = "build_user@vsphere.local"

We chose to initialize the variables from a separate file. In it, we just assign values to our input variables. If we need to modify any variable this is the only place where we make the changes which makes it easier to manage. Again, for ease of reading the file has been truncated.

Time to see what the tmpl-linux.pkr.hcl file contains. In the customization we'll apply to our template we are looking at two things:
  • add a new disk to the target image
  • install software packages in the target image

We'll look at each section in the packer fil. First we define the required plugins - in our case vsphere. You can make sure that a certain version is loaded. 

packer {
  required_version = ">= 1.8.5"
  required_plugins {
    vsphere = {
      version = ">= v1.1.1"
      source  = "github.com/hashicorp/vsphere"
    }
  }
}

Next we define the source block which has the configuration needed by the builder plugin (vsphere plugin loaded above).

source "vsphere-clone" "linux-vm-1" {
  
  # vcenter server connection
  vcenter_server      = "${var.vcenter_server}"
  insecure_connection = "true"
  username            = "${var.build_user}"
  password            = local.build_user_pass

  # virtual infrastructure where we build the templates
  datacenter          = "${var.datacenter}"
  host                = "${var.vsphere_host}"
  datastore           = "${var.datastore}"
  folder              = "Templates/${var.lab_name}"

  # source template name 
  template            = "${var.src_vm_template}"

  # build process connectivity 
  communicator        = "ssh"
  ssh_username        = "${var.ssh_user}"
  ssh_password        = local.linux_user_pass

  # target image name and VM notes  
vm_name = "tmpl-${var.lab_name}-${var.new_vm_template}" notes = "build with packer \n version ${local.timestamp} " # target image hardware changes disk_controller_type = ["pvscsi"] storage { disk_size = var.extra_disk_size disk_thin_provisioned = true disk_controller_index = 0 } convert_to_template = true }

In the source we let the build plugin know how to connect to vCenter Server, what virtual infrastructure to use (datastores, hosts), what is the source template that we will use, how to connect to it and what is the configuration for the target template that we build. At the end we instruct the plugin to convert the newly created image to a VM template. Notice the communicator defined as "ssh". Communicators instruct Packer how to upload and execute scripts in the target image. It supports: none, ssh and winrm. Please mind some builders have their own communicators, such as Docker builder. 

With the current configuration we can actually define our build process. We've already accomplished half of our customization - adding the new disk is defined in the source block. In the build block we place the configuration that is needed by our build plugin. We will actually use a shell provisioner to install two packages - htop and tree. In my example, shell provisioner is sufficient to do the job, basically run in the target image "yum install". However I would recommend using a proper configuration management tool such as Ansible instead of directly running commands. 

build {
  sources = ["source.vsphere-clone.linux-vm-1"]

  provisioner "shell" {
    execute_command = "echo '${local.linux_user_pass}' | sudo -S sh -c '{{ .Vars }} {{ .Path }}'"
    inline = ["yum install tree htop -y"]
  }

}


Notice execute_command - this is a customization of the command we want to run (yum) and we use it to send the sudo password. The password itself is take from the local variable which is initialized with the value from kept in Vault secrets manager (as defined in variables.pkr.hcl).

The only thing left to do is to validate your configuration and run the build process.

packer validate .

packer build .

Please note that variable files in this post have been truncated for ease of reading. If you intend to use this example, you would need to fill in the missing variables and initialize them according to your environment. 

Sunday, December 18, 2022

vCenter Server 8 Upgrade - Unknown Host Error

 I've recently upgraded my vCenter Server 7.0.3 to 8.0 and during the process I've encountered the following error: Error in method invocation [Errno 1] Unknown host

The error is related to using IP address instead of FQDN as shown in the image below . It will appear in stage 2, after VCSA VM is deployed in the environment. 



To avoid it, just use FQDN everywhere. This error was first mentioned on upgrades from 6.7 to 7.0, but somehow in my lab got ported to upgrade from 7.0 to 8.0.  

Thursday, December 8, 2022

VMUG Leader Summit 2022 - How The Times Have Changed

It's been 32 months since the last summit took place and how the times have changed. Not sure if anyone would've dared in February 2020 to imagine the shifts to come a few weeks later and that will keep on coming throughout the next 2 and a half years. And it's not only socially and politically, but also technologically. The status quo of the human society was pretty much ignored and we had to learn, adapt and change. 

This week in Lisbon I had the opportunity to meet again the great community of people from all the corners of the world called VMUG. And the people in it is what makes this whole idea great. In 2020 it was the social part that impressed me too. Technologies come and change, trends may shift (again this word) on the way. The people involved in the process are the most important. 

Talking about people we had (as always) a number of select guests from VMware to talk to us. From Joe Baguley's (CTO EMEA) talk about skills gap and how AI and robots are slowly making it to the masses and into your house, to Duncan Epping's journey throughout VMware and the idea that everyone wants to be something, but very few want to make the effort to become that something, it all evolved around the changes that we are living and need to face. And it is scary, especially since the buzz word without any relation to the event, was AI, and more exactly ChatGPT, an OpenAI project that was trained to interact with humans in a conversation like natural language. The ease of usage and accessibility of this AI model is amazing and it will change a lot. But it is still only a tool in our hands. And the way we use it will make a difference. Only two years ago a talk around this subject would've brought a lot of smiles. 

I am a techie, not really great with words so let me wrap it up. As humans we need certainty, connection, contribution. VMUG is a community that brings all of this and more. And it does it through its members, the people that change, learn and grow along with the community.