Sysadmin Stories: April 2020

Monday, April 20, 2020

Tips & Tricks to Install and Upgrade vRA 8.x in a Small Lab

I started building my vRA 8 environment in the home lab and even if the process was a pretty smooth one, working in an environment with limited resources presented challenges that I hope this article will help you overcome easily.

For me it was a 2 step process: install vra 8.0.1 and upgrade to 8.1 within week. So I will treat each step independently. Some of the challenges of installing 8.0.1 will certainly apply to a direct installation of 8.1.

My home lab is made of ESXi hosts running some minimal hardware (4 cores and 32 GB of RAM per host). Requirementes for vRA 8 are as following:
- vRealize Identity Manager (vIDM) - minimum 2 vCPU / 6 GB RAM
- vRealize Lifecycle Manager (vRLCM) - minimum 2 vCPU / 6 GB RAM
- vRealize Automation (vRA) 8.0.1 - 8 vCPU / 32 GB RAM
- vRealize Automation (vRA) 8.1 - 12 vCPU / 40 GB RAM

As you can easily see vRA 8 shouldn't be actually installed on a 32 GB ESXi host. And if I wouldn't have started with 8.0.1, I don't think I would've have even tried to install 8.1. However, I did start with 8.0.1. I also haven't read the system requirements in the beginning..

Installation of vRA 8.0.1

vRA certificate

If you have a green field like I did, then the first thing to install is vRLCM using the easy installer. At step "Identity Manager Configuration" make sure to select "Install New VMware Identity Manager". This will deploy both appliances. Now you can login to vRLM and create a new environment with vRA 8.

vRA8 needs a certificate, even if it's self signed. Go first to Locker - Certificate and generate a new certificate.

Next, fill in all required information about hostname, IP addresses, passwords, the usual stuff. The precheck should also execute successfully. Before launching the deployment, you can save the configuration as a JSON file. I recommend doing it as it may come in handy if you ever want to automate this install.

vRA VM configuration downgrade

The deployment is a multi staged process. In the first stage it will deploy the actual VM from OVF and try to power it on. In my case it failed as it tries to start a 8 vCPU / 32GB VM.

Open vSphere Client and change the settings of the VM - I used 4 vCPU and 30 GB of RAM. I did try with 24 GB, but that ended up with containers not being scheduled due to lack of resources:

In this case I think 30 GB is a decent compromise. Once you have modified the VM, go back to vRLCM and restart the task (Retry request). Take care not to delete the already exiting VM. At this point you only have to wait until all 8 stages are finished.

vRO expired password

Unfortunately, two hours later I had another error, this time during vravainitializecluster step. This is something related to 8.0.1 and it does not happen all the time. So you may not see it.

To confirm it, SSH to vRA VM and look at the log file indicated in the error /var/log/deploy.log and look for database connection error. Run also the following command to check the status of vRO containers: kubectl -n prelude get pods. If the vRO container in CrashLoopBackOff a quick search for "vro container CrashLoopBackOff" will get you to the following KB on new installs of vRealize Orchestrator 8.x failing to install due to a POD STATUS of 'CrashLoopBackOff' (76870). The error is caused by an expired password. Apply the steps in the KB and restart deploy. It picks up where it left and soon vRA is installed and running.

I am curios how a direct install of vRA 8.1 will actually work having in mind the small resource drawback. But even if it doesn't, there is a way to get there.

A few days later 8.1 was released so it was time to upgrade.

Upgrade to vRA 8.1

For a step by step article on upgrading to 8.1 you may look here, As stated above, I will only focus on the small hick ups.

Binary mapping

You need to first upgrade vRLCM and vIDM. Once these two feats are done (again, pretty straight forward thanks to Lifecyle Manager) you will upgrade vRA. I've downloaded the updaterepo ISO file from VMware site and uploaded it to vRLCM to /data (using winscp). Then I created a new binary mapping by going to Settings - Binary Mappings and adding the binary:

Precheck ignore

You can start the upgrade using vRLCM repository. The precheck will fail because this time there is a VM and it looks at its configuration and it does not like seeing 4 CPU and 30 GB of RAM:

Do like I did, ignore the errors and start the upgrade. One hour later you should see something similar to the following, which means the upgrade was successful

Snapshot

Do not forget the upgrade process takes a snapshot of the VM that you need to delete

Now I am running vRA 8.1 in my home lab and it works decently at least the Cloud Assembly part that I've been playing with so far. I do understand that resources are required for good reasons, but you need a 12 core / 64 GB host which is not easy to get. In this case, running it with reduced performance is better than not running it at all. There are obvious impacts on the services. For example the time it takes to boot up and it's impressive that it does it in the end. The following snip is a proof of the struggle behind:

Thursday, April 16, 2020

Veeam Backup & Recovery - Change Block Tracking Reset

Change Block Tracking (CBT) is a feature that allows tracking of changed disk sectors. Tracking for virtual disks is done in the virtualization layer. CBT is exposed through vSphere APIs for Data Protection (VADP) to 3rd party applications.

VBR uses this feature to track changes between incremental backups and make those backups faster. Instead of reading the whole vmdk, it will ask and receive only the changed blocks from the last incremental backup. How do you know CBT is enabled and used?

From VMware point of view, you will see for each vmdk a file with the vmdk_name-ctk.vmdk

From Veeam point of view, you will see in the backup job statistics next to the disk details CBT. Let's look at what actually happens in the backup job.

CBT on, active full backup, read 20 GB

CBT on, incremental backup, read 11 MB

CBT off, incremental backup, read 20 GB

This is CBT in action. The first picture is an active full, so the whole disk has been read. The second one is an incremental where only changes from last backup have been read. When CBT was disabled, the second incremental also read 20GB. Now extrapolate this to hundreds of VMs to understand the importance of CBT: less data read means not only a shorter backup window, but also less load on the production storage.

Luckily, CBT is enabled by default for all newly created backup jobs. It can be found in the vSphere integration tab, in Storage > Advanced.

CBT is not supported for physical mode RDMs and if a VM has snapshots when it's activated the first time. Sometimes CBT can get corrupted and the only way to solve it is to reset CBT. This is not easily done A new feature in VBR v10 allows to automatically reset CBT on all VMs when after an active full backup is executed. Remembering how many times I heard support guys advising a CBT reset, I think this is a cool feature to add. The CBT reset action is caught in the backup stats window

The downside is that the active full backup will take a bit longer, but you will be protected against potential CBT corruptions.

Monday, April 6, 2020

Upgrading vCSA 6.7 to vCSA 7.0

First thing, first: backup vCSA 6.7. Use a backup solution or do a vCenter backup from VAMI. It's never bad to have one, even though you will see that it can be skipped.

Next, mount the ISO and start the installer UI. The UI presents the already known options: install, upgrade, migrate, restore. My target is to upgrade existing lab environment so, I am choosing upgrade. Before going further into the post, I want to clarify the upgrade process. It is not an in place upgrade, it is actually a migration and you will end up with a new VM running vCSA 7.0 alongside the old vCSA 6.7.

I don't intend to describe the step by step upgrade since there are already a few good blog posts and the process itself is pretty straight forward. I will however highlight some things I came across during the upgrade.

you will end up with 2 VMs - a powered off old vCSA 6.7 and a powered on vCSA 7.0
vCSA 7.0 will preserve in the end FQDN and IP of vCSA 6.7
a temporary IP address is required for vCSA 7.0 to be used during the data migration from vCSA 6.7 (that moment in time when both VMs are up and running)
it's a 2 stage process: first a new vCSA VM is deployed, then the data is migrated
migration offers possibility to chose how much data you actually want transferred (the more you choose, the longer it takes)

configuration and inventory
configuration, inventory, tasks and events
configuration, inventory, tasks, events and performance metrics

the whole process took almost 2 hours (my lab), expect bigger times for real environments

During the upgrade itself I got a couple of warning at pre-check and a couple of notifications at the end for the rest being smooth and uneventful. The warning was about not having DRS enabled on my cluster, which was fine because I had a single node cluster where vCSA was running:

The notifications is about TLS 1.0 and 1.1 being disabled and Auto-Deploy needing an update:

I did try to upgrade using CLI installer, however there are some issues with the upgrade templates and its schema in the GA version (15843807) and it kept on failing during JSON template precheck. I will come back to this topic once I figure it out.

Saturday, April 4, 2020

Veeam Backup and Replication v10 - General Options

I finally got v10 up and running in my lab. I started randomly clicking around the interface and noticed a couple of changes in the General Options menu. I think it's one of the most ignored part of Veeam Backup & Replication while it has a lot of important settings. So I decided to pay it the proper respect and write a post about it. Just a heads up: if you are interested in VBR notifications, alerts, audits, reports, then this it where you should look.

I/O Control
By default it is disabled. However, if you are running in a production environment and backups may overlap with any business hours, this should be enabled. It ensure backups are not going to impact the production datastores by imposing latency threshold and throttling down backup jobs. The thresholds should be set accordingly with the underlying physical storage and current production latency.

Notifications
This tab is enabled by default. Besides license and update related notifications, there are 3 settings regarding notifications on space usage:

warn when backup storage space is less than 10%
warn when production storage is less than 10% - this will cause your backup jobs to finish with a warning
skip VM processing when free disk space is below 5% - in this case a backup job will fail for the VMs on those datastores

As you can see these notifications are not only simple notifications, but also determine the result of backup jobs. You can tweak the default settings, but it is highly recommended to leave enough free space on datastores for snapshots to grow during a backup.Otherwise you may run into an out of space situation.

Security

This is used to change VBR SSL certificate (default is self signed), set up trust for Linux discovered hosts and it also has a new setting that appeared in v10: audit logs location.

The audit logs are used to track file level restore activities. By changing the location to a centralized repository or even a WORM (write once read many) device, you can ensure a proper audit trail and archival. In case you run a file level restore, the following information will be logged in a CSV format file: Time, User, SID, Operation, Result, Object.

29.03.2020 18:35:51Z, MYLAB\Administrator, S-1-5-21-AAAAAAA-XXXXXXX-NNNNNNN-500, Restore, Success, C:\Users\Administrator.MYLAB\Desktop\order_form_7393 VMUG Feb 2019.pdf

However there is a workaround - you can open a file directly from backup in explorer and read its content. This activity will not be logged.

E-mail Settings and SMNP Settings

These two tabs are dedicated to reporting over e-mail and alerting over SNMP. When enabling e-mail reporting you may choose the level of granularity. By default you will receive e-mails for any job status: success, warning or fail.

For SNMP you may configure up to 5 different receivers, as well as their community string and listening port.

Session history

Session history configure the number of session to display and the retention period. Up to 9.5U4 the default retention period was 53 weeks. With v10 this changed to 13 weeks.

From an operational point of view, the most important settings are related to I/O control, notifications and e-mail/SNMP while from auditing point of view, the audit log and the history retention period.

Thursday, April 2, 2020

vCenter Server Appliance 7.0 Command Line Installer

One of my favorite features in vSphere is command line install of vCenter Server. It first appeared with vCenter Server 6.0. It is based on a JSON file input and can be use to do a fresh install of vCSA 7.0, upgrade an existing vCSA 6.5 or 6.7 installation to 7.0 or migrate a Windows vCenter Server 6.5 or 6.7 to vCSA 7.0.

The installer can be run from Windows, Linux or Mac. To access it, you need the vCSA iso file and locate folder vcsa-cli-installer\win32 (Windows users). JSON templates are found in templates folder. You need to modify the JSON template that fits your use case. I will do a fresh install of vCSA 7.0 in my lab so I will be using the template embedded_vCSA_on_VC.json which deploys the new vCSA inside an existing vCenter Server. The template is commented very well, however I will post here an example of what a simple configuration looks like. Please be aware that this is just a snippet of the actual template and some parts have been left out for ease of reading.

    "new_vcsa": {
        "vc": {
            "hostname": "vcsa67.mylab.com",
            "username": "administrator@mylab.com",
            "password": "",
            "deployment_network": "VM Network",
            "datacenter": [
                "VDC-1"
            ],
            "datastore": "DATASTORE-1",
            "target": [
                "CLUSTER-1"
            ]
        },
        "appliance": {
            "thin_disk_mode": true,
            "deployment_option": "small",
            "name": "vcsa70"
        },
        "network": {
            "ip_family": "ipv4",
            "mode": "static",
            "system_name": "vcsa70.mylab.com",
            "ip": "192.168.100.1",
            "prefix": "24",
            "gateway": "192.168.100.254",
            "dns_servers": [
                "192.168.1.10"
            ]
        },
        "os": {
            "password": "",
            "ntp_servers": "0.ro.pool.ntp.org",
            "ssh_enable": true
        },
        "sso": {
            "password": "",
            "domain_name": "vsphere.local"
        }
    }

As you can see, once you create the template it can reused a lot of times. What for you may ask and one answer is nested labs. If you are unsure what size the vCSA should be, the installer will tell you:
.\vcsa-deploy.exe --supported-deployment-sizes

The installer takes different parameters besides the JSON file:
.\vcsa-deploy.exe install --accept-eula [--verify-template-only|--precheck-only][file_path_to_json]

If you want to automatically accept SSL certificate thumbprint, you can add --no-ssl-certificate-verification parameter.

As seen above, the installer comes with 2 options that enable you to check that everything is fine before actually starting the install:

verify-template-only - will run a JSON file verification to validate the structure and input parameters (e.g. password strength, IP address, netmask). The final check result is displayed along with the path to the log file. The log file contains all required details. For example if you typed an IP address that does not exist, the following message is displayed in log file:

2020-03-27 19:44:06,232 - vCSACliInstallLogger - ERROR - The value '192.268.100.1' of the key 'ip' in section 'new_vcsa', subsection 'network' is invalid. Correct the value and rerun the script.

precheck-only - will do a dry run of the installer. This time it will connect to vCenter server and check that the environment values are actually correct: for example that you don't have another VM with the same name, vCenter objects are correct (datacenter, datastore, cluster or host). It also does a ping test to validate the IP/FQDN entered for the new vCSA are available.

================ [FAILED] Task: PrecheckTask: Running prechecks. execution

Error message: ApplianceName: A virtual machine with the name 'vcsa70' already

exists on the target ESXi host or cluster. Choose a different name for the

vCenter Server Appliance (case-insensitive).

Of course, you don't have to run both checks or even any check if you are confident enough. For me, precheck-only helped since I didn't understand how to fill in the JSON file from the first time (I will blame it on a barrier language). One very important aspect of installing is to have DNS records setup and working. If you don't, even if the prechecks and the actual install will work, first boot of vCSA will most likely fail.

Having all setup up and checked, you just run the install command and that's it. I like the CLI installer because it is simple, powerful and repeatable. No more filling in fields in a GUI and waiting for the lines on the screen.