Sysadmin Stories: April 2023

Sunday, April 23, 2023

A Quick Look At Terraform Provider for Ansible

Terraform Provider for Ansible v1.0.0 has been release recently and while reading a couple of articles about it I actually wanted to see how it work end to end.

We're going to look in this article at an use case where we provision cloud infrastructure with Terraform and then use Ansible to configure that infrastructure.

To be more specific, in our scenario we are looking at achieving the following

1. use Terraform to deploy an infrastructure in Google Cloud: VPC, VM instance with an external IP address and firewall rule to allow access to the instance

2. automatically and transparently update Ansible inventory file

3. automatically configure the newly provisioned VM instance with Ansible

We use Terraform provider for Ansible and Ansible Terraform collection. From the collection we will be using the inventory plugin. Everything is run from a management machine installed with Terraform, Ansible and the Ansible collection (for installation please see the GitHub project linked above).

We will orchestrate everything from Terraform. We'll use Ansible provider to place the newly created VM instance to a specific Ansible group called "nginx_hosts" and execute Ansible commands to update the inventory and run the playbook that installs nginx.

For simplicity we use a flat structure with a single Terraform configuration file, an Ansible inventory file and an Ansible playbook.

We start by looking at the Ansible files.

inventory.yml contains only one line that references the collection inventory plugin:

plugin: cloud.terraform.terraform_provider

This way we make sure the inventory file is actually created dynamically based on the Terraform state file.

nginx_install.yml is the playbook that installs nginx on the VM instance. It's a very simple playbook that checks the latest version is installed and that it is started. We will be using Ubuntu for our Linux distribution.

---
- hosts: nginx_hosts
  tasks:
    - name: ensure nginx is at the latest version
      apt: name=nginx state=latest update_cache=true
      become: true
    - name: start nginx
      service:
          name: nginx
          state: started

Based on the code written so far, if we add any host to the group named "nginx_hosts" running the playbook will ensure latest version of nginx is installed. We have no knowledge of IP addresses or the hostnames of those hosts. We actually have no idea if there are any hosts in the group.

The Ansible hosts that we want to configure are created using Terraform. For simplicity there is only one flat file - main.tf file. We start by defining the Ansible provider.

terraform {
  required_providers {
    ansible = {
      source  = "ansible/ansible"
      version = "1.0.0"
    }
  }
}

Next we define the variables. We are using Google Cloud provider and we need some variables to configure it and deploy the resources. We are using a user_id to generate unique resource name for each deployment. We add GCP provider variables (region, AZ, project) and variables for the network.

variable "user_id" {
  type = string
  description = "unique id used to create resources"  
  default = "tfansible001"
}

variable "gcp_region" {
  type = string
  description = "Google Cloud region where to deploy the resources"  
  default = "europe-west4"
}

variable "gcp_zone" {
  type = string
  description = "Google Cloud availability zone where to deploy resources"  
  default = "europe-west4-a"
}

variable "gcp_project" {
  type = string
  description = "Google Cloud project name where to deploy resources" 
  default = "your-project"
}

variable "networks" {
  description = "list of VPC names and subnets"
  type        = map(any)
  default = {
    web = "192.168.0.1/24"
  }
}

variable "fwl_allowed_tcp_ports" {
  type        = map(any)
  description = "list of firewall ports to open for each VPC"
  default = {
    web = ["22", "80", "443"]
  }
}

We need also variables for Ansible provider resources: ansible user that can connect and configure the instance, the path the the ssh key file and the path to python executable. In case you use just want to test this, you can use your Google Cloud user.

variable "ansible_user" {
  type = string
  description = "Ansible user used to connect to the instance"
  default = "ansible_user"
}

variable "ansible_ssh_key" {
  type = string
  description = "ssh key file to use for ansible_user"
  default = "path_to_ssh_key_for_ansible_user"
}

variable "ansible_python" {
  type = string
  description = "path to python executable"
  default = "/usr/bin/python3"
}

Then we configure the Google Cloud provider. Note that in Terraform it is not mandatory to define a provider with requried_provider block. Also note that for Ansible provider there is no configuration.

provider "google" {
  region  = var.gcp_region
  zone    = var.gcp_zone
  project = var.gcp_project
}

Time to create the resources. We start with the VPC, subnet and firewall rules. The code iterates through the map object defined in variables section:

resource "google_compute_network" "main" {
  for_each                = var.networks
  name                    = "vpc-${each.key}-${var.user_id}"
  auto_create_subnetworks = "false"
}

resource "google_compute_subnetwork" "main" {
  for_each                 = var.networks
  name                     = "subnet-${each.key}-${var.user_id}"
  ip_cidr_range            = each.value
  network                  = google_compute_network.main[each.key].id
  private_ip_google_access = "true"
}

resource "google_compute_firewall" "allow" {
  for_each = var.fwl_allowed_tcp_ports
  name     = "allow-${each.key}"
  network  = google_compute_network.main[each.key].name

  allow {
    protocol = "tcp"
    ports    = each.value
  }

  source_ranges = ["0.0.0.0/0"]

  depends_on = [
    google_compute_network.main
  ]
}

Then we deploy the VM instance and we inject the ssh key using VM metadata. Again, ansible_user could be your Google user if you are using this for a quick test.

resource "google_compute_instance" "web" {
  name         = "web-vm-${var.user_id}"
  machine_type = "e2-medium"

  boot_disk {
    initialize_params {
      image = "projects/ubuntu-os-cloud/global/images/ubuntu-2210-kinetic-amd64-v20230125"
    }
  }

  network_interface {
    network    = google_compute_network.main["web"].self_link
    subnetwork = google_compute_subnetwork.main["web"].self_link
    access_config {}
  }

  metadata = {
    "ssh-keys" = <<EOT
      ansible_user:ssh-rsa AAAAB3NzaC1y...
     EOT
  }

}

So far we have the infrastructure deployed. We now need to configure the VM instance. We will configure a resource of type ansible_host. The resource will be used to dynamically update the Ansible inventory.

resource "time_sleep" "wait_20_seconds" {
  depends_on      = [google_compute_instance.web]
  create_duration = "20s"
}

resource "ansible_host" "gcp_instance" {
  name   = google_compute_instance.web.network_interface.0.access_config.0.nat_ip
  groups = ["nginx_hosts"]
  variables = {
    ansible_user                 = "${var.ansible_user}",
    ansible_ssh_private_key_file = "${var.ansible_ssh_key}",
    ansible_python_interpreter   = "${var.ansible_python}"
  }

  depends_on = [time_sleep.wait_20_seconds]
}

We've added a sleep time to make sure the VM instance is powered on and services are running. Please note that we add the public IP of the VM instance, whatever that is, as the host name in Ansible. The host is added to "nginx_hosts" group. We also let Ansible know what user, ssh key and python version to use.

Last thing to do is to update Ansible inventory and run the playbook. We will use terraform_data resources to execute Ansible command line.

resource "terraform_data" "ansible_inventory" {
  provisioner "local-exec" {
    command = "ansible-inventory -i inventory.yml --graph --vars"
  }
  depends_on = [ansible_host.gcp_instance]
}

resource "terraform_data" "ansible_playbook" {
  provisioner "local-exec" {
    command = "ansible-playbook -i inventory.yml nginx_install.yml"
  }
  depends_on = [terraform_data.ansible_inventory]
}

And that's it. Once you update the code above with your information and run terraform apply, it will automatically deploy a Google Cloud VM instance and configure it with Ansible. All transparent and dynamic, all driven from Terraform.

In this article you've seen how to use Terraform to deploy a cloud VM instance and automatically and transparently configure it with Ansible.

Monday, April 17, 2023

Moving Backups to Hardened Linux Repositories

It's not enough to have a backup of your data. You need to make sure that you will be able to recover from that backup when the time comes. And one of the best ways to make sure you can do it, is to make protect your backups from being modified intentionally or unintentionally.

In Veeam Backup & Replication, a hardened repository is using a Linux server to provide immutability for your backups. The feature was first released in version 11. Let's see what makes the hardened repository special, how it protects your backups from changes and how easy is to actually start using it

Immutable file attribute

Linux file system allows setting special attributes to its files. One of these attributes is immutable attribute. As long as it is set on a file, that file cannot be modified by any user, not even root. More, root user is the only user that can actually set and unset the immutable attribute on a specific file. You can do it using lsattr and chattr commands in Linux as seen in the below screenshot:

Veeam hardened repo uses exactly the same mechanism of making backup files immutable.

Isolate Linux processes

To run a successful repository, Veeam needs several functionalities: to receive data from proxies, to open and close firewall ports, to set and unset immutability as per retention policy. In order to harden the repository, Veeam implements these functionalities as separate Linux processes.

The processes that sets and unsets immutable attribute on the backup files is called veeamimmureposvc and it needs to run with root privileges, as root user is the only user that can modify immutable attribute.

veeamtransport --run-service is the Data Mover service performing data receiving, processing and storing. Because it is a service exposed on the network, it is running under a standard Linux user. In case of a breach, the service will give access only to a standard user with limited privileges. The Linux user under which this service runs must not be allowed to elevate its privileges.

A third process takes care of dynamically opening and closing firewall ports: veeamtransport --run-environmentsvc and this one is also running with elevated privileges.

The following screen shot shows the three main services that are part of a hardened repository.

Single use credentials

Another layer of protection is added through the way the credentials are handled within the backup server.

To add the Linux repo to the backup server you need to specify Linux credentials. These credentials are only used during the initial configuration process and they are never stored in backup server's credential manager. Temporary privilege elevation may be needed during the repository configuration for deployment and installation of Veeam processes. After the configuration process finishes, all elevated privileges must be revoked from the user.

Additional repository features - fast clone

This one is not a security related feature, but it comes in as a great add on to the hardened repository.

In case you formatted your file system with XFS file system and you have a supported Linux distribution (see this user guide page for more details), Veeam will use fast clone to reduce used disk space on the repository and increase the speed of synthetic backups and transformations. Fast clone works by referencing existing data blocks on the repository instead of copying the data blocks between files.

Using the hardened repository

For new backup jobs, just point them to your hardened repository. In case you have existing backups then you need to migrate those to your new repo. With v12 comes a new feature that allows to move any backup from an existing repository to another one. Simply select your backup, right click it and you will see that now you can "move backup"

Let's look at moving backups from a Windows NTFS repository to our hardened Linux repo. We start with an empty repository configured with a service account called veeambackup

The first backup chain is for an unencrypted backup job. The backup job is configured to use a standard Windows repository. There are 2 full restore points in the backup chain. Each restore point is 960 MB and the total size on disk is 1.87 GB. We use "move backup" to send the the backup chain to the hardened repository:

Once the move processes finished, the backup job has been updated to point to the new repository. Let's check what happened on the Linux hardened repo.

Find the backup chain in our repo:

Check the immutability flag:

The restore points are set as immutable. The metadata file is not since this file is modified during each backup operation. Trying to delete any of the restore points will fail:

We can also check that XFS fast clone is working by looking at the used space on the repo which is less that the sum of the 2 full backups:

In this post we've looked at the features of hardened repository and how they work. To implement a hardened repository in your environment follow the steps in the user guide