Thursday, December 10, 2020

NSX Load Balancer - Redirecting Traffic to Maintenance Page

In this post we'll look at 2 situations when we need to redirect vRealize Automation traffic to a maintenance page. The type of traffic doesn't really matter as long as the traffic goes through the load balancer, but for a less abstract post we'll use vRA 7.x. The use cases are: 

  • vRA services are down (for example IaaS manager pool is gone) - in this case it would help if traffic is redirected from the vRA server login portal to a "sorry server"
  • scheduled maintenance window (for patching)  - you need vRA working normally, but you don't want anyone else to login and start playing around 

For both cases we'll be using simple application rules in NSX load balancer (well, if the services are actually behind a NSX load balancer). In a highly available architecture, every service in VRA will be behind a load balancer. For simplicity we'll look only at VRA appliances as for the rest it can be easily extrapolated. 



When a user tries to connect to VRA portal, it will make a request using the virtual IP assigned to the load balancer virtual server. The virtual server (VRA Appliance Virtual Server) has a pool of servers (VRA Appliance pool) associated to which it can direct the traffic. The blue path represents normal situation, when the user reaches VRA appliances is the portal. The green path does not actually exist and is the subject of the post. What we need is in case all servers in the VRA Appliance pool are down to redirect the user to another page. For this we need a few additional elements.

First we need a VM that runs an HTTP server and is able to serve a simple html page, called in the diagram above "Sorry Server". We installed Apache, enabled SSL and created in the document root path a structure similar to VRA login URL (below document root is /var/www/html) to serve a custom index.html page.

/var/www/html/vcac/org/[orgName]/index.html

At NSX level we add the "sorry server" to a new pool, called "vra-maintenance-pool". We also create application rules to check availability of  VRA appliances.  Application rules are written using HAProxy syntax and they are used to manipulate traffic at the load balancer side. It's a simple rule, where we first check if there are any servers up and running in VRA appliance pool using an access control list (acl). If the pool is down, acl becomes true and we use another backend pool - the maintenance one:

# detect if vra appliance is still up 

acl vra-appliance-down nbsrv(vra-appliance-pool) eq 0

# use pool "vra-maintenance-pool" if app is dead

use_backend vra-maintenance-pool if vra-appliance-down

The rule is then linked to the virtual server of the VRA appliances. Whenever a request comes to the virtual server, the rule is checked and if vra-appliance-pool is down, users will be redirected to the maintenance page. You can extend the rules and redirect users to maintenance pool for other situations that may render VRA useless such as IaaS manager servers down or other IaaS services are down. 

Another usage for application rules is restricting access to VRA during scheduled maintenance. In this case the rule will use ACL to restrict IP's accessing VRA virtual servers by matching the source IP of the request.

# allow only vra components and management server 

acl allowed-servers src 192.168.1.1 192.168.10.10 192.168.20.10

# send everything else to maintenance page

use_backend vra-maintenance-pool if !allowed-servers

Traffic is redirected to maintenance pool when it comes from a source different than the VRA itself or the management server. Happy patching! 


No comments: