First look at HCX Network Extension HA feature

The High Availability feature for HCX Network Extension appliances was introduced in the version 4.3. This was a big deal, because if someone needed HA for their appliances that extend L2 networks (as this is a basic requirement for the resiliency) to the cloud and wanted to use VMware’s stack in particular, they had to deploy a pair of NSX-T standalone Edges on-prem leveraging NSX-T on the cloud side and setup a L2VPN.

VMware documentation mentions the following important prerequisites to consider before deploying HCX NE HA:

  • Network Extension HA requires the HCX Enterprise license.
  • Network Extension High Availability protects against one Network Extension appliance failure in a HA group.
  • Network Extension HA operates without preemption, with no automatic failback of an appliance pair to the Active role.
  • Network Extension HA Standby appliances are assigned IP addresses from the Network Profile IP pool.
  • The Network Extension appliances selected for HA activation must have no networks extended over them.

Also an interesting thing about NE in HA mode is the upgrade process:

In-Service upgrade is not available for Network Extension High Availability (HA) groups. HA groups use the failover process to complete the upgrade. In this case, the Standby pair is upgraded first. After the Standby upgrade finishes, a switchover occurs and the Standby pair takes on the Active role. At that point, the previously Active pair is upgraded and takes on the Standby role.

Let’s take a look at this new feature. In the HCX UI 4.3+, in the Interconnect -> Service Mesh -> View Appliances view, there is a new option called ACTIVATE HIGH AVAILABILITY.

First, you need to have a pair of deployed NE appliances, the option won’t work when there is no eligible partner for HA.

Also, system checks, if there are extended networks on an appliance that you select for HA.

It can be a challenge to enable HA in an environment, where networks are already extended. In most cases this would require a downtime because we would have to unextend existing networks on NE appliances for the time of the HA configuration.

The system also checks if there are eligible NE appliances to activate HA feature and we get a button that is a shortcut to edit a Service Mesh and add more appliances. It can also be a challenge, if we don’t have a sufficient number of free IPs in our Network Profile’s IP pool. For 2 additional NE appliances on-prem and at the cloud side, we need one management IP and one uplink IP for each of them.

Once appliances are deployed, you only need to press a button Activate HA. Everything is configured for us, like in vSphere HA. New HA appliances create a HA group with a specific uuid.

When HA is enabled, we can monitor its health in the HA Management tab. In the example below, we have 2 pairs: us-east-NE-I2 (on-prem) with us-east-NE-R2 (the cloud side) and us-east-NE-I3 (on-prem) with us-east-NE-R3 (the cloud side). Right after creation, the first pair is ACTIVE and the second is STANDBY.

Also in the Appliances view we can quickly check with NE appliance is ACTIVE and which one is STANDBY.

NE appliances with HA enabled can coexist with other NE appliances. This means we can still use single NE appliances for less critical workloads and HA groups only for selected networks.

During a HA group creation, there is a VM/Host Rule created in a vCenter to make sure NE appliances in the same HA group won’t run on a same host.

After HA group is deployed, we can simulate a failure of one NE.

In this example I am using two sites, Sindey represents on-prem and Ashburn is my cloud side.

The network test 192.268.99.1/24 is extended on this new HA Group from on-prem (NE-I2 and NEI-3) to cloud (NE-R2 and NE-R3).

There is a vm test_aga_vm_2 (192.168.99.107) on-prem connected to the test subnet and on a cloud side, there is a vm test_aga_vm (192.168.99.103) on the extended network L2E_test. As VMs are at the opposite sides, we can have a ping running between them to test the connectivity during the NE failover.

If you look closely at NE appliances, there is nothing specific to HA there, no specific HA subnet, only management, uplink and extended networks.

On the on-prem side, there is a test network attached to NE-I2, same for NE-I3.

On the cloud side, there is a L2_test network attached to NE-R2, same for NE-R3.

Now, to force a failover, I am shutting down an appliance on the on-prem: NE-I2.

What happens next is there is a failover for the first pair of appliances: NE-I3 and NE-R3 become active. This is always a pair of appliances that changes the state, so in case NE-I2 is down, NE-R2 also fails over to NE-R3.

There is a small (as expected) packet loss observed and also it takes some time in the HA view in HCX UI to show the updated status of the appliances. The communication path between NE on-prem and NE on GCVE was restored and pings successful but the UI still showed a DEGREADED state for HA appliances for some time.

I lost 5 pings only during the failover, it was pretty quick.

After few refreshes the UI shows a new state: HEALTHY with NE-I2 and NE-R2 being in the STANDBY and NE-I3 and NE-R3 in the ACTIVE state.

There is also an interesting option in the UI to view HA activity timeline. We can quickly check the HA history and states of the appliances.

If we don’t want to power off the appliances, we can use “manual failover” option to test HA.

This one seems to be more graceful as I lost only 1 ping during this test.

Maintenance Mode of ESXis with Zerto VRAs installed in a public cloud

For Zerto users, maintenance work on ESXis has always been a challenge. Zerto’s Virtual Replication Appliances (VRAs) are pinned to their dedicated hosts, so when we put a host in Maintenance Mode , VRAs can’t be auto-evacuated. VRAs even shouldn’t be evacuated as they can only work on their dedicated hosts. If the vSphere cluster is on-prem, we have many options to address this issue. We can power off VSAs, delete them, force migrate etc. We can stop and resume replications to make sure our data is consistent.

But in a public cloud with a shared responsibility model, a cloud provider is responsible for all maintenance work on ESXis but they don’t have access to your Zerto application. Imagine a situation when a cloud provider needs to replace or upgrade an ESXi node that is still operational and they want to evacuate this host. With VRA being pinned to the host, this evacuation won’t work for them. Cloud provider probably also will not want to power off a VRA appliance because they know it will break your replications. This situation can seriously delay every maintenance work on an ESXi in a public cloud.

What can you do as a Zerto admin if you are using a public cloud as your replication target?

It turns out Zerto offers a very nice feature called Workload Automation. You can enable it in the Site Settings of Zerto Virtual Manager (ZVM).

Workload Automation can detect when a host is entering MM and can “evacuate” (=power off) a VRA in such situations. It can also detect when a host exits MM and bring back a VRA into an operational state. Thanks to this feature, a cloud provider can perform any maintenance work on your hosts and it won’t break your Zerto’s setup.

There are also other very useful options. When a new node is added to a cluster (due to auto scaling policy or a node replacement), Zerto will detect this and install its VRA there. When a node is removed from a cluster, Zerto will remove it from its inventory.

I run a simple test to check how it works. I used a 4 node vSAN cluster. When I put one of he hosts esxi-793 in MM, I noticed Zerto shut its VRA appliance down on this host.

A new alert was raised in the ZVM UI console that one of its VRA appliances had been powered off.

I also noticed Zerto powered off not only VRA but also a helper appliance: VRAH.

When I exited ESXi esxi-793 from MM, Zerto detected it correctly and powered on the appliances.

It seems Zerto Workload Automation is a must have option to be ON when you are running your Zerto in a public cloud and you don’t want to delay maintenance work of your hosts.

vSAN Health History, something I have been waiting for a long time

This is a small but also a very useful feature that comes with SAN 7.0 U2. In the former vSAN releases, when you troubleshooted vSAN issues (especially intermittent issues), vSAN Health showed the current status only. But if the issue occured from time to time you had to check the logs. Now you can enable the Health History in vCenter UI.

The historical health data is by default set to be kept up for 30 days back but this is sufficient in most of the cases.

After selecting the timeframe, you will see which checks failed and when did it happen.

I noticed that for backplane maintenance operations like data move, we get an Info icon, not a warning.

So now we can retrieve a historical snapshot of what happened with our vSAN cluster in the certain timeframe.

Migrating VMs from VMware on-prem (or GCVE ) to Google Compute Engine

When we want to migrate our existing (on-prem) VMware workloads to a cloud we have two options. It can be a lift and shift migration to a VMware-As-A-Service solution like GCVE where VMware HCX could be used to guarantee a high performance of migrations and a minimum or no downtime during such a process. The other option is to convert from VMware to cloud native format like Google Compute Engine. This article briefly covers the second approach.

The VM format conversion sounds scary but it is actually a very easy process if you are using Google Migrate for Compute (M4C) service.

Imagine you have a VM (or a bunch them) that runs in a vSphere environment (on-prem or on GCVE) and it uses an OS system supported by M4C. In a matter of let’s say an hour (depending on the VM size and its data churn and also assuming you have a connectivity between your on-prem VMware cluster and a VPC where your M4C service is running or Private Service Connection to your GCVE cluster) you can have it running on Google Cloud.

I will use test-aga-vm that runs on Ubuntu in my example.

What you need to do is to setup you M4C environment following official Google’s documentation. One of the steps is to enable Migrate for Compute API after which you will find your M4C dashboard is under “Compute Engine”

When everything is set on GCP side (APIs and permissions), the next step is to deploy M4C appliance on your vSphere cluster. The link to the most recent OVA version is included in the documentation.

The most important part when deploying OVA is to deploy it in a network segment that can access googleapis.com and your DNS. The M4C appliance will have to be able to resolve your vCenter FQDN and also FQDNs of all ESXis in your cluster. It is ok to run it on a vSAN cluster.

When a M4C appliance is ready the only way to ssh to it is via its SSH Private Key. The SSH Public Key has to be provided when M4C is deployed.

After a M4C appliance is powered on, it has to be registered in your project. The registration process looks like this:

admin@migrate-appliance:/m4c/OSS$ m4c register
Please enter vCenter host address: xxx.us-east4.gve.goog
vCenter server SSL certificate fingerprint is xxx Do you approve? [Y/n]Y
Please enter vCenter account name to be used by this appliance: solution-user-05@gve.local
Please enter vCenter account password:xxx
vSphere credentials verified

Please visit this URL to authorize this application: https://accounts.google.com/o/xxx
Enter the authorization code: xxx
This Migrate Connector was registered to Source xxx in Project yyy
Please select project:
xxx
List is longer than 10, truncating list. Please select or type project.
xxx

Please select region:
1. asia-east1
2. asia-south1
...
List is longer than 10, truncating list. Please select or type region.
us-east4

Please supply new vSphere source name (vSphere source format must be only lowercase letters, digits, and hyphens and have a length between 6 and 63) : vcsa-599
Creating new source…

Please select service account: ("new" to create)
1. new
2. xxx
new


Please supply new service account name (service account format must be only lowercase letters, digits, and hyphens and have a length between 6 and 3 0): migration
Waiting for the Migrate Connector to become active. This may take several minutes…

Registration completed

If you wan to check the status of the appliance, use m4c status.

After the successful registration you will find new service accounts were created to support migrations and OS customisations.

After a successful registration you will also see in the Source tab a list of all VMs under a registered vCenter.

You can start a VM replication now. After an initial sync you can test-clone your VM (to a sandbox VPC for example) or do a cut-over. For those two actions you need to provide Migration Details for a VM. The M4C service needs to know where do you want your replica to be created, what would be a machine type you want to use when you do a test clone or cut-over and how frequent the replication cycle should be. It looks like this service could be used not only for migrations but also as a DR.

If you want to use a service account, this is where you configure it:

A Cut-over will never delete an original VM on a vSphere side, it will only power it off. Other activities I observed during replication and cut-over were VM snapshots. M4C will not reconfigure a VM that runs on vSphere. In case a cut-over fails, you can always power on an original VM.

When my cut-over task was completed I could ssh into your my migrated VM from a Cloud Shell to evaluate it.

M4C does a lot of OS adaptations when converting the format from VMware to Compute Engine like uninstalling VMware Tools, configuring NIC to use DHCP, installing Google packages etc. A new VM will have new IP addresses that are provided by your VPC. It can also use different CPU and RAM parameters than the ones configured on vSphere, so a migration can also be a good moment to evaluate and resize your VM. M4C also offers a VM utilization report that you can run for a longer period of time on VMs that are still running on vSphere to right size them for a migration.

HCX Mobility Agent aka “dummy host”

I bet many administrators have been surprised by this new ESXi appearing on a hosts list in vCenter UI after a HCX service mesh had been created for the first time. And for every HCX service mesh we create, there is one ESXi “host” being deployed.

Of course it is not a “real” host, it looks a little bit like its a nested one and its name seems to always be its HCX Interconnect (IX) appliance’s management IP. I think of it as a IX appliance’s alter ego 😉

New host on the Hosts list

It does have its own “VMware Mobility Agent Basic” license for 2 CPUs.

Mobility Agent License

But it not a typical nested ESXi, it’s a dedicated VMware Mobility Platform.

It has its own VMFS local datastore called ma-ds with a total “capacity” of 500 TB and 1 TB of RAM 😉 Fortunately, nothing is really taken from our physical resources.

In my case HCX service mesh Interconnect Appliance IX has 172.16.4.2 as management address, hence the host was named “172.16.4.2”.

IX appliance is a VM with many network interfaces, among the most important ones are HCX Management Interface, HCX vMotion Interface and HCX Uplink Interface. Uplink Interface is in fact the only one that is used to communicate with a target side, Management and vMotion Interfaces are used locally. The CIDR for those interfaces (and many other settings) are set during Network Profiles creation in HCX UI.

So what does this new dummy Host do?

Its job looks like a proxy for vMotion tasks between two paired HCX sides and allows for long distance cross vCenter vMotion. It is configured when the following service option is enabled: vMotion Migration service.

The vMotion Migration service provides zero-downtime, bi-directional Virtual Machine mobility. The service is deployed as an embedded function on the HCX-WAN-IX virtual appliance.

Configuring a service mesh with vMotion service

For HCX vMotion, we don’t need direct connectivity between source and target vCenters and their vMotion networks. A source IX appliance task is to trick a source ESXi to believe the destination ESXi for the vMotion task is local. Source ESXi “thinks” the target ESXi for the vMotion task is its local Mobility Agent host. A target IX appliance task is to trick a target ESXi to believe the vMotion task is local. Target ESXi “thinks” the source ESXi for the vMotion task is its local Mobility Agent host.

What source and target sides think is going on
vMotion from Source ESXi host to Source MA host
vMotion from Target MA Host to Target ESXi

What is really going on is transparent for the both source and target side. IX appliance on the source side acting as a receiving end for the vMotion task transfers the VM data via HCX Uplink Interface (IPsec tunnel) to the target IX that acts as the initiator of the vMotion task at the destination side.

What is really going on

This explains why there is a requirement for IX appliance to be able to communicate with ESXi over their vMotion networks.

https://ports.vmware.com/home/VMware-HCX