Migrating VMs from VMware on-prem (or GCVE ) to Google Compute Engine

When we want to migrate our existing (on-prem) VMware workloads to a cloud we have two options. It can be a lift and shift migration to a VMware-As-A-Service solution like GCVE where VMware HCX could be used to guarantee a high performance of migrations and a minimum or no downtime during such a process. The other option is to convert from VMware to cloud native format like Google Compute Engine. This article briefly covers the second approach.

The VM format conversion sounds scary but it is actually a very easy process if you are using Google Migrate for Compute (M4C) service.

Imagine you have a VM (or a bunch them) that runs in a vSphere environment (on-prem or on GCVE) and it uses an OS system supported by M4C. In a matter of let’s say an hour (depending on the VM size and its data churn and also assuming you have a connectivity between your on-prem VMware cluster and a VPC where your M4C service is running or Private Service Connection to your GCVE cluster) you can have it running on Google Cloud.

I will use test-aga-vm that runs on Ubuntu in my example.

What you need to do is to setup you M4C environment following official Google’s documentation. One of the steps is to enable Migrate for Compute API after which you will find your M4C dashboard is under “Compute Engine”

When everything is set on GCP side (APIs and permissions), the next step is to deploy M4C appliance on your vSphere cluster. The link to the most recent OVA version is included in the documentation.

The most important part when deploying OVA is to deploy it in a network segment that can access googleapis.com and your DNS. The M4C appliance will have to be able to resolve your vCenter FQDN and also FQDNs of all ESXis in your cluster. It is ok to run it on a vSAN cluster.

When a M4C appliance is ready the only way to ssh to it is via its SSH Private Key. The SSH Public Key has to be provided when M4C is deployed.

After a M4C appliance is powered on, it has to be registered in your project. The registration process looks like this:

admin@migrate-appliance:/m4c/OSS$ m4c register
Please enter vCenter host address: xxx.us-east4.gve.goog
vCenter server SSL certificate fingerprint is xxx Do you approve? [Y/n]Y
Please enter vCenter account name to be used by this appliance: solution-user-05@gve.local
Please enter vCenter account password:xxx
vSphere credentials verified

Please visit this URL to authorize this application: https://accounts.google.com/o/xxx
Enter the authorization code: xxx
This Migrate Connector was registered to Source xxx in Project yyy
Please select project:
xxx
List is longer than 10, truncating list. Please select or type project.
xxx

Please select region:
1. asia-east1
2. asia-south1
...
List is longer than 10, truncating list. Please select or type region.
us-east4

Please supply new vSphere source name (vSphere source format must be only lowercase letters, digits, and hyphens and have a length between 6 and 63) : vcsa-599
Creating new source…

Please select service account: ("new" to create)
1. new
2. xxx
new


Please supply new service account name (service account format must be only lowercase letters, digits, and hyphens and have a length between 6 and 3 0): migration
Waiting for the Migrate Connector to become active. This may take several minutes…

Registration completed

If you wan to check the status of the appliance, use m4c status.

After the successful registration you will find new service accounts were created to support migrations and OS customisations.

After a successful registration you will also see in the Source tab a list of all VMs under a registered vCenter.

You can start a VM replication now. After an initial sync you can test-clone your VM (to a sandbox VPC for example) or do a cut-over. For those two actions you need to provide Migration Details for a VM. The M4C service needs to know where do you want your replica to be created, what would be a machine type you want to use when you do a test clone or cut-over and how frequent the replication cycle should be. It looks like this service could be used not only for migrations but also as a DR.

If you want to use a service account, this is where you configure it:

A Cut-over will never delete an original VM on a vSphere side, it will only power it off. Other activities I observed during replication and cut-over were VM snapshots. M4C will not reconfigure a VM that runs on vSphere. In case a cut-over fails, you can always power on an original VM.

When my cut-over task was completed I could ssh into your my migrated VM from a Cloud Shell to evaluate it.

M4C does a lot of OS adaptations when converting the format from VMware to Compute Engine like uninstalling VMware Tools, configuring NIC to use DHCP, installing Google packages etc. A new VM will have new IP addresses that are provided by your VPC. It can also use different CPU and RAM parameters than the ones configured on vSphere, so a migration can also be a good moment to evaluate and resize your VM. M4C also offers a VM utilization report that you can run for a longer period of time on VMs that are still running on vSphere to right size them for a migration.

HCX Mobility Agent aka “dummy host”

I bet many administrators have been surprised by this new ESXi appearing on a hosts list in vCenter UI after a HCX service mesh had been created for the first time. And for every HCX service mesh we create, there is one ESXi “host” being deployed.

Of course it is not a “real” host, it looks a little bit like its a nested one and its name seems to always be its HCX Interconnect (IX) appliance’s management IP. I think of it as a IX appliance’s alter ego 😉

New host on the Hosts list

It does have its own “VMware Mobility Agent Basic” license for 2 CPUs.

Mobility Agent License

But it not a typical nested ESXi, it’s a dedicated VMware Mobility Platform.

It has its own VMFS local datastore called ma-ds with a total “capacity” of 500 TB and 1 TB of RAM 😉 Fortunately, nothing is really taken from our physical resources.

In my case HCX service mesh Interconnect Appliance IX has 172.16.4.2 as management address, hence the host was named “172.16.4.2”.

IX appliance is a VM with many network interfaces, among the most important ones are HCX Management Interface, HCX vMotion Interface and HCX Uplink Interface. Uplink Interface is in fact the only one that is used to communicate with a target side, Management and vMotion Interfaces are used locally. The CIDR for those interfaces (and many other settings) are set during Network Profiles creation in HCX UI.

So what does this new dummy Host do?

Its job looks like a proxy for vMotion tasks between two paired HCX sides and allows for long distance cross vCenter vMotion. It is configured when the following service option is enabled: vMotion Migration service.

The vMotion Migration service provides zero-downtime, bi-directional Virtual Machine mobility. The service is deployed as an embedded function on the HCX-WAN-IX virtual appliance.

Configuring a service mesh with vMotion service

For HCX vMotion, we don’t need direct connectivity between source and target vCenters and their vMotion networks. A source IX appliance task is to trick a source ESXi to believe the destination ESXi for the vMotion task is local. Source ESXi “thinks” the target ESXi for the vMotion task is its local Mobility Agent host. A target IX appliance task is to trick a target ESXi to believe the vMotion task is local. Target ESXi “thinks” the source ESXi for the vMotion task is its local Mobility Agent host.

What source and target sides think is going on
vMotion from Source ESXi host to Source MA host
vMotion from Target MA Host to Target ESXi

What is really going on is transparent for the both source and target side. IX appliance on the source side acting as a receiving end for the vMotion task transfers the VM data via HCX Uplink Interface (IPsec tunnel) to the target IX that acts as the initiator of the vMotion task at the destination side.

What is really going on

This explains why there is a requirement for IX appliance to be able to communicate with ESXi over their vMotion networks.

https://ports.vmware.com/home/VMware-HCX

VM snapshots on a vSAN datastore and their SPBM policy

When we create a snapshot of a VM on a vSAN datastore, delta disks (where all new writes go) inherit the storage policy from the base disk.

Our VM Test_VM_123 uses vSAN policy RAID – 1 mirror, which means there are at least two copies of the VMDK and a witness. After snapshot is taken, we see the same policy applied to delta disk.

But if we want to change a storage policy for Test_VM_123, we can change it for a VM Home object and base disk only. There is no option to change the policy for a “snapshot”/delta disk.

After the policy was changed for a base disk to FTT-0/ RAID-0 Stripe 4, we see the delta disk retained its FTT-1 policy.

This behaviour is described in VMware KB 70797 “Modifying storage policy rules on Virtual Machine running on snapshot in vSAN Data-store”. In order to keep all storage policies consistent across VM disks, it is recommended to consolidate all the snapshots before making a SPBM policy changes to a VM.

Direct file (.ova or .iso) upload to a vSAN datastore

Uploading files directly to vSAN is not the best option. There are many ways to move data to vSAN clusters. For .iso files you can use native vSAN NFS service. For moving VMs we have Storage vMotion, HCX can be used or Cross vCenter Workload Migration Utility fling (that is also included in vSphere 7.0 Update 1c release). The move-VM PowerCLI command is also possible between vCenters that don’t share SSO. vSphere Replication could also be an option or restore from a backup. Those methods are well documented and use supported APIs.

But if there is a corner case and a file needs to be put on vsanDatastore, this is also possible but not for all file sizes. System will not allow us to upload to the root path directly, we will have to create a folder on the datastore.

The first issue you will probably see after trying to upload a first file is the following:

Opening the recommended url and login into ESXi directly should be sufficient authorise us to upload files on vsanDatastore directly.

I tested it with some smaller files and up to 255 GB on vSAN 7.0 U1 cluster and the upload was successful:

but adding another file in this folder failed:

Also uploading a >255GB file in a new folder on the vsanDatastore failed. What I could find in ESXi vmkernel.log was the following:

write to large_file.ova (...) 1048576 bytes failed: No space left on device 

'cb954c60-5416-7dfa-6d87-1c34da607660': [rt 1] No Space - did not find enough resources after second pass! (needed: 1, found: 0)

It looks like uploading files do vsanDatastore directly bypasses the logic that stripes objects larger than 255GB into smaller components. Why is that?

Looking into the file path, you can determine its object UUID which in my case was: cb954c60-5416-7dfa-6d87-1c34da607660

You can use the following command to query this object in esxi directly:

esxcli vsan debug object list -u cb954c60-5416-7dfa-6d87-1c34da607660

Now we have the answer. Object type our our direct upload is vmnamespace and it’s like a container with fixed size of 255.00 GB. And this is the max number of GB that we can place there. By default it uses vSAN Default Storage Policy (FTT=1, mirror in this cluster).

How to run basic performance tests for HCX uplink interface

I believe the build-in HCX perftest tool should be used for every freshly deployed HCX Service Mesh before we start migrating VMs between sites. Although the test is just a benchmark (it uses iperf3, it is single threaded), it will give us an idea how fast the VM migration will be and what can be expected in production. With HCX perftest tool testing is easier than with native iperf3 because we don’t have to provide/remember any IP addresses of appliances on-prem and in the cloud ;-).

To start the test we have to ssh to HCX manager as admin and select the IX appliance we want to test:

>ccli

>list

> go x -> select your service mesh appliance

> perftest -> to check available options:

Available Commands:
all perftest uplink, ipsec, wanopt and site in one command
ipsec iperf3 perf testing against ipsec tunnels
perf iperf3 perf testing
reachability Ping remote peers to test reachability.
site iperf3 perf testing between sites
status Query the test status.
uplink iperf3 perf testing against uplink
wanopt tcpperf testing against WANOPT tunnels

Available flags are:

Flags:
-h, --help help for uplink
-i, --interval uint32 Interval in second to report. Default is 1 second. (default 1)
-m, --msgsize uint32 TCP maximum segment size to send.
-P, --parallel uint32 Number of parallel streams. Default is 1. (default 1)
-p, --port uint32 Listen port on server side. Default is 4500. -p 22 also allowed. (default 4500)
-T, --runtimeout uint32 Individual test duration in second. Default is 1 minute. (default 60)
-t, --timeout uint32 Total timeout in seconds. Default 10 min. (default 600)
-v, --verbose Show details during testing if set
.

PERFTEST SITE: GENERAL TUNNEL CHECK

>perftest site
++++++++++ StartTest ++++++++++

---------- Site-0 [192.0.2.33 >>> 192.0.2.34] ----------
Duration Transfer Bandwidth Retransmit
server workload started
[ 4] 0.00-30.00 sec 13.8 GBytes 3.96 Gbits/sec 365 sender
[ 4] 0.00-30.00 sec 13.8 GBytes 3.95 Gbits/sec receiver
Done

---------- Site-0 [192.0.2.33 <<< 192.0.2.34] ----------
Duration Transfer Bandwidth Retransmit
[ 4] 0.00-30.00 sec 14.8 GBytes 4.24 Gbits/sec 167 sender
[ 4] 0.00-30.00 sec 14.8 GBytes 4.23 Gbits/sec receiver
Done

The iperf3 native commands that are used for this test with default values :

iperf3 -c 192.0.2.34 -i 1 -p 9000 -P 1 -t 30

iperf3 -s -p 9000 -B 192.0.2.33

PERFTEST IPSEC: TEST INSIDE IPSEC

> perftest ipsec
++++++++++ StartTest ++++++++++

---------- Ipsec-0 [t_0, 192.0.2.37 >>> 192.0.2.45] ----------
Duration Transfer Bandwidth Retransmit
server workload started
[ 4] 0.00-30.00 sec 3.40 GBytes 973 Mbits/sec 0 sender
[ 4] 0.00-30.00 sec 3.39 GBytes 972 Mbits/sec receiver
Done

---------- Ipsec-0 [t_0, 192.0.2.37 <<< 192.0.2.45] ----------
Duration Transfer Bandwidth Retransmit
[ 4] 0.00-30.00 sec 3.40 GBytes 974 Mbits/sec 0 sender
[ 4] 0.00-30.00 sec 3.40 GBytes 973 Mbits/sec receiver
Done

---------- Ipsec-1 [t_0, 192.0.2.38 >>> 192.0.2.46] ----------
Duration Transfer Bandwidth Retransmit
server workload started
[ 4] 0.00-30.00 sec 3.40 GBytes 973 Mbits/sec 0 sender
[ 4] 0.00-30.00 sec 3.40 GBytes 973 Mbits/sec receiver
Done

---------- Ipsec-1 [t_1, 192.0.2.38 <<< 192.0.2.46] ----------
Duration Transfer Bandwidth Retransmit
[ 4] 0.00-30.00 sec 3.40 GBytes 974 Mbits/sec 0 sender
[ 4] 0.00-30.00 sec 3.40 GBytes 973 Mbits/sec receiver
Done

---------- Ipsec-2 [t_2, 192.0.2.39 >>> 192.0.2.47] ----------
Duration Transfer Bandwidth Retransmit
server workload started
[ 4] 0.00-30.00 sec 3.39 GBytes 971 Mbits/sec 0 sender
[ 4] 0.00-30.00 sec 3.39 GBytes 970 Mbits/sec receiver
Done

---------- Ipsec-2 [t_2, 192.0.2.39 <<< 192.0.2.47] ----------
Duration Transfer Bandwidth Retransmit
[ 4] 0.00-30.00 sec 3.39 GBytes 971 Mbits/sec 1181 sender
[ 4] 0.00-30.00 sec 3.39 GBytes 970 Mbits/sec receiver
Done

The iperf3 native commands that are used for this test with default values :

iperf3 -c 192.0.2.45 -i 1 -p 9000 -P 1 -t 30

iperf3 -s -p 9000 -B 192.0.2.37

PERFTEST UPLINK: UPLINK INTERFACE CHECK

> perftest uplink

Testing uplink reachability…
Uplink-0 round trip time:
rtt min/avg/max/mdev = 66.734/67.081/68.135/0.578 ms

Uplink native throughput test is initiated from LOCAL site.
++++++++++ StartTest ++++++++++

---------- Uplink-0 [te_0, a.a.a.a >>> b.b.b.b] ----------
Duration Transfer Bandwidth Retransmit
server workload started
[ 4] 0.00-60.00 sec 5.20 GBytes 745 Mbits/sec 5116 sender
[ 4] 0.00-60.00 sec 5.20 GBytes 744 Mbits/sec receiver
Done
---------- Uplink-0 [te_0, a.a.a.a <<< b.b.b.b] ----------
Duration Transfer Bandwidth Retransmit
server workload started
[ 4] 0.00-60.00 sec 4.55 GBytes 652 Mbits/sec 6961 sender
[ 4] 0.00-60.00 sec 4.55 GBytes 651 Mbits/sec receiver
Done

The iperf3 native commands that are used for this test with default values :

iperf3 -c a.a.a.a -i 1 -p 4500 -P 1 -B b.b.b.b -t 60

iperf3 -c a.a.a.a -i 1 -p 4500 -P 1 -B b.b.b.b -t 60

Keep in mind that this is the only test that uses 4500 TCP port by default. If you have only 4500 UDP port open (this is the standard HCX Uplink requirement), your test will fail. You will see probably something like this:

"Command error occurs: Error calling peer [a.a.a.a.a:9445]: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp b.b.b.b:9445: connect: connection refused"

PERFTEST ALL: ALL TESTS COMBINED

This test will run iperf for uplink, ipsec, wanopt and site.

>perftest all
========== PERFTEST ALL STARTING ==========
== WanOpt is Present ==
== TOTAL # of TESTs : 11 ==
== ESTIMATED TEST DURATION : 12 minutes ==
-T option to change individual test duration [default 60 sec]
-k option to skip 'perftest uplink' if tcp port 4500 or 22 not opened
== Are you ready to start ?? [y/n]:

USEFUL FLAGS

You can use more streams to saturate the pipe (-P), but keep in mind the test uses a single thread.

>perftest site -P 2
++++++++++ StartTest ++++++++++

---------- Site-0 [ 192.0.2.33 >>> 192.0.2.34] ----------
Duration Transfer Bandwidth Retransmit
server workload started
[ 4] 0.00-60.00 sec 16.8 GBytes 2.40 Gbits/sec 1498 sender
[ 4] 0.00-60.00 sec 16.8 GBytes 2.40 Gbits/sec receiver
[ 6] 0.00-60.00 sec 16.4 GBytes 2.35 Gbits/sec 1815 sender
[ 6] 0.00-60.00 sec 16.4 GBytes 2.35 Gbits/sec receiver
[SUM] 0.00-60.00 sec 33.2 GBytes 4.76 Gbits/sec 3313 sender
[SUM] 0.00-60.00 sec 33.2 GBytes 4.75 Gbits/sec receiver
Done
---------- Site-0 [ 192.0.2.33 <<< 192.0.2.34] ----------
Duration Transfer Bandwidth Retransmit
[ 4] 0.00-60.00 sec 19.0 GBytes 2.72 Gbits/sec 937 sender
[ 4] 0.00-60.00 sec 19.0 GBytes 2.72 Gbits/sec receiver
[ 6] 0.00-60.00 sec 19.5 GBytes 2.80 Gbits/sec 806 sender
[ 6] 0.00-60.00 sec 19.5 GBytes 2.79 Gbits/sec receiver
[SUM] 0.00-60.00 sec 38.5 GBytes 5.52 Gbits/sec 1743 sender
[SUM] 0.00-60.00 sec 38.5 GBytes 5.51 Gbits/sec receiver
Done

>perftest site -P 4
++++++++++ StartTest ++++++++++

---------- Site-0 [ 192.0.2.33 >>> 192.0.2.34] ----------
Duration Transfer Bandwidth Retransmit
server workload started
[ 4] 0.00-60.00 sec 9.22 GBytes 1.32 Gbits/sec 2108 sender
[ 4] 0.00-60.00 sec 9.21 GBytes 1.32 Gbits/sec receiver
[ 6] 0.00-60.00 sec 9.13 GBytes 1.31 Gbits/sec 2194 sender
[ 6] 0.00-60.00 sec 9.12 GBytes 1.31 Gbits/sec receiver
[ 8] 0.00-60.00 sec 9.20 GBytes 1.32 Gbits/sec 2288 sender
[ 8] 0.00-60.00 sec 9.19 GBytes 1.32 Gbits/sec receiver
[ 10] 0.00-60.00 sec 8.71 GBytes 1.25 Gbits/sec 2396 sender
[ 10] 0.00-60.00 sec 8.70 GBytes 1.25 Gbits/sec receiver
[SUM] 0.00-60.00 sec 36.3 GBytes 5.19 Gbits/sec 8986 sender
[SUM] 0.00-60.00 sec 36.2 GBytes 5.19 Gbits/sec receiver
Done
---------- Site-0 [ 192.0.2.33 <<< 192.0.2.34] ----------
Duration Transfer Bandwidth Retransmit
[ 4] 0.00-60.00 sec 10.2 GBytes 1.45 Gbits/sec 2071 sender
[ 4] 0.00-60.00 sec 10.1 GBytes 1.45 Gbits/sec receiver
[ 6] 0.00-60.00 sec 10.0 GBytes 1.43 Gbits/sec 1932 sender
[ 6] 0.00-60.00 sec 10.0 GBytes 1.43 Gbits/sec receiver
[ 8] 0.00-60.00 sec 10.2 GBytes 1.47 Gbits/sec 2149 sender
[ 8] 0.00-60.00 sec 10.2 GBytes 1.47 Gbits/sec receiver
[ 10] 0.00-60.00 sec 10.3 GBytes 1.47 Gbits/sec 2366 sender
[ 10] 0.00-60.00 sec 10.3 GBytes 1.47 Gbits/sec receiver
[SUM] 0.00-60.00 sec 40.7 GBytes 5.83 Gbits/sec 8518 sender
[SUM] 0.00-60.00 sec 40.7 GBytes 5.82 Gbits/sec receiver
Done

You can change MTU to test the best option (-m) and identify any MTU mismatch issues. You can also modify MTU settings in HCX Network Profile for Uplink profile.

> perftest site -m 1390
++++++++++ StartTest ++++++++++

---------- Site-0 [ 192.0.2.33 >>> 192.0.2.34] ----------
Duration Transfer Bandwidth Retransmit
server workload started
[ 4] 0.00-60.00 sec 30.6 GBytes 4.37 Gbits/sec 518 sender
[ 4] 0.00-60.00 sec 30.5 GBytes 4.37 Gbits/sec receiver
Done
---------- Site-0 [192.0.2.33 <<< 192.0.2.34] ----------
Duration Transfer Bandwidth Retransmit
[ 4] 0.00-60.00 sec 31.1 GBytes 4.46 Gbits/sec 270 sender
[ 4] 0.00-60.00 sec 31.1 GBytes 4.45 Gbits/sec receiver
Done

perftest site -m 9000
++++++++++ StartTest ++++++++++

---------- Site-0 [ 192.0.2.33 >>> 192.0.2.34] ----------
Duration Transfer Bandwidth Retransmit
server workload started
[ 4] 0.00-60.00 sec 29.4 GBytes 4.21 Gbits/sec 341 sender
[ 4] 0.00-60.00 sec 29.4 GBytes 4.20 Gbits/sec receiver
Done
---------- Site-0 [ 192.0.2.33 <<< 192.0.2.34] ----------
Duration Transfer Bandwidth Retransmit
[ 4] 0.00-60.00 sec 29.3 GBytes 4.19 Gbits/sec 307 sender
[ 4] 0.00-60.00 sec 29.2 GBytes 4.19 Gbits/sec receiver
Done