When I started working regularly with VMware Cloud on AWS (VMC) mid-way through 2019, HCX hadn’t officially been switched over to the Service Mesh method of deployment. At some point before the end of the summer, HCX multi-site service mesh became the preferred way to deploy HCX within VMC. Eventually, the older HCX components method will be removed entirely from the HCX console. There wasn’t a ton of information regarding HCX service mesh, so a lot of lab time and deployment experience helped me get up to speed with everything that is required to be successful in standing up a multi-site service mesh. I’d like to share my experience and other info that can help shed some light for those who may be new to the world of VMC and HCX.
What is VMware HCX?
Starting at the ground floor, HCX (formerly known as Hybrid Cloud Extension) is a VMware service that is the “swiss army knife” of site-to-site workload migrations. It can perform a number of different functions, most notably:
- Migrations (Bulk and vMotion)
- Layer 2 Network Extensions
- Disaster Recovery / VM Replication
HCX Advanced comes baked into VMC and provides great tools and flexibility for customers to address connectivity and workload migrations required by their specific use case. HCX can also be used outside of VMC, with the main goal being the pairing of distinctly separate sites with each other to easily facilitate connectivity for migrations between them.
HCX itself is comprised of 4 different components that each have their own virtual appliance(s) within a VMware data center at each side of a site pairing. They include: HCX manager, hybrid interconnect, WAN optimization and network extension. Click here for a deeper dive into HCX itself, and also check out a HCX overview from Cloud Field Day 5.
Where does a service mesh come into play?
The growth of container orchestration and microservices has made service mesh a buzz worthy IT term. A service mesh is basically a dedicated infrastructure layer that facilitates network transport between application services. It is a way for services to decouple the networking components required for application communication from the application itself. We aren’t necessarily dealing with application services in HCX, so the naming is certainly a bit…interesting. It could be argued that maybe “service mesh” shouldn’t be part of the name, but at the end of the day this newer method of HCX deployment definitely makes the service more automated and easy to change/scale as needed.
From the VMC side of things, deploying HCX is as easy as it has always been. You simply navigate to the Add Ons tab in the VMC console and click to activate HCX. VMC automates the deployment of the HCX Cloud manager into your SDDC. The “legacy” method of deploying HCX on-prem was pretty familiar to any VMware admin. It simply consisted of deploying an .ova file for the HCX Enterprise manager as well as each additional component that would be needed for that particular deployment.
With HCX service mesh, deploying the HCX Enterprise manager appliance is still the same manual process. Once that gets deployed and registered to vCenter, the HCX plugin becomes available within the vSphere client. Launching HCX from vSphere is where the service mesh configuration takes place. Let’s dig into the details.
HCX Service Mesh Components
Service mesh components all live within the Infrastructure -> Interconnect portion of the HCX plugin. From there you are able to configure the components that are required to build a service mesh:
Compute Profiles – A compute profile defines a couple of things. First it allows you to configure where the HCX appliances will be deployed in your data center. It also defines which portion of your VMware data center you want to be accessible to the HCX service itself.
Site Pairs – Creates a connection between two HCX sites, only requires a HCX manager appliance at either site.
Network Profiles – Defines a range of IP addresses / networks that can be used for HCX to provide for its virtual appliances.
Service Mesh – The service mesh itself is deployed by selecting a compute profile on both sides of an existing site pair. Once created, HCX will automate the deployment of a hybrid interconnect appliance that is (optionally) paired with a WAN accelerator appliance. It will also deploy one or more L2 extension appliances depending on your configuration.
The simplest infrastructure component is the network profile. This is where you define a network that can be used as an IP pool for appliances that HCX deploys when a service mesh is created. There is a nifty calculator that helps you size the pool based on the environment you intend to deploy. Network profiles are used later when creating a Compute Profile and a Service Mesh, and you can optionally create a network profile within those wizards.
Site pairing is also rather straight-forward. It is simply a way to link two HCX managers across two different sites. This is simply performed by entering the URL of the remote HCX manager along with credentials. Sites are typically paired across a WAN link, so firewall rules usually come into play. For HCX communication, TCP 443 and UDP 500,4500 are the only requirements across the WAN.
Tackling the Compute Profile
The most complex component is certainly the Compute Profile, especially for the first few times going through the configuration process. There are a number of steps to click through to fully define a compute profile. Each step draws up a somewhat confusing diagram that updates itself as you add components. Personally, I feel like the visuals can be a bit distracting and don’t necessarily help to understand the configuration. Maybe it is just me, but this space could be used to paint a better picture in future versions. Thankfully the initial screen within the wizard does give a good overview of the specific configuration options:
The create compute profile wizard steps you through a series of configuration options as you continue through it:
- Services to be enabled – allows you to choose which HCX services you want available to your compute profile (WAN opt, bulk migration, network extension, etc.).
- Service Resources – which part of your VMware data center do you want to give access to HCX services.
- Deployment Resources – where do you want HCX to deploy the appliances it needs to function, choose a host/cluster and datastore.
- Management Network Profile – which Network Profile can HCX use to provide the appliances it deploys with an IP address that can communicate with local vCenter and ESXi management.
- Uplink Network Profile – which Network Profile can HCX use to give IPs to appliances that can communicate with corresponding appliances on the remote site (can be same as mgmt profile).
- vMotion Network Profile – which Network Profile can HCX use to facilitate vMotion (should be same as local vMotion network).
- vSphere Replication Network Profile – which Network Profile can HCX use to reach the replication interface of ESXi hosts (typically same as mgmt network).
- Distributed Switches for Network Extensions (NE) – previous caveats to network extensions still apply. There is a one-to-one relationship between one HCX NE appliance and one vDS. This allows you to select which vDS will be available for HCX to deploy NE appliances for.
Once these settings are all configured, the wizard presents you with a screen to review the WAN/LAN connections that will be active based on the configuration. The last step shows the final configuration and allows you to create your compute profile. Smaller environments may only have one compute profile, but if your on-prem data center has multiple compute clusters with multiple vDS spread across the environment, it certainly may make sense to split things across multiple compute profiles (think test/dev or dmz/lan, etc.).
Putting it all together with a service mesh
Once there is at least one site pairing and one on-prem compute profile created, a HCX service mesh can be built. NOTE: for VMC, activating HCX automatically creates a compute profile for the VMC SDDC.
The first step is to select the paired sites, in this case one on-prem site as the local and VMC as the remote:
After the sites themselves are selected, you must select the compute profile at each site that will be used to define the service mesh. Uplink Network Profiles can optionally be created next. This is very important step in that it defines how HCX will communicate across the WAN. From the VMC side there are three network profiles you can choose:
mgmt-app-network: pre-populated with private IPs, should be used with IPSec VPN
externalNetwork: pre-populated with public IPs, should be used with public WAN
directConnectNetwork: must be configured manually, define an IP range to be used for Dx that is NON-overlapping with any other network in the environment (VMC including mgmt, AWS native, on-prem). Dx will automatically advertise this new network via BGP.
The next step is to configure the NE appliance scale. As mentioned earlier, a NE appliance has a one-to-one mapping with a vDS. It also has a maximum of 8 networks (vDS portgroups) it can extend per appliance. There are also throughput limits to consider, but generally you can divide the number of networks you plan to extend by 8 and come up with the number of NE appliances you will need per vDS. The great thing about HCX service mesh is you can easily edit this value once the service mesh is up and HCX will automate the changes for you.
If you are using the WAN optimization service, the next step allows you to set a bandwidth limit in Mb/s for migrations across all uplinks. The service mesh is then given a name and clicking the finish button begins the process of building the service mesh.
Once the build process starts, you can navigate back to HCX under service mesh and view the tasks to see what is being done and how far along the process is. HCX will deploy new interconnect (IX), WAN and NE appliances as needed at each site based on the compute profile settings. Once those appliances are deployed, they will eventually come up and attempt to negotiate with their peers. The fun part with HCX is always checking for the tunnel status at the end of the process. If things come up green, then you are ready to rock n’ roll with HCX! If not, check firewall rules and security appliances to make sure nothing is being blocked / filtered that would stop the tunnels from negotiating properly.
HCX is a very interesting tool with many valuable capabilities. VMC definitely adds a ton of value by including HCX, and it is a service that is constantly being updated and improved. The recent change to the service mesh model of deployment for HCX begins to make it a bit easier to consume. The VMC side has always been easily automated, and providing a vSphere plugin that only requires one manual appliance deployment starts down that path for on-prem sites. The on-prem service mesh configuration is still a bit clunky and can be confusing, so taking time to explore how it can fit into your environment on the front end will help make the process smoother. The great benefit to this new method is that once the service mesh is up, it is rather easy to make changes or to even tear down and rebuild if necessary. I was’t initially a huge fan of the change, but seeing the end result and benefit once a service mesh is deployed, I have hope that the front end configuration will eventually make for an easier to consume experience.