Recently, I’ve been spending time on reviewing new functionality inside of VMware vCloud Director (vCD) 9.7, specifically Edge Clusters. Edge Clusters provides distinct capabilities to control tenant Edge placement while achieving a higher level of availability. While Edges are a distinct function of NSX to control traffic that ingresses/egresses out of NSX, vCD can provide a significant level of additional functionality.
Abhinav Mishra and I have spent some time writing about the rationale, implementation, migration, and design decisions in regards to Edge Clusters in version 9.7. Below are the links to each of these respective blog posts:
Currently, I am working on some overall design content for Edge Clusters inside of VMware vCloud Director 9.7. However, I wanted to share a post on providing a step by step guide on establishing an Edge Cluster inside of vCD. I will much more to share on our corporate blog shortly, but this should start some thoughtful discussions.
Quick Intro to Edge Clusters
So what’s the deal with Edge Clusters? Edge Clusters now allow a provider to discrete control of tenant Edge placement. Previously, this was rather limited and only controlled at the Provider Virtual Data Center (pVDC) layer. With Edge Clusters, we now can establish this on a per-oVDC basis. In essence, the three main value points of Edge Clusters:
Consumption of dedicated Edge Clusters for North/South
traffic – optimized traffic flow while minimizing the span of Layer 2 broadcast
Provide a higher level of availability to Edge
nodes that can distinctly fail between two clusters.
Ability to balance organization Edge services
between multiple Edge Clusters – I do not have to use the “same” Primary and
Secondary Edge Cluster for every org VDC. This can be configured on a per
Below is a overall high level design of Edge Clusters from a physical and logical layer –
The intent of these videos is to discuss setting up Cross-VDC networking in vCloud Director but also have a live chat on items we’ve learned along the way with working with it. Quite frankly, it was an open discussion between the team on the inner workings on vCD/NSX and what our development team has done in the backend.
In the first video, we discuss the pre-requisites before we can start configuring vCloud Director for Cross-VDC networking. In essence, the assumption is cross-vCenter NSX has already been established and we have the primary and secondary NSX managers registered.
Next, we review the concept of creating a Datacenter group and what are the different egress options. This is very important as it explicitly controls how traffic exits the overlay environment.
Here, we discuss how BGP weights control our active/passive egress points and what vCD automates in the backend. The key is this is all done without provider/tenant configuration – vCD automates this process.
As a final wrap-up of the BGP weights, we review creation of the stretched networks inside of vCloud Director along with operational management inside of the vCD H5 UI.
Last of all, we demonstrate testing of Cross-VDC and failover of my “Daniel-App” between the two sites. What’s interesting is the ability to migrate egress points without any loss of connectivity. Unintended failover is managed by BGP weights, which the default timer is 60 seconds and could be revised if required.
As stated before, this shows the requirement of having a mirror Edge configuration, especially for NAT configuration and failover testing between sites.
This was a fun experience with the team while reviewing and having open discussions on Cross-VDC networking. We are hoping these are valuable for those of you that are interested in bringing this as a new service inside of vCloud Director.
Recently, I received a request from one of our aggregators regarding how Equal Cost Multipathing (ECMP) is metered within the VMware Cloud Provider Program (VCPP), specifically Tom Fojta’s recommendation on architecting Provider-managed NSX Edges and Distributed Logical Router (DLR) in ECMP mode, specifically this diagram from the Architecting a VMware vCloud Director Solution –
As shown in the diagram above – How does Usage Meter handle bill these tenant virtual machines (VMs) when we have a provider NSX architecture that utilizes ECMP?
For you TL;DR readers – any VM connected to a Tenant Edge / direct network that has ECMP enabled northbound, NSX Advanced will be charged for said VM. Read on if you want to learn how this is done.
First off, let’s talk about why this matters. Per the Usage Meter Product Detection whitepaper (this can be found on VMware Partner Central), we can see how Usage Meter detects specific NSX features based on the pattern of usage. Regarding dynamic ECMP, it is metered by the “Edge gateway” which could be a little ambiguous. If one utilizes ECMP, they would be metered for NSX Advanced within VCPP.
One of the scenarios from the whitepaper does show ECMP-enabled Edges but not an Edge that is abstracted away from the provider environment –
My initial reaction was that Usage Meter would not look at the northbound provider configuration and the interconnectivity to vCloud Director. However, I was not confident and wanted to verify this explicit configuration and expected metering. Let’s review my findings.
In the above diagram, we can see I created a similar Provider managed NSX configuration with ECMP enabled from the DLR to the two Provider Edges with dynamic routing enabled (BGP). From there, I expose a LIF/Logical Switch named “ECMP-External-Network” to vCloud Director that is then exposed to my T2 organization as a new External Network.
From there, I created a dedicated Tenant Edge named “T2-ECMP-ESG” that will be attached to this newly created network along with a VM named “T2-ECMP-VM.” The goal is to verify how T2-ECMP-VM and T2-TestVM are metered by Usage Meter with this newly created Tenant Edge.
My Edges are setup for BGP and reporting the correct routes from the southbound connected DLR (and tenant Edges) –
From the DLR, we can see that I have two active paths to my Provider Edges (Provider-Edge-1 and 2) –
Last of all, my T2-ECMP-ESG is operational and attached to the newly created ECMP External Network –
Last of all, I have my VM’s created and powered on (remember, Usage Meter will only meter powered on VM’s). We can see T2-ECMP-VM is attached to a org routed network from T2-ECMP-ESG named “T2-ECMP-Network” –
Let’s work from the north to south – start with the Provider Edges and show how Usage Meter detects and bills.
Note – I have vROps Enterprise in my lab environment, so we will see Usage Meter picking up vROps registration and placing it in the appropriate bundle.
Provider Edges / DLR
As expected, the Provider Edges and DLR are detected along with registration to vROps. By design, NSX Edges are charged for the Advanced SP Bundle as they are metered as a management component (minimum Advanced bundle / 7-point). However, in my case, we see detection, and then registration to vROps Enterprise. Therefore, since it’s a bundle ID (BND) of 12, this is correlated to Advanced Bundle with Management (10-point) –
Tenant Edge – T2-ECMP-ESG
Just like the Provider Edges and DLR, we see T2-ECMP-ESG register to UM along with vROps Enterprise registration. Same billing model as above.
Tenant VM – T2-TestVM
I would not expect any change to this VM, but wanted to showcase that having a separate Edge with standard networking (i.e. no ECMP) will bill based off the NSX SP Base level. As expected, T2-TestVM was handled by Usage Meter just as anticipated – we can see registration, NSX SP Base usage, along with registration to vROps Enterprise –
Tenant VM – T2-ECMP-VM
Finally, let’s take a look at my T2-ECMP-VM – as discussed before, this is wired to a Tenant Edge that is connected to the ECMP-enabled DLR via an External Network.
We see initial registration, registration to vROps Enterprise, then NSX Advanced usage! This would be metered at Advanced Bundle with Networking and Management due to the NSX Advanced usage (12-point).
Summary of Findings
Here’s what we learned:
Edges/DLR Control VM’s are not charged for NSX usage since UM handles them as a management component. If you are using vROps, it will place it in the most cost effective bundle.
Utilizing ECMP at the provider-level DOES impact any southbound connected VM from a billing perspective, even if an Edge sits in between the ECMP enabled configuration and the tenant VM. Per the findings, NSX Advanced will be metered.
Therefore, be aware of any NSX provider architecture and the use of NSX specific features.
Again, this shows the logic inside of Usage Meter and how it relates to metering for tenant workloads. Cheers!