Recently, I’ve been spending time on reviewing new functionality inside of VMware vCloud Director (vCD) 9.7, specifically Edge Clusters. Edge Clusters provides distinct capabilities to control tenant Edge placement while achieving a higher level of availability. While Edges are a distinct function of NSX to control traffic that ingresses/egresses out of NSX, vCD can provide a significant level of additional functionality.
Abhinav Mishra and I have spent some time writing about the rationale, implementation, migration, and design decisions in regards to Edge Clusters in version 9.7. Below are the links to each of these respective blog posts:
Currently, I am working on some overall design content for Edge Clusters inside of VMware vCloud Director 9.7. However, I wanted to share a post on providing a step by step guide on establishing an Edge Cluster inside of vCD. I will much more to share on our corporate blog shortly, but this should start some thoughtful discussions.
Quick Intro to Edge Clusters
So what’s the deal with Edge Clusters? Edge Clusters now allow a provider to discrete control of tenant Edge placement. Previously, this was rather limited and only controlled at the Provider Virtual Data Center (pVDC) layer. With Edge Clusters, we now can establish this on a per-oVDC basis. In essence, the three main value points of Edge Clusters:
Consumption of dedicated Edge Clusters for North/South
traffic – optimized traffic flow while minimizing the span of Layer 2 broadcast
Provide a higher level of availability to Edge
nodes that can distinctly fail between two clusters.
Ability to balance organization Edge services
between multiple Edge Clusters – I do not have to use the “same” Primary and
Secondary Edge Cluster for every org VDC. This can be configured on a per
Below is a overall high level design of Edge Clusters from a physical and logical layer –
I get this question quite a bit due to the new vCloud Director 9.5 Cross-VDC networking functionality – does vCloud Availability for Cloud-to-Cloud 1.5 (C2C) work with stretched networking inside of Cross-VDC networking?
The answer is: yes!
This is another great addition for recoverability considerations as one could fail over between vCloud Director instances without modifying the guest OS IP address. Furthermore, based on the application architecture, one could have active-active applications and ensure replication/failover in the event of a disaster.
Let’s go through my example high-level design I’ve worked up in my lab –
In the above diagram, we can see I have two active vCloud Director instances, Site-A and Site-B. I have two organizations, “Daniel” that resides in Site-A along with “Daniel-B” that resides in Site-B.
C2C is deployed on each site in the combined model and I have multi-site pairing completed so I can easily manage this between my two sites –
Within my Cross-VDC networking setup, I currently have my active egress setup to Site-A as depicted in the diagram above.
Last of all, I ran a protected workflow from Site-A to Site-B for my Stretched-vApp-VM –
From there, one can either migrate or failover the workload and without any guest OS IP changes. I am going to do a video shortly, but here’s a short GIF I created that shows the ease of use of failing over between my Site-A and Site-B –
After failover, I can then access Stretched-vApp-VM from the new NAT address on Site-B.
An Organization Administrator could also configure active/active or active/passive egress points for additional resiliency. This provides further capability inside of vCloud Director, especially with stretched networking and a complementary availability solution.
Recently, I received a request from one of our aggregators regarding how Equal Cost Multipathing (ECMP) is metered within the VMware Cloud Provider Program (VCPP), specifically Tom Fojta’s recommendation on architecting Provider-managed NSX Edges and Distributed Logical Router (DLR) in ECMP mode, specifically this diagram from the Architecting a VMware vCloud Director Solution –
As shown in the diagram above – How does Usage Meter handle bill these tenant virtual machines (VMs) when we have a provider NSX architecture that utilizes ECMP?
For you TL;DR readers – any VM connected to a Tenant Edge / direct network that has ECMP enabled northbound, NSX Advanced will be charged for said VM. Read on if you want to learn how this is done.
First off, let’s talk about why this matters. Per the Usage Meter Product Detection whitepaper (this can be found on VMware Partner Central), we can see how Usage Meter detects specific NSX features based on the pattern of usage. Regarding dynamic ECMP, it is metered by the “Edge gateway” which could be a little ambiguous. If one utilizes ECMP, they would be metered for NSX Advanced within VCPP.
One of the scenarios from the whitepaper does show ECMP-enabled Edges but not an Edge that is abstracted away from the provider environment –
My initial reaction was that Usage Meter would not look at the northbound provider configuration and the interconnectivity to vCloud Director. However, I was not confident and wanted to verify this explicit configuration and expected metering. Let’s review my findings.
In the above diagram, we can see I created a similar Provider managed NSX configuration with ECMP enabled from the DLR to the two Provider Edges with dynamic routing enabled (BGP). From there, I expose a LIF/Logical Switch named “ECMP-External-Network” to vCloud Director that is then exposed to my T2 organization as a new External Network.
From there, I created a dedicated Tenant Edge named “T2-ECMP-ESG” that will be attached to this newly created network along with a VM named “T2-ECMP-VM.” The goal is to verify how T2-ECMP-VM and T2-TestVM are metered by Usage Meter with this newly created Tenant Edge.
My Edges are setup for BGP and reporting the correct routes from the southbound connected DLR (and tenant Edges) –
From the DLR, we can see that I have two active paths to my Provider Edges (Provider-Edge-1 and 2) –
Last of all, my T2-ECMP-ESG is operational and attached to the newly created ECMP External Network –
Last of all, I have my VM’s created and powered on (remember, Usage Meter will only meter powered on VM’s). We can see T2-ECMP-VM is attached to a org routed network from T2-ECMP-ESG named “T2-ECMP-Network” –
Let’s work from the north to south – start with the Provider Edges and show how Usage Meter detects and bills.
Note – I have vROps Enterprise in my lab environment, so we will see Usage Meter picking up vROps registration and placing it in the appropriate bundle.
Provider Edges / DLR
As expected, the Provider Edges and DLR are detected along with registration to vROps. By design, NSX Edges are charged for the Advanced SP Bundle as they are metered as a management component (minimum Advanced bundle / 7-point). However, in my case, we see detection, and then registration to vROps Enterprise. Therefore, since it’s a bundle ID (BND) of 12, this is correlated to Advanced Bundle with Management (10-point) –
Tenant Edge – T2-ECMP-ESG
Just like the Provider Edges and DLR, we see T2-ECMP-ESG register to UM along with vROps Enterprise registration. Same billing model as above.
Tenant VM – T2-TestVM
I would not expect any change to this VM, but wanted to showcase that having a separate Edge with standard networking (i.e. no ECMP) will bill based off the NSX SP Base level. As expected, T2-TestVM was handled by Usage Meter just as anticipated – we can see registration, NSX SP Base usage, along with registration to vROps Enterprise –
Tenant VM – T2-ECMP-VM
Finally, let’s take a look at my T2-ECMP-VM – as discussed before, this is wired to a Tenant Edge that is connected to the ECMP-enabled DLR via an External Network.
We see initial registration, registration to vROps Enterprise, then NSX Advanced usage! This would be metered at Advanced Bundle with Networking and Management due to the NSX Advanced usage (12-point).
Summary of Findings
Here’s what we learned:
Edges/DLR Control VM’s are not charged for NSX usage since UM handles them as a management component. If you are using vROps, it will place it in the most cost effective bundle.
Utilizing ECMP at the provider-level DOES impact any southbound connected VM from a billing perspective, even if an Edge sits in between the ECMP enabled configuration and the tenant VM. Per the findings, NSX Advanced will be metered.
Therefore, be aware of any NSX provider architecture and the use of NSX specific features.
Again, this shows the logic inside of Usage Meter and how it relates to metering for tenant workloads. Cheers!