Overview of VMware vCloud Availability 3.0 – Introduction, Roles, Deployment Process

In this series of blog posts, I will be discussing the new release of VMware vCloud Availability 3.0 (vCAv). This is a very exciting release for the VMware Cloud Provider team. vCAv 3.0 will be out shortly (by end of our fiscal quarter) and I want to provide my insight into this platform. I will be focusing on the following points –

  1. Introduction to vCAv 3.0
  2. High-Level Architecture
  3. vCAv 3.0 Service Roles
  4. Deployment Process
  5. Deployment Approach for Provider
  6. Deployment for Tenant (On-Premises)
  7. Protection Workflow
  8. Resources

Introduction to vCAv 3.0

First off, let’s discuss what vCAv provides from a functionality perspective. vCAv is what I like to call “functional convergence.” In the past, we had many different products that provided some level of availability or migration capability –

In my opinion, this was a duplication of appliances and could be confusing to customers. The team has done a great job of putting forth a significant investment into vCAv 3.0 to simplify the architecture. Therefore, here’s where we are today –

Therefore, no need for multiple tools for migration, DR from/to Cloud, or between Clouds. We now have a single solution that answers all of the above.

vCAv 3.0 Key Functionality

So what does vCAv 3.0 provide?

  1. Simple, Unified Architecture
    1. There is a single OVF for the Provider and the Tenant. On the Provider side, each role can be easily deployed used by the vSphere Client or CLI.
    2. Deployment is very intuitive and scalable – each role can be quickly deployed in a matter of minutes.
    3. On-Premises Appliance is unified and provides vCenter UI integration for management.
  2. On-Premises Migration and Protection
    1. On-Premises appliance provides the same UI experience as connected to the Cloud instance, but built into vCenter.
    2. Migration and/or protection of workloads can be done with a few clicks.
    3. Allows for protection/migration to and from vCloud instances.
  3. Cloud to Cloud Migration and Protection
    1. Very similar behavior to C2C 1.5, we can protect workloads (vApps and VM’s) between vCD instances.
  4. Network Mapping and Re-Addressing
    1. One of the new additions is the ability to re-address and map out protected workflows for faster recovery at the destination site.
    2. This maps to existing vCD oVDC Network constructs such as Static IP pools.
  5. Scale
    1. As discussed before, the team is aware of scalability requirements for Providers. For this version, here is the stated guidelines:
      1. 300 tenants with active protections paired to vCloud Director instance
      2. 20 vCAv Replicator instances per instance
      3. 500 active protections per vCAv Replicator instance
      4. 9,500 active protections across tenants to a Cloud
      5. 5TB protected VM size (contingent upon Cloud storage)

Business Value

Direct and native vCloud Director integration for providers and tenants – with the ability to provide self-service, this provides a unique experience that meets DRaaS requirements and migration functionality.


Ease of Operationalization – I’ve done several deployments during the development process and it’s one of the easiest VMware Cloud Provider solutions to deploy. Once we review the roles and concepts, anyone should be able to operationalize this with ease.

Cost-Effective Approach – vCAv 3.0 will be part of the VMware Cloud Provider Program (VCPP). This is based on the monthly consumption of points, which is a very cost-effective solution that can be modeled and productized for DRaaS and migration offerings.

High-Level Architecture

In the following diagram, we can see how this all comes together with a single, on-premises vCenter along with two vCD instances. One can pair this up with vCD multi-site federation capability.

Moreover, this pairs very well with existing vCD services such as CrossVDC Networking or L2VPN connectivity between on-premises and organization VDCs.

vCAv 3.0 Service Roles

Let’s review the distinct service roles of vCAv 3.0.

Provider

On the Provider side, we have the following –

  1. Cloud Replication Management
    1. This is a logical entity that consists of the core of vCAv 3.0.
    2. vCloud Availability Portal – User Interface for the tenant and provider. All UI configuration ingresses from this service and applies all to all necessary connected components.
    3. vCloud Availability vApp Replication Manager – communicates directly to vCD and understands tenancy constructs such as organizations, vApps, etc. Also responsible for enabling protections or migrations.
    4. vCloud Availability Replication Manager – understands vCenter and ESXi concepts and will interoperate between the replicators and protected vCenters.
  2. vCloud Availability Replicator – lightweight node responsible for executing on the host-based replication from a specific host. Typically, you deploy a replicator per vCenter.
  3. vCloud Availability Tunnel – this is the tunneling service that is responsible for providing secure connectivity between on-premises vCenter(s) and connected vCD instances.

Each of these roles can be deployed separately or in a combined virtual appliance. For production deployments (which we will review later), the recommendation is standalone deployments for each role.

Tenant / On-Premises

On the tenant side, we have a single appliance that has a combined appliance approach –

  1. vCloud Availability Replicator – just like on the Provider side, the Replicator is responsible for executing the host-based replication (HBR) process
  2. vCloud Availability Tunnel – provides secure connectivity between on-premises and vCloud environment. All traffic securely ingresses and egresses through this service.
  3. vCloud Availability Plugin – this plugin provides local vCenter UI management that is the same experience as connecting the vCAv Cloud environment.

Deployment Process

While this blog series will cover the Provider and On-Premises side in further detail, we will have the following steps to execute on for a successful deployment.

Provider:

  1. Deployment of Cloud Replication Management (CRM) Instance
  2. Deploy vCAv Replicator(s)
  3. Deploy vCAv Tunnel
  4. Configuration of CRM instance and start of site wizard
  5. Configuration of Replicator
  6. Configuration of Tunnel
  7. Validation

Tenant:

  1. Deploy vCAv On-Premises Appliance
  2. Start configuration wizard
  3. Connect to vCAv Cloud Tunnel
  4. Configuration of local placement
  5. Validation

Next up, I will review the Provider deployment process in further detail while providing the step-by-step procedures. Stay tuned!

-Daniel

VMware vCloud Director – Installation of PostgreSQL and Migration from Oracle

In one of my lab instances, I currently have Oracle still running as my backend vCloud Director database. In this post, I am going to document the steps it takes to install Postgres10 and migrate away from Oracle.

Preparation

First, taking a snapshot of my vCD instance – always back up before making any type of database changes! 🙂

Next, my system is a little dated, so I am running a yum update to get all of the latest binaries before we install PgSQL.

I am also running RHEL, so your steps may different based on your distribution.

Installing and Starting PostgreSQL 10

My esteemed colleague, Sean Smith, wrote a nice post on setting up an all-in-one vCD appliance here so I am going to borrow his steps on installing PgSQL 10.

Get the RPM and start the install –

rpm -Uvh https://yum.postgresql.org/10/redhat/rhel-7-x86_64/pgdg-centos10-10-2.noarch.rpm
yum install postgresql10-server postgresql10

Let’s initialize the database –

service postgresql-10 initdb

From there, using chkconfig to enable it to start on boot –

chkconfig postgresql-10 on

Ready to start!

service postgresql-10 start

Let’s change the password to the default postgres account (ignore my superweak password) –

Before we can make the authentication change, we need to set the default postgres user password.

Switch user to postgres and type in psql followed by a \password to set it –

su - postgres
psql
\password

\q and go back into the root console.

Finally, we need to allow authentication to the database. I am going to allow full access to local and remote logins to the database.

Edit the /var/lib/pgsql/10/data/pg_hba.conf file and modify this line –

local all all  peer

to

local all all md5

And adding this line below –

host all all 0.0.0.0/0 md5

Now, edit the postgresql.conf file and remove the # from the ‘listen_addresses line –

Finally, restart the postgresql-10 service –

We are now ready for the next step which is creating the new vCloud database that we can move over to.

Setting up the new vCloud database on PostgreSQL 10

We are now ready to create our new database and prepare it for the migration.

First, let’s switch user over to the postgres account and enter psql –

su - postgres

We need to create the vcloud account with a password –

create user vcloud with password 'vcloudpass';

Now I’m ready to create my vcloud database. I already have my vcloud user account on the system, so no need to create that again. Following these instructions from the VMware master docs.

create database vcloud owner vcloud;

Finally, altering it so it enables the database owner on login:

alter role vcloud with login;

From here, one can setup SSL for secured communication. Since this is my lab, I’m going to skip over that configuration.

Stopping the vCD Instance and Migrating

Let’s stop the vCD service –

Now we can follow the instructions here on the documentation on using the cell-management-tool for dbmigrate

cell-management-tool is under /opt/vmware/vcloud-director/bin –

Now we are ready to run the cell-management-tool dbmigrate command. For me, this was my configuration – it will differ based on your setup.

./cell-management-tool dbmigrate -dbhost vcd-01a.corp.local -dbport 5432 -dbuser vcloud -dbname vcloud -dbpassword vcloudpass

Processing….

Awesome!

Ready now to run the reconfigure-database command, and boom! Complete.

/opt/vmware/vcloud-director/bin/cell-management-tool reconfigure-database -dbhost vcd-01a.corp.local -dbport 5432 -dbuser vcloud -dbname vcloud -dbpassword vcloudpass -dbtype postgres

Let’s start back up vCD….

We are back up and running!

Lessons Learned

  1. While this was not a difficult task, every distribution is different, inclusive of Sean’s post where he did the installation and setup of PostgreSQL-10.
  2. The cell-management-tool works great for database migrations to PostgreSQL-10.
  3. Note that I did not setup SSL communication. This requires further steps to set it up. Sean did a great job on the steps here.
  4. Test, test, test before you do this in production.

Thanks!

-Daniel

How does VMware vCloud Availability for Cloud-to-Cloud 1.5 interoperate with vCD Cross-VDC Networking?

I get this question quite a bit due to the new vCloud Director 9.5 Cross-VDC networking functionality – does vCloud Availability for Cloud-to-Cloud 1.5 (C2C) work with stretched networking inside of Cross-VDC networking?

The answer is: yes!

This is another great addition for recoverability considerations as one could fail over between vCloud Director instances without modifying the guest OS IP address. Furthermore, based on the application architecture, one could have active-active applications and ensure replication/failover in the event of a disaster.

Let’s go through my example high-level design I’ve worked up in my lab –

Example Cross-VDC setup with vCloud Availability for Cloud-to-Cloud 1.5

In the above diagram, we can see I have two active vCloud Director instances, Site-A and Site-B. I have two organizations, “Daniel” that resides in Site-A along with “Daniel-B” that resides in Site-B.

C2C is deployed on each site in the combined model and I have multi-site pairing completed so I can easily manage this between my two sites –

Within my Cross-VDC networking setup, I currently have my active egress setup to Site-A as depicted in the diagram above.

Last of all, I ran a protected workflow from Site-A to Site-B for my Stretched-vApp-VM –

From there, one can either migrate or failover the workload and without any guest OS IP changes. I am going to do a video shortly, but here’s a short GIF I created that shows the ease of use of failing over between my Site-A and Site-B –

via GIPHY

After failover, I can then access Stretched-vApp-VM from the new NAT address on Site-B.

An Organization Administrator could also configure active/active or active/passive egress points for additional resiliency. This provides further capability inside of vCloud Director, especially with stretched networking and a complementary availability solution.

Thanks!

-Daniel

How is ECMP metered when configured in the Provider architecture within the VMware Cloud Provider Program?

Recently, I received a request from one of our aggregators regarding how Equal Cost Multipathing (ECMP) is metered within the VMware Cloud Provider Program (VCPP), specifically Tom Fojta’s recommendation on architecting Provider-managed NSX Edges and Distributed Logical Router (DLR) in ECMP mode, specifically this diagram from the Architecting a VMware vCloud Director Solution – 

As shown in the diagram above – How does Usage Meter handle bill these tenant virtual machines (VMs) when we have a provider NSX architecture that utilizes ECMP? 

For you TL;DR readers – any VM connected to a Tenant Edge / direct network that has ECMP enabled northbound, NSX Advanced will be charged for said VM. Read on if you want to learn how this is done. 

First off, let’s talk about why this matters. Per the Usage Meter Product Detection whitepaper (this can be found on VMware Partner Central), we can see how Usage Meter detects specific NSX features based on the pattern of usage. Regarding dynamic ECMP, it is metered by the “Edge gateway” which could be a little ambiguous. If one utilizes ECMP, they would be metered for NSX Advanced within VCPP. 

One of the scenarios from the whitepaper does show ECMP-enabled Edges but not an Edge that is abstracted away from the provider environment – 

My initial reaction was that Usage Meter would not look at the northbound provider configuration and the interconnectivity to vCloud Director. However, I was not confident and wanted to verify this explicit configuration and expected metering. Let’s review my findings. 

Lab Setup

In the above diagram, we can see I created a similar Provider managed NSX configuration with ECMP enabled from the DLR to the two Provider Edges with dynamic routing enabled (BGP). From there, I expose a LIF/Logical Switch named “ECMP-External-Network” to vCloud Director that is then exposed to my T2 organization as a new External Network.

From there, I created a dedicated Tenant Edge named “T2-ECMP-ESG” that will be attached to this newly created network along with a VM named “T2-ECMP-VM.” The goal is to verify how T2-ECMP-VM and T2-TestVM are metered by Usage Meter with this newly created Tenant Edge. 

Lab Configuration

My Edges are setup for BGP and reporting the correct routes from the southbound connected DLR (and tenant Edges) – 

From the DLR, we can see that I have two active paths to my Provider Edges (Provider-Edge-1 and 2) – 

Last of all, my T2-ECMP-ESG is operational and attached to the newly created ECMP External Network – 

Last of all, I have my VM’s created and powered on (remember, Usage Meter will only meter powered on VM’s). We can see T2-ECMP-VM is attached to a org routed network from T2-ECMP-ESG named “T2-ECMP-Network” – 

Findings

Let’s work from the north to south – start with the Provider Edges and show how Usage Meter detects and bills. 

Note – I have vROps Enterprise in my lab environment, so we will see Usage Meter picking up vROps registration and placing it in the appropriate bundle.

Provider Edges / DLR

As expected, the Provider Edges and DLR are detected along with registration to vROps. By design, NSX Edges are charged for the Advanced SP Bundle as they are metered as a management component (minimum Advanced bundle / 7-point). However, in my case, we see detection, and then registration to vROps Enterprise. Therefore, since it’s a bundle ID (BND) of 12, this is correlated to Advanced Bundle with Management (10-point) – 


Tenant Edge – T2-ECMP-ESG

Just like the Provider Edges and DLR, we see T2-ECMP-ESG register to UM along with vROps Enterprise registration. Same billing model as above. 

Tenant VM – T2-TestVM

I would not expect any change to this VM, but wanted to showcase that having a separate Edge with standard networking (i.e. no ECMP) will bill based off the NSX SP Base level. As expected, T2-TestVM was handled by Usage Meter just as anticipated – we can see registration, NSX SP Base usage, along with registration to vROps Enterprise – 

Tenant VM – T2-ECMP-VM

Finally, let’s take a look at my T2-ECMP-VM – as discussed before, this is wired to a Tenant Edge that is connected to the ECMP-enabled DLR via an External Network. 

We see initial registration, registration to vROps Enterprise, then NSX Advanced usage! This would be metered at Advanced Bundle with Networking and Management due to the NSX Advanced usage (12-point). 

Summary of Findings

Here’s what we learned:

  1. Edges/DLR Control VM’s are not charged for NSX usage since UM handles them as a management component. If you are using vROps, it will place it in the most cost effective bundle.
  2. Utilizing ECMP at the provider-level DOES impact any southbound connected VM from a billing perspective, even if an Edge sits in between the ECMP enabled configuration and the tenant VM. Per the findings, NSX Advanced will be metered. 
  3. Therefore, be aware of any NSX provider architecture and the use of NSX specific features. 

Again, this shows the logic inside of Usage Meter and how it relates to metering for tenant workloads. Cheers!

-Daniel