Article

Serverless Deployment of VMware Cloud on AWS

If you’re familiar with AHEAD, you’ll know that it’s pretty common for us to do briefings on industry trends, and the art of what’s possible. Many times, these conversations include live demos of a certain technology that our customers might be interested in to solve their unique business problems. We’ve done this for many years and AHEAD has invested a lot of time and money into our lab.

But, in today’s world, AHEAD’s lab has to adjust to things like on-demand billing just like the rest of the world. We do a lot of cloud work for our customers and showing them things like VMware Cloud on AWS (VMC) is part of the cost of doing business for us. Luckily, we can use our automation and cloud talent to lower our own lab costs and still provide our customers the experience they need to make decisions for their business. Here’s an example of how we used automation in conjunction with AWS native services and the VMware Cloud on AWS API to provision our VMC lab.

The Requirements

The requirements were simple. Build an on-demand VMC environment that we can use to demonstrate the capabilities to our customers or for troubleshooting, while minimizing our costs.

The Design Decisions

Several decisions were made about how to deploy VMC. The first decision that needed to be made was around where the deployment scripts should run. Should we spin up a vRealize Orchestrator appliance or a Jenkins container? Those would work, but we’d then need to keep track of those resources in our on-prem cloud which requires backups, patching, etc. So we decided that since VMC lives within an AWS account anyway, why not just put the deployment scripts in an AWS Lambda function within that account? Yeah, that seems simple enough.

We next had to decide what language would be used for the build. PowerShell was the first choice, especially since there are VMC modules already written and provided by VMware. Unfortunately, those modules were too big to fit into a Lambda function so we resorted to using Python 3.7 and using native REST calls to the VMC API.

We realized that there would need to be some configuration completed after the VMC was built, such as opening a firewall rule to allow access to the vCenter instance. Unfortunately, our Lambda function couldn’t trigger the build and then continuously loop for the two hours it takes to stand up VMC from scratch. That posed a bit of a problem, but we solved it by using multiple Lambda functions and storing state data in AWS SSM Parameter Store. This way, our build function could trigger the deployment, log some important info to Parameter Store, and quit. Then, a second status checker function would be scheduled through Amazon CloudWatch on a 5-minute interval.

Now, once the status checker found that the VMC had been deployed and was in a READY status, we wanted to trigger another set of configurations that might make our lab more useful. The first configuration we built was to open a firewall port to the vCenter instance that was deployed in VMC. This would allow our engineers to immediately login to vCenter once the VMC build was complete.

We could use another polling function here, or the status checker function could’ve called another function by itself, but we thought this might be a good opportunity to leverage Amazon Simple Queueing Services (SQS) as a message bus. When the status checker identifies a “READY” VMC, it throws a message on the SQS Queue, which is a trigger for our VMC configuration script.

Next, was what we’d do about VMC running for too long. Let’s face it, if we build the automation to build VMC, but do nothing with the removal process, we’ll still be stuck with an environment that is always provisioned. Even I have forgotten to remove cloud resources when I was done with them and they can wrack up a large bill if you aren’t paying attention.

We tackled this by pushing a destroy date to parameter store during the build that was 2 days in the future. Another Lambda function was set on a 1 hour schedule to check for VMC builds that should be terminated.

So far, so good. The next concern was focused on ensuring that the people responsible for paying the bills knew when a new VMC was deployed and destroyed. This was easy enough to do through an email notification at build and destroy times through Amazon Simple Notification Service (SNS).

Whenever a user deployed a VMC environment, and whenever the destroy function removed a VMC, those responsible for the VMC bills received an alert that billing was started or stopped. This was useful to keep an eye on how much the VMC environment was being used.

Along with notifications, it was anticipated that at some point we would need to do some troubleshooting. To keep things simple, we just used Amazon CloudWatch to store the details about what’s happening in our lambda functions. Amazon CloudWatch would show the performance of the functions and any errors occurring, while Amazon CloudWatch Logs would have the details about the functions themselves.

Lastly, we had to decide how people would request this new service. To make things simple, we put the “Build” function in AWS Service Catalog. This allowed us to set permissions and simplify the deploy process so that users only needed to launch a product from the service catalog instead of knowing how to execute a Lambda function. And just for good measure, we also setup an Amazon IoT button to do the deploy as well, because what’s cooler than building your cloud with the literal touch of a button.

The Results

After coding up the Lambda functions and using this for a bit, we can now safely deploy VMC in a single node fashion, and be sure that it will be destroyed before it racks up a big bill. In the end, we can see the status of our functions in CloudWatch, we get notified when a new VMC is deployed, and we feel pretty secure about our lab environment.

Now, of course there are other ways to do this. We could’ve front-ended the deployment with ServiceNow instead of AWS Service Catalog. We could’ve used Splunk for monitoring our logs. We could’ve removed the need for an SQS queue. We could’ve done all of those things, but this is also a good example for us to show customers how you can fit different cloud services together to build an app. The way we built an application to deploy VMC is very similar to how customers might want to build apps for their own use cases. In this example, the act of automating our own VMC build provides an example of the ways our customers can automate their own solutions with cloud services.

Need to Talk VMware?

To see just how your business could leverage VMC you can schedule a briefing with AHEAD experts and test-drive the technology while reviewing the specific details of your IT environment. In the meantime, check out our other resources in The LAB on VMC and reach out with any questions. As a VMware Premier Solution Provider, we can help you with any VMware solution in your stack.