Branch Deployments at Wistia

At Wistia, multiple teams need access to staging at the same time, sometimes for extended periods to gather stakeholder input. This creates contention for the staging environment across the organization.
Branch deployments solve this contention.

Dave Jachimiak

Engineering

What is a Branch Deployment?

A branch deployment is a deployment of a Git branch to a staging cluster. It’s given a unique URL and exists on staging along with many other branch deployments at the same time. Branch deployments free us from the constraints of a single staging environment, allowing us to give access to stakeholders (like product managers, designers, and QA) for extended periods of time.

“Giving PRs a staging link has been IMMENSELY helpful as a designer. It’s helpful to see it live because it allows me to fiddle with the css and give better, more accurate feedback on changes.”

Inspired by GitHub’s review lab, we built branch deployments by extending our internal infrastructure/PaaS. We automatically create a branch deployment in our staging cluster whenever a pull request (PR) is opened by an engineer.

How it Works

Examples below will refer to wistia.staging, which is a fictitious domain.

Background

We use a combination of technologies to achieve branch deployments: our dub cli, Jenkins, our deploy-api, and Kubernetes.

dub CLI

dub is a CLI we built for all sorts of internal engineering tasks like setting up and tearing down VMs, managing deployments, and tailing production logs.

dub reads from service configuration files and translates them into Kubernetes resources. One of the things to configure are public hosts for web apps. These describe domains and paths to be associated with those applications. During deploys, dub creates Kubernetes ingresses out of those public hosts, which send incoming traffic matching the configured domains and paths to the associated web apps.

Jenkins Cluster

Our Jenkins cluster receives commands from the deploy-api to run build-test-publish and deploy jobs for each service. Build-test-publish jobs build Docker images, run tests against a containers of the images, and publish the images to our image repository if the tests pass.

The cluster posts back information about the build and deploy state to the deploy-api.

deploy-api

deploy-api is a proxy to our Jenkins cluster. Clients hit this proxy to kick off build-test-publish and deploy jobs in Jenkins.

We found that it’s useful to manage builds and deploys in a single service. Deploying depends on images. Having the deploy-api manage the conditional publishing of images allows it to tell API clients which commits are available to deploy. It also allows us to return reasons why certain commits perhaps failed to publish — like test failures — and are therefore impossible to deploy.

Kubernetes

We shifted most of our services to Kubernetes in 2017. Branch deployments leverage Kubernetes namespaces so that they can be easily updated and deleted. They also rely heavily on Kubernetes ingresses for HTTP routing.

Overview of Branch Deployment Flow

When an engineer opens a pull request (PR), GitHub sends a webhook to our deploy-api to kick off a build on our continuous integration (CI) servers.

The deploy-api waits until the latest commit of the PR’s CI build is successful. Then it tells our Jenkins cluster to deploy that commit to its own Kubernetes namespace and with a special subdomain in the ingresses.

Once the branch deployment is rolled out, the deploy-api posts links to the branch deployment to a PR comment on Github. The URLs have the special subdomain that point to the branch’s live code.

We update the branch deployment’s live code when subsequent commits for the same branch are pushed to GitHub and those commits’ builds pass. Updating branch deployments is simply a matter of updating the related Kubernetes deployment with the SHAs of the new Docker builds.

“Branch deployments reduce the amount of time and effort it takes to get code into staging, and removes the need to organize the team around a single staging environment. Our engineers never block each other due to limited staging availability, and the feature branches are always available for stakeholders to review. I can’t imagine going back to a time before we had branch deployments.”

Branch deployments stick around either until the PR is closed or the branch becomes stale, receiving no new commits or activity for a week.

Subdomains

For Most Services

Most of our services act on static subdomains, or don’t act on them at all. Since we have a TLS certificate for *.wistia.staging, we can prepend any valid subdomain string, like a normalized branch name, to wistia.staging, for these services. For example, if there exists a branch for the blog service called davej/fix-ui-bug, the domains for the blog service’s ingresses become blog-davej-fix-ui-bug.wistia.staging.

For Services With Dynamic Subdomains

Services with application code that react to dynamic subdomains should be handled differently. This is the case for our main customer application — the Wistia app. The Wistia app uses account names as the subdomain to point to the user’s account. For example, Wistia’s home account is at home.wistia.com. The staging equivalent, without branch deployments in mind, is home.wistia.staging.

For these services, the deploy-api reserves 100 subdomains in the form branch-#{n}. The deploy-api leases them to Wistia app branch deployments. For example, if branch-1 and branch-2 are already leased to unrelated branch deployments, and I open a PR for my branch, then the deploy-api leases branch-3 to my branch, and I’ll be able to see my code at *.branch-3.wistia.staging. In turn, I could reach the home account at home.branch-3.wistia.staging.

We do all of this because wildcards in TLS certificates only handle one level of subdomain: double wildcards like*.*.wistia.staging don’t conform to the domain name spec. For example, a TLS cert for *.wistia.staging can handle a fully qualified domain name like blog-davej-fix-ui-bug.wistia.staging (one level of subdomain), but not home.davej-fix-ui-bug.wistia.staging (two-levels of subdomain).

So we made TLS certificates for wildcard domains with those set-aside subdomains — *.branch-1.wistia.staging, for example. We attach those certs to the AWS load balancers that handle our staging traffic.

Deleting Branch Deployments

To delete a branch deployment, the deploy-api sends a message to the Jenkins cluster to run a job that deletes the deployment’s Kubernetes namespace. Easy peasy.

Considerations

Databases

Ideally, each deploy would get its own database, seeded with the data it needs for testing the branch. Since this would be more infrastructure work, we decided to see how using a shared database would go before building out that functionality. So far, shared databases have worked well for us; the relative stability of our data models may let us dodge this concern for now.

Cluster resources

Having multiple branches deployed at the same time simply means that there are more containers running in the staging cluster. That means our staging cluster needs more compute resources — CPU and RAM — than it did before. Compute resource requirements are relatively trivial for most of our services, but the most heavily developed one — the Wistia app — use quite a bit. As a result, though we do spend more money to keep up with the new compute resource requirements, we feel the benefits of branch deployments significantly outweigh the costs.

“Without branch deployments there is no way that we would be able to develop, test, and release at the speed we do. No longer worrying about the logistics of staging or coordinating its use has been a gift to the entire engineering team. It’s allowed me to spend time focusing on the details of my code instead of thinking about how and when I can deploy to the staging environment.”

Manual deployments

We initially thought we would allow for manually kicked off branch deployments, either through a Slack bot or our dub CLI. But after the infrastructure team collected feedback from application engineers, it was clear that an automated approach was more desireable.

We imagine that we’ll want to enable manual branch deployments in the future, but we’re satisfied with PR-triggered ones for now.


Branch deployments enable a lot of fun and interesting work here at Wistia. Interested in joining us? We’re hiring! Check out our engineering listings on our jobs page.