Every Branch Gets a Stack: A How-To Guide

June 10, 2020

Turbocharge your team’s development workflow with this strategy that provides quick feedback in a collaborative, no-risk environment

Last year SingleStone set out to build its first SaaS solution. We decided to do things differently and lean into a branch strategy. We’ve built plenty of custom software in our 23-year history, but always for clients. In those projects there are always pre-existing factors that guide design decisions: specific languages, tooling, and processes. Sometimes, there are existing CI workflows in place that dictate how our contributions are incorporated.

In building a SaaS solution that we would run ourselves we had more latitude than usual to decide what tools and languages we would use, and how we would build, test, and release our code.

An environment for every branch

One decision we made early on was that every branch of code we developed would get built and deployed—in its own environment—before it was merged. Building a complete application stack for every branch provides so many benefits:

  • It shortens the feedback cycle. By building stacks automatically you can find out whether your code works soon after pushing it. It’s easier, faster, and less expensive to fix bugs earlier in the development process.
  • It keeps the master branch stable. By only merging code that is proven to work, we ensure that any commit on the master branch is stable, deployable code.
  • It makes feature demos easy. We always strive to demo from the master branch but sometimes a feature isn’t quite ready to merge. By deploying each branch we can still show off the feature running in a production-like environment.
  • It eliminates differences between development and production environments. We’re using AWS to run our application, so every application stack uses the same ECS, ECR, Secrets Manager, RDS, and ALB services. We still use Docker-Compose locally for development, but as soon as it’s viable we have CI deploy in the cloud.
  • It improves collaboration. If a developer is having an issue on a branch or needs design input, it’s easy for the team to swarm around a live environment. Since each application stack gets its own unique URL, getting someone’s input is as easy as sending them a link.

This approach works equally well for virtual servers, containers, or serverless functions.

See my colleague George Hatzikotelis’s post on 7 Quality Assurance Principles for more insight—especially point number 7: push-button releases.

How it works

Now that the benefits are obvious…how do you actually do this?

Infrastructure (as code, obviously)

First, all of your infrastructure needs to be defined in code. For AWS, that means CloudFormation, a script using the CDK or Boto3, or Terraform. If you’re deploying to Kubernetes, look at tools like kustomize and Helm to parameterize your infrastructure configuration. This infrastructure code is exactly the same as what you use to deploy to production. It needs to capture at least two parameters: Git branch name and environment.

The branch name is used by the pipeline to pull the correct deployment artifacts and as a prefix for cloud resource names. It also becomes the subdomain in your feature branch environment’s fully qualified domain name (FQDN).

The environment parameter (dev/qa/prod or similar) is used to specify resource allocation via conditional mappings inside your infrastructure code. Feature branch environments are considered dev and as a result receive less compute resources to reduce costs. Redundancy options like multi-AZ databases are likewise skipped.

Pipeline smarts

With the infrastructure defined, the next step is to define a CI/CD pipeline that takes specific actions according to the triggering event. For this article, we’re focused on what actions the pipeline takes for pull requests. Specifically, when:

  • A pull request is opened.
  • A commit is pushed to a branch that has an open PR.
  • The pull request title is not marked as a “work-in-progress” indicating that the branch is not yet ready for an environment to be built. For GitLab, that means your (pull)merge request title is prefixed with “WIP”. For GitHub, you could implement similar logic in the pipeline to check for the presence of a “WIP” label. (If you’re using GitHub Actions, something like WIP could fit the bill.)

Pull requests should run through the following pipeline stages:

Unit tests & linting

Any test failures or linter findings should halt the pipeline. Have your pipeline run these jobs in parallel to make the pipeline more efficient.

Build

The resulting artifact could be a binary package or a container image. In either case, it should be published to an artifact repository or container registry. The artifact’s name should include the branch name and the abbreviated commit SHA. In the case of a container image, this would be the image tag. For example, if the new-user-signup branch of ExampleCo’s Doodad app were built, the image name would be exampleco/doodad:new-user-signup-bfda0317. Appending the short SHA to the artifact name makes it easier to distinguish between multiple commits to the same branch. If your application has multiple artifacts—perhaps you build a “frontend” container image and a “backend” container image—have the pipeline build them in parallel.

Deploy

This is where the pipeline deploys a live application stack using the parameterized infrastructure code and the artifact built in the previous stage. The key is to have the pipeline pass the necessary parameter values (branch name and environment) to the infrastructure templates. The logic in the infrastructure template does the rest: deploying the correct artifacts and sizing the resources appropriately. This includes cloud-specific services that can’t be replicated locally, such as load balancers, managed databases, and cloud networking constructs like security groups.

One important aspect of deploy is configuring DNS and HTTPS. You’ll need to have a hosted DNS zone already set up and the ability to dynamically generate CA-signed certificates. AWS’ Route 53 and Certificate Manager services are an obvious choice, but you could accomplish something similar with another DNS provider and Let’s Encrypt. On Kubernetes, CoreDNS and cert-manager, or various network and ingress frameworks can be used to achieve the same result. When a feature branch environment is deployed, the infrastructure code should include logic to create a subdomain in the hosted zone and attach a certificate to the application’s public interface (which may be a load balancer). The result is that the application running in the feature branch environment can be reached at a secure, named endpoint such as https://new-user-signup.internal.example.com.

Finally, the deploy stage should output that endpoint name somewhere visible by the team. Some CI platforms provide a way to feed this URL back into the pull request (such as GitLab Review Apps). But a simple bot could also perform this task by posting the URL into the pull request comments.

Integration tests

With a live environment, the pipeline can now trigger any automated tests against a running application. We used Cypress, but there are many options. If this area is new to you, see George’s companion post on automated testing for ideas on how to get started.

(Re)Load test data

This is an optional step but loading test data into the application makes it much more “real” when a human logs in to evaluate the feature branch environment. Providing a manually triggered pipeline job to wipe the database and reload the test data gives manual testers the flexibility to make whatever changes they need with the knowledge that they can always wipe the slate clean.

Teardown

When the pull request is closed (either because it was merged, or because it was discarded) the final pipeline stage should terminate the infrastructure stack. In addition to being an automated final step, a manually-triggered job with the same effect allows the team to turn off any feature branch environments that they don’t need running—thereby saving money.

Any additional commits to the feature branch will re-run the pipeline, and the latest commit will be deployed.

Example implementation

Here’s a picture of how it comes together, in this case using CloudFormation and GitLab CI to build and deploy a containerized application. The approach stays the same regardless of what tools and app stack you use.

First, you’ll want your pipeline to build your application and publish the resulting artifact whenever there is a pull request (“merge request” in GitLab parlance). If that succeeds, the resulting build should be deployed.

Build:MergeRequest:
  stage: Package
  image: $BUILD_CONTAINER
  variables:
    # The docker-compose config will tag the image with $TAG if it is set,
    # resulting in something like myorg/myapp:mybranch-a1b2c3d.
    # An example docker-compose snippet...
    #   myapp:
    #     image: "${REGISTRY_IMAGE-local}/myapp:${TAG-latest}"
    # See https://docs.docker.com/compose/compose-file/#variable-substitution
    # for details.
    TAG: $CI_COMMIT_REF_SLUG-$CI_COMMIT_SHORT_SHA
  script:
    - |
      echo "Building Docker image - myorg/myapp"
      docker-compose build myapp
      docker-compose push myapp
  only:  # GitLab CI has since deprecated `only` in favor of rules
    refs:
      - merge_requests

Deploy:Review:
  stage: Deploy to Non-Production
  image: $BUILD_CONTAINER
  variables:
    BRANCH_NAME: $CI_COMMIT_REF_SLUG
    ENVIRONMENT: test
    VERSION: $CI_COMMIT_REF_SLUG-$CI_COMMIT_SHORT_SHA
  environment:
    name: MyApp-Review/$CI_COMMIT_REF_SLUG
    url: https://$CI_COMMIT_REF_SLUG.myapp.com
    on_stop: Stop Review
  script:
    - |
      echo "Deploying branch $BRANCH_NAME..."
      aws cloudformation deploy \
        --region $REGION \
        --stack-name myapp-$BRANCH_NAME \
        --template-file ./every_branch_gets_a_stack_cfn.yaml \
        --parameter-overrides \
          Environment=$ENVIRONMENT \
          BranchName=$BRANCH_NAME \
          Version=$VERSION \
        --no-fail-on-empty-changeset
  only:
    refs:
      - merge_requests
  except:
    variables:
      - $CI_MERGE_REQUEST_LABELS =~ /no-deploy/

Stop Review:
  extends: .Stop
  stage: Deploy to Non-Production
  environment:
    name: MyApp-Review/$CI_COMMIT_REF_SLUG
    action: stop

Your infrastructure code should have the ability to determine whether it’s deploying a branch environment and, if so, create resources accordingly.

This CloudFormation example snippet accomplishes that by:

  1. Capturing the branch name, target environment, and artifact version to deploy.
  2. Including conditional logic to alter behaviors based on the branch name.
  3. Allocating resources based on the target environment; opting for a scaled down non-production deployment to reduce costs. For databases, using smaller servers and eliminating clustering for non-production.
  4. Creating resources—especially named resources—differently based on the branch name.
Parameters:

  BranchName:
    Description: The name of the Git branch (or ref) from which to deploy
    Type: String
    Default: master

  Environment:
    Description: The environment to deploy
    Type: String
    Default: test

  Version:
    Description: The version of the container image to deploy
    Type: String
    Default: latest


Conditions:

  # If the BranchName != 'master', MasterBranch condition == False
  MasterBranch: !Equals [ 'master', !Ref 'BranchName' ]


Mappings:

  EnvironmentMap:
    release:
      TaskDesiredCount: 8
      TargetGroupDeregistrationDelay: 60
      TaskDefinitionCpu: 2048
      TaskDefinitionMemory: 4096
    test:
      TaskDesiredCount: 2
      TargetGroupDeregistrationDelay: 0
      TaskDefinitionCpu: 1024
      TaskDefinitionMemory: 2048


Resources:

# ...snip...

  LoadBalancerAliasRecord:
    Type: AWS::Route53::RecordSet
    Properties:
      Type: A
      Name: !Sub
        - '${RecordPrefix}${HostedZoneName}'
        -
          HostedZoneName: !ImportValue HostedZoneName
          RecordPrefix:
            Fn::If: [ MasterBranch, '', !Sub '${BranchName}.' ]  # Set DNS subdomain based on branch name
      AliasTarget:
        DNSName: !GetAtt LoadBalancer.DNSName
        EvaluateTargetHealth: False
        HostedZoneId: !GetAtt LoadBalancer.CanonicalHostedZoneID
      HostedZoneId: !ImportValue HostedZoneId

# ...snip...

  DatabaseCluster:
    Type: AWS::RDS::DBCluster
    Properties:
      AvailabilityZones:
        - !Select
          - 0
          - Fn::GetAZs: !Ref 'AWS::Region'
        - !Select
          - 1
          - Fn::GetAZs: !Ref 'AWS::Region'
        - !Select
          - 2
          - Fn::GetAZs: !Ref 'AWS::Region'
      BackupRetentionPeriod: !If [ MasterBranch, 35, 3 ]
      DatabaseName: mydatabase
      DBSubnetGroupName: !Ref DBSubnetGroup
      DeletionProtection: !If [ MasterBranch, True, False ]  # Only protect non-ephemeral environments
      EnableCloudwatchLogsExports:
        - error
        - general
        - slowquery
        - audit
      Engine: aurora
      MasterUsername: !Join ['', ['{{resolve:secretsmanager:', !Ref DatabaseMasterSecret, ':SecretString:username}}' ]]
      MasterUserPassword: !Join ['', ['{{resolve:secretsmanager:', !Ref DatabaseMasterSecret, ':SecretString:password}}' ]]
      StorageEncrypted: True
      VpcSecurityGroupIds:
        - !Ref DatabaseSecurityGroup

  DatabaseInstance1:
    Type: AWS::RDS::DBInstance
    Properties:
      AllowMajorVersionUpgrade: False
      AutoMinorVersionUpgrade: False
      DBClusterIdentifier: !Ref DatabaseCluster
      DBInstanceClass: !If [ MasterBranch, db.r5.large, db.t3.medium ]  # Use smaller instances for ephemeral environments
      DBSubnetGroupName: !Ref DBSubnetGroup
      Engine: aurora

  DatabaseInstance2:
    Type: AWS::RDS::DBInstance
    Condition: MasterBranch  # Only make this a multi-node DB cluster if on `master` branch; destined for demo or production environments
    Properties:
      AllowMajorVersionUpgrade: False
      AutoMinorVersionUpgrade: False
      DBClusterIdentifier: !Ref DatabaseCluster
      DBInstanceClass: !If [ MasterBranch, db.r5.large, db.t3.medium ]
      DBSubnetGroupName: !Ref DBSubnetGroup
      Engine: aurora

# ...snip...

Considerations

If you’ve read this far then you see the merits of this approach. But what else has to be considered when adopting this strategy?

First, pay attention to infrastructure costs if you’re running these feature branch stacks in a public cloud. You’ll be running a complete instance of your application for every open pull request. You can keep costs low by keeping feature branch stacks running for only as long as they’re needed, turning them off during non-working hours, ensuring the pipeline destroys those stacks once pull requests are closed, and by running scaled down, non-redundant infrastructure. Our team has found that the marginal increase in cost was far outweighed by the increase in productivity.

Also, if you don’t need always need manual reviews, you can extend your pipeline to build a stack, run the automated integration tests, capture the results, and then tear down the stack. If you want the ability to toggle this behavior on for some—but not all—pull requests, most CI platforms receive a list of labels applied to the pull request when they’re triggered. Have your pipeline skip the manual review jobs if a “skip-manual” label is applied to the pull request.

Crucial question: how are you going to keep this secure?

 The short answer is that you’ll do it by baking security in at multiple layers, the same way you’ll keep your production environment secure. Here are a few steps to take.

  • Network security. If you have a VPN or other direct connection to your cloud network, have your infrastructure code apply different firewall rules for feature branch stacks that prevents inbound Internet access. That will keep any external actors out of your test systems. If you don’t have a direct connection, have the firewall rules whitelist only inbound traffic from your office network.
  • User access. Your application running in a feature branch stacks should behave just like the instance in production, and that includes user authentication and authorization. When accessing our application, the first thing it did was prompt for credentials. We maintained separate LDAP directories and IdPs for our test environments (e.g. “ExampleCo”) and our internal tooling. Keeping that separation helped ensure that a configuration oversight like assigning a dummy/test user to the wrong group wouldn’t inadvertently grant access to our AWS account or CI/CD platform.

Used individually, the items above will improve the security posture of your feature branch environments. Be cautious if your feature branch environments are not firewalled off from the Internet and rely only on managing user access. Even if your login flow is flawless, there may be other parts of your publicly-exposed test application that have vulnerabilities you weren’t aware of. Always practice defense in depth.

What are you waiting for?

It’s well understood that catching and fixing issues early in the development cycle is less costly and time-consuming than addressing them after a release has gone live. This article lays out the blueprint for how to improve your team’s development process by providing quick feedback in a collaborative, no-risk environment. Build better software, faster.

Are you ensuring quality in your application before going to production? Or are you just hoping for the best every time you merge a pull request? Either way, we’d love to hear your story.

Sign Up to Hear More

Get notified when more posts like “Every Branch Gets a Stack: A How-To Guide” are available.

  • This field is for validation purposes and should be left unchanged.

Chris Belyea

Technical Director (DevOps)
Chris Belyea is SingleStone’s Cloud and DevOps Technical Director. Chris guides clients through Cloud and DevOps transformations, including cloud strategy, migrating workloads to the cloud, automating infrastructure deployment, creating CI/CD pipelines, and improving automated configuration management.

Leave a Reply

Your email address will not be published. Required fields are marked *