Voltos

The perfect Continuous Integration setup for Heroku

It is a tale as old as time (well, not really): you add a new feature to your app, you run the tests and they all come back green. You git push master, the build automatically runs, and comes back green. You deploy to production and… it all breaks. Because there was some underlying system package missing, or a version mismatch, or any of a hundred other reasons.

The solution is obviously to make the dev and test environments more similar to the production environment. Keeping that gap as small as possible is one of the tenets of the 12 factor app: dev/prod parity.

Historically that's been easier said than done on Heroku.

With our product Voltos, we need a very high degree of trust and confidence in our test infrastructure. We need to mitigate or reduce any incidental risk that is introduced in our test infrastructure to ensure that what we've tested is definitely the same as what we have in production. And with relatively high (security related) version churn in underlying system dependencies like OpenSSL over the past few years, the risk that our test infrastructure is running a different version of such a library to our production system isn't acceptable.

The problem with PaaS CI

So much of the value of running on a PaaS like Heroku is that someone else is helping take care of things for you. When there's a critical OpenSSL issue Heroku have usually started rolling out patches across the fleet before most people would have had time to respond.

This freedom to focus on your own product comes with considerable constraints, an obvious cost: you're not the one that controls the base image.

Compound it with the fact that for the most part you've not been able to run your own CI setup on Heroku, but instead had to pick a vendor from the Add-on/Elements Marketplace. Which has been great for convenience but introduces a wider gap in that dev/prod parity ideal. You've now got a third-party vendor who is managing your production infrastructure, and a different third-party where you're trying to run some approximation of what you've got in production.

Not to mention it almost certainly doesn't use any custom buildpacks or other Heroku extensions that affect exactly how your application runs.

The dream Heroku CI setup

I spent over 3 years trying to politely cajole various CI providers into supporting a different approach. Something that more accurately replicated a production Heroku environment. Something that actually ran within a dyno. That way it would use all the same buildpacks. It could talk to Heroku Postgres just like my app did. I was free to test any other add-ons if I felt it appropriate in an integration test as opposed to mocking literally everything. Nobody was able to fit it into the way they ran their CI services though.

Well… that is until I met Buildkite.

Let's go fly a kite

The Buildkite approach to CI is beautiful in it's simplicity: an agent you run that sits waiting for work, when it gets some it runs the various scripts you define in the Buildkite UI, and returns the status of those commands back to Buildkite. All of the handling of webhooks, fanning out work to parallel works (if you need it), reporting status back, etc. is taken care of. The one thing they don't do is run your infrastructure for you.

Which is perfect. Heroku do that for me.

Some minor tweaking of the default agent behaviour, wrap it up in a buildpack, and it's ready to run our tests on Heroku for us whenever we deploy.

Here's how we're using it with parts of Voltos:

Setting up Heroku

  1. Setup a Heroku Pipeline so that you create a test/staging app and link it to your production app.
  2. Add the Heroku GitHub Integration to your staging app so that Heroku will check out any changes to your master branch automatically. Make sure you do not check the Wait for CI to pass before deploy checkbox, otherwise your code will never get deployed to this branch.
  3. If you have some test specific external dependencies, such PhantomJS, you'll need to add them via additional buildpacks (e.g., heroku buildpacks:add https://github.com/stomita/heroku-buildpack-phantomjs)
  4. You'll need to tell bundler to ignore only the development dependencies now (so that test dependencies will be installed), and that you want to run the app in test mode heroku config:set BUNDLE_WITHOUT=development RACK_ENV=test RAILS_ENV=test.
  5. I've enforced an environment/config variable of APP_NAME as a requirement to make targeting individual Buildkite agents/apps on Heroku easier (a subject for another time). For now just make sure you set the app name: heroku config:set APP_NAME=your-app-name.
  6. Install the Buildkite Agent Buildpack as the very last buildpack on your app: heroku buildpacks:add https://github.com/gluio/heroku-buildkite-agent.

Setting up your local app

There's some minor customisation required in your app to make the transition as seamless as possible. You'll need to create scripts for each step you want Buildkite to run (e.g., setting up the database, running the tests). This is because Buildkite normally assumes it will be a long-running agent that will continue to get work, run it, return status, and then wait for new work.

We don't want to do that on Heroku.

Instead we want to fetch work, run it, and if it fails we want to kill the agent and then return an error to the Heroku build process so that it aborts too. That means two minor hacks, one for any regular step we run and then a slightly different approach for whatever we know to be the last step in a pipeline.

Here's the script I use for setting up the database (bin/migrate_database):

1
2
3
4
5
6
7
8
9
#!/bin/sh
echo "Running database migration"
bundle exec rake db:migrate
if [ $? -ne 0 ]; then
  pkill -KILL buildkite-agent
  exit 1
else
  exit 0
fi

You'll see it's just the regular bundle exec rake db:migrate from Rails. But then I check the return code for the previous command. If it's not equal to zero, kill the agent and exit 1 to abort the build process. Otherwise exit 0 and continue the build.

The script to run our tests (bin/run_tests) is the last step in our pipeline. It looks like:

1
2
3
4
5
6
7
8
9
10
#!/bin/sh
echo "Running tests"
bundle exec rake spec
if [ $? -ne 0 ]; then
  pkill -TERM buildkite-agent
  exit 1
else
  pkill -TERM buildkite-agent
  exit 0
fi

The only substantial difference here is that the agent will always be killed, irrespective of the return status of the previous command. Because in this instance we're done with the agent and need it to shut down and not try and receive any more work.

Setting up Buildkite

  1. Create a Buildkite account.
  2. Create a new pipeline.
  3. Add the steps for your process (e.g., bin/migrate_database and bin/run_tests) and save the pipeline
  4. Go to the Agents page, reveal & copy your agent token, and add it to your Heroku app: heroku config:set BUILDKITE_AGENT_TOKEN=8b8b9ccad.

You're done

Push a change to master and watch GitHub notify both Heroku and Buildkite of the change. Buildkite will queue up the test work while Heroku is busy checking out the code, resolving dependencies, and building the app. Once that's done the Buildkite Agent will start, it'll fetch the instructions on what it needs to run, and submit the results back.

Advanced steps

Really fast production deployments

From here you can use the pipelines feature of heroku pipelines:promote to move the release on your staging app directly onto your production one in a fraction of a second and have really fast manual deployments to production at a time you control.

Auto-deployment of green builds

The alternative approach we've taken is to enable the GitHub integration on our production Heroku app, but check the Wait for CI to pass before deploy checkbox for that app. Now the production app will wait for the staging app to run the tests and make sure everything is green, before re-doing the build on production but this time excluding the test dependencies.

Glenn Gillen

Co-founder of Voltos. I'm also an advisor to, and investor in, early-stage tech startups such as StackShare, Stamplay, GrapheneDB, Fossa, and Polybit. Ex-Heroku, ran Heroku Add-ons & Ecosystem.