After writing my previous post about cleaning up my Continuous Deployment solution for deploying this site, I decided to experiment with AWS’s CodePipeline to set up my own home-grown pipeline to build and deploy the site instead of relying on CodeShip.

Infrastructure setup

For all its features, AWS doesn’t make it easy to set up and tweak things in a secure fashion. The recommended way of creating AWS resources is by using STS, IAM Roles and Policies to set up permissions in the most restrictive way possible. But that approach makes it really hard to iteratively create something when you aren’t quite sure about what you need to create and how all the parts are going to fit together. And then once you’re done setting everything up, you tend to lose track of all the little bits and pieces you created along the way that maybe didn’t end up fitting into the final product.

Enter Terraform. You create a few config files specifying all the resources you need and terraform goes and creates them for you. It keeps track of everything that it created, so making incremental changes easy. And if you really mess things up or the project doesn’t work out, you can run one command to destroy every single resource that was created.

So, I used a few blog posts (several turn up when you search for “s3 static website terraform github” on your favorite search engine) and a few existing terraform modules as a reference to set up a Terraform configuration that spins up an AWS CodePipeline which:

  1. Uses Github webhooks to listen for commits pushed to my blog’s (private) repository
  2. Uses AWS CodeBuild with a golang:1.12 base image to build the site. In my buildspec.yml I put commands to go get Hugo as the INSTALL step, ran hugo -v as the build command and then.. nothing. I’ll get back to that ominous “nothing” shortly.
  3. Uses Amazon S3 as a Deployment Provider in the Deploy step. More details here.

Everything seemed fine. I ran into a couple of permissions issues that turned out to be related to a misconfiguration of the encryption_key field in Terraform. However, I ran into a big hurdle. The last step for deploying to S3 would just fail with an extremely unhelpful error: InternalError: Error reference code: blah blah blah (the blah’s are mine). With no other information to go on, I created a post on the AWS forums and destroyed all Terraform assets a couple of days later when I hadn’t received a response on my post. But then I happily stumbled across a random buildspec.yml and saw that it had an artifacts section that listed a bunch of files. So I took a look at the reference for buildspec and sure enough, here’s what it says:

artifacts: Optional sequence. Represents information about where CodeBuild can find the build output and how CodeBuild prepares it for uploading to the Amazon S3 output bucket.

No shit, sherlock. Couldn’t you have told me that the deployment step couldn’t find any artifacts to upload instead of the useless InternalError? Anyway, I added an artifacts section to my buildspec, and the pipeline turned green! It was pushing the generated static files to my final destination bucket.

Gotta go fast!

Just one small issue at this point. Downloading hugo for each build took its own sweet time, in the order of a couple of minutes. Which isn’t a big issue when it comes to getting a blog post out, but being an engineer I prefer faster over slower (with some non-software exceptions of course). So I did some searching and came across this EXCELLENT post about creating custom docker images, storing them in AWS ECR and using them in your CodeBuild step instead of creating a new one each time (which is what the INSTALL step essentially does, it creates a base image and then modifies it each time). So I created a custom docker image using a simple Dockerfile that downloads and installs hugo. That made my entire CodeBuild step go from a few minutes to ~16 seconds.

Size is everything

…and dont’ let anyone tell you otherwise.

Despite using golang:1.12-alpine as the base image, the resulting docker image was still pretty big because it had all the Hugo source, Go source and build artifacts. Enter Docker multi-stage builds! This features lets you create one image that produces some build artifacts that can then be sent as input to a second image, and the resulting image is just the SECOND one, with none of the intermediate crap from the first one! Here’s what my Dockerfile looks like now:

FROM golang:1.12-alpine as builder

RUN apk add --no-cache git
RUN go get -v github.com/spf13/hugo

FROM alpine:latest
WORKDIR /go/bin
COPY --from=builder /go/bin/hugo .
ENV PATH=$PATH:/go/bin

Pretty self-explanatory, but the tl;dr version is that it copies the built Hugo binary from the first image to /go/bin/hugo and adds /go/bin to the PATH.

Free HTTPS and CDN

Cloudflare is awesome. I discovered a while ago that they offer a free CDN (content delivery network) and free DDOS protection and configured the DNS on my hostname provider at the time to use Cloudflare’s nameservers. When I first started this porting effort, I set up a CDN using AWS Cloudfront with HTTPS but quickly got overwhelmed. AWS’s pricing is pretty flexible, but the large number of tables talking about in-region and cross-region data transfers made it hard for me to figure out how much the whole shebang would cost me, especially given that my website doesn’t really get a lot of traffic as far as I know. It would probably still have been only pennies at the end, but why spend money if you don’t need to? In addition to the features listed above, Cloudflare also has a nice analytics dashboard, so I wouldn’t need to set up some sort of third-party analytics like Google Analytics. Additionally, Terraform has a Cloudflare provider. So instead of making an AWS-only solution, I tore down the Cloudfront resources and decided to use Cloudflare for CDN, DDOS protection and HTTPS everywhere. Didn’t have to do a lot to get it up and running, just had to use their API to fetch all my DNS records, import them into Terraform resources and then manually add zone settings overrides for other miscellaneous settings. The Cloudflare configuration in Terraform isn’t quite as neat and structured as the one for AWS, but it gets the job done.

Free for all

After putting so much effort into this solution, I wanted to make it available to others in an easy-to-use fashion. ECR doesn’t support public images that can be shared with others at the moment and I didn’t want users to have to first manually build an image and upload it to ECR before terraforming everything else. Fortunately, CodeBuild does support pulling images from Docker Hub so I uploaded my build image there under the name hugo-alpine and configured CodeBuild to use that instead of my ECR one. I moved all of my terraform configuration to a module with configurable variables for secrets and other user-specific values (like Github username, repo, etc). Terraform unfortunately doesn’t really work quite right if modules define their own providers, so the user needs to configure every provider themselves in order to use my module, but I’m mostly happy with the result. The final module is available here: https://github.com/ameyp/terraform-aws-cloudflare-static. I hope it works for you. Issues are welcome, as are pull requests!