Notes on Continuous Integration and multi-architecture Docker Images

I spent most of the intervening week in a haze (mostly pondering a few personal matters), but decided to scratch a few personal itches regarding my DevOps pipeline, most of which have to do with the lack of updated versions of Docker base images that I rely on–not for Intel/AMD CPUs, but for ARM machines. And since I like dealing with lower-level stuff as a distraction from work, I decided to rebuild everything as multiarch Docker images.

Basic Technique

Docker has massively simplified cross-compiling. Six or seven years back, when I was doing custom Android builds, I had the massive displeasure of dealing with cross-compiling toolchains, which were fast but required going through build scripts with a fine tooth comb and masking or overriding all sorts of things.

These days, as long as you have a fast enough CPU, all you need to do is load up QEMU and run the native compiler on your machine. And in modern Linux systems, Docker makes that trivial by both allowing you to run QEMU as a non-native binary handler inside a container but also by using qemu-user-static to run the ARM compiler inside the non-native containers themselves.

It all boils down to five simple steps:

  • Register QEMU as a binary loader inside the build system
  • Inject the qemu-user-static binary (the amd64 one) inside a vanilla alpine image for each architecture you need to target (because QEMU expects the binary to be inside the Docker filesystem)
  • Use those images as a base for your builds
  • Push them to Docker Hub
  • Build a virtual image manifest that references each architecture and push that as well

Continuous Integration

All of it is currently working on my alpine-python repository, which is the image I decided to start out with. The CI for it is maintainted via Travis CI (which precedes my use of Azure DevOps by several years), so I’m going to use its .travis.yml file as an example:

sudo: required
services:
  - docker # this tells Travis to start the service

before_install:
  - docker --version  # document the version Travis is using (currently 17.x, which is way too old)
  - echo '{"experimental":true}' | sudo tee /etc/docker/daemon.json # try to use manifests
  - mkdir -p $HOME/.docker
  - echo '{"experimental":"enabled"}' | tee $HOME/.docker/config.json # needed to use docker manifest
  - sudo service docker restart

install:
  - docker run --rm --privileged multiarch/qemu-user-static:register --reset # setup QEMU for cross-builds

script: # Since I want to move this to Azure DevOps, all the smarts are in a Makefile
  - make deps # this builds the base QEMU-enabled images
  - make 2.7  # now build each Python image I need (each line builds six variants)
  - make 3.6

after_success: # these can fail without breaking the build
  - docker login -u $DOCKER_USERNAME -p $DOCKER_PASSWORD
  - make push
  - make manifest # doesn't work on Travis since they are only using Docker 17.x

Like I point out above, the really smart bits to select the right base image and do the actual builds are inside the Makefile, and are nothing more sleight of hand, with careful handling of build arguments.

For the sake of brevity, I’m only going to show a few of the Makefile targets, but you can check out the whole thing at your leisure:

# Register QEMU into the host and grab the latest stable binary
deps:
	-docker run --rm --privileged multiarch/qemu-user-static:register
	-mkdir tmp 
	cd tmp && \
	curl -L -o qemu-arm-static.tar.gz https://github.com/multiarch/qemu-user-static/releases/download/v3.0.0/qemu-arm-static.tar.gz && \
	tar xzf qemu-arm-static.tar.gz && \
	cp qemu-arm-static ../qemu/

# Build a QEMU-enabled image for older Raspberry Pis using the Dockerfile in the qemu folder
qemu-arm32v6:
	docker build \
		--build-arg ARCH=arm32v6 \
		--build-arg BASE=arm32v6/$(ALPINE_VERSION) \
		-t local/qemu-arm32v6 qemu
        
# Use that image to build a Python 3.6 image using the Dockerfile in the 3.6 folder:
3.6-arm32v6:
	docker build \
		--build-arg BASE=local/qemu-arm32v6 \
		-t $(IMAGE_NAME):3.6-arm32v6 3.6
        
# Build and push a manifest with multiple architectures:
manifest:
	docker manifest create --amend \
		$(IMAGE_NAME):latest \
		$(IMAGE_NAME):3.6-amd64 \
		$(IMAGE_NAME):3.6-arm32v6 \
		$(IMAGE_NAME):3.6-arm32v7
        docker manifest push $(IMAGE_NAME):latest

The qemu Dockerfile is simplicity itself (and building for other architectures is just a matter of replacing the binary name, which can also be passed on as a build-arg):

ARG BASE 
FROM ${BASE}
COPY qemu-arm-static /usr/bin/qemu-arm-static

Next Steps

Right now, all that’s needed is to build up from this (I’m currently rebuilding some NodeJS images, in a fitting reversal of the “turtles all the way down” allegory).

But doing most of the logic inside a Makefile makes it very easy to move across CI systems. It can be a bit verbose and repetitive, but I can always assume I have GNU Make available and just do this:

ARCHS?=amd64 arm32v6 arm32v7
all:
    $(foreach var, $(ARCHS), make build-$(var);)

build-%:
	docker build \
		--build-arg BASE=local/qemu-$* \
		-t $(IMAGE_NAME):$* dockerfile_folder

…but it smacks of premature optimization, so I’m going to work my way through a few images in an easily debuggable fashion and then I’m going to give something like that a go.

Obviously, all of the above works in Azure Devops just fine, and I’ll add an example YAML for that once I start building a few of the “upper stack” images there.

I’m in no rush to move my existing CI across–remember, if it isn’t broken, don’t fix it, and it’s always a good idea to compare tools and focus on the core techniques.

Update: Well, I couldn’t just leave this alone, now could I? The repo is now fully cleaned up and the Makefile duly optimized. Even better, I’ve tested the same approach on a similar set of NodeJS images (although in that case the build time is so long that Travis times out, which means I will be moving those to Azure Devops next weekend), and expect to re-use this for another five or six images I maintain.