Your Pi Pulled the Wrong Image Again
You spent an evening containerizing your app, pushed the image, pulled it on your Raspberry Pi 5, and got this:
exec /usr/local/bin/app: exec format errorClassic. Your laptop is x86_64, your Pi is arm64, and Docker handed it the wrong binary format. The container started just fine on your laptop, so obviously it’s the Pi’s fault. It’s not.
Here’s the thing: in 2026 multi-architecture builds aren’t optional anymore. Apple Silicon Macs run arm64. AWS Graviton instances are arm64 and they’re cheaper per vCPU than x86. Your homelab is probably a mix of a Pi, a Rockchip board, maybe an old Intel NUC. If you’re pushing a single-arch image, you’re writing off half your hardware and your CI bills in one shot.
The fix is a multi-arch image — one image reference that resolves to the right binary for whatever pulls it. Docker’s buildx plugin and QEMU make this surprisingly achievable. Let’s do it.
What a Multi-Arch Image Actually Is
Before we build anything, a quick anatomy lesson because the terminology trips people up.
A manifest list (also called an OCI image index) is a pointer. When you push myrepo/myapp:latest as a multi-arch image, the registry stores something like this:
$ docker manifest inspect myrepo/myapp:latest{ "schemaVersion": 2, "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json", "manifests": [ { "mediaType": "application/vnd.docker.distribution.manifest.v2+json", "digest": "sha256:abc123...", "platform": { "architecture": "amd64", "os": "linux" } }, { "mediaType": "application/vnd.docker.distribution.manifest.v2+json", "digest": "sha256:def456...", "platform": { "architecture": "arm64", "variant": "v8", "os": "linux" } } ]}The tag latest doesn’t point at an image — it points at a list of images. When your Pi runs docker pull, it reads the manifest list, matches its own architecture, and pulls the right one. Your x86 machine does the same. Same tag, right binary, no exec format error.
This is what you’re building. Two (or more) actual images, stitched together under one tag.
Approach 1: QEMU Emulated Builds (Easy, Slow)
QEMU lets your x86 CPU pretend to be an ARM CPU using software translation. It’s like running an ARM interpreter — not fast, but it works and requires exactly zero extra hardware.
Install buildx and QEMU
buildx ships with Docker Desktop. If you’re on Docker Engine (Linux), check:
$ docker buildx versiongithub.com/docker/buildx v0.15.1 ...If it’s missing, install the plugin:
$ sudo apt install docker-buildx-pluginThen register QEMU binary formats with the kernel:
$ docker run --privileged --rm tonistiigi/binfmt --install allThis sets up /proc/sys/fs/binfmt_misc so the kernel knows to hand arm64 binaries to QEMU instead of panicking. You need to do this once per boot (or set it up as a systemd unit if you’re running a builder host).
Create a buildx builder
The default builder doesn’t support multi-arch. Create one that does:
$ docker buildx create --name multiarch --use$ docker buildx inspect --bootstrapThe --use flag makes it the active builder. inspect --bootstrap starts the BuildKit container in the background and confirms available platforms. You should see linux/amd64, linux/arm64, linux/arm/v7, and a bunch more.
Build and push
$ docker buildx build \ --platform linux/amd64,linux/arm64 \ --tag myrepo/myapp:latest \ --push \ .That’s it. --push is required here — multi-arch builds can’t be loaded into your local Docker daemon (a manifest list has nowhere to land locally). The image goes straight to the registry.
BuildKit spins up two builds in parallel: one native amd64 pass, one arm64 pass running through QEMU. The arm64 pass is slow. Like, “make a coffee” slow for non-trivial images. A Rust compile that takes 90 seconds natively might take 12 minutes under QEMU. We’ll fix that in approach 2, but for small images or scripted languages, emulation is perfectly fine.
A Dockerfile that plays nice with multi-arch
Most Dockerfiles just work. A few things to watch:
# Use a multi-arch base image — alpine is fine hereFROM alpine:3.20
# This is fine — apk resolves the right arch automaticallyRUN apk add --no-cache curl
# Copy your binary — but make sure you're building it for the right arch# in your build stage (see below)COPY app /usr/local/bin/appRUN chmod +x /usr/local/bin/app
CMD ["app"]If you’re compiling a Go binary:
FROM golang:1.22-alpine AS builder
WORKDIR /appCOPY . .
# GOARCH and GOOS are automatically set by BuildKit via build argsARG TARGETOSARG TARGETARCHRUN GOOS=${TARGETOS} GOARCH=${TARGETARCH} go build -o app .
FROM alpine:3.20COPY --from=builder /app/app /usr/local/bin/appCMD ["app"]TARGETOS and TARGETARCH are automatically injected by BuildKit when building for a specific platform. You don’t set them — they’re just there. Go’s cross-compilation is then trivial: set the env vars, build, done. No QEMU needed for the compile step itself, which is why Go multi-arch images build fast even with emulation.
For Rust or C/C++, cross-compilation is more involved (you need the right linker and sysroot). In those cases, native builders (approach 2) save your sanity.
Approach 2: Native Multi-Node Builders (Fast)
Instead of emulating ARM on your laptop, you point BuildKit at actual ARM hardware. Each platform builds natively, no emulation penalty. If you have an arm64 machine anywhere in your lab — a Pi, an Oracle Cloud free tier ARM VM, an AWS Graviton instance — you can do this.
Set up the builder nodes
On your x86 build host, create a builder that spans both machines:
# Add the local (amd64) node$ docker buildx create \ --name multiarch-native \ --node native-amd64 \ --platform linux/amd64 \ --use
# Add the remote arm64 node (over SSH)$ docker buildx create \ --name multiarch-native \ --append \ --node native-arm64 \ --platform linux/arm64 \ --driver-opt env.BUILDKIT_STEP_LOG_MAX_SIZE=10485760 \ ssh://user@your-arm-hostBuildKit connects to the remote node over SSH and runs the arm64 build there natively. Your amd64 build runs locally. Both run in parallel. Fast.
Inspect to confirm both nodes are up:
$ docker buildx inspect multiarch-nativeName: multiarch-nativeDriver: docker-container
Nodes:Name: native-amd64Endpoint: unix:///var/run/docker.sockPlatforms: linux/amd64
Name: native-arm64Endpoint: ssh://user@your-arm-hostPlatforms: linux/arm64The build command is identical:
$ docker buildx build \ --platform linux/amd64,linux/arm64 \ --tag myrepo/myapp:latest \ --push \ .BuildKit dispatches the amd64 layers to the local node and the arm64 layers to the SSH node. Results merge into a single manifest list at the registry. Your Rust compile that took 12 minutes under QEMU now takes 90 seconds because it’s actually running on ARM silicon.
CI Patterns
GitHub Actions
name: Build Multi-Arch Image
on: push: branches: [main] pull_request:
env: REGISTRY: ghcr.io IMAGE_NAME: ${{ github.repository }}
jobs: build: runs-on: ubuntu-latest permissions: contents: read packages: write
steps: - name: Checkout uses: actions/checkout@v4
- name: Set up QEMU uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx uses: docker/setup-buildx-action@v3
- name: Log in to registry uses: docker/login-action@v3 with: registry: ${{ env.REGISTRY }} username: ${{ github.actor }} password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push uses: docker/build-push-action@v6 with: context: . platforms: linux/amd64,linux/arm64 push: ${{ github.event_name != 'pull_request' }} tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest cache-from: type=gha cache-to: type=gha,mode=maxThe setup-qemu-action handles the binfmt registration. setup-buildx-action creates the multi-arch builder. The cache lines are the important part — without them, every run re-downloads base layers and re-compiles from scratch.
GitLab CI / Forgejo Actions
The pattern is nearly identical — Forgejo Actions is GitHub Actions-compatible, so the YAML above works with minor tweaks for registry auth. GitLab’s equivalent:
build: image: docker:27 services: - docker:27-dind variables: DOCKER_BUILDKIT: "1" before_script: - docker run --privileged --rm tonistiigi/binfmt --install all - docker buildx create --use - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY script: - docker buildx build --platform linux/amd64,linux/arm64 --tag "$CI_REGISTRY_IMAGE:latest" --push --cache-from type=registry,ref=$CI_REGISTRY_IMAGE:buildcache --cache-to type=registry,ref=$CI_REGISTRY_IMAGE:buildcache,mode=max .Cache Strategy
Cache misses are the number-one reason multi-arch CI builds are slow. The two main strategies:
GitHub Actions cache (type=gha): BuildKit exports layer data to the GHA cache API. Fast to write, free within GHA limits, wiped if unused for 7 days. Good default for public repos.
Registry cache (type=registry): BuildKit pushes a cache manifest to your registry alongside the real image. Survives across runners, works in self-hosted CI, and you control the retention. The example above uses myrepo/myapp:buildcache as the cache tag — that’s a real (but metadata-only) tag in your registry.
# Manual build with registry cache$ docker buildx build \ --platform linux/amd64,linux/arm64 \ --tag myrepo/myapp:latest \ --cache-from type=registry,ref=myrepo/myapp:buildcache \ --cache-to type=registry,ref=myrepo/myapp:buildcache,mode=max \ --push \ .mode=max exports all intermediate layers, not just the final image. More storage, much better cache hit rate on changed layers deep in the build. If you’re watching your registry storage, mode=min exports only the final layers — cheaper but fewer hits.
Multi-Arch Base Image Traps
Not every base image has an arm64 build. Check before you commit to a base.
Alpine vs glibc: Alpine uses musl libc. Most Go and static binaries don’t care. But if you’re using a C library or compiling against glibc-linked system packages, Alpine on arm64 can produce subtle runtime failures that only surface on the Pi. debian:slim is heavier but consistent. When in doubt, test on actual hardware.
Missing arm builds: Some images on Docker Hub only publish amd64. tonistiigi/xx, ghcr.io/linuxserver/*, and official language images (golang, python, node, rust) have solid multi-arch support. Third-party images are a grab bag — always check the Tags tab on Docker Hub and look for the architecture badges.
Platform-specific packages: Some apt or apk packages have different names or aren’t available on arm64. If your build fails only on arm64, check whether you’re hitting a package availability gap.
A quick sanity check before committing to a base:
$ docker manifest inspect python:3.12-slim | grep architectureIf you see both amd64 and arm64 in the output, you’re good.
Verify Your Manifest List
After pushing, confirm the manifest list is actually there:
$ docker buildx imagetools inspect myrepo/myapp:latestName: docker.io/myrepo/myapp:latestMediaType: application/vnd.oci.image.index.v1+jsonDigest: sha256:aaabbb...
Manifests: Name: docker.io/myrepo/myapp:latest@sha256:abc123... MediaType: application/vnd.oci.image.manifest.v1+json Platform: linux/amd64
Name: docker.io/myrepo/myapp:latest@sha256:def456... MediaType: application/vnd.oci.image.manifest.v1+json Platform: linux/arm64Two manifests under one tag. Your Pi will pull sha256:def456, your NUC will pull sha256:abc123. Both call it myrepo/myapp:latest. That’s the whole thing.
Which Approach Should You Use
QEMU emulation is the right call when:
- You’re on a laptop or a single build host with no ARM hardware
- Your image uses a scripted language (Python, Node, Ruby) or cross-compiles cleanly (Go, Rust with cross-rs)
- Build time is acceptable — under 10-15 minutes for your image
Native multi-node is worth the setup when:
- You’re compiling C, C++, or Rust with no cross-compilation setup
- Your CI builds are hitting the 30-minute mark under emulation
- You already have an arm64 machine sitting idle in your lab
For most homelab images, QEMU + registry cache gets you there without maintaining SSH access to a second build node. For anything compile-heavy going into production, the native approach pays for itself the first week.
Your Pi deserves better than exec format error. It took you 30 seconds to set up QEMU. No excuses.