Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

docker builds broken on armv7 #9403

Closed
richvdh opened this issue Feb 13, 2021 · 24 comments
Closed

docker builds broken on armv7 #9403

richvdh opened this issue Feb 13, 2021 · 24 comments
Labels
X-Release-Blocker Must be resolved before making a release

Comments

@richvdh
Copy link
Member

richvdh commented Feb 13, 2021

Our Docker build is currently broken on armv7, since python-cryptography now requires a rust compiler to build:

From https://app.circleci.com/pipelines/github/matrix-org/synapse/19720/workflows/47040f65-8516-4995-b250-9cae722e4efc/jobs/25662:

#25 5590.   Building wheel for cryptography (PEP 517): started
#25 5605.   Building wheel for cryptography (PEP 517): finished with status 'error'
#25 5605.   ERROR: Command errored out with exit status 1:
#25 5605.    command: /usr/local/bin/python /usr/local/lib/python3.8/site-packages/pip/_vendor/pep517/_in_process.py build_wheel /tmp/tmpyl06y7df
#25 5605.        cwd: /tmp/pip-install-mmoy6v11/cryptography_468fe58643ff4d02b4439954d410409c
...
#25 5605.   generating cffi module 'build/temp.linux-armv7l-3.8/_openssl.c'
#25 5605.   running build_rust
#25 5605.   
#25 5605.       =============================DEBUG ASSISTANCE=============================
#25 5605.       If you are seeing a compilation error please try the following steps to
#25 5605.       successfully install cryptography:
#25 5605.       1) Upgrade to the latest pip and try again. This will fix errors for most
#25 5605.          users. See: https://pip.pypa.io/en/stable/installing/#upgrading-pip
#25 5605.       2) Read https://cryptography.io/en/latest/installation.html for specific
#25 5605.          instructions for your platform.
#25 5605.       3) Check our frequently asked questions for more information:
#25 5605.          https://cryptography.io/en/latest/faq.html
#25 5605.       4) Ensure you have a recent Rust toolchain installed:
#25 5605.          https://cryptography.io/en/latest/installation.html#rust
#25 5605.       5) If you are experiencing issues with Rust for *this release only* you may
#25 5605.          set the environment variable `CRYPTOGRAPHY_DONT_BUILD_RUST=1`.
#25 5605.       =============================DEBUG ASSISTANCE=============================
#25 5605.   
#25 5605.   error: Can not find Rust compiler
#25 5605.   ----------------------------------------
#25 5605.   ERROR: Failed building wheel for cryptography

amd64 and arm64 are fine, because we use precompiled wheels from PyPI.

(see also pyca/cryptography#5771)

@richvdh
Copy link
Member Author

richvdh commented Feb 13, 2021

The immediate fix is presumably either to pin cryptography or set CRYPTOGRAPHY_DONT_BUILD_RUST=1, but longer term I guess we need to install a (modern) rust compiler in the dockerfile.

@richvdh richvdh changed the title docker builds broken on non-x86 architectures docker builds broken on armv7 Feb 13, 2021
@richvdh richvdh added the X-Release-Blocker Must be resolved before making a release label Feb 13, 2021
@callahad
Copy link
Contributor

...or drop ARMv7 Docker images? Do we have any usage stats?

@callahad
Copy link
Contributor

Hm. Apparently Raspberry Pis still ship with a 32-bit (ARMv7) kernel, even if the host processor is 64-bit, so we do need to get armv7 builds working again.

@richvdh
Copy link
Member Author

richvdh commented Feb 14, 2021

if our goal is simply targetting raspberry pis, we could plausibly pull wheels from https://www.piwheels.org/.

@callahad
Copy link
Contributor

callahad commented Feb 15, 2021

Okay, let's:

  • Pin cryptography to 3.3.2 to solve our immediate problems for 1.27 and 1.28 and close this bug.
  • File a separate bug to unpin cryptography and add rustc to our build Dockerfiles for 1.29+, labeled as T-Task and P1.

Of note, our builds use the python:3.8-slim image, which is based on Debian 10 (buster). This ships with Rust 1.41, which has issues building cryptography. It sounds like those are being worked out upstream. I have an ARM box, so I'm happy to have the second issue assigned to me as I can probably iterate on it more quickly.

@callahad
Copy link
Contributor

Cryptography 3.4.5 has reduced its minimum supported Rust version to 1.41, so we should be fine on that front.

@callahad
Copy link
Contributor

callahad commented Feb 15, 2021

Actually, apt install build-essential libssl-dev libffi-dev rustc is sufficient to get pip install --no-binary :all: cryptography==3.4.5 to work inside a python:3.8-slim image; rustc is the only item missing from our current Dockerfile.

...so I suspect we can just jam rustc in there and we're good to go.

callahad added a commit to callahad/synapse that referenced this issue Feb 15, 2021
This is needed to build the cryptography library, since it does not
provide wheels for ARMv7.

Fixes matrix-org#9403

Signed-off-by: Dan Callahan <danc@element.io>
callahad added a commit that referenced this issue Feb 15, 2021
This is needed to build the cryptography library, since it does not
provide wheels for ARMv7.

Fixes #9403

Signed-off-by: Dan Callahan <danc@element.io>
@clokep
Copy link
Member

clokep commented Feb 16, 2021

I think this is fixed?

@clokep clokep closed this as completed Feb 16, 2021
@callahad
Copy link
Contributor

Oh how I wish it were.

Building Docker images on native ARMv7 hosts should work just fine.

Cross-building with qemu fails, which means normal docker buildx as used in our CircleCI configuration fails.

Erik traced it to an upstream bug in qemu which has been known for a couple of years.

Erik also proposed a patch to our Dockerfile with the caveat "but that breaks standard docker builds :/" However, it does seem to work fine for me locally for both buildx and normal builds.

...but do we want to carry this debt, versus just dropping ARMv7l?

diff --git a/docker/Dockerfile b/docker/Dockerfile
index d619ee08e..289bc3c74 100644
--- a/docker/Dockerfile
+++ b/docker/Dockerfile
@@ -14,7 +14,35 @@
 ARG PYTHON_VERSION=3.8
 
 ###
-### Stage 0: builder
+### Stage: prebuilder
+###
+### New versions of cryptography require rust to be installed, but due to
+### https://bugs.launchpad.net/qemu/+bug/1805913 downloading cargo registries
+### and dependencies doesn't work on multiarch builds. We fix this by
+### prebuilding `cryptography` first in the native arch, and then copying over
+### the registry and running cargo in offline mode.
+###
+FROM --platform=$BUILDPLATFORM docker.io/python:${PYTHON_VERSION}-slim as prebuilder
+
+RUN apt-get update && apt-get install -y \
+        build-essential \
+        libffi-dev \
+        libssl-dev \
+        rustc \
+        zlib1g-dev \
+        && rm -rf /var/lib/apt/lists/*
+
+RUN mkdir /root/.cargo
+ENV CARGO_HOME=/root/.cargo
+
+RUN pip download --no-binary :all: --no-deps cryptography
+
+RUN tar -xf cryptography*.tar.gz --wildcards cryptography*/src/rust/
+
+RUN cd cryptography*/src/rust && cargo fetch
+
+###
+### Stage: builder
 ###
 FROM docker.io/python:${PYTHON_VERSION}-slim as builder
 
@@ -32,6 +60,14 @@ RUN apt-get update && apt-get install -y \
     zlib1g-dev \
  && rm -rf /var/lib/apt/lists/*
 
+COPY --from=prebuilder /root/.cargo /root/.cargo
+
+ENV CARGO_NET_OFFLINE=true
+ENV CARGO_HOME=/root/.cargo
+
+RUN pip install --prefix="/install" --no-warn-script-location \
+        cryptography
+
 # Build dependencies that are not available as wheels, to speed up rebuilds
 RUN pip install --prefix="/install" --no-warn-script-location \
         cryptography \
@@ -57,7 +93,7 @@ RUN pip install --prefix="/install" --no-warn-script-location \
         /synapse[all]
 
 ###
-### Stage 1: runtime
+### Stage: runtime
 ###
 
 FROM docker.io/python:${PYTHON_VERSION}-slim

@callahad callahad reopened this Feb 16, 2021
@callahad
Copy link
Contributor

callahad commented Feb 17, 2021

Okay, so normal builds are broken unless you enable BuildKit.

Which is fine (BuildKit is awesome!), but also this is feeling like a lot of complexity and technical debt to support a platform.

@callahad
Copy link
Contributor

I'm fine with either accepting the patch as-is, or dropping ARMv7 altogether.

Raspberry Pi users should be able to pip install matrix-synapse and have things Just Work since piwheels.org does produce up-to-date cryptography wheels for all current Pi models. They should also be able to build their own Docker images locally, if that's a preferred deployment target.

@ShadowJonathan
Copy link
Contributor

ShadowJonathan commented Feb 17, 2021

I vote for keeping docker support for ARMv7, but knocking it down from a release-blocker (for normal releases, but not for security releases), and making that explicit.

I think the future for raspberry-pi-based homeservers are other implementations than synapse (as synapse is increasingly aimed at large deployments, and it's requirements start to aim more towards that), but i think for the time being we should at least consider supporting them, just on a backburner.

Also, couldn't we add piwheels to the docker build for ARMv7 to avoid building cryptography on there? Seems to make sense to utilise that source anyways.

@callahad
Copy link
Contributor

Alternative patch from Erik which doesn't require BuildKit

diff --git a/docker/Dockerfile b/docker/Dockerfile
index d619ee08e..ce404662c 100644
--- a/docker/Dockerfile
+++ b/docker/Dockerfile
@@ -12,11 +12,13 @@
 #
 
 ARG PYTHON_VERSION=3.8
+ARG BASE_IMAGE=docker.io/python:${PYTHON_VERSION}-slim
+ARG CARGO_NET_OFFLINE=false
 
 ###
 ### Stage 0: builder
 ###
-FROM docker.io/python:${PYTHON_VERSION}-slim as builder
+FROM ${BASE_IMAGE} as builder
 
 # install the OS build deps
 RUN apt-get update && apt-get install -y \
@@ -32,6 +34,8 @@ RUN apt-get update && apt-get install -y \
     zlib1g-dev \
  && rm -rf /var/lib/apt/lists/*
 
+ENV CARGO_NET_OFFLINE=${CARGO_NET_OFFLINE}
+
 # Build dependencies that are not available as wheels, to speed up rebuilds
 RUN pip install --prefix="/install" --no-warn-script-location \
         cryptography \

And then you'd build a separate base image (see below) and use it by passing--build-arg BASE_IMAGE=cargo_cache --build-arg CARGO_NET_OFFLINE=true when doing multiarch builds in CI.

diff --git a/docker/Dockerfile-cargo-cache b/docker/Dockerfile-cargo-cache
new file mode 100644
index 000000000..b4673fff5
--- /dev/null
+++ b/docker/Dockerfile-cargo-cache
@@ -0,0 +1,21 @@
+# A docker file that caches the cargo index for the cryptography deps. This is
+# mainly useful for multi-arch builds where fetching the index from the internet
+# fails for 32bit archs built on 64 bit platforms.
+
+ARG PYTHON_VERSION=3.8
+
+FROM --platform=$BUILDPLATFORM docker.io/python:${PYTHON_VERSION}-slim as builder
+
+RUN apt-get update && apt-get install -y \
+    rustc \
+    && rm -rf /var/lib/apt/lists/*
+
+RUN pip download --no-binary cryptography --no-deps cryptography
+
+RUN tar -xf cryptography*.tar.gz --wildcards cryptography*/src/rust/
+
+RUN cd cryptography*/src/rust && cargo fetch
+
+FROM docker.io/python:${PYTHON_VERSION}-slim
+
+COPY --from=builder /root/.cargo /root/.cargo

So we'll have to decide on a path forward, but it seems like consensus is along the lines of making ARM a tier 2 platform: best effort, but we'll ship releases like 1.27 without them.

@TR-SLimey
Copy link

Ah, so there will be no arm64 images for 1.27? :(

@callahad
Copy link
Contributor

Probably. Our CI is effectively set up as:

  1. Build amd64 and publish to Docker Hub
  2. Run a three-way multi-arch build for amd64, arm/v7, and arm64; and publish to Docker Hub, overwriting the label from above

So if either ARM architecture fails, we don't publish any ARM images. And our current assumption is that there's no easy way to modify existing Docker Hub listings to add additional architectures... so to optionally publish arm64 even when v7 fails, we'd need a three stage build process:

  1. Build [amd64] and publish if successful
  2. Build [amd64, arm64] and publish if successful
  3. Build [amd64, arm64, arm/v7] and publish if successful

Since all the ARM stuff is happening under qemu, it takes a very long time to build. So that's unlikely to be a way forward.

If we're missing something fundamental about how Docker handle multi-arch, please do let us know :)

@TR-SLimey
Copy link

TR-SLimey commented Feb 17, 2021

Ah, then I might have a solution! :D

As opposed to building and re-building the images, you could simply build and publish each image separately and tag each as {{whatever}}-linux-amd64, {{whatever}}-linux-arm64 and {{whatever}}-linux-armv7 and then simply publish a manifest with the {{whatever}} tag and pointing to each of the above. If any fail, those would just be skipped out.

Only thing with that is, the final manifest would need to be published either after everything is built, thereby slowing down the release of the amd64 image (only theoretically, it would still exist under the -linux-amd64 tag), or every time a release is built which feels kinda hacky.

Not sure if the above was understandable enough, so to hopefully clarify, I can point you towards a CI file I had set up earlier for one of my projects: https://github.com/TR-SLimey/DroneExternalConfig/blob/master/drone.yml

Notice how in the above, the ARM build is commented out but everything works just fine. Of course that works on Drone CI and not Circle CI which I think is what you're using, but maybe you can figure something out for Circle?

Edit: In case the Drone plugin docs are of any use to you, here they are also: Manifest, Docker Build.

@callahad
Copy link
Contributor

All that said, in trying to do some reading on multi-arch docker, it looks like the docker manifest command does have an --amend option for docker manifest create, which seems like we could probably do a one-shot, best-case build and include any architectures that built successfully.

A few links in my "read later" pile:

@callahad
Copy link
Contributor

In chat Rich brought up the question of what we should do with the latest tag when we can't build every arch. E.g., if armv7 is missing from the 1.27.0 latest, then users on RPis who docker run matrixdotorg/synapse won't automatically get 1.26.0, they'll get an error.

@TR-SLimey
Copy link

In chat Rich brought up the question of what we should do with the latest tag when we can't build every arch. E.g., if armv7 is missing from the 1.27.0 latest, then users on RPis who docker run matrixdotorg/synapse won't automatically get 1.26.0, they'll get an error.

Maybe you could create more tags? latest-amd64, latest-arm64 and latest-armv7 would each guarantee the latest build available for their respective architectures whilst latest is simply a best effort to include all architectures but with no guarantees.

@clokep
Copy link
Member

clokep commented Feb 18, 2021

Note that due to the difficulty in getting these to build reliably we've dropped the ARM/v7 build of Synapse for the moment (as of Synapse v1.27.0).

@scottwallacesh
Copy link

Docker Hub is showing latest as pushed 23 days ago. Did latest not get updated with all this armv7 shenanigans?

@callahad
Copy link
Contributor

Unfortunately, our build scripts only update latest when all included arches build, so updating latest would break ARM64 in addition to ARMv7... and we expect to restore ARM64 images with our next release, so we'd rather keep things working, even if outdated, for those users until we can ship 1.28.

Fortunately, the delayed release of 1.27 means 1.28 (with a proper latest for both amd64 and arm64) should arrive very early next week.

@callahad
Copy link
Contributor

The decision to drop ARMv7 builds was discussed in the 1.27 release announcement. I'll go ahead and close this issue now. Expect a formal release of 1.28 (with ARM64) images in the next 48 hours.

@reyqn
Copy link

reyqn commented Dec 13, 2022

Just in case, synapse doesn't build on piwheels anymore, which was one of the solutions provided when dropping armv7 support. The remaining solutions being:

  • building ourselves (kind of a hassle)
  • using a aarch64 image (not always possible).

We can still apparently use https://hub.docker.com/r/asonix/synapse which I haven't tried (as a side note, the official image boasts armv7 support, when it's actually deprecated)

This doesn't seem like a reasonably painless experience to start using matrix.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
X-Release-Blocker Must be resolved before making a release
Projects
None yet
Development

No branches or pull requests

7 participants