Skip to content

Commit

Permalink
Speed up Python docker builds using pre-compiled python
Browse files Browse the repository at this point in the history
Our Python docker builds are slow... a typical CI test run for Python
takes ~21 minutes, of which the first ~10 minutes are spent building the
image.

This results in slower local development, slow CI test suites, and slow deployments.

The main culprit is `pyenv install` which under the covers downloads the
python source and then compiles it locally. Profiling showed that the
download was quick, so even though `pyenv` supports `aria2c`, there's
not much to be gained there. Unfortunately, a quick look at the `pyenv`
issue tracker showed [there's no way to to pass pre-compiled artifacts to
`pyenv`](https://github.com/orgs/pyenv/discussions/1872).

For a long time we've bandied about the idea of switching from `pyenv`
to downloading pre-compiled Pythons. In fact, the GitHub Actions
publishes versions that we could use here:
https://github.com/actions/python-versions/releases

However, we use `pyenv local` + `pyenv exec` throughout our Ruby code
for switching to different Python versions. So we thought that it'd take
a week or more to fully migrate away from `pyenv`.

Today I had to rebuild the python image multiple times, and got so
annoyed that I decided to poke at it a bit.

It turns out that `pyenv` is simply a shim layer, and as long as
`/usr/local/.pyenv/versions/<x.y.z>/bin` exists, it will happily pass
commands to anything in that folder.

So I was able to come up with an intermediate solution that speeds the
builds up drastically without requiring a large code refactor.

Running this locally results in the Python download/install/build
step going from ~500 seconds all the way down to ~75 seconds, a savings
of 7 minutes. Given that a full CI run of the python test suite
previously took ~21 minutes, this cuts it by 1/3.
  • Loading branch information
jeffwidman committed Aug 30, 2023
1 parent 90e20e1 commit d2eb235
Show file tree
Hide file tree
Showing 2 changed files with 67 additions and 14 deletions.
74 changes: 61 additions & 13 deletions python/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ USER root

# Install *only* the apt packages required for this builder image to build Python.
# C-libs needed by users to build their Python packages should be installed down below in the final docker image.
# TODO: not all these packages may be needed now that we've switched from `pyenv install` which compiled from source to
# downloading / copying pre-compiled python
RUN apt-get update \
&& apt-get upgrade -y \
&& apt-get install -y --no-install-recommends \
Expand All @@ -32,43 +34,89 @@ RUN apt-get update \
tk-dev \
xz-utils \
zlib1g-dev \
# jq isn't necessary to build Python from source, it's used below for locating the download URLs
jq \
&& rm -rf /var/lib/apt/lists/*

COPY --chown=dependabot:dependabot python/helpers /opt/python/helpers
USER root
# TODO: Now that switched from `pyenv install` which compiled from source to downloading / copying a pre-compiled python
# we could entirely drop pyenv if we change our ruby code that calls `pyenv exec` to track which version of python to
# call and uses the full python paths.
ENV PYENV_ROOT=/usr/local/.pyenv \
PATH="/usr/local/.pyenv/bin:$PATH"
RUN mkdir -p "$PYENV_ROOT" && chown dependabot:dependabot "$PYENV_ROOT"
USER dependabot
ENV DEPENDABOT_NATIVE_HELPERS_PATH="/opt"
RUN git -c advice.detachedHead=false clone https://github.com/pyenv/pyenv.git --branch $PYENV_VERSION --single-branch --depth=1 /usr/local/.pyenv

# We used to use `pyenv install 3.x.y` but it's really slow because it compiles from source (~500s). So instead, we hack
# around that by downloading a pre-compiled version, then symlinking the `bin` folder to where pyenv expects it.
# In the future, we should consider dropping pyenv completely, as it's mostly used here for legacy reasons.

FROM python-core as python-3.8
RUN pyenv install $PY_3_8 \
RUN mkdir -p /opt/hostedtoolcache/Python/$PY_3_8/x64/ \
&& cd /opt/hostedtoolcache/Python/$PY_3_8/x64/ \
&& curl -L https://raw.githubusercontent.com/actions/python-versions/main/versions-manifest.json \
| jq 'map(select(.version == '\"$PY_3_8\"')) [].files | map(select(.platform == "linux" and .arch == "x64" and .platform_version == "22.04"))[] | .download_url' \
| xargs -i -- curl -L {} \
| tar xz \
&& rm build_output.txt Python-$PY_3_8.tgz tools_structure.txt setup.sh \
&& mkdir /usr/local/.pyenv/versions/ \
&& ln -s /opt/hostedtoolcache/Python/$PY_3_8/x64 /usr/local/.pyenv/versions/$PY_3_8 \
&& bash /opt/python/helpers/build $PY_3_8 \
&& cd /usr/local/.pyenv \
&& tar czf 3.8.tar.gz versions/$PY_3_8
&& cd /opt/hostedtoolcache/Python \
&& tar czf $PY_3_8.tar.gz $PY_3_8

FROM python-core as python-3.9
RUN pyenv install $PY_3_9 \
RUN mkdir -p /opt/hostedtoolcache/Python/$PY_3_9/x64/ \
&& cd /opt/hostedtoolcache/Python/$PY_3_9/x64/ \
&& curl -L https://raw.githubusercontent.com/actions/python-versions/main/versions-manifest.json \
| jq 'map(select(.version == '\"$PY_3_9\"')) [].files | map(select(.platform == "linux" and .arch == "x64" and .platform_version == "22.04"))[] | .download_url' \
| xargs -i -- curl -L {} \
| tar xz \
&& rm build_output.txt Python-$PY_3_9.tgz tools_structure.txt setup.sh \
&& mkdir /usr/local/.pyenv/versions/ \
&& ln -s /opt/hostedtoolcache/Python/$PY_3_9/x64 /usr/local/.pyenv/versions/$PY_3_9 \
&& bash /opt/python/helpers/build $PY_3_9 \
&& cd /usr/local/.pyenv \
&& tar czf 3.9.tar.gz versions/$PY_3_9
&& cd /opt/hostedtoolcache/Python \
&& tar czf $PY_3_9.tar.gz $PY_3_9

FROM python-core as python-3.10
RUN pyenv install $PY_3_10 \
RUN mkdir -p /opt/hostedtoolcache/Python/$PY_3_10/x64/ \
&& cd /opt/hostedtoolcache/Python/$PY_3_10/x64/ \
&& curl -L https://raw.githubusercontent.com/actions/python-versions/main/versions-manifest.json \
| jq 'map(select(.version == '\"$PY_3_10\"')) [].files | map(select(.platform == "linux" and .arch == "x64" and .platform_version == "22.04"))[] | .download_url ' \
| xargs -i -- curl -L {} \
| tar xz \
&& rm build_output.txt Python-$PY_3_10.tgz tools_structure.txt setup.sh \
&& mkdir /usr/local/.pyenv/versions/ \
&& ln -s /opt/hostedtoolcache/Python/$PY_3_10/x64 /usr/local/.pyenv/versions/$PY_3_10 \
&& bash /opt/python/helpers/build $PY_3_10 \
&& cd /usr/local/.pyenv \
&& tar czf 3.10.tar.gz versions/$PY_3_10
&& cd /opt/hostedtoolcache/Python \
&& tar czf $PY_3_10.tar.gz $PY_3_10

FROM python-core
RUN pyenv install $PY_3_11 \
# The pre-compiled Python expects to be installed to this dir
RUN mkdir -p /opt/hostedtoolcache/Python/$PY_3_11/x64/ \
&& cd /opt/hostedtoolcache/Python/$PY_3_11/x64/ \
# TODO: Add support for arm64 on Ubuntu whenever actions/python-versions adds support for it. Currently not available.
&& curl -L https://raw.githubusercontent.com/actions/python-versions/main/versions-manifest.json \
| jq 'map(select(.version == '\"$PY_3_11\"')) [].files | map(select(.platform == "linux" and .arch == "x64" and .platform_version == "22.04"))[] | .download_url' \
| xargs -i -- curl -L {} \
| tar xz \
# These files are part of the actions/python-versions install wrapper and aren't necessary.
&& rm build_output.txt Python-$PY_3_11.tgz tools_structure.txt setup.sh \
# pyenv expects the python installation files in the `versions` folder, but the pre-compiled python3 / pip3
# expect to reside in the /opt/hostedtoolcache/Python/x.y.z/x64 dir, so need a symlink to make them play nice.
&& mkdir /usr/local/.pyenv/versions/ \
&& ln -s /opt/hostedtoolcache/Python/$PY_3_11/x64 /usr/local/.pyenv/versions/$PY_3_11 \
&& pyenv global $PY_3_11 \
&& bash /opt/python/helpers/build $PY_3_11

COPY --from=python-3.10 /usr/local/.pyenv/3.10.tar.gz /usr/local/.pyenv/3.10.tar.gz
COPY --from=python-3.9 /usr/local/.pyenv/3.9.tar.gz /usr/local/.pyenv/3.9.tar.gz
COPY --from=python-3.8 /usr/local/.pyenv/3.8.tar.gz /usr/local/.pyenv/3.8.tar.gz
COPY --from=python-3.8 /opt/hostedtoolcache/Python/$PY_3_8.tar.gz /opt/hostedtoolcache/Python/$PY_3_8.tar.gz
COPY --from=python-3.9 /opt/hostedtoolcache/Python/$PY_3_9.tar.gz /opt/hostedtoolcache/Python/$PY_3_9.tar.gz
COPY --from=python-3.10 /opt/hostedtoolcache/Python/$PY_3_10.tar.gz /opt/hostedtoolcache/Python/$PY_3_10.tar.gz

# Install C-libs needed to build users' Python packages. Please document why each package is needed.
USER root
Expand Down
7 changes: 6 additions & 1 deletion python/lib/dependabot/python/language_version_manager.rb
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,12 @@ def install_required_python
return if SharedHelpers.run_shell_command("pyenv versions").include?(" #{python_major_minor}.")

SharedHelpers.run_shell_command(
"tar xzf /usr/local/.pyenv/#{python_major_minor}.tar.gz -C /usr/local/.pyenv/"
"tar xzf /opt/hostedtoolcache/Python/#{python_version}.tar.gz -C /opt/hostedtoolcache/Python/"
)
# pyenv expects the python installation files in the `versions` folder, but the pre-compiled python3 / pip3
# expect to reside in the /opt/hostedtoolcache/Python/x.y.z/x64 dir, so need a symlink to make them play nice.
SharedHelpers.run_shell_command(
"ln -s /opt/hostedtoolcache/Python/#{python_version}/x64 /usr/local/.pyenv/versions/#{python_version}"
)
end

Expand Down

0 comments on commit d2eb235

Please sign in to comment.