Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manylinux2 #152

Closed
wants to merge 9 commits into from
Closed

Manylinux2 #152

wants to merge 9 commits into from

Conversation

markrwilliams
Copy link

Initial draft of a manylinux2 Docker image based on CentOS 6.9.

There's a lengthy comment that attempts to justify patching glibc.

markrwilliams and others added 3 commits January 26, 2018 22:34
CentOS 6.9 includes a new enough OpenSSL (1.0.1e) to allow curl to
connect to HTTPS URLs.

The official 32-bit Docker image does not include the utils-linux-ng
package that contains linux32.  Without setting a 32-bit personality,
yum will download packages for the the host's architecture.

Python 2.6 is included, however, so include a Python implementation of
setarch(8) to enable bootstrapping.
@trishankatdatadog
Copy link
Member

trishankatdatadog commented Feb 1, 2018

Very nice work, @markrwilliams! Out of curiosity: why Centos 6, not 7? The latter has a longer lifetime, which means we will need to visit manylinux3 much later in the future, but I'm guessing the former was chosen for maximal backwards-compatibility?

@markrwilliams
Copy link
Author

@trishankatdatadog Looks like my email(s) have made it to distutils-sig: https://mail.python.org/pipermail/distutils-sig/2018-February/031944.html

I'll be happy to answer any questions there!

@njsmith
Copy link
Member

njsmith commented Feb 3, 2018

@trishankatdatadog the answer is pretty simple – stuff built on centos 6 runs on centos 7 but not vice-versa, and people still run centos 6/rhel 6, so if you want your wheels to work everywhere then you have to build them on centos 6.

Copy link
Member

@njsmith njsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the PEP to be accepted before we can do much here, but here's a few comments.

We should figure out how we want to handle the transition, too... I guess given how unmaintainable the manylinux1 image is rapidly becoming, maybe we'll want to push that off into a branch and declare it unmaintained as soon as this is ready?

check_sha256sum epel-release-5-4.noarch.rpm $EPEL_RPM_HASH
# https://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
cp $MY_DIR/epel-release-6-8.noarch.rpm .
check_sha256sum epel-release-6-8.noarch.rpm $EPEL_RPM_HASH

# Dev toolset (for LLVM and other projects requiring C++11 support)
curl -fsSLO http://people.centos.org/tru/devtools-2/devtools-2.repo
check_sha256sum devtools-2.repo $DEVTOOLS_HASH
mv devtools-2.repo /etc/yum.repos.d/devtools-2.repo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be switching to devtoolset-7. That could potentially be a followup PR though...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! devtoolset-2 is why this has the same version of gcc as manylinux1, about which you were rightly suspicious.

I think the change should be done in this PR, because upgrading to devtoolset-7 will bump the version of gcc et. al. to 7.2.1, but I've advertised 4.8.2 in the PEP. Installing devtoolset-7 doesn't imply an upgrade to glibc, so I don't think it will result in incompatible executables, but I'll verify this with some additional tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately devtoolset-7 hasn't been built for i686. I can build it from scratch, much like the glibc patching, but I have concerns that it might inadvertently bump the libgcc version and thus require a change in the PEP. I'm still investigating, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building devtoolset-7 by hand seems like a lot of work. 32-bit linux users are extremely rare at this point – a quick check of bigquery says that in 2018, 0.7% of numpy's manylinux downloads were i686, and it's 0.2% for lxml, 0.8% for cryptography. (I'm spot-checking individual projects b/c I know they have both kinds of wheels, to avoid biases from projects that only have one or the other). If i686 has to use an earlier devtoolset it's probably fine, or if maintenance becomes too much of a hassle we could even drop support entirely.

Do you happen to know if any of the newer devtoolset releases target i686?


SELECT
  COUNT(*) AS downloads,
  REGEXP_EXTRACT(file.filename, r'.*-(manylinux.*)\.whl') AS manylinux_variant
FROM
  TABLE_DATE_RANGE([the-psf:pypi.downloads], TIMESTAMP("20180101"), CURRENT_TIMESTAMP())
WHERE
  file.project = 'cryptography'
GROUP BY
  manylinux_variant
ORDER BY
  downloads DESC
LIMIT
  1000

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating to devtoolset 7 will be nice. Do you anticipate a change to the C++ ABI version? To the best of my knowledge, devtoolset 7 is still forcing you to use the old C++ ABI and statically linking in part of libstdc++ so that CentOS 6 need not ship newer libstdc++.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@njsmith It looks like it's not possible to build 32-bit artifacts on a 64-bit CentOS 6 image with devtoolset-7 either. What if we ship devtoolset-7 in the 64-bit image and leave the i686 image with devtoolset-2 from tru's repository? The compiled output should be identical between the two devtoolsets, except that devtoolset-2 probably produces slower code. That way we don't have to disadvantage projects like lepton with relatively popular 32-bit manylinux wheels or projects like cryptography and numpy whose manylinux downloads consist almost entirely of 64-bit wheels.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

devtoolset-2 also can't build all the things devtoolset-7 can – this is conda's issue, because they care about modern C++ support.

If you look at the spreadsheet again, I don't think I'd say leptop has "popular 32-bit manylinux wheels" – yeah, it's almost 30% of their manylinux downloads in 2018, but in absolute terms it's 2 downloads :-). (Versus ~15k for cryptography, ~7k for numpy, ~20k for cffi, etc.)

Anyway, I don't object to keeping 32-bit support around; it's just a question of whether the extra maintenance cost is worthwhile. I did just check and I don't see 32-bit builds for any of the newer devtoolset compilers either. If you want to do the work to set that up, and it's not too ugly to maintain, then go for it. Or an alternative would be to announce that we don't have the resources to maintain that, and see if anyone cares enough to step up...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true that lepton isn't popular, but they get value out of manylinux and I don't want to cast them adrift if possible.

It's also true that none of the devtoolsets are built for i686 for some reason. I suspect that they're part of RHEL's paid offering.

I think it may be easy (in terms of build scripts, not computing resources) to build an i686 version of devtoolset-7 with mock.

But the easiest thing I can think of is to leave the i686 image with an older version of devtoolset and install devtoolset-7 on the x86_64 image. Most users would get a modern C++; people who don't need modern C++ but do want to build 32-bit artifacts -- in other words, the current users of manylinux1's i686 image -- could still do their thing; and anybody else could submit a PR :)

It's true that this would result in inconsistent build environments, but I don't think it will result in meaningfully inconsistent artifacts because of how the devtoolsets link stuff in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cryptography is down to 0.4% in the last 30 days for i686 wheels btw.

@@ -35,20 +35,11 @@ SQLITE_AUTOCONF_HASH=d7dd516775005ad87a57f428b6f86afd206cb341722927f104d3f0cf65f
# Dependencies for compiling Python that we want to remove from
# the final image after compiling Python
# GPG installed to verify signatures on Python source tarballs.
PYTHON_COMPILE_DEPS="zlib-devel bzip2-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel gpg"
PYTHON_COMPILE_DEPS="zlib-devel bzip2-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel"

# Libraries that are allowed as part of the manylinux1 profile
MANYLINUX1_DEPS="glibc-devel libstdc++-devel glib2-devel libX11-devel libXext-devel libXrender-devel mesa-libGL-devel libICE-devel libSM-devel ncurses-devel"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs updating.

@@ -152,7 +144,7 @@ ln -s $PY36_BIN/auditwheel /usr/local/bin/auditwheel
# final image
yum -y erase wireless-tools gtk2 libX11 hicolor-icon-theme \
avahi freetype bitstream-vera-fonts \
${PYTHON_COMPILE_DEPS} > /dev/null 2>&1
${PYTHON_COMPILE_DEPS} #> /dev/null 2>&1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The redirection is there because at some point Travis start freaking out at our logs being too long. Not sure if we want to comment it out or not...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah, in fact it looks like the x86-64 build is failing because we need to hide the glibc build logs. (Did you know that Travis will kill any job that generates too much logs?)

Copy link
Author

@markrwilliams markrwilliams Feb 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, 4 MB is a lot of logs. Seems fair of Travis to assume some test fell into a loop.

What do you think about creating a new base image just for the patched glibc? I was reluctant to do it at first because I was worried introducing another intermediary would make it harder to reproduce the image. At the same time, we already rely on centos:6.9 staying available in Docker Hub, so maybe depending on centos-no-vsyscall instead isn't so bad? And, if Travis stops supporting vsyscall, then the x86_64 job will cease to build unless we switch to such a base image.

Also, it looks like Travis includes a Docker that's new enough to support multi-stage builds. I'll make the x86_64 Dockerfile build glibc in its own stage so the final image is a little smaller.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll also mute rpmbuild for now - we can always move to another base image later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making a base image for just that seems reasonable, if only to speed up the build by skipping rebuilding glibc all the time.

It looks like the docker build --squash option might make optimizing this stuff much simpler.

@markrwilliams
Copy link
Author

@njsmith Now Travis is killing the job because it's not outputting enough. Looks like a new base image will be necessary.

What's the best way to build and deploy that base image? Its Dockerfile should be in this repo, so it will be easy for interested parties to reproduce themselves, but we won't be able to build and upload it to quay.io with Travis, which seems bad.

@markrwilliams
Copy link
Author

What if I introduce a Dockerfile that builds the RPMs, asserts that each matches a corresponding hard-coded SHA-256 digest, then uploads them to some host from which the manylinux2 image can install them? If the manylinux2 image checks that the SHA-256 digests of the retrieved RPMs match the values from the builder, then we'd have an audit trail that might ameliorate reproducibility concerns introduced by removing the glibc rebuilding from the manylinux2 image.

The Dockerfile and supporting scripts for this live in docker/glibc/.
This cannot be built on Travis because it takes too long and emits too
many log messages.  Later patches should address reproducibility.
@markrwilliams
Copy link
Author

I've moved patching glibc into its own Dockerfile rather than generate RPMs for installation into the x86_64 manylinux2 image for two reasons:

  1. The new centos-6.9-no-vsyscall image can be run on any host, regardless of vsyscall settings, so our use of Travis to build manylinux images should remain safe;
  2. People introduced in reproducing the build can just use Docker instead of some other script; for example, we can eventually modify the manylinux2 image to depend on the digest of this image instead of a tag, so that Docker performs the audit instead of sha256sum.

I think 2) should assuage concerns about reproducibility, even though we won't be able to build the base image with Travis.

@njsmith
Copy link
Member

njsmith commented Feb 6, 2018

If we can't use Travis to build the base image, then that's unfortunate, but not something we can fix. We probably still want to keep the base image Dockerfile and patches somewhere here, so even if someone has to perform any rebuilds manually, then at least they can do that. (And hopefully we can convince RH to apply the patch themselves, and drop our special base image entirely.)

(OTOH, is the problem with building the base image just the log output? If it otherwise works on Travis, we could use a ticker process that just prints "Still working..." every 30 seconds to keep Travis from killing the job.)

@kaczmarj
Copy link

The issue with Travis killing the job can probably be resolved with travis_wait.

For example,

travis_wait 30 docker build --rm -t quay.io/pypa/manylinux1_$PLATFORM:$TRAVIS_COMMIT -f docker/Dockerfile-$PLATFORM docker/

will allow the docker build command to take up to 30 minutes.

Jakub Kaczmarzyk and others added 2 commits February 21, 2018 11:04
Travis-CI will kill a job if it does not produce output for 10 minutes. `travis_wait N` will allow a job to run without producing output for up to `N` minutes.
fix: allow `docker build` to take up to 40 minutes
@kaczmarj
Copy link

It seems like the x86_64 build failed because of a curl HTTP error. Can the build be restarted?

It might be useful to include --retry 5 in the curl commands to retry.

@ncoghlan
Copy link
Member

PEP 571 has been accepted (defining the CentOS 6/glibc 2.12 based manylinux2010 ABI baseline), so this PR (or a new PR based on it) can move forward again now :)

@ncoghlan
Copy link
Member

One thing I'll note: while building for manylinux1 is definitely becoming a hassle, it would still be desirable to default to that until installers start accepting manylinux2010 wheels.

That means it would be desirable to add manylinux2010 build environments, without losing the manylinux1 environments. Then the default choice of build environment can change later, once PyPI metrics indicate broad use of newer installer versions that include manylinux2010 support.

@njsmith
Copy link
Member

njsmith commented Apr 14, 2018 via email

@ncoghlan
Copy link
Member

ncoghlan commented Apr 14, 2018

The draft PR changes the existing Dockerfile entries, rather than (for example) adding a separate subdirectory for manylinux2010 with new image definition files in it. I'd expect the outcome of doing that to be that the 3rd party projects mentioned in #179 would implicitly start publishing manylinux2010 build environment images instead of manylinux1 images.

There are a variety of approaches we could take to manage that baseline evolution, but updating the existing files in-place isn't one of them.

@markrwilliams
Copy link
Author

@ncoghlan I took that approach because the existing manylinux1 Docker images are increasingly hard to build and run. @reaperhulk fixed one issue and @geofft has an approach for another, but they're going to continue to bit rot. Are we prepared to shoulder the on going maintenance burden for manylinux1? If not, why keep it in master? It will remain in the Git tree in perpetuity.

What if we tag the current head of master in preparation for this landing?

@njsmith
Copy link
Member

njsmith commented Apr 14, 2018

We could also keep the manylinux1 and manylinux2010 build scripts in separate branches. I guess it depends on how much shared code we think there'll be, and the extra CI cost of rebuilding both of them on every commit even if it only touched one of the variants. Hopefully the manylinux2010 scripts will be substantially simpler...

@markrwilliams
Copy link
Author

markrwilliams commented Apr 14, 2018

We could also keep the manylinux1 and manylinux2010 build scripts in separate branches. I guess it depends on how much shared code we think there'll be

@njsmith I'd like manyliunx2010 to move all of build.sh to its Dockerfiles so that it can take advantage of layer caching. As you've pointed out before, --squash should address concerns around extraneous layers.

If manylinux2010 does this, there won't be much to share with manylinux1.

@njsmith
Copy link
Member

njsmith commented Apr 14, 2018

move all of build.sh to its Dockerfiles

...I have complicated feelings about this, but I guess we can defer that discussion for now :-). It does serve as an example that the two sets of Dockerfiles might be diverging quite a bit.

@ncoghlan
Copy link
Member

The core of the problem is that it's going to be (at least) months before manylinux2010 wheels are actually usable for a significant proportion of users (since they're going to need a sufficiently recent version of pip for it to accept them), and there's going to be an extended period where publishers have to make a choice between "easier builds targeting a more modern baseline" and "building artifacts that users can actually install".

The manylinux1 images being so hard to maintain and use means it make sense to push for a relatively aggressive migration timeline, but even "by the end of 2018" would likely be quite ambitious.

@njsmith
Copy link
Member

njsmith commented Apr 15, 2018

The images themselves aren't going anywhere. The manylinux1 builds have been broken for extended periods, and probably will be again, but it doesn't really affect most people because they're fetching the pre-built images.

Actually, that's probably the winning argument: we should maintain the manylinux1 and manylinux2010 images in different branches so that when one is broken, it doesn't block merging fixes to the other.

@dolang
Copy link

dolang commented Apr 15, 2018

Hey,

I wasn't aware this was still moving forward and as I've recently started to try building wheels for the Kivy project, I've actually worked on this myself in the meantime.

I would've published it yesterday, but then all of a sudden my build broke due to the new pip. That's fixed now, too. So if anyone's interested, the scripts in my forked repo actually build the images:

https://github.com/dolang/manylinux

See also: https://mail.python.org/pipermail/wheel-builders/2018-April/000321.html

@ncoghlan
Copy link
Member

Checking the CI build projects already identified in the tracking issue, it looks like they'll all be fine as long as the images are clearly separated:

Given that, I think a maintenance branch model is likely to work pretty well for this repo: have master track whatever the latest version of the manylinux spec happens to be, and create maintenance branches for the old images if they end up needing to be rebuilt for some reason.

@ofek
Copy link

ofek commented Aug 27, 2018

Any progress on this?

@safijari
Copy link

What work is remaining for this? How can I help?

@takluyver
Copy link
Member

From the tracking issue, manylinux2010 support is now in auditwheel, warehouse and (the next release of) pip. So I think that means that the docker images are the last piece needed to allow practical distribution of manylinux2010 wheels.

(I'd offer to help, but I don't think I've got the time or relevant knowledge to finish it off, so it would be an empty gesture. I'm grateful to everyone working on this, and I'm not demanding that you give up more of your free time. :-)

@henriquegemignani
Copy link

I have a branch where I picked this branch and rebased it on top of master:
https://github.com/henriquegemignani/manylinux/tree/manylinux2

There's a second branch where I installed devtoolset-7 from SCL, as commented here
https://github.com/henriquegemignani/manylinux/tree/manylinux2-devtoolset-7

I've published a devtoolset-7 x86-64 image to Docker Hub: https://hub.docker.com/r/henriquegemignani/manylinux/
Here's a project I manage that uses this image for building and publishing manylinux2010 wheels:
https://travis-ci.com/henriquegemignani/py-nod/jobs/162157921
https://pypi.org/project/nod/#files

@takluyver
Copy link
Member

Thanks @henriquegemignani - do you want to make a pull request from your branch so people can easily see the changes.

(See also #182 which is/was working towards manylinux2010 support as well)

@dralley
Copy link

dralley commented Jan 4, 2019

@henriquegemignani @dolang As an onlooker, I'm deeply confused as to which PR is the "real" and most up to date one. This PR is linked by the tracking issue, but there's also #252 and #182 which both (I think?) forked this branch, rebased and did other miscellaneous things to it. I have no idea where where the development is supposed to be happening or what branch to use if I wanted to try out building a manylinux2010 wheel (as I've got a package which cannot be built on manylinux1 due to a too-old glibc version -- I want to ship compiled wheels but I'm currently limited to sdists due to this issue).

Can someone explain the current state of this feature? If this PR is obsoleted, could the tracking issue be updated to point to a new one? #179

@mayeut mayeut mentioned this pull request Mar 31, 2019
3 tasks
@mayeut
Copy link
Member

mayeut commented Apr 12, 2019

Superseded by #279 that's been merged-in.
@markrwilliams, I think all your patches were included by @dolang then me but if not, feel free to open another PR or tell me so that I can include them. Thanks a lot for your work !

@mayeut mayeut closed this Apr 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.