-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Manylinux2 #152
Manylinux2 #152
Conversation
CentOS 6.9 includes a new enough OpenSSL (1.0.1e) to allow curl to connect to HTTPS URLs. The official 32-bit Docker image does not include the utils-linux-ng package that contains linux32. Without setting a 32-bit personality, yum will download packages for the the host's architecture. Python 2.6 is included, however, so include a Python implementation of setarch(8) to enable bootstrapping.
Very nice work, @markrwilliams! Out of curiosity: why Centos 6, not 7? The latter has a longer lifetime, which means we will need to visit manylinux3 much later in the future, but I'm guessing the former was chosen for maximal backwards-compatibility? |
@trishankatdatadog Looks like my email(s) have made it to distutils-sig: https://mail.python.org/pipermail/distutils-sig/2018-February/031944.html I'll be happy to answer any questions there! |
@trishankatdatadog the answer is pretty simple – stuff built on centos 6 runs on centos 7 but not vice-versa, and people still run centos 6/rhel 6, so if you want your wheels to work everywhere then you have to build them on centos 6. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the PEP to be accepted before we can do much here, but here's a few comments.
We should figure out how we want to handle the transition, too... I guess given how unmaintainable the manylinux1 image is rapidly becoming, maybe we'll want to push that off into a branch and declare it unmaintained as soon as this is ready?
check_sha256sum epel-release-5-4.noarch.rpm $EPEL_RPM_HASH | ||
# https://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm | ||
cp $MY_DIR/epel-release-6-8.noarch.rpm . | ||
check_sha256sum epel-release-6-8.noarch.rpm $EPEL_RPM_HASH | ||
|
||
# Dev toolset (for LLVM and other projects requiring C++11 support) | ||
curl -fsSLO http://people.centos.org/tru/devtools-2/devtools-2.repo | ||
check_sha256sum devtools-2.repo $DEVTOOLS_HASH | ||
mv devtools-2.repo /etc/yum.repos.d/devtools-2.repo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be switching to devtoolset-7. That could potentially be a followup PR though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! devtoolset-2
is why this has the same version of gcc
as manylinux1
, about which you were rightly suspicious.
I think the change should be done in this PR, because upgrading to devtoolset-7 will
bump the version of gcc
et. al. to 7.2.1, but I've advertised 4.8.2 in the PEP. Installing devtoolset-7
doesn't imply an upgrade to glibc
, so I don't think it will result in incompatible executables, but I'll verify this with some additional tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately devtoolset-7
hasn't been built for i686. I can build it from scratch, much like the glibc patching, but I have concerns that it might inadvertently bump the libgcc version and thus require a change in the PEP. I'm still investigating, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Building devtoolset-7
by hand seems like a lot of work. 32-bit linux users are extremely rare at this point – a quick check of bigquery says that in 2018, 0.7% of numpy's manylinux downloads were i686, and it's 0.2% for lxml, 0.8% for cryptography. (I'm spot-checking individual projects b/c I know they have both kinds of wheels, to avoid biases from projects that only have one or the other). If i686 has to use an earlier devtoolset it's probably fine, or if maintenance becomes too much of a hassle we could even drop support entirely.
Do you happen to know if any of the newer devtoolset releases target i686?
SELECT
COUNT(*) AS downloads,
REGEXP_EXTRACT(file.filename, r'.*-(manylinux.*)\.whl') AS manylinux_variant
FROM
TABLE_DATE_RANGE([the-psf:pypi.downloads], TIMESTAMP("20180101"), CURRENT_TIMESTAMP())
WHERE
file.project = 'cryptography'
GROUP BY
manylinux_variant
ORDER BY
downloads DESC
LIMIT
1000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating to devtoolset 7 will be nice. Do you anticipate a change to the C++ ABI version? To the best of my knowledge, devtoolset 7 is still forcing you to use the old C++ ABI and statically linking in part of libstdc++ so that CentOS 6 need not ship newer libstdc++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markrwilliams looks like the SCL repo is GPG-signed, should be OK: https://copr.fedorainfracloud.org/coprs/rhscl/centos-release-scl/repo/epel-6/rhscl-centos-release-scl-epel-6.repo%20sudo%20yum%20install%20centos-release-scl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@njsmith It looks like it's not possible to build 32-bit artifacts on a 64-bit CentOS 6 image with devtoolset-7 either. What if we ship devtoolset-7 in the 64-bit image and leave the i686 image with devtoolset-2
from tru's repository? The compiled output should be identical between the two devtoolsets, except that devtoolset-2 probably produces slower code. That way we don't have to disadvantage projects like lepton with relatively popular 32-bit manylinux
wheels or projects like cryptography
and numpy
whose manylinux
downloads consist almost entirely of 64-bit wheels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
devtoolset-2 also can't build all the things devtoolset-7 can – this is conda's issue, because they care about modern C++ support.
If you look at the spreadsheet again, I don't think I'd say leptop has "popular 32-bit manylinux wheels" – yeah, it's almost 30% of their manylinux downloads in 2018, but in absolute terms it's 2 downloads :-). (Versus ~15k for cryptography, ~7k for numpy, ~20k for cffi, etc.)
Anyway, I don't object to keeping 32-bit support around; it's just a question of whether the extra maintenance cost is worthwhile. I did just check and I don't see 32-bit builds for any of the newer devtoolset compilers either. If you want to do the work to set that up, and it's not too ugly to maintain, then go for it. Or an alternative would be to announce that we don't have the resources to maintain that, and see if anyone cares enough to step up...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true that lepton
isn't popular, but they get value out of manylinux
and I don't want to cast them adrift if possible.
It's also true that none of the devtoolsets are built for i686 for some reason. I suspect that they're part of RHEL's paid offering.
I think it may be easy (in terms of build scripts, not computing resources) to build an i686
version of devtoolset-7 with mock.
But the easiest thing I can think of is to leave the i686
image with an older version of devtoolset and install devtoolset-7 on the x86_64
image. Most users would get a modern C++; people who don't need modern C++ but do want to build 32-bit artifacts -- in other words, the current users of manylinux1
's i686
image -- could still do their thing; and anybody else could submit a PR :)
It's true that this would result in inconsistent build environments, but I don't think it will result in meaningfully inconsistent artifacts because of how the devtoolsets link stuff in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cryptography is down to 0.4% in the last 30 days for i686 wheels btw.
@@ -35,20 +35,11 @@ SQLITE_AUTOCONF_HASH=d7dd516775005ad87a57f428b6f86afd206cb341722927f104d3f0cf65f | |||
# Dependencies for compiling Python that we want to remove from | |||
# the final image after compiling Python | |||
# GPG installed to verify signatures on Python source tarballs. | |||
PYTHON_COMPILE_DEPS="zlib-devel bzip2-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel gpg" | |||
PYTHON_COMPILE_DEPS="zlib-devel bzip2-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel" | |||
|
|||
# Libraries that are allowed as part of the manylinux1 profile | |||
MANYLINUX1_DEPS="glibc-devel libstdc++-devel glib2-devel libX11-devel libXext-devel libXrender-devel mesa-libGL-devel libICE-devel libSM-devel ncurses-devel" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs updating.
docker/build_scripts/build.sh
Outdated
@@ -152,7 +144,7 @@ ln -s $PY36_BIN/auditwheel /usr/local/bin/auditwheel | |||
# final image | |||
yum -y erase wireless-tools gtk2 libX11 hicolor-icon-theme \ | |||
avahi freetype bitstream-vera-fonts \ | |||
${PYTHON_COMPILE_DEPS} > /dev/null 2>&1 | |||
${PYTHON_COMPILE_DEPS} #> /dev/null 2>&1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The redirection is there because at some point Travis start freaking out at our logs being too long. Not sure if we want to comment it out or not...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah, in fact it looks like the x86-64 build is failing because we need to hide the glibc build logs. (Did you know that Travis will kill any job that generates too much logs?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be fair, 4 MB is a lot of logs. Seems fair of Travis to assume some test fell into a loop.
What do you think about creating a new base image just for the patched glibc? I was reluctant to do it at first because I was worried introducing another intermediary would make it harder to reproduce the image. At the same time, we already rely on centos:6.9
staying available in Docker Hub, so maybe depending on centos-no-vsyscall
instead isn't so bad? And, if Travis stops supporting vsyscall
, then the x86_64 job will cease to build unless we switch to such a base image.
Also, it looks like Travis includes a Docker that's new enough to support multi-stage builds. I'll make the x86_64 Dockerfile build glibc in its own stage so the final image is a little smaller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll also mute rpmbuild
for now - we can always move to another base image later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making a base image for just that seems reasonable, if only to speed up the build by skipping rebuilding glibc all the time.
It looks like the docker build --squash
option might make optimizing this stuff much simpler.
MAINTAINER is also deprecated, so that's replaced with LABEL.
@njsmith Now Travis is killing the job because it's not outputting enough. Looks like a new base image will be necessary. What's the best way to build and deploy that base image? Its Dockerfile should be in this repo, so it will be easy for interested parties to reproduce themselves, but we won't be able to build and upload it to quay.io with Travis, which seems bad. |
What if I introduce a Dockerfile that builds the RPMs, asserts that each matches a corresponding hard-coded SHA-256 digest, then uploads them to some host from which the |
The Dockerfile and supporting scripts for this live in docker/glibc/. This cannot be built on Travis because it takes too long and emits too many log messages. Later patches should address reproducibility.
3bdb01e
to
8b61bcb
Compare
I've moved patching
I think 2) should assuage concerns about reproducibility, even though we won't be able to build the base image with Travis. |
If we can't use Travis to build the base image, then that's unfortunate, but not something we can fix. We probably still want to keep the base image Dockerfile and patches somewhere here, so even if someone has to perform any rebuilds manually, then at least they can do that. (And hopefully we can convince RH to apply the patch themselves, and drop our special base image entirely.) (OTOH, is the problem with building the base image just the log output? If it otherwise works on Travis, we could use a ticker process that just prints "Still working..." every 30 seconds to keep Travis from killing the job.) |
The issue with Travis killing the job can probably be resolved with For example,
will allow the |
Travis-CI will kill a job if it does not produce output for 10 minutes. `travis_wait N` will allow a job to run without producing output for up to `N` minutes.
fix: allow `docker build` to take up to 40 minutes
It seems like the x86_64 build failed because of a It might be useful to include |
PEP 571 has been accepted (defining the CentOS 6/glibc 2.12 based |
One thing I'll note: while building for That means it would be desirable to add manylinux2010 build environments, without losing the manylinux1 environments. Then the default choice of build environment can change later, once PyPI metrics indicate broad use of newer installer versions that include |
We don't have much say over what environment projects use -- there isn't
really a "default". We can make suggestions of course.
…On Fri, Apr 13, 2018 at 6:41 PM, Nick Coghlan ***@***.***> wrote:
One thing I'll note: while building for manylinux1 is definitely becoming
a hassle, it would still be desirable to default to that until installers
start accepting manylinux2010 wheels.
That means it would be desirable to *add* manylinux2010 build
environments, without losing the manylinux1 environments. Then the default
choice of build environment can change later, once PyPI metrics indicate
broad use of newer installer versions that include manylinux2010 support.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#152 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAlOaPjdEP-GW_7uzMz0yAOr8G4_Fg0Nks5toVPZgaJpZM4R00ZT>
.
--
Nathaniel J. Smith -- https://vorpus.org <http://vorpus.org>
|
The draft PR changes the existing There are a variety of approaches we could take to manage that baseline evolution, but updating the existing files in-place isn't one of them. |
@ncoghlan I took that approach because the existing What if we tag the current head of master in preparation for this landing? |
We could also keep the |
@njsmith I'd like If |
...I have complicated feelings about this, but I guess we can defer that discussion for now :-). It does serve as an example that the two sets of Dockerfiles might be diverging quite a bit. |
The core of the problem is that it's going to be (at least) months before The |
The images themselves aren't going anywhere. The manylinux1 builds have been broken for extended periods, and probably will be again, but it doesn't really affect most people because they're fetching the pre-built images. Actually, that's probably the winning argument: we should maintain the manylinux1 and manylinux2010 images in different branches so that when one is broken, it doesn't block merging fixes to the other. |
Hey, I wasn't aware this was still moving forward and as I've recently started to try building wheels for the Kivy project, I've actually worked on this myself in the meantime. I would've published it yesterday, but then all of a sudden my build broke due to the new pip. That's fixed now, too. So if anyone's interested, the scripts in my forked repo actually build the images: https://github.com/dolang/manylinux See also: https://mail.python.org/pipermail/wheel-builders/2018-April/000321.html |
Checking the CI build projects already identified in the tracking issue, it looks like they'll all be fine as long as the images are clearly separated:
Given that, I think a maintenance branch model is likely to work pretty well for this repo: have |
Any progress on this? |
What work is remaining for this? How can I help? |
From the tracking issue, manylinux2010 support is now in auditwheel, warehouse and (the next release of) pip. So I think that means that the docker images are the last piece needed to allow practical distribution of manylinux2010 wheels. (I'd offer to help, but I don't think I've got the time or relevant knowledge to finish it off, so it would be an empty gesture. I'm grateful to everyone working on this, and I'm not demanding that you give up more of your free time. :-) |
I have a branch where I picked this branch and rebased it on top of master: There's a second branch where I installed devtoolset-7 from SCL, as commented here I've published a devtoolset-7 x86-64 image to Docker Hub: https://hub.docker.com/r/henriquegemignani/manylinux/ |
Thanks @henriquegemignani - do you want to make a pull request from your branch so people can easily see the changes. (See also #182 which is/was working towards manylinux2010 support as well) |
@henriquegemignani @dolang As an onlooker, I'm deeply confused as to which PR is the "real" and most up to date one. This PR is linked by the tracking issue, but there's also #252 and #182 which both (I think?) forked this branch, rebased and did other miscellaneous things to it. I have no idea where where the development is supposed to be happening or what branch to use if I wanted to try out building a manylinux2010 wheel (as I've got a package which cannot be built on manylinux1 due to a too-old glibc version -- I want to ship compiled wheels but I'm currently limited to sdists due to this issue). Can someone explain the current state of this feature? If this PR is obsoleted, could the tracking issue be updated to point to a new one? #179 |
Superseded by #279 that's been merged-in. |
Initial draft of a
manylinux2
Docker image based on CentOS 6.9.There's a lengthy comment that attempts to justify patching glibc.