-
-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify ABI for pantsbuild.pants wheel and build with both UCS2 and UCS4 #7235
Specify ABI for pantsbuild.pants wheel and build with both UCS2 and UCS4 #7235
Conversation
This is going to be necessary to release Py3 with abi3. The line however results in the issue that this PR is going to aim to fix: now Python 2 will be built with abi `cp27m` or `cp27mu`, whereas earlier it was `none`. To test, try running `./pants setup-py --run="bdist_wheel --python-tag cp27 --plat-name=linux_x86_64" src/python/pants:pants-packaged` followed by `ls -l dist/pantsbuild.pants-1.14.0rc0/dist/`. Note also that ext_modules is deprecated in favor of distutils.Extension. We use this now as a temporary workaround until we add support for Extension.
We must now build pantsbuild.pants with both unicode versions for Py2. So, we introduce Py2 to do this. We use Pyenv to install interpreter with the relevant encoding where necessary. See https://stackoverflow.com/questions/38928942/build-python-as-ucs-4-via-pyenv.
Our native code strongly assumes UTF-8 everywhere... I don't really know much about python publishing, but I would assume that means we should only be publishing a |
I think we're all a bit light on the details here. Suffice it to say that a 2.7 interpreter built with ucs2 still handles encoding and decoding utf-8 strings, at least for simple test cases. I'm pretty sure we need to support both. The need comes from the fact that we are a bad python citizen. You cannot grab our sdist and build, we have to have burned the shared library you need or you're out of luck and are forced to clone our repo and hand-build. |
Yes, agreed with @jsirois that UCS2 surprisingly works even though we assume UTF-8 everywhere. We know this because Python 2 on macOS defaults to UCS2 when using system Python or installed via Pyenv—the user must go out of their way to set |
I can't reproduce the same failure locally. John was suspicious why for the problematic dependencies the sdist isn't being used to build the wheel when the bdist is not released for cp27mu. Hopefully this provides some insight.
The folder path is py27, not py2!
The shard was failing when trying to build cryptography from an sdist because it could not find openssl. So, we now explicitly install it with Brew and modify the env vars to expose it. This is identical to how we install Py3 on OSX.
Make it more explicit how shard is configured / which wheel building config it has. Whereas for most shards we specify if they run with Py36 vs Py27 in parantheses, it is actually very important we make explicit the wheel building config, as it impacts which wheels we end up producing.
PEX_VERBOSE only impacts runtime output. -vvv... impacts build time output.
We now have to set up this env var 4 distinct times. With pantsbuild#7235, we will have to again use the env var in order to install Python 2 with UCS4. So, this code needs to be de-duplicated. We keep the name generic to any Python interpreter because it's really about Pyenv, not Python 3.
For Linux UCS2, the build was failing due to gcc complaining it could not find any files. This reproduced locally when running `./pants3 setup-py --run="bdist_wheel --py-limited-api cp36" src/python/pants:pants-packaged` on OSX. John suggested and gave the code snippet to pass a dummy file so this no longer happens. Thanks John!
We can't use the PEX from AWS because the Python versions do not match up.
This reverts commit edf81ef.
This reverts commit 28b9e8b.
Realized this is a better design while working on pantsbuild#7261.
Change how we handle env var to not use PEX when first bootstrapping, then use it in the followup release.sh command.
Use the {osx,linux}_config images rather than {osx,linux}_test_config images.
They don't exist apparently.
Now that we don't it in the base_build_wheels_env, we don't need to set this. This was actually causing a failure. ./pants only checks if the env var is set, and not what its value is.
- OSX UCS2 shard no longer was setting RUN_PANTS_FROM_PEX anywhere - Linux UCS4 had its before_script entry being override by travis_image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for banging away at this Eric! This was just a quick skim while I had a minute.
@@ -0,0 +1,77 @@ | |||
#!/usr/bin/env bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably this script was helpful when you were debugging this PR. You're also sure it's worth its weight going forward?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script is meant above all to act as an integration test / check for regressions. The most likely scenario where it will catch us from a regression is that we change how we handle interpreter constraints or resolve $PY
, and that that change results in this no longer working as expected. Other scenarios could be changes made to .travis.yml
or the Dockerfile breaking things.
This is why I productionized it. The original was an inline command in .travis.yml
with a series of &&
and ;
statements, and then I realized this would be useful to have permanently as a sanity check and that we can even use it to verify abi3
is setup correctly when building Py3 wheels.
At the cost of this safety / regression test, we have more code and about 1 extra second to each of the wheel building shards. Output from running script with time
on my mac:
$ time ./build-support/bin/check_pants_pex_abi.sh cp27m
0.47s user 0.94s system 93% cpu 1.508 total
Fine with deleting it if you all prefer but I think we get more benefit than harm here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the above I am in favor of keeping this script until we have a test that exercises actually resolving each of the built wheels and running them on a python interpreter with the given abi.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excited for this!
@@ -381,41 +372,113 @@ cargo_audit: &cargo_audit | |||
# Build wheels | |||
# ------------------------------------------------------------------------- | |||
|
|||
# N.B. With Python 2, we must build pantsbuild.pants with both UCS2 and UCS4 to provide full |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good comment!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can there be a test verifying that all of these wheels we're building can actually be resolved and used correctly with all of the abis? If it requires having a python interpreter with that abi, is that something we can set up another docker image(s) to do?
@@ -0,0 +1,77 @@ | |||
#!/usr/bin/env bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the above I am in favor of keeping this script until we have a test that exercises actually resolving each of the built wheels and running them on a python interpreter with the given abi.
Much more readable and less hacky now! I was only using Bash because I had started with Bash and because almost all of our scripts are in Bash so I thought there was precedent. But Python is a more natural fit.
We really want to check two things: 1) that the interpreter is available on the PATH, and 2) that they are using python2.7, as the script does not (yet) work with Python 3. Rather than muddling both these two tests into one command, instead split them out into distinct checks with very explicit error messages. This will also make it easier to ensure they are using only 2.7 or 3.6 to release, once that PR goes through, as we don't want to allow <2.7, 3.0<=x<=3.5, or 3.7. Thanks Benjy!
Explain why we need dummy.c and why we set ext_modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great reviews! This is ready for re-review.
We repeat this code a couple times and it's pretty finicky. Duplicate it out. Props to Danny for finding a way to get this working with multiline strings!
Danny found a way to make this work when the template is used for a key entry (i.e. with -) and is not linked to an anchor!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feeling really positive about the templating and the python scripting here!
@@ -231,14 +216,15 @@ base_linux_build_engine: &base_linux_build_engine | |||
<<: *travis_docker_image | |||
stage: *bootstrap | |||
script: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit is awesome!! Especially allowing parameterizing the image name! Thanks a ton for figuring this out!!!
# that we correctly distinguish when using Python 2 between interpreters built with UCS2 vs. UCS4 | ||
# (see https://www.python.org/dev/peps/pep-0513/#ucs-2-vs-ucs-4-builds). | ||
# TODO(7344): the tuple syntax for ext_modules is deprecated. Use Extension once we support it. | ||
ext_modules=[('native_engine', {'sources': ['src/pants/dummy.c']})], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like an annoying workaround with a lot of extra work required just to set the ABI, but if it continues to work then it should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely a lot of extra work 😁 As highlighted in the updated PR description, getting us to mark the new ABI is actually not much work—it's just this file and dummy.c
. The real work is adding the new shards to build Linux UCS2 and OSX UCS4 to ensure we don't lose support for any Pants users.
A lot of these changes are also changes that would be good to make anyways, such as the changes to release.sh
.
--
One step closer to Python 3 release! This PR does a substantial portion of the prework.
Also fix some release-related documentation.
…omments (#7351) ### Problem #7235 introduced an initial deduplication of launching docker images, notably allowing the image name to be parametrized. While working on #7197, it became evident how frequently the same code was being repeated for calling `docker run`. The only thing that changes between these invocations is what goes inside `sh -c "my bash command"`. This duplication makes our Travis code harder to understand, along with the usual arguments for DRY. ### Solution Introduce a new `docker_run_image` mustache template that takes the parameters `$docker_image_name` and `$docker_run_command`. Also changed: * rename `launch_docker_image` -> `docker_build_image` * remove script entry for `&travis_docker_image`, as it gets overridden every time by its caller so is unnecessary and confusing. * introduce Mustache comments to some of our templates. These don't get copied over into `.travis.yml`, which is good for reducing noise, while still allowing us to explain the concept in the template. ### Result CI should perform the same with the benefit of a shorter and better abstracted `travis.yml.mustache`.
…7393) ### Problem After #7235, we began building wheels for multiple ABIs and platforms, but when building a pex, we were allowing `pex` to choose which ABI to use in the output `pex`, which could result in an unusable output pex: see #7383. ### Solution Explicitly include all ABIs when constructing a pex with `-p`. ### Result Released and nightly pexes will be larger, but much more compatible. Fixes #7383.
Problem
We should be marking the ABI (application binary interface) for the
pantsbuild.pants
wheel because it uses native code. Currently, we mark the ABI asnone
, which is incorrect per https://www.python.org/dev/peps/pep-0513/#ucs-2-vs-ucs-4-builds.In particular, in Python 2, Python may be installed with either UCS2 (UTF-16) or UCS4 (UTF-8). We should be marking the wheel as either
cp27m
for UCS2 orcp27mu
for UCS4.As a result of marking the ABI, we must now produce more wheels. macOS defaults to UCS2. For Linux, "ucs4 is much more widespread among Linux CPython distributions." We do not want to rely on these assumptions, however, when releasing, as some users may not have these default unicode settings. So, instead we must release
pantsbuild.pants
as both acp27m
andcp27mu
wheel, and rely on Pip to resolve which the user should use.Solution
At a high level, this PR does two things:
none
.cp27m
andcp27mu
wheel for both Linux and OSX. See https://travis-ci.org/pantsbuild/pants/builds/503639333 for the end result of this.To setup the new shards, we use Pyenv to install new versions of Python 2 with the appropriate unicode settings, thanks to the env var
PYTHON_CONFIGURE_OPTS=--enable-unicode=ucs{2,4}
(https://stackoverflow.com/a/38930764).Because both the OSX UCS4 shard and Linux UCS2 shard already have Python 2.7 installed, we must install a Python 2.7.x version different than what is already there, and use
PANTS_PYTHON_SETUP_INTERPRETER_CONSTRAINTS
to ensure Pants and PEX are using the exact interpreter we want. For this reason, this PR was blocked by #7285 to propagate interpreter constraints to PEX.Specifically, we make these changes to achieve these two high level goals:
src/python/pants/BUILD
so thatbdist_wheel
knows it needs to mark the ABI. This achieves goal 1.release.sh
to allow pre-setting$PY
and to also setPANTS_PYTHON_SETUP_INTERPRETER_CONSTRAINTS
in order to use the specific Python interpreter we are targeting.travis_ci_py27_ucs2/Dockerfile
to get Python 2 w/ UCS2 onto Linux..travis.yml
env
entries to get OpenSSL and Homebrew-installed Python working properly, along with launching a Docker image, in order to avoid duplication:env_osx_with_pyenv.mustache
,launch_docker_image.mustache
, andgenerate_travis_yml.py
.travis.yml.mustache
to set up the 4 distinct wheel building shards. Similar to how we created a Dockerfile to use Pyenv to install Python 2 with UCS2 on Linux, we use Pyenv to install Python 2 with UCS4 on OSX. We also move the wheel building shards below unit tests.Ensuring the correct abi is used
We need to ensure the
pants.pex
used byrelease.sh
has the correct abi for its dependencies, and thatrelease.sh
is using the correct Python interpreter. We introduce a new scriptcheck_pants_pex_abi.py
that inspects the pex'sPEX-INFO
to ensure the targeted abi was used.An even better test would test the result of
release.sh
to ensure the builtpantsbuild.pants
wheel has the correct ABI and can be consumed properly. Currentlyrelease.sh
verifies the wheel is valid, but it does not enforce which ABI it was built with. This could be a good followup PR.Result
We now properly mark the ABI for Python 2. Beyond the new script
check_pants_pex_abi.sh
proving this, we performed a run of this PR with verbose PEX logging turned on: https://travis-ci.org/pantsbuild/pants/builds/503639333. Inspecting the logs for the wheel building shards and searching forUsing the current platform
proves the 4 wheel building shards are using the correct interpreter and abi.In addition to correctness for Python 2, this unblocks releasing Python 3 wheels (#7197).
Note this should have no significant impact on the end user, as Pip will resolve to the current ABI for their interpreter. It will change the name of our
pantsbuild.pants
wheel and will prevent using that wheel with an interpreter that uses a different UCS setting, but all users should be able to pull down whichever wheel they need as we provide wheels for both UCS2 and UCS4 on both OSX and Linux.Downside: wheel building explosion
We currently are building more wheels than necessary. For wheels that are universal / platform-independent, we only need them to be built once, but we build them every time. See #7258.
This PR adds two new shards so adds ~30 unnecessary core wheels we build, in addition to 3rd party wheels that are universal / platform-independent.