Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify ABI for pantsbuild.pants wheel and build with both UCS2 and UCS4 #7235

Merged

Conversation

Eric-Arellano
Copy link
Contributor

@Eric-Arellano Eric-Arellano commented Feb 12, 2019

Problem

We should be marking the ABI (application binary interface) for the pantsbuild.pants wheel because it uses native code. Currently, we mark the ABI as none, which is incorrect per https://www.python.org/dev/peps/pep-0513/#ucs-2-vs-ucs-4-builds.

In particular, in Python 2, Python may be installed with either UCS2 (UTF-16) or UCS4 (UTF-8). We should be marking the wheel as either cp27m for UCS2 or cp27mu for UCS4.

As a result of marking the ABI, we must now produce more wheels. macOS defaults to UCS2. For Linux, "ucs4 is much more widespread among Linux CPython distributions." We do not want to rely on these assumptions, however, when releasing, as some users may not have these default unicode settings. So, instead we must release pantsbuild.pants as both a cp27m and cp27mu wheel, and rely on Pip to resolve which the user should use.

Solution

At a high level, this PR does two things:

  1. Marks that the ABI should be specified, rather than none.
  2. Sets up 4 Travis shards so that we build both a cp27m and cp27mu wheel for both Linux and OSX. See https://travis-ci.org/pantsbuild/pants/builds/503639333 for the end result of this.

To setup the new shards, we use Pyenv to install new versions of Python 2 with the appropriate unicode settings, thanks to the env var PYTHON_CONFIGURE_OPTS=--enable-unicode=ucs{2,4} (https://stackoverflow.com/a/38930764).

Because both the OSX UCS4 shard and Linux UCS2 shard already have Python 2.7 installed, we must install a Python 2.7.x version different than what is already there, and use PANTS_PYTHON_SETUP_INTERPRETER_CONSTRAINTS to ensure Pants and PEX are using the exact interpreter we want. For this reason, this PR was blocked by #7285 to propagate interpreter constraints to PEX.

Specifically, we make these changes to achieve these two high level goals:

  1. Modify src/python/pants/BUILD so that bdist_wheel knows it needs to mark the ABI. This achieves goal 1.
  2. Change release.sh to allow pre-setting $PY and to also set PANTS_PYTHON_SETUP_INTERPRETER_CONSTRAINTS in order to use the specific Python interpreter we are targeting.
  3. Create travis_ci_py27_ucs2/Dockerfile to get Python 2 w/ UCS2 onto Linux.
  4. Extract out .travis.yml env entries to get OpenSSL and Homebrew-installed Python working properly, along with launching a Docker image, in order to avoid duplication: env_osx_with_pyenv.mustache, launch_docker_image.mustache, and generate_travis_yml.py.
  5. Modify travis.yml.mustache to set up the 4 distinct wheel building shards. Similar to how we created a Dockerfile to use Pyenv to install Python 2 with UCS2 on Linux, we use Pyenv to install Python 2 with UCS4 on OSX. We also move the wheel building shards below unit tests.

Ensuring the correct abi is used

We need to ensure the pants.pex used by release.sh has the correct abi for its dependencies, and that release.sh is using the correct Python interpreter. We introduce a new script check_pants_pex_abi.py that inspects the pex's PEX-INFO to ensure the targeted abi was used.

An even better test would test the result of release.sh to ensure the built pantsbuild.pants wheel has the correct ABI and can be consumed properly. Currently release.sh verifies the wheel is valid, but it does not enforce which ABI it was built with. This could be a good followup PR.

Result

We now properly mark the ABI for Python 2. Beyond the new script check_pants_pex_abi.sh proving this, we performed a run of this PR with verbose PEX logging turned on: https://travis-ci.org/pantsbuild/pants/builds/503639333. Inspecting the logs for the wheel building shards and searching for Using the current platform proves the 4 wheel building shards are using the correct interpreter and abi.

In addition to correctness for Python 2, this unblocks releasing Python 3 wheels (#7197).

Note this should have no significant impact on the end user, as Pip will resolve to the current ABI for their interpreter. It will change the name of our pantsbuild.pants wheel and will prevent using that wheel with an interpreter that uses a different UCS setting, but all users should be able to pull down whichever wheel they need as we provide wheels for both UCS2 and UCS4 on both OSX and Linux.

Downside: wheel building explosion

We currently are building more wheels than necessary. For wheels that are universal / platform-independent, we only need them to be built once, but we build them every time. See #7258.

This PR adds two new shards so adds ~30 unnecessary core wheels we build, in addition to 3rd party wheels that are universal / platform-independent.

This is going to be necessary to release Py3 with abi3.

The line however results in the issue that this PR is going to aim to fix: now Python 2 will be built with abi `cp27m` or `cp27mu`, whereas earlier it was `none`. To test, try running `./pants setup-py --run="bdist_wheel --python-tag cp27 --plat-name=linux_x86_64" src/python/pants:pants-packaged` followed by `ls -l dist/pantsbuild.pants-1.14.0rc0/dist/`.

Note also that ext_modules is deprecated in favor of distutils.Extension. We use this now as a temporary workaround until we add support for Extension.
We must now build pantsbuild.pants with both unicode versions for Py2. So, we introduce Py2 to do this.

We use Pyenv to install interpreter with the relevant encoding where necessary. See https://stackoverflow.com/questions/38928942/build-python-as-ucs-4-via-pyenv.
@illicitonion
Copy link
Contributor

Our native code strongly assumes UTF-8 everywhere... I don't really know much about python publishing, but I would assume that means we should only be publishing a cp27mu wheel?

@jsirois
Copy link
Contributor

jsirois commented Feb 12, 2019

I think we're all a bit light on the details here. Suffice it to say that a 2.7 interpreter built with ucs2 still handles encoding and decoding utf-8 strings, at least for simple test cases. I'm pretty sure we need to support both. The need comes from the fact that we are a bad python citizen. You cannot grab our sdist and build, we have to have burned the shared library you need or you're out of luck and are forced to clone our repo and hand-build.

@Eric-Arellano
Copy link
Contributor Author

Eric-Arellano commented Feb 12, 2019

Yes, agreed with @jsirois that UCS2 surprisingly works even though we assume UTF-8 everywhere. We know this because Python 2 on macOS defaults to UCS2 when using system Python or installed via Pyenv—the user must go out of their way to set PYTHON_CONFIGURE_OPTS=--enable-unicode=ucs4. Everything works as intended with macOS, otherwise our integration tests would fail on macOS and our macOS users (such as all of Foursquare) would have discovered unicode issues the past few years.

I can't reproduce the same failure locally. John was suspicious why for the problematic dependencies the sdist isn't being used to build the wheel when the bdist is not released for cp27mu. Hopefully this provides some insight.
The folder path is py27, not py2!
The shard was failing when trying to build cryptography from an sdist because it could not find openssl. So, we now explicitly install it with Brew and modify the env vars to expose it. This is identical to how we install Py3 on OSX.
Make it more explicit how shard is configured / which wheel building config it has. Whereas for most shards we specify if they run with Py36 vs Py27 in parantheses, it is actually very important we make explicit the wheel building config, as it impacts which wheels we end up producing.
PEX_VERBOSE only impacts runtime output. -vvv... impacts build time output.
Eric-Arellano added a commit to Eric-Arellano/pants that referenced this pull request Feb 19, 2019
We now have to set up this env var 4 distinct times. With pantsbuild#7235, we will have to again use the env var in order to install Python 2 with UCS4.

So, this code needs to be de-duplicated. We keep the name generic to any Python interpreter because it's really about Pyenv, not Python 3.
For Linux UCS2, the build was failing due to gcc complaining it could not find any files. This reproduced locally when running `./pants3 setup-py --run="bdist_wheel --py-limited-api cp36" src/python/pants:pants-packaged` on OSX.

John suggested and gave the code snippet to pass a dummy file so this no longer happens. Thanks John!
We can't use the PEX from AWS because the Python versions do not match up.
This reverts commit edf81ef.
Realized this is a better design while working on pantsbuild#7261.
Change how we handle env var to not use PEX when first bootstrapping, then use it in the followup release.sh command.
Use the {osx,linux}_config images rather than {osx,linux}_test_config images.
They don't exist apparently.
Now that we don't it in the base_build_wheels_env, we don't need to set this.

This was actually causing a failure. ./pants only checks if the env var is set, and not what its value is.
- OSX UCS2 shard no longer was setting RUN_PANTS_FROM_PEX anywhere
- Linux UCS4 had its before_script entry being override by travis_image.
Copy link
Contributor

@jsirois jsirois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for banging away at this Eric! This was just a quick skim while I had a minute.

@@ -0,0 +1,77 @@
#!/usr/bin/env bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this script was helpful when you were debugging this PR. You're also sure it's worth its weight going forward?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script is meant above all to act as an integration test / check for regressions. The most likely scenario where it will catch us from a regression is that we change how we handle interpreter constraints or resolve $PY, and that that change results in this no longer working as expected. Other scenarios could be changes made to .travis.yml or the Dockerfile breaking things.

This is why I productionized it. The original was an inline command in .travis.yml with a series of && and ; statements, and then I realized this would be useful to have permanently as a sanity check and that we can even use it to verify abi3 is setup correctly when building Py3 wheels.

At the cost of this safety / regression test, we have more code and about 1 extra second to each of the wheel building shards. Output from running script with time on my mac:

$ time ./build-support/bin/check_pants_pex_abi.sh cp27m
0.47s user 0.94s system 93% cpu 1.508 total

Fine with deleting it if you all prefer but I think we get more benefit than harm here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the above I am in favor of keeping this script until we have a test that exercises actually resolving each of the built wheels and running them on a python interpreter with the given abi.

.travis.yml Outdated Show resolved Hide resolved
Copy link
Contributor

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excited for this!

src/python/pants/BUILD Show resolved Hide resolved
build-support/bin/release.sh Outdated Show resolved Hide resolved
build-support/bin/check_pants_pex_abi.sh Outdated Show resolved Hide resolved
build-support/bin/check_pants_pex_abi.sh Outdated Show resolved Hide resolved
build-support/bin/check_pants_pex_abi.sh Outdated Show resolved Hide resolved
build-support/bin/check_pants_pex_abi.sh Outdated Show resolved Hide resolved
@@ -381,41 +372,113 @@ cargo_audit: &cargo_audit
# Build wheels
# -------------------------------------------------------------------------

# N.B. With Python 2, we must build pantsbuild.pants with both UCS2 and UCS4 to provide full
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good comment!

build-support/docker/travis_ci_py27_ucs2/Dockerfile Outdated Show resolved Hide resolved
Copy link
Contributor

@cosmicexplorer cosmicexplorer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can there be a test verifying that all of these wheels we're building can actually be resolved and used correctly with all of the abis? If it requires having a python interpreter with that abi, is that something we can set up another docker image(s) to do?

.travis.yml Outdated Show resolved Hide resolved
src/python/pants/BUILD Show resolved Hide resolved
build-support/bin/check_pants_pex_abi.sh Outdated Show resolved Hide resolved
build-support/bin/check_pants_pex_abi.sh Outdated Show resolved Hide resolved
build-support/bin/check_pants_pex_abi.sh Outdated Show resolved Hide resolved
build-support/bin/check_pants_pex_abi.sh Outdated Show resolved Hide resolved
build-support/bin/check_pants_pex_abi.sh Outdated Show resolved Hide resolved
build-support/bin/check_pants_pex_abi.sh Outdated Show resolved Hide resolved
build-support/bin/check_pants_pex_abi.sh Outdated Show resolved Hide resolved
@@ -0,0 +1,77 @@
#!/usr/bin/env bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the above I am in favor of keeping this script until we have a test that exercises actually resolving each of the built wheels and running them on a python interpreter with the given abi.

Much more readable and less hacky now! I was only using Bash because I had started with Bash and because almost all of our scripts are in Bash so I thought there was precedent. But Python is a more natural fit.
We really want to check two things: 1) that the interpreter is available on the PATH, and 2) that they are using python2.7, as the script does not (yet) work with Python 3.

Rather than muddling both these two tests into one command, instead split them out into distinct checks with very explicit error messages.

This will also make it easier to ensure they are using only 2.7 or 3.6 to release, once that PR goes through, as we don't want to allow <2.7, 3.0<=x<=3.5, or 3.7.

Thanks Benjy!
Explain why we need dummy.c and why we set ext_modules.
Copy link
Contributor Author

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great reviews! This is ready for re-review.

src/python/pants/BUILD Show resolved Hide resolved
build-support/bin/release.sh Show resolved Hide resolved
build-support/bin/check_pants_pex_abi.py Show resolved Hide resolved
We repeat this code a couple times and it's pretty finicky. Duplicate it out.

Props to Danny for finding a way to get this working with multiline strings!
Danny found a way to make this work when the template is used for a key entry (i.e. with -) and is not linked to an anchor!
Copy link
Contributor

@cosmicexplorer cosmicexplorer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feeling really positive about the templating and the python scripting here!

@@ -231,14 +216,15 @@ base_linux_build_engine: &base_linux_build_engine
<<: *travis_docker_image
stage: *bootstrap
script:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit is awesome!! Especially allowing parameterizing the image name! Thanks a ton for figuring this out!!!

build-support/travis/travis.yml.mustache Outdated Show resolved Hide resolved
build-support/bin/check_pants_pex_abi.py Outdated Show resolved Hide resolved
# that we correctly distinguish when using Python 2 between interpreters built with UCS2 vs. UCS4
# (see https://www.python.org/dev/peps/pep-0513/#ucs-2-vs-ucs-4-builds).
# TODO(7344): the tuple syntax for ext_modules is deprecated. Use Extension once we support it.
ext_modules=[('native_engine', {'sources': ['src/pants/dummy.c']})],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like an annoying workaround with a lot of extra work required just to set the ABI, but if it continues to work then it should be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely a lot of extra work 😁 As highlighted in the updated PR description, getting us to mark the new ABI is actually not much work—it's just this file and dummy.c. The real work is adding the new shards to build Linux UCS2 and OSX UCS4 to ensure we don't lose support for any Pants users.

A lot of these changes are also changes that would be good to make anyways, such as the changes to release.sh.

--

One step closer to Python 3 release! This PR does a substantial portion of the prework.

build-support/bin/check_pants_pex_abi.py Show resolved Hide resolved
@Eric-Arellano Eric-Arellano merged commit 7b2225e into pantsbuild:master Mar 9, 2019
@Eric-Arellano Eric-Arellano deleted the py2-wheels-abi-specified branch March 9, 2019 16:09
benjyw added a commit to benjyw/pants that referenced this pull request Mar 10, 2019
benjyw added a commit to benjyw/pants that referenced this pull request Mar 10, 2019
Also fix some release-related documentation.
benjyw added a commit that referenced this pull request Mar 10, 2019
Also fix some release-related documentation.
Eric-Arellano added a commit that referenced this pull request Mar 11, 2019
…omments (#7351)

### Problem
#7235 introduced an initial deduplication of launching docker images, notably allowing the image name to be parametrized. While working on #7197, it became evident how frequently the same code was being repeated for calling `docker run`. The only thing that changes between these invocations is what goes inside `sh -c "my bash command"`.

This duplication makes our Travis code harder to understand, along with the usual arguments for DRY.

### Solution
Introduce a new `docker_run_image` mustache template that takes the parameters `$docker_image_name` and `$docker_run_command`.

Also changed:
* rename `launch_docker_image` -> `docker_build_image`
* remove script entry for `&travis_docker_image`, as it gets overridden every time by its caller so is unnecessary and confusing.
* introduce Mustache comments to some of our templates. These don't get copied over into `.travis.yml`, which is good for reducing noise, while still allowing us to explain the concept in the template. 

### Result
CI should perform the same with the benefit of a shorter and better abstracted `travis.yml.mustache`.
stuhood pushed a commit that referenced this pull request Mar 15, 2019
…7393)

### Problem

After #7235, we began building wheels for multiple ABIs and platforms, but when building a pex, we were allowing `pex` to choose which ABI to use in the output `pex`, which could result in an unusable output pex: see #7383.  

### Solution

Explicitly include all ABIs when constructing a pex with `-p`.

### Result

Released and nightly pexes will be larger, but much more compatible. Fixes #7383.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants