-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: don't require system Python to perform bootstrapping #1929
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
Well, it's about my EOD, so I thought I'd post the latest laugh/cry/wtf moment from fixing CI failures. It's kind of an amusing regression that actually looks like a promising fix of another undesirable behavior that has long been a plague. The new bootstrap doesn't suffer from the Why? Because the e.g. But, ironically, fixing that causes another problem. Boo! Without that entry, the runfiles root is then the first sys.path entry. One of our tests performs But, good news: because we can perform more arbitrary sys.path manipulations, we can manipulate sys.path as we need. Stated another way: we can implement |
6f58637
to
0251e5b
Compare
This is a pretty major, but surprisingly non-invasive, overhaul of how binaries are started. It fixes several issues and lays ground work for future improvements. In brief: * A system Python is no longer needed to perform bootstrapping. * Errors due to `PYTHONPATH` exceeding environment variable size limits is no longer an issue. * Coverage integration is now cleaner and more direct. * The zipapp `__main__.py` entry point generation is separate from the Bazel binary bootstrap generation. * Self-executable zips now have actual bootstrap logic. The way all of this is accomplished is using a two stage bootstrap process. The first stage is responsible for locating the interpreter, and the second stage is responsible for configuring the runtime environment (e.g. import paths). This allows the first stage to be relatively simple (basically find a file in runfiles), so implementing it in cross-platform shell is feasible. The second stage, because it's running under the desired interpreter, can then do things like setting up import paths, and use the `runpy` module to call the program's real main. This also fixes the issue of long `PYTHONPATH` environment variables causing an error. Instead of passing the import paths using an environment variable, they are embedded into the second stage bootstrap, which can then add them to sys.path. This also switches from running coverage as a subprocess to using its APIs directly. This is possible because of the second stage bootstrap, which can rely on `import coverage` occurring in the correct environment. This new bootstrap method is disabled by default. It can be enabled by setting `--@rules_python//python/config_settings:bootstrap_impl=two_stage`. Once the new APIs are released, a subsequent release will make it the default. This is to allow easier upgrades for people defining their own toolchains.
This is a pretty major, but surprisingly not that invasive, overhaul of how binaries
are started. It fixes several issues and lays ground work for future improvements.
In brief:
PYTHONPATH
exceeding environment variable size limits is nolonger an issue.
__main__.py
entry point generation is separate from the Bazelbinary bootstrap generation.
The way all of this is accomplished is using a two stage bootstrap process. The first
stage is responsible for locating the interpreter, and the second stage is responsible
for configuring the runtime environment (e.g. import paths). This allows the first
stage to be relatively simple (basically find a file in runfiles), so implementing it
in cross-platform shell is feasible. The second stage, because it's running under the
desired interpreter, can then do things like setting up import paths, and use the
runpy
module to call the program's real main.This also fixes the issue of long
PYTHONPATH
environment variables causing an error.Instead of passing the import paths using an environment variable, they are embedded
into the second stage bootstrap, which can then add them to sys.path.
This also switches from running coverage as a subprocess to using its APIs directly.
This is possible because of the second stage bootstrap, which can rely on
import coverage
occurring in the correct environment.This new bootstrap method is disabled by default. It can be enabled by setting
--@rules_python//python/config_settings:bootstrap_impl=two_stage
. Once the new APIsare released, a subsequent release will make it the default. This is to allow easier
upgrades for people defining their own toolchains.
The two-stage bootstrap ignores errors during lcov report generation, which
partially addresses #1434
Fixes #691