Skip to content

Commit

Permalink
fix(precompiling)!: make binary-level precompile opt-in/opt-opt work (#…
Browse files Browse the repository at this point in the history
…2243)

This makes binary-level opt-in/opt-out of precompiling work as intended.
Previously,
when `pyc_collection=include_pyc` was set on a binary, only transitive
libraries that
had explicitly enabled precompiling were being included (which was moot
anyways --
libraries put their files in runfiles, so no matter what, their files
were included).

The intent was that, when a binary set `pyc_collection=include_pyc`,
then precompiled
files would be used for all its transitive dependencies (unless they
had, at the
target-level, disabled precompiling). Conversely, if
`pyc_collection=disabled` was set,
the precompiled files would not be used (unless a target had, at the
target level,
enabled precompiling).

To make it work as desired, the basic fix is to make it so that
libraries have a place to
put the implicit pyc files (the ones automatically generated), and have
the binaries
include those when requested. The net effect is a library has 4 sets of
files it produces:
* required py files: py source files that should always go into the
binary's runfiles
* required pyc files: precompiled pyc files that should always go into
the binary's
  runfiles (e.g., when a library sets `precompile=enabled` directly).
* implicit pyc files: precompiled pyc files for a library that are
always generated, but
  it's up to the binary if they go into the runfiles
* implicit pyc source files: the source py file for an implicit pyc
file. When a binary
*doesn't* include the implicit pyc file, it must include the source py
file (otherwise
  none of the library's code ends up included).

Similarly, in order to allow a binary to decide what files are used,
libraries must
stop putting the py/pyc files into runfiles themselves. While this is
potentially
a breaking change, I found that, within Google, there was no reliance on
this behavior,
so should be safe enough. That said, I added `--add_srcs_to_runfiles` to
restore
the previous behavior to aid in transitioning.

**BREAKING CHANGES**
1. `py_library` no longer puts its srcs into runfiles directly.
2. Removed `--precompile_add_to_runfiles`
3. Removed `--pyc_collection`
4. `precompile=if_generated_source` removed
5. `precompile_source_retention=omit_if_generated_source` removed

Though 2 through 5 are technically breaking changes, I don't think
precompiling
was very usable anyways, so usages of those flags/values is rare.

Fixes #2212
  • Loading branch information
rickeylev authored Oct 11, 2024
1 parent f6361e4 commit a3cdab5
Show file tree
Hide file tree
Showing 17 changed files with 746 additions and 344 deletions.
20 changes: 18 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,14 @@ A brief description of the categories of changes:
[x.x.x]: https://github.com/bazelbuild/rules_python/releases/tag/x.x.x

### Changed
* **BREAKING** `py_library` no longer puts its source files or generated pyc
files in runfiles; it's the responsibility of consumers (e.g. binaries) to
populate runfiles with the necessary files. Adding source files to runfiles
can be temporarily restored by setting {obj}`--add_srcs_to_runfiles=enabled`,
but this flag will be removed in a subsequent releases.
* {obj}`PyInfo.transitive_sources` is now added to runfiles. These files are
`.py` files that are required to be added to runfiles by downstream binaries
(or equivalent).
* (toolchains) `py_runtime.implementation_name` now defaults to `cpython`
(previously it defaulted to None).

Expand All @@ -50,6 +58,8 @@ A brief description of the categories of changes:
* (bzlmod) In hybrid bzlmod with WORKSPACE builds,
`python_register_toolchains(register_toolchains=True)` is respected
([#1675](https://github.com/bazelbuild/rules_python/issues/1675)).
* (precompiling) The {obj}`pyc_collection` attribute now correctly
enables (or disables) using pyc files from targets transitively

### Added
* (py_wheel) Now supports `compress = (True|False)` to allow disabling
Expand All @@ -66,14 +76,20 @@ A brief description of the categories of changes:
* `3.10 -> 3.10.15`
* `3.11 -> 3.11.10`
* `3.12 -> 3.12.7`
[20241008]: https://github.com/indygreg/python-build-standalone/releases/tag/20241008
* (coverage) Add support for python 3.13 and bump `coverage.py` to 7.6.1.
* (bzlmod) Add support for `download_only` flag to disable usage of `sdists`
when {bzl:attr}`pip.parse.experimental_index_url` is set.
* (api) PyInfo fields: {obj}`PyInfo.transitive_implicit_pyc_files`,
{obj}`PyInfo.transitive_implicit_pyc_source_files`.

[20241008]: https://github.com/indygreg/python-build-standalone/releases/tag/20241008

### Removed
* Nothing yet
* (precompiling) {obj}`--precompile_add_to_runfiles` has been removed.
* (precompiling) {obj}`--pyc_collection` has been removed. The `pyc_collection`
attribute now bases its default on {obj}`--precompile`.
* (precompiling) The {obj}`precompile=if_generated_source` value has been removed.
* (precompiling) The {obj}`precompile_source_retention=omit_if_generated_source` value has been removed.

## [0.36.0] - 2024-09-24

Expand Down
63 changes: 25 additions & 38 deletions docs/api/rules_python/python/config_settings/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,25 @@

# //python/config_settings

:::{bzl:flag} add_srcs_to_runfiles
Determines if the `srcs` of targets are added to their runfiles.

More specifically, the sources added to runfiles are the `.py` files in `srcs`.
If precompiling is performed, it is the `.py` files that are kept according
to {obj}`precompile_source_retention`.

Values:
* `auto`: (default) Automatically decide the effective value; the current
behavior is `disabled`.
* `disabled`: Don't add `srcs` to a target's runfiles.
* `enabled`: Add `srcs` to a target's runfiles.
::::{versionadded} 0.37.0
::::
::::{deprecated} 0.37.0
This is a transition flag and will be removed in a subsequent release.
::::
:::

:::{bzl:flag} python_version
Determines the default hermetic Python toolchain version. This can be set to
one of the values that `rules_python` maintains.
Expand Down Expand Up @@ -42,12 +61,8 @@ Values:

* `auto`: (default) Automatically decide the effective value based on environment,
target platform, etc.
* `enabled`: Compile Python source files at build time. Note that
{bzl:obj}`--precompile_add_to_runfiles` affects how the compiled files are included into
a downstream binary.
* `enabled`: Compile Python source files at build time.
* `disabled`: Don't compile Python source files at build time.
* `if_generated_source`: Compile Python source files, but only if they're a
generated file.
* `force_enabled`: Like `enabled`, except overrides target-level setting. This
is mostly useful for development, testing enabling precompilation more
broadly, or as an escape hatch if build-time compiling is not available.
Expand All @@ -56,6 +71,9 @@ Values:
broadly, or as an escape hatch if build-time compiling is not available.
:::{versionadded} 0.33.0
:::
:::{versionchanged} 0.37.0
The `if_generated_source` value was removed
:::
::::

::::{bzl:flag} precompile_source_retention
Expand All @@ -73,45 +91,14 @@ Values:
target platform, etc.
* `keep_source`: Include the original Python source.
* `omit_source`: Don't include the orignal py source.
* `omit_if_generated_source`: Keep the original source if it's a regular source
file, but omit it if it's a generated file.

:::{versionadded} 0.33.0
:::
:::{versionadded} 0.36.0
The `auto` value
:::
::::

::::{bzl:flag} precompile_add_to_runfiles
Determines if a target adds its compiled files to its runfiles.

When a target compiles its files, but doesn't add them to its own runfiles, it
relies on a downstream target to retrieve them from
{bzl:obj}`PyInfo.transitive_pyc_files`

Values:
* `always`: Always include the compiled files in the target's runfiles.
* `decided_elsewhere`: Don't include the compiled files in the target's
runfiles; they are still added to {bzl:obj}`PyInfo.transitive_pyc_files`. See
also: {bzl:obj}`py_binary.pyc_collection` attribute. This is useful for allowing
incrementally enabling precompilation on a per-binary basis.
:::{versionadded} 0.33.0
:::
::::

::::{bzl:flag} pyc_collection
Determine if `py_binary` collects transitive pyc files.

:::{note}
This flag is overridden by the target level `pyc_collection` attribute.
:::

Values:
* `include_pyc`: Include `PyInfo.transitive_pyc_files` as part of the binary.
* `disabled`: Don't include `PyInfo.transitive_pyc_files` as part of the binary.
:::{versionadded} 0.33.0
:::
:::{versionchanged} 0.37.0
The `omit_if_generated_source` value was removed
::::

::::{bzl:flag} py_linux_libc
Expand Down
36 changes: 21 additions & 15 deletions docs/precompiling.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,24 +20,24 @@ While precompiling helps runtime performance, it has two main costs:

## Binary-level opt-in

Because of the costs of precompiling, it may not be feasible to globally enable it
for your repo for everything. For example, some binaries may be
particularly large, and doubling the number of runfiles isn't doable.
Binary-level opt-in allows enabling precompiling on a per-target basic. This is
useful for situations such as:

If this is the case, there's an alternative way to more selectively and
incrementally control precompiling on a per-binry basis.
* Globally enabling precompiling in your `.bazelrc` isn't feasible. This may
be because some targets don't work with precompiling, e.g. because they're too
big.
* Enabling precompiling for build tools (exec config targets) separately from
target-config programs.

To use this approach, the two basic steps are:
1. Disable pyc files from being automatically added to runfiles:
{bzl:obj}`--@rules_python//python/config_settings:precompile_add_to_runfiles=decided_elsewhere`,
2. Set the `pyc_collection` attribute on the binaries/tests that should or should
not use precompiling.
To use this approach, set the {bzl:attr}`pyc_collection` attribute on the
binaries/tests that should or should not use precompiling. Then change the
{bzl:flag}`--precompile` default.

The default for the `pyc_collection` attribute is controlled by the flag
{bzl:obj}`--@rules_python//python/config_settings:pyc_collection`, so you
The default for the {bzl:attr}`pyc_collection` attribute is controlled by the flag
{bzl:obj}`--@rules_python//python/config_settings:precompile`, so you
can use an opt-in or opt-out approach by setting its value:
* targets must opt-out: `--@rules_python//python/config_settings:pyc_collection=include_pyc`
* targets must opt-in: `--@rules_python//python/config_settings:pyc_collection=disabled`
* targets must opt-out: `--@rules_python//python/config_settings:precompile=enabled`
* targets must opt-in: `--@rules_python//python/config_settings:precompile=disabled`

## Advanced precompiler customization

Expand All @@ -48,7 +48,7 @@ not work as well for remote execution builds. To customize the precompiler, two
mechanisms are available:

* The exec tools toolchain allows customizing the precompiler binary used with
the `precompiler` attribute. Arbitrary binaries are supported.
the {bzl:attr}`precompiler` attribute. Arbitrary binaries are supported.
* The execution requirements can be customized using
`--@rules_python//tools/precompiler:execution_requirements`. This is a list
flag that can be repeated. Each entry is a key=value that is added to the
Expand Down Expand Up @@ -92,3 +92,9 @@ Note that any execution requirements values can be specified in the flag.
`foo.cpython-39.opt-2.pyc`). This works fine (it's all byte code), but also
means the interpreter `-O` argument can't be used -- doing so will cause the
interpreter to look for the non-existent `opt-N` named files.
* Targets with the same source files and different exec properites will result
in action conflicts. This most commonly occurs when a `py_binary` and
`py_library` have the same source files. To fix, modify both targets so
they have the same exec properties. If this is difficult because unsupported
exec groups end up being passed to the Python rules, please file an issue
to have those exec groups added to the Python rules.
27 changes: 9 additions & 18 deletions python/config_settings/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,11 @@ load("@bazel_skylib//rules:common_settings.bzl", "string_flag")
load("@pythons_hub//:versions.bzl", "DEFAULT_PYTHON_VERSION", "MINOR_MAPPING", "PYTHON_VERSIONS")
load(
"//python/private:flags.bzl",
"AddSrcsToRunfilesFlag",
"BootstrapImplFlag",
"ExecToolsToolchainFlag",
"PrecompileAddToRunfilesFlag",
"PrecompileFlag",
"PrecompileSourceRetentionFlag",
"PycCollectionFlag",
)
load(
"//python/private/pypi:flags.bzl",
Expand All @@ -33,6 +32,14 @@ construct_config_settings(
versions = PYTHON_VERSIONS,
)

string_flag(
name = "add_srcs_to_runfiles",
build_setting_default = AddSrcsToRunfilesFlag.AUTO,
values = AddSrcsToRunfilesFlag.flag_values(),
# NOTE: Only public because it is dependency of public rules.
visibility = ["//visibility:public"],
)

string_flag(
name = "exec_tools_toolchain",
build_setting_default = ExecToolsToolchainFlag.DISABLED,
Expand Down Expand Up @@ -68,22 +75,6 @@ string_flag(
visibility = ["//visibility:public"],
)

string_flag(
name = "precompile_add_to_runfiles",
build_setting_default = PrecompileAddToRunfilesFlag.ALWAYS,
values = sorted(PrecompileAddToRunfilesFlag.__members__.values()),
# NOTE: Only public because it's an implicit dependency
visibility = ["//visibility:public"],
)

string_flag(
name = "pyc_collection",
build_setting_default = PycCollectionFlag.DISABLED,
values = sorted(PycCollectionFlag.__members__.values()),
# NOTE: Only public because it's an implicit dependency
visibility = ["//visibility:public"],
)

string_flag(
name = "bootstrap_impl",
build_setting_default = BootstrapImplFlag.SYSTEM_PYTHON,
Expand Down
4 changes: 4 additions & 0 deletions python/private/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -297,6 +297,10 @@ bzl_library(
name = "py_package_bzl",
srcs = ["py_package.bzl"],
visibility = ["//:__subpackages__"],
deps = [
":builders_bzl",
":py_info_bzl",
],
)

bzl_library(
Expand Down
49 changes: 28 additions & 21 deletions python/private/common/attributes.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,26 @@ load(

_PackageSpecificationInfo = getattr(py_internal, "PackageSpecificationInfo", None)

# Due to how the common exec_properties attribute works, rules must add exec
# groups even if they don't actually use them. This is due to two interactions:
# 1. Rules give an error if users pass an unsupported exec group.
# 2. exec_properties is configurable, so macro-code can't always filter out
# exec group names that aren't supported by the rule.
# The net effect is, if a user passes exec_properties to a macro, and the macro
# invokes two rules, the macro can't always ensure each rule is only passed
# valid exec groups, and is thus liable to cause an error.
#
# NOTE: These are no-op/empty exec groups. If a rule *does* support an exec
# group and needs custom settings, it should merge this dict with one that
# overrides the supported key.
REQUIRED_EXEC_GROUPS = {
# py_binary may invoke C++ linking, or py rules may be used in combination
# with cc rules (e.g. within the same macro), so support that exec group.
# This exec group is defined by rules_cc for the cc rules.
"cpp_link": exec_group(),
"py_precompile": exec_group(),
}

_STAMP_VALUES = [-1, 0, 1]

def _precompile_attr_get_effective_value(ctx):
Expand All @@ -50,7 +70,6 @@ def _precompile_attr_get_effective_value(ctx):
if precompile not in (
PrecompileAttr.ENABLED,
PrecompileAttr.DISABLED,
PrecompileAttr.IF_GENERATED_SOURCE,
):
fail("Unexpected final precompile value: {}".format(repr(precompile)))

Expand All @@ -60,14 +79,10 @@ def _precompile_attr_get_effective_value(ctx):
PrecompileAttr = enum(
# Determine the effective value from --precompile
INHERIT = "inherit",
# Compile Python source files at build time. Note that
# --precompile_add_to_runfiles affects how the compiled files are included
# into a downstream binary.
# Compile Python source files at build time.
ENABLED = "enabled",
# Don't compile Python source files at build time.
DISABLED = "disabled",
# Compile Python source files, but only if they're a generated file.
IF_GENERATED_SOURCE = "if_generated_source",
get_effective_value = _precompile_attr_get_effective_value,
)

Expand All @@ -90,7 +105,6 @@ def _precompile_source_retention_get_effective_value(ctx):
if attr_value not in (
PrecompileSourceRetentionAttr.KEEP_SOURCE,
PrecompileSourceRetentionAttr.OMIT_SOURCE,
PrecompileSourceRetentionAttr.OMIT_IF_GENERATED_SOURCE,
):
fail("Unexpected final precompile_source_retention value: {}".format(repr(attr_value)))
return attr_value
Expand All @@ -100,14 +114,17 @@ PrecompileSourceRetentionAttr = enum(
INHERIT = "inherit",
KEEP_SOURCE = "keep_source",
OMIT_SOURCE = "omit_source",
OMIT_IF_GENERATED_SOURCE = "omit_if_generated_source",
get_effective_value = _precompile_source_retention_get_effective_value,
)

def _pyc_collection_attr_is_pyc_collection_enabled(ctx):
pyc_collection = ctx.attr.pyc_collection
if pyc_collection == PycCollectionAttr.INHERIT:
pyc_collection = ctx.attr._pyc_collection_flag[BuildSettingInfo].value
precompile_flag = PrecompileFlag.get_effective_value(ctx)
if precompile_flag in (PrecompileFlag.ENABLED, PrecompileFlag.FORCE_ENABLED):
pyc_collection = PycCollectionAttr.INCLUDE_PYC
else:
pyc_collection = PycCollectionAttr.DISABLED

if pyc_collection not in (PycCollectionAttr.INCLUDE_PYC, PycCollectionAttr.DISABLED):
fail("Unexpected final pyc_collection value: {}".format(repr(pyc_collection)))
Expand Down Expand Up @@ -283,13 +300,9 @@ Whether py source files **for this target** should be precompiled.
Values:
* `inherit`: Determine the value from the {flag}`--precompile` flag.
* `enabled`: Compile Python source files at build time. Note that
--precompile_add_to_runfiles affects how the compiled files are included into
a downstream binary.
* `inherit`: Allow the downstream binary decide if precompiled files are used.
* `enabled`: Compile Python source files at build time.
* `disabled`: Don't compile Python source files at build time.
* `if_generated_source`: Compile Python source files, but only if they're a
generated file.
:::{seealso}
Expand Down Expand Up @@ -344,8 +357,6 @@ in the resulting output or not. Valid values are:
* `inherit`: Inherit the value from the {flag}`--precompile_source_retention` flag.
* `keep_source`: Include the original Python source.
* `omit_source`: Don't include the original py source.
* `omit_if_generated_source`: Keep the original source if it's a regular source
file, but omit it if it's a generated file.
""",
),
# Required attribute, but details vary by rule.
Expand All @@ -357,10 +368,6 @@ in the resulting output or not. Valid values are:
# Required attribute, but the details vary by rule.
# Use create_srcs_version_attr to create one.
"srcs_version": None,
"_precompile_add_to_runfiles_flag": attr.label(
default = "//python/config_settings:precompile_add_to_runfiles",
providers = [BuildSettingInfo],
),
"_precompile_flag": attr.label(
default = "//python/config_settings:precompile",
providers = [BuildSettingInfo],
Expand Down
Loading

0 comments on commit a3cdab5

Please sign in to comment.