Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow root module to customize toolchains #2081

Open
1 of 7 tasks
rickeylev opened this issue Jul 20, 2024 · 1 comment
Open
1 of 7 tasks

Allow root module to customize toolchains #2081

rickeylev opened this issue Jul 20, 2024 · 1 comment
Labels
type: bzlmod bzlmod work type: feature request type: toolchain Related to the toolchains provided by rules_python

Comments

@rickeylev
Copy link
Collaborator

rickeylev commented Jul 20, 2024

We've had several requests where users want to modify something about the toolchain definitions that are generated for the hermetic runtimes. Typically these are small tweaks. Allowing other modules to tweak toolchains is a no-go, but allowing the root module to change things is reasonable.

The "simple" option for customizing toolchains is to add args to python_register_toolchains and python.toolchain.

The two main issues I see with adding args is:

  1. bzlmod APIs tend to be "written in stone". They happen very early in the build process, so we have limited options for evolving the API. In this case, this is somewhat tempered by only the root module being allowed to use it, but it still makes me uneasy.
  2. Toolchains tend to have unstable APIs. Most parts of them are just implementation details for the rules. For example, adding a single arg to customize the bootstrap template made sense last year, but now there are multiple template files. Similarly, the stub_shebang attribute is slated for removal with the introduce of --bootstrap_impl=script.

Adding args to python.toolchain is, I think, a bad idea. Having something else, e.g. python.root_config might make a bit more sense. That said, args on python.toolchain() make it easy to associate something to a particular toolchain, while python.root_config would need some arg to specify which toolchain it wanted to restrict the setting to.

This issue is to collect the use cases and try and figure out some options.

Things people want to change:

  • Overriding bootstrap template

    • Example: feat: Allow custom bootstrap template in python_repository #2032
      • Goal: They want to implement a separate bootstrap
      • The PR proposes adding args to e.g. python_register_toolchains().
      • Issues: (1) Would need to be exposed via bzmod apis. (2) There are 4 bootstrap files: python_bootstrap_template (legacy system python bootstrap), stage1 script bootstrap, stage2 bootstrap, and zip main bootstrap. Plus the launcher exe thing for windows.
      • Proposal: Indirect the references to the templates using flags.
  • Customizing stub_shebang: https://bazelbuild.slack.com/archives/CA306CEV6/p1721464127008299

    • Goal: They want to set -S PYTHONNOUSERSITE=1 for all their Python invocations
    • Proposal 1: Have a flag override for stub_shebang
    • Proposal 2: Add interpreter_args and make env populate values into the bootstrap.
    • Proposal 3: Have flags for "default interpreter args" and "default interpreter env"
  • The register_coverage_tool arg

    • Goal: Having coverage included as part of the toolchain is unnecessary if a user test provides the coverage library itself
  • Control whether local or hermetic runtimes are used

    • Goal: Provide a way for the root module to decide whether a local or hermetic toolchain is used.
    • Basically a local=True|False arg when defining a toolchain. It decides whether the hermetic runtime repo rule or local runtime repo rule is used under the hood.
    • Proposal: Have a flag that acts as a constraint for matching toolchains.
  • Control whether runtimes are in-build or platform runtimes

    • Goal: A platform runtime is cheaper to setup; while not compatible with RBE or sandboxing, that's fine if you aren't using RBE or sandboxing.
    • This mostly applies to local runtimes, but could also apply to hermetic.
    • Basically, have an inbuild=True|False arg when defining a toolchain. It controls whether the underlying py_runtime() targets generated use interpreter= or interpreter_path=
  • Control constraints of toolchains

    • Goal: Allow registering both local and hermetic runtimes. Using constraints, local or platform toolchains can be used for local invocations, and hermetic for RBE.
  • Control what toolchain versions are available

    • Goal: An infrastructure team wants to ensure only e.g. Python 3.11 (or a small set of proscribed versions) is usable at their company.
@aignas
Copy link
Collaborator

aignas commented Aug 8, 2024

Wanted to create a separate issue for this but realized that this is already there. We could just create a python.override tag class for the majority of the things listed above, but for starters we could start with customising the URLs for the python toolchain override.

Currently the TOOL_VERSIONS used by the extension cannot be overridden via MODULE.bazel. The only way users have right now is to use the bazel_downloader config, but that may give hard to debug errors and still does not allow users to add Python builds for extra platforms that are not present on the indygreg website.

Initial solution could be a python.override tag class that can override the URLs/structs for python toolchains that are used. The design of the API should allow for future extension for different overrides:

  • restricting allowed python versions
  • setting the x.y->x.y.z mapping
  • disable registering all versions
  • etc

@aignas aignas modified the milestones: v1.0.0, Bzlmod GA Aug 8, 2024
@aignas aignas added type: feature request type: bzlmod bzlmod work type: toolchain Related to the toolchains provided by rules_python labels Aug 8, 2024
github-merge-queue bot pushed a commit that referenced this issue Sep 9, 2024
This value has been hardcoded for a long time, so let's add the
`patch_strip`
attribute to the `WORKSPACE` setups now so that we add less tech debt in
the
`bzlmod` world later and need to migrate `bzlmod` users from the hard
coded
value to something that can be configured.

This should be backwards compatible because of the default `int` value
of `1`
for the `patch_strip` attribute.

Summary:
- feat: add `patch_strip` to `python_repository`.
- feat: add the default value of patch_strip to get_release_info.
- refactor: handle patch_strip key in `TOOL_VERSIONS`.

Work towards #2081.

---------

Co-authored-by: Richard Levasseur <richardlev@gmail.com>
@aignas aignas mentioned this issue Sep 10, 2024
10 tasks
github-merge-queue bot pushed a commit that referenced this issue Sep 11, 2024
With this PR we get rudimentary unit tests for the `python` bzlmod
extension
which allows us to unit test the override behaviour that will become
more
complex soon.

Summary:
- refactor: inline the python_register_toolchains
- refactor: use toolchain_info to call python_register_toolchains
- refactor: move the registration out of the main loop
- refactor: split the parsing of the modules to a separate function
- test(bzlmod): add python.toolchain module parsing tests

Work towards #2081

---------

Co-authored-by: Richard Levasseur <richardlev@gmail.com>
aignas added a commit to aignas/rules_python that referenced this issue Sep 13, 2024
aignas added a commit to aignas/rules_python that referenced this issue Sep 13, 2024
github-merge-queue bot pushed a commit that referenced this issue Sep 15, 2024
… in full_version (#2219)

This PR just makes the `MINOR_MAPPING` overridable and explicit in
many macros/rules that we own. Even though technically new API is
exposed, I am not sure if it is possible to use it and I am not sure
if we should advertise it.

Explicit minor_mapping results in easier wiring of `python.override`
`bzlmod` extension tag class planned for #2081.
github-merge-queue bot pushed a commit that referenced this issue Sep 15, 2024
This `semver` function may turn out to be useful in validating
the input for the `python.*override` tag classes to be added in
a followup PR. Because this is a refactor of an existing code and
adding tests, I decided to split it out.

For a POC see #2151, work towards #2081.
aignas added a commit to aignas/rules_python that referenced this issue Sep 15, 2024
Before this PR the users could not override how the hermetic toolchain
is downloaded when in `bzlmod`. However, the same APIs would be
available to users using `WORKSPACE`. With this we also allow root
modules to restrict which toolchain versions the non-root modules, which
may be helpful when optimizing the CI runtimes so that we don't waste
time downloading multiple `micro` versions of the same `3.X` python
version, which most of the times have identical behavior.

Work towards bazelbuild#2081 and this should be one of the last items that are
blocking bazelbuild#1361 from the API point of view.
github-merge-queue bot pushed a commit that referenced this issue Sep 17, 2024
This is a different take on #2205.

Summary:
- Remove version constraints from the
`//python/config_settings:python_version`.
- Remove `is_python_config_setting` macro and use a different
implementation to
  retain the same functionality.

This in general will improve the situation on how the toolchains are
registered
and hopefully is a step towards resolving issues for #2081.
aignas added a commit to aignas/rules_python that referenced this issue Sep 17, 2024
Before this PR the users could not override how the hermetic toolchain
is downloaded when in `bzlmod`. However, the same APIs would be
available to users using `WORKSPACE`. With this we also allow root
modules to restrict which toolchain versions the non-root modules, which
may be helpful when optimizing the CI runtimes so that we don't waste
time downloading multiple `micro` versions of the same `3.X` python
version, which most of the times have identical behavior.

Whilst at it, tweak the `semver` implementation to allow for testing of
absence of values in the original input.

Work towards bazelbuild#2081 and this should be one of the last items that are
blocking bazelbuild#1361 from the API point of view.
github-merge-queue bot pushed a commit that referenced this issue Sep 19, 2024
…separate files (#2232)

This makes the dependency management in WORKSPACE much easier to do.

Summary:
- refactor: split out the py_repositories call to a separate file
- refactor: split out the python_repository rule to a separate file
- refactor: split out the standalone interpreter utility function
- refactor: split out the python_register_toolchains function
- refactor: rename the remaining file

Work towards #2081.
github-merge-queue bot pushed a commit that referenced this issue Sep 22, 2024
Before this PR the users could not override how the hermetic toolchain
is downloaded when in `bzlmod`. However, the same APIs would be
available to users using `WORKSPACE`. With this we also allow root
modules to restrict which toolchain versions the non-root modules, which
may be helpful when optimizing the CI runtimes so that we don't waste
time downloading multiple `micro` versions of the same `3.X` python
version, which most of the times have identical behavior.

Whilst at it, tweak the `semver` implementation to allow for testing of
absence of values in the original input.

Work towards #2081 and this should be one of the last items that are
blocking #1361 from the API point of view.

Replaces #2151.

---------

Co-authored-by: Richard Levasseur <richardlev@gmail.com>
@aignas aignas removed this from the Bzlmod GA milestone Sep 24, 2024
github-merge-queue bot pushed a commit that referenced this issue Oct 7, 2024
With this change we set the default value of `--python_version` when
the `python.toolchain` is used in `bzlmod` and we generate the
appropriate config settings based on the registered toolchains and
given overrides by the root module.

This means that we expect the `--python_version` to be always set to
a non-empty value under `bzlmod` and we can cleanup code which was
handling `//conditions:default` case. This also means that we can
in theory drop the requirement for `python_version` in `pip.parse`
and just setup dependencies for all packages that we find in the
`requirements.txt` file and move on. This is left as future work
by myself or anyone willing to contribute. We can also start reusing
the same `whl_library` instance for multi-platform packages because
the python version flag is always set - this will simplify the layout
and makes the feature non-experimental anymore under bzlmod.

Summary:
* Add `@pythons_hub` to the `WORKSPACE` with dummy data to make
  pythons_hub work.
* Add `MINOR_MAPPING` and `PYTHON_VERSIONS` to pythons_hub to
  generate the config settings.
* Remove handling of the default version in `pypi` code under bzlmod.

Work towards #2081, #260, #1708

---------

Co-authored-by: Richard Levasseur <rlevasseur@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bzlmod bzlmod work type: feature request type: toolchain Related to the toolchains provided by rules_python
Projects
None yet
Development

No branches or pull requests

2 participants