Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support (restrictive) load in MODULE.bazel #17880

Closed
joca-bt opened this issue Mar 24, 2023 · 39 comments
Closed

Support (restrictive) load in MODULE.bazel #17880

joca-bt opened this issue Mar 24, 2023 · 39 comments
Labels
area-Bzlmod Bzlmod-specific PRs, issues, and feature requests P2 We'll consider working on this in future. (Assignee optional) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: feature request

Comments

@joca-bt
Copy link

joca-bt commented Mar 24, 2023

Description of the feature request:

Currently, the MODULE.bazel file must contain everything needed by the project within it, and cannot call load statements to help make it easier to navigate.

One situation in which it would be helpful to use load is when configuring external package managers, e.g. like with rules_jvm_external. On a Java project it is not unusual to depend on several external jars. Putting that configuration in MODULE.bazel can be considered as polluting the file and makes it more difficult to maintain.

This was originally opened in bazel-contrib/rules_jvm_external#854 and that issue provides further details on a specific example with rules_jvm_external.

Having the ability to use load to load constants or macros from other files would help us maintain a more organized MODULE.bazel.

What underlying problem are you trying to solve with this feature?

No response

Which operating system are you running Bazel on?

Linux

What is the output of bazel info release?

6.0.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@fmeum
Copy link
Collaborator

fmeum commented Mar 24, 2023

I don't think I'm exaggerating that much with the following claim: The availability of and reliance on load in WORKSPACE was a major contributor to its subpar performance.

Module extensions can accept label attributes on their tags and collect dependency information from the files pointed to by these labels. Without any changes to Bazel itself, rules_jvm_external could allow for the following usage where artifacts.json is some other file in your repo with e.g. an array of Maven coordinates:

maven = use_extension("@rules_jvm_external//:extensions.bzl", "maven")
maven.install(artifacts = "//:artifacts.json")

This would keep your MODULE.bazel file clean and your external Java dependencies separated from all the others while still allowing for fast discovery of the entire dependency tree.

@ShreeM01 ShreeM01 added the team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. label Mar 24, 2023
@Wyverald
Copy link
Member

What @fmeum said; although, we might explore allowing load from the current repository for the root module only, to better support large monorepos that will never be used as a dependency.

@Wyverald Wyverald added P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) area-Bzlmod Bzlmod-specific PRs, issues, and feature requests and removed untriaged labels Mar 27, 2023
@shs96c
Copy link
Contributor

shs96c commented May 11, 2023

Would there be an option of using load in a local repository, but when we upload to something like the BCR, having the load statements inlined so there's just the single MODULE.bazel file without any external dependencies?

@Wyverald Wyverald added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) labels Jun 2, 2023
@Wyverald
Copy link
Member

Wyverald commented Jun 2, 2023

See this comment in #14632 for a previous abandonded attempt.

We don't currently have the cycles to work on this. But if anyone is interested in picking up the work and writing a design for it (addressing the concerns in the comment above), we'd be happy to review PRs.

@fmeum
Copy link
Collaborator

fmeum commented Jun 2, 2023

We could:

  • allow load in the root module only;
  • restrict load to .bzl files in the root module (including transitive loads);
  • do not add any of the MODULE.bazel globals to the environment of the loaded .bzl files. Instead users will have to pass extension proxies into macros explicitly or load plain Starlark data. Edit: Possibly even better, prohibit calling the globals (but not proxy tags) in loaded files, which gives better error messages and prevents passing them into macros.

The last restriction makes it less likely for two different loaded files to define conflicting bazel_deps. Typical usages could look like this:

load("//:go.bzl", "declare_go_overrides", "GO_REPOS")
 
go_deps = use_extension(...)
go_deps.from_file(go_mod = "//:go.mod")
declare_go_overrides(go_deps)
use_repo(go_deps, *GO_REPOS)

@sluongng
Copy link
Contributor

sluongng commented Jun 5, 2023

One situation in which it would be helpful to use load is when configuring external package managers, e.g. like with rules_jvm_external. On a Java project it is not unusual to depend on several external jars. Putting that configuration in MODULE.bazel can pollute the file and make it more difficult to maintain.

The pain point of maintaining a large MODULE.bazel file is completely valid. But the solution of allowing load statements in there could be the wrong route to turn.

Looking into the current Go ecosystem, I don't see anyone asking for the ability to "load / import" within go.mod / go.sum files. The reason is go mod tidy provides everything you need to manage a go.mod file in a Go project: populating the content of the file, sorting, formatting entries, etc... An average go developer would not even look into a go.mod file as the tool did a great job at managing it. As a result, it's very rare to hear that the file is too big to manage.

So is it possible for us to solve this problem with better automation tooling than relying on dynamic template loading? For example: perhaps a bazel/gazelle mod tidy could help us populate the MODULE.bazel file automatically by scanning through the existing project. The convention of how to arrange and sort the content of MODULE.bazel could also be reinforced there to avoid style-related arguments.

@ashi009
Copy link
Contributor

ashi009 commented Jun 5, 2023

One situation in which it would be helpful to use load is when configuring external package managers, e.g. like with rules_jvm_external. On a Java project it is not unusual to depend on several external jars. Putting that configuration in MODULE.bazel can pollute the file and make it more difficult to maintain.

The pain point of maintaining a large MODULE.bazel file is completely valid. But the solution of allowing load statements in there could be the wrong route to turn.

Looking into the current Go ecosystem, I don't see anyone asking for the ability to "load / import" within go.mod / go.sum files. The reason is go mod tidy provides everything you need to manage a go.mod file in a Go project: populating the content of the file, sorting, formatting entries, etc... An average go developer would not even look into a go.mod file as the tool did a great job at managing it. As a result, it's very rare to hear that the file is too big to manage.

So is it possible for us to solve this problem with better automation tooling than relying on dynamic template loading? For example: perhaps a bazel/gazelle mod tidy could help us populate the MODULE.bazel file automatically by scanning through the existing project. The convention of how to arrange and sort the content of MODULE.bazel could also be reinforced there to avoid style-related arguments.

The reason why go is allowed to do so is because the source of truth of dependencies relies in the source tree, therefore it's okay to let tool generate it.

But for module.bazel, it's already the source of truth, and there are lots of human written parts in it, eg. repo override configurations, etc. Automation is not entirely feasible .

@sluongng
Copy link
Contributor

sluongng commented Jun 5, 2023

@ashi009 In my mind, a command such as bazel mod tidy should solve several things:

  1. Update MODULE.bazel with appropriate configs using current source tree.
    It's possible that somewhere in the source tree, a user would add to their BUILD file a load statement referring to an external rule. bazel mod tidy should be able to detect such usage and automatically it into MODULE.bazel

  2. Consistent formatting of MODULE.bazel.
    There should be an extensible set of config grouping in MODULE.bazel, one for each popular languages. The order and format of these groups within MODULE.bazel file should be deterministic so that we could run bazel mod tidy to format it over time.

  3. Similar to go.mod's replace directive, there should be an override section for folks to customize. However, I think the goal should be to keep this section small and limited. More extensive re-configuration and patching could/should be done by using a different BzlMod Registry instead of using MODULE.bazel file. The custom registry could be an in-tree directory managed by source control so the setup could be relatively simple. You could also use multiple registries overlaying each others thus re-using Bazel central registry is possible.

I think at minimum, we should defer the addition of load / import within MODULE.bazel after BzlMod has been out for a while. It would let us direct the solutions toward (1) better automation tooling to manage the file and (2) alternative registries as much as possible. After the initial GA of BzlMod, we should be able to identify problems that could not be solved using either of the above approaches and add load / import if needed.

The reverse, however, is not feasible. Once load / import is added, we will have folks depending on that behavior, and taking it away will cause heavy churn/migration.

@ashi009
Copy link
Contributor

ashi009 commented Jun 5, 2023

@sluongng Totally agree on the first two. However, not true for the third. As at most of the time we are overriding the extension generated repos.

In a relatively large mono repo, the number of repos that need a second touch is a lot. Our current non-bzlmod rules_go generated bzl file with lots of patches and custom gazelle directives is over 17k loc.

I don't see how this gonna play out in a single module.bazel file with the help of any tool. Even that is feasible, reviewing that file is also challenging -- how do we delegate a portion of a file to some reviewer while other portion to some others?

Also a note on mirroring the bcr, it's not a valid solution for most of the users, it adds too much toil for who ever maintaining that mirror -- just thinking about pulling upstream releases and resolving merge conflicts.

@Wyverald
Copy link
Member

Wyverald commented Jun 5, 2023

Update MODULE.bazel with appropriate configs using current source tree.
It's possible that somewhere in the source tree, a user would add to their BUILD file a load statement referring to an external rule. bazel mod tidy should be able to detect such usage and automatically it into MODULE.bazel

I'm not sure this is feasible. Unlike Go where the import path fully identifies the module, in Bazel/Bzlmod you just see people using @rules_foo//:something and you've no idea whether that's actually the rules_foo you have on the BCR. Maybe they intended to import the module company_rules_foo as rules_foo, for example.

This is even less tractable if you consider module extensions and repos generated from them.

@cgrindel
Copy link

cgrindel commented Jun 5, 2023

@ashi009 Just a quick clarification. With the custom registry solution, you don't need to mirror the entire BCR. The custom registry only needs to contain the customized modules.

@sluongng
Copy link
Contributor

sluongng commented Jun 5, 2023

Also a note on mirroring the bcr, it's not a valid solution for most of the users, it adds too much toil for who ever maintaining that mirror -- just thinking about pulling upstream releases and resolving merge conflicts.

@ashi009 I could be wrong, so please do correct me if I am. Here, I am not suggesting mirror-ing. I am suggesting using multiple registries on top of each others:

# .bazelrc
build --registry=./internal/registry/
build --registry=https://bcr.bazel.build

So you would only need to maintain your internal module customizations.

I'm not sure this is feasible. Unlike Go where the import path fully identifies the module, in Bazel/Bzlmod you just see people using @rules_foo//:something and you've no idea whether that's actually the rules_foo you have on the BCR. Maybe they intended to import the module company_rules_foo as rules_foo, for example.

This is even less tractable if you consider module extensions and repos generated from them.

@Wyverald I think a best-effort lookup from the existing registries should be sufficient.
In case a repo is missing, simply provide the users with a warning/error message. go mod tidy does the same when a referenced module could not be found. This is also why enterprises often operate their own internal Go Proxy pointing to internal modules.

For module extension, we could use the approach existing in gazelle, which is to provide directives in the form of comments. These directives will help guide the automation tool to map a repo name to a group/path inside MODULE.bazel. After a while, we could identify some common directives and upgrade support for them to the registry level.

@cgrindel
Copy link

cgrindel commented Jun 5, 2023

The custom registry could be an in-tree directory managed by source control so the setup could be relatively simple.

I think @sluongng idea for an in-repo registry could be the answer. It satisfies two important criteria in my mind:

  1. Keep the customizations for the external dependencies in the repo. (Spreading them out to different repos does not help the problem.)
  2. Compartmentalizes the changes by external dependency. (Keeps things clean.)

@Wyverald I have not dug into the Bazel code for the registry yet. At first blush, it seems that if the on-disk custom registry had the correct layout, it would just be a matter of saying to Bazel "look over here" instead of "download from there". Am I thinking about this correctly?

@Wyverald
Copy link
Member

Wyverald commented Jun 5, 2023

I have not dug into the Bazel code for the registry yet. At first blush, it seems that if the on-disk custom registry had the correct layout, it would just be a matter of saying to Bazel "look over here" instead of "download from there". Am I thinking about this correctly?

Indeed. The only caveat is that you'd need to use a local_path type for your source.json files: https://bazel.build/external/registry#:~:text=The%20type%20can%20be%20changed%20to%20use%20a%20local%20path

@sluongng
Copy link
Contributor

sluongng commented Jun 5, 2023

@Wyverald The doc is giving me the impression that

When source.json type is local_path, then the registry must be hosted locally.

It does not imply the reverse which I think you are implying though:

When a registry is hosted locally, all the source.json in it must have type local_path

Am I understanding this wrongly? Because I was expecting that archive type would still work with a local registry.

@Wyverald
Copy link
Member

Wyverald commented Jun 5, 2023

You're right -- I didn't try to imply that you couldn't use the archive type with local registries. I just had the impression that Chuck was asking for a no-download, just-use-this-directory registry, but looking back now I might have misinterpreted his comment.

Wyverald added a commit that referenced this issue Mar 20, 2024
ModuleFileGlobals used to serve as both the definition of Starlark methods available in MODULE.bazel files, *and* the state tracker for a module being built. This commit brings it more in line with other Starlark evaluation contexts where there's a simple object that just holds methods, and the actual state is stored in the StarlarkThread. ModuleThreadContext is that state object.

This commit is a pure mechanical refactor with no behavioral changes. It paves the way for us to allow root module MODULE.bazel files to import other files.

Work towards #17880.
Wyverald added a commit that referenced this issue Mar 20, 2024
ModuleFileGlobals used to serve as both the definition of Starlark methods available in MODULE.bazel files, *and* the state tracker for a module being built. This commit brings it more in line with other Starlark evaluation contexts where there's a simple object that just holds methods, and the actual state is stored in the StarlarkThread. ModuleThreadContext is that state object.

This commit is a pure mechanical refactor with no behavioral changes. It paves the way for us to allow root module MODULE.bazel files to import other files.

Work towards #17880.
Wyverald added a commit that referenced this issue Mar 20, 2024
ModuleFileGlobals used to serve as both the definition of Starlark methods available in MODULE.bazel files, *and* the state tracker for a module being built. This commit brings it more in line with other Starlark evaluation contexts where there's a simple object that just holds methods, and the actual state is stored in the StarlarkThread. ModuleThreadContext is that state object.

This commit is a pure mechanical refactor with no behavioral changes. It paves the way for us to allow root module MODULE.bazel files to import other files.

Work towards #17880.
Wyverald added a commit that referenced this issue Mar 20, 2024
ModuleFileGlobals used to serve as both the definition of Starlark methods available in MODULE.bazel files, *and* the state tracker for a module being built. This commit brings it more in line with other Starlark evaluation contexts where there's a simple object that just holds methods, and the actual state is stored in the StarlarkThread. ModuleThreadContext is that state object.

This commit is a pure mechanical refactor with no behavioral changes. It paves the way for us to allow root module MODULE.bazel files to import other files.

Work towards #17880.
Wyverald added a commit that referenced this issue Mar 20, 2024
ModuleFileGlobals used to serve as both the definition of Starlark methods available in MODULE.bazel files, *and* the state tracker for a module being built. This commit brings it more in line with other Starlark evaluation contexts where there's a simple object that just holds methods, and the actual state is stored in the StarlarkThread. ModuleThreadContext is that state object.

This commit is a pure mechanical refactor with no behavioral changes. It paves the way for us to allow root module MODULE.bazel files to import other files.

Work towards #17880.
copybara-service bot pushed a commit that referenced this issue Mar 26, 2024
ModuleFileGlobals used to serve as both the definition of Starlark methods available in MODULE.bazel files, *and* the state tracker for a module being built. This commit brings it more in line with other Starlark evaluation contexts where there's a simple object that just holds methods, and the actual state is stored in the StarlarkThread. ModuleThreadContext is that state object.

This commit is a pure mechanical refactor with no behavioral changes. It paves the way for us to allow root module MODULE.bazel files to import other files.

Work towards #17880.

Closes #21749.

PiperOrigin-RevId: 619261497
Change-Id: Ibff3a311e1788f7d8089ca194933cfb5fcc53ee0
Wyverald added a commit that referenced this issue Mar 29, 2024
This new directive allows the root module to divide its MODULE.bazel into multiple segments. TODO: add more context

RELNOTES: Added a new `module_import()` directive to MODULE.bazel files. TODO: elaborate a bit more

Fixes #17880.
Wyverald added a commit that referenced this issue Apr 4, 2024
This new directive allows the root module to divide its MODULE.bazel into multiple segments. TODO: add more context

RELNOTES: Added a new `include()` directive to MODULE.bazel files. TODO: elaborate a bit more

Fixes #17880.
Wyverald added a commit that referenced this issue Apr 17, 2024
This new directive allows the root module to divide its MODULE.bazel into multiple segments. TODO: add more context

RELNOTES: Added a new `include()` directive to MODULE.bazel files. TODO: elaborate a bit more

Fixes #17880.
Wyverald added a commit that referenced this issue Apr 17, 2024
This new directive allows the root module to divide its MODULE.bazel into multiple segments. TODO: add more context

RELNOTES: Added a new `include()` directive to MODULE.bazel files. TODO: elaborate a bit more

Fixes #17880.
Wyverald added a commit that referenced this issue Apr 17, 2024
This new directive allows the root module to divide its MODULE.bazel into multiple segments. This directive can only be used by root modules; only files in the main repo may be included; variable bindings are only visible in the file they occur in, not in any included or including files. See the docs for `include()` (in `ModuleFileGlobals.java`) for more details.

RELNOTES: Added a new `include()` directive to MODULE.bazel files.

Fixes #17880.
Wyverald added a commit that referenced this issue Apr 19, 2024
This new directive allows the root module to divide its MODULE.bazel into multiple segments. This directive can only be used by root modules; only files in the main repo may be included; variable bindings are only visible in the file they occur in, not in any included or including files. See the docs for `include()` (in `ModuleFileGlobals.java`) for more details.

RELNOTES: Added a new `include()` directive to MODULE.bazel files.

Fixes #17880.
Wyverald added a commit that referenced this issue Apr 22, 2024
They must end in `.MODULE.bazel`.

Follow-up for #17880
copybara-service bot pushed a commit that referenced this issue Apr 22, 2024
They must end in `.MODULE.bazel`.

Follow-up for #17880

Closes #22075.

PiperOrigin-RevId: 627136756
Change-Id: If9b1797f2e7ddc1aebd929646329e832288bfd8a
bazel-io pushed a commit to bazel-io/bazel that referenced this issue Apr 25, 2024
ModuleFileGlobals used to serve as both the definition of Starlark methods available in MODULE.bazel files, *and* the state tracker for a module being built. This commit brings it more in line with other Starlark evaluation contexts where there's a simple object that just holds methods, and the actual state is stored in the StarlarkThread. ModuleThreadContext is that state object.

This commit is a pure mechanical refactor with no behavioral changes. It paves the way for us to allow root module MODULE.bazel files to import other files.

Work towards bazelbuild#17880.

Closes bazelbuild#21749.

PiperOrigin-RevId: 619261497
Change-Id: Ibff3a311e1788f7d8089ca194933cfb5fcc53ee0
github-merge-queue bot pushed a commit that referenced this issue Apr 25, 2024
ModuleFileGlobals used to serve as both the definition of Starlark
methods available in MODULE.bazel files, *and* the state tracker for a
module being built. This commit brings it more in line with other
Starlark evaluation contexts where there's a simple object that just
holds methods, and the actual state is stored in the StarlarkThread.
ModuleThreadContext is that state object.

This commit is a pure mechanical refactor with no behavioral changes. It
paves the way for us to allow root module MODULE.bazel files to import
other files.

Work towards #17880.

Closes #21749.

PiperOrigin-RevId: 619261497
Change-Id: Ibff3a311e1788f7d8089ca194933cfb5fcc53ee0

Commit
9facaaa

Co-authored-by: Xdng Yng <wyverald@gmail.com>
Wyverald added a commit that referenced this issue Apr 30, 2024
This new directive allows the root module to divide its `MODULE.bazel` into multiple segments. This directive can only be used by root modules; only files in the main repo may be included; variable bindings are only visible in the file they occur in, not in any included or including files. See the docs for `include()` (in `ModuleFileGlobals.java`) for more details.

In follow-ups, we'll need to address:
1. Enforcing the loaded files to have some sort of naming format (tentatively `foo.MODULE.bazel` where `foo` is anything)
2. Making `bazel mod tidy` work with included files

RELNOTES: Added a new `include()` directive to `MODULE.bazel` files.

Fixes #17880.

Closes #21855.

PiperOrigin-RevId: 627034184
Change-Id: Ifc2f616cf0791445daeeac9ca5ec4478e83382aa
Wyverald added a commit that referenced this issue Apr 30, 2024
They must end in `.MODULE.bazel`.

Follow-up for #17880

Closes #22075.

PiperOrigin-RevId: 627136756
Change-Id: If9b1797f2e7ddc1aebd929646329e832288bfd8a
Kila2 pushed a commit to Kila2/bazel that referenced this issue May 13, 2024
This new directive allows the root module to divide its `MODULE.bazel` into multiple segments. This directive can only be used by root modules; only files in the main repo may be included; variable bindings are only visible in the file they occur in, not in any included or including files. See the docs for `include()` (in `ModuleFileGlobals.java`) for more details.

In follow-ups, we'll need to address:
1. Enforcing the loaded files to have some sort of naming format (tentatively `foo.MODULE.bazel` where `foo` is anything)
2. Making `bazel mod tidy` work with included files

RELNOTES: Added a new `include()` directive to `MODULE.bazel` files.

Fixes bazelbuild#17880.

Closes bazelbuild#21855.

PiperOrigin-RevId: 627034184
Change-Id: Ifc2f616cf0791445daeeac9ca5ec4478e83382aa
Kila2 pushed a commit to Kila2/bazel that referenced this issue May 13, 2024
They must end in `.MODULE.bazel`.

Follow-up for bazelbuild#17880

Closes bazelbuild#22075.

PiperOrigin-RevId: 627136756
Change-Id: If9b1797f2e7ddc1aebd929646329e832288bfd8a
@iancha1992
Copy link
Member

A fix for this issue has been included in Bazel 7.2.0 RC1. Please test out the release candidate and report any issues as soon as possible.
If you're using Bazelisk, you can point to the latest RC by setting USE_BAZEL_VERSION=7.2.0rc1. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Bzlmod Bzlmod-specific PRs, issues, and feature requests P2 We'll consider working on this in future. (Assignee optional) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: feature request
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.