Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Toolchainize //scala:toolchain_type #1633

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

mbland
Copy link
Contributor

@mbland mbland commented Oct 23, 2024

Description

Moves the toolchain targets for //scala:toolchain_type to a new @io_bazel_rules_scala_toolchains repository as a step towards Bzlmodification. Part of #1482.

Motivation

Instantiating toolchains in their own repository enables module extensions to define the the repositories required by those toolchains within the extension's own namespace. Bzlmod users can then register the toolchains from this repository without having to import all the toolchains' dependencies into their own namespace via use_repo().


The scala_toolchains_repo() macro wraps the underlying repository rule and assigns it the standard name io_bazel_rules_scala_toolchains. Right now it's only instantiating the main Scala toolchain via the default scala = True parameter. Future changes will expand this macro and repository rule with more boolean parameters to instantiate other toolchains, specifically:

  • scalatest
  • junit
  • specs2
  • twitter_scrooge
  • jmh
  • scala_proto and scala_proto_enable_all_options
  • testing (includes all of scalatest, junit, and specs2)
  • scalafmt

WORKSPACE users will now have to import and call the scala_toolchains_repo() macro to instantiate
@io_bazel_rules_scala_toolchains.

load("@io_bazel_rules_scala//:scala_config.bzl", "scala_config")

scala_config()

load(
    "//scala:scala.bzl",
    "rules_scala_setup",
    "rules_scala_toolchain_deps_repositories",
    "scala_toolchains_repo",
)

rules_scala_setup()

rules_scala_toolchain_deps_repositories()

scala_toolchains_repo()

register_toolchains("@io_bazel_rules_scala_toolchains//...:all")

Or, they can use the new single scala_toolchains() macro to elide most of the calls:

load("@io_bazel_rules_scala//:scala_config.bzl", "scala_config")

scala_config()

load("//scala:scala.bzl", "scala_toolchains")

scala_toolchains()

register_toolchains("@io_bazel_rules_scala_toolchains//...:all")

This is what the corresponding MODULE.bazel setup would look like:

module(name = "rules_scala", version = "7.0.0")

scala_config = use_extension(
    "//scala/extensions:config.bzl", "scala_config"
)
scala_config.settings(scala_version = "2.13.14")

scala_deps = use_extension("//scala/extensions:deps.bzl", "scala_deps")
scala_deps.toolchains()

The register_toolchains() call in WORKSPACE isn't strictly required at this point, but is recommended. All the WORKSPACE files in this repo already registered their required toolchains using existing macros. However, they've been updated to use register_toolchains(...) instead.

These register_toolchains() calls maintain the order of previous toolchain registrations to avoid the following breakages:

  • test_compilation_fails_with_plus_one_deps_undefined from test/shell/test_compilation.sh depends on //scala:unused_dependency_checker_error_toolchain resolving first. This toolchains sets the scala_toolchain() parameters dependency_tracking_method = "ast-plus" and unused_dependency_checker_mode = "error", and the @io_bazel_rules_scala_toolchains//scala toolchains don't.

  • test_scala_binary_allows_opt_in_to_use_of_argument_file_in_runner_for_improved_performance from test/shell/test_scala_binary.sh depends on the use_argument_file_in_runner parameter of scala_toolchain being False. This is the default, but the @io_bazel_rules_scala_toolchains//scala toolchains explicitly set this to True instead.

In the Bzlmod case, the register_toolchains() call isn't necessary at all. This is because @io_bazel_rules_scala_toolchains includes one package per set of toolchains, and the rules_scala MODULE.bazel calls register_toolchains("@io_bazel_rules_scala_toolchains//...:all"). This will automatically register all configured rules_scala toolchains, while allowing users to override any of them using register_toolchains() in their own MODULE.bazel files.

Technically, the scala_deps.toolchains() call isn't required when only using the default scala = True parameter; the rules_scala MODULE.bazel will instantiate this automatically, too.

@mbland mbland force-pushed the bzlmod-toolchain-scala branch 2 times, most recently from 85924ce to bd2364a Compare October 25, 2024 23:36
@mbland
Copy link
Contributor Author

mbland commented Oct 25, 2024

I've rebased this PR onto master after #1635 landed, updated it substantially with two new commits, and adjusted the PR description.

The first new commit introduces the new scala_toolchains() macro, which will be the common funnel point between WORKSPACE and Bzlmod. However, that first commit doesn't add the macro to existing WORKSPACE calls. This demonstrates that the changes to rules_scala_setup and scala_repositories are compatible with existing API usage.

The second commit does replace as many WORKSPACE calls as possible with scala_toolchains() and register_toolchains(). This shows that the new API produces equivalent results, and proves a clear path towards adding the Bzlmod layer.

If you'd like to see the end state I have in mind for this macro, the scala_toolchains_repo() macro/rule, and the Bzlmod extension, check out the equivalent files in my Bzlmod working branch:

I've tested that all the following work with both the first commit and the second:

Once this change lands, toolchainizing the remaining toolchains should prove very straightforward. Then Bzlmod is possibly only PR away after that.

@simuons
Copy link
Collaborator

simuons commented Nov 4, 2024

Hi, @mbland, I just want to let you know that I'm taking a look at this. Whole last week I was away, sorry for delay. Give me couple of days on this. I want to understand how final api in bzlmod would look like.

@mbland
Copy link
Contributor Author

mbland commented Nov 4, 2024

@simuons No worries. Thanks for the heads-up.

FWIW, now that I've got Bzlmod and WORKSPACE working together in my Bzlmod working branch, you can compare each WORKSPACE/MODULE.bazel pair in that branch. The repos under examples/ are probably the best representation of what I'd expect most users to end up with. The dt_patches/test_dt_patches_user_srcjar/MODULE.bazel file shows how it looks to register compiler srcjars.

Copy link
Collaborator

@simuons simuons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @mbland, thanks for PR. Please take a look at comments/questions.

Sorry it's taking too long, past two weeks were busy and PR itself is not trivial (at least for me).

dt_patched_compiler_setup(scala_version, scala_compiler_srcjar)
dt_patched_compiler_setup(scala_version, srcjars.get(scala_version))

compiler_sources_repo(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates compiler_sources_repo below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh! Merge fail. Fixed. (Or, will be once I get through all the comments and push an update.)

@@ -4,10 +4,11 @@ load("@io_bazel_rules_scala//scala:scala_cross_version.bzl", "version_suffix")
def scala_register_toolchains():
for scala_version in SCALA_VERSIONS:
native.register_toolchains(
"@io_bazel_rules_scala//scala:toolchain" + version_suffix(scala_version),
"@io_bazel_rules_scala_toolchains//scala:toolchain" +
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't @io_bazel_rules_scala_toolchains//scala guaranteed to have a toolchain for each configured scala version? If so then maybe this macros should do
register_toolchains("@io_bazel_rules_scala_toolchains//...:all") as in workspaces?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, you're right. Done.

@@ -165,8 +170,26 @@ def rules_scala_setup(scala_compiler_srcjar = None):
url = "https://github.com/bazelbuild/rules_proto/releases/download/6.0.2/rules_proto-6.0.2.tar.gz",
)

if setup_compiler_sources:
srcs = {version: scala_compiler_srcjar for version in SCALA_VERSIONS}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this slipped with previous PRs but for me it looks like same compiler sources might be used for multiple scala versions. Probably separate issue need to be open.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I noticed that, but this technically maintains the previous behavior. The new _setup_scala_compiler_sources function underlying it doesn't have this problem. That implementation could technically replace this one; let me know what you'd prefer, but that would be a breaking change (or maybe just a bugfix?).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I shouldn't bring this up. Let's leave this as is. It will either go away with bzlmod or will be fixed in a separate PR.

if load_dep_rules:
rules_scala_setup()
# When `WORKSPACE` goes away, so can this case.
rules_scala_setup(setup_compiler_sources = False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be better to split rules_scala_setup into two macros: load_rules_dependencies and setup_scala_compiler_sources.
So that rules_scala_setup would call both of these macros. And here load_rules_dependencies would be called conditionally and setup_scala_compiler_sources unconditionally.
It looks a bit odd that half of the macro is disabled and that disabled part is called immediately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree, but that would be another break from the existing API. But if you definitely want it now, I'll do it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make myself clear. What I meant was

def rules_scala_setup(scala_compiler_srcjar = None):
  load_rules_dependencies()
  setup_scala_compiler_sources()

So that existing clients of rules_scala_setup won't break

And here have

if load_dep_rules:
   load_rules_dependencies()
setup_scala_compiler_sources(scala_compiler_srcjars)

This would add two new macros to the api but it would be easier to follow the code (at least for me). Unless I completely misunderstood what you mean by "break from the existing API"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're so right. I wasn't reading closely enough or thinking clearly enough before. This is definitely better, and doesn't break the existing API at all. Done.

fetch_sources = False,
validate_scala_version = True,
scala_compiler_srcjars = {},
scala = True):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a case when someone would like to call this macro with scala = False? Maybe this macro could be simplified by removing this arg along with num_toolchains and conditionals?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time. I thought that would be within the realm of possibility. But now, it does seem silly. Removed.

@simuons
Copy link
Collaborator

simuons commented Nov 8, 2024

I just want to share my ideas/concerns/struggles I've got while looking at this PR (reason it took longer than needed to review). By no means these are blockers or something that needs to be addressed in this PR:

  1. Number of internal repositories keeps growing. Maybe these could be merged in bzlmod (probably not possible in workspace). For example I have a feeling that scala_config repository should be merged into toolchains as those are closely related. Maybe this is not an issue at all.
  2. There are too many scala toolchains. Comment about toolchains ordering is concerning. Issue is that features orthogonal to scala version are configured on toolchain. Ideally there should be single toolchain per scala version. Maybe I'll continue exploration on Configure rules behaviour/toolchain attributes via build settings #1570

Maybe it's a wrong place/time for this, but I just wanted to share so I won't forget it. Or maybe I just want to get second opinion on this :)

@mbland
Copy link
Contributor Author

mbland commented Nov 8, 2024

I just want to share my ideas/concerns/struggles I've got while looking at this PR (reason it took longer than needed to review). By no means these are blockers or something that needs to be addressed in this PR:

No worries! It's important to get this right.

BTW, I'm replying now as I run the full test suite again. Will push my updates and ping when that's all done.

  1. Number of internal repositories keeps growing. Maybe these could be merged in bzlmod (probably not possible in workspace). For example I have a feeling that scala_config repository should be merged into toolchains as those are closely related. Maybe this is not an issue at all.

I've wondered about this, but haven't reached a conclusion. scala_config defines constants that are then imported everywhere. It may require a significant refactor to accommodate a different schema.

  1. There are too many scala toolchains. Comment about toolchains ordering is concerning. Issue is that features orthogonal to scala version are configured on toolchain. Ideally there should be single toolchain per scala version. Maybe I'll continue exploration on Configure rules behaviour/toolchain attributes via build settings #1570

Yeah, I haven't thought about that specifically, but I see what you mean, like the unused deps toolchain. That said, I don't think this change introduces any new toolchains from what was already present; it just repackages them to enable a proper Bzlmod API.

Maybe it's a wrong place/time for this, but I just wanted to share so I won't forget it. Or maybe I just want to get second opinion on this :)

No worries! Good to start thinking these through as the notion arises.

mbland added a commit to mbland/rules_scala that referenced this pull request Nov 8, 2024
- Removes an extraneous `compiler_sources_repo` call.

- `scala_toolchains` will now _always_ create the main Scala toolchain
  repository (there's no longer an option to turn it off).

- Registers `@io_bazel_rules_scala_toolchains//...:all` in
  `scala_register_toolchains.
@mbland
Copy link
Contributor Author

mbland commented Nov 8, 2024

OK, everything's updated, tests are all passing, and I've pushed the changes to the repo. Ready for the next round!

mbland added a commit to mbland/rules_scala that referenced this pull request Nov 11, 2024
- Removes an extraneous `compiler_sources_repo` call.

- `scala_toolchains` will now _always_ create the main Scala toolchain
  repository (there's no longer an option to turn it off).

- Registers `@io_bazel_rules_scala_toolchains//...:all` in
  `scala_register_toolchains.
@simuons
Copy link
Collaborator

simuons commented Nov 12, 2024

Hi, @mbland, overall looks good. Let's settle on rules_scala_setup question and I think we are good to go.

mbland added a commit to mbland/rules_scala that referenced this pull request Nov 12, 2024
Requested by @simuons in bazelbuild#1633 to make `rules_scala_setup` and
`scala_repositories` more readable while maintaining the existing APIs.
@mbland
Copy link
Contributor Author

mbland commented Nov 12, 2024

@simuons In addition to extracting load_rules_dependencies, I just noticed that I could restore all those setup_scala_toolchains() calls in the WORKSPACE files, since we updated that macro to call native.register_toolchains("@io_bazel_rules_scala_toolchains//...:all").

If you want, I'm happy to update this PR accordingly, do it in another PR, or let it go (since all these updates are internal, existing users probably won't notice, and we're introducing the scala_toolchains macro that'll require a documentation update anyway).

Moves the `toolchain` targets for `//scala:toolchain_type` to a new
`@io_bazel_rules_scala_toolchains` repository as a step towards
Bzlmodification. Part of bazelbuild#1482.

Instantiating toolchains in their own repository enables module
extensions to define the the repositories required by those toolchains
within the extension's own namespace. Bzlmod users can then register the
toolchains from this repository without having to import all the
toolchains' dependencies into their own namespace via `use_repo()`.

---

The `scala_toolchains_repo()` macro wraps the underlying repository rule
and assigns it the standard name `io_bazel_rules_scala_toolchains`.
Right now it's only instantiating the main Scala toolchain via the
default `scala = True` parameter. Future changes will expand this
macro and repository rule with more boolean parameters to instantiate
other toolchains, specifically:

- `scalatest`
- `junit`
- `specs2`
- `twitter_scrooge`
- `jmh`
- `scala_proto` and `scala_proto_enable_all_options`
- `testing` (includes all of `scalatest`, `junit`, and `specs2`)
- `scalafmt`

---

`WORKSPACE` users will now have to import and call the
`scala_toolchains_repo()` macro to instantiate
`@io_bazel_rules_scala_toolchains`.

```py
load("@io_bazel_rules_scala//:scala_config.bzl", "scala_config")

scala_config()

load(
    "//scala:scala.bzl",
    "rules_scala_setup",
    "rules_scala_toolchain_deps_repositories",
    "scala_toolchains_repo",
)

rules_scala_setup()

rules_scala_toolchain_deps_repositories()

scala_toolchains_repo()

register_toolchains("@io_bazel_rules_scala_toolchains//...:all")
```

This is what the corresponding `MODULE.bazel` setup would look like:

```py
module(name = "rules_scala", version = "7.0.0")

scala_config = use_extension(
    "//scala/extensions:config.bzl", "scala_config"
)
scala_config.settings(scala_version = "2.13.14")

scala_deps = use_extension("//scala/extensions:deps.bzl", "scala_deps")
scala_deps.toolchains()
```

The `register_toolchains()` call in `WORKSPACE` isn't strictly required
at this point, but is recommended. However, all the `WORKSPACE` files in
this repo already register their required toolchains using existing
macros, which have been updated in this change.

In fact, calling `register_toolchains()` right after
`scala_toolchains_repo()` as shown above breaks two tests that depend on
the existing `WORKSPACE` toolchain registration:

- `test_compilation_fails_with_plus_one_deps_undefined` from
  `test/shell/test_compilation.sh` depends on
  `scala_register_unused_deps_toolchains()` setting up its toolchain to
  resolve first. `//scala:unused_dependency_checker_error_toolchain`
  sets the `scala_toolchain()` parameters `dependency_tracking_method =
  "ast-plus"` and `unused_dependency_checker_mode = "error"`, and the
  `@io_bazel_rules_scala_toolchains//scala` toolchains don't.

- `test_scala_binary_allows_opt_in_to_use_of_argument_file_in_runner_for_improved_performance`
  from `test/shell/test_scala_binary.sh` depends on the
  `use_argument_file_in_runner` parameter of `scala_toolchain` being
  `False`. This is the default, but the
  `@io_bazel_rules_scala_toolchains//scala` toolchains explicitly set
  this to `True` instead.

In the Bzlmod case, the `register_toolchains()` call isn't necessary at
all. This is because `@io_bazel_rules_scala_toolchains` includes one
package per set of toolchains, and the rules_scala `MODULE.bazel` calls
`register_toolchains("@io_bazel_rules_scala_toolchains//...:all")`. This
will automatically register all configured rules_scala toolchains, while
allowing users to override any of them using `register_toolchains()` in
their own `MODULE.bazel` files.

Technically, the `scala_deps.toolchains()` call isn't required when only
using the default `scala = True` parameter; the rules_scala
`MODULE.bazel` will instantiate this automatically, too.
Extracted a single `scala_toolchains` macro to share between `WORKSPACE`
and the `deps.bzl` module extension. This will make it easier to ensure
`WORKSPACE` compatibility, and to add a Bazel module extension as a thin
layer on top. Part of bazelbuild#1482.

This change includes updates to `rules_scala_setup` and
`scala_repositories` to support this, while preserving compatibility
with existing `WORKSPACE` calls. The next commit will replace many
existing `WORKSPACE` calls with `scala_toolchains`.
Also added `@io_bazel_rules_scala_toolchains//...:all` to
`register_toolchains()` calls everywhere, even when not specifically
necessary. This proves the mechanism is safe and works with `WORKSPACE`
now, and will make future updates to consolidate other toolchains less
noisy.
Should've been included in the previous commit. All tests still pass.
Missed this one in third_party/test/proto earlier. Also removed
unnecessary `USE_BAZEL_VERSION` expression in
test_scala_proto_library.sh. If `USE_BAZEL_VERSION` is set, it takes
precedence to begin with, and third_party/test/proto/.bazelversion
exists now after bazelbuild#1629.
This turns out to be helpful when adapting
`test_version/WORKSPACE.template` to the toolchainized version of
`twitter_scrooge` for testing. In this case, we want `twitter_scooge` to
be in its own customized repo, separate from the one generated by
`scala_toolchains`.

This seems like it might be generally useful when writing module
extensions for alternative toolchains.
- Removes an extraneous `compiler_sources_repo` call.

- `scala_toolchains` will now _always_ create the main Scala toolchain
  repository (there's no longer an option to turn it off).

- Registers `@io_bazel_rules_scala_toolchains//...:all` in
  `scala_register_toolchains.
Requested by @simuons in bazelbuild#1633 to make `rules_scala_setup` and
`scala_repositories` more readable while maintaining the existing APIs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants