Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Toolchain deps infra #1072

Merged
merged 12 commits into from
Aug 3, 2020
Merged

Conversation

liucijus
Copy link
Collaborator

This is the first actual PR from Deps Toolchains Infrastructure (#1067)

  • Adds reusable infra to develop toolchains for deps
  • Adds docs
  • Minimizes deps required for minimal Scala setup

Opinionated decisions (welcome to discuss!):

  • common deps are added on existing scala toolchain (proposal had a separate deps toolchain) - this is done for simplicity
  • guava (moved to scrooge setup) and commons io were removed. The reusability gain was minimal, but has added more complexity for users. cc @blorente

Question: maybe we also want to migrate ScalacProvider to DepsInfo map?

N.B. This is a breaking change for users who do not use repositories macros

@liucijus liucijus requested a review from ittaiz as a code owner July 21, 2020 14:40
@ittaiz
Copy link
Member

ittaiz commented Jul 22, 2020

Thanks for this! I think there's a lot of sense in having a unified mechanism.
Not sure if there are cons from moving ScalacProvider itself.
WDYT about sending a commit that moves it and we can understand from it if there are cons?

@@ -0,0 +1,153 @@
# Developing Toolchains for Deps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@katre is there a chance you can take a look at this toolchains pattern?
I think this might be valuable for rulesets in general and not only rules_scala.
If you think we're abusing toolchains or just misunderstanding it we'd also love to know that.
One concrete question is whether using a map here is the best solution?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked over the PR and the docs, and I'm afraid I don't fully understand what's being implemented. You are adding dependencies to the toolchain that will then also be provided to other, non-toolchain-aware rules, via the normal transitive dependencies? Can you give a concrete example of where this would be useful?

As for the map, I am also confused about why it is a map from a label (which seems to always be a declare_deps_provider) to a string (which is a unique provider id?). Why this direction and not a map of provider id to label?

Lastly, if you are updating your toolchains, have you seen the design for Toolchain Transitions, including the migration steps? This will allow you to declare some toolchain dependencies as being in the target config (for the target being built with the toolchain), and others in the exec config, which may be useful with these deps.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked over the PR and the docs, and I'm afraid I don't fully understand what's being implemented. You are adding dependencies to the toolchain that will then also be provided to other, non-toolchain-aware rules, via the normal transitive dependencies? Can you give a concrete example of where this would be useful?

Example: scalac target is java_binary which isn't aware of scala_toolchain.

As for the map, I am also confused about why it is a map from a label (which seems to always be a declare_deps_provider) to a string (which is a unique provider id?). Why this direction and not a map of provider id to label?

I haven't found a type for string to label mapping (https://docs.bazel.build/versions/master/skylark/lib/attr.html). I don't like having inverted dict as a compromise, but I failed to find a better alternative.

Lastly, if you are updating your toolchains, have you seen the design for Toolchain Transitions, including the migration steps? This will allow you to declare some toolchain dependencies as being in the target config (for the target being built with the toolchain), and others in the exec config, which may be useful with these deps.

I've looked at toolchain transitions, but I do not see how it fits into our case. Are there good examples to learn from?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If scalac is a java_binary, why doesn't it have a complete set of deps? Can it only be built as a dependency of a scala target? Or have I misunderstood the build graph?

Instead of a string_to_label dict, the typical approach is to use a label_list, and when you get the DepInfo providers (is that the right provider from this attribute?) from the attribute, each DepsInfo has the provider id. Are you ever actually using the same target as different provider ids in different toolchains?

Toolchain transitions are new, and we're still in the migration period. The basic idea is that it allows your toolchain to separate out the tool dependencies (like scalac), which need to run on the execution platform, from your library dependencies (which are actually linked into your final target), and which need to be in the target configuration.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move provider id to DepsInfo. I don't like this approach, but it is semantically more correct. Mapping ids to labels feels more declarative and would be good for mapping the same targets more than once, but inverted map does not allow the same label twice anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I agree that the previous labe->id map was odd. Feel free to reach out to the Starlark API team on bazel-users and ask about this, but I suspect the answer will be that the label_list system you have is more "bazel-y".

In general, I really like this approach for solving the problem of adding transitive deps to things that consume scala targets. I think linking future toolchain authors to your docs will definitely be useful. Would it be possible for the author to also write a blog post or some other article with a high-level discussion to point other rule authors to? I realize this is even more work, but it could really help the community.

@liucijus
Copy link
Collaborator Author

I have migrated scalac_provider_attr to dep_providers, but come to conclusion that leaving concept of ScalacProvider is quite useful abstraction, so I left it inside phases.

@ittaiz
Copy link
Member

ittaiz commented Jul 24, 2020

have migrated scalac_provider_attr to dep_providers, but come to conclusion that leaving concept of ScalacProvider is quite useful abstraction, so I left it inside phases.

Can you elaborate on why?

@liucijus
Copy link
Collaborator Author

have migrated scalac_provider_attr to dep_providers, but come to conclusion that leaving concept of ScalacProvider is quite useful abstraction, so I left it inside phases.

Can you elaborate on why?

Mostly to have less changes in current phases. Also scala deps have "special" meaning in scala rules which are design around them. Maybe this needs to be redesigned but I would prefer to do it outside toolchains scope, which is already quite big.

@ittaiz
Copy link
Member

ittaiz commented Jul 25, 2020

To make sure I understand- the problem is only priorities and breaking down tasks? I'm treating scalac provider as another validation of your design and want to understand if we got negative signals

@liucijus
Copy link
Collaborator Author

I have refactored dep_providers to label list

@liucijus
Copy link
Collaborator Author

To make sure I understand- the problem is only priorities and breaking down tasks? I'm treating scalac provider as another validation of your design and want to understand if we got negative signals

The question is if we want to change current phases. Now there's single phase which brings three depsets with ScalacProvider. Maybe it makes sense to split it into three new phases per depset and get rid of ScalacProvider. I didn't do it as I have no strong opinion about it and it was very easy to hook into existing phase. In general it's no a complicated change, but will touch quite a few places in the current code.

Copy link
Member

@katre katre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tagging me in, this is really interesting to see.

"default_repl_classpath": attr.label_list(allow_files = True),
"default_macro_classpath": attr.label_list(allow_files = True),
"deps": attr.label_list(allow_files = True),
"depset_id": attr.string(mandatory = True),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless this is actually somehow a depset, I would suggest renaming.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will rename it

@@ -0,0 +1,153 @@
# Developing Toolchains for Deps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I agree that the previous labe->id map was odd. Feel free to reach out to the Starlark API team on bazel-users and ask about this, but I suspect the answer will be that the label_list system you have is more "bazel-y".

In general, I really like this approach for solving the problem of adding transitive deps to things that consume scala targets. I think linking future toolchain authors to your docs will definitely be useful. Would it be possible for the author to also write a blog post or some other article with a high-level discussion to point other rule authors to? I realize this is even more work, but it could really help the community.

declare_deps_toolchain(
name = "my_toolchain_impl",
dep_providers = {
":my_compile_deps_provider": "compile_deps",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is outdated

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching, I will update docs

@ittaiz
Copy link
Member

ittaiz commented Jul 31, 2020

seems as lint fails

@liucijus liucijus force-pushed the toolchain-deps-infra branch from 07f87cf to 8e6375a Compare August 3, 2020 15:05
@liucijus
Copy link
Collaborator Author

liucijus commented Aug 3, 2020

@ittaiz, I have fixed lint error and issues mentioned in the comments

@ittaiz ittaiz merged commit eabb1d2 into bazelbuild:master Aug 3, 2020
@liucijus liucijus deleted the toolchain-deps-infra branch August 4, 2020 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants