Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On reconsidering the compiler args API #396

Closed
artem-zinnatullin opened this issue Nov 17, 2020 · 18 comments
Closed

On reconsidering the compiler args API #396

artem-zinnatullin opened this issue Nov 17, 2020 · 18 comments

Comments

@artem-zinnatullin
Copy link

artem-zinnatullin commented Nov 17, 2020

We wanted to raise an issue about the design approach rules_kotlin currently uses for compiler args before a release of the rules is cut.

Over the last few weeks we’ve seen many PRs adding each Kotlin compiler arg as a separate Starlark rule attribute:

After internal discussion at Lyft, we (@artem-zinnatullin, @Bencodes and @keith) all agreed that this is a questionable approach for a number of reasons and we’d like to raise an issue to reconsider this before it’s too late:

  • Versioning: args get added and removed to/from upstream kotlinc, tracking each of them as a Starlark attribute makes rules_kotlin tightly coupled with kotlinc versions, yet rules_kotlin allows overriding kotlinc version/binary.
  • Delaying the users: if rules supported list of free args, users wouldn’t have to wait on upstream for someone to submit a PR, for someone to review it and then for a new stable release cut with the new args
  • Naming: some of the args have different names in the Starlark attributes, there are debates in the PRs on how to name them, all of that doesn’t change the actual kotlinc arg name yet adds complexity to the discoverability of what means what in the Bazel configuration vs kotlinc docs
  • Complexity of the rules: tracking each arg as a Starlark attribute, adding docs to them, trying to identify incompatible combinations adds complexity to the rules_kotlin with little benefit: kotlinc is still the ultimate validator that might change its validation logic between versions. Moreover, rules_kotlin can still inspect the free args list if that’s a requirement.

Most other rules use free compiler args instead of separate Starlark attributes:

We do think that there are few things that are more complex in the form of free compiler args, notably: compiler plugins and friends (friendly modules for the internal visibility modifier).

They should probably be kept as separate Starlark attributes, similarly to how we pass deps as Bazel labels, rather than free compiler args.

It’s also worth noting, that there were few similar discussions in PRs/issues, so we’re linking previous art for reference:

With all this in mind, we’d like to ask maintainers and contributors of the rules_kotlin to reconsider current design of compiler args API and have a peaceful, open minded discussion before a stable release is cut.

Thanks.

GitHub SEO tags: compiler args, compiler flags, free args, kotlinc args, kotlinc flags.

cc-ing few people, who have been seen in similar discussions (sorry for spam, just not sure everyone monitors new issues):

@cgruber
Copy link
Collaborator

cgruber commented Nov 17, 2020

These are some valid concerns, and there are definite tradeoffs to the explicit approach we're presently taking. To an extent we're extending a long-standing policy in these rules, from before I was the maintainer, largely because we never got to the point of having enough dedicated users who are also experienced enough with bazel to hash this out. I think the discussion is timely, and it isn't to late to discuss and set direction here. Given that we had quite a bit of change, little of that are going into the legacy-1.4.0-release. The big batches of new change are all going in under a forthcoming 1.5.0 alpha, which is NOT stable, and we can figure all of this out.

It seems to me that we have a few major dimensions to choose. The present rules occupy a position on each of these, but we can consider each dimension somewhat separately.

  • Scope of the various compiler args
    • toolchain-wide
    • per-target
  • Flexibility of compiler args
    • free-form
    • locked down and limited
    • something stricter than freeform, but maybe extensible
  • Are all compiler arguments to be treated in the same system?
    • Some compiler args make sense repo-wide, some not
    • Are compiler args and plugins separate systems? Assuming they are, can we free-form without resulting in people free-forming conflicting flag sets with what are generated from plugin definitions.
    • Is there a place for strictness in the default mechanism, but with a "void the warranty" hedge-feature that lets developers crack the seal and do what they need to while the more robust stricter structure is working its way through approval.

Plus a few other questions:
- To what extent does our having a flaky release schedule and best-effort nature impact these questions. That is, if we tightened things up so we released frequently, with unstable and stable versions fast enough, would that reduce the draw for free-form?

Some of the above is asked in @artem-zinnatullin's post - namely do we separate plugins and friends (I'd say yes - I don't think we want manually managed classpaths anywhere near bazel) but that really impacts the shame of how we expose flags.

The existing approach is intended to be conservative - it's easier to loosen later than tighten later. But now that the question is posed, let's see if we can't find the "right" APIs here.

@cgruber
Copy link
Collaborator

cgruber commented Nov 17, 2020

I'm sure I've missed some dimensions - I just want to push us to try to separate concerns cleanly.

@artem-zinnatullin
Copy link
Author

Thanks for listing important parts of the equation, I'm a bit lacking on the energy levels to support full-blown discussion, so I kind of expect @Bencodes and @keith to back me up here, but I have some thoughts on "Scope of the various compiler args":

I think we should support both toolchain-wide and per-target args, similarly to how rules_java does it: https://docs.bazel.build/versions/master/be/java.html#java_toolchain.javacopts and https://docs.bazel.build/versions/master/be/java.html#java_library.javacopts

If users start complaining about args concatenation, we can add an attribute to override the args rather than concatting target-level with toolchain-level args, but I think starting with what rules_java is good enough, macros and multi-toolchain already give a lot of flexibility.

@cgruber
Copy link
Collaborator

cgruber commented Nov 17, 2020

One thing to keep in mind is that at least Corbin and I saw a lot of misuse of the free availability of java args in the google codebase, and that's definitely something we want to pay attention to. That's not an argument so much as a caution, and whatever we decide should decidedly take a position on not only best uses, but how we plan to push users towards those best uses (that doesn't rely on docs and good intentions).

This becomes especially big when you start having to maintain large megarepos/monorepos. Doing migrations when people have done per-target flags becomes a disaster. So having flexibility is fine, but ideally piping it through some sort of option-rule so you can have managed sets of options and do refactoring and discovery.

@cgruber
Copy link
Collaborator

cgruber commented Nov 17, 2020

(And the above comment is one source of the abundance of caution that informed the conservative approach - there are ways to open up the flexibility, but in managed way - I'd like to find those)

@Bencodes
Copy link
Collaborator

One thing to keep in mind is that at least Corbin and I saw a lot of misuse of the free availability of java args in the google codebase, and that's definitely something we want to pay attention to.

This is definitely something that we'd like to prevent in our codebase as well, but I think by solving this in the rules we are making a lot of assumptions about the uses cases that people are going to have for rules_kotlin. I'd much rather see people defining macro abstractions for these rules where they enforce the compiler flags that make sense for their codebase (or blocking them).

Some of the above is asked in @artem-zinnatullin's post - namely do we separate plugins and friends (I'd say yes - I don't think we want manually managed classpaths anywhere near bazel) but that really impacts the shame of how we expose flags.

I definitely agree that given the complexity of plugins and friends, we probably want to provide dedicated APIs for those.

@jeffzoch
Copy link
Contributor

I think there is room for compromise as I dont see all options as equal when it comes to kotlinc. For example, -X options (advanced options) change very frequently and are not as stable (nor as common) as flags like -warn. Perhaps we can stabilize all the "stable" flags as we are now but allow free-form -X flags?

@restingbull
Copy link
Collaborator

restingbull commented Nov 17, 2020

First, let me mention the concerns around versioning. This is a larger issue that hasn't been addressed at this time. Upgrading kotlinc 1.2 -> 1.3 required worker code changes, and it appears 1.4 will require some subtler changes in the worker (kapt now handles long arg encoding, for example) Ideally, rules_kotlin would use versioned toolchains rather than arbitrary binaries that may or may not work with the kotlin worker. (I haven't tried using the 1.2 K2JVMCompiler with the current worker. It might work. It might not.)

That said, here is the breakdown of the reasoning for different methods of expressing compiler flags.

The compiler options have the following design goals:

  • rules development: how much do compiler flags add additional burden rules_koltin maintenance development?
  • repository maintenance: how hard is it to make large scale code changes (LSC) across one or more large code bases.
  • "principal of the least surprise": a subjective measurement of "WTF?" per minute when attempting to accomplish day to day development tasks.

Additionally, I am in the midst of reworking the worker for performance. Given the addition of multiplex workers, bringing kotlin performance inline with Gradle appears possible. This means many of the -X flags can be considered potentially problematic -- they provide a surprisingly large amount of control over the compilation process and resulting bytecode.

The design for options is not finalized (hence the limitation to the toolchain, which makes api changes less painful) and still subject to evolution. Depending on the presented use cases, I expect there will be some cleanup before an official announcement. The number of PRs that appear for compiler options is indicative of an unmet need in the rules -- I expect, once we have accumulated the list of flags that are actually used, to drop off until the next version.

Known Designs and Trade-offs

Compiler Flags via Options Rule

Compiler flags are translated into a kt_*_options rule. Currently, these rules are unversioned and not language restricted (beyond javac vs kotlinc).

Prior Art: https://kotlinlang.org/docs/reference/using-gradle.html#compiler-options

Benefits

Development
LSC

These become more relevant when options are allowed on specific rules.

  • Provides a buildozer friendly interface for adding and removing options at scale.
  • Visibility and platform constraints can be used to provide "blessed" flag sets.
  • Auditing flag usage is query friendly.
Least Surprise aka end user
  • Properly written documentation can provide guidance about best practices. It should link to the compiler documentation where applicable.
  • Conflicting/nonsensical option combinations, for example -nowarn and -Werror, can be simplified
  • All supported flags are immediately visible. Given that many of the flags are experimental, or inappropriate (like -Xuse-javac, which is rendered useless by using the Bazel java toolchains.)
  • Flag format errors are discovered at analysis time

Detriments

Development
  • New flags must be directly introduced, validated, and maintained
LSC
  • insert detriments as discovered
Least Surprise aka end user benefits
  • New flags take longer to be made available
  • Compiler binary and options can be out of sync
  • Current implementation does not distinguish between JS, JVM, and Native

Compiler Flags via Attributes on kt_toolchain

Prior Art: https://docs.bazel.build/versions/master/be/java.html#java_library.javacopts

Given the number attributes already on kt_toolchain, this was not completely considered. It offers many of the benefits above, including making the connection between kotlinc and the options more direct.

The detriments are most in starlark code maintenance and difficult to read documentation (mixes compiler configuration with the rest of the toolchain dependencies.)

Compiler options via string_list or string

Benefits

Development
  • Minimal maintenance for flags support
  • Delegate flag validation to kotlinc
LSC
  • insert benefits here as discovered
Least Surprise aka End User:
  • Single source of documentation for compiler plugins.
  • New flags are available upon compiler update.

Detriments

Development
  • Inappropriate flags must be validated and readable error messages returned to avoid strange compiler failures. This is necessary as flag and worker interactions can be problematic.
  • Maintain a flag parser when using the richer K2JVMFlags api.
  • Merging toolchain flags and rule flags can produce obscure error messages from the compiler.
  • Certain flags may affect silently performance, which can result in difficult to diagnose bug reports.
LSC
  • Invalid flags are discovered at compile time, rather than analysis or even pre-analysis.
  • Auditing flag usage requires both queries and string processing.
  • Modifying flags requires string manipulation via regex or or custom buildozer tooling
  • Backsliding (new changes introduced during a migration) is difficult to prevent.
  • Organizing sets of configuration requires macros or similar machinery, whic adds to complexity of managing a build system.
Least Surprise aka End User
  • Nonsensical flag combinations are not reported on unless the compiler complains.
  • Incorrect flags are mixed in with other compile errors
  • Deviates from the Bazel standard of passing free text to the compiler.

I expect that folks have additional proposals -- I leaned on expediency of a constrained approach that can be loosened up. As it stands, any rule implementing the providers can be used.

@restingbull
Copy link
Collaborator

As a side note -- Macro abstractions are a terrible long term strategy for repository management.

  • They don't work well with builddozer (all you need is a macro inside a macro, or some string manipulation) and can have ugly performance. We had an issue with macros that took 30 seconds of analysis time by globbing directories, then globbing more directories.
  • They don't have constraints for environments or visibility. You get a select, and you can't look inside it.
  • They can really, really obfuscate a BUILD file. There is a very real reason that bazel query --output build exists -- it;s the joy of looking a build file and seeing one macro invocation. That produces 3k targets.

They are fine in moderation, but not as a maintenance strategy.

@artem-zinnatullin
Copy link
Author

Abstracting flags from the command line allows for dramatic changes to the underlying worker structure (e.g. #390 or adopting the strategies used in the kotlin compiler daemon.) In some cases, a flag may be rendered irrelevant due to breaking a compilation into phases. It is conceivable there may be a class of flags that should never be used -- friend or compiler plugins related ones come to mind

@restingbull just to be clear, do you mean that some combinations of compiler args can't be used against same instance of kotlinc persistent worker?

If that's the case, I think inspecting the free compiler args list can still do the trick and I don't see how it's different from inspecting particular Starlark attributes.. 🤔

Sorry if I'm asking wrong questions or suggesting wrong things, everything I post is with good intention and my best understanding of the problem

@restingbull
Copy link
Collaborator

@artem-zinnatullin some flags can't be used at all -- -Xuse-javac would really muck up some of the compilation caching (also disables the ir backend), not mention it's useless if there are no .java files. Considering the proliferation of experimental and possible bad flag combinations, K2JVMCompiler is a real beast. Unlike javac, kotlinc is changing fast. It's not quire mature.

Let's look at the requirements for validating and merging flags:

Free text

Validation of disallowed flags
If it is done Starlark offers startwith, endswith, and in for string introspection. This is inherently messy and imprecise, and will most likely result in an inclusion list rather than exclusion list, leading to trial and error or voluminous documentation.

Done in the worker would have an additional flag parsing step, and happens during compilation. This is a subpar user experience, as folks blame the rules not flags in the toolchain.

Merging toolchain and rule flags
Would need to be done in Starlark, as delegating to the worker results in passing extra-arguments and a poor error reporting (errors would appear as compiler errors not configuration errors. If macros are used, finding the contributing function could difficult. This is ancedotal evidence from maintaining the android rules.)

Given the rich nature of some flags, the parsing logic will be non-trivial.

Leaning on kotlinc for validation of conflicting flags can producing confusing messages, or worse, no messages and unexpected behaviour..

Structured starlark

Validation
Unwanted flags/combinations can't be set.

Merging
Conditionals for each type, expanded into appropriate types. Errors reported to the appropriate attribute, linked to a rule.

What are the benefits to free text args? I attempted to cover the all the upsides I could think of, but I'm biased.

The primary use-case for arbitrary untyped arguments, that I can think of, is experimentation with alpha and beta compilers. Given that, with some restructuring and a separate worker, we could roll an experimental toolchain that acts a wrapper for the vanilla kotlinc. It wouldn't have compilation optimizations, but would allow a greater latitude. A "void the warranty" feature as Christian calls it.

@restingbull
Copy link
Collaborator

(point of process: I will not check this thread till Friday, due to other commitments.)

@jeffzoch
Copy link
Contributor

Another alternative I thought of: What if we continued down the path we are going down but allowed an escape hatch? If there was an "unsafe_arg" list where you could enable compiler flags using a free list of strings for those flags which arent yet in this repo it would allow the flexibility many want while still being slightly more auditable. Its not perfect by any stretch but at least forces people to contend with the inherent unsafety these args present.

@artem-zinnatullin
Copy link
Author

-Xuse-javac would really muck up some of the compilation caching (also disables the ir backend), not mention it's useless if there are no .java files

Well, there are args like these that mess with build systems in every other compiler, in my opinion if user consciously opted in to use such an arg — it's on them to troubleshoot it. We can add docs warning users about few args like this.

@cgruber
Copy link
Collaborator

cgruber commented Nov 18, 2020

Given that dagger configuration has been brought up, it occurs to me that some subset of the issue here relates to annotation processing arguments, which despite being passed in as compiler args, are a different beast (and also aren't adequately supported in rules_kotlin at present). But I would like to factor those out of this discussion, because:

  1. They are string-to-string pass-through properties consumed ad-hoc by annotation processors, much like System.properties.
  2. We need to carefully manage the weird interop cases that come up in mixed codebases. (e.g. people using javacopts to specify them in java_plugin, but also specifying them per-target)
  3. We would have to parse them out of compilation flags anyway as they're parameters to kapt, not kotlinc, and mixing these

I think whatever we decide here, we should separately figure out the right mechanism for annotation processing options, and factor it out of the general mechanism.

@restingbull
Copy link
Collaborator

restingbull commented Nov 20, 2020

Seems like this issue has gone quiet. I would like to wrap up the discussion, as it blocks quite a few prs. Additionally, the "issue gone quiet" has been a blocker for development. See the "friends" discussions.

The following issues to be addressed separately from compiler options:

I would add:

That leaves:

  • Usability
    • Naming
    • Complexity
    • Repository maintenance
    • Validation
    • Documentation
  • Maintainability
  • Extensibility

I have attempted to address these in #396 (comment).

There was some discussion around Maintainability: #396 (comment) through #396 (comment).
The two points of view are:

  • Invalid and negatively performance impacting flags are the responsibility of the user, and presumably alleviated by documentation and project support.
    • Other bazel projects have followed this approach.
  • Remove the possibility of defining those flags.

Without additional feedback on the advantage of free-form flags for the stated design principals, it is not currently a viable proposal.

This brings the conversation to designing a system that allows voiding the warranty.

Given the issues of Maintainability and the areas for optimization (the kotlin compiler optimization principals focus on caching, which will require stricter control over configuration), an experimental toolchain with vanilla worker for kotlinc appears to be an appropriate solution:

  • compiler flags would be definable only on the toolchain.
  • no validation of the flags would be performed.
  • compiler would handle .java and .kt compilation (as opposed to the production toolchain which may, or may not)
  • worker is expected backwards/forwards compatible within reason. Production toolchain will handle these via a TBD versioning system.
  • many support requests will probably end with: use the production toolchain.

Finally, as this keeps resurfacing, the java rules use JavaBuilder. JavaBuilder is not a vanilla javac and is maintained by a team. A lot of work goes into migrating options and cleaning up Google3 when java is upgraded. (Anecdotally, it was considered to never upgrade the java version after pushing to Java 8, due to the investment of the migration.) rules_kotlin doesn't have the luxuries of a controlled repo and a dedicated team. It is very much worth noting that Jetbrains does not provide raw flag access in the Gradle rules. While we are in Bazel land, many developers are coming to Kotlin via Android Studio and Gradle -- it behooves us to consider the cases of familiarity and usability when designing APIs.

@restingbull
Copy link
Collaborator

Point of process: without additional conversation, I will close this issue on 11-29-20.

This is not intended to stifle conversation: rather, keep it moving.

@ZacSweers
Copy link

The lack of resolution on this issue is disappointing to see, as the whole reason KGP offers a freeCompilerArgs outlet is because it's untenable to maintain an exhaustive list of all available options at a given kotlinc release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants