-
-
Notifications
You must be signed in to change notification settings - Fork 644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code Quality Tool: toml-based backend templating #20270
base: main
Are you sure you want to change the base?
Code Quality Tool: toml-based backend templating #20270
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
err_msg = ( | ||
f"Entrypoint {name} in {backend_module} backend template " | ||
f"must accept {list(kwargs)} as keyword arguments: {e!r}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be improved by using inspect.signature(entrypoint)
to also present the actual keyword args expected.
Would probably also be helpful to include kwargs["backend_package_alias"]
in the error message, to help identify which templating section this was for, as the backend module could be in multiple of them.
Co-authored-by: Andreas Stenius <git@astekk.se>
@kaos I am going to try to begin adding tests here. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
description_of_origin=f"the configured `target` for Code quality tool {request.description_of_origin}", | ||
), | ||
) | ||
target = wrapped_target.target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this file are only to give a better error message when a user mistypes the address of a code quality tool target.
scope=backend_package_alias, | ||
) | ||
|
||
return CodeQualityToolBackend(rule_builder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the main function of a backend template module. It returns an object with a similar interface as the register
module objects normally used to load backends.
template = d.pop("template", None) | ||
if not template: | ||
raise ValueError('"template" is a required key for a templated backend') | ||
return cls(template=cast(str, template), kwargs=FrozenDict(d)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Located in a separate file to avoid circular imports between options and extension loading. This stage of parsing logic is owned by bootstrap options.
) | ||
module = backend_generator_module.generate(backend_package, cfg.kwargs) | ||
module_source = f"{module} generated from {cfg.template}" | ||
return module, module_source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two functions return objects with the same an interface: module
is something with zero-arg callable attributes like target_types
, rules
and module_source
is a string describing the module for use in error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really cool stuff!
Happy with how this has turned out. Only thing still nagging me is the terminology. But maybe that's just me 🤷🏽 I would still prefer to see the templating bit as a generic way to "configure" any backend.. that from a user perspective all backends are equal, and some may take additional configuration.
Just to clarify my thoughts to make it easier to agree/disagree:
[GLOBAL]
backend_packages.add = [
"some.regular.backend",
"another-backend-variant",
"yet.one.more",
]
[GLOBAL.backend_package_config.another-backend-variant]
module = "another.backend" # or maybe `package = ..` or `register = ..` 🤷🏽
flavour = "variant"
sauce = "medium"
The benefit of still having also the configured backends in the regular backend_packages
list is to make it easier to manage the list as usual, to enable/disable backends--while the configuration for them could always be present.
@kaos , the way the PR is currently one does have to refer to the configured backends still in I think it will be fine to change the [GLOBAL]
backend_packages.add = [
"pants.backend.python",
# the below are configured/templated backends and must still be referred to in the backend_packages list.
"flake8",
"markdownlint",
]
[GLOBAL.templated_backends.flake8]
module = "pants.backend.experimental.adhoc.code_quality_tool_backend_template"
# not template = and not package = because the string refers to a module not a python package.
goal = "lint"
target = "//:flake8_linter"
name = "Flake8"
[GLOBAL.templated_backends.markdownlint]
module = "pants.backend.experimental.adhoc.code_quality_tool_backend_template"
goal = "lint"
target = "//:markdownlint_linter"
name = "Markdown Lint" What do you think? I can make the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see why this file is placed here, as it becomes the module name used in the configuration to invoke it. Would however suggest moving the implementation to the regular place, and importing it from there (so we still have the experimental name as the interface to use in the configuration.
from pants.backend.adhoc.code_quality_tool_backend_template import generate as generate
(the as generate
part is to tell linters that we import it with the purpose of re-exporting it)
The motivation being that we then have the full history also after it graduates from experimental.
Yep, and I like that. 👍🏽
Yea, I like
Do you have any alternative suggestions for the I like the So: [GLOBAL.templated_backends.markdownlint]
module = "pants.backend.experimental.adhoc.code_quality_tool"
... When loading backends, pants would load |
I'll dive into this this week, but damn. Looks absolutely fantastic on the surface. You should feel proud, this is very well done. |
@kaos This comment takes your suggestion and see how to tie it together in a self-consistent way of talking/thinking about backends. What you are proposing is an interesting formulation. Current proposal: two types of backendsWhat is in this PR at the time of this writing. There are two types of backends: regular and templated. We refer to their definition in code differently. One has a package with a Alternate proposal: unification of templated and regular backendsThis is following your suggestion. We can propose that all backends are "generated"/"templated". The only difference with regular backends is that
# equivalent to the implicit generate defined in every classic backend package
def generate(backend_package_alias, kwargs) -> RegisterProtocol:
assert not kwargs, f"Backend package {__name__} does not take configuration args"
return sys.modules[__name__] In this view the I'd like to wait for another reviewer/maintainer to chime in as well and if they are +1 with moving to this direction, I'll go ahead and impl. |
Can you share a snippet of what that would look like in the toml? |
Sorry, @thejcannon , what I mean by:
Is just a conceptual change, but does not require any change in toml. Its just saying that internally we can manage regular backends as though they too are "generated" via an implicit |
This looks awesome, but I've got a conceptual worry that nags at me. These are definitely not packages in any sense that overlaps with existing usage. I'm not even sure if they'd count as backends? Conceptually, they're jazzed up code generators. I don't say that to denigrate the work done - it's awesome, and Pants is unforgivingly verbose so I'm all for simplifying the process. But there's certainly a lack of smarts that separates them from regular backends. I'm worrying that by equating them we'll (a) make teaching Pants harder, (b) have a mismatch in expectation of what they'll do, and (c) potentially set us up for a lot of work to meet those expectations? I'm wondering if there's a "missing" component in Pants here which lives between static configuration ( As a more general point about this whole series, I'm worried how this would interact e.g. with code generators, or native Python extensions. Can that work? How does it work? What about Rust, which requires both rustup + toolchain before it can run, and the partioning has to happen on "package" or workspace level? On the other hand -- the way Pants handles source files + dependencies is quite inflexible, so this is a good use-case for finding those limits. I'm just wondering whether we can then translate that back into gains for general "non-trivial" file based runs. |
(Just a drive-by thought) The "Environments" support parses the root |
Not just the root |
Yes, we can attempt to always check for
I agree they're not packages. What if the global option was However, not being a package but rather a placeholder for a configured name that in turn points at a module to use to generate the backend to load--they indeed can be seen as jazzed up code generators, but not generic and always produce the same thing; the backend to load. So the new concept being that mixed with the regular backend packages there can now also be "backend aliases". As these aliases are named freely, it'll be up to the user to pick names that makes sense to them so I don't see this as a big problem, outweighed by the intuitive effect of being able to manipulate one single list for all "backends" to use. |
What smarts are lacking? A templated backend can do precisely the same thing as a regular backend, so I fail to see what expectations there may be that we wouldn't meet? |
oh, and thanks for reviewing and raising questions, it's precisely what we need to settle this :) 🙏🏽 |
Another shift, to stay closer to packages in [GLOBAL]
backend_packages.add = [
"pants.backend.python",
# the below are configured/templated backends and must still be referred to in the backend_packages list.
"pants.backend.experimental.adhoc.code_quality_tool",
]
[[GLOBAL.templated_backends]]
package = "pants.backend.experimental.adhoc.code_quality_tool"
goal = "lint"
target = "//:flake8_linter"
name = "Flake8"
[[GLOBAL.templated_backends]]
package = "pants.backend.experimental.adhoc.code_quality_tool"
goal = "lint"
target = "//:markdownlint_linter"
name = "Markdown Lint" So, when loading backends, we check if there are any configurations for the package in The main difference to previous is that we don't use the names in the I kind of like this idea, as it keeps the |
Very much +1 that we should be careful to have coherent meanings for the words
I agree that they are code generators. In this case a kind of code generator for a pants plugin, creating a bunch of types and rules relating them to goals and known union bases. As to:
The only changed introduced here or, for that matter, in the original PR - is a pattern of dynamically generating pants types and rules during the process of backend-loading. Andreas, the issue with: [GLOBAL]
backend_packages.add = [
"pants.backend.python",
# the below are configured/templated backends and must still be referred to in the backend_packages list.
"pants.backend.experimental.adhoc.code_quality_tool",
] is that backend order matters for things like We can also not do thisI'd very much like us to leave on the table the option of not introducing this new bit of pants.toml [GLOBAL]
pythonpath = ["%(buildroot)s/pants-plugins"]
backend_packages.add = [
"pants.backend.python",
# code_quality_tool's target and rules are defined in the adhoc backend
"pants.backend.experimental.adhoc",
# thin in-repo plugin activates a particular target as a linter
"flake8_linter_plugin",
] and an in-repo plugin pants-plugins/flake8_linter_plugin/register.py from pants.backend.adhoc.code_quality_tool import CodeQualityToolRuleBuilder
def rules():
cfg = CodeQualityToolRuleBuilder(
goal="lint", target="//:flake8_tool", name="Flake8", scope="flake8_tool"
)
return cfg.rules() This is already supported and does not require introducing any new concepts, bits of syntax or overloading of syntax, and the user retains full control over things like ordering. On the other hand, users do seem to have apprehension to go into writing their own in-repo plugins, and maybe that will also happen for such a microscopic one. |
Ah, thank you for keeping that top of mind. Good point, I thought that would be solved by having the I still think this issue with order being significant is a dumb issue to have and ought to be resolved.. but until then we'll have to deal with it. |
I mentioned a few cases where I think it'd be hard to use this efficiently. By smarts I mean it cannot use any deeper knowledge about the language to run correctly -- it only runs on a glob of files (and whatever smarts are used to generate that glob). And then there's partitioning better to improve caching, and so on. That's fine on efficient tools, but this would suck horrendously for something like And while it's true it can do the same things, especially with more configuration options, at some point the configuration to get exactly what's needed (again: see clippy) would likely be less maintainable, definitely less shareable, and still very verbose.
Sure. I think my concern is mostly that I put a lot of emphasis on the "smarts" side, where this makes tradeoffs that makes it unfit for a bunch of use-cases. I'll concede that maybe that isn't part of the definition, semantically. I do however also maintain that there's a semantic difference between a dedicated plugin with multiple backends compared to this. That's maybe less important the package-vs-non-package distinction.
I'm aware, it's just my general line of reasoning around whether these count as "backends", as in -- are there footguns here that warrants keeping them more or less separate from current backends.
I much prefer the separate-but-equal approach. How big is the risk here w.r.t. ordering? Is it something we can solve in different ways? It seems like a footgun in general -- though I guess with the split approach it's unsolvable without other mechanics.
If I had to rank my ideal workflows it'd probably be "WORKSPACE-style"/BUILD, followed by pants.toml, followed by this dummy backend model. That to me seems like sacrificing both shareability1 and simplicity. Footnotes
|
Hi all, I'm wondering what the status of this is? |
@benjyw The PR is itself in a good state, but there is debate about whether this is the right move or whether what is offered here will provide value greater than its weight. I can pick it back up if the community can see a path forward. It was otherwise, in my opinion, starting to fall in the "when in doubt, leave it out" category. |
This demonstrates a version of the suggestion in #17729 (comment)
The PR does two things:
Overview/Demo
It introduces the notion within bootstrapping that backends can be templated out from a package:
Working example
See
https://github.com/gauthamnair/example-pants-byo-tool/tree/global-backends-option-demo for a repo with a working example of using this functionality to activate code quality tool without needing an in-repo plugin.
Backend templating
We adjust the meaning of
backend_packages
to include either:register.py
with entrypoints that take no arguments[GLOBAL.templated_backends]
. See below how they get loaded.Backends are loaded in order, as before.
A templated backend is configured with toml following this shape:
To load it, pants will import the python module specified in the
template
arg. It will call thegenerate
function on it to produce a register-module-like object:Then pants will look for and call the usual backend registration entry points but on
obj
:obj.rules()
,obj.target_types()
, etc.Code Quality Tool
PR makes these changes:
pants.backend.experimental.adhoc.code_quality_tool_backend_template
. The backends it stamps out are self-contained and does not require theadhoc
backend to be separately added tobackend_packages