-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Meta issue: plugin system #283
Comments
Strongly agree. Will write up some thoughts on this later! |
The first big decision is: should plugins be written in Rust? Or in Python? I believe that either could be possible (though I haven't scoped out the work at all), e.g., using pyo3. It may even be easier to support plugins in Python given that loading code dynamically is much easier in a scripted language... However, I'm partial to requiring plugins be written in Rust. It will lead to a more cohesive codebase, allow us to maintain a focus on performance, and avoid requiring extensive cross-language FFI. I'm open to being convinced otherwise here though. Here are a few relevant resources on implementing a plugin system for Rust:
One of the main challenges seem to be around the lack of ABI stability in Rust. In many of the above write-ups, they discuss how both the plugins and calling library need to use the same versions of Rust in order to be compatible, which feels like a tall order. (From that perspective, one thing that's interesting to me is: could we compile plugins to WASM?) |
I think that instead of Python based plugins, it is better to provide some kind of query language to make easy plugins very easy. Like in https://github.com/hchasestevens/astpath In my opinion, any complex stuff should be in Rust. This way it can reuse existing APIs and be fast. But, I don't know how many Python developers actually know Rust 🤔 I think another way of dealing with it is to ask exisiting flake8 plugin authors about their prefered way of writting it. Their feedback would be very valuable! |
Interesting, Fixit / LibCST has something kind of like that too. It's not quite a distinct query language, but it's effectively a DSL (in Python) to pattern-match against AST patterns. |
My 2c. Then you could have most flake8 plugins available through (lightly related to #414) |
Another idea: we could build a plug-in system atop https://github.com/ast-grep/ast-grep. This would allow users to express lint rules in YAML or via a simple DSL. |
(That tool is itself built atop tree-sitter.) |
I’m coming to this from a position of having written a flake8 plugin for a very specific need at work, and as part of a larger project. This is not something generic, so it’d never make sense as a built-in feature in ruff. I’d love a way to write plugins to ruff in Python, mostly because it’s convenient as someone familiar with Python and not so much rust, but also because it would be nice to keep a project in pure Python even while interacting with rust. The specific plugin in my case, oida, was first written as a standalone thing before I discovered how easy it was to add it as a plugin to flake8. It also uses LibCST, for its ability to round trip code, where we do codemodding for its. If it would be possible to expose a similar ast based Python-interface for plugins that would be awesome. I also have use cases where I’d like to do auto fixing, which it would also be nice to support. The first thing I’d like to do is normalize import statements (relative vs absolute). In order to do that I’d need an interface where I get import statements or ast nodes and the path to the file so I can locate it in relation to other files on the system. I understand that writing plugin in Python would be a slowdown compared to writing them in rust, but I think that tradeoff would be very much acceptable in many cases. |
Very helpful and all makes sense. Maybe just as another data point for the thread: when I was at Spring Discovery, we wrote a few Flake8 plugins to enforce highly codebase-specific rules. For example:
|
So in that light, I think there are different categories of plugins:
|
I think most of those "custom" plugins / checks could be built atop something like |
(Separately: this could arguably make sense to include in Ruff directly.) |
Yeah, I wouldn’t really be able to implement any of Oida using ast-grep, as all the rules depend on the context of the project. I use in-process caching to keep that state ready between files in the current flake8 plugin btw, forgot to mention that above, so the flake8 interface isn’t ideal for that kind of plugin.
I guess some rules could be, again not for my specific case. What we’re considering at work is to enforce relative imports within a Django app and use absolute imports for everything else. Our structure will be |
You asked for feedback from other flake8 plugin authors, so:
|
Thank you @peterjc! Really appreciate your engagement here as a plugin author! (Regarding RST: it looks like there's at least one Rust crate for parsing RST, though it doesn't look super popular.) |
Is this possible, or supported currently? https://github.com/adamchainz/flake8-tidy-imports |
@ofek - Not currently supported but it’s a pretty small surface area so should be easy to add some time in the next few days. |
Thanks! I've been enforcing absolute imports recently (except in tests) https://github.com/pypa/hatch/blob/b0911bb0eaa8d331c24eda940b97bf244ecd5ac3/.flake8#L8-L11 After that I'll switch over, and make new projects generated by Hatch use this. |
Sweet! The banned relative import rule I can definitely do today. |
@ofek -- You can use it in Hatch by adding this to your [tool.ruff]
select = [
"B",
"C",
"E",
"F",
"W",
# Ruff doesn't have this, but it does have E722.
# "B001",
"B003",
"B006",
"B007",
# These don't exist in newer flake8-bugbear versions IIUC.
# "B301",
# "B305",
# "B306",
# "B902",
"Q000",
"Q001",
"Q002",
"Q003",
"I252",
]
ignore = [
"B027",
# "E203",
# "E722",
# "W503",
]
line-length = 120
# tests can use relative imports
per-file-ignores = {"tests/*" = ["I252"], "tests/**/*" = ["I252"]}
[tool.ruff.flake8-tidy-imports]
ban-relative-imports = "all" Let me know if it works, or doesn't! :) |
Thank you!!! pypa/hatch#607 |
@charliermarsh You wrote somewhere that libcst is significantly slower than the current ast implementation in ruff (can't find it right now). Do you know why? Is it because it's a cst or is it because the classes it exposes are Python "compatible"? I'm asking because I've started looking into pyo3 and from what I see the only way to expose an ast to a Python plugin would be to make the ast classes Python classes in pyo3. If that's what's slow with libcst I guess there's not really any point in investigating that route too much, but if we could make that fast enough I guess it could be one way to make plugins work. That doesn't resolve auto-fixing, but as I suggested in another thread I think maybe doing auto-fixing on the token level could be made to work. Maybe an interface like this: def visit_Import(node: ast.Import, tokens: list[str]) -> list[str]:
# Check ast (or tokens) for violations and return updated token
return ["import", " ", "foo"] Or maybe have tokens as an attribute on the ast nodes 🤔 |
@ljodal - This was all based on LibCST as a Rust crate, with no Python FFI -- so I think it's just the CST and parser, and not anything to do with the the serialization. (I also hacked in some RustPython vs. LibCST benchmarks into the existing LibCST |
@ljodal - I don't have great intuition for whether the PyO3 FFI would add much overhead and what the performance impact would be. I think it's worth exploring! |
Aight, then I'll continue investigating :) I haven't written any rust before, so it's slow going (thinking of doing advent of code in rust to get a kickstart). My plan was to use the Python ASDL definitions to generate AST classes, but it's been years since last I touched compilers so I'll have to see how I go about the tokenization and conversion to ast |
It sounds like the jury is still out when it comes to creating custom rules, is that correct? I've seen custom linting rules be a valuable tool when modularizing a monolith, with rules very customized for the codebase you're working in. As far as I can tell, Ruff does not allow you to develop custom rules at this time, so we'd have to run another linter alongside Ruff for that ability. We just switched our codebase to using Ruff, and are also looking to start modularizing. I'm trying to figure out if I need to chose a second tool alongside Ruff for customizations. |
I think the main point of contention is not whether it should be allowed but rather how. IMO if people want to author a plugin in python they should probably use a python-based tool (e.g. I strongly suspect that a DSL would be sufficient for 80+% of the kind of use cases people are describing where a repo has rules very specific to their code base that wouldn't be sufficiently broadly applicable to upstream. Especially nice about that is that such custom rules could just be included in the |
We've been working with Biome to integrate GritQL as an extension/plugin system and I'd love to offer the same for Ruff. The problem space is similar and I think GritQL provides a few advantages:
Here's a few example of how @charliermarsh's earlier custom suggestions could be implemented directly:
|
I'm looking to write a few fitness functions by extending ruff linters. I think that would be ideal, and I'd like to use Since ruff does not support plugins, I'm writing these functions as tests that run on CI using |
@charliermarsh hi! do you think that ruff will consider a plugin system in the short-medium term? |
Thank you, @morgante, for offering your support to help us build a GritQL-based plugin system. GritQL is undoubtedly at the top of my mind when it comes to designing a plugin system for Ruff, and I'm following the work in the Biome repository from a distance (but I must admit, not very closely). It will probably be a while before we evaluate solutions for a plugin system because we're currently in the middle of rewriting Ruff's compiler infrastructure to support multifile analysis (and more ;)). But I'll come back to your offer when we're ready to explore Ruff plugins. |
One flake8 plugin that could be very useful to be covered by ruff would be pydoclint. Even having external plugins, like called as shell commands would prove very useful, speed should not be no1 priority. In time plugin authors might rewrite them in rust, but for start a way to hook external ones would prove very useful. |
I am far from convinced that there's value in having It sort of sounds to me that what some people are looking for is a way to run a bunch of arbitrary checkers on python files in a repository, and they don't really care whether those commands are integrated into the same executable or not. If that's what you're looking for, maybe look at something like |
A new blog post investigating Rust plugin systems, probably helpful! https://benw.is/posts/plugins-with-rust-and-wasi |
FYI, using external processes (like a rust-based standalone binary or a python-based |
Members of my team have been agitating for Ruff, but without support for custom rules it's a tough sell. We need to be able to implement bespoke rules to support our own coding style - stuff like forbidding top-level statements not guarded by |
Or (and I don't necessarily recommend it), Ruff + Flake8 (using the external config option for rules handled by Flake8). |
If they want to monetise what they’ve built (and add this as a paid feature) then Astral might welcome this sort of feedback, but otherwise it feels a little off-key for a free tool with a permissive open-source license. |
I have the same problem as @Dreamsorcerer for my team. |
Along with plugin system I would like to share great idea for beginners to lint their code who don't know how to write linting rules using AST or something else. "Lint using regex" In JS, ESLint has plugin system and we used to write custom rules for our solutions. We wrote some rules using official rules docs but using regex to lint the code allowed all of us in our team to writing linting rules without any additional learning. Regex plugin that allowed beginner devs to write linting rules: https://www.npmjs.com/package/eslint-plugin-regex |
I had a look at ast-grep and it's pretty fantastic. It is fast, it is written in rust and provides multiple high-level APIs, including both python and YAML. But it has one giant problem: it doesn't preserve comments, making it pretty useless for real-world code manipulation. Would it be possible to publish ruff's python ast parser as a crate, which ast-grep could be modified to use, and then ruff could use ast-grep's high level APIs for its plugin system? |
This switches the Ruff subsystem to download and run it via the artifacts published to GitHub releases. Ruff is published to PyPI and thus runnable as a PEX, which is what Pants currently does... but under the hood ruff is a single binary, and the PyPI package is for convenience. The package leads to a bit of extra overhead (e.g. have to build a `VenvPex` before being able to run it). It is also fiddly to change the version of a Python tool, requiring building a resolve and using `python_requirement`s. By downloading from GitHub releases, we can: - more easily support multiple ruff versions and allow users to pin to one/switch between them with a `version = "..."` (this PR demonstrates this, by including both 0.6.4 as in `main`, and 0.4.9 as in 2.23, i.e. if someone upgrades to 2.24 and wants to stick with 0.4.9, they can just add `version = "0.4.9"`, no need to fiddle with `known_versions`) - eliminate any Python/pex overhead by invoking the binary directly - side-step fiddle like interpreter constraints (#21235 (comment)) Potential issues: - If Ruff adds plugins via Python in future, maybe we will want it installed in a venv so that the venv can include those plugins... astral-sh/ruff#283 seems to be inconclusive about the direction, so I'm inclined to not worry and deal with it in future, if it happens. This PR does: - Switches the ruff subsystem to be a `TemplatedExternalTool` subclass, not `PythonToolBase` - Plumbs through the requisite changes, including: setting up some `default_known_versions` for the `main` and 2.23 versions (0.6.4 and 0.4.9 respectively), changing how the tool is installed, removing metadata about interpreter constraints that is no longer relveant or computable - Eases the upgrade by adding deprecated instances of the fields provided by `PythonToolBase`: people may've customised the Ruff version with `install_from_resolve`. If the field was just removed, running any pants command will fail on start-up with `Invalid option 'install_from_resolve' under [ruff] in path/to/pants.toml`, which isn't very helpful. By having the fields exist, we ensure the user gets a descriptive warning about what to do. NB. the backend is labelled experimental, so I haven't tried to preserve behaviour, just focused on ensuring we can show a removal error message to them.
It hasn't been mentioned explicitly in this thread but a plugin system gives the opportunity to lint non-Python languages that have been embedded in a Python file. This does show up in Python files somewhat often (rST in docstrings, SQL string literals) but implementing those formatters is clearly out of scope for ruff itself. It still makes sense to have ruff as the host for these linters, though: you'd get to take advantage of ruff's configuration (file exclude rules), you could piggyback on the Even when you're just linting a string literal you may want to expose the AST to that linter. A SQL statement might want to know what level of indentation a string literal was declared at to match vertical alignment:
could be linted/formatted into
But maybe there is also opportunity to have a simpler "if this string literal is annotated for my linter ( I see a bit of discussion in this thread over forking vs in-process processing: if you are providing an AST or AST-lite API you can let the plugin decide if they want to fork or not. It would be unfortunate for performance if they had to fork but it is entirely possible that a plugin might want to use an existing linter and be practically constrained to fork to run it. |
@dgilmanAIDENTIFIED supporting embedded languages that are commonly used with Python is a long-term goal. See #8237 |
Maybe it's a bit too naive of an idea - but wouldn't the embedding also immediately solve the issue of needing Python linting/formatting plugins to be written in Rust? I.e. if you have a |
That sort of implies that there's some universal representation for ASTs, which isn't the case. |
I think that the main thing why Flake8 is so popular is its plugin system.
You can find plugins for every possible type of problems and tools.
Right now docs state:
I propose designing and implementing plugin API.
This way
ruff
can compete withflake8
in terms of adoption and usability.Plugin API
I think that there are some flake8 problems that should be fixed and also there are some unique chalenges that should be addressed.
flake8
suffers from a problem when you install some tool and it has aflake8
plugin definition. This plugin is automatically enabled due to howsetuptools
hooks work. I think that all rules must be explicit. So,eslint
's explicitplugins:
looks like a better way.[flake8]
section can cause conflicts between plugins. Probably, the wayeslint
does that is betterPlease, share your ideas and concerns.
The text was updated successfully, but these errors were encountered: