Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ruff server: Resolve configuration for each document individually #10950

Merged
merged 11 commits into from
Apr 16, 2024

Conversation

snowsignal
Copy link
Contributor

@snowsignal snowsignal commented Apr 15, 2024

Summary

Configuration is no longer the property of a workspace but rather of individual documents. Just like the Ruff CLI, each document is configured based on the 'nearest' project configuration. See the Ruff documentation for more details.

To reduce the amount of times we resolve configuration for a file, we have an index for each workspace that stores a reference-counted pointer to a configuration for a given folder. If another file in the same folder is opened, the configuration is simply re-used rather than us re-resolving it.

Guide for reviewing

The first commit is just the restructuring work, which adds some noise to the diff. If you want to quickly understand what's actually changed, I recommend looking at the two commits that come after it. f7c073d makes configuration a property of DocumentController/DocumentRef, moving it out of Workspace, and it also sets up the ConfigurationIndex, though it doesn't implement its key function, get_or_insert. In the commit after it, fc35618, we implement get_or_insert.

Test Plan

The best way to test this would be to ensure that the behavior matches the Ruff CLI. Open a project with multiple configuration files (or add them yourself), and then introduce problems in certain files that won't show due to their configuration. Add those same problems to a section of the project where those rules are run. Confirm that the lint rules are run as expected with ruff check. Then, open your editor and confirm that the diagnostics shown match the CLI output.

As an example - I have a workspace with two separate folders, pandas and scipy. I created a pyproject.toml file in pandas/pandas/io and a ruff.toml file in pandas/pandas/api. I changed the select and preview settings in the sub-folder configuration files and confirmed that these were reflected in the diagnostics. I also confirmed that this did not change the diagnostics for the scipy folder whatsoever.

@snowsignal snowsignal added the server Related to the LSP server label Apr 15, 2024
Copy link
Contributor

github-actions bot commented Apr 15, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the laziness of the current design but I think has the downside (as far as I understand) that we deserialize and store a resolved configuration for each directory. That's unlikely a huge memory issue. It's also unclear to me if the ConfigurationIndex gets correctly invalidated when:

  • I open a file in a directory that inherits the configuration from the parent directory (the configuration is now stored in the index)
  • I close the file
  • I change the parent configuration (that reloads the configurations of all paths with open documents)
  • I reopen the file (I suspect it reuses the old configuration)

I think there's a similar case where creating a new configuration in a specific directory must correctly propagate to all sub-directories). The easiest fix is to simply prune all sub-directory entries, but maybe we can do better?

I wonder if we should instead traverse the workspace eagerly and discovers all configurations or model the index in a way that it tracks whether a path has its own configuration or inherits from the parent (finding the configuration is then the same as finding the first entry where the configuration is not inherited).

crates/ruff_server/src/session/workspace/configuration.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/session/workspace/configuration.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/session/workspace/configuration.rs Outdated Show resolved Hide resolved
@dhruvmanila
Copy link
Member

It's not uncommon for a language server to spend some time initially to setup the workspace. This could potentially include building up the configuration index as suggested by Micha but I'm ok with the lazy approach as well.

Thank you for providing the test plan. Can you expand it by listing down any/all the scenarios that you've tested against? This way I can make sure to not repeat it and come up with any other possible project structure to test this change against.

@snowsignal
Copy link
Contributor Author

It's also unclear to me if the ConfigurationIndex gets correctly invalidated when:

  • I open a file in a directory that inherits the configuration from the parent directory (the configuration is now stored in the index)
  • I close the file
  • I change the parent configuration (that reloads the configurations of all paths with open documents)
  • I reopen the file (I suspect it reuses the old configuration)

It should be correctly invalidated - any time a configuration file gets changed, we clear the entire configuration index before we start the configuration reload (see this line)

@snowsignal
Copy link
Contributor Author

snowsignal commented Apr 15, 2024

I wonder if we should instead traverse the workspace eagerly and discovers all configurations or model the index in a way that it tracks whether a path has its own configuration or inherits from the parent (finding the configuration is then the same as finding the first entry where the configuration is not inherited).

It's not uncommon for a language server to spend some time initially to setup the workspace. This could potentially include building up the configuration index as suggested by Micha but I'm ok with the lazy approach as well.

I'm cool with the eager approach! It would definitely be more efficient.

@snowsignal snowsignal force-pushed the jane/server/per-document-config branch from 1de380a to a746423 Compare April 16, 2024 02:04
@snowsignal
Copy link
Contributor Author

Thank you for providing the test plan. Can you expand it by listing down any/all the scenarios that you've tested against? This way I can make sure to not repeat it and come up with any other possible project structure to test this change against.

I've added them to the test plan!

@snowsignal snowsignal requested a review from MichaReiser April 16, 2024 02:30
@snowsignal snowsignal enabled auto-merge (squash) April 16, 2024 18:08
@snowsignal snowsignal merged commit cffc555 into main Apr 16, 2024
17 checks passed
@snowsignal snowsignal deleted the jane/server/per-document-config branch April 16, 2024 18:15
@snowsignal snowsignal added this to the Ruff Server: Beta milestone Apr 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
server Related to the LSP server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants