Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust-analyzer discoverConfig integration #3073

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

bobozaur
Copy link

@bobozaur bobozaur commented Dec 10, 2024

Adds a target that can be used for project auto-discovery by using the discoverConfig settings as described in the rust-analyzer user manual.

Unlike the gen_rust_project target, this can be used for dynamic project discovery, and passing {arg} to discoverConfig.command can split big repositories into multiple, smaller workspaces that rust-analyzer switches between as needed. Large repositories can make it OOM.

At amo, we've used a similar implementation for a while with great success, which is why we figured we might upstream it. The changes also include two additional output groups to ensure that proc-macros and build script targets are built, as rust-analyzer depends on these to provide complete IDE support.

Additionally, the PR makes use of the output_base value in bazel invocations. We found it helpful to have tools such as rust-analyzer and clippy run on a separate bazel server than the one used for building. And a config_group argument was added to provide the ability to provide a config group to bazel invocations.

An attempt to get codelens actions to work was done as well, particularly around tests and binaries. They seem to work, but I'm not 100% sure whether the approach taken is the right one.

Closes #2755 .

Copy link

google-cla bot commented Dec 10, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@bobozaur bobozaur force-pushed the rust-analyzer-discover-config branch 4 times, most recently from d8ef7e4 to 80a45c1 Compare December 10, 2024 17:50
@bobozaur bobozaur force-pushed the rust-analyzer-discover-config branch from 267aa69 to ea06599 Compare December 13, 2024 09:19
@bobozaur
Copy link
Author

@sam-mccall I noticed you've been working on things adjacent to this in other recent PRs. Just adding you here for visibility reasons. Any advice you can offer regarding the PR is more than welcome.

@sam-mccall
Copy link
Contributor

@sam-mccall I noticed you've been working on things adjacent to this in other recent PRs. Just adding you here for visibility reasons. Any advice you can offer regarding the PR is more than welcome.

Thanks! Yes I had a draft version running locally that I'd planned to clean up and send, but I'm happy you got there first.
I'm not an owner here, but happy to provide feedback and test this out.

This PR combines a few logically-separate changes. I get it - there are a bunch of details that all need to be right for this to work end-to-end. (I ran into some overlapping-but-different set of these myself). I'll try to understand them all and provide high-level comments first, and then let you figure out whether to split them out and land the more "obvious" stuff first. (e.g. having a separate output-base and injecting blaze configuration are useful, but different users will have different needs here).

(I did spot this last friday and started to review, but have been unwell this week...)

Copy link
Contributor

@sam-mccall sam-mccall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great stuff, really thorough job! Some parts of the protocol (logging & runnables) were totally new to me.

High level:

  • starlark changes (proc macros & build scripts) are good. Related: rust_analyzer: generate more reqired sources #3031 includes some other generated files.
  • A separate binary for discover sounds good to me (I had this as a flag on gen_rust_project, but it's messy)
  • this is background infra that should "do what I mean", as such we want fewer flags and more detection.
  • some things we will eventually want should inform flags/detection:
    • become a standalone binary (not invoked through blaze run)
    • automatically select an output_base
  • a BlazeSpec (argv0, workspace, execroot, ...) would be a useful abstraction, replacing Config and long param lists
  • clearer split between main (config + protocol) and library (build+query+describe) would make the code easier to follow.

Let me know what you think - as I said I'm not an owner here. I can nag one to get you a stamp :-) but also feel free to get a second opinion instead.

If you're interested, this could be a few PRs - might go quicker, but totally up to you:

  • proc macros + build scripts
  • refactor current bin+lib to make lib reusable
  • add discover tool
  • runnables

build_info = build_info,
))

return [
rust_analyzer_info,
OutputGroupInfo(rust_analyzer_crate_spec = rust_analyzer_info.crate_specs),
OutputGroupInfo(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think rust_analyzer_build_info_out_dirs is really about ensuring generated sources are available, and should be named as such: rust_analyzer_srcs. See https://github.com/bazelbuild/rules_rust/pull/3031/files. These don't need to be distinct output groups as from the caller's point of view it doesn't make sense to want one without the other.

Having three output groups does make sense to me though: crate specs are the bare minimum and I can imagine producing the rest asynchronously, and proc macros are optional (we use a toolchain that doesn't even have the proc-macro-srv...)

"deps": "List[String]: IDs of direct dependency crates",
"env": "Dict[String: String]: Environment variables, used for the `env!` macro",
"id": "String: Arbitrary unique ID for this crate",
"proc_macro_dylib_path": "File: compiled shared library output of proc-macro rule",
"proc_macro_dylib": "File: compiled shared library output of proc-macro rule",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "if this is a proc-macro target, the shared-library output"?

(wasn't clear to me which macro, and "rule" is wrong here I think)

@@ -162,17 +162,21 @@ RustAnalyzerInfo = provider(
"cfgs": "List[String]: features or other compilation `--cfg` settings",
"crate": "CrateInfo: Crate information.",
"crate_specs": "Depset[File]: transitive closure of OutputGroupInfo files",
"proc_macro_dylibs": "Depset[File]: transitive closure of OutputGroupInfo files",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should have clear names and comments (the comment on crate_specs is bad; maybe fix while here?)

@@ -45,6 +45,8 @@ def write_rust_analyzer_spec_file(ctx, attrs, owner, base_info):
RustAnalyzerInfo: Info with the embedded spec file.
"""
crate_spec = ctx.actions.declare_file("{}.rust_analyzer_crate_spec.json".format(owner.name))
proc_macro_dylibs = [base_info.proc_macro_dylib] if base_info.proc_macro_dylib else None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is just to write the file, everything with the RustAnalyzerInfo should be a pure passthrough.

(This function is ugly and there's probably a way to get rid of it, but failing that it should at least stay trivial)

@@ -221,6 +247,13 @@ def _create_single_crate(ctx, attrs, info):
crate["root_module"] = path_prefix + info.crate.root.path
crate["source"] = {"exclude_dirs": [], "include_dirs": []}

# Store build system related info only for local crates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment echoes the code, can you say why instead?

To me it's not obvious why, but I guess it's to prevent blaze test runnables from showing up for other crates? But is "local" the right choice there, rather than "top-level" i.e. targets from the selected blaze package?

@@ -108,6 +223,31 @@ pub fn generate_rust_project(
sysroot: Some(sysroot.into()),
sysroot_src: Some(sysroot_src.into()),
crates: Vec::new(),
runnables: vec![
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(no action needed) runnables is not documented in the rust-project.json spec, we should fix that.

@@ -108,6 +223,31 @@ pub fn generate_rust_project(
sysroot: Some(sysroot.into()),
sysroot_src: Some(sysroot_src.into()),
crates: Vec::new(),
runnables: vec![
Runnable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the intended use of build?

If it's to get diagnostics (which RunnableKind::Check suggests), it's not clear to me if we should pass --config, --output_base etc as we're a tool, or not pass them as we're acting as a simple proxy for the user.

I'm fine with either answer but let's say why in a comment.

Part of this is: are we expecting --config to be something like --config=generate_simplified_sources_for_rust_analyzer (should be ignored) or --config=macos (should be passed)?

.replace("__OUTPUT_BASE__", output_base)
.replace("__WORKSPACE__", workspace);
let rust_project_content = serde_json::to_string_pretty(rust_project)?;
let rust_project_content = normalize_project_string(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the layering is unclear: here you're calling normalize_project_string within the rust_project module, but in the other case (discover) the caller is expected to invoke it.

This normalization is messy :-(
As-is, I think the cleanest solution is to consistently call the normalize function from the top level, right before doing the IO (and not have the IO buried inside libraries).

.env_remove("BAZELISK_SKIP_WRAPPER")
.env_remove("BUILD_WORKING_DIRECTORY")
.env_remove("BUILD_WORKSPACE_DIRECTORY")
.arg(format!("--output_base={output_base}"))
.arg("build")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(no action needed) When we're running in the background, we want to be as forgiving as possible if the user's current state is broken. So eventually we should have --nocheck_visibility here and possibly elsewhere.

@@ -1,62 +1,153 @@
mod aquery;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general the layering/responsibility split between main, this lib, and its submodules feels pretty unclear. (This is true before your changes, but it's more load-bearing after them!)

I'd suggest main should be responsible for flag parsing, protocol (serialization, deserialization, io), and most things that are inherently tool-specific. I don't see a clear reason that we want discover_rust_project and write_rust_project as different library entry points.

@bobozaur
Copy link
Author

@sam-mccall Sorry for the delay and thank you for the thorough review! I'm not going to have access to my laptop for a few more days but I plan to address all of this after Christmas. Happy holidays :)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integration with rust-analyzer's automatic project discovery
2 participants