Skip to content

Conversation

@moabo3li
Copy link
Contributor

Fix cargo locate-project --workspace performance issue

cargo locate-project --workspace was unnecessarily loading and validating all workspace members, causing slowdowns in large workspaces (>1s for 1500+ members).
Changed to use find_workspace_root() instead of args.workspace(), which only parses the minimal manifests needed to locate the workspace root path without loading all members.

Fixes #15107

@rustbot rustbot added A-cli Area: Command-line interface, option parsing, etc. Command-locate-project S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 20, 2025
@rustbot
Copy link
Collaborator

rustbot commented Dec 20, 2025

r? @ehuss

rustbot has assigned @ehuss.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@moabo3li moabo3li force-pushed the locate_project_workspace branch from af4174a to e6bd4cb Compare December 20, 2025 14:33
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might also want to add some extra failing cases:

  • Run from a nested directory of a package. The package specifies package.workspace to the other workspace but the workspace in its parent directory
  • Run from a nested directory of a package. The outer workspace manifest doesn't have that package as a member.

The current implementation may forget to take this into account #15107 (comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@moabo3li moabo3li force-pushed the locate_project_workspace branch from e6bd4cb to 6667ca2 Compare December 20, 2025 20:09
@moabo3li moabo3li changed the title refactor(locations): streamline workspace root retrieval in exec function Fix cargo locate-project --workspace performance issue Dec 20, 2025
name = "nested"
version = "0.0.0"
[package.workspace]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test setup is wrong.

package.workspace should be a valid workspace path

https://doc.rust-lang.org/cargo/reference/manifest.html#the-workspace-field

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in what way?

I expected that point to a valid workspace rootz and that affects what this command returns

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m sorry if I’m missing something, but I don’t understand why this is not considered valid.

nested/Cargo.toml has workspace = "..", which resolves to the parent directory.
The parent Cargo.toml contains a [workspace] section with members = ["nested"], which is a valid workspace root.

Therefore, workspace = ".." points to a valid workspace path according to the documentation.

Is there something I’m misunderstanding?

In this test:

workspace = ".." points to the parent directory

The parent workspace lists ["nested"] in members, which includes this package

Given this, the cargo metadata command returns [ROOT]/foo/Cargo.toml (the parent workspace root).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was reading the wrong diff. Sorry.

#16423 (comment)

In that comment I was looking for failure cases, where package.workspace points to some other workspaces, not the parent one. With that we can validate when workspace manifest and package manifest disagree with each other, and should return the workspace the package manifest belongs to.

This test itself is nice to have to validate the package.workspace working, but not the case I was looking for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, by failing I didn't meant they really need to fail. They are edge cases where Cargo might want to point to the right workspace rather than the wrong workspace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added these tests, plus one more to validate cases like nested/*. If the tests and solution look good, I’ll add them along with the commits made before the fix, as usual.

.with_stdout_data(
str![[r#"
{
"root": "[ROOT]/foo/Cargo.toml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this point to the non-member package?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With find_workspace_root, it returns the parent workspace root, even though the package is not a member

before change it should fails with error "current package believes it's in a workspace when it's not"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With find_workspace_root, it returns the parent workspace root, even though the package is not a member

Is this the desired behavior?

Please see #15107 (comment) that we still want a minimal verification

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@moabo3li moabo3li force-pushed the locate_project_workspace branch 2 times, most recently from f04c198 to f928e9e Compare December 22, 2025 14:32
@rustbot rustbot added the A-workspaces Area: workspaces label Dec 22, 2025
.with_stdout_data(
str![[r#"
{
"root": "[ROOT]/foo/inner-workspace/Cargo.toml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't make much sense to me. The inner-workspace/pkg/Cargo.toml will always point to the inner-workspace/Cargo.toml because it is the immediate parent directory that contain a workspace manifest, regardless of having a outer workspace manifest or not.

To avoid the parent directory probiding in your code path, you might want to change workspace = ".." to point to a workspace manifest that is in a sibiling directory of the package manifest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
match self.members {
Some(ref members) => members.iter().any(|mem| {
if mem.contains('*') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the correct way to handle glob syntax, see this function for the right approach (and we might want reuse it or refactor to reuse)

/// Returns expanded paths along with the glob that they were expanded from.
/// The glob is `None` if the path matched exactly.
#[tracing::instrument(skip_all)]
fn members_paths<'g>(
&self,
globs: &'g [String],
) -> CargoResult<Vec<(PathBuf, Option<&'g str>)>> {
let mut expanded_list = Vec::new();
for glob in globs {
let pathbuf = self.root_dir.join(glob);
let expanded_paths = Self::expand_member_path(&pathbuf)?;
// If glob does not find any valid paths, then put the original
// path in the expanded list to maintain backwards compatibility.
if expanded_paths.is_empty() {
expanded_list.push((pathbuf, None));
} else {
let used_glob_pattern = expanded_paths.len() > 1 || expanded_paths[0] != pathbuf;
let glob = used_glob_pattern.then_some(glob.as_str());
// Some OS can create system support files anywhere.
// (e.g. macOS creates `.DS_Store` file if you visit a directory using Finder.)
// Such files can be reported as a member path unexpectedly.
// Check and filter out non-directory paths to prevent pushing such accidental unwanted path
// as a member.
for expanded_path in expanded_paths {
if expanded_path.is_dir() {
expanded_list.push((expanded_path, glob));
}
}
}
}
Ok(expanded_list)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

/// - The path is the workspace root manifest itself, or
/// - No explicit `members` list is specified (implicit membership), or
/// - The path matches one of the `members` patterns
fn is_path_member(&self, manifest_path: &Path) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is missing an important check: path dependencies.

@epage what is your though on this, as it may provide a false result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has there been any update or thoughts regarding the path dependencies check ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Briefly discussed this with epage. We would lean towards not having Cargo produce false results.

Path dependencies are implicit workspaces dependency if reside in the workspace directory (see the doc for complete definition). To determine whether a package is a path dependencies of any workspace, we need to load the manifest file, which undermines the performance improvement we want here for #15107

Ed has proposed a solution: If a package is explicitly listed as a member (in workspace.members), it is fine to skip loading manifest as Cargo knows it belongs to the workspace,. For package that is unknown whether it belongs to any workspace in its ancestor directories, Cargo needs to fallback to the all way (load everything) to deteremine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that and tested it to check the dependencies.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one downside to this change is the error reporting will be conditioned on how the manifest is loaded.

Not saying we shouldn't do my idea but we need to be cognizant of that downside when deciding to move forward with it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I’ve applied the discussed changes and tested them.

Let me know if there’s anything else you’d like me to update.

@moabo3li moabo3li force-pushed the locate_project_workspace branch from f928e9e to ae37cf6 Compare December 22, 2025 18:31
@moabo3li moabo3li force-pushed the locate_project_workspace branch from ae37cf6 to e051ddc Compare December 31, 2025 13:10
@moabo3li moabo3li requested a review from weihanglo January 2, 2026 11:35
Comment on lines 47 to 48
// If loading fails (e.g., package is not a member of any workspace),
// treat the package as its own workspace root.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the fallback for loading failure? This sounds like a behavior change than an optimization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think it should be like the old one:

workspace = args.workspace(gctx)?;
workspace.root_manifest()

This way, we keep the fast path and the slow path for full validation, following the old approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

"sibling-workspace/Cargo.toml",
r#"
[workspace]
members = ["../pkg"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is redundant, no? We already have package.workspace = "../sibling-workspace". What do we want to test here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could we add these test before the fix like other tests, so the test snapshot diff will show the behavior change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it okay to add all the tests in one commit before the fix commit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I forgot to remove it before. I’ll removed it now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

p.cargo("locate-project --workspace")
.cwd("not-member")
.with_stderr_data(str![[r#"
[ERROR] current package believes it's in a workspace when it's not:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid this error, we should make not-member a workspace, which requires putting [workspace] in the manifest`. I think we can have both tests:

  • failed with this error
  • succeeded if in subdirector but having [workspace].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

.build();

p.cargo("locate-project --workspace")
.cwd("not-member/src")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only difference between this and workspace_not_a_member is that the cwd is in not-member/src for? Why testing this? And if we need it, we may want to just adding another p.cargo("locate-project --workspace") call in workspace_not_a_member.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test that it doesn't get confused by being one level deeper and still correctly identifies.
I`ll add it to workspace_not_a_member and remove this test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

"Cargo.toml",
r#"
[workspace]
members = ["nested"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have package.workspace = ".." already in nested/Cargo.toml, do we still need workspace.members in the root Cargo.toml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

/// package and workspace agree on each other:
/// - If the package has an explicit `package.workspace` pointer, it is trusted
/// - Otherwise, the workspace must include the package in its `members` list
pub fn find_workspace_root_with_metadata(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of reimplementing most of the probing logic, I guess it might be possible to reuse find_workspace_root_with_loader with a provided loader?

It might look like

pub fn find_workspace_root_with_metadata() -> CargoResult<Option<PathBuf>> {
    find_workspace_root_with_loader(manifest_path, gctx, |self_path| {
        let source_id = SourceId::for_manifest_path(self_path)?;
        let manifest = read_manifest(self_path, source_id, gctx)?;
        match manifest.workspace_config() {
            // check explicit membership...
        }
    })

}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

/// package and workspace agree on each other:
/// - If the package has an explicit `package.workspace` pointer, it is trusted
/// - Otherwise, the workspace must include the package in its `members` list
pub fn find_workspace_root_with_metadata(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name find_workspace_root_with_metadata is not ideal. with_metadata is less meaningful to API user. find_workspace_root_with_membership_check might be a better name for the exposed API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@moabo3li moabo3li force-pushed the locate_project_workspace branch from e051ddc to 7990081 Compare January 5, 2026 17:20
#[cargo_test]
fn workspace_nested_with_explicit_pointer() {
let p = project()
.file("Cargo.toml", "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workspace manifest is not a valid manifest. I would assume this to fail. Having an explicit pointer in member manifest doesnt mean workspace manifest agrees with the membership.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll do it, but what do you mean by fail?
I mean, the test has already passed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This #16423 (comment).

Invalid workspace manifest should fail the invocation of cargo locate-project but I believed it wasn't the intent of this test.

.with_status(101)
.with_stderr_data(str![[r#"
[ERROR] current package believes it's in a workspace when it's not:
...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we omitting here? I feel like it is fine snapshotting the entire stderr, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a valid workspace manifest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you want to mention here, since the code isn’t shown?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I meant .file("sibling-workspace/Cargo.toml", ""). If the workspace manifest isn't valid, the cargo locate-project should fail.

WorkspaceConfig::Member {
root: Some(path_to_root),
} => {
// Has explicit `package.workspace` pointer - trust it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. We should not. We should check if the workspace manifest agrees with it.

So basically we'll always have to read at least two manifests for inspecting a member manifest.

};
expanded_members
.iter()
.any(|(member_path, _)| manifest_path.starts_with(member_path))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of starts_with we probably want to compare the path explicitly. Something like

Suggested change
.any(|(member_path, _)| manifest_path.starts_with(member_path))
.any(|(member_path, _)| manifest_path.parent() == Some(member_path))

Also, you might want to check whether the member_path is not normalized (e.g., ../crates/* expand to ../crates/foo/Cargo.toml without removing ..). There are a couple of paths::normalize_path call in the module I am aware of.

/// A `false` return does NOT mean the package is definitely not a member -
/// it could still be a member via path dependencies. Callers should fallback
/// to full workspace loading when this returns `false`.
fn is_explicitly_listed_member(&self, manifest_path: &Path) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// default-members are allowed to be excluded, but they
// still must be referred to by the original (unfiltered)
// members list. Note that we aren't testing against the
// manifest path, both because `members_paths` doesn't
// include `/Cargo.toml`, and because excluded paths may not
// be crates.

I found this interesting comment. Wonder what would happen with the current implementation in these situations:

  • A member is listed in workspace.default-members but not in workspace.members.
  • A member is listed in both workspace.default-members and workspace.exclude
  • A member is listed in both workspace.members and workspace.exclude

The new implementation should be aligned with the default args.workspace() behavior IMO.

@moabo3li moabo3li force-pushed the locate_project_workspace branch from 7990081 to 2f54485 Compare January 6, 2026 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-cli Area: Command-line interface, option parsing, etc. A-workspaces Area: workspaces Command-locate-project S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cargo locate-project --workspace loads all the workspace

5 participants