Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo rm unexpectedly removes patches #12419

Closed
dtolnay opened this issue Jul 31, 2023 · 17 comments · Fixed by #12454
Closed

cargo rm unexpectedly removes patches #12419

dtolnay opened this issue Jul 31, 2023 · 17 comments · Fixed by #12454
Assignees
Labels
C-bug Category: bug Command-remove E-medium Experience: Medium S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review

Comments

@dtolnay
Copy link
Member

dtolnay commented Jul 31, 2023

Problem

When I run cargo rm dependency, I expect Cargo to remove the crate dependency from the current crate's [dependencies], and other related necessary changes, like removing dependency from the right-hand side of features that previously enabled an optional dependency on "dependency" or one of its features "dependency/…".

Cargo does these things, but it also wipes out my patch.crates-io section from the workspace's manifest. I could not figure out how this seems related in any way to removing a single dependency, so I think this is a bug.

Steps

# Cargo.toml

[workspace]
members = ["serde", "serde_derive"]

[patch.crates-io]
serde = { path = "serde" }
# serde/Cargo.toml

[package]
name = "serde"
version = "1.0.0"

[dependencies]
serde_derive = { path = "../serde_derive" }
# serde_derive/Cargo.toml

[package]
name = "serde_derive"
version = "1.0.0"

+serde/src/lib.rs and serde_derive/src/lib.rs which can be empty files.

In the serde directory, run:

$ cargo check  # sanity check--the manifests are valid

$ cargo rm serde_derive

Expected diff:

diff --git a/serde/Cargo.toml b/serde/Cargo.toml
index 759de5d..10027d7 100644
--- a/serde/Cargo.toml
+++ b/serde/Cargo.toml
@@ -3,6 +3,3 @@
 [package]
 name = "serde"
 version = "1.0.0"
-
-[dependencies]
-serde_derive = { path = "../serde_derive" }

Actual diff produced by Cargo:

diff --git a/Cargo.toml b/Cargo.toml
index 583dd2f..60ffc4f 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -2,6 +2,3 @@
 
 [workspace]
 members = ["serde", "serde_derive"]
-
-[patch.crates-io]
-serde = { path = "serde" }
diff --git a/serde/Cargo.toml b/serde/Cargo.toml
index 759de5d..10027d7 100644
--- a/serde/Cargo.toml
+++ b/serde/Cargo.toml
@@ -3,6 +3,3 @@
 [package]
 name = "serde"
 version = "1.0.0"
-
-[dependencies]
-serde_derive = { path = "../serde_derive" }

Possible Solution(s)

No response

Notes

No response

Version

cargo 1.73.0-nightly (7ac9416d8 2023-07-24)
release: 1.73.0-nightly
commit-hash: 7ac9416d82cd4fc5e707c9ec3574d22dff6466e5
commit-date: 2023-07-24
host: x86_64-unknown-linux-gnu
libgit2: 1.6.4 (sys:0.17.2 vendored)
libcurl: 8.1.2-DEV (sys:0.4.63+curl-8.1.2 vendored ssl:OpenSSL/1.1.1u)
ssl: OpenSSL 1.1.1u  30 May 2023
os: Ubuntu 22.04 (jammy) [64-bit]
@dtolnay dtolnay added C-bug Category: bug S-triage Status: This issue is waiting on initial triage. labels Jul 31, 2023
@epage
Copy link
Contributor

epage commented Jul 31, 2023

If I'm reading the reproduction steps correctly, the serde patch is unused. This is from #11194.

Note the GC process was expanded to clean up not just references to the dependencies just removed, but also references of all dependencies. This was because there's not an easy way to determine which dependencies correspond to the given TOML keys, without either 1) figuring that out before the removal (therefore having to predict the behavior), or 2) returning that information from the remove function (somewhat unorthodox for an op).

Note that this is referencing some of the structure of cargo. This was implemented as remove and gc_workspace functions which adds some burden for connecting the pieces (but not terribly).

This seemed ok because the chance of this seemed low and people could selecting add/revert the relevant parts.

@dtolnay
Copy link
Member Author

dtolnay commented Jul 31, 2023

If I'm reading the reproduction steps correctly, the serde patch is unused.

In serde's repo where I hit this, the patch is definitely used.

$ cargo tree
serde v1.0.179 (/git/serde/serde)
[dev-dependencies]
└── serde_derive v1.0.179 (proc-macro) (/git/serde/serde_derive)
    ├── proc-macro2 v1.0.66
    │   └── unicode-ident v1.0.11
    ├── quote v1.0.32
    │   └── proc-macro2 v1.0.66 (*)
    └── syn v2.0.28
        ├── proc-macro2 v1.0.66 (*)
        ├── quote v1.0.32 (*)
        └── unicode-ident v1.0.11
    [dev-dependencies]
    └── serde v1.0.179 (/git/serde/serde) (*)

serde_derive v1.0.179 (proc-macro) (/git/serde/serde_derive) (*)

serde_derive_internals v0.28.0 (/git/serde/serde_derive_internals)
├── proc-macro2 v1.0.66 (*)
├── quote v1.0.32 (*)
└── syn v2.0.28 (*)

serde_test_suite v0.0.0 (/git/serde/test_suite)
└── serde v1.0.179 (/git/serde/serde) (*)
[dev-dependencies]
├── automod v1.0.12 (proc-macro)
│   ├── proc-macro2 v1.0.66 (*)
│   ├── quote v1.0.32 (*)
│   └── syn v2.0.28 (*)
├── fnv v1.0.7
├── rustversion v1.0.14 (proc-macro)
├── serde v1.0.179 (/git/serde/serde) (*)
├── serde_derive v1.0.179 (proc-macro) (/git/serde/serde_derive) (*)
├── serde_test v1.0.176
│   └── serde v1.0.179 (/git/serde/serde) (*)
└── trybuild v1.0.82
    ├── basic-toml v0.1.4
    │   └── serde v1.0.179 (/git/serde/serde) (*)
    ├── dissimilar v1.0.7
    ├── glob v0.3.1
    ├── once_cell v1.18.0
    ├── serde v1.0.179 (/git/serde/serde) (*)
    ├── serde_derive v1.0.179 (proc-macro)
    ├── serde_json v1.0.104
    │   ├── itoa v1.0.9
    │   ├── ryu v1.0.15
    │   └── serde v1.0.179 (/git/serde/serde) (*)
    └── termcolor v1.2.0

Notice how all of serde_test, trybuild, basic-toml, and serde_json (all from crates.io) depend on the local serde instead of the crates.io serde. That is because of the patch.

@dtolnay
Copy link
Member Author

dtolnay commented Jul 31, 2023

Updated repro:

# serde/Cargo.toml

[package]
name = "serde"
version = "1.0.0"

[dependencies]
serde_derive = { path = "../serde_derive" }

[dev-dependencies]
serde_json = "=1.0.0"

cargo rm serde_derive still incorrectly removes the patch despite the patch being used.

Before:

$ cargo tree
serde v1.0.0 (/git/repro/workspace/serde)
└── serde_derive v1.0.0 (/git/repro/workspace/serde_derive)
[dev-dependencies]
└── serde_json v1.0.0
    ├── dtoa v0.4.8
    ├── itoa v0.3.4
    ├── num-traits v0.1.43
    │   └── num-traits v0.2.16
    │       [build-dependencies]
    │       └── autocfg v1.1.0
    └── serde v1.0.0 (/git/repro/workspace/serde) (*)

After:

$ cargo tree
serde v1.0.0 (/git/repro/workspace/serde)
[dev-dependencies]
└── serde_json v1.0.0
    ├── dtoa v0.4.8
    ├── itoa v0.3.4
    ├── num-traits v0.1.43
    │   └── num-traits v0.2.16
    │       [build-dependencies]
    │       └── autocfg v1.1.0
    └── serde v1.0.179

@dtolnay
Copy link
Member Author

dtolnay commented Jul 31, 2023

Mentioning @cassaundra since you worked on the relevant PR.

@epage
Copy link
Contributor

epage commented Jul 31, 2023

Thanks for clarifying the reproduction case!

Looks like the bug is that we only check workspace member dependencies for a patch applying. That is correct for workspace inheritance but not for patches.

@epage
Copy link
Contributor

epage commented Jul 31, 2023

I wonder if a simple way of resolving this is to generate the lockfile (resolve_ws) before we run gc_workspace and then we can look to see if the patched result is present in what was resolved.

@epage epage added S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review E-medium Experience: Medium and removed S-triage Status: This issue is waiting on initial triage. labels Jul 31, 2023
@alcolmenar
Copy link
Contributor

@epage hey, I'm a first time contributor here. I was wondering if I could claim this issue? Also, if it's ok, are there any glaring gotchas that you see and I should be on the look out for here?

@epage
Copy link
Contributor

epage commented Aug 4, 2023

I'd start with a test that demonstrates the bug, making this its own commit. This way when the commit with the fix will show the change in behavior, making it easier for a reviewer to see that this has the intended affect.

The start of the code in question is here. For the APIs you'll be working with resolve_ws is a good start.

You are also welcome to come to office hours to talk it over.

@alcolmenar
Copy link
Contributor

@epage awesome! thank you for pointing me in the right direction. I'll try working through it.

@rustbot claim

@alcolmenar
Copy link
Contributor

@epage Looks like the source is that the gc_workspace function removes patches where no workspace members have an explicit dependency for the patched crate. This doesn't appear to be the way it should work since the patch can technically override a crate that is a nested dependency of an explicit dependency. Seems like we can't reliably determine if a patch is used until after the workspace has been resolved.

I'm thinking I could break out gc'ing patches to a separate function and run it after resolve_ws using its unused_patches result to determine what to remove. If something is removed, then run resolve_ws again. What do you think?

Additionally, I think there's a bug in unused_patches for path dependencies

pub fn register_used_patches(&mut self, patches: &[Summary]) {

This marks the package unused only if the graph doesn't contain it, but since the patch is a member it'll always be in the graph.

@epage
Copy link
Contributor

epage commented Aug 7, 2023

I'm thinking I could break out gc'ing patches to a separate function and run it after resolve_ws using its unused_patches result to determine what to remove. If something is removed, then run resolve_ws again. What do you think?

Ah, I had overlooked that we already track used patches and can reuse that, nice! I was assuming we'd just walk resolve_wss graph.

This marks the package unused only if the graph doesn't contain it, but since the patch is a member it'll always be in the graph.

I'm not quite following what the problem is.

@alcolmenar
Copy link
Contributor

I'm not quite following what the problem is.

Commenting out the gc patch code, Resolve's unused_patches field from resolve_ws isn't being populated with the patch in the failing example even though it's unused. It seems like there a bug in the code which adds the patch to the unused field

@epage
Copy link
Contributor

epage commented Aug 7, 2023

If I'm understanding correctly, your proposed fix would mean we never GC patches for workspace members? I can live with that, at least for now. This is a minority case (workspace member that is never depended on in the workspace) and people might want to keep that around.

I'd simplify the test case to the minimum number of variable which is just a transitive dependency that is being patched.

@alcolmenar
Copy link
Contributor

alcolmenar commented Aug 7, 2023

I'm proposing the following:

  1. Fix the issue where unused patches are not being added to the resolve properly for path dependencies
  2. Move patch gc logic to a separate function to be executed after resolve_ws
  3. Use resolve_ws result's unused_patches in gc logic function to automatically remove unused patches
  4. Re-execute resolve_ws if something was removed in previous step

This'll automatically remove unused patches but leave ones that have an actual nested dependency. In its current form, patches are being removed if it isn't overriding a workspace member's dependency.

I do wonder if we should be automatically removing unused patches without an explicit flag from the user. Maybe a flag could be added that'll enable this? --gc-unused-patches?

@epage
Copy link
Contributor

epage commented Aug 7, 2023

Fix the issue where unused patches are not being added to the resolve properly for path dependencies

If I'm understanding correctly, this is unrelated to the current issue and should be handled separately. Feel free to create an issue and investigate it further (I'd want input from others more involved with the resolver to even decide whether this is a bug or not) but I don't think this should be part of the discussion fro this issue, in the same PR, or investigation of it blocking progress on this issue.

@epage
Copy link
Contributor

epage commented Aug 7, 2023

I do wonder if we should be automatically removing unused patches without an explicit flag from the user. Maybe a flag could be added that'll enable this? --gc-unused-patches?

That is a separate discussion as well

@alcolmenar
Copy link
Contributor

alcolmenar commented Aug 7, 2023

Ok, fair enough. I was thinking they should be separate discussions as well. My proposal then is:

  1. Move patch gc logic to a separate function to be executed after resolve_ws
  2. Walk resolve_ws's graph to determine if a patch is used and remove the patch if it isn't
  3. Re-execute resolve_ws if something was removed in previous step

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: bug Command-remove E-medium Experience: Medium S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants