Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cargo updating index even with lockfile presence #1710

Closed
aidanhs opened this issue Jun 12, 2015 · 23 comments
Closed

Cargo updating index even with lockfile presence #1710

aidanhs opened this issue Jun 12, 2015 · 23 comments

Comments

@aidanhs
Copy link
Member

aidanhs commented Jun 12, 2015

The snapshot version of cargo used for compiling cargo itself doesn't exhibit this. Not sure if it was an intentional change...

~/Desktop/rust/rust-ptrace $ cargo --version
cargo 0.3.0-nightly (ae8b752 2015-06-12) (built 2015-06-11)
~/Desktop/rust/rust-ptrace $ cargo build
    Updating registry `https://github.com/rust-lang/crates.io-index`
^C
~/Desktop/rust/rust-ptrace $ cargo build
    Updating registry `https://github.com/rust-lang/crates.io-index`
^C
~/Desktop/rust/rust-ptrace $ ../cargo/target/snapshot/cargo/bin/cargo --version
cargo 0.0.1-pre-nightly (84d6d2c 2015-03-31) (built 2015-04-01)
~/Desktop/rust/rust-ptrace $ ../cargo/target/snapshot/cargo/bin/cargo build
   Compiling rustc-serialize v0.3.14
   Compiling rand v0.3.8
^C
~/Desktop/rust/rust-ptrace $ ../cargo/target/snapshot/cargo/bin/cargo build
   Compiling rustc-serialize v0.3.14
   Compiling rand v0.3.8
^C
@aidanhs
Copy link
Member Author

aidanhs commented Jun 12, 2015

I can obviously reproduce so I'm doing digging. Just raised in case someone has immediate thoughts.

@aidanhs
Copy link
Member Author

aidanhs commented Jun 12, 2015

With

 $ git diff
diff --git a/src/cargo/lib.rs b/src/cargo/lib.rs
index e5c5724..61b401f 100644
--- a/src/cargo/lib.rs
+++ b/src/cargo/lib.rs
@@ -1,4 +1,3 @@
-#![deny(unused)]
 #![feature(file_type, dir_entry_ext)]
 #![cfg_attr(test, deny(warnings))]

diff --git a/src/cargo/sources/registry.rs b/src/cargo/sources/registry.rs
index 6dfdfcf..e23ae92 100644
--- a/src/cargo/sources/registry.rs
+++ b/src/cargo/sources/registry.rs
@@ -451,6 +451,7 @@ impl<'cfg> RegistrySource<'cfg> {

         try!(self.config.shell().status("Updating",
              format!("registry `{}`", self.source_id.url())));
+        panic!("oh no");
         let repo = try!(self.open());

         // git fetch origin
@@ -478,6 +479,7 @@ impl<'cfg> Registry for RegistrySource<'cfg> {
         // theory the registry is known to contain this version. If, however, we
         // come back with no summaries, then our registry may need to be
         // updated, so we fall back to performing a lazy update.
+        println!("{:?} {}", dep, dep.name());
         if dep.source_id().precise().is_some() {
             let mut summaries = try!(self.summaries(dep.name())).iter().map(|s| {
                 s.0.clone()

I get

$ ../cargo/target/x86_64-unknown-linux-gnu/debug/cargo build
Dependency { name: "bitflags", source_id: SourceId { inner: SourceIdInner { url: Url { scheme: "https", scheme_data: Relative(RelativeSchemeData { username: "", password: None, host: Domain("github.com"), port: None, default_port: Some(443), path: ["rust-lang", "crates.io-index"] }), query: None, fragment: None }, canonical_url: Url { scheme: "https", scheme_data: Relative(RelativeSchemeData { username: "", password: None, host: Domain("github.com"), port: Some(443), default_port: Some(443), path: ["rust-lang", "crates.io-index"] }), query: None, fragment: None }, kind: Registry, precise: Some("locked") } }, req: VersionReq { predicates: [Predicate { op: Ex, major: 0, minor: Some(1), patch: Some(1) }] }, specified_req: Some("0.1"), kind: Normal, only_match_name: false, optional: false, default_features: true, features: [], only_for_platform: None } bitflags
    Updating registry `https://github.com/rust-lang/crates.io-index`
thread '<main>' panicked at 'oh no', src/cargo/sources/registry.rs:454

But

~/.cargo/registry/index $ ls
github.com-1ecc6299db9ec823
~/.cargo/registry/index $ cat github.com-1ecc6299db9ec823/bi/tf/bitflags 
{"name":"bitflags","vers":"0.1.0","deps":[],"cksum":"1385e296b345ba9678db8b9363e55f5225e952878c75c6737262d84b1c9cb4db","features":{},"yanked":false}
{"name":"bitflags","vers":"0.1.1","deps":[],"cksum":"2a6577517ecd0ee0934f48a7295a89aaef3e6dfafeac404f94c0b3448518ddfe","features":{},"yanked":false}
{"name":"bitflags","vers":"0.2.0","deps":[],"cksum":"6aade0405f0d91bb6e5ce16199bb6c11da4bb0940247120e832b382ddd37b28b","features":{},"yanked":true}
{"name":"bitflags","vers":"0.2.1","deps":[],"cksum":"a41f80ec2e140d19e789764fdf22d0f2da98fe7e55d26f99db59cb3d2605d327","features":{},"yanked":false}

It's not changed since then, so it wouldn't be getting anything new anyway...

~/.cargo/registry/index $ curl https://raw.githubusercontent.com/rust-lang/crates.io-index/master/bi/tf/bitflags
{"name":"bitflags","vers":"0.1.0","deps":[],"cksum":"1385e296b345ba9678db8b9363e55f5225e952878c75c6737262d84b1c9cb4db","features":{},"yanked":false}
{"name":"bitflags","vers":"0.1.1","deps":[],"cksum":"2a6577517ecd0ee0934f48a7295a89aaef3e6dfafeac404f94c0b3448518ddfe","features":{},"yanked":false}
{"name":"bitflags","vers":"0.2.0","deps":[],"cksum":"6aade0405f0d91bb6e5ce16199bb6c11da4bb0940247120e832b382ddd37b28b","features":{},"yanked":true}
{"name":"bitflags","vers":"0.2.1","deps":[],"cksum":"a41f80ec2e140d19e789764fdf22d0f2da98fe7e55d26f99db59cb3d2605d327","features":{},"yanked":false}

I'm going to do a git bisect...

@aidanhs
Copy link
Member Author

aidanhs commented Jun 13, 2015

Bisect definitely identifies e5a870b as the culprit...

@aidanhs
Copy link
Member Author

aidanhs commented Jun 13, 2015

I reverted just the change to source.rs, but no joy. Apparently just upgrading rust-url somehow causes the issue.

@aidanhs
Copy link
Member Author

aidanhs commented Jun 13, 2015

Ah! It's looking at a different registry folder.

  • the registry folder used to clone is partially determined by a hash of SourceId
  • the hash of SourceId is determined by a hash of the url
  • the hash of the url has changed for some reason...

Previously:

SourceId { inner: SourceIdInner { url: Url { scheme: "https", scheme_data: Relative(RelativeSchemeData { username: "", password: None, host: Domain("gith
ub.com"), port: None, default_port: Some(443), path: ["rust-lang", "crates.io-index"] }), query: None, fragment: None }, canonical_url: Url { scheme: "https", scheme_data: Relative(RelativeSch
emeData { username: "", password: None, host: Domain("github.com"), port: Some(443), default_port: Some(443), path: ["rust-lang", "crates.io-index"] }), query: None, fragment: None }, kind: Re
gistry, precise: Some("locked") } }

short hash: 1ecc6299db9ec823

Now:

SourceId { inner: SourceIdInner { url: Url { scheme: "https", scheme_data: Relative(RelativeSchemeData { username: "", password: None, host: Domain("gith
ub.com"), port: None, default_port: Some(443), path: ["rust-lang", "crates.io-index"] }), query: None, fragment: None }, canonical_url: Url { scheme: "https", scheme_data: Relative(RelativeSch
emeData { username: "", password: None, host: Domain("github.com"), port: Some(443), default_port: Some(443), path: ["rust-lang", "crates.io-index"] }), query: None, fragment: None }, kind: Re
gistry, precise: Some("locked") } }

short hash: 0a35038f75765ae4

I see no difference in sourceid.

Has deriving PartialOrd in url changed the hash?

@aidanhs
Copy link
Member Author

aidanhs commented Jun 13, 2015

I get different hash values between url version 0.2.34 and 0.2.35.

@aidanhs
Copy link
Member Author

aidanhs commented Jun 13, 2015

Raised servo/rust-url#117

@aidanhs
Copy link
Member Author

aidanhs commented Jun 13, 2015

Closing, not really a cargo issue, was just annoying to hit when I had no internet.

@aidanhs aidanhs closed this as completed Jun 13, 2015
@alexcrichton
Copy link
Member

@aidanhs hm it should definitely not be the case that a cargo instance should continue updating after it's updated once, but it is possible for cargo to require an update across cargo versions (due to the underlying hashes change, like you found). If you keep running into this though please feel free to reopen!

@aidanhs
Copy link
Member Author

aidanhs commented Nov 4, 2015

/work/dayerproj/tar-rs $ git checkout -
Switched to branch 'aphs-file-entries'
Your branch is up-to-date with 'mine/aphs-file-entries'.
/work/dayerproj/tar-rs $ cargo build
    Updating registry `https://github.com/rust-lang/crates.io-index`
   Compiling tar v0.3.1 (file:///work/dayerproj/tar-rs)
/work/dayerproj/tar-rs $ git checkout -
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
/work/dayerproj/tar-rs $ cargo build
    Updating registry `https://github.com/rust-lang/crates.io-index`
   Compiling tar v0.3.1 (file:///work/dayerproj/tar-rs)

Just when I was hoping to do some rust programming offline...

@aidanhs aidanhs reopened this Nov 4, 2015
@aidanhs
Copy link
Member Author

aidanhs commented Nov 4, 2015

I've got two machines with this version of cargo.

$ cargo --version
cargo 0.7.0-nightly (8acff56 2015-10-31)

One exhibits the annoying updating behaviour, one doesn't.

Broken machine:

$ RUST_LOG=debug cargo build
DEBUG:cargo::build: executing; cmd=cargo-build; args=["cargo", "build"]
DEBUG:cargo::ops::cargo_compile: compile; manifest-path=/work/dayerproj/tar-rs/Cargo.toml
DEBUG:cargo::ops::cargo_compile: loaded package; package=tar v0.3.1 (file:///work/dayerproj/tar-rs)
DEBUG:cargo::ops::cargo_compile: loaded config; configs={}
DEBUG:cargo::core::registry: load/missing  file:///work/dayerproj/tar-rs
DEBUG:cargo::core::registry: load/missing  registry https://github.com/rust-lang/crates.io-index
DEBUG:cargo::core::registry: load/mismatch registry https://github.com/rust-lang/crates.io-index
    Updating registry `https://github.com/rust-lang/crates.io-index`
^C

Good machine:

$ RUST_LOG=debug cargo build
DEBUG:cargo::build: executing; cmd=cargo-build; args=["cargo", "build"]
DEBUG:cargo::ops::cargo_compile: compile; manifest-path=/space/ahobsons/rust/tar-rs/Cargo.toml
DEBUG:cargo::ops::cargo_compile: loaded package; package=tar v0.3.1 (file:///space/ahobsons/rust/tar-rs)
DEBUG:cargo::ops::cargo_compile: loaded config; configs={"registry": {"token": pzbdF0qCqRGFXmlod4y0tq7KAes1AD8K (from /home/ahobsons/.cargo/config)}}
DEBUG:cargo::core::registry: load/missing  file:///space/ahobsons/rust/tar-rs
DEBUG:cargo::core::registry: load/missing  registry https://github.com/rust-lang/crates.io-index
DEBUG:cargo::core::registry: load/match    registry https://github.com/rust-lang/crates.io-index
[...]

Good machine after nuking the cargo config

$ RUST_LOG=debug cargo build                                                                                                                
DEBUG:cargo::build: executing; cmd=cargo-build; args=["cargo", "build"]
DEBUG:cargo::ops::cargo_compile: compile; manifest-path=/space/ahobsons/rust/tar-rs/Cargo.toml
DEBUG:cargo::ops::cargo_compile: loaded package; package=tar v0.3.1 (file:///space/ahobsons/rust/tar-rs)
DEBUG:cargo::ops::cargo_compile: loaded config; configs={}
DEBUG:cargo::core::registry: load/missing  file:///space/ahobsons/rust/tar-rs
DEBUG:cargo::core::registry: load/missing  registry https://github.com/rust-lang/crates.io-index
DEBUG:cargo::core::registry: load/match    registry https://github.com/rust-lang/crates.io-index
[...]

So something to do with the mismatch.
The indexes compared as identical:

~/.cargo/registry $ find index/ -name .git -prune -o -type f -print | sort | xargs sha1sum | sha1sum                                                                    
13b813c4150d043bf78d6171fde94765e2dcfadc  -

Edit: yes I've reset my API token after being an idiot and pasting it above

@alexcrichton
Copy link
Member

Do you know of perhaps any major differences between the two machines? It looks like the logs aren't too too useful here but if I gave you a build with extra logging would you be able to repro again?

@aidanhs
Copy link
Member Author

aidanhs commented Nov 7, 2015

One's inside a docker container (the broken one), the other is an actual machine.

Yes, I can consistently repro. Don't worry about a build with extra logging, I'll start tracking it down when I have reliably internet again.

@alexcrichton
Copy link
Member

OK, thanks for taking a look @aidanhs!

@aidanhs
Copy link
Member Author

aidanhs commented Nov 12, 2015

21:21 < aidanhs> anyone know what could be causing cargo to constantly try to update the registry?
21:21 < aidanhs> it works as expected on one machine (no update when switching branches) but the same cargo version does try and update every time on a different computer
Day changed to 04 Nov 2015
11:45 < mbrubeck> aidanhs: Maybe differences in the system clock or filesystem timestamps between the computers?  Also, does either have any path overrides in .cargo/config?
Day changed to 05 Nov 2015
08:15 < larsberg> aidanhs: a paths override (like mbrubeck mentioned) will hit the network every time if it or any of its deps have a "*" semver requirement, in my experience
/work/dayerproj/tar-rs $ stat .cargo ../.cargo ../../.cargo ../../../.cargo
stat: cannot stat '.cargo': No such file or directory
stat: cannot stat '../.cargo': No such file or directory
stat: cannot stat '../../.cargo': No such file or directory
stat: cannot stat '../../../.cargo': No such file or directory

@aidanhs
Copy link
Member Author

aidanhs commented Nov 12, 2015

$ cargo new --bin tmp
$ cd tmp/
$ echo 'libc = "0.1"' >> Cargo.toml
$ git add .
$ git commit -m commit1
[...]
$ git tag 1
$ sed -i 's/^libc.*/libc = "0.2"/' Cargo.toml
$ git add Cargo.toml
$ git commit -m commit2
[...]
$ git tag 2
$ cargo build
    Updating registry `https://github.com/rust-lang/crates.io-index`
   Compiling libc v0.2.2
   Compiling tmp v0.1.0 (file:///tmp/tmp.T6N8TTAGmw/tmp)
$ git checkout 1
[...]
$ cargo build
    Updating registry `https://github.com/rust-lang/crates.io-index`
   Compiling libc v0.1.12
   Compiling tmp v0.1.0 (file:///tmp/tmp.T6N8TTAGmw/tmp)
$ git checkout 2
[...]
$ cargo build
    Updating registry `https://github.com/rust-lang/crates.io-index`
   Compiling tmp v0.1.0 (file:///tmp/tmp.T6N8TTAGmw/tmp)

Even though it already has libc downloaded (and compiled by the third time we do the build) it updates the index because it needs to change the Cargo.lock, due to Cargo.toml changing.
It was working on another machine because that clone hadn't been updated in a while and didn't have the altered Cargo.toml.

@aidanhs
Copy link
Member Author

aidanhs commented Nov 12, 2015

This is a case where the stars do align (#1882 (comment)) but cargo doesn't realise. Closing in favour of that issue.

@aidanhs aidanhs closed this as completed Nov 12, 2015
@aidanhs
Copy link
Member Author

aidanhs commented May 23, 2016

Ok, back to the originally reported issue this time.

hm it should definitely not be the case that a cargo instance should continue updating after it's updated once, but it is possible for cargo to require an update across cargo versions (due to the underlying hashes change, like you found). If you keep running into this though please feel free to reopen!

I've run into it again (culprit is a rust-url upgrade, like last time), and now I have two sets of registry index folders. Anything with 88ac128001ac3a9a in the name is presumably going to sit around forever now?

$ ls -l ~/.cargo/registry/*  
/myhome/.cargo/registry/cache:
total 16
drwxr-xr-x 2 user usergroup  4096 May 23 11:08 github.com-1ecc6299db9ec823
drwxr-xr-x 2 user usergroup 12288 May  4 16:34 github.com-88ac128001ac3a9a

/myhome/.cargo/registry/index:
total 24
drwxr-xr-x 451 user usergroup 12288 May 23 11:08 github.com-1ecc6299db9ec823
drwxr-xr-x 447 user usergroup 12288 Apr 29 14:45 github.com-88ac128001ac3a9a

/myhome/.cargo/registry/src:
total 16
drwxr-xr-x 131 user usergroup  4096 May 23 11:08 github.com-1ecc6299db9ec823
drwxr-xr-x 205 user usergroup 12288 May  4 16:34 github.com-88ac128001ac3a9a
$ du -sh ~/.cargo/registry/*/*
23M     /myhome/.cargo/registry/cache/github.com-1ecc6299db9ec823
28M     /myhome/.cargo/registry/cache/github.com-88ac128001ac3a9a
65M     /myhome/.cargo/registry/index/github.com-1ecc6299db9ec823
65M     /myhome/.cargo/registry/index/github.com-88ac128001ac3a9a
107M    /myhome/.cargo/registry/src/github.com-1ecc6299db9ec823
146M    /myhome/.cargo/registry/src/github.com-88ac128001ac3a9a

It seems like it might be worthwhile to take the recommendation from the rust-url issue so that the hash value doesn't keep changing:

The serialization of URLs should be much more stable (since it affect interoperability and has a spec) so I’d recommend basing your hashing on that rather than the built-in Hash trait.

To avoid allocating a serialized string, you could have a specialized std::fmt::Write implementation that does the hashing.

@alexcrichton
Copy link
Member

That's a good point! Although I believe the Hash impl for Url now does that by default, and we should be updated to that version as well. So in that sense we're already done!

@aidanhs
Copy link
Member Author

aidanhs commented May 23, 2016

This issue was originally raised because Url moved away from basing it on serialisation!

How would you feel about me bringing the current url hash impl into cargo to make a guarantee about it staying stable?

@alexcrichton
Copy link
Member

I'd personally prefer to stick to Hash as it's more ergonomic, and I suspect that changing the impl may be a breaking change for the url crate at this point?

@aidanhs
Copy link
Member Author

aidanhs commented May 24, 2016

I don't think I expressed myself very well - I'll make a PR for clarity, it'll only be changing a couple of lines I think.

@alexcrichton
Copy link
Member

Sure I guess it can't hurt to see what the changes are.

aidanhs added a commit to aidanhs/cargo that referenced this issue May 24, 2016
bors added a commit that referenced this issue May 24, 2016
…hton

Hash url serialization to ensure hash stability

`as_str` is documented to return the serialization of a url (http://servo.github.io/rust-url/url/struct.Url.html#method.as_str).
The serialization of a url has a spec (https://url.spec.whatwg.org/#url-serializing).

Hashing the serialization of a url insulates cargo against possible decisions to alter how rust-url does hashing (I've observed it happening twice). Note that rust-url happens to do this internally (but is not guaranteed to in future) and so this change doesn't actually have any effect on hash values.

r? @alexcrichton

Closes #1710
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants