Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix spurious rebuilds when switching source paths #1697

Merged
merged 3 commits into from
Jun 8, 2015

Conversation

alexcrichton
Copy link
Member

The method of creating package ids in Cargo means that all sub-crates of a main
repo have the same package id, which encodes the path it came from. This means
that if the "root crate" switches, the package id for all dependencies will
change, causing an alteration in package id hashes, causing recompiles.

This commit alters a few points of hashing to ensure that whenever a package is
being hashed for freshness the source root of the crate is used instead of the
root of the main crate. This cause the hashes to be consistent across compiles,
regardless of the root package.

Closes #1694

The package already has access to the source through the stored manifest's
package ID.
The method of creating package ids in Cargo means that all sub-crates of a main
repo have the same package id, which encodes the path it came from. This means
that if the "root crate" switches, the package id for all dependencies will
change, causing an alteration in package id hashes, causing recompiles.

This commit alters a few points of hashing to ensure that whenever a package is
being hashed for freshness the *source root* of the crate is used instead of the
root of the main crate. This cause the hashes to be consistent across compiles,
regardless of the root package.

Closes rust-lang#1694
@rust-highfive
Copy link

r? @wycats

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton
Copy link
Member Author

r? @brson

@rust-highfive rust-highfive assigned brson and unassigned wycats Jun 8, 2015
@brson
Copy link
Contributor

brson commented Jun 8, 2015

@bors r+

@bors
Copy link
Contributor

bors commented Jun 8, 2015

📌 Commit ddf3c79 has been approved by brson

@bors
Copy link
Contributor

bors commented Jun 8, 2015

⌛ Testing commit ddf3c79 with merge 1ae683a...

bors added a commit that referenced this pull request Jun 8, 2015
The method of creating package ids in Cargo means that all sub-crates of a main
repo have the same package id, which encodes the path it came from. This means
that if the "root crate" switches, the package id for all dependencies will
change, causing an alteration in package id hashes, causing recompiles.

This commit alters a few points of hashing to ensure that whenever a package is
being hashed for freshness the *source root* of the crate is used instead of the
root of the main crate. This cause the hashes to be consistent across compiles,
regardless of the root package.

Closes #1694
@bors
Copy link
Contributor

bors commented Jun 8, 2015

@bors bors merged commit ddf3c79 into rust-lang:master Jun 8, 2015
alexcrichton added a commit to alexcrichton/cargo that referenced this pull request Jun 12, 2015
The fix in rust-lang#1697 was to alter the `Hash` implementation of `Package` to take
into account the source root instead of just the package id (so packages at the
same source would have the same hash). There's a spot, however, where we
normalize some paths and not others, causing two paths which logically point the
same location end up having different hashes, so the same package ended up being
stored multiple times, later leading to an assertion that there was only one
stored.

This commit alters the behavior of read_packages to keep a hash map of ID =>
Package instead of just a hash set of Package as the deduplication/equality
needs to be based on the ID, not the literal path to the source root.

This specific bug shows up when you have an override and a normal dependency
pointing at the same location (and the override has some inter-dependencies).
Not exactly a normal use case, but it showed up in Servo's build!
bors added a commit that referenced this pull request Jun 12, 2015
The fix in #1697 was to alter the `Hash` implementation of `Package` to take
into account the source root instead of just the package id (so packages at the
same source would have the same hash). There's a spot, however, where we
normalize some paths and not others, causing two paths which logically point the
same location end up having different hashes, so the same package ended up being
stored multiple times, later leading to an assertion that there was only one
stored.

This commit alters the behavior of read_packages to keep a hash map of ID =>
Package instead of just a hash set of Package as the deduplication/equality
needs to be based on the ID, not the literal path to the source root.

This specific bug shows up when you have an override and a normal dependency
pointing at the same location (and the override has some inter-dependencies).
Not exactly a normal use case, but it showed up in Servo's build!
@alexcrichton alexcrichton deleted the fix-path-pkgid branch June 15, 2015 05:05
bors added a commit that referenced this pull request Mar 4, 2016
All crates being compiled by Cargo are identified by a unique `PackageId` instance. This ID incorporates information such as the name, version, and source from where the crate came from. Package ids are allowed to have path sources to depend on local crates on the filesystem. The package id itself encodes the path of where the crate came from.

Historically, however, the "path source" from where these packages are learned had some interesting logic. Specifically one specific source would be able to return many packages within. In other words, a path source would recursively walk crate definitions and the filesystem attempting to find crates. Each crate returned from a source has the same source id, so consequently all packages from one source path would have the same source path id.

This in turn leads to confusing an surprising behavior, for example:

* When crates are compiled the status message indicates the path of the crate root, not the crate being compiled
* When viewed from two different locations (e.g. two different crate roots) the same package would have two different source ids because the id is based on the root location.

This hash mismatch has been [papered over](#1697) in the past to try to fix some spurious recompiles, but it unfortunately [leaked back in](#2279). This is clearly indicative of the "hack" being inappropriate so instead these commits fix the root of the problem.

---

In short, these commits ensure that the package id for a package defined locally has a path that points precisely at that package. This was a relatively invasive change and had ramifications on a few specific portions which now require a bit of extra code to support.

The fundamental change here was to change `PathSource` to be non-recursive by default in terms of what packages it thinks it contains. There are still two recursive use cases, git repositories and path overrides, which are used for backwards compatibility. This meant, however, that the packaging step for crate no longer has knowledge of other crates in a repository to filter out files from. Some specific logic was added to assist in discovering a git repository as well as filtering out sibling packages.

Another ramification of this patch, however, is that special care needs to be taken when decoding a lockfile. We now need all path dependencies in the lockfile to point precisely at where the path dependency came from, and this information is not encoded in the lock file. The decoding support was altered to do a simple probe of the filesystem to recursively walk path dependencies to ensure that we can match up packages in a lock file to where they're found on the filesystem.

Overall, however, this commit closes #1697 and also addresses servo/servo#9794 where this issue was originally reported.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants