-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Renaming the project root dir invalidates compilation cache #3273
Comments
I believe Cargo stores absolute path names in various places of its crate cache, which is likely leading to this. |
That sounds like a likely source. In what cases wouldn't it be preferable to work with relative paths (to the directory containing Cargo.toml)? |
Ideally, yeah, although I don't think this would be easy to do at all in Cargo, the absolute paths are plumbed pretty far throughout the whole system. If possible I'd recommend getting the same directory working on CI (like Travis) |
Dear @alexcrichton, |
The problem here is that we store everything as an absolute path in memory, and then that gets hashed and/or serialized to disk. Changing the in-memory representation is likely to be difficult, so is changing how it's serialized/hashed. Unfortunately the fix here will probably be hard. |
Isn't it a matter of changing the places that pass the absolute path to the hashing function, so that they pass the relative path instead? Are there a lot of those? |
Patches are of course always welcome! I probably don't personally have time to work on this, but I can help out if needed. |
I might not have time this week, either, but I'd love to have a deeper understanding of what changes are involved and how much work it would be. I've done some searching around the repo to figure out where this hash is generated - if you could point in the right direction of the relevant files/code bits, that would super helpful! |
A |
I took a look at the source code and I believe these are the relevant places. I tested it locally by hard-coding fixed strings/paths at these places, but a real implementation would replace these absolute paths with relative ones that do not change when the whole directory is moved. However, that only solves the first fingerprint. cargo will then check the fingerprint of |
I believe Fingerprints being expected at absolute paths happens here -- https://github.com/alubbe/cargo/blob/f6500c6b6480c89492773bff6d41c4abf6c5ee15/src/cargo/ops/cargo_rustc/fingerprint.rs#L672-L687 Which... makes me think that this may almost not ever work? |
In intelliJ there's also a problem of project local paths. It's solved in a hackish, but simple and transparent way: in memory, all paths are absolute. During serialization, any string which contains the absolute path to the project is replaced with |
@matklad that seems like a great solution! |
IIUC we can re-use that approach so that not only is the absolute path replaced before serialization, but also before hashing (in the places I mentioned before). |
One related issue is that build scripts may capture an absolute path, and are not currently scheduled to re-run if the project root directory changes. I've just hit this. |
Avoid rebuilding a project when cwd changes This commit is targeted at solving a use case which typically comes up during CI builds -- the `target` directory is cached between builds but the cwd of the build changes over time. For example the following scenario can happen: 1. A project is compiled at `/projects/a`. 2. The `target` directory is cached. 3. A new build is started in `/projects/b`. 4. The previous `target` directory is restored to `/projects/b`. 5. The build start, and Cargo rebuilds everything. The last piece of behavior is indeed unfortunate! Cargo's internal hashing currently isn't that resilient to changing cwd and this PR aims to help improve the situation! The first point of too-much-hashing came up with `Target::src_path`. Each `Target` was hashed and stored for all compilations, and the `src_path` field was an absolute path on the filesystem to the file that needed to be compiled. This path then changed over time when cwd changed, but otherwise everything else remained the same! This commit updates the handling of the `src_path` field to simply ignore it when hashing. Instead the path we actually pass to rustc is later calculated and then passed to the fingerprint calculation. The next problem this fixes is that the dep info files were augmented after creation to have the cwd of the compiler at the time to find the files at a later date. This, unfortunately, would cause issues if the cwd itself changed. Instead the cwd is now left out of dep-info files (they're no longer augmented) and instead the cwd is recalculated when parsing the dep info later. The final problem that this commit fixes is actually an existing issue in Cargo today. Right now you can actually execute `cargo build` from anywhere in a project and Cargo will execute the build. Unfortunately though the argument to rustc was actually different depending on what directory you were in (the compiler was invoked with a path relative to cwd). This path ends up being used for metadata like debuginfo which means that different directories would cause different artifacts to be created, but Cargo wouldn't rerun the compiler! To fix this issue the matter of cwd is now entirely excluded from compilation command lines. Instead rustc is unconditionally invoked with a relative path *if* the path is underneath the workspace root, and otherwise it's invoked as an absolute path (in which case the cwd doesn't matter). Once all these fixes were added up it means that now we can have projects where if you move the entire directory Cargo won't rebuild the original source! Note that this may be a bit of a breaking change, however. This means that the paths in error messages for cargo will no longer be unconditionally relative to the current working directory, but rather relative to the root of the workspace itself. Unfortunately this is moreso of a feature right now rather than a bug, so it may be one that we just have to stomach. Closes #3273
I was surprised to learn that renaming the project root dir invalidates the compilation cache. This is something that happens during every CI and deployment run, making it run quite a bit slower when none of the rust code was actually changed. Is this intentional / is there a way to keep the cache fresh when renaming the directory?
The behaviour can be reproduced as follows (stable 1.12.1):
The text was updated successfully, but these errors were encountered: