-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consider caching file hashes in the sccache server #758
Comments
From a cursory search there appear to be several concurrent hashtable crates out there, one of those might be suitable. Given that this is a cache there might be some special-purpose data structure that would work better, I don't know. Would you need to cache by (filename, mtime) for this to be correct? I assume your concern is mostly the time spent hashing rlib / rmeta files, right? The existing cargo fingerprinting there probably makes that less of a concern. A data structure that allowed lock-free reads ought to keep the fast path fast, and given that the file hashing code is already async, the write path could be something like:
Honestly if you get updated to the latest |
It can't be entirely correct since there are ways to modify files without updating mtime. But yeah, on windows that's probably a decent heuristic. On unix systems |
...so that, particularly for Rust compilations, we don't waste a bunch of time re-hashing the same files over and over again.
I don't know how much this really helps, because it probably hurts a little bit (?) on single-shot builds where the server doesn't stay up very long (e.g. Firefox automation builds)...though maybe not having to touch the disk (or the kernel's file cache, or whatever) is a win overall. Also unsure how easy it is to arrange things so you don't wind up badly serializing the whole process on your hash cache; maybe the locking overhead would really not be that large.
The text was updated successfully, but these errors were encountered: