Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: in Base.stale_cachefile, for determining cachefile freshness, use a hash of the source file contents (instead of comparing mtimes) #45541

Closed
DilumAluthge opened this issue Jun 1, 2022 · 2 comments · Fixed by #49866
Labels
compiler:precompilation Precompilation of modules packages Package management and loading

Comments

@DilumAluthge
Copy link
Member

DilumAluthge commented Jun 1, 2022

Currently, in the Base.stale_cachefile function, in order to determine whether or not a .ji cachefile is fresh relative to its .jl source files, for each source file, we compare ftime_req (the original mtime of the source file as recorded in the header of the cachefile) to ftime (the current mtime of the source file).

The advantage of this approach is that it is fast. The disadvantage is that we have to do all kinds of workarounds to deal with mtime issues in different situations. For example:

julia/base/loading.jl

Lines 2052 to 2054 in 0a55a8e

# Issue #13606: compensate for Docker images rounding mtimes
# Issue #20837: compensate for GlusterFS truncating mtimes to microseconds
# The `ftime != 1.0` condition below provides compatibility with Nix mtime.

Instead of using mtime, it would be nice if we could compute a hash of the contents of the source files; then, we invalidate the cachefile if and only if the hash has changed.

The disadvantage of using hashes is that it would be slower than using mtimes. The advantage is that we can get rid of all of the workarounds that are necessary for the mtime approach. We can also ensure that we only invalidate a cachefile if the source files have actually had their contents modified.

@DilumAluthge DilumAluthge added packages Package management and loading compiler:precompilation Precompilation of modules labels Jun 2, 2022
@IanButterworth
Copy link
Member

If it moves to using hashes, perhaps first do a file size check to allow fail-fast?

@Keno
Copy link
Member

Keno commented Jun 2, 2022

Also consider making the hashing compatible with fs-verity, so if that's enabled on a file, you can just pick up the hash from there without having to read it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:precompilation Precompilation of modules packages Package management and loading
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants