-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Precompilation cache files are invalidated when the mtime
s of the Julia installation change
#50918
Comments
I believe this mtime check should be being skipped Line 3124 in d99f249
|
EDIT: I am able to reproduce the issue on Julia 1.9.2. |
mtime
s of the Julia installation changemtime
s of the Julia installation change
EDIT: I am able to reproduce the issue on Julia 1.9.2. |
Yeah we don't ship cache files for stdlibs in 1.8. |
I am able to reproduce this issue on Julia 1.9.2. The details for the 1.9.2 tarball that I used: JULIA_MAJ=1
JULIA_MIN=9
JULIA_PATCH=2
JULIA_OS="mac"
JULIA_ARCH="x64"
JULIA_PLATFORM=mac64
JULIA_TARBALL_SHA256=a2e8eb31a89b26e4a99349303aeff8e8ee780144bbdb1f7eda6f41024d42cadb After I Here are the relevant log messages: (click to expand)
|
mtime
s of the Julia installation changemtime
s of the Julia installation change
In the logs for 1.9.2, I do not see any occurrences of the And I have |
I can reproduce this issue on Julia master (specifically, commit fd38d50). The log messages are different (e.g. a lot of detailed log messages explaining why the different pkgimages were rejected, etc.), but the overall bug is still the same - after |
Here are the relevant log messages for Julia master (specifically, commit fd38d50). Click to expand:
|
Okay, I think that #50919 fixes this (against Julia master), but I don't know enough about this part of the codebase to be able to say for sure. Here is a relevant log snippet from me testing #50919 locally. AFAICT, with #50919, invocations of Click to expand:
|
Here's one of the things that I don't understand. In all of my manual testing, it seems that But in the current Julia master, the "skipping mtime check for stdlib" check is only performed if So I don't really understand what causes us to go down one branch versus the other for the stdlib files. |
Yes, problem is the implementation is buried in the changes there. |
@fatteneder You wrote |
Ups, sorry, yes I meant #49866. |
@fatteneder Would you be able to extract the "file hashing" functionality from #49866 out into a separate PR? I'd like to get a sense of how big that one specific change is. |
Below is a patch I extracted from #49866 which should contain the relevant changes, but did not test in isolation. patchdiff --git a/base/loading.jl b/base/loading.jl
index 6ab2a1cd53..0be7a18f83 100644
--- a/base/loading.jl
+++ b/base/loading.jl
@@ -1653,7 +1653,7 @@ const include_callbacks = Any[]
# used to optionally track dependencies when requiring a module:
const _concrete_dependencies = Pair{PkgId,UInt128}[] # these dependency versions are "set in stone", and the process should try to avoid invalidating them
-const _require_dependencies = Any[] # a list of (mod, path, mtime) tuples that are the file dependencies of the module currently being precompiled
+const _require_dependencies = Any[] # a list of (mod, path, fsize, hash) tuples that are the file dependencies of the module currently being precompiled
const _track_dependencies = Ref(false) # set this to true to track the list of file dependencies
function _include_dependency(mod::Module, _path::AbstractString)
prev = source_path(nothing)
@@ -1664,7 +1664,12 @@ function _include_dependency(mod::Module, _path::AbstractString)
end
if _track_dependencies[]
@lock require_lock begin
- push!(_require_dependencies, (mod, path, mtime(path)))
+ fsize, hash = if isfile(path)
+ UInt64(filesize(path)), open(_crc32c, path, "r")
+ else
+ UInt64(0), UInt32(0)
+ end
+ push!(_require_dependencies, (mod, path, fsize, hash))
end
end
return path, prev
@@ -1779,7 +1784,7 @@ function __require(into::Module, mod::Symbol)
end
uuidkey, env = uuidkey_env
if _track_dependencies[]
- push!(_require_dependencies, (into, binpack(uuidkey), 0.0))
+ push!(_require_dependencies, (into, binpack(uuidkey), UInt64(0), UInt32(0)))
end
return _require_prelocked(uuidkey, env)
finally
@@ -2497,7 +2502,8 @@ end
struct CacheHeaderIncludes
id::PkgId
filename::String
- mtime::Float64
+ fsize::UInt64
+ hash::UInt32
modpath::Vector{String} # seemingly not needed in Base, but used by Revise
end
@@ -2525,8 +2531,10 @@ function parse_cache_header(f::IO)
end
depname = String(read(f, n2))
totbytes -= n2
- mtime = read(f, Float64)
+ fsize = read(f, UInt64)
totbytes -= 8
+ hash = read(f, UInt32)
+ totbytes -= 4
n1 = read(f, Int32)
totbytes -= 4
# map ids to keys
@@ -2616,7 +2624,7 @@ end
function cache_dependencies(f::IO)
_, (includes, _), modules, _... = parse_cache_header(f)
- return modules, map(chi -> (chi.filename, chi.mtime), includes) # return just filename and mtime
+ return modules, map(chi -> chi.filename, includes) # return just filename
end
function cache_dependencies(cachefile::String)
@@ -2967,7 +2975,7 @@ end
if build_id != UInt128(0)
id_build = (UInt128(checksum) << 64) | id.second
if id_build != build_id
- @debug "Ignoring cache file $cachefile for $modkey ($((UUID(id_build)))) since it is does not provide desired build_id ($((UUID(build_id))))"
+ @debug "Ignoring cache file $cachefile for $modkey ($((UUID(id_build)))) since it does not provide desired build_id ($((UUID(build_id))))"
return true
end
end
@@ -3005,13 +3013,13 @@ end
# check if this file is going to provide one of our concrete dependencies
# or if it provides a version that conflicts with our concrete dependencies
# or neither
- skip_timecheck = false
+ skip_hashcheck = false
for (req_key, req_build_id) in _concrete_dependencies
build_id = get(modules, req_key, UInt64(0))
if build_id !== UInt64(0)
build_id |= UInt128(checksum) << 64
if build_id === req_build_id
- skip_timecheck = true
+ skip_hashcheck = true
break
end
@debug "Rejecting cache file $cachefile because it provides the wrong build_id (got $((UUID(build_id)))) for $req_key (want $(UUID(req_build_id)))"
@@ -3019,42 +3027,40 @@ end
end
end
- # now check if this file is fresh relative to its source files
- if !skip_timecheck
+ # now check if this file's content hash has changed relative to its source files
+ if !skip_hashcheck
if !samefile(includes[1].filename, modpath) && !samefile(fixup_stdlib_path(includes[1].filename), modpath)
@debug "Rejecting cache file $cachefile because it is for file $(includes[1].filename) not file $modpath"
return true # cache file was compiled from a different path
end
for (modkey, req_modkey) in requires
# verify that `require(modkey, name(req_modkey))` ==> `req_modkey`
- if identify_package(modkey, req_modkey.name) != req_modkey
- @debug "Rejecting cache file $cachefile because uuid mapping for $modkey => $req_modkey has changed"
+ pkg = identify_package(modkey, req_modkey.name)
+ if pkg != req_modkey
+ @debug "Rejecting cache file $cachefile because uuid mapping for $modkey => $req_modkey has changed, expected $modkey => $pkg"
return true
end
end
for chi in includes
- f, ftime_req = chi.filename, chi.mtime
+ f, fsize_req, hash_req = chi.filename, chi.fsize, chi.hash
if !ispath(f)
_f = fixup_stdlib_path(f)
if isfile(_f) && startswith(_f, Sys.STDLIB)
# mtime is changed by extraction
- @debug "Skipping mtime check for file $f used by $cachefile, since it is a stdlib"
+ @debug "Skipping hash check for file $f used by $cachefile, since it is a stdlib"
continue
end
@debug "Rejecting stale cache file $cachefile because file $f does not exist"
return true
end
- ftime = mtime(f)
- is_stale = ( ftime != ftime_req ) &&
- ( ftime != floor(ftime_req) ) && # Issue #13606, PR #13613: compensate for Docker images rounding mtimes
- ( ftime != ceil(ftime_req) ) && # PR: #47433 Compensate for CirceCI's truncating of timestamps in its caching
- ( ftime != trunc(ftime_req, digits=6) ) && # Issue #20837, PR #20840: compensate for GlusterFS truncating mtimes to microseconds
- ( ftime != 1.0 ) && # PR #43090: provide compatibility with Nix mtime.
- !( 0 < (ftime_req - ftime) < 1e-6 ) # PR #45552: Compensate for Windows tar giving mtimes that may be incorrect by up to one microsecond
- if is_stale
- @debug "Rejecting stale cache file $cachefile (mtime $ftime_req) because file $f (mtime $ftime) has changed"
+ fsize = filesize(f)
+ if fsize != fsize_req
+ @debug "Rejecting stale cache file $cachefile (file size $fsize_req) because file $f (file size $fsize) has changed"
return true
end
+ hash = open(_crc32c, f, "r")
+ if hash != hash_req
+ @debug "Rejecting stale cache file $cachefile (hash $hash_req) because file $f (hash $hash) has changed"
end
end
diff --git a/base/sysimg.jl b/base/sysimg.jl
index 04de951dc0..285719d08b 100644
--- a/base/sysimg.jl
+++ b/base/sysimg.jl
@@ -90,8 +90,9 @@ let
print_time(stdlib, tt)
end
for dep in Base._require_dependencies
- dep[3] == 0.0 && continue
- push!(Base._included_files, dep[1:2])
+ mod, path, fsize = dep[1], dep[2], dep[3]
+ fsize == 0 && continue
+ push!(Base._included_files, (mod, path))
end
empty!(Base._require_dependencies)
Base._track_dependencies[] = false
diff --git a/src/staticdata.c b/src/staticdata.c
index c05422fd10..58656fc7d0 100644
--- a/src/staticdata.c
+++ b/src/staticdata.c
@@ -2678,7 +2678,7 @@ static void jl_write_header_for_incremental(ios_t *f, jl_array_t *worklist, jl_a
write_uint8(f, jl_cache_flags());
// write description of contents (name, uuid, buildid)
write_worklist_for_header(f, worklist);
- // Determine unique (module, abspath, mtime) dependencies for the files defining modules in the worklist
+ // Determine unique (module, abspath, hash, fsize) dependencies for the files defining modules in the worklist
// (see Base._require_dependencies). These get stored in `udeps` and written to the ji-file header.
// Also write Preferences.
// last word of the dependency list is the end of the data / start of the srctextpos
diff --git a/src/staticdata_utils.c b/src/staticdata_utils.c
index bf1a830b60..1ff59929bd 100644
--- a/src/staticdata_utils.c
+++ b/src/staticdata_utils.c
@@ -1,11 +1,6 @@
// inverse of backedges graph (caller=>callees hash)
jl_array_t *edges_map JL_GLOBALLY_ROOTED = NULL; // rooted for the duration of our uses of this
-static void write_float64(ios_t *s, double x) JL_NOTSAFEPOINT
-{
- write_uint64(s, *((uint64_t*)&x));
-}
-
// Decide if `t` must be new, because it points to something new.
// If it is new, the object (in particular, the super field) might not be entirely
// valid for the cache, so we want to finish transforming it before attempting
@@ -719,7 +714,8 @@ static int64_t write_dependency_list(ios_t *s, jl_array_t* worklist, jl_array_t
size_t slen = jl_string_len(dep);
write_int32(s, slen);
ios_write(s, jl_string_data(dep), slen);
- write_float64(s, jl_unbox_float64(jl_fieldref(deptuple, 2))); // mtime
+ write_uint64(s, jl_unbox_uint64(jl_fieldref(deptuple, 2))); // fsize
+ write_uint32(s, jl_unbox_uint32(jl_fieldref(deptuple, 3))); // hash
jl_module_t *depmod = (jl_module_t*)jl_fieldref(deptuple, 0); // evaluating module
jl_module_t *depmod_top = depmod;
while (depmod_top->parent != jl_main_module && depmod_top->parent != depmod_top)
diff --git a/test/precompile.jl b/test/precompile.jl
index 31ceb49475..942f6f5e5a 100644
--- a/test/precompile.jl
+++ b/test/precompile.jl
@@ -374,7 +374,7 @@ precompile_test_harness(false) do dir
@test string(Base.Docs.doc(Foo.Bar)) == "Bar module\n"
modules, (deps, requires), required_modules, _... = Base.parse_cache_header(cachefile)
- discard_module = mod_fl_mt -> (mod_fl_mt.filename, mod_fl_mt.mtime)
+ discard_module = mod_fl_mt -> mod_fl_mt.filename
@test modules == [ Base.PkgId(Foo) => Base.module_build_id(Foo) % UInt64 ]
@test map(x -> x.filename, deps) == [ Foo_file, joinpath(dir, "foo.jl"), joinpath(dir, "bar.jl") ]
@test requires == [ Base.PkgId(Foo) => Base.PkgId(string(FooBase_module)),
@@ -545,12 +545,12 @@ precompile_test_harness(false) do dir
fb_uuid = Base.module_build_id(FooBar)
sleep(2); touch(FooBar_file)
insert!(DEPOT_PATH, 1, dir2)
- @test Base.stale_cachefile(FooBar_file, joinpath(cachedir, "FooBar.ji")) === true
+ @test Base.stale_cachefile(FooBar_file, joinpath(cachedir, "FooBar.ji")) isa Tsc
@eval using FooBar1
@test !isfile(joinpath(cachedir2, "FooBar.ji"))
@test !isfile(joinpath(cachedir, "FooBar1.ji"))
@test isfile(joinpath(cachedir2, "FooBar1.ji"))
- @test Base.stale_cachefile(FooBar_file, joinpath(cachedir, "FooBar.ji")) === true
+ @test Base.stale_cachefile(FooBar_file, joinpath(cachedir, "FooBar.ji")) isa Tsc
@test Base.stale_cachefile(FooBar1_file, joinpath(cachedir2, "FooBar1.ji")) isa Tsc
@test fb_uuid == Base.module_build_id(FooBar)
fb_uuid1 = Base.module_build_id(FooBar1)
|
Should be fixed now on master. Close? |
Fixed by #49866 |
Reproduced on:
MWE
Here is an MWE for Julia 1.8.5. Run the following commands in Bash.
Click to expand:
Expected behavior
No precompilation occurs.
Observed behavior
Precompilation occurs for the first
import HTTP
, because themtime
has changed for one of the.jl
source files in the stdlib.Here are some examples of the observed log messages:
Motivation
I have found this to be a problem in CI environments where the Julia installation may be coming from one place (e.g. baked into a Docker image, or maybe available via the GitHub Actions toolcache), but the Julia depot
.julia
(including the precompilation cache at.julia/compiled
) is cached via a different mechanism.The text was updated successfully, but these errors were encountered: