Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Towards a deterministic julia build #34753

Open
nico202 opened this issue Feb 13, 2020 · 6 comments
Open

Towards a deterministic julia build #34753

nico202 opened this issue Feb 13, 2020 · 6 comments
Labels
building Build system, or building Julia or its dependencies

Comments

@nico202
Copy link
Contributor

nico202 commented Feb 13, 2020

TLDR: I got a deterministic julia build

Hello everybody!

This is a followup from issue #25900.

Background: reproducible builds are important both for trusting that binary artifacts match a given source code, and from a scientific point of view. Two distributions that give a special attention to reproducible builds are Guix System and NixOS.

Currently, julia is not reproducible on neither systems.

After many failed tries, I successfully produced a deterministic result, with relatively few patches. Those patches are available on my fork, based on v1.4.0-rc1 release.
They cannot be applied as-is, basically because I'm not sure every corner case will work and because the last few patches disable some precompilation (work in progress on identifying why the current precompile_script breaks determinism), but I see this like a success, so I'll describe the patches so that we can discuss better solutions.

A few notes on the build environment: I'm building it with guix. Guix uses a clean chroot environment, with empty&isolated /tmp dir, among other things, the variable SOURCE_DATE_EPOCH set to 1, ASLR disabled.

  • SOURCE_DATE_EPOCH: described here, with even more details here. The idea behind is quite simple: allow the current time to be set from a environment variable, shared between all tools, so that we can predict the output. Many tools and compilers already support it. My implementation in the following patches was just a quick hack to get it working, where to use it and how is open to discussion.
  • Address Space Layout Randomization: must be disabled (echo 0 | sudo tee /proc/sys/kernel/randomize_va_space)

Description of the commits (file name links to the commit):

  1. base/loading.jl: Do not store mtime() in precompiled file. Solved by reading SOURCE_DATE_EPOCH env variable, if set, and using it instead of file's mtime. Tom McLaughlin's solution is just to skip the check, but that's not enough (the files are different).
  2. src/support/timefuncs.c: Again, support SOURCE_DATE_EPOCH.
  3. src/module.c: do not store hrtime() in precompiled modules. There's a backup counter "in case hrtime is not incrementing". Can hrtime just be dropped? Else, if modules compiled in different sessions (hence same mcounter -> same build_id) can the hash of the content//something deterministic be used instead? Here I just identified this as a point that needs to be addressed, so hopefully somebody with more knowledge can propose a real solution
  4. contrib/generate_precompile.jl: here, mktemp() and mktempdir() are called. The problem is that, the current directory get stored in the precompile cache (because of calls like `push!(DEPOT_PATH, prec_path)`). Maybe we can check for an env variable and decide what to do (use a random name/static name based on that)?
  5. base/Base.jl: srand. I don't think I need to add anything :D maybe initialize it with the current time() (so that SOURCE_DATE_EPOCH is used and a deterministic result is obtained)?
  6. Base.jl, sysimg.jl: time_ns() gets included somehow.
  7. src/codegen.cpp: I don't get how storing the time needed to load the file can be useful, but probably I'm missing something
  8. contrib/generate_precompile.jl: the other three patches prepare a bare minimum precompile cache. The cache is 101Mb, where the full precompile cache is 148Mb (but not deterministic yet). Only ~1000 precompile statements are used (compared to ~4000 with the version shipped), but I got to this point to prove that it can be done. Now just more testing is needed to add more statements

To recap:

  1. Current julia release does not build deterministically
  2. With my set of (drastic) patches, I get reproducible builds (also for external modules like Compat.jl and HTTP.jl)
  3. More work and some collaboration with somebody that has a better understanding on julia internals is needed
  4. I think the result will justify the effort

What do you think?

Thanks, Nicolò

@KristofferC
Copy link
Member

contrib/generate_precompile.jl: here, mktemp() and mktempdir() are called. The problem is that, the current directory get stored in the precompile cache (because of calls like push!(DEPOT_PATH, prec_path)). Maybe we can check for an env variable and decide what to do (use a random name/static name based on that)?

But they get emptied later? Isn't that enough? nico202@ae2929a#diff-5dfe4463e5341531d80109dacf106768R77-R78.

@nico202
Copy link
Contributor Author

nico202 commented Feb 14, 2020

contrib/generate_precompile.jl: here, mktemp() and mktempdir() are called. The problem is that, the current directory get stored in the precompile cache (because of calls like push!(DEPOT_PATH, prec_path)). Maybe we can check for an env variable and decide what to do (use a random name/static name based on that)?

But they get emptied later? Isn't that enough? nico202@ae2929a#diff-5dfe4463e5341531d80109dacf106768R77-R78.

Nope! The code precompiled with the using keyword (using __PackagePrecompilationStatementModule) includes the package path.

strings /gnu/store/pv5xqa12x09v0dfx2zg7nmnqbhhgrkn1-julia-1.4.0/lib/julia/sys.so| grep "/tmp/jl"

/tmp/jl_precompile
/tmp/jl_precompile_file
/tmp/jl_precompile/__PackagePrecompilationStatementModule/src/__PackagePrecompilationStatementModule.jl

@thomasjm
Copy link

This is awesome! Sorry to hear that my patch to base/loading.jl was not perfect...

It seems to me that a few of these patches are simple and non-controversial, while others will take more discussion. What about opening individual PRs for the easy ones first? I'd love to see forward progress on this, since it would pave the way for Julia packages in Nix.

@nico202
Copy link
Contributor Author

nico202 commented Nov 15, 2020

Hi @thomasjm!
Sure, I'll try to resume the work in the following days (rebase on master, check if the build is still deterministic & so on) and then I'll send one PR at a time.

@StefanKarpinski
Copy link
Member

Let us know how it goes and if you need help with anything. Very sympathetic to making deterministic builds possible!

@nico202
Copy link
Contributor Author

nico202 commented Nov 26, 2020

I got to a point I'm not able to solve by myself.
I was able to build a smaller sys.so image from the current master branch which is deterministic, but the result is different on another machine. After digging a lot, my understanding is that module serialization is the serialization of an hash table (htable.inc), whose key is derived by taking the inthash (hashing.c) of the pointer of the module (toplevel.c). The value of the pointer depends on tls (thread local storage), whose position is different on my two machines.
I guess the best solution would be to "sort the htable" (e.g. by module name) before serializing, or, better, use b->name->hash as identifier.
Would it be possible @JeffBezanson? (I'm tagging you as it seems you did most of the work on module.c)
Thanks, Nicolò

@brenhinkeller brenhinkeller added the building Build system, or building Julia or its dependencies label Nov 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
building Build system, or building Julia or its dependencies
Projects
None yet
Development

No branches or pull requests

5 participants