-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Rejecting cache file" on a heterogeneous cluster, leading to repeated precompilation #48579
Comments
Could probably benefit from better docs, but package images support multiversioning. So there are two strategies.
I think |
Unfortunately, I don't know enough about setting CPU targets. Does the following option make sense? JULIA_CPU_TARGET="generic;skylake-avx512,clone_all;znver2,clone_all" Edit: Unfortunately, this didn't seem to work |
Can you elaborate? |
My bad, this seems to be resolved by setting the environment variable |
Only having |
As Kristoffer said you are basically caching the oldest x86_64 code, by default julia does native. For multi CPU environments I would recommend |
I am using a freshly downloaded nightly on a Slurm cluster, and encounter this repeated cache invalidation that leads to repeated precompilation.
The login node has
and the compute node has
I start by deleting my
.julia
directory to avoid clashes:rm -rf /scratch/user/.julia
After this, on the login node, I generate a simple package with
FillArrays.jl
as the only dependency. I instantiate the package on the login node, to seeSo far, so good, as the package clearly doesn't precompile twice. Now, I drop to the compute node and find that the package precompiles again:
Now, if I go back to the login node and try to precompile the package again, I find
Every time I switch between the login and the compute node, the package requires a fresh round of precompilation, which can be quite time-consuming. I wonder if it'll be possible to save two sets of cache files such that one doesn't invalidate the other?
The text was updated successfully, but these errors were encountered: