-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
occasional error:FileNotFound with empty zig cache on MacOS #18763
Comments
Now that we have a proper bug report upstream, we can apply the workaound in the toolchain repository behind the scenes. THanks @nadavwe! Zig upstream issue: ziglang/zig#18763
Now that we have a proper bug report upstream, we can apply the workaound in the toolchain repository behind the scenes. THanks @nadavwe! Zig upstream issue: ziglang/zig#18763
I'd be curious what the stderr output will be on version |
While it is not clear why artifacts are occasionally missing for which the manifest remains intact, regardless of how it happens, the cache should be at least somewhat resilient to kills, crashes, power loss, etc. affecting compilation processes anyway. Workaround ziglang#18763
I have written a workaround that, while it is not expected to prevent this error from occurring, which would require actually knowing how it is even happening, should at least change the mitigation to simply "try again" with no cache wipe required. |
It is very unlikely that there were multiple invocations of
I suggest I reword the error message to instruct the user to:
... and then we wait for an undetermined amount of time. How does that sound? |
On the assumption that this only happens when zig-cache is in |
That could be. We don't control when users clean their cache. |
Since there was movement on the upstream issue (ziglang/zig#18763), we can now gather more information. Not removing the cache directory, asking users to collaborate.
Since there was movement on the upstream issue (ziglang/zig#18763), we can now gather more information. Not removing the cache directory, asking users to collaborate.
A detail question; is the error message
|
@mikdusan On the version of zig that the reproduction happened the output was just |
I'm speculating that an external event such as factory macos cleaning /tmp periodically is the culprit. To simulate this removal:
Thus we are getting an impossible hit on I'm certain it is not a goal to handle sporadic deletions of the zig-cache tree by external actors... but shouldn't compiler_rt properly re-institute itself in the cache given such a removal? |
I am planning to release
Could be. If you can help me with instructions on how to check it, I would pass it on to affected-colleagues-of-the-past to confirm or deny that this exists on their machine. |
I checked on macos 11 and macos 14, tmp cleanup is enabled by default. |
Zig doesn't implement GC on the cache yet, so I understand the desire to use /tmp, but the bottom line is that it's not supported to race file deletion from the zig cache while using the compiler. As it stands, the user must only delete the zig-cache directory when they have independently ensured that no compiler processes will run from start to finish of the deletion operation. Also they must fully delete the directory; running the compiler after a partial deletion is not supported. |
Cache purging by MacOS sounds like a more and more plausible explanation. Any tips on what would be a better place to store zig cache on MacOS? It needs to be an absolute path, because of bazel limitations. |
I understand the wrapper goal is to set both local/global cache to the same value.
edit: if it's desirable to have a sep tree, perhaps |
The correct place to put a non-user cache is |
here's a find:
but getting this value would require adding |
This path will need to appear in This option only accepts full paths, without variable expansion. Since this option violates hermeticity, Bazel treats it quite strictly (and I don't presume any PRs would be accepted to, say, allow environment variables there).
I don't need the cache to be cleared on every reboot.
I will confirm both today and, voilà, we may have a resolution to this.
Ideally it should be in |
Yeah that really doesn't leave much choice, as the only directories you can reasonably expect to be writable on linux are:
I should also note that there is configuration on macOS (at least the older versions) for excluding patterns from the tmp cleaning, but obviously fewer user configuration requirements would be better. |
MacOS has a cronjob that deletes files older than 3 days from /tmp. That's not good for zig cache: ziglang/zig#18763 Switch to /var/tmp, which does not seem to be randomly wiped at runtime. This breaking change means the next hermetic_cc_toolchain will be v3.0.0.
Thanks everyone and sorry for such an alarm. I now have high confidence that moving cache to This was truly a long-time and multi-person effort to find! |
No need to apologize. Maybe there is still room for improving the error message here- it could have saved time if it clearly indicated the problem occurred due to files missing from the global cache directory. I want to be careful not to offer misleading hints but, any clarifications that could have more quickly led to the diagnosis would make sense to make in the error reporting. |
Note that after the recent linker changes, the status quo with lld and the self-hosted linker on master is:
|
My only suggestion would be -error: unexpected error: parsing input file failed with error FileNotFound
+error: unable to parse input file: FileNotFound
note: while parsing ~/.cache/zig/o/7a704696d970e90cbff9db1a66fc9583/libc.a but that has nothing to do with this issue. So I think there is nothing else to do. |
hermetic_cc_toolchain
MacOS occasionally seeerror: FileNotFound
when compiling a zig program for the first time:If the user receives such error, subsequent runs of
zig build
fail with the same error. The only known mitigation is to wipe zig cache and try again, then it always succeeds.A few comments:
Attaching an archive of
/tmp/zig-cache
from the machine that just got this error.If there is anything more I can provide, let me know, I will do my best to instruct the users to do so.
Files and versions
0.12.0-dev.2127+fcc0c5ddc
zig-wrapper.zig
is this.cc @jacobly0 to whom this might be of interest, he worked on cache races before.
The text was updated successfully, but these errors were encountered: