-
-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: copy_directory copies .DS_Store leading to remote cache misses #887
Comments
Interesting one. It does look like that I could see some blanket ignores in the copy_directory rule to avoid copying files such as |
In this case I am patching |
Interesting find. I have noticed more than cache misses onMacOS than expected, but did not yet drill into why myself. "Ideally I think copy_directory() should probably ignore .DS_Store files automatically since it’s very unlikely they’re contributing to the build." I do agree with this. |
@gregmagolan any more thoughts or updates on this? I keep on seeing cache misses on Macos even when the source has not changed. I haven't checked what is generating those misses but it could plausibly be DS_Store. |
Yes, this looks like something we can fix. The principled fix is to stop using copy_directory for the lifecycle hook actions and instead untar the tarball so that only the tarball is the input. In the meantime, we could add a feature to the copy_directory rule to filter out files. We control the golang implementation so it would be relatively easy to add that feature to bazel-lib and then consume it in rules_js. No time for this right now for me as I'm in scramble mode on multiple fronts but I would be happy to review PRs with the changes. FYSA @jbedard |
It seems better to address this at the I suppose perhaps |
The problematic case with the npm packages is unique as it takes a source directory as an input. This means that everything in the directory which isn't managed by Bazel ends up being copied into the output TreeArtifact, including any Source directory inputs are pretty rare from what I've seen in the wild. We used them here in rules_js since enumerating each file as an individual file inputs from an extracted npm package won't work if files have spaces or other special characters that Bazel doesn't like. Other uses of copy_directory that take a TreeArtifact instead of a source directory would not suffer from this issue. Its just source directory inputs that have the problem here and I don't think copy_directory with source directory inputs is a common case outside of rules_js npm packages. Either way, we should solve the both in both spots since copy_directory should do the right thing with source directory inputs and extracting a tar instead of copying a directory is more efficient for lifecycle hooks. |
Thanks for elaborating. I see what you mean and agree with your points. |
What happened?
.DS_Store
is a file sometimes added to a directory on MacOS by the operating system (it has to do with file system indexing I think). Sometimes these.DS_Store
sneak themselves into npm package repositories (generated byrules_js
) then copied with this action here:https://github.com/aspect-build/rules_js/blob/d0ff155c73e3c7fee5d72485e00775bca1fde10a/npm/private/npm_package_store.bzl#L222-L232
I’m some debugging remote cache misses on MacOS and when I look at the execution log, I’m seeing
.DS_Store
appearing in the diff. Example from one of the execution log diffs I generated following “Debugging Remote Cache Hits for Remote Execution”:I know this is
copy_directory
related since when I go to the log file this is from and look for the command I see this snippet:How could I configure
copy_directory()
to ignore.DS_Store
files if they’re present to prevent remote caching misses? Ideally I thinkcopy_directory()
should probably ignore.DS_Store
files automatically since it’s very unlikely they’re contributing to the build.I haven’t proven this is the incompatibility between the two MacOS machines which is causing the remote cache miss but figured it may be worth addressing whether or not it’s the incompatibility causing my remote caching misses.
Version
Development (host) and target OS/architectures: MacOS arm64
Output of
bazel --version
:Version of the Aspect rules, or other relevant rules from your
WORKSPACE
orMODULE.bazel
file:aspect_bazel_lib
v1.42.3rules_nodejs
v5.8.4aspect_rules_js
v1.42.3Language(s) and/or frameworks involved: Node.js, JavaScript, pnpm
The text was updated successfully, but these errors were encountered: