-
Notifications
You must be signed in to change notification settings - Fork 539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flaky test: bzlmod integration test on MacOS #1261
Comments
It looks like all the files that are reported missing have the format:
i.e. it's always something with a file in pip's pycache directory, and always some integer as the suffix. I can't tell what the integer is, though. Perhaps a hash? Some sample numbers are 4313755696, 4422117616, 4409426928, 4422117616, 4571366992, 4325443600, 4494442544. They aren't timestamps -- those are in the year 2112-ish. This also only happens on Macs. |
All the failures are for "Middleman _middlemen/requirements.update-runfiles", too. This is the generated So, maybe this isn't flaky hosts afterall. That there is a pycache directory strikes me as odd; maybe some python is running, adding files, and then a glob is picking them up? And maybe it's happening during a build, or only turns into an error due to timing or ordering? |
new theory: the root bug (the temp pyc.N files being inputs) has been present for awhile, but it wasn't until Bazel CI enabled remote caching back in May (thus causing the machine-specific pyc.N files to be listed as inputs that other machines see) that it triggered more regularly. I wouldn't be surprised if there was a race bug, too -- that something creates pyc.N files, meanwhile something else globs the directory, then the pyc.N creator cleans up, and the globber has one of its inputs go missing. I think the fix is to modify the files that are excluded from the whl_library directories. It already excludes I think these spots: And probably also this one, for good measure: https://github.com/bazelbuild/rules_python/blob/main/python/repositories.bzl#L221 The pyc.N files are somewhat expected. They're temp files created by python during pyc creation and use |
We ignore pyc files most everywhere (because they aren't deterministic), but part of the pyc creation process involves creating temporary files named `*.pyc.NNN`. Though these are supposed to be temporary files nobody sees, they seem to get picked up by a glob somewhere, somehow. I'm unable to figure out how that is happening, but ignoring them in the glob expressions should also suffice. Fixes bazelbuild#1261
Thanks for detailing the investigation here. |
… 'master' Update rules_python to the latest version. The version contains fix for bazelbuild/rules_python#1261 that we also observe from time to time: https://dash.sf1-idx1.dfinity.network/invocation/8093f9cb-633f-4c6f-a9d4-a8f899dd47bc See merge request dfinity-lab/public/ic!14948
Part of the pyc compilation process is to create a temporary file named `<name>.pyc.NNNN`, where `NNNN` is a timestamp. Once the pyc is entirely written, this file is renamed to the regular pyc file name. These files only exist for brief periods of time, but its possible for different threads/processes to see the temporary files when computing the glob() values. Later, since the file is gone, an error is raised about the file missing. PR bazelbuild#1266 mostly fixed this issue, except that the glob exclude for an interpreter runtime's files was behind the `ignore_root_user_error` flag, which meant it wasn't always applied. This changes it to always be applied, which should eliminate the failures. Fixes bazelbuild#1261 Work towards bazelbuild#1520
…1541) Part of the pyc compilation process is to create a temporary file named `<name>.pyc.NNNN`, where `NNNN` is a timestamp. Once the pyc is entirely written, this file is renamed to the regular pyc file name. These files only exist for brief periods of time, but its possible for different threads/processes to see the temporary files when computing the glob() values. Later, since the file is gone, an error is raised about the file missing. PR #1266 mostly fixed this issue, except that the exclude for the `.pyc.NNNN` files for an interpreter runtime's files was behind the `ignore_root_user_error` flag, which meant it wasn't always applied. This changes it to always be applied, which should eliminate the failures due to the missing NNNN files. Fixes #1261 Work towards #1520
This is a tracking bug to collect info on the "bzlmod integration test on MacOS" to hopefully help figure out why it's failing. They usually pass after a couple retries
The error is typically something about the "Middleman" missing pyc files.
Example error:
Current theories:
Issue with the build machines (wouldn't be the first time)
hosts with failures:
hosts with successes:
Failing builds
https://buildkite.com/bazel/rules-python-python/builds/5036#01889b48-56c1-448c-bbac-cf37dad80b84
https://buildkite.com/bazel/rules-python-python/builds/5037#01889b60-d830-4827-ae25-a9c37e7c8ce8
https://buildkite.com/bazel/rules-python-python/builds/5029#
https://buildkite.com/bazel/rules-python-python/builds/5047#01889c11-eae2-4402-9ba5-81620511ea18
https://buildkite.com/bazel/rules-python-python/builds/5050#01889c3a-1c95-4f53-a538-0c7434773a5b
https://buildkite.com/bazel/rules-python-python/builds/5047#01889c11-eae2-4402-9ba5-81620511ea18
https://buildkite.com/bazel/rules-python-python/builds/5066#0188a254-986a-4bdc-a972-220424b5349a
https://buildkite.com/bazel/rules-python-python/builds/5067#0188a258-ed2a-455a-b5fe-3148f93d1231
https://buildkite.com/bazel/rules-python-python/builds/5084#0188b31f-dbab-4e48-9726-fe872304785c
Successful builds
https://buildkite.com/bazel/rules-python-python/builds/5039#01889b82-d0cc-4133-8712-cc11392043d0
https://buildkite.com/bazel/rules-python-python/builds/5029#01889bb5-4ed4-42c9-9816-4be86d9da129
https://buildkite.com/bazel/rules-python-python/builds/5067#_
https://buildkite.com/bazel/rules-python-python/builds/5084#0188b327-c85e-4f27-ac14-db9697815997
The text was updated successfully, but these errors were encountered: