forked from facebook/watchman
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix hangs in watchman oss tests on ubuntu by excluding liblzma-dev #2
Draft
ahornby
wants to merge
1
commit into
main
Choose a base branch
from
watchman_ubuntu_fixes
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ahornby
force-pushed
the
watchman_ubuntu_fixes
branch
from
November 25, 2024 00:18
2a4b97a
to
93083d3
Compare
Summary: watchman oss tests on ubuntu are [hanging for over 2 hours on github](https://github.com/facebook/watchman/actions/runs/11989130985/job/33425158244), and locally show errors. One clue was that tests worked locally on centos stream 9 and fedora 40, pointing to one or more of the ubuntu system packages being the trigger. Turns out that having the ubuntu xz aka liblzma-dev system packages linked is triggering the issue on both ubuntu 22.04 and 24.04. Disabled system packages for xz and for glog and libunwind that also bring in the xz system packages on ubuntu. Test Plan: Local run: ``` /build/fbcode_builder/getdeps.py --allow-system-packages test watchman --src-dir=. --filter watchman.integration.test_bulkstat.TestBulkStatCliJson --retry 0 ``` Before, fails ``` Test project /home/alex/local/tmp/ubuntu-24.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/build/watchman Start 110: test_py::watchman.integration.test_bulkstat.TestBulkStatCliJson.test_bulkstat_off Start 111: test_py::watchman.integration.test_bulkstat.TestBulkStatCliJson.test_bulkstat_on 1/2 Test facebook#111: test_py::watchman.integration.test_bulkstat.TestBulkStatCliJson.test_bulkstat_on ....***Failed 60.82 sec test_bulkstat_on (watchman.integration.test_bulkstat.TestBulkStatCliJson.test_bulkstat_on) ... 2024-11-24T18:33:28,230: [cli] failed to identify PDU: fill_buffer: Connection reset by peer ERROR Watchman logs: CLI logs Server logs 2024-11-24T18:33:28,401: [listener] Watchman 0.0.0 <no build info set> starting up on ubuntu-24.04.local by command /usr/bin/python3 /home/alex/local/tmp/ubuntu-24.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/build/watchman/watchman/integration/test_py watchman.integration.test_bulkstat.TestBulkStatCliJson.test_bulkstat_on 2024-11-24T18:33:28,403: [sanitychecks] starting sanityCheckThread 2024-11-24T18:33:28,439: [unix-listener] accepted client:stm=0x7fd388000ce0 2024-11-24T18:33:28,439: [client=1:stm=140546496531680:pid=539033] dispatch_command: version 2024-11-24T18:33:28,439: [client=1:stm=140546496531680:pid=539033] dispatch_command: version (completed) 2024-11-24T18:33:28,439: [client=1:stm=140546496531680:pid=539033] dispatch_command: version 2024-11-24T18:33:28,439: [client=1:stm=140546496531680:pid=539033] dispatch_command: version (completed) 2024-11-24T18:33:28,440: [client=1:stm=140546496531680:pid=539033] dispatch_command: get-pid 2024-11-24T18:33:28,440: [client=1:stm=140546496531680:pid=539033] dispatch_command: get-pid (completed) 2024-11-24T18:33:28,440: [NOT_CONN:client=1:stm=140546496531680:pid=539033] client_delete 1 ====================================================================== ERROR: test_bulkstat_on (watchman.integration.test_bulkstat.TestBulkStatCliJson.test_bulkstat_on) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/alex/local/tmp/ubuntu-24.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/build/watchman/watchman/integration/test_py/watchman/integration/test_bulkstat.py", line 26, in test_bulkstat_on self.assertFileList(root, ["foo", "bar"]) File "/home/alex/local/tmp/ubuntu-24.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/build/watchman/watchman/integration/test_py/watchman/integration/lib/WatchmanTestCase.py", line 389, in assertFileList st, res = self.waitFor( ^^^^^^^^^^^^^ File "/home/alex/local/tmp/ubuntu-24.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/build/watchman/watchman/integration/test_py/watchman/integration/lib/WatchmanTestCase.py", line 301, in waitFor return self._waitForCheck(cond, lambda res: res, timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/alex/local/tmp/ubuntu-24.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/build/watchman/watchman/integration/test_py/watchman/integration/lib/WatchmanTestCase.py", line 289, in _waitForCheck res = cond() ^^^^^^ File "/home/alex/local/tmp/ubuntu-24.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/build/watchman/watchman/integration/test_py/watchman/integration/lib/WatchmanTestCase.py", line 391, in <lambda> self.getFileList(root, cursor=cursor, relativeRoot=relativeRoot), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/alex/local/tmp/ubuntu-24.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/build/watchman/watchman/integration/test_py/watchman/integration/lib/WatchmanTestCase.py", line 339, in getFileList res = self.watchmanCommand("query", root, expr) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/alex/local/tmp/ubuntu-24.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/build/watchman/watchman/integration/test_py/watchman/integration/lib/WatchmanTestCase.py", line 271, in watchmanCommand return client.query(*args) ^^^^^^^^^^^^^^^^^^^ File "/home/alex/local/tmp/ubuntu-24.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/build/watchman/watchman/integration/test_py/pywatchman/__init__.py", line 1143, in query res = self.receive() ^^^^^^^^^^^^^^ File "/home/alex/local/tmp/ubuntu-24.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/build/watchman/watchman/integration/test_py/pywatchman/__init__.py", line 1051, in receive result = self.recvConn.receive() ^^^^^^^^^^^^^^^^^^^^^^^ File "/home/alex/local/tmp/ubuntu-24.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/build/watchman/watchman/integration/test_py/pywatchman/__init__.py", line 804, in receive line = self.transport.readLine() ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/alex/local/tmp/ubuntu-24.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/build/watchman/watchman/integration/test_py/pywatchman/__init__.py", line 244, in readLine b = self.readBytes(4096) ^^^^^^^^^^^^^^^^^^^^ File "/home/alex/local/tmp/ubuntu-24.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/build/watchman/watchman/integration/test_py/pywatchman/__init__.py", line 657, in readBytes raise WatchmanError("EOF on CLI process transport") pywatchman.WatchmanError: EOF on CLI process transport, while executing ('query', '/home/alex/local/tmp/ubuntu-24.04/watchmantestu1xdmt7d/watchman.integration.test_bulkstat.TestBulkStatCliJson.test_bulkstat_on.cli.json.p/tmpmlimtbp5', {'expression': ['exists'], 'fields': ['name']}) ---------------------------------------------------------------------- Ran 1 test in 60.590s ``` After, works ``` 1/2 Test facebook#111: test_py::watchman.integration.test_bulkstat.TestBulkStatCliJson.test_bulkstat_on .... Passed 0.52 sec 2/2 Test facebook#110: test_py::watchman.integration.test_bulkstat.TestBulkStatCliJson.test_bulkstat_off ... Passed 0.52 sec 100% tests passed, 0 tests failed out of 2 ``` After, all tests work locally on ubuntu 22.04 (although 2 intermittently fail): ``` 441/441 Test facebook#337: test_py::watchman.integration.test_sock_perms.TestSockPerms.test_invalid_sock_access ......................................... Passed 120.20 sec 100% tests passed, 0 tests failed out of 441 Total Test time (real) = 123.80 sec ``` Example remaining intermittent fails: ``` 24-11-24T20:13:54,749: [client=4:stm=140224373984480:pid=718930] dispatch_command: query (completed) 2024-11-24T20:13:54,750: [NOT_CONN:client=4:stm=140224373984480:pid=718930] client_delete 4 tcache_thread_shutdown(): unaligned tcache chunk detected *** Aborted at 1732479234 (Unix time, try 'date -d @1732479234') *** *** Signal 6 (SIGABRT) (0x3e8000af610) received by PID 718352 (pthread TID 0x7f88db7fe640) (linux TID 718939) (maybe from PID 718352, UID 1000) (code: -6), stack trace: *** @ 000000000082b9d2 _ZN5folly10symbolizer12_GLOBAL__N_113signalHandlerEiP9siginfo_tPv /home/alex/local/tmp/ubuntu-22.04/fbcode_builder_getdeps-ZhomeZalexZlocalZwatchmanZbuildZfbcode_builder/repos/github.com-facebook-folly.git/folly/debugging/symbolizer/SignalHandler.cpp:453 @ 000000000004251f (unknown) @ 00000000000969fc pthread_kill @ 0000000000042475 raise @ 00000000000287f2 abort @ 0000000000089675 (unknown) @ 00000000000a0cfb (unknown) @ 00000000000a56c3 (unknown) @ 000000000009494e (unknown) @ 000000000012684f (unknown) ERROR ... 99% tests passed, 2 tests failed out of 441 Total Test time (real) = 120.17 sec The following tests FAILED: 140 - test_py::watchman.integration.test_content_hash.TestContentHashCliJson.test_cacheLimit (Failed) 141 - test_py::watchman.integration.test_content_hash.TestContentHashCliJson.test_contentHash (Failed) Errors while running CTest Command '['/usr/bin/ctest', '--output-on-failure', '-j', '31', '--rerun-failed']' returned non-zero exit status 8. !! Failed ```
ahornby
force-pushed
the
watchman_ubuntu_fixes
branch
from
November 25, 2024 00:19
93083d3
to
bea5d05
Compare
ahornby
changed the title
fix watchman oss tests on ubuntu
fix hangs in watchman oss tests on ubuntu by excluding liblzma-dev
Nov 25, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
watchman oss tests on ubuntu are hanging for over 2 hours on github, and locally show errors.
One clue was that tests worked locally on centos stream 9 and fedora 40, pointing to one or more of the ubuntu system packages being the trigger.
Turns out that having the ubuntu xz aka liblzma-dev system packages linked is triggering the issue on both ubuntu 22.04 and 24.04. Disabled system packages for xz and for glog and libunwind that also bring in the xz system packages on ubuntu.
Test Plan:
Local run:
Before, fails
After, works
After, all tests work locally on ubuntu 22.04 (although 2 intermittently fail):
Example remaining intermittent fails: