-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spawn_test: prolong termination time to be more tolerant. #1483
Conversation
@tchaikov Hi, kefu, I think I've found the reason. My ArchLinux is hosted on a machine with kernel 5.4, which supports |
@balusch hi Jianyong, thanks for your excellent analysis. that explains the symptom in your environment! the test failures are not related to this change:
|
tests/unit/spawn_test.cc
Outdated
BOOST_CHECK_LE(ms, 10); | ||
// sleep should be terminated in 10ms. | ||
// pidfd_open(2) may fail and thus p.wait() falls back to | ||
// waitpid(2) with backoff(at least 20ms). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, please add a space before " (at".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
tests/unit/spawn_test.cc
Outdated
// sleep should be terminated in 10ms. | ||
// pidfd_open(2) may fail and thus p.wait() falls back to | ||
// waitpid(2) with backoff(at least 20ms). | ||
// here we allow one backoff to be more tolerant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// here we allow one backoff to be more tolerant. | |
// the minimal backoff is added to 10ms, so the test can pass on | |
// older kernels as well. |
spawn_test still fails with this change: Is the stall_detector failure below just a fluke? |
the spawn_test is a known one and is not fixed by this change. i was looking at it, but no clues so far. probably i should drop it. the stall_detector one might be a flaky test. let's note it down just in case. |
0d2b708
to
307d02f
Compare
pidfd_open(2) can fail on old kernels due to several reasons (e.g. syscall not supported, O_NONBLOCK flags not supported), in which case process:wait() falls back to waitpid(2) with backoff (at least 20ms), which may make it longer than 10ms. Here we add the minimal backoff to 10ms so that the test can pass on old kernels. Signed-off-by: Jianyong Chen <baluschch@gmail.com>
307d02f
to
26b4d55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@scylladb/seastar-maint could you help merge this one as well? |
pidfd_open(2) can fail on old kernels due to several reasons (e.g.
syscall not supported, O_NONBLOCK flags not supported), in which
case process:wait() falls back to waitpid(2) with backoff (at least
20ms), which may make it longer than 10ms.
Here we add the minimal backoff to 10ms so that the test can pass
on old kernels.
Signed-off-by: Jianyong Chen baluschch@gmail.com