-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix race in libzfs_run_process_impl #16801
Conversation
I see the waitpid() man page example code (https://linux.die.net/man/2/waitpid) is a little different from the way we do things in diff --git a/lib/libzfs/libzfs_util.c b/lib/libzfs/libzfs_util.c
index 1f7e7b0e6..951feb1a0 100644
--- a/lib/libzfs/libzfs_util.c
+++ b/lib/libzfs/libzfs_util.c
@@ -963,12 +963,14 @@ libzfs_run_process_impl(const char *path, char *argv[], char *env[], int flags,
} else if (pid > 0) {
/* Parent process */
int status;
-
- while ((error = waitpid(pid, &status, 0)) == -1 &&
- errno == EINTR)
- ;
- if (error < 0 || !WIFEXITED(status))
- return (-1);
+ do {
+ error = waitpid(pid, &status, WUNTRACED | WCONTINUED);
+ if (error == -1)
+ return (-1);
+ if (WIFEXITED(status) || WIFSIGNALED(status) ||
+ WIFSTOPPED(status) || WIFCONTINUED(status))
+ return (-1);
+ } while (!WIFEXITED(status) && !WIFSIGNALED(status));
if (lines != NULL) {
close(link[1]); |
@tonyhutter I don't think it would improve the issue at hand.
This code would return error if the child exited before the parent had a chance to check it - the same as current code. While this kind of check is correct for many cases (ie: when a child exiting so fast is not expected), for this specific operation (replacing a disk with an empty prepare script) it is not. This is how I understand it, at least. |
@shodanshok I think you misunderstand how it works. There should be no race. Please see the "Notes" section of the man page. Besides I am not sure it is correct to check |
@amotin I see what do you mean, and I think you are right. Upon further inspection, I suspect the issue is related to the double-wait done via If I am not mistaken, when replacing a disk via
Something seems to go wrong between these two forks/waits. I added some debug
Notice how:
|
72cea01
to
80f09a7
Compare
Indeed, the I have updated the patch with a possible solution. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My memories in the area are a bit rusty, but rafter reading some man pages seems to make sense.
When replacing a disk, a child process is forked to run a script called zfs_prepare_disk (which can be useful for disk firmware update or health check). The parent than calls waitpid and checks the child error/status code. However, the _reap_children thread (created from zed_exec_process to manage zedlets) also waits for all children with the same PGID and can stole the signal, causing the replace operation to be aborted. As waitpid returns -1, the parent incorrectly assume that the child process had an error or was killed. This, in turn, leaves the newly added disk in REMOVED or UNAVAIL status rather than completing the replace process. This patch changes the PGID of the child process execuing the prepare script, shielding it from the _reap_children thread. Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Rebased. EDIT: I missed that rebasing would remove the accepted label, sorry. |
When replacing a disk, a child process is forked to run a script called zfs_prepare_disk (which can be useful for disk firmware update or health check). The parent than calls waitpid and checks the child error/status code. However, the _reap_children thread (created from zed_exec_process to manage zedlets) also waits for all children with the same PGID and can stole the signal, causing the replace operation to be aborted. As waitpid returns -1, the parent incorrectly assume that the child process had an error or was killed. This, in turn, leaves the newly added disk in REMOVED or UNAVAIL status rather than completing the replace process. This patch changes the PGID of the child process execuing the prepare script, shielding it from the _reap_children thread. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Gionatan Danti <g.danti@assyoma.it> Closes openzfs#16801
When replacing a disk, a child process is forked to run a script called zfs_prepare_disk (which can be useful for disk firmware update or health check). The parent than calls waitpid and checks the child error/status code. However, the _reap_children thread (created from zed_exec_process to manage zedlets) also waits for all children with the same PGID and can stole the signal, causing the replace operation to be aborted. As waitpid returns -1, the parent incorrectly assume that the child process had an error or was killed. This, in turn, leaves the newly added disk in REMOVED or UNAVAIL status rather than completing the replace process. This patch changes the PGID of the child process execuing the prepare script, shielding it from the _reap_children thread. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Gionatan Danti <g.danti@assyoma.it> Closes openzfs#16801
When replacing a disk, a child process is forked to run a script called
zfs_prepare_disk
(which can be useful for disk firmware update or health check). The parent than callswaitpid
and checks the child error/status code.However, the ZED
_reap_children
thread (created fromzed_exec_process
to manage zedlets) also waits for all children with the same PGID and can stole the signal, causing the replace operation to be aborted.As
waitpid
returns -1, the parent incorrectly assume that the child process had an error or was killed. This, in turn, leaves the newly added disk in REMOVED or UNAVAIL status rather than completing the replace process.This patch changes the PGID of the child process execuing the prepare script, shielding it from the
_reap_children thread
.Motivation and Context
Description
How Has This Been Tested?
Types of changes
Checklist:
Signed-off-by
.