-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sudo has hangs on FreeBSD when use_pty
is on
#1013
Comments
While I haven't managed to reproduce a complete hang on FreeBSD, I do see sudo-rs busy-waiting an chewing up CPU, especially on a single-core VM. It is possible that is the same underlying problem. As you say, this looks similar to what was fixed in sudo by sudo-project/sudo@0cb3e33. Some kind of interlock is probably required. In ogsudo I used the existing error pipe for this (but changed it to a bi-directional socketpair). Another option is to have the child call sigsuspend() and use a signal from the parent to wake it up after the foreground process group has been setup. |
Using a single-core VM/VPS seems to definitely be required for reproduction. |
On some systems, a short sleep does not always yield the CPU. An explicit yield prevents us from consuming CPU while we wait for the parent to grant the monitor the controlling terminal. Fixes issue trifectatechfoundation#1013.
On some systems, a short sleep does not always yield the CPU. An explicit yield prevents us from consuming CPU while we wait for the parent to grant the monitor the controlling terminal. Fixes issue trifectatechfoundation#1013.
If you yield instead of sleeping the problem goes away. I suppose on FreeBSD a short sleep must not always cause the process to yield the CPU. In my testing #1017 fixes the problem. |
On some systems, a short sleep does not always yield the CPU. An explicit yield prevents us from consuming CPU while we wait for the parent to grant the monitor the controlling terminal. Fixes issue #1013.
I would be interested to know if changing sleeping 1us to simply yielding to the scheduler has any other noticeable impact in a multiprocessor setup, but I'll run some tests tomorrow. In the worst case we could:
If anything bad is discovered or pvdrz still wants to do a more thorough update or our monitor code we can always re-open; but I like the minimality of the change in #1017. |
On a FreeBSD box, run
(where you can replace
true
with another command, my favorite issudo cat somefile | diff - somefile
).You'll get various hangs with sudo-rs if
use_pty
is enabled, which is the default.It looks to me that if the child process finishes to quickly, the monitor process doesn't register this and forgets to reap it (turning the child into a zombie) and enters a tight busywait loop. I also notice in the dev logs that this problem seems related to
setpgid
returning an error code 3 ("Not found"). Although in cases where a process was executed succesfullysetpgid
returns error code 13 ("Permission denied"), which seems like it's not entirely right as well--but maybe that's a tangential issue.I notice that on ogsudo, several bug reports also trickled in when @millert enabled
use_pty
by default as well, and this one comment seems to indicate an identical problem:sudo-project/sudo#258 (comment)
In particular, if I change the busy-wait loop in our code from 1 microsecond to 10 microseconds (in
exec/use_pty/monitor.rs
) as suggested by this comment, the problem also seems to be gone. But @millert addressed this in a different way in0cb3e33444b8acb161935e4e00c337e56cf50448, as mentioned in the above issue thread.And he had several other fixes around that time as well. I think we implemented our
use_pty
code pretty strictly based on sudo's use_pty code, and well before the date of september '23 (which was after our public release), so it makes sense that we need to do an update round.I've pinged @pvdrz to also take a look at this (I hope he has time). This issue is a blocker for a release on FreeBSD, since I think we do need
use_pty
there as well.The text was updated successfully, but these errors were encountered: