Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sudo has hangs on FreeBSD when use_pty is on #1013

Closed
squell opened this issue Mar 2, 2025 · 4 comments · Fixed by #1017
Closed

sudo has hangs on FreeBSD when use_pty is on #1013

squell opened this issue Mar 2, 2025 · 4 comments · Fixed by #1017
Labels
bug Something isn't working freebsd

Comments

@squell
Copy link
Member

squell commented Mar 2, 2025

On a FreeBSD box, run

for x in `seq 10`; do echo "$x "; target/debug/sudo true;done

(where you can replace true with another command, my favorite is sudo cat somefile | diff - somefile).

You'll get various hangs with sudo-rs if use_pty is enabled, which is the default.

It looks to me that if the child process finishes to quickly, the monitor process doesn't register this and forgets to reap it (turning the child into a zombie) and enters a tight busywait loop. I also notice in the dev logs that this problem seems related to setpgid returning an error code 3 ("Not found"). Although in cases where a process was executed succesfully setpgid returns error code 13 ("Permission denied"), which seems like it's not entirely right as well--but maybe that's a tangential issue.

I notice that on ogsudo, several bug reports also trickled in when @millert enabled use_pty by default as well, and this one comment seems to indicate an identical problem:

sudo-project/sudo#258 (comment)

In particular, if I change the busy-wait loop in our code from 1 microsecond to 10 microseconds (in exec/use_pty/monitor.rs) as suggested by this comment, the problem also seems to be gone. But @millert addressed this in a different way in0cb3e33444b8acb161935e4e00c337e56cf50448, as mentioned in the above issue thread.

And he had several other fixes around that time as well. I think we implemented our use_pty code pretty strictly based on sudo's use_pty code, and well before the date of september '23 (which was after our public release), so it makes sense that we need to do an update round.

I've pinged @pvdrz to also take a look at this (I hope he has time). This issue is a blocker for a release on FreeBSD, since I think we do need use_pty there as well.

@squell squell added bug Something isn't working freebsd labels Mar 2, 2025
@millert
Copy link
Contributor

millert commented Mar 3, 2025

While I haven't managed to reproduce a complete hang on FreeBSD, I do see sudo-rs busy-waiting an chewing up CPU, especially on a single-core VM. It is possible that is the same underlying problem. As you say, this looks similar to what was fixed in sudo by sudo-project/sudo@0cb3e33. Some kind of interlock is probably required. In ogsudo I used the existing error pipe for this (but changed it to a bi-directional socketpair). Another option is to have the child call sigsuspend() and use a signal from the parent to wake it up after the foreground process group has been setup.

@squell
Copy link
Member Author

squell commented Mar 5, 2025

Using a single-core VM/VPS seems to definitely be required for reproduction.

millert added a commit to millert/sudo-rs that referenced this issue Mar 5, 2025
On some systems, a short sleep does not always yield the CPU.
An explicit yield prevents us from consuming CPU while we wait
for the parent to grant the monitor the controlling terminal.
Fixes issue trifectatechfoundation#1013.
millert added a commit to millert/sudo-rs that referenced this issue Mar 5, 2025
On some systems, a short sleep does not always yield the CPU.
An explicit yield prevents us from consuming CPU while we wait
for the parent to grant the monitor the controlling terminal.
Fixes issue trifectatechfoundation#1013.
@millert
Copy link
Contributor

millert commented Mar 5, 2025

If you yield instead of sleeping the problem goes away. I suppose on FreeBSD a short sleep must not always cause the process to yield the CPU. In my testing #1017 fixes the problem.

squell pushed a commit that referenced this issue Mar 5, 2025
On some systems, a short sleep does not always yield the CPU.
An explicit yield prevents us from consuming CPU while we wait
for the parent to grant the monitor the controlling terminal.
Fixes issue #1013.
@squell
Copy link
Member Author

squell commented Mar 5, 2025

I would be interested to know if changing sleeping 1us to simply yielding to the scheduler has any other noticeable impact in a multiprocessor setup, but I'll run some tests tomorrow. In the worst case we could:

  • do both a sleep and a yield
  • do a yield on FreeBSD and a sleep on Linux

If anything bad is discovered or pvdrz still wants to do a more thorough update or our monitor code we can always re-open; but I like the minimality of the change in #1017.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working freebsd
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants