Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ongoing: Handle divergences between MacOS vs Ubuntu when writing/appending to redirected file descriptors #283

Open
balupton opened this issue Feb 11, 2025 · 5 comments
Assignees
Labels
ongoing Ongoing efforts of incremental improvements

Comments

@balupton
Copy link
Member

balupton commented Feb 11, 2025

for details see:
https://gist.github.com/balupton/cd779f3a39507f75d5956a67e5543ab8

for the failed Dorothy test:
https://github.com/bevry/dorothy/actions/runs/13254558267/job/36998982191#step:2:502

dorothy-config calls echo-lines-before with >"$temp_filepath" as it is a new file, and echo-lines-after with >>"$temp_filepath"
https://github.com/bevry/dorothy/blob/dev/devilbird/commands/dorothy-config#L234
https://github.com/bevry/dorothy/blob/dev/devilbird/commands/dorothy-config#L265
however, because of differences in ubuntu vs macos, on ubuntu this results in nothing being written with the echo-lines-before, and eventually results in file starting with a )

the cause of this is that eval-lines-before and eval-lines-after output each line individually:

__print_lines "$line"

this is handled by stdinargs.bash, which calls bash.bash:eval_capture
which eval_capture --statusvar=status -- "$@" is doing an errexit compatible form of "$@" >/dev/stdout || status=$?

eval_capture --statusvar=status -- "$@"

the >/dev/stdout is the critical piece, and is leftover as a convenience, as --stdoutvar=... --outputvar=... and --stdoutpipe=... --outputpipe=... all allow easy abilities to store in a variable and/or pipe/redirect the stdout/stderr/both content to different locations
local item cmd=() exit_status_local exit_status_variable='exit_status_local' stdout_variable='' stderr_variable='' output_variable='' stdout_pipe='/dev/stdout' stderr_pipe='/dev/stderr'

eval_capture_wrapper >"$stdout_pipe" 2>"$stderr_pipe"

for what can be done:

buffering

Instead of streaming output, that is to say outputting each line as we have them, which causes each line to go through the redirect flow, which is an expensive operation, as each line in this example would have to go through each tee, whereas with buffering, the flow needs to go through each only once.
https://gist.github.com/balupton/9ceaf968d46378e4bed714a3df128676#file-04-multi-experiments-L12-L20

It allows an easy way to opt-out of this new default with say an introduction of a printf '%s\n' one two three | echo-lines --no-buffer.

It reduces the need for echo-wait everywhere, and as such reduces fragility, surface area, and possible sigpipe failures, where the pipe reader has decided it is done on an earlier than all output, causing further writes to fail to write as there is no reader.

put echo-wait everywhere

This keeps the default behaviour as one that is fragile and divergent between systems, and needs to be explicitly opted-in, which means problems only get fixed retroactively.

use numbered file descriptors instead of /dev/stdout, /dev/stderr

This is good, however, on bash version 4.1 it requires fds between 3-9 of which conflicts could arise, or some way to detect availability. Bash v4.1 provides an easy solution for this:
https://gist.github.com/balupton/cd779f3a39507f75d5956a67e5543ab8#file-03-the-core-issue-L298-L311

for general code, this would mean go through everywhere and replace say >/dev/stdout with >&1 and >/dev/stderr with >&2. Using shopt -o noclobber can enforce this, to prevent mistakes.

for eval_capture the simplest solution for convergence, would be to drop the *pipe=<target> handling, and have them always write to >&1 and >&2, and let the caller sort out the redirections, including those to /dev/null.

If however eval_capture was to add convergence to its continued support for *pipe=<target> handling, it could so like so:

# this fixes where it goes, but say if /dev/file is provided, then it doesn't fix whether it appends or overwrites
#   which would then require say a [--append-stdout] flag or something to handle
#   and as ubuntu seems to discard appending-file-descriptors [exec 3>>"$file"] this is a problem
# as such it still seems that ultimately, eval_capture should only write to [>&1] and [>&2]
#   and leave any further redirection to the caller

if [[ -z $stdout_target || $stdout_target === '/dev/stdout' ]]; then
	stdout_target=1
elif [[ $stdout_target === '/dev/stderr' ]]; then
	stdout_target=2
else
	# a file path, or say /dev/tty
	exec 3>"$stdout_target"
	stdout_target=3
fi
eval "$command" >&${stdout_target}
exec >&{stdout_target}-


if [[ -z $stdout_target ||  $stdout_target === '/dev/stdout' ]]; then
	exec 3>1
elif [[ $stdout_target === '/dev/stderr' ]]; then
	exec 3>2
else
	# a file path, or say /dev/tty
	exec 3>"$stdout_target"
fi
eval "$command" >&3
exec >&3-


if [[ -z $stdout_target ||  $stdout_target === '/dev/stdout' ]]; then
	exec {STDOUT_FD}>1
elif [[ $stdout_target === '/dev/stderr' ]]; then
	exec {STDOUT_FD}>2
else
	# a file path, or say /dev/tty
	exec {STDOUT_FD}>"$stdout_target"
fi
eval "$command" >&${STDOUT_FD}
exec >&{STDOUT_FD}-

the problem with any change here, is that it doesn't handle the situation where the target is an actual file path, in which case there needs to be a --append flag and special handling to differentiate between overwrite and append operations.


as it is not clear yet to me whether the ubuntu handling or the macos handling is the correct expected behaviour, the buffering proposal seems the best one at this point, and is what I will do so I can continue with the release cadence of dorothy.

@balupton balupton added the ongoing Ongoing efforts of incremental improvements label Feb 11, 2025
balupton added a commit that referenced this issue Feb 12, 2025
/improves #283
/fixes https://github.com/bevry/dorothy/actions/runs/13254558267/job/36998982191#step:2:502

update tests in `dorothy-config`, `echo-lines-after`, `echo-lines-before` to correctly detect and report a regression here
@balupton
Copy link
Member Author

balupton commented Feb 12, 2025

I have pushed up a workaround to eval_capture which uses >> instead of >, which is fine, as all the eval_capture targets so far in Dorothy are the special files of /dev/stdout, /dev/stderr, /dev/tty, and /dev/null.

Do note that in the workaround, buffering lines into a single write was not enough, as >> was still discarded if a >/dev/stdout redirect was used. The only solution, which also meant buffering lines was not needed, was to use >>/dev/stdout in the relevant location(s). See the referenced commit for details.

From the debugging and the result of the failed test:
https://github.com/bevry/dorothy/actions/runs/13254558267/job/36998982191#step:2:502

It seems that this could have actually been the cause of this earlier issue:
#228

As the debugging and the fix encountered that and similar results, which are now fixed.

However, even with the fix, considering the response by Lawrence Velázquez:
https://lists.gnu.org/archive/html/help-bash/2025-02/msg00052.html

And that these special device files may not be available on all systems that Dorothy will eventually target:
https://unix.stackexchange.com/a/278361/50703

It seems best that over time, we gradually remove and discourage use of such, while devising an alternative solution. This will be an ongoing issue that will be achieved over the next several months, as there are more important matters, and the surface area so far of where this has caused a detected issue has been resolved.

balupton added a commit that referenced this issue Feb 12, 2025
/improves #283
/fixes https://github.com/bevry/dorothy/actions/runs/13254558267/job/36998982191#step:2:502

update tests in `dorothy-config`, `echo-lines-after`, `echo-lines-before` to correctly detect and report a regression here
@molleweide
Copy link
Collaborator

molleweide commented Feb 12, 2025

saving some articles i find in addition:

safe usage of FDs: https://unix.stackexchange.com/questions/748702/which-file-descriptor-numbers-are-safe-to-use-in-posix-shell-scripts
automatic file descriptor allocation: https://unix.stackexchange.com/questions/405439/bash-use-automatic-file-descriptor-creation-instead-of-fifo
how to find the next available FD in bash: https://stackoverflow.com/questions/41603787/how-to-find-next-available-file-descriptor-in-bash
bash set -x logs to file: https://askubuntu.com/questions/811439/bash-set-x-logs-to-file
how do i open an FD that is externably redirectable: https://stackoverflow.com/questions/10587857/in-bash-how-do-i-open-a-writable-file-descriptor-thats-externally-redirectable?rq=3

avoid directly manipulating FDs in shell:https://www.oilshell.org/blog/2017/08/12.html

I wasn't familiar with these constructs before I started writing a shell, and I suspect many others aren't either.

is it safe to use FD3?: https://unix.stackexchange.com/questions/320510/is-it-safe-to-use-fd-3

this one goes pretty hard it seems:
https://catonmat.net/bash-one-liners-explained-part-three

goldmine >>> how to find the lowest available FD: https://stackoverflow.com/questions/8297415/in-bash-how-to-find-the-lowest-numbered-unused-file-descriptor

@balupton
Copy link
Member Author

balupton commented Feb 16, 2025

avoid directly manipulating FDs in shell:https://www.oilshell.org/blog/2017/08/12.html

I wasn't familiar with these constructs before I started writing a shell, and I suspect many others aren't either.

Dude is only writing trivial scripts if he thinks his alternatives are replacements. He's never had a need for /dev/tty either and probably not familiar with the issue in the OP. His is right for trivial scripts.

But then again, maybe he's right. Will need to prototype say __stdout, __stderr, __tty functions.

@balupton
Copy link
Member Author

balupton commented Feb 16, 2025

This is probably our solution for tty/stderr compat on bash v3 and is the best answer there
https://stackoverflow.com/a/76011352/130638

@balupton
Copy link
Member Author

balupton commented Feb 16, 2025

avoid directly manipulating FDs in shell:https://www.oilshell.org/blog/2017/08/12.html

I wasn't familiar with these constructs before I started writing a shell, and I suspect many others aren't either.

Dude is only writing trivial scripts if he thinks his alternatives are replacements. He's never had a need for /dev/tty either and probably not familiar with the issue in the OP. His is right for trivial scripts.

But then again, maybe he's right. Will need to prototype say __stdout, __stderr, __tty functions.

So locally I've changed all the relevant >/dev/stderr calls with either mostly a __stderr function, and in a few places using the fds or >>/dev/stdeer. Likewise for the other targets. This handles 99% of the use cases.

However for eval-tester and a few other places of actual complexity, we need to do the fd dup thing, such that content can be grabbed, including that of tty.

@balupton balupton added this to the Share Launch [v3] milestone Feb 23, 2025
@balupton balupton self-assigned this Feb 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ongoing Ongoing efforts of incremental improvements
Development

No branches or pull requests

2 participants