-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix pipe hang (issue #319) #327
Conversation
There is a second hang in the |
It's also likely I'm just hiding the original bug by changing some timing. |
Is there an upstream bug report for this somewhere (e.g. on https://github.com/axboe/liburing)? |
I have to check deeper, I found a bunch of stuff related to bugs with pipes, also the commit for adding splice support for nonblocking in pipes is from June this year, it's all very recent (https://lore.kernel.org/all/20220607080619.513187-5-hao.xu@linux.dev/) Basically setting the pipes to nonblocking fix the first case, but the second case makes the splice call return EAGAIN, and restarting/handling is not enough. it ends up hanging at some point. If I make both pipes nonblocking and disable splicing, everything works. I need to isolate things better before opening something upstream, but it all seems a bit dicey and immature, I'm writing C equivalents to make sure we're not missing something. |
As noted in #319 (comment), I can reproduce the behaviour in that C program. |
After reading through this PR and comments, I did this for my application, which solved hangs I was seeing under |
Probably outside of the scope of this PR, but shouldn't we make the sockets non blocking by default too - if they are not already? I was under the impression that the uring and luv backends both set sockets fd to be non-blocking by default. Is this not the case? |
So I've started to poke uring with
|
I've ran some more tests today and this is where we are:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rebased it on main
to fix the CI error.
Is this ready to merge, or does it need splice to be disabled first? |
Apologies, I should have been more clear, it's not ready, it was more like checking if this would be ok. I've also wanted to hunt the kernel versions where the patch has been applied and maybe test conditionally for disabling it, but I didn't get to it. |
This issue turned out to be a kernel bug which has been fixed in: torvalds/linux@46a525e Reported here: axboe/liburing#665 (comment) We can workaround it by making sure pipes are non-blocking, uring considers pipes unbounded work and relies on uring worker threads if they are blocking, the bug is only triggered if this is the case, so force them to be non-blocking. If we use splice, the splice call itself needs the worker threads and the bug surfaces again (I verified this with perf probes), so disable splice for now. Even with the fix, it's desirable to keep pipes as non-blocking to avoid thread pooling. The splice call can return EAGAIN in uring, this happens even with the kernel patched, so handle it for the future. We can tune this better by disabling splice only for the unpatched kernels.
This should be good to go, I updated the commit message as well. |
CHANGES: API changes: - Unify IO errors as `Eio.Io` (@talex5 ocaml-multicore/eio#378). This makes it easy to catch and log all IO errors if desired. The exception payload gives the type and can be used for matching specific errors. It also allows attaching extra information to exceptions, and various functions were updated to do this. - Add `Time.Mono` for monotonic clocks (@bikallem @talex5 ocaml-multicore/eio#338). Using the system clock for timeouts, etc can fail if the system time is changed during the wait. - Allow datagram sockets to be created without a source address (@bikallem @haesbaert ocaml-multicore/eio#360). The kernel will allocate an address in this case. You can also now control the `reuse_addr` and `reuse_port` options. - Add `File.stat` and improve `Path.load` (@haesbaert @talex5 ocaml-multicore/eio#339). `Path.load` now uses the file size as the initial buffer size. - Add `Eio_unix.pipe` (@patricoferris ocaml-multicore/eio#350). This replaces `Eio_linux.pipe`. - Avoid short reads from `getrandom(2)` (@haesbaert ocaml-multicore/eio#344). Guards against buggy user code that might not handle this correctly. - Rename `Flow.read` to `Flow.single_read` (@talex5 ocaml-multicore/eio#353). This is a low-level function and it is easy to use it incorrectly by ignoring the possibility of short reads. Bug fixes: - Eio_luv: Fix non-tail-recursive continue (@talex5 ocaml-multicore/eio#378). Affects the `Socket_of_fd` and `Socketpair` effects. - Eio_linux: UDP sockets were not created close-on-exec (@talex5 ocaml-multicore/eio#360). - Eio_linux: work around io_uring non-blocking bug (@haesbaert ocaml-multicore/eio#327 ocaml-multicore/eio#355). The proper fix should be in Linux 6.1. - `Eio_mock.Backend`: preserve backtraces from `main` (@talex5 ocaml-multicore/eio#349). - Don't lose backtrace in `Switch.run_internal` (@talex5 ocaml-multicore/eio#369). Documentation: - Use a proper HTTP response in the README example (@talex5 ocaml-multicore/eio#377). - Document that read_dir excludes "." and ".." (@talex5 ocaml-multicore/eio#379). - Warn about both operations succeeding in `Fiber.first` (@talex5 ocaml-multicore/eio#358, reported by @iitalics). - Update README for OCaml 5.0.0~beta2 (@talex5 ocaml-multicore/eio#375). Backend-specific changes: - Eio_luv: add low-level process support (@patricoferris ocaml-multicore/eio#359). A future release will add Eio_linux support and a cross-platform API for this. - Expose `Eio_luv.Low_level.Stream.write` (@patricoferris ocaml-multicore/eio#359). - Expose `Eio_luv.Low_level.get_loop` (@talex5 ocaml-multicore/eio#371). This is needed if you want to create resources directly and then use them with Eio_luv. - `Eio_linux.Low_level.openfile` is gone (@talex5 ocaml-multicore/eio#378). It was just left-over test code.
Turns out the kernel needs to handle blocking file descriptors differently, it needs to do thread-pooling and it's a different code path (also slower).
It also seems that this is bugged in some cases. This makes sense as I believe most people don't test with blocking FDs. I believe we should make all our FDs non blocking as a precaution.
There is a program listed in issue #319 that replicates this. I could hang it in < 5 seconds, and it's now running for 10minutes with the fix, so kaboom.