-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Signals do not add SA_ONSTACK which may cause linked Go applications to crash. #3520
Comments
I didn't know that we set a signal handler if you did not directly use the |
Signal handlers are lazily installed the first time a (Note that on Unix platforms tokio's process management will also install a signal handler, in case some other part of the code is using processes) |
https://github.com/tokio-rs/tokio/blob/master/tokio/src/runtime/driver.rs#L10 When enable_io is used, process, signal and io handlers are all installed. Unless I'm mistaken this is the what I'm asking, if this could be enabled in a more fine-grained manner. |
This is where the driver is registered, but signal handlers are lazily registered here: https://github.com/tokio-rs/tokio/blob/master/tokio/src/runtime/driver.rs#L10 It seems like something in the application is registering a signal one way or another. If the signal driver is disabled, then that part of the code will panic (although it will help you find out where the registration is happening at least). I'm not very familiar with |
I found this issue after running into the same problem with a Rust library that's being called from Go. I wrote a little test program and was able to confirm that the signal handler for SIGCHLD is installed eagerly when the Here's the main.rs from the test program: #[tokio::main]
async fn main() {
tokio::time::sleep(std::time::Duration::from_secs(600)).await;
} I ran that, and then followed these instructions while the program was sleeping.
The SigCgt above indicates that it is catching signal 17 (SIGCHLD). When I remove the I don't know whether that should be considered a bug or not, or if it was intentional. I could see an argument either way. But I did confirm that with tokio 0.2.25 this signal handler is not installed eagerly, but with 0.3.7 it is. |
I think that fine-grained enable_ would be a useful way to resolve this as it is possible for a crate/package to want to have signal + io enabled, but differing runtimes with or without those features. I have noticed that if any crate in my dependency chain requests signal then it "taints" all the way back to my crate even though I do not have the signal feature enabled. So fine grained enable/access would probably help here anyway. |
Ah yes you're right! I had forgotten that the |
#3521 <<-- my PR does this, but it's rough, likely not ready for merge at all. The names probably aren't great either. Maybe it's a start you can build from or you can advise on how to clean it up? |
Thanks for posting your issue here. Could you elaborate on why omitting the |
Because another part of my application does require signal handling. |
Wouldn't this part of the application turn signal handling back on in the runtime and bring you back to the same issue? I'm not super familiar with |
@ipetkov It's actually worse than that. What it is is that this part of the application sets up one tokio runtime without signal handling in the pam/nsswitch modules. But because we link to another crate which also has tokio, and that does have the signal feature, it taints through and activates it in this part of the application. IE if you have A without signal, and it depends on B that does have signal, A has signal activated. That's why I implemented #3521 because this allows the runtime in A to select what it needs, even though the tokio features have been tainted through the dependency on B. I'm not really invested enough to try to solve the SA_ONSTACK problem, but certainly it has value for tokio in being able to better cooperate with languages like go that do set SA_ONSTACK. |
I'm here because I found out that tokio is installing a global Now this is not a new problem, and as of relatively recently pidfd exists on Linux to fix this since you get a file descriptor that one can just add to epoll etc. Looks like there's an async-pidfd crate. Though, this is also something that the standard library could benefit from...ah and now that I look, there's this tracking issue and some recent attempts to use it. |
I think discussing OK wait, no apparently I'm wrong, Go (naturally now that I think about it) doesn't use But that said, either way it's still process global state; the best long term solution is to get away from that. On Linux, that's pidfds as mentioned above. Unfortunately I need to care about RHEL8 for quite a while, and I'm sure I'm not the only one needing something that works on pre-pidfd Linux (or other Unix I guess). In the case of a multithreaded Tokio runtime, we could perhaps follow Go's model and basically do a thread per child process. Maybe as some sort of opt-in |
See tokio-rs/tokio#3520 and https://gitlab.gnome.org/GNOME/glib/-/issues/1866#note_1310871 This is a workaround for tokio installing a global `SIGCHLD` handler, which GLib *also* does. It's then a Highlander situation. Work around this by using the blocking API in std which internally just uses `waitpid()`.
In case anyone else comes here, one can also work around this by avoiding using Tokio's child process APIs and instead just use the ones from |
I'm curious to understand what may be happening here. Tokio uses the |
Ah, I see. (reads) - that's some nice code. In my case the GLib code happens to run after tokio, and the GLib code is making no attempt to preserve the previous handler. Hmm...that's obviously doable in the same way that signal-hook is. May look at that, but I'd need to wait a while before I could rely on it. And now that I think about it, I could likely work around this by spawning a dummy process before using tokio too. OK so I take my earlier comments back, this issue is then somewhat specific to Go's use of But the above around using pidfd etc. on Linux I think still stands as the right fix at least on Linux. |
The Unix signal APIs are far from ergonomic, though chaining signal handlers is the "expected" behavior one should follow. I find it unfortunate (and frankly surprising) that the GLib will clobber this outright... That said, I have no problems with exploring a |
Re pidfd, I think it makes sense to wait until it lands in |
#6152 adds pidfd support to tokio and use pidfd instead of signal in |
We could still register a signal handler if the fallback is used. Furthermore, |
You are right, this issue is more general than just |
…l os fixes tokio-rs/tokio#3520 Signals do not add SA_ONSTACK which may cause linked Go applications to crash. fixes pact-foundation#68 fixes pact-foundation/pact-go#269 note: std::process unable to read pact-avro-plugin startup message see pact-foundation#68 for detail
…l os fixes tokio-rs/tokio#3520 Signals do not add SA_ONSTACK which may cause linked Go applications to crash. fixes pact-foundation#68 fixes pact-foundation/pact-go#269 note: std::process unable to read pact-avro-plugin startup message see pact-foundation#68 for detail
…l os fixes tokio-rs/tokio#3520 Signals do not add SA_ONSTACK which may cause linked Go applications to crash. fixes pact-foundation#68 fixes pact-foundation/pact-go#269 note: std::process unable to read pact-avro-plugin startup message see pact-foundation#68 for detail
relates to tokio-rs/tokio#3520 Signals do not add SA_ONSTACK which may cause linked Go applications to crash. fixes pact-foundation#68 fixes pact-foundation/pact-go#269 windows plugin shutdowns, terminate child processes
relates to tokio-rs/tokio#3520 Signals do not add SA_ONSTACK which may cause linked Go applications to crash. fixes #68 fixes pact-foundation/pact-go#269 windows plugin shutdowns, terminate child processes
Version
tokio-util v0.6.3 ()
tokio v1.2.0 ()
Platform
Linux pyrite 5.10.12-1-default #1 SMP Sat Jan 30 19:15:49 UTC 2021 (a3c8888) x86_64 x86_64 x86_64 GNU/Linux
Description
When a module built with the tokio runtime is linked or used by an application with go, the related go module crashes. In this situation, the cause is that the nss_kanidm.so module which is built with tokio 1.2.0 is loaded via nsswitch, and this interfers with the signal handling of the docker daemon causing it to crash.
The previous nss_kanidm tokio cargo.toml was:
nss_kanidm module would make a call to a localhost resolver via a unix domain socket consuming a tokio util codec type. However this required block_on and a tokio run time, so the following code was used in the nss client support module.
As tokio was built with
signal
support the call to "enable_all" would cause signal handlers to be added. These signal handlers are managed by signal-hook-registry, and it appears they are not loaded with thesigaction
flags forSA_ONSTACK
. Go loads it's signal handlers on an alternate stack, meaning any other registered handlers which do not use this flag cause the Go runtime to panic.Workarounds
Remove the "signal" flag from tokio in Cargo.toml to prevent these being loaded in the call for enable_all.
Possible Resolutions
I think there are a few ways this could be handled.
One is that signal-hook-registry has the capability to call sigaction with SA_ONSTACK in case that Rust + Go is linked in the same situation, but this may require work with the related projects.
Another possible approach is that Builder is very "coarse" with the enable flags. The options today are:
Today enable_io from docs.rs states "Doing this enables using net, process, signal, and some I/O types on the runtime." This is rather "coarse", when the Cargo.toml in tokio shows that signal is not a dependency to net/io (but is for process).
So perhaps tokio could improve the enable_ flags to have more fine grained settings such as enable_net, enable_process (implies signal). This would allow consumers to have finer control to avoid signal handler registration which can also work around the issue.
Thanks,
The text was updated successfully, but these errors were encountered: