-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make improvements to signal handling on .NET applications #50527
Comments
Change that introduced the current behavior: dotnet/coreclr#4309 I am not sure whether it is a good idea to overload the existing |
I'm fine with introducing a new event if that's what it takes, I just want the problem to be solved by the runtime itself instead of the libraries (which does a poor job at it). |
OK let's design an API. Today we have namespace System
{
public static class Environment
{
+ public static Task WaitForShutdownSignalAsync();
+ public static void WaitForShutdownSignal();
}
} Usageusing System;
// Blocking wait for shutdown to start
Environment.WaitForShutdownSignal();
// This is where the application performs graceful shutdown Non-blocking waitusing System;
// Non-Blocking wait for shutdown to start
await Environment.WaitForShutdownSignalAsync();
// This is where the application performs graceful shutdown I know this has the potential to get into the generic signal handling discussion but I'd like us to focus on the main requirements:
|
|
Updated.
I wanted to avoid that for this change. We can always add overloads that take an enum or flags enum with the signals to care about.
It could. And like today it would be confusing as there could be multiple waiters an each who can decide what the global state of resuming the signal handling should be. I think not having a callback API means it won't matter much in practice since my entire continuation will execute before signal processing resumes anyways.
I don't think this API should be a replacement for the current ConsoleKeyProcess handler. I don't think it should initiate shutdown at all, just leave that up to the application. |
Why? What's special about these specific signals other than they're the ones required today? If we're going to add a general signal listening mechanism, it should be generalized. That said, what is the expected behavior on Windows?
The console handler doesn't initiate shutdown. It let's you intercept the signal that otherwise shuts down. It only has to do something special in this regard if you choose not to intercept.
Not if it's async. Also, you can't run arbitrary managed code in signal handiers, so the callback needs to be made async from the signal handler anyway. For reference, you can see this in how we handle ctrl-C in Console. When a signal arrives, our registered signal handler queues our handling of it and then also immediately invokes the next handler in the chain: runtime/src/libraries/Native/Unix/System.Native/pal_signal.c Lines 43 to 68 in 4523fbf
There's then a thread processing this queue, which handles the SIGINT by invoking the user's callback: runtime/src/libraries/Native/Unix/System.Native/pal_signal.c Lines 109 to 117 in 79ae74f
And if the callback says the ctrl-C shouldn't be canceled, at that point we can't still rely on the default tear-down-the-app-behavior, since the original signal is long gone, so we clear out our handler and re-send the signal, such that this time it'll tear things down: runtime/src/libraries/Native/Unix/System.Native/pal_signal.c Lines 191 to 199 in 79ae74f
|
My motivation is to fix bugs we cannot work around today to do with graceful shutdown. What makes these signals special is what I said here:
The pattern that repeats itself in many of these systems (Kubernetes, Docker, Systemd), send a SIGTERM to give the application some time to shutdown, then follow up with a SIGKILL, if the app hasn't shutdown in some timeframe. SIGINT, is the dev scenario where the developer hits CTRL+C on the command line.
On windows, this usually ends up being CTRL+C. Last time this conversation stalled at Mono.Posix. I don't have a problem if you think we should make it more general purpose, but these signals are special because of the shutdown semantics other systems are applying to these signals.
Right, I figured this would be the case. (though I didn't look at the code). I don't have a strong opinion on if we should allow this. Again my motivation is about handling these buggy shutdown cases that are hard to workaround today without pinvoking. |
I see there's a desire to make this more generic, I don't have a problem with that but I also don't have the more generic scenarios. SIGINT is already handled by CTRL+C, so there actually isn't a huge problem with not supporting that here IMO.
You can see they all follow this pattern or SIGTERM followed by SIGKILL. I had 2 goals with the original proposal:
I do think there are 2 scenarios that I previously mentioned outside of generic signal handling:
These are the shutdown cases we want to be able to handle uniformly in Microsoft.Extensions.Hosting and any console application in general. What do we need to do to move this forward? |
We have been fixing this multiple times already by small incremental changes that just focused on the next problem, only to find that it still does not work well. If we continue this pattern by introducing a new API that just fixes the next problem, we are likely to find that it still does not work well. I think we need an API that does at least the following:
|
If we want something this general purpose then we might want to take some inspiration from others https://golang.org/pkg/os/signal/ https://nodejs.org/api/process.html#process_signal_events Where we're unique in this space is typically our handling of windows. Most other platforms are *nix first so their APIs try to map *nix signals to various windows behaviors. PS: I know we've had lengthy discussions about mono.posix and that's why this discussion worries me a bit (scope creep). My primary concern is still making sure those scenarios I mentioned get fixed. cc @tmds because he knows all things about linux. cc @danmoseley because this has come up before. My main goal is to fix those bugs I closed and referenced in the issue, I see this general purpose feature as a plus to that. |
I think this is gonna be problematic for the async case. |
Agree. It suggests that the API should be more like the existling |
Ugh I very much dislike that. I'd prefer passing an argument to the method that informs if signal processing should continue. The async version would never has this mode and signal processing would always move to the next handler. |
I expect that the implementation is going to have an API like that at the lowest level internally. Why not polist it and expose it to give full control about the behavior if you need it? Then the rest is just a discussion about convenience wrappers. |
Meaning we're not exposing an API that we expect the majority of users to use? We build another layer for that? That's fine with me but will result in potential duplication. It's needed because this is basically how every application that runs in a container works today. Microsoft.Extensions.Hosting tries to abstract this across different hosting environments with a lifetime abstraction. I assume we want people to be able to easily write console applications that can wait for shutdown signals without using that library right? var running = Task.Run(() =>
{
while (true)
{
Console.WriteLine(DateTime.Now);
await Task.Delay(5000);
}
});
var sigtermTask = Environment.WaitForSingalAsync(Signals.SIGTERM);
await Task.WhenAny(running, sigtermTask);
|
Actually it might be nicer to use CancellationToken cancel = Environment.GetSingalToken(Signals.SIGTERM);
while (true)
{
cancel.ThrowIfCancellationRequested();
Console.WriteLine(DateTime.Now);
await Task.Delay(5000, cancel);
} |
command-line-api uses a Compared to an event, the
I think a key part of what is requested here is that using the API, implies the runtime assumes
The API should allow for SIGINT (CTRL+C) to do something other than terminate the app. For example, if you write a REPL, you may want it to stop the current command. |
@tmds Why are you in my brain 😄 |
YES. I want main to be the continuation.
YEP. |
I think it is rare for simple console applications to need to wait for shutdown signals. I think waiting for shutdown signals is for complex console applications or appmodels like Microsoft.Extensions.Hosting (in which case the appmodel can provide the convenience wrapper). |
What would convince you that this isn't the case? |
#44086 is about |
namespace System.Runtime.InteropServices
{
public enum PosixSignal
{
SIGHUP = -1,
SIGINT = -2,
SIGQUIT = -3,
SIGTERM = -4,
SIGCHLD = -5,
...
}
public sealed class PosixSignalContext
{
public PosixSignal Signal { get; }
public bool Cancel { get; set; }
public PosixSignalContext(PosixSignal signal) { }
}
public sealed class PosixSignalRegistration : IDisposable
{
private PosixSignalRegistration() { }
public static PosixSignalRegistration Create(PosixSignal signal, Action<PosixSignalContext> handler);
}
} |
@bartonjs it would be neat if the signals enum mapped numerically to the most common values for the various signals (Linux x86/ARM or BSD, they are in broad agreement). So e.g. where you have SIGTERM as -4 I would have expected it to be -9 (SIGKILL being -4). This is an awesome proposal all around though! |
The decision was to map these defaults signals understood by us to negative numbers so we can handle them specially. If you want to use other raw signals, pass the raw number through and it will bind to the OS. |
I see. As a user I would have expected to be able to register to handle a wider variety of signals (e.g. SIGUSR1/SIGUSR2). I think SIGWINCH is also of note. Anecdotally I've seen SIGHUP used not as a terminator but as a "reload config" signal in a variety of systems as well. If the API intent is to narrowly handle a certain class of "maybe terminate" signals then I think it needs a more specific name, the implication (at a glance) to me was that I could handle any or at least most POSIX signals if I registered for them... |
For PowerShell, we would love to see SIGPIPE, SIGTTIN, and SIGTTOU. Related #452 |
You can map whatever signal you want, it's a cast away. Then we can discuss what things people need added to the enum and if it's possible to implement them without borking the runtime 😄: PosixSignalRegistration.Create((PosixSignal)(10) /* SIGUSR1 */, context => ...); |
TTIN/TTOU were on the list of things discussed. From the API Review perspective we're OK with adding anything using the standard POSIX name, using an arbitrary/incrementing negative number to get handled appropriately in a PAL. Assuming SIGPIPE isn't something that the runtime is using already in an important way, it'd be in the same bucket.
There's no real reason for symmetry. A note I took in the middle of the review was that "universal" codes (e.g. SIGTERM looks like it's always 15) would be positive and non-universal ones would be negative, but we decided to just use negative numbers and translate. (And, as @davidfowl pointed out, if you pass a positive number we just don't translate it). Since our code will probably be something like public static PosixSignalRegistration Create(PosixSignal signal, ...)
{
if (signal < PosixSignal.TheLastOneWeSupported)
throw new ArgumentOutOfRangeException(nameof(signal));
int nativeCode = PalMap(signal);
...
} PALEXPORT int32_t PalMap(PalSignal value)
{
switch (value)
{
case PAL_SIGABRT:
return SIGABRT;
...
default:
return value;
}
} There's not much benefit in using affinitized numbers (in fact, it makes "you used a negative value we don't understand" harder). |
@tmds is this something you would want to take on? |
@davidfowl I'll try to pick this up in June. I'll add a comment when I start working on it. |
@bartonjs could we consider the name |
@tmds, I merged your PR (thanks!) and I'll finish it. |
@stephentoub are you going to change Microsoft.Extensions.Hosting as well? |
No, just the PosixSignalRegistration Windows implementation. |
OK I will work on the hosting PR |
Background and Motivation
Many systems use SIGTERM as a way to signal a graceful shutdown. To make this work well in container scenarios, we need to start the shutdown process and wait for some timeframe until it's done. To coordinate that, Microsoft.Extensions.Hosting sets up a bunch of events to allow the main thread to exit before continuing the process exit handler. Here's what that would look like (without the dependencies):
The above code shows how a user could set this up today. It's much more complex than the windows equivalent (handling Ctrl+C) for a couple of reasons:
Environment.Exit
at the wrong point in the application.This is what waiting for CTRL+C looks like:
The runtime itself handles
e.Cancel = true
and doesn't shut down the process after the event handler runs. Instead the application can use this to coordinate letting the application gracefully unwind from the main thread.The request here is to treat SIGTERM like Ctrl+C and support e.Cancel.
cc @janvorli @stephentoub @kouvel
Proposed API
NOTES:
SignalContext.Signal
provides the signals that fired to cause the event to trigger so that can be checked in the callback to take action if the same handler was used for different registrations.SignalContext.Cancel
cancels default processing.We will map windows behavior to linux signal names:
Usage Examples
Waiting synchronously on shutdown to start:
The hosting model in Microsoft.Extensions.Hosting will look like this in .NET 6:
Alternative Designs
Don't add new API but overload the existing Console.CancelKeyPress. This would cover one specific scenario but wouldn't handle the arbitrary signals.
Risks
None
The text was updated successfully, but these errors were encountered: