Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SignalR connections are shutdown directly during graceful shutdowns #25069

Open
WJRovers opened this issue Aug 20, 2020 · 24 comments
Open

SignalR connections are shutdown directly during graceful shutdowns #25069

WJRovers opened this issue Aug 20, 2020 · 24 comments
Assignees
Labels
affected-most This issue impacts most of the customers area-signalr Includes: SignalR clients and servers enhancement This issue represents an ask for new feature or an enhancement to an existing one Needs: Design This issue requires design work before implementating. Priority:1 Work that is critical for the release, but we could probably ship without severity-minor This label is used by an internal tool
Milestone

Comments

@WJRovers
Copy link

WJRovers commented Aug 20, 2020

Context:
Hosting an application in a kubernetes environment can sometimes cause the service (with active connections) to be shutdown as autoscaling happens. We have implemented graceful shutdown behaviour for various parts of the application but I see no such feature/possibility for SignalR servers. This causes problems when using the streaming features in SignalR. The Hub interface does offer you a OnDisconnectedAsync but as it turns out this is fired after the connection has been closed (by the server itself).

Things Tried
I've stepped into the source code with a debugger to find out most is handled in privates/internal classes. Should a complete new implementation be written of a HubConnectionHandler that can deal with IApplicationLifetime? Or am I missing something like an injectable class that can deal with this. Or is there an existing SignalR lifetime available?
Also; Im not sure if the HubFilters that seem to be coming out in a later release would help us with our problem.

Question/Idea:
We would like to await stream completion (with a max timeout) before the SignalR connection is cut off. Is it possible to interact with the IApplicationLifetime or await completion before SignalR connections are actually closed?

@javiercn javiercn added the area-signalr Includes: SignalR clients and servers label Aug 20, 2020
@davidfowl
Copy link
Member

You might be able to achieve something if you handle the IHostApplicationLifetime notification before SignalR does and closes all connections. This logic actually isn't directly in the signalr layer, it's part of the transport logic.

@WJRovers
Copy link
Author

WJRovers commented Aug 21, 2020

I am not sure this is not possible as I could not find any interaction between the SignalR hosting service and the IHostApplicationLifetime. Ive done a little test:

Startup.cs

public void Configure(IApplicationBuilder app, IHostApplicationLifetime applicationLifetime)
{
     applicationLifetime.ApplicationStopping.Register(() =>
     {
          Console.WriteLine("App is stopping - We should wait before SignalR closes connections?");
     }, true);

    // Some more stuff here

ChatHub

public async Task StreamData(ChannelReader<string> streamOfStrings){
    try
    {
        await foreach (var receivedString in _channelReader.ReadAllAsync())
            Console.WriteLine(receivedString)
    catch (OperationCanceledException e)
    {
        Console.WriteLine(e.Message) // "The underlying connection was closed."
    }
}

When you start a Graceful shutdown during a stream of strings being send; the console will show the followiing:
Console Output

The underlying connection was closed.
App is stopping

To me it looks like the SignalR service does not respect or interact with any of the IHostApplicationLifetime events. Am I missing something obvious or is this something that is not implemented?

Little extra information
When you have a SignalR client with auto reconnect; and you put something like a thread sleep (for testing) in the ApplicationStopping registration. It closes the connection during graceful shutdown; and then allows the agent to reconnect.

@davidfowl
Copy link
Member

What server are you running on?

@BrennanConroy
Copy link
Member

So one idea would be to add a hook to allow users to run something before SignalR starts closing connections.

@BrennanConroy BrennanConroy added this to the Backlog milestone Aug 21, 2020
@ghost
Copy link

ghost commented Aug 21, 2020

We've moved this issue to the Backlog milestone. This means that it is not going to be worked on for the coming release. We will reassess the backlog following the current release and consider this item at that time. To learn more about our issue management process and to have better expectation regarding different types of issues you can read our Triage Process.

@BrennanConroy BrennanConroy added affected-very-few This issue impacts very few customers enhancement This issue represents an ask for new feature or an enhancement to an existing one severity-minor This label is used by an internal tool labels Nov 9, 2020 — with ASP.NET Core Issue Ranking
@BrennanConroy
Copy link
Member

We're thinking of adding an option to delay connection closes. By default there is no delay so today as soon as shutdown is triggered connections start closing. With the option you can specify how long SignalR delays once shutdown is triggered before trying to gracefully close connections.

The server can still aggressively shutdown connections that are still alive, Kestrel has configuration for this delay, not sure about other servers.

cc @davidfowl

@ghost
Copy link

ghost commented Feb 23, 2021

Thanks for contacting us.
We're moving this issue to the Next sprint planning milestone for future evaluation / consideration. We will evaluate the request when we are planning the work for the next milestone. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

@davidfowl davidfowl added affected-most This issue impacts most of the customers and removed affected-very-few This issue impacts very few customers labels Apr 3, 2021
@cubed-it
Copy link

I have exactly the same problem as the author. My singalr core application needs graceful termination for streams when scaling and rolling out an update. It is a showstopper somehow... 🙁

So I would like to support the call for such a ApplicationStopping support for signalr core hereby explicitly!

Thanks a lot! 👍

@springy76
Copy link

I have the same problem with Server-Blazor (using SignalR implicitly): All open circuits got ICircuitTracker.OnConnectionDownAsync before my IHostApplicationLifetime.ApplicationStopping handler gets called. And I checked that my code attaching to ApplicationStopping was the first one adding delegates, so should be the first one called, too.

I need an earlier point which allows me to wait (async) and trigger Component.StateHasChanged().

I already have existing code (in a singleton service which monitors all circuits) which - based on a timeout - "kicks" users by first collecting state data and then redirecting them to a simple non-blazor razor page which allows them to reenter, but also allows IIS to apply idle-shutdown of the app pool.
This already works like a charm, I just cannot combine this code with a forced shutdown; all clients then just receive a reconnecting-attempt with final "not possible" error - either because the server was shutdown entirely or the freshly started server can't of course re-attach dropped Blazor circuits.

The automatic re-connect of Blazor/SignalR is extremely worse in my scenario which involves hundreds of app-pools of the same application (strict tenant separation) - because in the (seldom) case of an entire IIS-reset hundreds of app-pools are starting at the same time immediately just because of the automatic retry.

@cubed-it
Copy link

cubed-it commented Jun 26, 2021

@springy76 as stated above I'm mainly interested in signalr core graceful termination, but blazor is also such a sore spot... I've been looking for a solution for my employer how to roll out an update using kubernetes without all users being kicked out immediately...
however my I ask if the source for the "singleton service which monitors all circuits" is shareable? or at least could you provide some api hints? sounds really interessting and now I want such a thing too. ;) some users like to keep tabs open pretty long..

@springy76
Copy link

@cubed-it It's a relative simple service deriving from CircuitHandler: https://docs.microsoft.com/en-us/aspnet/core/blazor/fundamentals/signalr?view=aspnetcore-5.0&pivots=server#blazor-server-circuit-handler which tracks all opened and closed circuits plus an in-memory weak-referenced service bus which all app components are listening to for kick messages.

@BrennanConroy
Copy link
Member

As a workaround you can register to the IHostApplicationLifetime.ApplicationStopping token after any call to MapHub or MapConnections and perform work in there before returning and letting the rest of the events fire.

app.UseEndpoint(endpoints =>
{
    endpoints.MapHub<MyHub>("/hub");
});
// cancellation tokens fire callbacks in LIFO order, so make sure we're after MapHub is called
app.ApplicationServices.GetRequiredService<IHostApplicationLifetime>().ApplicationStopping.Register(() => DoWork());

@rafikiassumani-msft rafikiassumani-msft added this to the 7.0-preview2 milestone Jan 25, 2022
@rafikiassumani-msft rafikiassumani-msft added the Priority:1 Work that is critical for the release, but we could probably ship without label Feb 10, 2022
@kunom
Copy link

kunom commented May 3, 2022

Same issue here. I would like to have SignalR being up until MassTransit is fully closed down.

MassTransit works as a IHostedService, where shutdown happens one-per-one in reverse order of registration. I was a bit surprised that Azure SignalR does not do that but directly shortcuts to IHostApplicationLifetime.

@ghost
Copy link

ghost commented May 31, 2022

Thanks for contacting us.

We're moving this issue to the .NET 7 Planning milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s).
If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues.
To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

@zackliu
Copy link

zackliu commented Jun 10, 2022

We're facing the same issue. And the workaround is not good enough because the customized graceful shutdown is usually async method. What you have to do is "sync over async" to block the next task. We have many WebHosts instance in one process for business needs. So, the "sync over async" will block a lot of threads and easy to cause thread starving.

@kunom
Copy link

kunom commented Jun 10, 2022

Not only should shutdown be happening as part of the IHostedService contracts instead of the IApplicationLifetime contract, it also should be possible to shut down individual SignalR hubs in a well-defined order.

This, at least, is our use case, and it took me a lot of time/reverse engineering/hacking/reflection to end up with a shutdown sequence of:

SignalRHubs group1 > MassTransit > SignalRHubs group2

Needless to say, as a side remark, that I don't understand why all the Azure SignalR extensions are marked as internal and thus are also extremely hard to monitor for health checking.

@ghost
Copy link

ghost commented Oct 11, 2022

Thanks for contacting us.

We're moving this issue to the .NET 8 Planning milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s).
If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues.
To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

@springy76
Copy link

So what's the progress of .NET 8 Planning?

@BrennanConroy BrennanConroy modified the milestones: .NET 8 Planning, Backlog Mar 2, 2024
@DanielCordell
Copy link
Contributor

DanielCordell commented Dec 16, 2024

Has there been any progress on this? Feels like this issue may have been forgotten about! @BrennanConroy

Right now I want to perform a SignaLR connection to every connected client on server shutdown (basically like a "I'm shutting down right now!" message).

I've tried all combinations of ordering my middleware registration, I've done something similar to this as well #25069 (comment), but every time I try it seems that by the time I get to the ApplicationStopping event, the signalr connections are already stopped.

This is essentially what I want to do:

app.UseEndpoints(endpoints =>
{
    app.MapHub<GracefulShutdownHub>("/GracefulShutdownController");
});

var hubContextService = app.Services.GetRequiredService<IHubContext<GracefulShutdownHub, IGracefulShutdownClient>>();

app.Services.GetRequiredService<IHostApplicationLifetime>().ApplicationStopping.Register(() =>
{
    Logger.LogInformation("PERFORMING GRACEFUL SHUTDOWN");
    hubContextService.Clients.All.PerformGracefulShutdown().Wait();
    Logger.LogInformation("Graceful Shutdown Performed");
});

By the time I'm calling hubContextService.Clients.All.PerformGracefulShutdown(), it appears every client has already disconnected.

@BrennanConroy
Copy link
Member

I just tried the workaround, and it works as expected. The issue is that PerformGracefulShutdown only waits to send the clients a message, it doesn't wait for the clients to close.

@DanielCordell
Copy link
Contributor

DanielCordell commented Dec 16, 2024

I just tried the workaround, and it works as expected. The issue is that PerformGracefulShutdown only waits to send the clients a message, it doesn't wait for the clients to close.

Sorry, how would I wait for the clients to close in this context? I can't see anything on the hubContext or it's child objects, and I definitely want to delay the clients closing until after this invocation.

@DanielCordell
Copy link
Contributor

@BrennanConroy Sorry I wouldn't ping if I hadn't tried everything I could think of 😆, do you have any sample or reproducible code (or can you just explain) for what I would need to do to get the connection closing to wait until after these messages are sent?

@BrennanConroy
Copy link
Member

how would I wait for the clients to close in this context?

Your app needs to manage that. Here is an example:

builder.Services.AddSingleton(clientManager);

var app = builder.Build();

app.UseCors();
app.UseHttpsRedirection();

app.UseRouting();

app.UseEndpoints(endpoints =>
{
    endpoints.MapHub<MyHub>("/default");
});

var hubContextService = app.Services.GetRequiredService<IHubContext<MyHub>>();

app.Services.GetRequiredService<IHostApplicationLifetime>().ApplicationStopping.Register(() =>
{
    hubContextService.Clients.All.SendAsync("Shutdown").Wait();
    clientManager.WaitForShutdown().Wait();
});

app.Run();

public class MyHub : Hub
{
    private readonly ClientManager _clientManager;

    public MyHub(ClientManager clientManager)
    {
        _clientManager = clientManager;
    }

    public override Task OnConnectedAsync()
    {
        _clientManager.Add(Context);
        return Task.CompletedTask;
    }

    public override Task OnDisconnectedAsync(Exception? exception)
    {
        _clientManager.Remove(Context.ConnectionId);
        return Task.CompletedTask;
    }
}

public class ClientManager
{
    private readonly ConcurrentDictionary<string, HubCallerContext> _clients = new();

    public void Add(HubCallerContext caller)
    {
        _clients.TryAdd(caller.ConnectionId, caller);
    }

    public void Remove(string connectionId)
    {
        _clients.TryRemove(connectionId, out _);
    }

    public async Task WaitForShutdown()
    {
        var maxWaitTime = TimeSpan.FromSeconds(5);
        var start = Stopwatch.GetTimestamp();

        while (Stopwatch.GetElapsedTime(start) < maxWaitTime && _clients.Count > 0)
        {
            await Task.Delay(100);
        }
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affected-most This issue impacts most of the customers area-signalr Includes: SignalR clients and servers enhancement This issue represents an ask for new feature or an enhancement to an existing one Needs: Design This issue requires design work before implementating. Priority:1 Work that is critical for the release, but we could probably ship without severity-minor This label is used by an internal tool
Projects
None yet
Development

Successfully merging a pull request may close this issue.