Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlock in EventPipeEventDispatcher #90053

Merged
merged 6 commits into from
Aug 14, 2023

Conversation

davmason
Copy link
Member

@davmason davmason commented Aug 4, 2023

Fixes #86838

In EventPipeEventDispatcher we currently wait on the session to close synchronously before updating the configuration, this can lead to a deadlock if the dispatcher thread tries to call back in to EventSource apis (as described in #86838).

This change fixes the deadlock by making stopping the session asynchronous, and moving the call to update EventPipeEventDispatcher outside of the EventListenersLock in EventListener.Dispose.

This is the test program I used to recreate the deadlock. It would reliably hit the deadlock within a minute before, and can run for hours without a deadlock after

// See https://aka.ms/new-console-template for more information
using System.Diagnostics;
using System.Diagnostics.Tracing;
using System.Reflection.Emit;

Console.WriteLine("Hello, World!");

List<Thread> threads = new List<Thread>();
DateTime lastUpdate = DateTime.Now;

threads.Add(new Thread(() =>
{
    while (true)
    {
        RuntimeEventListener listener = new RuntimeEventListener();
        Thread.Sleep(100);
        listener.Dispose();

        lastUpdate = DateTime.Now;
        Console.Write('.');
    }
}));

threads.ForEach(x => x.Start());

while (true)
{
    // create some GC events
    GC.Collect();
    GC.Collect();

    Thread.Sleep(1000);
    if ((DateTime.Now - lastUpdate) > TimeSpan.FromSeconds(10))
    {
        Console.WriteLine("Deadlock!");
        Debugger.Launch();
        Debugger.Break();
    }
}

[EventSource(Name = "SampleEventSource")]
class SampleEventSource : EventSource
{

}

class RuntimeEventListener : EventListener
{
    protected override void OnEventSourceCreated(EventSource eventSource)
    {
        base.OnEventSourceCreated(eventSource);

        if (eventSource.Name.Equals("Microsoft-Windows-DotNETRuntime"))
        {
            EnableEvents(eventSource, EventLevel.Informational, (EventKeywords)0x11);
        }
    }

    protected override void OnEventWritten(EventWrittenEventArgs eventData)
    {
        base.OnEventWritten(eventData);

        Console.Write("!");
        using (SampleEventSource testSource = new SampleEventSource())
        {
            testSource.Write("OneEvent");
        }
    }
}

As part of this fix we discovered a latent bug in EventPipeEventDispatcher as described in #90052. We are too close to releasing .net 8 to make this kind of change so the work will happen in a later release.

@ghost
Copy link

ghost commented Aug 4, 2023

Tagging subscribers to this area: @tarekgh, @tommcdon, @pjanotti
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #86838

In EventPipeEventDispatcher we currently wait on the session to close synchronously before updating the configuration, this can lead to a deadlock if the dispatcher thread tries to call back in to EventSource apis (as described in #86838).

This change fixes the deadlock by making stopping the session asynchronous, and moving the call to update EventPipeEventDispatcher outside of the EventListenersLock in EventListener.Dispose.

This is the test program I used to recreate the deadlock. It would reliably hit the deadlock within a minute before, and can run for hours without a deadlock after

// See https://aka.ms/new-console-template for more information
using System.Diagnostics;
using System.Diagnostics.Tracing;
using System.Reflection.Emit;

Console.WriteLine("Hello, World!");

List<Thread> threads = new List<Thread>();
DateTime lastUpdate = DateTime.Now;

threads.Add(new Thread(() =>
{
    while (true)
    {
        RuntimeEventListener listener = new RuntimeEventListener();
        Thread.Sleep(100);
        listener.Dispose();

        lastUpdate = DateTime.Now;
        Console.Write('.');
    }
}));

threads.ForEach(x => x.Start());

while (true)
{
    // create some GC events
    GC.Collect();
    GC.Collect();

    Thread.Sleep(1000);
    if ((DateTime.Now - lastUpdate) > TimeSpan.FromSeconds(10))
    {
        Console.WriteLine("Deadlock!");
        Debugger.Launch();
        Debugger.Break();
    }
}

[EventSource(Name = "SampleEventSource")]
class SampleEventSource : EventSource
{

}

class RuntimeEventListener : EventListener
{
    protected override void OnEventSourceCreated(EventSource eventSource)
    {
        base.OnEventSourceCreated(eventSource);

        if (eventSource.Name.Equals("Microsoft-Windows-DotNETRuntime"))
        {
            EnableEvents(eventSource, EventLevel.Informational, (EventKeywords)0x11);
        }
    }

    protected override void OnEventWritten(EventWrittenEventArgs eventData)
    {
        base.OnEventWritten(eventData);

        Console.Write("!");
        using (SampleEventSource testSource = new SampleEventSource())
        {
            testSource.Write("OneEvent");
        }
    }
}

As part of this fix we discovered a latent bug in EventPipeEventDispatcher as described in #90052. We are too close to releasing .net 8 to make this kind of change so the work will happen in a later release.

Author: davmason
Assignees: -
Labels:

area-System.Diagnostics.Tracing

Milestone: -

@davmason
Copy link
Member Author

davmason commented Aug 7, 2023

There is a race condition still present - since both the old task and new task use the same m_stopDispatchTask flag, if the new task starts running before the old one has finished shutting down it can potentially terminate early and never deliver events. I'm working on fixing it

Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@davmason
Copy link
Member Author

There is a latent issue where ep_shutdown races with the EventPipeEventDispatcher thread, filed #90472 to track.

@davmason
Copy link
Member Author

The changes in this PR made the shutdown race much more prevalent, I added changes to address it. This PR now fixes #90472 as well.

@davmason
Copy link
Member Author

Test failures are all unrelated, merging.

@davmason davmason merged commit ad60861 into dotnet:main Aug 14, 2023
166 of 170 checks passed
@ghost ghost locked as resolved and limited conversation to collaborators Sep 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
2 participants