Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EventCounter publishing thread can hang #53564

Closed
noahfalk opened this issue Jun 1, 2021 · 2 comments · Fixed by #53836
Closed

EventCounter publishing thread can hang #53564

noahfalk opened this issue Jun 1, 2021 · 2 comments · Fixed by #53836

Comments

@noahfalk
Copy link
Member

noahfalk commented Jun 1, 2021

Description

When listening to EventCounters there is a dedicated thread that is supposed to publish the metric values via EventSource on a periodic timer. However due a bug it is possible to put it into an infinite loop here. This occurs any time _pollingIntervalMilliseconds <= 0, which could occur for a few reasons:

  1. The listener specified EventCounterIntervalSec=0 after previously specifying a non-zero value. We would never expect a well-behaved listener to request a 0 sec interval, but the runtime shouldn't blindly trust this input.
  2. The listener enables and then disables the EventSource. Due to a race the infinite loop code could be run prior to re-checking _eventSource.IsEnabled() and it would observe _pollingIntervaMilliseconds=0

Configuration

.NET 5 (but presumably also reproable in .NET 3 and 3.1)

Regression?

The underlying bug was introduced in 2019 and shipped in .NET Core 3, 3.1, and 5:
However at this point only failure reason (1) above would have been able to trigger it.

In August 2020 an additional change was made in .NET Core 5 (and backported to 3.1) which opened the race condition allowing failure reason (2) above to be hit.

Repro

using System;
using System.Collections.Generic;
using System.Diagnostics.Tracing;
using System.Threading.Tasks;

internal sealed class RuntimeEventListener : EventListener
{
    protected override void OnEventSourceCreated(EventSource source)
    {
        if (source.Name.Equals("System.Runtime"))
        {
            EnableEvents(source, EventLevel.LogAlways, EventKeywords.None, new Dictionary<string, string>()
                { { "EventCounterIntervalSec", "1" } });
            
            // The next line makes the example fail, comment this out to see it work normally
            EnableEvents(source, EventLevel.LogAlways, EventKeywords.None, new Dictionary<string, string>()
                { { "EventCounterIntervalSec", "0" } });
        }
    }

    protected override void OnEventWritten(EventWrittenEventArgs eventData)
    {
        Console.WriteLine(eventData.TimeStamp + " " + eventData.EventName);
    }
}

class Program
{
    static void Main(string[] args)
    {
        RuntimeEventListener listener = new RuntimeEventListener();
        Console.ReadLine();
    }
}

Expected behavior: No threads should spin in the counter infinite loop
Actual behavior: A thread will hang in that loop

Note: this repro only shows technique (1) to repro the problem. Any fix still needs to account for technique (2) as well.

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Diagnostics.Tracing untriaged New issue has not been triaged by the area owner labels Jun 1, 2021
@ghost
Copy link

ghost commented Jun 1, 2021

Tagging subscribers to this area: @tarekgh, @tommcdon, @pjanotti
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

When listening to EventCounters there is a dedicated thread that is supposed to publish the metric values via EventSource on a periodic timer. However due a bug it is possible to put it into an infinite loop here. This occurs any time _pollingIntervalMilliseconds <= 0, which could occur for a few reasons:

  1. The listener specified EventCounterIntervalSec=0 after previously specifying a non-zero value. We would never expect a well-behaved listener to request a 0 sec interval, but the runtime shouldn't blindly trust this input.
  2. The listener enables and then disables the EventSource. Due to a race the infinite loop code could be run prior to re-checking _eventSource.IsEnabled() and it would observe _pollingIntervaMilliseconds=0

Configuration

.NET 5 (but presumably also reproable in .NET 3 and 3.1)

Regression?

The underlying bug was introduced in 2019 and shipped in .NET Core 3, 3.1, and 5:
However at this point only failure reason (1) above would have been able to trigger it.

In August 2020 an additional change was made in .NET Core 5 (and backported to 3.1) which opened the race condition allowing failure reason (2) above to be hit.

Author: noahfalk
Assignees: -
Labels:

area-System.Diagnostics.Tracing, untriaged

Milestone: -

@noahfalk noahfalk added this to the 6.0.0 milestone Jun 1, 2021
@noahfalk
Copy link
Member Author

noahfalk commented Jun 1, 2021

@josalem

@tommcdon tommcdon removed the untriaged New issue has not been triaged by the area owner label Jun 2, 2021
josalem pushed a commit to josalem/runtime that referenced this issue Jun 7, 2021
* default internal state to DateTime.MaxValue
* re-check IsEnabled after not holding lock
* add test for setting interval to 0
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jun 7, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jun 10, 2021
github-actions bot pushed a commit that referenced this issue Jun 10, 2021
…sEnabled after not holding lock * add test for setting interval to 0
mmitche pushed a commit that referenced this issue Jun 15, 2021
* Fix #53564 * default internal state to DateTime.MaxValue * re-check IsEnabled after not holding lock * add test for setting interval to 0

* simplify fix to just a for loop

* remove one more line

* Remove loop

* Update test to match #54081

Co-authored-by: John Salem <josalem@microsoft.com>
@ghost ghost locked as resolved and limited conversation to collaborators Jul 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants