Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix Flush deadlock by implementing SpinWait pattern using Interlocked.CompareExchange #2595

Closed
wants to merge 11 commits into from

Conversation

TimothyMothra
Copy link
Member

@TimothyMothra TimothyMothra commented May 7, 2022

Fix Issue #1186.

Metrics Flushing has a deadlock which occurs here:

lock (buffer)
{
int maxFlushIndex = Math.Min(buffer.PeekLastWriteIndex(), buffer.Capacity - 1);
int minFlushIndex = buffer.NextFlushIndex;
if (minFlushIndex > maxFlushIndex)
{
return;
}
stage1Result = this.UpdateAggregate_Stage1(buffer, minFlushIndex, maxFlushIndex);
buffer.NextFlushIndex = maxFlushIndex + 1;
}

This PR removes the lock and implements a SpinWait pattern using Interlocked.CompareExchange().
To make this work, Interlocked.CompareExchange needs access to the index variable in the class MetricValuesBuffer. Because of this, I've had to do some refactoring.

Changes

  • removed lock from the MetricSeriesAggregatorBase class.
  • added new method TryGetFlushIndexes() into the MetricValuesBuffer class.
    This new method uses Interlocked.CompareExchange to try to advance the index variable.

Alternatives considered

I had a PoC that replaced the lock with Monitor.TryEnter. #2594
This more closely matches the original author's intent using a lock, and adds a timeout to mitigate the deadlock.
This was abandoned in favor of removing the lock entirely.

Checklist

  • I ran Unit Tests locally.
  • CHANGELOG.md updated with one line description of the fix, and a link to the original issue if available.

For significant contributions please make sure you have completed the following items:

  • Design discussion issue #
  • Changes in public surface reviewed

The PR will trigger build, unit tests, and functional tests automatically. Please follow these instructions to build and test locally.

Notes for authors:

  • FxCop and other analyzers will fail the build. To see these errors yourself, compile localy using the Release configuration.

Notes for reviewers:

  • We support comment build triggers
    • /AzurePipelines run will queue all builds
    • /AzurePipelines run <pipeline-name> will queue a specific build

@TimothyMothra TimothyMothra changed the title [WIP] Interlocked.CompareExchange fix Flush deadlock by implementing SpinWait pattern using Interlocked.CompareExchange May 20, 2022
@TimothyMothra TimothyMothra requested a review from cijothomas May 20, 2022 20:33
@TimothyMothra TimothyMothra marked this pull request as ready for review May 20, 2022 20:34
@TimothyMothra
Copy link
Member Author

GitHub won't let Kennedy comment. He shared his comments offline:

Won't let me leave a comment on that PR...
But taking a look, I'm curious why a while(true) and not a for loop with a break and return false at the end?
continue at the bottom is also not needed
Otherwise looks good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant