Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When threads have low free stack space, StackSpiller may infinite-loop through the thread pool #61757

Open
kouvel opened this issue Nov 17, 2021 · 2 comments

Comments

@kouvel
Copy link
Member

kouvel commented Nov 17, 2021

When threads have marginally more than 128 KB of stack space, RuntimeHelpers.TryEnsureSufficientExecutionStack discounts 128 KB of stack space and only considers the remaining space, and there may not be enough even for a non-nested/non-recursive thread pool work item. In that situation, StackGuard.RunOnEmptyStackCore is triggered again from its continuation and keeps doing so since TryEnsureSufficientExecutionStack always returns false.

This appears to be the continuation triggered by RunOnEmptyStackCore, which triggers itself continually:

private Result RewriteExpression(Expression node, Stack stack)
{
if (node == null)
{
return new Result(RewriteAction.None, null);
}
// When compiling deep trees, we run the risk of triggering a terminating StackOverflowException,
// so we use the StackGuard utility here to probe for sufficient stack and continue the work on
// another thread when we run out of stack space.
if (!_guard.TryEnterOnCurrentStack())
{
return _guard.RunOnEmptyStack((StackSpiller @this, Expression n, Stack s) => @this.RewriteExpression(n, s), this, node, stack);

@kouvel kouvel added this to the 7.0.0 milestone Nov 17, 2021
@kouvel kouvel self-assigned this Nov 17, 2021
@ghost
Copy link

ghost commented Nov 17, 2021

Tagging subscribers to this area: @cston
See info in area-owners.md if you want to be subscribed.

Issue Details

When threads have marginally more than 128 KB of stack space, RuntimeHelpers.TryEnsureSufficientExecutionStack discounts 128 KB of stack space and only considers the remaining space, and there may not be enough even for a non-nested/non-recursive thread pool work item. In that situation, StackGuard.RunOnEmptyStackCore is triggered again from its continuation and keeps doing so since TryEnsureSufficientExecutionStack always returns false.

This appears to be the continuation triggered by RunOnEmptyStackCore, which triggers itself continually:

private Result RewriteExpression(Expression node, Stack stack)
{
if (node == null)
{
return new Result(RewriteAction.None, null);
}
// When compiling deep trees, we run the risk of triggering a terminating StackOverflowException,
// so we use the StackGuard utility here to probe for sufficient stack and continue the work on
// another thread when we run out of stack space.
if (!_guard.TryEnterOnCurrentStack())
{
return _guard.RunOnEmptyStack((StackSpiller @this, Expression n, Stack s) => @this.RewriteExpression(n, s), this, node, stack);

Author: kouvel
Assignees: kouvel
Labels:

area-System.Linq.Expressions

Milestone: 7.0.0

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Nov 17, 2021
@kouvel kouvel removed this from the 7.0.0 milestone Nov 18, 2021
@kouvel kouvel removed their assignment Nov 18, 2021
@kouvel
Copy link
Member Author

kouvel commented Nov 18, 2021

Repro is on an Ubuntu 20.04 VM, to run the System.Linq.Expressions tests with the following environment variable set (128 KB stack size for threads):

COMPlus_DefaultStackSize=0x20000

dotnet build /p:Configuration=Release /t:Test from src/libraries/System.Linq.Expressions/tests appears to hang for the same reason when the env var is set, and running the tests directly with the build output also hangs with some tests reported as long-running:

   System.Linq.Expressions.Tests: [Long Running Test] 'System.Linq.Expressions.Tests.LiftedSubtractNullableTests.CheckLiftedSubtractNullableULongTest', Elapsed: 00:02:11
[Long Running Test] 'System.Linq.Expressions.Tests.AsTests.CheckCustomArrayAsIEnumerableOfInterfaceTest', Elapsed: 00:02:11
[Long Running Test] 'System.Linq.Expressions.Tests.ListBindTests.NonAddableListType', Elapsed: 00:02:11
[Long Running Test] 'System.Linq.Expressions.Tests.Assign.ReferenceAssignable', Elapsed: 00:02:11
[Long Running Test] 'System.Linq.Expressions.Tests.ConvertTests.ConvertNullableDoubleToNullableEnumTest', Elapsed: 00:02:11
[Long Running Test] 'System.Linq.Expressions.Tests.MemberInitTests.Reduce', Elapsed: 00:02:11
[Long Running Test] 'System.Dynamic.Tests.BindingRestrictionsTests.TypeRestrictionTrueForMatchType', Elapsed: 00:02:11
[Long Running Test] 'System.Linq.Expressions.Tests.ConvertCheckedTests.ConvertCheckedNullableFloatToSByteTest', Elapsed: 00:02:11
[Long Running Test] 'System.Dynamic.Tests.ExpandoObjectProxyTests.KeyCollectionCorrectlyViewed', Elapsed: 00:02:11
[Long Running Test] 'System.Linq.Expressions.Tests.LiftedMultiplyCheckedNullableTests.CheckLiftedMultiplyCheckedNullableSByteTest', Elapsed: 00:02:11
[Long Running Test] 'System.Linq.Expressions.Tests.BinaryNullableAndTests.CheckNullableULongAndTest', Elapsed: 00:02:11
[Long Running Test] 'System.Linq.Expressions.Tests.LambdaMultiplyTests.LambdaMultiplyUShortTest', Elapsed: 00:02:11

I attempted something simple, just to avoid checking the StackGuard when the queued work item runs, but it doesn't seem to be enough to fix the hang, I guess it may be more involved.

@cston cston added this to the Future milestone Aug 2, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Aug 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants