Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BDN randomly crashes on Linux? #90691

Closed
EgorBo opened this issue Aug 16, 2023 · 14 comments · Fixed by #90794
Closed

BDN randomly crashes on Linux? #90691

EgorBo opened this issue Aug 16, 2023 · 14 comments · Fixed by #90794

Comments

@EgorBo
Copy link
Member

EgorBo commented Aug 16, 2023

Filing a tracking issue for the bug @stephentoub noticed on Linux-x64, minimal repro:

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Tests).Assembly).Run(args);

[DisassemblyDiagnoser]
public class Tests
{
    IEnumerable<int> _source = Enumerable.Repeat(0, 1024);
    Random _rand = new();

    [Benchmark]
    public bool Any() => _rand.NextDouble() < 1.0 ?
        _source.Any(i => i == 42) :
        _source.Any(i => i == 43);
}

Run with:

dotnet run -c Release -f net8.0 -- -j short --filter "*" --iterationTime 50

(--iterationTime 50 is just to make crash faster, it reproduces without it as well)

When we run - it randomly just quits silently in the middle of benchmarking (it doesn't happen for all benchmarks, but so far it looks like it's not PGO, TieredCompilation, AVX512 and Stephen even managed to repro it on .NET 7.0).

Presumably, the culprit is [DisassemblyDiagnoser] and ClrMD behind it.

cc @adamsitnik

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Aug 16, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Aug 16, 2023
@ghost
Copy link

ghost commented Aug 16, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Filining a tracking issue for the bug @stephentoub noticed on Linux-x64, minimal repro:

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Tests).Assembly).Run(args);

[DisassemblyDiagnoser]
public class Tests
{
    IEnumerable<int> _source = Enumerable.Repeat(0, 1024);
    Random _rand = new();

    [Benchmark]
    public bool Any() => _rand.NextDouble() < 1.0 ?
        _source.Any(i => i == 42) :
        _source.Any(i => i == 43);
}

Run with:

dotnet run -c Release -f net8.0 -- -j short --filter "*" --iterationTime 50

(--iterationTime 50 is just to make crash faster, it reproduces without it as well)

When we run it randomly it just silently quits in the middle of benchmarking (it doesn't happen for all benchmarks, but so far it looks like it's not PGO, TieredCompilation, AVX512 and Stephen even managed to repro it on .NET 7.0).

Presumably, the culprit is [DisassemblyDiagnoser] and ClrMD behind it.

cc @adamsitnik

Author: EgorBo
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member

I have seen this too, and removing [DissasemblyDiagnoser] has fixed things.

@stephentoub
Copy link
Member

removing [DissasemblyDiagnoser] has fixed things

Yeah, but unfortunately I need it :)

@AndyAyersMS
Copy link
Member

Profiling (or other knowledge of the methods of interest matter) plus a suitably crafted DOTNET_JitDisasm and DOTNET_JitStdOutFile might get you something you could use instead.

@stephentoub
Copy link
Member

stephentoub commented Aug 16, 2023

Profiling (or other knowledge of the methods of interest matter) plus a suitably crafted DOTNET_JitDisasm and DOTNET_JitStdOutFile might get you something you could use instead.

Yes, it just doesn't work well for my blog post. As written, everyone will get crashes (on Linux), which is obviously not a good experience.

@adamsitnik
Copy link
Member

It may be caused by something similar to #79846 which got fixed by @janvorli.

it randomly just quits silently in the middle of benchmarking

BDN attaches the disassembler after all iterations (to not affect the results), so I am not 100% sure.

@stephentoub
Copy link
Member

The process is seg faulting, from:

OS Thread Id: 0xf0e6
        Child SP               IP Call Site
00007F77F0C99870                  [InlinedCallFrame: 00007f77f0c99870]
00007F77F0C99870                  [InlinedCallFrame: 00007f77f0c99870]
00007F77F0C99850 00007FB89FF6C8FD Microsoft.Diagnostics.Runtime.dll!ILStubClass.IL_STUB_PInvoke(IntPtr, Microsoft.Diagnostics.Runtime.DacInterface.ClrDataAddress, UInt64, Microsoft.Diagnostics.Runtime.DacInterface.MethodDescData ByRef, Int32, Microsoft.Diagnostics.Runtime.DacInterface.RejitData[], Int32 ByRef) + 285
00007F77F0C99930 00007FB89FF6C7B6 Microsoft.Diagnostics.Runtime.dll!Microsoft.Diagnostics.Runtime.DacInterface.SOSDac.GetMethodDescData(UInt64, UInt64, Microsoft.Diagnostics.Runtime.DacInterface.MethodDescData ByRef) + 134 [D:\a\_work\1\s\src\Microsoft.Diagnostics.Runtime\src\DacInterface\SosDac.cs @ 52]
00007F77F0C99990 00007FB89FF93243 Microsoft.Diagnostics.Runtime.dll!Microsoft.Diagnostics.Runtime.Builders.RuntimeBuilder.CreateMethodFromHandle(UInt64) + 195 [D:\a\_work\1\s\src\Microsoft.Diagnostics.Runtime\src\Builders\RuntimeBuilder.cs @ 1297]
00007F77F0C99AB0 00007FB89FF93156 Microsoft.Diagnostics.Runtime.dll!Microsoft.Diagnostics.Runtime.Implementation.ClrmdRuntime.GetMethodByHandle(UInt64) + 70 [D:\a\_work\1\s\src\Microsoft.Diagnostics.Runtime\src\Implementation\ClrmdRuntime.cs @ 155]
00007F77F0C99AE0 00007FB89FF92388 BenchmarkDotNet.dll!BenchmarkDotNet.Disassemblers.ClrMdV2Disassembler.TryTranslateAddressToName(UInt64, Boolean, BenchmarkDotNet.Disassemblers.State, Boolean, Int32, Microsoft.Diagnostics.Runtime.ClrMethod) + 568
00007F77F0C99C30 00007FB89FF75E33 BenchmarkDotNet.dll!BenchmarkDotNet.Disassemblers.IntelDisassembler+<Decode>d__3.MoveNext() + 3219
00007F77F0C99FE0 00007FB89CDAD5A2 System.Private.CoreLib.dll!System.Collections.Generic.List`1[[System.__Canon, System.Private.CoreLib]].AddRange(System.Collections.Generic.IEnumerable`1<System.__Canon>) + 290 [/_/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/List.cs @ 269]
00007F77F0C9A030 00007FB89FF6FB6F BenchmarkDotNet.dll!BenchmarkDotNet.Disassemblers.ClrMdV2Disassembler.DisassembleMethod(BenchmarkDotNet.Disassemblers.MethodInfo, BenchmarkDotNet.Disassemblers.State, BenchmarkDotNet.Disassemblers.Settings, BenchmarkDotNet.Diagnosers.DisassemblySyntax) + 2271
00007F77F0C9A3A0 00007FB89FF6EE10 BenchmarkDotNet.dll!BenchmarkDotNet.Disassemblers.ClrMdV2Disassembler.Disassemble(BenchmarkDotNet.Disassemblers.Settings, BenchmarkDotNet.Disassemblers.State) + 416
00007F77F0C9A470 00007FB89FF3F184 BenchmarkDotNet.dll!BenchmarkDotNet.Disassemblers.ClrMdV2Disassembler.AttachAndDisassemble(BenchmarkDotNet.Disassemblers.Settings) + 932
00007F77F0C9A610 00007FB89FF3E75F BenchmarkDotNet.dll!BenchmarkDotNet.Disassemblers.SameArchitectureDisassembler.Disassemble(BenchmarkDotNet.Diagnosers.DiagnoserActionParameters) + 143
00007F77F0C9A660 00007FB89E5E2A57 BenchmarkDotNet.dll!BenchmarkDotNet.Diagnosers.DisassemblyDiagnoser.Handle(BenchmarkDotNet.Engines.HostSignal, BenchmarkDotNet.Diagnosers.DiagnoserActionParameters) + 151
00007F77F0C9A6E0 00007FB89E5E28F1 BenchmarkDotNet.dll!BenchmarkDotNet.Diagnosers.CompositeDiagnoser.Handle(BenchmarkDotNet.Engines.HostSignal, BenchmarkDotNet.Diagnosers.DiagnoserActionParameters) + 225
00007F77F0C9A7D0 00007FB89E5E3948 BenchmarkDotNet.dll!BenchmarkDotNet.Loggers.Broker.ProcessDataBlocking() + 696
00007F77F0C9A920 00007FB89CB6EC92 System.Private.CoreLib.dll!System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(System.Threading.Thread, System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) + 66 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/ExecutionContext.cs @ 264]
00007F77F0C9A960 00007FB89CB854C3 System.Private.CoreLib.dll!System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef, System.Threading.Thread) + 179 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs @ 2349]
00007F77F0C9A9E0 00007FB89E5D5583 System.Private.CoreLib.dll!System.Threading.ThreadPoolWorkQueue.Dispatch() + 419 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/ThreadPoolWorkQueue.cs @ 913]
00007F77F0C9AA50 00007FB89CB807E7 System.Private.CoreLib.dll!System.Threading.PortableThreadPool+WorkerThread.WorkerThreadStart() + 327 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/PortableThreadPool.WorkerThread.NonBrowser.cs @ 58]
00007F77F0C9AC50                  [DebuggerU2MCatchHandlerFrame: 00007f77f0c9ac50]

@janvorli
Copy link
Member

I've reproed that locally too. The ClrDataAccess::GetMethodDescData was passed MethodDesc* with value 0xffffffffffffffff

@janvorli
Copy link
Member

The BenchmarkDotNet.Disassemblers.ClrMdV2Disassembler.TryTranslateAddressToName is being passed address with value 0xffffffffffffffff

@janvorli
Copy link
Member

janvorli commented Aug 18, 2023

It looks like it is basically the same problem I've fixed for MethodTable in the PR @adamsitnik has mentioned, only that this time, it is happening for MethodDesc at some other place in the same function.

@janvorli
Copy link
Member

I am trying a fix now

@janvorli
Copy link
Member

The fix worked, I'll create a PR shortly.

janvorli added a commit to janvorli/runtime that referenced this issue Aug 18, 2023
Make both methods more resilient to the case of invalid MethodDesc
and MethodTable with value -1.

Close dotnet#90691
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Aug 18, 2023
github-actions bot pushed a commit that referenced this issue Aug 18, 2023
Make both methods more resilient to the case of invalid MethodDesc
and MethodTable with value -1.

Close #90691
@janvorli janvorli added area-Diagnostics-coreclr and removed area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Aug 18, 2023
@janvorli janvorli added this to the 8.0.0 milestone Aug 18, 2023
@ghost
Copy link

ghost commented Aug 18, 2023

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

Filing a tracking issue for the bug @stephentoub noticed on Linux-x64, minimal repro:

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Tests).Assembly).Run(args);

[DisassemblyDiagnoser]
public class Tests
{
    IEnumerable<int> _source = Enumerable.Repeat(0, 1024);
    Random _rand = new();

    [Benchmark]
    public bool Any() => _rand.NextDouble() < 1.0 ?
        _source.Any(i => i == 42) :
        _source.Any(i => i == 43);
}

Run with:

dotnet run -c Release -f net8.0 -- -j short --filter "*" --iterationTime 50

(--iterationTime 50 is just to make crash faster, it reproduces without it as well)

When we run - it randomly just quits silently in the middle of benchmarking (it doesn't happen for all benchmarks, but so far it looks like it's not PGO, TieredCompilation, AVX512 and Stephen even managed to repro it on .NET 7.0).

Presumably, the culprit is [DisassemblyDiagnoser] and ClrMD behind it.

cc @adamsitnik

Author: EgorBo
Assignees: -
Labels:

area-Diagnostics-coreclr, in-pr

Milestone: -

carlossanlop pushed a commit that referenced this issue Aug 18, 2023
Make both methods more resilient to the case of invalid MethodDesc
and MethodTable with value -1.

Close #90691

Co-authored-by: Jan Vorlicek <janvorli@microsoft.com>
janvorli added a commit that referenced this issue Aug 22, 2023
Make both methods more resilient to the case of invalid MethodDesc
and MethodTable with value -1.

Close #90691
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Aug 22, 2023
@AndyAyersMS
Copy link
Member

Should we port this to RC1?

@ghost ghost locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants