Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix profiler evacuation loop logic #99075

Merged
merged 2 commits into from
Feb 29, 2024

Conversation

davmason
Copy link
Member

This way we get a crash dump when it fails.

@davmason davmason added this to the 9.0.0 milestone Feb 28, 2024
@davmason davmason requested a review from a team February 28, 2024 19:54
@davmason davmason self-assigned this Feb 28, 2024
@ghost
Copy link

ghost commented Feb 28, 2024

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

This way we get a crash dump when it fails.

Author: davmason
Assignees: davmason
Labels:

area-Diagnostics-coreclr

Milestone: 9.0.0

@AndyAyersMS
Copy link
Member

I thought this was merged because I have a failure with a dump, but ...?

https://dev.azure.com/dnceng-public/public/_build/results?buildId=583200&view=ms.vss-test-web.build-test-results-tab

@jkotas
Copy link
Member

jkotas commented Feb 29, 2024

https://dev.azure.com/dnceng-public/public/_build/results?buildId=583200&view=ms.vss-test-web.build-test-results-tab

@AndyAyersMS Thank you for sharing the link to the dumps!

The profiled process is stuck waiting without making a progress in ProfilingAPIDetach::ExecuteEvacuationLoop:

coreclr!CLREventWaitHelper::__l4::__Body::Run+0xc [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 397] 
coreclr!CLREventWaitHelper+0x28 [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 399] 
coreclr!CLREventBase::WaitEx+0xe4 [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 466] 
coreclr!CLREventBase::Wait+0x114 [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 412] 
coreclr!ProfilingAPIDetach::ExecuteEvacuationLoop+0x298 [D:\a\_work\1\s\src\coreclr\vm\profdetach.cpp @ 298] 
coreclr!ProfilingAPIDetach::ProfilingAPIDetachThreadStart+0x84 [D:\a\_work\1\s\src\coreclr\vm\profdetach.cpp @ 569] 
kernel32!BaseThreadInitThunk+0x30 [clientcore\base\win32\client\thread.c @ 77] 
ntdll!RtlUserThreadStart+0x3c [minkernel\ntdll\rtlstrt.c @ 1166] 

even though there are s_profilerDetachInfos to cleanup:

0:003> dt coreclr!ProfilingAPIDetach::s_profilerDetachInfos
   +0x000 pbBuff           : (null) 
   +0x008 iSize            : 0x60
   +0x010 cbTotal          : 0x200
   +0x018 rgData           : [64] 0x00007ffe`04e08e08
   +0x218 m_curSize        : 1

I think that that problem is in the following code:

for (SIZE_T pos = 0; pos < s_profilerDetachInfos.Size(); ++pos)
{
ProfilerDetachInfo current = s_profilerDetachInfos.Pop();

If we end up with two profilers wanting to detach at the same time, the s_profilerDetachInfos is going to have two elements (s_profilerDetachInfos.Size() == 2) before this loop start. After the first iteration of the loop, s_profilerDetachInfos.Size() is going to be 1 and but pos is going to be one as well, and so we miss cleaning up the second profiler. @davmason If you agree with my analysis, could you please submit a PR with a fix?

@davmason davmason changed the title Throw exception when we see timeout in Profiler test Fix profiler evacuation loop logic Feb 29, 2024
@davmason
Copy link
Member Author

I just pushed a commit with a fix for the loop condition

@davmason davmason linked an issue Feb 29, 2024 that may be closed by this pull request
@hoyosjs hoyosjs merged commit 207f2bb into dotnet:main Feb 29, 2024
121 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

profiler\\multiple\\multiple\\multiple.cmd failing on windows arm64
4 participants