-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test process crash on Microsoft.CodeAnalysis.EditorFeatures.UnitTests.dll #55639
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
We have no dumps for any of these failures, so I'm first investigating that. I believe this is due to incorrect configuration when we pass our test runs off to helix. Example failure - Pipelines - Run 20210812.14 logs (azure.com) This issue I think is that our RunTests project is not correctly configuring the windows dump collection registry key on the helix work item. The dump collection is configured here - roslyn/Program.cs at 315c2e1 · dotnet/roslyn (github.com) which runs before the code where we pass the job off to helix (roslyn/Program.cs at 315c2e1 · dotnet/roslyn (github.com)).
|
After discussing with coreeng, the windows error reporting registry keys should be set on the machines running helix work items. However some other user could have changed these values and wiped them. Next step: If we get failures with no dumps even when set, it likely means windows error reporting is not capturing these failures. |
I made a change to always throw in the NFW path to check dump uploading. Only one work item with Microsoft.CodeAnalysis.EditorFeatures.UnitTests.dll failed, other work items on the same tests (but different legs) successfully uploaded dumps. In all the runs however, the registry keys look set correctly.
I do notice multiple calls to FailFast in the run that failed to upload a dump. Next step is to add some more logging to verify that no dump was produced. |
Attempting to reproduce the issue locally in #55657 I have successfully managed to crash the test process numerous times without producing a dump file (but it is very inconsistent). In those cases as best as I can tell WER is not attempting to do anything. Though from the windows event viewer the test host process crash is logged
|
Some notes. All the debug failures that did not upload dumps show multiple calls to the NFW handling that result in multiple Environment.FailFast calls. I think there is some kind of race where calling Environment.FailFast multiple times causes windows error reporting to not upload a dump. I don't know exactly how to follow that thread further (likely need to get in touch with runtime? / WER folks). So for now my immediate next path is to see if procdump will capture dumps locally when WER does not. If it does, then I plan to switch the test runs over to proc dump temporarily in hopes of capturing some dumps of the actual CI failures. |
#55939 potentially fixes the issue with collecting dumps |
Latest dump from debug + non-spanish run here https://dev.azure.com/dnceng/public/_build/results?buildId=1341334&view=ms.vss-test-web.build-test-results-tab&runId=39293838&resultId=216584&paneView=dotnet-dnceng.dnceng-build-release-tasks.helix-test-information-tab Looks like calling
|
this hasn't shown up in runfo in quite a while, closing |
Multiple runs have the test runner crash during the Microsoft.CodeAnalysis.EditorFeatures.UnitTests.dll helix work item.
Example output from console log (from https://runfo.azurewebsites.net/view/build/?number=1290779)
The text was updated successfully, but these errors were encountered: