Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NT_SIGINFO NOTE to ELF dumps #83059

Merged
merged 1 commit into from
Mar 8, 2023

Conversation

mikem8361
Copy link
Member

@mikem8361 mikem8361 commented Mar 7, 2023

Customer Impact

Linux Watson needs this to better triage ELF dumps. 1st party teams have asked for this.

Issue: #40958

This change update createdump which allows windbg/Watson to determine which thread actually crashed (via the .lastevent command). The NT_SIGINFO record has been missing from Linux core dumps causing the wrong thread (startup thread) to be blamed for the crash. This breaks Watson bucketing.

The underlying issue here that we do not put enough data in Linux coredumps: The “crashing thread” isn’t marked as the one of interest. Without this information, the debugger assumes that the 0th thread (usually the startup thread) is the guilty party.

When an automated debugging service comes along, like Watson/!analyze, they cannot properly triage the bug. Instead of properly blaming the correct thread (with the correct exception), it will try to blame the non-crashing “crashing thread” (usually just the main function doing nothing, sitting in a wait call). In effect, this renders all of our Azure Watson bucketing for all of our partner teams and customers useless.

Added "ExceptionType" field to "Parameters" section of the Linux crash report json.

Testing

All the SOS diagnostics tests pass with these changes.

Risk

Low. Createdump/core generation only.

Linux Watson needs this to better triage ELF dumps.

Add CreateDumpOptions helper struct to pass all the command options around. Add the
"--code", "--errno", "--address" command line options used to fill the NT_SIGINFO
NOTE. The runtime passes to createdump on a crash.

Added "ExceptionType" field to "Parameters" section of the Linux crash report json.
@ghost
Copy link

ghost commented Mar 7, 2023

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

Customer Impact

Linux Watson needs this to better triage ELF dumps. 1st party teams have asked for this.

Add CreateDumpOptions helper struct to pass all the command options around. Add the "--code", "--errno", "--address" command line options used to fill the NT_SIGINFO NOTE. The runtime passes to createdump on a crash.

Added "ExceptionType" field to "Parameters" section of the Linux crash report json.

Testing

All the SOS diagnostics tests pass with these changes.

Risk

Low. Createdump/core generation only.

Author: mikem8361
Assignees: mikem8361
Labels:

area-Diagnostics-coreclr

Milestone: -

Copy link
Member

@jeffschwMSFT jeffschwMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved. we will take for consideration in 7.0.x. please get a code review

@jeffschwMSFT jeffschwMSFT added the Servicing-consider Issue for next servicing release review label Mar 7, 2023
@jeffschwMSFT jeffschwMSFT added this to the 7.0.x milestone Mar 7, 2023
@rbhanda rbhanda modified the milestones: 7.0.x, 7.0.5 Mar 7, 2023
@rbhanda rbhanda added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Mar 7, 2023
@carlossanlop
Copy link
Member

Approved by Tactics.
Signed-off by area owners.
CI can't get any greener than this.
No OOB changes needed (native).
Ready to merge. :shipit:

@carlossanlop carlossanlop merged commit dbb333c into dotnet:release/7.0 Mar 8, 2023
@mikem8361 mikem8361 deleted the nt_siginfo7 branch March 21, 2023 22:55
@ghost ghost locked as resolved and limited conversation to collaborators Apr 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants