-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime/trace: flush trace data on non-throw crashes #65319
Comments
Note: this is related to #63185 (flight recording) as well, since this could make recovering trace data from a crash while flight recording was enabled much more successful in the future. We could consider also adding the ability to install an optional handler to the flight recorder for writing out trace data in these cases, though that should probably go in the flight recording proposal. |
This also goes hand-in-hand with #65316, since it's still likely the tail end of the trace data will be broken, since the crash still has to happen. |
Change https://go.dev/cl/562616 mentions this issue: |
This ensures the trace buffers are as up-to-date as possible right before crashing. It increases the chance of finding the culprit for the crash when looking at core dumps, e.g. if slowness is the cause for the crash (monitor kills process). Fixes golang#65319. Change-Id: Iaf5551911b3b3b01ba65cb8749cf62a411e02d9c Reviewed-on: https://go-review.googlesource.com/c/go/+/562616 Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
I'm wondering if we could do this in more cases. I think I see three top-level crashing functions (all calling
Which of these paths could we conceivably add a |
If a Go program has tracing enabled and crashes, chances are that the most recent data (the most useful data) won't be properly flushed, and the trace will be broken. We can discard this broken part of the trace in the tooling (#65316), but it doesn't change the fact that we might loose a lot of information.
The thing is, many crashes that only impact user program state (such as nil dereferences and uncaught-but-recoverable panics) can absolutely still go through with a global buffer flush (
runtime.traceAdvance
) since the runtime state is still OK.I'd like to suggest explicitly flushing all trace data on an uncaught panic or a crash due to some "easier" case, like nil dereferences, so that as much of the data comes out in-tact as possible.
The text was updated successfully, but these errors were encountered: