-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement crash-tracking handler #282
Conversation
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
Bumps [rustix](https://github.com/bytecodealliance/rustix) from 0.37.23 to 0.37.26. - [Release notes](https://github.com/bytecodealliance/rustix/releases) - [Commits](bytecodealliance/rustix@v0.37.23...v0.37.26) --- updated-dependencies: - dependency-name: rustix dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>
* Bump webpki from 0.22.1 to 0.22.4 Bumps [webpki](https://github.com/briansmith/webpki) from 0.22.1 to 0.22.4. - [Commits](https://github.com/briansmith/webpki/commits) --- updated-dependencies: - dependency-name: webpki dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> * Update LICENSE-3rdparty.yml --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Ivo Anjo <ivo.anjo@datadoghq.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good! Thank you!
I'm not 100% sure how the CMake will work out, but I'm happy to accept things as-is. I think there might be value in revealing the source file through a variable; that way I can include this cmake in order to find the sourcefile, then add my own linker stuff (for instance, setting a runpath and giving it a wrapper library with suitably exposed symbols).
But again, I don't know and I'll have to play around with it to see how everything fits together. I can easily infer the source location based on other hints, so my comment here is probably non-actionable.
The rest still looks good.
19d5608
to
2486394
Compare
69e8759
to
0894ac8
Compare
c781d77
to
471fae9
Compare
Some tests with 'continue-on-error: true' have failed:
Created by continue-on-error-comment |
**What does this PR do?** This PR adds support for the libdatadog crash tracker feature (off by default). The crash tracker works by detecting when the Ruby VM reaches a segmentation fault, reporting the crash information as a last profile before the VM dies. All of the interesting work is in <DataDog/libdatadog#282>, this PR basically just wires things up. **Motivation:** This will be a useful tool when debugging VM crashes. **Additional Notes:** I'm opening this PR as a draft as the libdatadog support has not yet landed/been released. Also, there's a few open questions on: * fork handling * when to shut down **How to test the change?** (You'll need to build <<DataDog/libdatadog#282> until the crash tracker gets included in a libdatadog release) To test the crash tracker with an actual crash, try running the following on Ruby 2.6: ```bash $ DD_PROFILING_ENABLED=true DD_PROFILING_EXPERIMENTAL_CRASH_TRACKING_ENABLED=true DD_TRACE_DEBUG=true bundle exec ddtracerb exec ruby -e 'Process.detach(fork { exit! }).instance_variable_get(:@foo)' ``` This should also work in the future but right now it doesn't work correctly; still looking into why: ```bash $ DD_PROFILING_ENABLED=true DD_PROFILING_EXPERIMENTAL_CRASH_TRACKING_ENABLED=true DD_TRACE_DEBUG=true bundle exec ddtracerb exec ruby -e 'Process.kill("SEGV", Process.pid)' ```
What does this PR do?
This module implements a crash tracker that enables libdatadog using profilers to send crash-reports to the datadog backend.
It has three related parts:
A
CrashInfo
struct which stores information about a crash in a structured format, and has built-in functions to upload itself to the datadog backend.This struct can be generated by the crash-handlers described below (e.g. for python or Ruby), or by language runtime tools from a crash dump (e.g. .Net).
A UNIX crash-handler, which registers signal handlers to detect crashes.
The legal operations within a crash-handler are limited, so the handler does as much out of process as possible.
When a crash occurs, this handler collects relevant information, and puts it on a pipe to the receiver, which processes it out of process.
A binary application
libdatadog-crashtracking-receiver
, which receives the crash-report over a pipe and formats it for transmission to the backend.Motivation
Currently, if crashes occur in profiling code, there is no notification until we get a support ticket. Learning early will enable us to close the loop to faster customer resolution.
Additional Notes
There are a number of open TODOs: see the README.md
How to test the change?
Run the test in profiling/crashtracker/api
For Reviewers
@DataDog/security-design-and-guidance
.