Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sentry #1461

Merged
merged 12 commits into from
Dec 9, 2020
Merged

Sentry #1461

merged 12 commits into from
Dec 9, 2020

Conversation

dconnolly
Copy link
Contributor

@dconnolly dconnolly commented Dec 4, 2020

Motivation

Partly to fullfill #1286, but in general, long-running sync tests don't fit well into testing frameworks because 1: they run very long, and 2: if the objective is 'sync to tip without panicking/erroring', that is basically the main use case. Therefore instead of running a 'test case', we should run zebrad, appropriately configured, and alert on panic/error/crash as a failure, and be quiet if nothing goes wrong (ie, there is never a endpoint for a 'test' like this to pass, only a possible 'fail').

Solution

When enabled, catches panics and other errors and reports them to our Sentry.io project. From there we can set up alerting, webhooks, and automatic GitHub issue opening if we want. Sentry support is behind a feature flag so by default zebrad doesn't include it, and when enabled, pulls the config information about where to send events from the runtime environmental variable. It's enabled for our Docker.build deployments which is used to deploy automatically from the #main branch on every merge and on any Manual Deploy workflow runs.

Review

Related Issues

#1286

Follow Up Work

Twiddle the alerting/webhook/GitHub integration configs to our liking but that does not require any code changes.

Inject our dsn more smartly via github secrets management through the workflow into github container execution environment, but this (inline in the thin release image) works fine for now, can be rotated later.

I can't tell if this works on macOS or not, I've only successfully triggered events/caught panics on Debian with ca-certificates package installed at runtime.

@dconnolly dconnolly added this to the First Alpha Release milestone Dec 4, 2020
@dconnolly dconnolly self-assigned this Dec 4, 2020
@dconnolly dconnolly linked an issue Dec 4, 2020 that may be closed by this pull request
@dconnolly dconnolly added the S-blocked Status: Blocked on other tasks label Dec 5, 2020
@dconnolly
Copy link
Contributor Author

@mpguerra mpguerra removed this from the First Alpha Release milestone Dec 7, 2020
@teor2345 teor2345 marked this pull request as draft December 7, 2020 21:40
@dconnolly dconnolly requested a review from yaahc December 8, 2020 23:24
@dconnolly dconnolly marked this pull request as ready for review December 9, 2020 01:16
@dconnolly dconnolly added A-infrastructure Area: Infrastructure changes and removed S-blocked Status: Blocked on other tasks labels Dec 9, 2020
@dconnolly dconnolly merged commit cff28f7 into main Dec 9, 2020
@dconnolly dconnolly deleted the sentry branch December 9, 2020 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-infrastructure Area: Infrastructure changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add mainnet node canary that fails when syncing to tip fails/panics
3 participants