-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CheckpointLogger changes to improve signal #823
Conversation
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found.
Additional details and impacted files@@ Coverage Diff @@
## main #823 +/- ##
==========================================
- Coverage 98.06% 98.01% -0.05%
==========================================
Files 444 444
Lines 35474 35540 +66
==========================================
+ Hits 34786 34835 +49
- Misses 688 705 +17
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found.
@@ Coverage Diff @@
## main #823 +/- ##
==========================================
- Coverage 98.06% 98.03% -0.03%
==========================================
Files 444 444
Lines 35474 36477 +1003
==========================================
+ Hits 34786 35760 +974
- Misses 688 717 +29
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found.
@@ Coverage Diff @@
## main #823 +/- ##
==========================================
- Coverage 98.06% 98.03% -0.03%
==========================================
Files 444 444
Lines 35474 36477 +1003
==========================================
+ Hits 34786 35760 +974
- Misses 688 717 +29
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov ReportAttention: Patch coverage is ✅ All tests successful. No failed tests found.
📢 Thoughts on this report? Let us know! |
f3bace2
to
6f07d01
Compare
✅ All tests successful. No failed tests were found. 📣 Thoughts on this report? Let Codecov know! | Powered by Codecov |
UploadFlow
metrics are not very trustworthy. there have always been "flow endings" that have been missed by my first pass at instrumentation (on_timeout
andon_failure
task callbacks), and recent refactors of the upload flow and integration of new features have added a bunch more + logged some non-coverage endings in the coverage flowbecause of the missing endings, we are probably underreporting errors and the reliability rate appears higher than it should be. also the
notification_latency
metric is probably inaccurate, but i don't know whether that is biased high or lowmetrics that require all exit paths from such a huge expanse of code to be instrumented manually are weak, but they are all we have in today's platform for this sort of topline. this PR attempts to improve the signal so we can better gauge the health of today's platform, and we can use that insight to set goals for future work
full details
review by commit, not whole PR
CheckpointLogger
state toLogContext
, auto-load from kwargs in base taskon_failure()
andon_timeout()
handlers to access the latest checkpoints data and not just what was initially passed in kwargsCheckpointLogger
class anymore lolUploadFlow
event outside of an upload flowUploadFlow
andTestResultFlow
flows in celeryon_failure()
andon_timeout()
handlersUploadTask
only logUploadFlow
checkpoints for coverage uploads (test results are a different flow)background:
UploadFlow
's reliability metrics are useful to track but the actual current value is probably not accurate. some 50% of the flow endings are apparently not captured and we have never dug into why. the missing endings are probably more slanted towards errors which means the reliability in our dashboards appears higher than it really isa more accurate understanding of our reliability will help us decide what/how much work to do to revamp the platform
Legal Boilerplate
Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.