-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(share/eds): print error for stuck register shard #2516
fix(share/eds): print error for stuck register shard #2516
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2516 +/- ##
==========================================
- Coverage 52.60% 52.33% -0.28%
==========================================
Files 156 156
Lines 9995 10013 +18
==========================================
- Hits 5258 5240 -18
- Misses 4272 4305 +33
- Partials 465 468 +3
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still the case after all the hangups are removed?
Latest tests shown not hangups. But still, we don't want to lose event of hangup and info all related info of if it happens. |
Agreed, yet sounds like it better be a metric |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @Wondertan , this is better recorded as a metric.
Metric would need logic implemented in this PR. There are 3 cases for hangups, so I would rather have specified log lines. Also metrics are optional, but it will be important to be able to find those errors from logs in case this problem happens again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we discussed making Put async?
Async Put would just kick the can down the road. |
Sometimes register shard can take longer, that provided context timeout. This PR logs errors if register shard is stuck.
Sometimes register shard can take longer, that provided context timeout. This PR logs errors if register shard is stuck.
Sometimes register shard can take longer, that provided context timeout. This PR logs errors if register shard is stuck.