-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flaky gauge value issue #13679
Fix flaky gauge value issue #13679
Conversation
Thanks @tibrewalpratik17 for identifying the issue, really appreciate it! Could you help add a unit test in |
86db21c
to
046a813
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor but LGTM. Thanks!
pinot-common/src/test/java/org/apache/pinot/common/metrics/AbstractMetricsTest.java
Show resolved
Hide resolved
046a813
to
b6f6db0
Compare
b6f6db0
to
a8fd649
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #13679 +/- ##
============================================
+ Coverage 61.75% 61.97% +0.21%
+ Complexity 207 198 -9
============================================
Files 2436 2554 +118
Lines 133233 140567 +7334
Branches 20636 21868 +1232
============================================
+ Hits 82274 87111 +4837
- Misses 44911 46820 +1909
- Partials 6048 6636 +588
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Fix for #13629
Currently, we maintain a map
_gaugeValues
in theAbstractMetrics
class for gauge values. When adding a new gauge, we lock the entire operation of adding the gauge to the_gaugeValues
map and themetricsRegistry
factory simultaneously (Ref). However, there was a bug where the removal was not done under a lock. This led to a race condition whenremoveGauge
andsetValueOfGauge
were called almost simultaneously. This could result in a scenario where the gauge is removed from the_gaugeValues
map, thensetValueOfGauge
takes a lock on the map, adds it back to_gaugeValues
, and registers it with_metricsRegistry
, but thenremoveFromMetricsRegistry
from theremoveGauge
method kicks in, removing it from_metricsRegistry
.This was specifically caught in our scenario by the way we have implemented our
_metricsRegistry
. We have a thread where we loop through all the gauges every 10 seconds and emit the value. In our scenario,_metricsRegistry
was missing the gauge altogether because of the above race condition.