-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add check and handle negative SearchRequestStats #16569
base: main
Are you sure you want to change the base?
Conversation
1963e6b
to
e3db948
Compare
❌ Gradle check result for e3db948: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
e3db948
to
b72b74d
Compare
Thanks @dzane17 , this looks pretty interesting. Are you tracking the investigation of this in another GitHub issue? I'm curious which search phase specifically you're seeing this one since having negative time values anywhere is quite concerning. |
In particular this block of code comes to mind, where for scroll requests fetch phase can be performed before query phase is completed OpenSearch/server/src/main/java/org/opensearch/search/SearchService.java Lines 836 to 872 in 456ca97
and since the |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16569 +/- ##
============================================
+ Coverage 72.08% 72.10% +0.02%
+ Complexity 65099 65079 -20
============================================
Files 5315 5315
Lines 303572 303581 +9
Branches 43925 43927 +2
============================================
+ Hits 218817 218892 +75
+ Misses 66861 66730 -131
- Partials 17894 17959 +65 ☔ View full report in Codecov by Sentry. |
@jed326 Here is the github issue: #16598 Actually, Regarding this code block in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if we should have generic way of ignoring negative values (writeUnsignedLong), especially for stat type of use cases instead of handling it every place separately. Missing any occurrence can result in node drops which is never desirable IMO. @jed326 - Thoughts?
server/src/main/java/org/opensearch/index/search/stats/SearchStats.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/search/stats/SearchStats.java
Outdated
Show resolved
Hide resolved
Signed-off-by: David Zane <davizane@amazon.com>
b72b74d
to
e48ffba
Compare
@jainankitk I'm actually a little apprehensive about ignoring the negative values at all. This PR is adding negative checking to both |
❌ Gradle check result for e48ffba: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
+1 I suspect a situation where both failure and success code are called (with rare failures) but wasn't able to find it in about an hour of searching. |
This PR is stalled because it has been open for 30 days with no activity. |
Description
We identified a bug where negative search stats are causing an exception when processed using
VLong
in the following line:https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/index/search/stats/SearchStats.java#L89
This issue occurs because
VLong
does not handle negative values, resulting in an exception.Solution
We have implemented a temporary fix that adds a check for negative values before calling
writeVLong
. If a negative value is encountered, a warning is logged, and0
is written instead of the negative value:This fix addresses the immediate problem but does not solve the root cause. We will continue to investigate why search stats are negative in the first place.
Related Issues
#16598
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.