-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flaky test in testApproximateRangeWithSizeOverDefault by adjusting totalHits assertion logic #16434
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #16434 +/- ##
============================================
- Coverage 72.09% 71.94% -0.15%
+ Complexity 65013 64943 -70
============================================
Files 5313 5313
Lines 303315 303315
Branches 43888 43888
============================================
- Hits 218661 218211 -450
- Misses 66721 67207 +486
+ Partials 17933 17897 -36 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! This is probably because the approximation tries to chop off at size
results for each segment. If the documents are split across multiple segments, we can end up returning all of the docs (since the approximation collects everything for each).
Thanks for fixing this @inpink!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to see a test for an "approximate" query has "approximate" tests ;)
…ing totalHits assertion logic (opensearch-project#15807) - Updated the test to account for Lucene's behavior where `IndexSearcher.search()` may return `GREATER_THAN_OR_EQUAL_TO` for totalHits when the number of matches exceeds 1000. - Added logic to check if `totalHits.relation` is `EQUAL_TO`. If so, assert that the count is exactly 11000. Otherwise, ensure the count is at least 11000 and within the allowed upper limit (`maxHits`). - This change prevents intermittent test failures caused by Lucene’s performance optimizations. Signed-off-by: inpink <inpink@kakao.com>
@msfroh Glad I could help! Thanks for clarifying the behavior with segments :) It makes a lot more sense now. Also, it’s ready for merge! |
Thank you, @dbwiddis ! I’m glad the test now appropriately reflects the “approximate” nature of the query. XD |
…ing totalHits assertion logic (#15807) (#16434) - Updated the test to account for Lucene's behavior where `IndexSearcher.search()` may return `GREATER_THAN_OR_EQUAL_TO` for totalHits when the number of matches exceeds 1000. - Added logic to check if `totalHits.relation` is `EQUAL_TO`. If so, assert that the count is exactly 11000. Otherwise, ensure the count is at least 11000 and within the allowed upper limit (`maxHits`). - This change prevents intermittent test failures caused by Lucene’s performance optimizations. Signed-off-by: inpink <inpink@kakao.com> (cherry picked from commit 66f0110) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…ing totalHits assertion logic (#15807) (#16434) (#16459) - Updated the test to account for Lucene's behavior where `IndexSearcher.search()` may return `GREATER_THAN_OR_EQUAL_TO` for totalHits when the number of matches exceeds 1000. - Added logic to check if `totalHits.relation` is `EQUAL_TO`. If so, assert that the count is exactly 11000. Otherwise, ensure the count is at least 11000 and within the allowed upper limit (`maxHits`). - This change prevents intermittent test failures caused by Lucene’s performance optimizations. (cherry picked from commit 66f0110) Signed-off-by: inpink <inpink@kakao.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Description
This PR addresses an issue in the
testApproximateRangeWithSizeOverDefault
method ofApproximatePointRangeQueryTests
, where the test would occasionally fail due to how Lucene handles total hits.By default,
search()
in Lucene'sIndexSearcher
provides an accurate count for up to 1000 hits.Beyond this threshold, Lucene may return a lower bound using GREATER_THAN_OR_EQUAL_TO for performance reasons (refer to the Lucene IndexSearcher documentation.)
In
testApproximateRangeWithSizeOverDefault
, the search range includes 12,001 documents, and the test would sometimes fail whenGREATER_THAN_OR_EQUAL_TO
occurred during the search.Changes:
EQUAL_TO
, the test checks for an exact count of 11000.GREATER_THAN_OR_EQUAL_TO
, the test ensures the hits are no less than 11000 and within the upper bound (maxHits).This issue is similar to OpenSearch PR #4270. I resolved it in a similar way. Special thanks to @dbwiddis for the valuable guidance.
Related Issues
Related #15807
Check List
API changes companion pull request created, if applicable.Public documentation issue/PR created, if applicable.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.