-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Add skiplist optimization to auto_date_histogram aggregation #20057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add skiplist optimization to auto_date_histogram aggregation #20057
Conversation
|
❌ Gradle check result for 27f8248: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Functionally correct but not showing improvement QueryResult |
|
@asimmahmood1 - Have you verified that the regression is only with |
|
So I figured out why performance was not up to par. Short answer is when auto date moves onto large intervals, we need to track of not just preparedRounding but also bucketOrds. If preparedRounding has changed since last time, we need to restart the skiplist logic. Otherwise, we'll collect too many docs, and although the end result doesn't change (i.e. unit test passes), performance is too low. Auto date histo has two modes: FromSingle and FromMany. FromSingle is often used in top level aggregation, so It's similar to Date Histogram where parent is null. FromMany is used e.g in big5's See validation below. Note: will update this PR after #19573 is merged. |
range auto date: from 1335 to 139 (89%) |
range-auto-date-with-metrics (22% lower)this is similar to the date-with-metrics, since the time is bounded by tavg stat |
|
❌ Gradle check result for 4d7f22d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
{"run-benchmark-test": "id_3"} |
|
{"run-benchmark-test": "id_11"} |
|
❌ Gradle check result for 4d7f22d: Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/5174/ . Final results will be published once the job is completed. |
Benchmark ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-pull-request/5174/
|
Benchmark Baseline Comparison ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-compare/210/
|
|
{"run-benchmark-test": "id_3"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
server/src/main/java/org/opensearch/search/aggregations/bucket/HistogramSkiplistLeafCollector.java (1)
231-234: Improve comment grammar.The comment "Skiplist is based as top level agg" is grammatically awkward.
Consider rewording:
- /** - * Skiplist is based as top level agg (null parent) or parent that will execute in sorted order - * - */ + /** + * Skiplist can be used when executing as a top-level aggregation (null parent) or when the parent executes in sorted order + * + */
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
server/src/main/java/org/opensearch/search/aggregations/bucket/HistogramSkiplistLeafCollector.java(7 hunks)server/src/main/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregator.java(16 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
server/src/main/java/org/opensearch/search/aggregations/bucket/HistogramSkiplistLeafCollector.java (1)
server/src/main/java/org/opensearch/search/aggregations/LeafBucketCollector.java (1)
LeafBucketCollector(51-179)
server/src/main/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregator.java (1)
server/src/main/java/org/opensearch/search/aggregations/bucket/HistogramSkiplistLeafCollector.java (1)
HistogramSkiplistLeafCollector(33-247)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: Analyze (java)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: detect-breaking-change
🔇 Additional comments (19)
server/src/main/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregator.java (9)
34-37: LGTM! Field name extraction for skiplist support.The imports and field name extraction logic are well-structured. The code safely handles cases where the values source is not field-data backed by setting
fieldNameto null, which is checked before use at line 279.Also applies to: 143-143, 166-168
255-259: LGTM! Abstract method signature updated consistently.The signature change to include
DocValuesSkipperis implemented consistently across bothFromSingleandFromManysubclasses within this file.
278-283: LGTM! Skipper retrieval is properly guarded.The code correctly retrieves the
DocValuesSkipperonly whenfieldNameis available, with null-safety handled by downstream skiplist eligibility checks.
444-489: LGTM! Standard collection path preserved.The fallback implementation maintains the existing collection behavior when skiplist optimization is unavailable, ensuring backward compatibility.
497-545: LGTM! Rounding advancement logic is well-structured.The
increaseRoundingIfNeededmethod correctly implements the rebucketing algorithm:
- Tracks min/max bounds to determine when rounding needs to advance
- Uses both bucket count and time range heuristics
- Properly manages resource cleanup with try-with-resources
- The do-while loop ensures at least one rebucketing attempt when thresholds are exceeded
707-737: LGTM! Skiplist integration for FromMany with appropriate constraints.The skiplist path correctly handles the constraint that
HistogramSkiplistLeafCollectorcurrently supports oneowningBucketOrdat a time. The comment on lines 712-720 clearly explains when this optimization applies (FilterRewrite context) and notes future enhancement possibilities.
739-785: LGTM! Standard collection path for FromMany preserved.The fallback implementation maintains existing multi-bucket collection behavior when skiplist is unavailable.
792-838: LGTM! Per-bucket rounding advancement with efficient rebucketing.The
increaseRoundingIfNeededmethod forFromManycorrectly implements the incremental rounding advancement:
- Dynamically grows tracking arrays (mins, maxes) as needed
- Uses ratio-based estimation to predict bucket counts after rounding changes
- Defers expensive rebucketing until
wastedBucketsOverestimateexceeds threshold- The exponential backoff for
nextRebucketAthelps avoid O(n²) behavior
392-393: LGTM! Debug tracking for skiplist usage.The
skiplistCollectorCounttracking provides valuable observability into how often the skiplist optimization is applied, which will be helpful for performance analysis and debugging.Also applies to: 556-556, 663-663, 884-884
server/src/main/java/org/opensearch/search/aggregations/bucket/HistogramSkiplistLeafCollector.java (10)
16-17: LGTM! Necessary imports for dynamic rounding support.The added imports support the new supplier-based design and skiplist eligibility checking functionality.
Also applies to: 19-19, 23-24
29-30: LGTM! Important constraint documented.The javadoc correctly notes the current single-owningBucketOrd limitation, which aligns with the usage restrictions described in
AutoDateHistogramAggregator.FromMany.
38-38: LGTM! Supplier-based design enables dynamic rounding.The refactoring from fixed fields to suppliers allows
AutoDateHistogramAggregatorto dynamically adjust rounding during collection, which is central to the skiplist optimization.Also applies to: 41-47
71-102: LGTM! Constructor delegation maintains backward compatibility.The original constructor now delegates to the new supplier-based constructor, preserving backward compatibility while enabling dynamic rounding support. The
isSubNoOpoptimization is a nice touch for performance when there are no sub-aggregations.
127-148: LGTM! Skiplist advancement now uses dynamic rounding.The
advanceSkippermethod correctly fetches the current rounding from the supplier, ensuring skiplist logic operates on the latest rounding configuration.
152-180: LGTM! Collection logic handles dynamic rounding with cache invalidation.The
collectmethod correctly invalidates cached skiplist state when rounding changes and uses the current rounding for all bucketing decisions. The callback notification at line 178 allows the aggregator to trigger further rounding adjustments as needed.
198-209: LGTM! Efficient stream handling based on sub-aggregation presence.The
isSubNoOpoptimization smartly usesstream.count()when sub-aggregations are absent, avoiding unnecessary per-document collection overhead.
222-229: LGTM! Callback interface for rounding adjustment notification.The
IncreaseRoundingIfNeededinterface provides a clean abstraction for notifying the aggregator when new buckets are collected, enabling dynamic rounding decisions.
235-246: LGTM! Skiplist eligibility check with noted limitation.The
canUseSkiplistmethod correctly identifies when skiplist optimization is applicable: top-level aggregations or filter-rewrite contexts. The TODO comment at line 237 appropriately flags that hard bounds support is not yet implemented.
65-69: Verify that reference equality reliably detects rounding changes.The code uses reference equality (
currentRounding != lastPreparedRounding) at line 156 to detect whenAutoDateHistogramAggregatorchanges rounding. This assumes that a newRounding.Preparedinstance is created whenever the rounding changes. Confirm that theprepareRounding()method consistently creates new instances rather than reusing or caching instances, as reuse would cause the reference equality check to fail silently and skip necessary cache invalidation.
| protected LeafBucketCollector getLeafCollector(SortedNumericDocValues values, DocValuesSkipper skipper, LeafBucketCollector sub) | ||
| throws IOException { | ||
| // Check if skiplist optimization is available | ||
| final NumericDocValues singleton = DocValues.unwrapSingleton(values); | ||
| if (HistogramSkiplistLeafCollector.canUseSkiplist(null, parent, skipper, singleton)) { | ||
| // Increment skiplist collector count | ||
| skiplistCollectorCount++; | ||
| return new HistogramSkiplistLeafCollector( | ||
| singleton, | ||
| skipper, | ||
| (owningBucketOrd) -> preparedRounding, // for FromSingle there will be no parent/ | ||
| () -> bucketOrds, | ||
| sub, | ||
| FromSingle.this, | ||
| (owningBucket, rounded) -> increaseRoundingIfNeeded(rounded) // Pass supplier to allow rounding change | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix comment typo; skiplist integration looks good.
The skiplist eligibility check and collector creation are well-structured with proper fallback to the standard path.
Minor issue: Line 436 has a trailing slash in the comment.
Apply this diff to fix the comment:
- (owningBucketOrd) -> preparedRounding, // for FromSingle there will be no parent/
+ (owningBucketOrd) -> preparedRounding, // for FromSingle there will be no parent📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| protected LeafBucketCollector getLeafCollector(SortedNumericDocValues values, DocValuesSkipper skipper, LeafBucketCollector sub) | |
| throws IOException { | |
| // Check if skiplist optimization is available | |
| final NumericDocValues singleton = DocValues.unwrapSingleton(values); | |
| if (HistogramSkiplistLeafCollector.canUseSkiplist(null, parent, skipper, singleton)) { | |
| // Increment skiplist collector count | |
| skiplistCollectorCount++; | |
| return new HistogramSkiplistLeafCollector( | |
| singleton, | |
| skipper, | |
| (owningBucketOrd) -> preparedRounding, // for FromSingle there will be no parent/ | |
| () -> bucketOrds, | |
| sub, | |
| FromSingle.this, | |
| (owningBucket, rounded) -> increaseRoundingIfNeeded(rounded) // Pass supplier to allow rounding change | |
| ); | |
| } | |
| protected LeafBucketCollector getLeafCollector(SortedNumericDocValues values, DocValuesSkipper skipper, LeafBucketCollector sub) | |
| throws IOException { | |
| // Check if skiplist optimization is available | |
| final NumericDocValues singleton = DocValues.unwrapSingleton(values); | |
| if (HistogramSkiplistLeafCollector.canUseSkiplist(null, parent, skipper, singleton)) { | |
| // Increment skiplist collector count | |
| skiplistCollectorCount++; | |
| return new HistogramSkiplistLeafCollector( | |
| singleton, | |
| skipper, | |
| (owningBucketOrd) -> preparedRounding, // for FromSingle there will be no parent | |
| () -> bucketOrds, | |
| sub, | |
| FromSingle.this, | |
| (owningBucket, rounded) -> increaseRoundingIfNeeded(rounded) // Pass supplier to allow rounding change | |
| ); | |
| } |
🤖 Prompt for AI Agents
server/src/main/java/org/opensearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregator.java
around lines 426 to 442: there is a minor typo — the inline comment on line 436
contains an extraneous trailing slash ("// for FromSingle there will be no
parent/"); remove the trailing slash and ensure the comment reads clearly (e.g.,
"// for FromSingle there will be no parent") so the comment has no stray
characters.
|
❕ Gradle check result for 44bbcb3: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #20057 +/- ##
============================================
+ Coverage 73.25% 73.66% +0.40%
- Complexity 71684 72102 +418
============================================
Files 5788 5793 +5
Lines 327866 328096 +230
Branches 47218 47252 +34
============================================
+ Hits 240194 241683 +1489
+ Misses 68399 67351 -1048
+ Partials 19273 19062 -211 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
❌ Gradle check result for 44bbcb3: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❕ Gradle check result for 44bbcb3: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
|
{"run-benchmark-test": "id_11"} |
|
{"run-benchmark-test": "id_3"} |
|
The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/5270/ . Final results will be published once the job is completed. |
|
The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/5271/ . Final results will be published once the job is completed. |
|
The benchmark job https://build.ci.opensearch.org/job/benchmark-pull-request/5270/ failed. |
|
The benchmark job https://build.ci.opensearch.org/job/benchmark-pull-request/5271/ failed. |
|
benchmarks failed with insufficient ec2: |
|
{"run-benchmark-test": "id_11"} |
|
The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/5272/ . Final results will be published once the job is completed. |
Benchmark ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-pull-request/5272/
|
Benchmark Baseline Comparison ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-compare/222/
|
Description
This is a built on top #19573.
Compare to date histogram skiplist, this change needs to hook into dynamic rounding during collection. There are 2 variables to keep track of:
bucketOrds- seen rounded datespreparedRounding- starts at lowest interval: MINUTES and goes upWhen a new ord is created,
increaseRoundingIfNeededfunction is called to determine if newpreparedRoundingneeds to kick in (e.g. from HOURS to DAYS), and may also merge dates inbucketOrds. Thus, both are need to be supplied via lambda.In the future skiplist can be enhanced to keep track of multiple
owningBucketOrd, for now it only works when auto date histogram is root (parent == null), or within range filter rewrite context that guarantees new auto date histogram is created per range.Related Issues
Resolves #19827
Part of #18882
Also #19384
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Summary by CodeRabbit
New Features
Improvements
Tests
✏️ Tip: You can customize this high-level summary in your review settings.