Skip to content

Conversation

@ajleong623
Copy link
Contributor

@ajleong623 ajleong623 commented Oct 6, 2025

Description

This change enables star tree profiling and also profiles the pre-computation phase when profiling aggregations. Sometimes when getting building leaf collector, the precomputation phase is part of that, therefore, we want to separate the rest of the precomputation phase from the logic in tryPrecomputeAggregationForLeaf.

A change I made was to update ProfilingAggregator.java so that now it can override the logic of tryPrecomputeAggregationForLeaf to account for timing. Additionally, in Aggregator.java, I moved the method of precomputing from the AggregatorBase.java to Aggregator so that ProfilingAggregator could override that method.

For collecting the profiling information for the star tree sub-phases, I noticed that all aggregators that use star tree both scan the star tree index to find the matched star tree document ids and add those ids into the buckets. All star tree pre-computation supporting aggregators implement StarTreePreComputeCollector, therefore, I added the methods into that class so that the aggregators could implement them. The reason for this is to separate the logic for scanning the star tree and filling those buckets.

The tricky part was understanding the InternalAggregationProfileTree and the profiling framework. The InternalAggregationProfileTree is the backing that keeps track of the current position in the aggregation and stores and updates the breakdown structures. A problem with the InternalAggregationProfileTree is that with aggregation profiling, we can only add breakdowns for other aggregations as nodes in the tree. Additionally in SearchProfileShardResults, the InternalAggregationProfileTree which extends AbstractInternalProfileTree is referenced for serializing profiling results.

I knew that the objective would be to have profiling results for star tree precomputation phases, therefore, I created a new class StarTreeProfileBreakdown.java to hold the profiling information from star tree precomputation.

For adding a StarTreeProfileBreakdown as a child node for an aggregation profile breakdown, I could either rewrite the InternalAggregationProfileTree to accommodate StarTreeProfileBreakdown nodes or edit the InternalAggregationProfileTree. Luckily, I noticed that when serializing the profile results, I could override the function createProfileResult in AbstractInternalProfileTree. I ended up rewriting that result and storing the StarTreeProfileBreakdown within AggregationProfileBreakdown for serialization since each aggregation can have at most one StarTreeProfileBreakdown .

Finally, for collecting the breakdown results, I create a new breakdown StarTreeProfileBreakdown.java which represents a breakdown for the star tree phase. In AggregationProfileBreakdown.java, I added a few methods to detect whether a StarTreeProfileBreakdown has been attached and to also keep track of the breakdown. Each AggregationProfileBreakdown could have at most one StarTreeProfileBreakdown attached to it. Additionally, in InternalAggregationProfileTree.java, I added some logic to add the star tree profiling result as a child to that AggregationProfileBreakdown.

An example of a an aggregation breakdown is

{
    "aggregations": [
        {
            "type": "NumericTermsAggregator",
            "description": "response_codes",
            "time_in_nanos": 4649501,
            "breakdown": {
                "build_aggregation": 443583,
                "build_aggregation_count": 1,
                "build_leaf_collector": 4678959,
                "build_leaf_collector_count": 1,
                "collect": 0,
                "collect_count": 0,
                "initialize": 3750,
                "initialize_count": 1,
                "post_collection": 1250,
                "post_collection_count": 1,
                "pre_compute": 4649501,
                "pre_compute_count": 1,
                "reduce": 0,
                "reduce_count": 0
            },
            "debug": {
                "total_buckets": 1,
                "result_selection_strategy": "select_all",
                "result_strategy": "double_terms"
            },
            "children": [
                {
                    "type": "StarTree",
                    "description": "Pre-computation using star-tree index",
                    "time_in_nanos": 4649501,
                    "breakdown": {
                        "build_buckets_from_star_tree": 933584,
                        "build_buckets_from_star_tree_count": 1,
                        "scan_star_tree_segments": 3715917,
                        "scan_star_tree_segments_count": 1
                    }
                }
            ]
        }
    ],
}

The part that changed is the breakdown

 {
    "type": "StarTree",
    "description": "Pre-computation using star-tree index",
    "time_in_nanos": 4649501,
    "breakdown": {
        "build_buckets_from_star_tree": 933584,
        "build_buckets_from_star_tree_count": 1,
        "scan_star_tree_segments": 3715917,
        "scan_star_tree_segments_count": 1
    }
}

For concurrent aggregations, the breakdown looks like:

{
    "type": "StarTree",
    "description": "Pre-computation using star-tree index",
    "time_in_nanos": 15614333,
    "max_slice_time_in_nanos": 15614333,
    "min_slice_time_in_nanos": 15614333,
    "avg_slice_time_in_nanos": 15614333,
    "breakdown": {
        "max_build_buckets_from_star_tree": 4658542,
        "max_scan_star_tree_segments": 10955791,
        "avg_scan_star_tree_segments": 10955791,
        "scan_star_tree_segments": 10955791,
        "scan_star_tree_segments_count": 1,
        "min_scan_star_tree_segments": 10955791,
        "build_buckets_from_star_tree": 4658542,
        "min_build_buckets_from_star_tree": 4658542,
        "avg_build_buckets_from_star_tree": 4658542,
        "build_buckets_from_star_tree_count": 1
    }
}

In aggregation breakdown itself, the pre_compute field is now available.

Related Issues

Resolves #19295

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
@github-actions
Copy link
Contributor

github-actions bot commented Oct 6, 2025

❌ Gradle check result for 6e2309a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

…utation in aggregation profiling tree

Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for f008d6a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sandeshkr419
Copy link
Member

Thanks @ajleong623 for spending time on this. Before looking into the specific code changes, do you have a before/after view of how the profile output looks from view of a search user.

@ajleong623
Copy link
Contributor Author

@sandeshkr419 not yet, but I will make sure to test it out and show the results before tomorrow. It should be very close to the example you shared in the issue.

Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
@github-actions
Copy link
Contributor

github-actions bot commented Nov 5, 2025

❌ Gradle check result for 5267341: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
@github-actions
Copy link
Contributor

github-actions bot commented Nov 5, 2025

❌ Gradle check result for 02e5b2e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for 7059a4e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for 4eb8843: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
@github-actions
Copy link
Contributor

✅ Gradle check result for b5c2633: SUCCESS

@codecov
Copy link

codecov bot commented Nov 11, 2025

Codecov Report

❌ Patch coverage is 37.19298% with 179 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.20%. Comparing base (46164b5) to head (b5c2633).
⚠️ Report is 26 commits behind head on main.

Files with missing lines Patch % Lines
...ofile/aggregation/AggregationProfileBreakdown.java 0.00% 27 Missing ⚠️
...arch/aggregations/StarTreePreComputeCollector.java 8.00% 23 Missing ⚠️
...le/aggregation/InternalAggregationProfileTree.java 0.00% 13 Missing ⚠️
...rch/search/aggregations/metrics/AvgAggregator.java 42.85% 11 Missing and 1 partial ⚠️
...rch/search/aggregations/metrics/SumAggregator.java 57.69% 10 Missing and 1 partial ⚠️
...ions/bucket/histogram/DateHistogramAggregator.java 16.66% 9 Missing and 1 partial ⚠️
...rch/aggregations/bucket/range/RangeAggregator.java 23.07% 9 Missing and 1 partial ⚠️
...ket/terms/GlobalOrdinalsStringTermsAggregator.java 16.66% 9 Missing and 1 partial ⚠️
...ggregations/bucket/terms/MultiTermsAggregator.java 16.66% 9 Missing and 1 partial ⚠️
...regations/bucket/terms/NumericTermsAggregator.java 16.66% 9 Missing and 1 partial ⚠️
... and 8 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #19527      +/-   ##
============================================
- Coverage     73.21%   73.20%   -0.02%     
- Complexity    71254    71308      +54     
============================================
  Files          5766     5768       +2     
  Lines        325470   325725     +255     
  Branches      47084    47108      +24     
============================================
+ Hits         238296   238434     +138     
- Misses        68043    68135      +92     
- Partials      19131    19156      +25     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for f12a3a5: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for b460a83: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for 7b0f837: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@ajleong623 ajleong623 marked this pull request as ready for review November 14, 2025 18:35
Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for e790eba: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement Enhancement or improvement to existing feature or request Search:Aggregations Search:Relevance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Enhance Profile API to show (star-tree/other) pre-computation time

2 participants