Skip sibling pipeline aggregators reduction during non-final reduce #40101

javanna · 2019-03-15T15:11:53Z

Today a coordinating node forces a final reduction of sibling pipeline aggregators whenever reducing aggs, unless it is reducing aggs incrementally. This works well for incremental reduction of aggs, but breaks CCS when minimizing roundtrips as each cluster ends up reducing its own pipeline aggregators locally while that should only be done by the CCS coordinating node later. This causes issues as after their reduction, pipeline aggs cannot be further reduced, which is what happens with CCS causing errors like "java.lang.UnsupportedOperationException: Not supported" being returned.

Each coordinating node should rather honour the reduce context flag that
indicates whether we are executing a final reduce or not. If not, it should leave the sibling pipeline aggregations alone.

Note that his bug affects only pipeline aggs that don't have a parent in
the aggs tree, while all the others work well.

Relates to #40059 but does not fix it yet, as the CCS coordinating node also needs to be adapted to recreate sibling pipeline aggregators from the request.

Today a coordinating node forces a final reduction of sibling pipeline aggregators whenever reducing aggs, unless it is reducing aggs incrementally. This works well for incremental reduction of aggs, but breaks CCS when minimizing roundtrips as each cluster ends up reducing its own pipeline aggregators locally while that should only be done by the CCS coordinating node later. This causes issues as after their reduction, pipeline aggs cannot be further reduced, which is what happens with CCS causing errors like "java.lang.UnsupportedOperationException: Not supported" being returned. Each coordinating node should rather honour the reduce context flag that indicates whether we are executing a final reduce or not. If not, it should leave the sibling pipeline aggregations alone. Note that his bug affects only pipeline aggs that don't have a parent in the aggs tree, while all the others work well. Relates to elastic#40059 but does not fix it yet, as the CCS coordinating node also needs to be adapted to recreate sibling pipeline aggregators from the request.

elasticmachine · 2019-03-15T15:11:54Z

Pinging @elastic/es-analytics-geo

jimczi

The change looks good to me but I'd like @polyfractal or @colings86 to take a look since they are more familiar with pipeline aggs.

javanna · 2019-03-15T15:23:25Z

server/src/main/java/org/elasticsearch/search/aggregations/InternalAggregations.java

+                    reducedAggregations.add(newAgg);
+                }
+            }
+        }


Note that this is moved from SearchPhaseController (and fixed to honour the finalReduce flag), as I want to be able to later call this from SearchResponseMerger in 7.0 and later branches)

javanna · 2019-03-15T15:23:59Z

server/src/main/java/org/elasticsearch/action/search/TransportSearchAction.java

@@ -315,8 +315,6 @@ public void onFailure(Exception e) {
            if (localIndices != null) {
                ActionListener<SearchResponse> ccsListener = createCCSListener(RemoteClusterAware.LOCAL_CLUSTER_GROUP_KEY,
                    false, countDown, skippedClusters, exceptions, searchResponseMerger, totalClusters, listener);
-                //here we provide the empty string a cluster alias, which means no prefix in index name,
-                //but the coord node will perform non final reduce as it's not null.


this is an outdated comment, unrelated to this change, I just stumbled upon it while working on it.

polyfractal

Also LGTM.

For posterity, I was curious why sibling aggs weren't throwing this exception normally under incremental reduce today, since it lacked the check. reduceAggs() is invoked by incremental reductions but nulls out the sibling aggs first (to keep them from running), or invoked directly by the SearchPhaseController when we are not incrementally reducing (in which case the siblings are valid).

I like the new arrangement, a little clearer since it's always relying on the reduce context instead of nullness of siblings.

We currently convert pipeline aggregators to their corresponding InternalAggregation instance as part of the final reduction phase. They arrive to the coordinating node as part of QuerySearchResult objects fom the shards and, despite we may incrementally reduce aggs (hence we may have some non-final reduce and the final one later) all the reduction phases happen on the same node. With CCS minimizing roundtrips though, each cluster performs its own non-final reduction, and then serializes the results back to the CCS coordinating node which will perform the final coordination. This breaks the assumptions made up until now around reductions happening all on the same node. With elastic#40101 we have made sure that top-level pipeline aggs are not reduced as part of the non-final reduction. The next step is to make sure that they don't get lost, meaning that each coordinating node needs to send them back to the CCS coordinating node as part of the top-level `InternalAggregations` object. Closes elastic#40059

…0177) We currently convert pipeline aggregators to their corresponding InternalAggregation instance as part of the final reduction phase. They arrive to the coordinating node as part of QuerySearchResult objects fom the shards and, despite we may incrementally reduce aggs (hence we may have some non-final reduce and the final one later) all the reduction phases happen on the same node. With CCS minimizing roundtrips though, each cluster performs its own non-final reduction, and then serializes the results back to the CCS coordinating node which will perform the final coordination. This breaks the assumptions made up until now around reductions happening all on the same node. With #40101 we have made sure that top-level pipeline aggs are not reduced as part of the non-final reduction. The next step is to make sure that they don't get lost, meaning that each coordinating node needs to send them back to the CCS coordinating node as part of the top-level `InternalAggregations` object. Closes #40059

…lastic#40101) Today a coordinating node forces a final reduction of sibling pipeline aggregators whenever reducing aggs, unless it is reducing aggs incrementally. This works well for incremental reduction of aggs, but breaks CCS when minimizing roundtrips as each cluster ends up reducing its own pipeline aggregators locally while that should only be done by the CCS coordinating node later. This causes issues as after their reduction, pipeline aggs cannot be further reduced, which is what happens with CCS causing errors like "java.lang.UnsupportedOperationException: Not supported" being returned. Each coordinating node should rather honour the reduce context flag that indicates whether we are executing a final reduce or not. If not, it should leave the sibling pipeline aggregations alone. Note that his bug affects only pipeline aggs that don't have a parent in the aggs tree, while all the others work well. Relates to elastic#40059 but does not fix it yet, as the CCS coordinating node also needs to be adapted to recreate sibling pipeline aggregators from the request.

…astic#40177) We currently convert pipeline aggregators to their corresponding InternalAggregation instance as part of the final reduction phase. They arrive to the coordinating node as part of QuerySearchResult objects fom the shards and, despite we may incrementally reduce aggs (hence we may have some non-final reduce and the final one later) all the reduction phases happen on the same node. With CCS minimizing roundtrips though, each cluster performs its own non-final reduction, and then serializes the results back to the CCS coordinating node which will perform the final coordination. This breaks the assumptions made up until now around reductions happening all on the same node. With elastic#40101 we have made sure that top-level pipeline aggs are not reduced as part of the non-final reduction. The next step is to make sure that they don't get lost, meaning that each coordinating node needs to send them back to the CCS coordinating node as part of the top-level `InternalAggregations` object. Closes elastic#40059

…lastic#40101) Today a coordinating node forces a final reduction of sibling pipeline aggregators whenever reducing aggs, unless it is reducing aggs incrementally. This works well for incremental reduction of aggs, but breaks CCS when minimizing roundtrips as each cluster ends up reducing its own pipeline aggregators locally while that should only be done by the CCS coordinating node later. This causes issues as after their reduction, pipeline aggs cannot be further reduced, which is what happens with CCS causing errors like "java.lang.UnsupportedOperationException: Not supported" being returned. Each coordinating node should rather honour the reduce context flag that indicates whether we are executing a final reduce or not. If not, it should leave the sibling pipeline aggregations alone. Note that his bug affects only pipeline aggs that don't have a parent in the aggs tree, while all the others work well. Relates to elastic#40059 but does not fix it yet, as the CCS coordinating node also needs to be adapted to recreate sibling pipeline aggregators from the request.

…astic#40177) We currently convert pipeline aggregators to their corresponding InternalAggregation instance as part of the final reduction phase. They arrive to the coordinating node as part of QuerySearchResult objects fom the shards and, despite we may incrementally reduce aggs (hence we may have some non-final reduce and the final one later) all the reduction phases happen on the same node. With CCS minimizing roundtrips though, each cluster performs its own non-final reduction, and then serializes the results back to the CCS coordinating node which will perform the final coordination. This breaks the assumptions made up until now around reductions happening all on the same node. With elastic#40101 we have made sure that top-level pipeline aggs are not reduced as part of the non-final reduction. The next step is to make sure that they don't get lost, meaning that each coordinating node needs to send them back to the CCS coordinating node as part of the top-level `InternalAggregations` object. Closes elastic#40059

…40101) Today a coordinating node forces a final reduction of sibling pipeline aggregators whenever reducing aggs, unless it is reducing aggs incrementally. This works well for incremental reduction of aggs, but breaks CCS when minimizing roundtrips as each cluster ends up reducing its own pipeline aggregators locally while that should only be done by the CCS coordinating node later. This causes issues as after their reduction, pipeline aggs cannot be further reduced, which is what happens with CCS causing errors like "java.lang.UnsupportedOperationException: Not supported" being returned. Each coordinating node should rather honour the reduce context flag that indicates whether we are executing a final reduce or not. If not, it should leave the sibling pipeline aggregations alone. Note that his bug affects only pipeline aggs that don't have a parent in the aggs tree, while all the others work well. Relates to #40059 but does not fix it yet, as the CCS coordinating node also needs to be adapted to recreate sibling pipeline aggregators from the request.

…0177) We currently convert pipeline aggregators to their corresponding InternalAggregation instance as part of the final reduction phase. They arrive to the coordinating node as part of QuerySearchResult objects fom the shards and, despite we may incrementally reduce aggs (hence we may have some non-final reduce and the final one later) all the reduction phases happen on the same node. With CCS minimizing roundtrips though, each cluster performs its own non-final reduction, and then serializes the results back to the CCS coordinating node which will perform the final coordination. This breaks the assumptions made up until now around reductions happening all on the same node. With #40101 we have made sure that top-level pipeline aggs are not reduced as part of the non-final reduction. The next step is to make sure that they don't get lost, meaning that each coordinating node needs to send them back to the CCS coordinating node as part of the top-level `InternalAggregations` object. Closes #40059

…40101) Today a coordinating node forces a final reduction of sibling pipeline aggregators whenever reducing aggs, unless it is reducing aggs incrementally. This works well for incremental reduction of aggs, but breaks CCS when minimizing roundtrips as each cluster ends up reducing its own pipeline aggregators locally while that should only be done by the CCS coordinating node later. This causes issues as after their reduction, pipeline aggs cannot be further reduced, which is what happens with CCS causing errors like "java.lang.UnsupportedOperationException: Not supported" being returned. Each coordinating node should rather honour the reduce context flag that indicates whether we are executing a final reduce or not. If not, it should leave the sibling pipeline aggregations alone. Note that his bug affects only pipeline aggs that don't have a parent in the aggs tree, while all the others work well. Relates to #40059 but does not fix it yet, as the CCS coordinating node also needs to be adapted to recreate sibling pipeline aggregators from the request.

…0177) We currently convert pipeline aggregators to their corresponding InternalAggregation instance as part of the final reduction phase. They arrive to the coordinating node as part of QuerySearchResult objects fom the shards and, despite we may incrementally reduce aggs (hence we may have some non-final reduce and the final one later) all the reduction phases happen on the same node. With CCS minimizing roundtrips though, each cluster performs its own non-final reduction, and then serializes the results back to the CCS coordinating node which will perform the final coordination. This breaks the assumptions made up until now around reductions happening all on the same node. With #40101 we have made sure that top-level pipeline aggs are not reduced as part of the non-final reduction. The next step is to make sure that they don't get lost, meaning that each coordinating node needs to send them back to the CCS coordinating node as part of the top-level `InternalAggregations` object. Closes #40059

…40101) Today a coordinating node forces a final reduction of sibling pipeline aggregators whenever reducing aggs, unless it is reducing aggs incrementally. This works well for incremental reduction of aggs, but breaks CCS when minimizing roundtrips as each cluster ends up reducing its own pipeline aggregators locally while that should only be done by the CCS coordinating node later. This causes issues as after their reduction, pipeline aggs cannot be further reduced, which is what happens with CCS causing errors like "java.lang.UnsupportedOperationException: Not supported" being returned. Each coordinating node should rather honour the reduce context flag that indicates whether we are executing a final reduce or not. If not, it should leave the sibling pipeline aggregations alone. Note that his bug affects only pipeline aggs that don't have a parent in the aggs tree, while all the others work well. Relates to #40059 but does not fix it yet, as the CCS coordinating node also needs to be adapted to recreate sibling pipeline aggregators from the request.

…0177) We currently convert pipeline aggregators to their corresponding InternalAggregation instance as part of the final reduction phase. They arrive to the coordinating node as part of QuerySearchResult objects fom the shards and, despite we may incrementally reduce aggs (hence we may have some non-final reduce and the final one later) all the reduction phases happen on the same node. With CCS minimizing roundtrips though, each cluster performs its own non-final reduction, and then serializes the results back to the CCS coordinating node which will perform the final coordination. This breaks the assumptions made up until now around reductions happening all on the same node. With #40101 we have made sure that top-level pipeline aggs are not reduced as part of the non-final reduction. The next step is to make sure that they don't get lost, meaning that each coordinating node needs to send them back to the CCS coordinating node as part of the top-level `InternalAggregations` object. Closes #40059

javanna added >bug :Analytics/Aggregations Aggregations v7.0.0 v6.7.0 v8.0.0 v7.2.0 labels Mar 15, 2019

jimczi approved these changes Mar 15, 2019

View reviewed changes

javanna commented Mar 15, 2019

View reviewed changes

Merge branch 'master' into fix/sibling_pipeline_non_final

615d8bd

$polyfractal$

polyfractal approved these changes Mar 15, 2019

View reviewed changes

Merge branch 'master' into fix/sibling_pipeline_non_final

142d195

javanna merged commit 8fb5c32 into elastic:master Mar 18, 2019

javanna added the backport pending label Mar 18, 2019

javanna mentioned this pull request Mar 18, 2019

Serialize top-level pipeline aggs as part of InternalAggregations #40177

Merged

javanna mentioned this pull request Mar 19, 2019

Cumulative 6.7 backport #40190

Closed

This was referenced Mar 19, 2019

Cumulative 7.x backport #40195

Closed

Cumulative 7.0 backport #40196

Closed

javanna removed the backport pending label Mar 19, 2019

michaelbaamonde added v7.0.0-rc1 and removed v7.0.0 labels Mar 25, 2019

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip sibling pipeline aggregators reduction during non-final reduce #40101

Skip sibling pipeline aggregators reduction during non-final reduce #40101

javanna commented Mar 15, 2019

elasticmachine commented Mar 15, 2019

jimczi left a comment

javanna Mar 15, 2019 •

edited

Loading

javanna Mar 15, 2019

$@polyfractal$ polyfractal left a comment

Skip sibling pipeline aggregators reduction during non-final reduce #40101

Skip sibling pipeline aggregators reduction during non-final reduce #40101

Conversation

javanna commented Mar 15, 2019

elasticmachine commented Mar 15, 2019

jimczi left a comment

Choose a reason for hiding this comment

javanna Mar 15, 2019 • edited Loading

Choose a reason for hiding this comment

javanna Mar 15, 2019

Choose a reason for hiding this comment

polyfractal left a comment

Choose a reason for hiding this comment

javanna Mar 15, 2019 •

edited

Loading

$@polyfractal$ polyfractal left a comment