Make sure non-collecting aggs include sub-aggs #64214

nik9000 · 2020-10-27T13:11:14Z

Now that we're consistently using cat_match to filter which shards we
run on we can get this confusing case:

You have a search with, say, a range and a sub-agg.
That search has a query that can_match can recognize will match no
docs. On any shard.
So we dutifully run it on a single shard so it can produce the
"empty" aggs.
The shard we pick happens to not have the target of the range mapped.
This kicks in the special range aggregator that doesn't collect any
documents.
Before this commit, that range aggregator also never produced any
sub-aggs.

So, without this change, it was quite possible for a search that
happened to match no documents to "throw away" the sub-aggs of a range
and a few other aggs.

We've had this problem for a long, long time but it is more confusing
now because can_match is really kicking in and causing us to see cases
where it looks like you are targeting a lot of shards but you really are
only targeting a couple. It used to be that to get the "no sub-aggs"
behavior you had to explicitly target only shards that didn't map the
target field of the range agg. And, like, in that case it isn't too
bad because you targeted a sort of degenerate shard. But now that
can_match is doing its thing you can end up with the confusing steps
above. It took me several hours to track down what what happening I know
how the individual pieces of all of this works. It took four hours to
figure out how they fit together in this case....

Anyway! This replaces all the aggregator implementations that throw out
the sub-aggregators with ones that keep them. I think this'll be less
confusing in the future.

Closes #64142

Now that we're consistently using `cat_match` to filter which shards we run on we can get this confusing case: 1. You have a search with, say, a range and a sub-agg. 2. That search has a query that `can_match` can recognize will match no docs. On *any* shard. 3. So we dutifully run it on a single shard so it can produce the "empty" aggs. 4. The shard we pick happens to not have the target of the range mapped. 5. This kicks in the special range aggregator that doesn't collect any documents. 6. Before this commit, that range aggregator *also* never produced any sub-aggs. So, without this change, it was quite possible for a search that happened to match no documents to "throw away" the sub-aggs of a range and a few other aggs. We've had this problem for a long, long time but it is more confusing now because `can_match` is really kicking in and causing us to see cases where it looks like you are targeting a lot of shards but you really are only targeting a couple. It used to be that to get the "no sub-aggs" behavior you had to explicitly target only shards that didn't map the target field of the `range` agg. And, like, in that case it isn't too bad because you targeted a sort of degenerate shard. But now that `can_match` is doing its thing you can end up with the confusing steps above. It took me several hours to track down what what happening I know how the individual pieces of all of this works. It took four hours to figure out how they fit together in this case.... Anyway! This replaces all the aggregator implementations that throw out the sub-aggregators with ones that keep them. I think this'll be less confusing in the future. Closes elastic#64142

elasticmachine · 2020-10-27T13:11:16Z

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

not-napoleon

LGTM. As you said, hard to find, easy to fix once you find it. Nice work.

nik9000 · 2020-10-27T13:30:02Z

Thanks @not-napoleon ! Here's hoping the tests all pass!

…4214) Now that we're consistently using `cat_match` to filter which shards we run on we can get this confusing case: 1. You have a search with, say, a range and a sub-agg. 2. That search has a query that `can_match` can recognize will match no docs. On *any* shard. 3. So we dutifully run it on a single shard so it can produce the "empty" aggs. 4. The shard we pick happens to not have the target of the range mapped. 5. This kicks in the special range aggregator that doesn't collect any documents. 6. Before this commit, that range aggregator *also* never produced any sub-aggs. So, without this change, it was quite possible for a search that happened to match no documents to "throw away" the sub-aggs of a range and a few other aggs. We've had this problem for a long, long time but it is more confusing now because `can_match` is really kicking in and causing us to see cases where it looks like you are targeting a lot of shards but you really are only targeting a couple. It used to be that to get the "no sub-aggs" behavior you had to explicitly target only shards that didn't map the target field of the `range` agg. And, like, in that case it isn't too bad because you targeted a sort of degenerate shard. But now that `can_match` is doing its thing you can end up with the confusing steps above. It took me several hours to track down what what happening I know how the individual pieces of all of this works. It took four hours to figure out how they fit together in this case.... Anyway! This replaces all the aggregator implementations that throw out the sub-aggregators with ones that keep them. I think this'll be less confusing in the future. Closes elastic#64142

…64247) Now that we're consistently using `cat_match` to filter which shards we run on we can get this confusing case: 1. You have a search with, say, a range and a sub-agg. 2. That search has a query that `can_match` can recognize will match no docs. On *any* shard. 3. So we dutifully run it on a single shard so it can produce the "empty" aggs. 4. The shard we pick happens to not have the target of the range mapped. 5. This kicks in the special range aggregator that doesn't collect any documents. 6. Before this commit, that range aggregator *also* never produced any sub-aggs. So, without this change, it was quite possible for a search that happened to match no documents to "throw away" the sub-aggs of a range and a few other aggs. We've had this problem for a long, long time but it is more confusing now because `can_match` is really kicking in and causing us to see cases where it looks like you are targeting a lot of shards but you really are only targeting a couple. It used to be that to get the "no sub-aggs" behavior you had to explicitly target only shards that didn't map the target field of the `range` agg. And, like, in that case it isn't too bad because you targeted a sort of degenerate shard. But now that `can_match` is doing its thing you can end up with the confusing steps above. It took me several hours to track down what what happening I know how the individual pieces of all of this works. It took four hours to figure out how they fit together in this case.... Anyway! This replaces all the aggregator implementations that throw out the sub-aggregators with ones that keep them. I think this'll be less confusing in the future. Closes #64142

nik9000 · 2020-10-29T14:02:39Z

I've got this into 7.10 and am trying to land it in 7.x. After that I'll need to update the skips in master and I can remove the backport pending label.

…64244) Now that we're consistently using `cat_match` to filter which shards we run on we can get this confusing case: 1. You have a search with, say, a range and a sub-agg. 2. That search has a query that `can_match` can recognize will match no docs. On *any* shard. 3. So we dutifully run it on a single shard so it can produce the "empty" aggs. 4. The shard we pick happens to not have the target of the range mapped. 5. This kicks in the special range aggregator that doesn't collect any documents. 6. Before this commit, that range aggregator *also* never produced any sub-aggs. So, without this change, it was quite possible for a search that happened to match no documents to "throw away" the sub-aggs of a range and a few other aggs. We've had this problem for a long, long time but it is more confusing now because `can_match` is really kicking in and causing us to see cases where it looks like you are targeting a lot of shards but you really are only targeting a couple. It used to be that to get the "no sub-aggs" behavior you had to explicitly target only shards that didn't map the target field of the `range` agg. And, like, in that case it isn't too bad because you targeted a sort of degenerate shard. But now that `can_match` is doing its thing you can end up with the confusing steps above. It took me several hours to track down what what happening I know how the individual pieces of all of this works. It took four hours to figure out how they fit together in this case.... Anyway! This replaces all the aggregator implementations that throw out the sub-aggregators with ones that keep them. I think this'll be less confusing in the future. Closes #64142

Now that elastic#64214 has landed in 7.10 we can run the bwc test including it.

Now that #64214 has landed in 7.10 we can run the bwc test including it.

nik9000 · 2020-10-29T18:39:34Z

Everything is in now!

nik9000 added >bug :Analytics/Aggregations Aggregations v8.0.0 v7.10.0 v7.11.0 labels Oct 27, 2020

nik9000 requested a review from not-napoleon October 27, 2020 13:11

elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Oct 27, 2020

not-napoleon approved these changes Oct 27, 2020

View reviewed changes

nik9000 merged commit 7feb19a into elastic:master Oct 27, 2020

nik9000 added backport pending and removed backport pending labels Oct 29, 2020

nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Oct 29, 2020

Update skip after backport

a95d373

Now that elastic#64214 has landed in 7.10 we can run the bwc test including it.

nik9000 mentioned this pull request Oct 29, 2020

Update skip after backport #64369

Merged

nik9000 added a commit that referenced this pull request Oct 29, 2020

Update skip after backport (#64369)

2cb5803

Now that #64214 has landed in 7.10 we can run the bwc test including it.

nik9000 removed the backport pending label Oct 29, 2020

Mpdreamz mentioned this pull request Nov 16, 2020

7.10.1 Meta Ticket elastic/elasticsearch-net#5096

Closed

61 tasks

stevejgordon mentioned this pull request Dec 17, 2020

7.11.0 Meta Ticket elastic/elasticsearch-net#5198

Closed

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make sure non-collecting aggs include sub-aggs #64214

Make sure non-collecting aggs include sub-aggs #64214

nik9000 commented Oct 27, 2020 •

edited

Loading

elasticmachine commented Oct 27, 2020

not-napoleon left a comment

nik9000 commented Oct 27, 2020

nik9000 commented Oct 29, 2020

nik9000 commented Oct 29, 2020

Make sure non-collecting aggs include sub-aggs #64214

Make sure non-collecting aggs include sub-aggs #64214

Conversation

nik9000 commented Oct 27, 2020 • edited Loading

elasticmachine commented Oct 27, 2020

not-napoleon left a comment

Choose a reason for hiding this comment

nik9000 commented Oct 27, 2020

nik9000 commented Oct 29, 2020

nik9000 commented Oct 29, 2020

nik9000 commented Oct 27, 2020 •

edited

Loading