Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline aggregations shouldn't serialize to remote clusters in cross-cluster search #73680

Closed
benwtrent opened this issue Jun 2, 2021 · 4 comments · Fixed by #79638
Closed
Assignees
Labels
:Analytics/Aggregations Aggregations >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@benwtrent
Copy link
Member

Pipeline aggregations have long been a way to "do one more thing" on aggregated data. I THINK that "one more thing" is always computed on the coordinating node and the coordinating node is always on the "local" cluster.

Consequently, there seems to be an opportunity here by not serializing pipeline aggregations to remote clusters in cross-cluster search. This way, newer pipeline aggs could be used when querying older clusters.

//CC @elastic/ml-core

@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 2, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@nik9000
Copy link
Member

nik9000 commented Jun 2, 2021

It is always executed on the coordinating node. But old coordinating node versions they don't know how to read the pipelines from the request - only from a response from a response sent back from the data nodes. The thing I haven't reasoned through is what cases we don't have to send them.

@imotov
Copy link
Contributor

imotov commented Oct 14, 2021

We should probably talk about this in the next team meeting.

@nik9000 nik9000 self-assigned this Oct 20, 2021
@nik9000
Copy link
Member

nik9000 commented Oct 20, 2021

A few of us talked and think that we do actually need to serialize these responses. We thought that we didn't, but async search serializes the response with binary. So! We have to make sure these are all serializable. Sadness. I'll see about adding more tests to catch places that aren't.

nik9000 added a commit to nik9000/elasticsearch that referenced this issue Oct 21, 2021
This tries to automatically detect problems seriealization aggregation
results by round tripping the results in our usual `AggregatorTestCase`.
It's "free" testing in that we already have the tests written and we'll
get round trip testing "on the side". But it's kind of sneaky because we
aren't *trying* to test serialization here. So they failures can be
surprising. But surprising test failures are better than bugs. At least
that is what I tell myself so I can sleep at night.

Closes elastic#73680
elasticsearchmachine pushed a commit that referenced this issue Oct 25, 2021
This tries to automatically detect problems seriealization aggregation
results by round tripping the results in our usual `AggregatorTestCase`.
It's "free" testing in that we already have the tests written and we'll
get round trip testing "on the side". But it's kind of sneaky because we
aren't *trying* to test serialization here. So they failures can be
surprising. But surprising test failures are better than bugs. At least
that is what I tell myself so I can sleep at night. Closes #73680
lockewritesdocs pushed a commit to lockewritesdocs/elasticsearch that referenced this issue Oct 28, 2021
This tries to automatically detect problems seriealization aggregation
results by round tripping the results in our usual `AggregatorTestCase`.
It's "free" testing in that we already have the tests written and we'll
get round trip testing "on the side". But it's kind of sneaky because we
aren't *trying* to test serialization here. So they failures can be
surprising. But surprising test failures are better than bugs. At least
that is what I tell myself so I can sleep at night. Closes elastic#73680
nik9000 added a commit to nik9000/elasticsearch that referenced this issue Nov 10, 2021
This tries to automatically detect problems seriealization aggregation
results by round tripping the results in our usual `AggregatorTestCase`.
It's "free" testing in that we already have the tests written and we'll
get round trip testing "on the side". But it's kind of sneaky because we
aren't *trying* to test serialization here. So they failures can be
surprising. But surprising test failures are better than bugs. At least
that is what I tell myself so I can sleep at night. Closes elastic#73680
nik9000 added a commit that referenced this issue Nov 10, 2021
This tries to automatically detect problems seriealization aggregation
results by round tripping the results in our usual `AggregatorTestCase`.
It's "free" testing in that we already have the tests written and we'll
get round trip testing "on the side". But it's kind of sneaky because we
aren't *trying* to test serialization here. So they failures can be
surprising. But surprising test failures are better than bugs. At least
that is what I tell myself so I can sleep at night. Closes #73680
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
4 participants