Support probability: auto for random sampler aggregation. #86559
Labels
:Analytics/Aggregations
Aggregations
>enhancement
:ml
Machine learning
Team:Analytics
Meta label for analytical engine team (ESQL/Aggs/Geo)
Team:ML
Meta label for the ML team
Description
Right now the onus of choosing the appropriate probability ratio lays fully on the consuming side.
This means the consuming side has to come up with multi phase query patterns to ensure enough data is loaded (see e.g elastic/kibana#127598)
It would be great if the random_sampler could explicitly support this multi phase query behavior through automatically increasing the probability.
I am not sure if
min_documents
needs to be per shard or overall to statistically work out correctly.This would also help ensuring if the query ends up running on a shard with very few documents we'd automatically take all documents into account.
The
max_retries
controls how often the coordinating node is able to reissue the query with a higher probability.The text was updated successfully, but these errors were encountered: