-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow administrators to limit the depth of nested aggregations #67479
Comments
Pinging @elastic/es-analytics-geo (Team:Analytics) |
Adding team-discuss to this for our next team meeting. I suspect this would be implemented this as a cluster setting since it's relatively static and can be checked when the query is parsed rather than during agg runtime. That said, it's a pretty coarse hammer; you can construct non-abusive aggs that are also quite deep (or shallow aggs that are very abusive). But I do agree that depth tends to be a good metric for abuse potential, and allowing admins control over that could be a good safety net. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
We have discussed this in our team meeting and we thought that an estimated overall number of the buckets in aggregations might be a better metric to prevent very large queries from running. Due to the recent optimization effort that @nik9000 was doing, we already have some idea about dimension of some of our aggregation early in the query phase. So we can perhaps combine #68504 and this issue and add a cluster level parameter that will end requests that can result in an excessive number of buckets earlier. |
@imotov It seems you prefer to use overall number of the buckets to early break the request. That's what I thought in #68504. Then I find some new issues #67474, #67478, it seems they track aggregation memory instead of bucket count and I begin to believe it is a better idea. |
@maosuhan the main issue with breaking on exceeding aggregation memory is that we cancel the query when it exhausts enough memory to trip the breaker. By this time, some damage is already done, so to speak. So, it might be beneficial to cancel the query as early as possible in order to prevent the query from wasting resources before being cancelled. |
@imotov Agree with you about that and it is what we met in our production environment. In #67474, I think
If we use |
Instead of allowing a user to nest aggregations 10+ levels, this could be tunable, and shut down before the request ever hits a shard. Admins could set a limit on the number of nested aggregations.
This could be implemented as a circuit breaker, or perhaps as a search setting.
This issue was originally raised as part of #62457.
The text was updated successfully, but these errors were encountered: