Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic max boolean clause counts #1526

Closed
malpani opened this issue Nov 10, 2021 · 8 comments · Fixed by #13568
Closed

Dynamic max boolean clause counts #1526

malpani opened this issue Nov 10, 2021 · 8 comments · Fixed by #13568
Labels
enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc v2.16.0 Issues and PRs related to version 2.16.0 v3.0.0 Issues and PRs related to version 3.0.0

Comments

@malpani
Copy link
Contributor

malpani commented Nov 10, 2021

Is your feature request related to a problem? Please describe.
As query shapes evolve, there have been multiple requests by users ato dynamically change limits on max clauses allowed for a boolean query. Today, this is a static setting indices.query.bool.max_clause_count that defaults to 1024. Any change requires a yml change and rolling restart of the cluster which is not convenient

On hitting an error like TooManyClauses: maxClauseCount is set to 1024, while a re-look at query is always recommended, but in scenarios where larger clauses are required, we should allow supporting those

Describe the solution you'd like
Make indices.query.bool.max_clause_count a dynamic setting which can then be dynamically updated via settings update like

PUT /_cluster/settings --data '{
    "transient" : {
        "indices.query.bool.max_clause_count" : 2048
    }
}'
@malpani malpani added enhancement Enhancement or improvement to existing feature or request untriaged labels Nov 10, 2021
@malpani
Copy link
Contributor Author

malpani commented Nov 18, 2021

Summarizing the review of #1527 with @reta and @nknize

  1. Push back on making this a dynamic setting as it would provide users an easy way out to raise this limit instead of relooking at their query.
  2. A query utilizing large number of clauses will cause interference with well behaved queries
  3. If making dynamic, feedback is to provide as a per index setting (instead of cluster wide) - however exposing per index limit on clauses is something that can be looked at in 2.x only as it will involve building on top of https://issues.apache.org/jira/browse/LUCENE-8811 (lucene 9.x)

I dont plan to work on this in the short term and if someone wants to drive this to closure, feel free to pick it up.

@AmiStrn
Copy link
Contributor

AmiStrn commented Apr 27, 2022

This is actually a useful tool for multitenant clusters.
Restarting just to change this is a bit of a pain. Maybe we could make a possible override on the index level.

@reta
Copy link
Collaborator

reta commented May 11, 2022

More context on the issue:

  • IndexSearcher has maxClauseCount limit, which is what BooleanQuery is using in 9.x (see please https://issues.apache.org/jira/browse/LUCENE-8811), this property is also static
  • the idea of having boolean query clause limit configurable per-query is effectively buried (see please https://issues.apache.org/jira/browse/LUCENE-7880)
  • introducing something like indices.query.max_clause_count (and deprecating indices.query.bool.max_clause_count since respective BooleanQuery property is scheduled for removal in Apache Lucene 10) with dynamic scope (changeable at runtime) could be useful:
PUT /_cluster/settings --data '{
    "transient" : {
        "indices.query.max_clause_count" : 2048
    }
}'

(in 3.x, we would still support indices.query.bool.max_clause_count with deprecation warning). The major concern here is that IndexSearcher::maxClauseCount is not volatile (nor guarded by synchronization primitives), so changing it may not be visible to all threads.

@nknize does it make sense? if yes, I could pick it up, thanks!

@kkhatua
Copy link
Member

kkhatua commented Mar 9, 2023

@reta / @nknize
Is this going forward? A dynamic setting would certainly help in enabling and tuning in production clusters.

@reta
Copy link
Collaborator

reta commented Mar 9, 2023

@kkhatua yeah, I know, haven't gotten any feedback on the proposal :(

@anand3493
Copy link

Users in our org are facing this prob once we migrated to Opensearch from Elasticsearch.
This feature will really help my organization.

@jainankitk
Copy link
Collaborator

jainankitk commented Mar 13, 2023

@reta - After going through various concerns from the lucene community and @nknize on PR #1527, I am wondering if having per query parameter makes sense? That ensures every (potentially) not so well written query fails atleast once on the default limit of 1024, and customer can deliberately choose higher limit at the cost of performance if their workload really needs it?

@reta
Copy link
Collaborator

reta commented Mar 14, 2023

@jainankitk you mean resuming discussion on [1] & [2]? I think it would make a lot of sense, because on OpenSearch side we cannot enforce this limit in isolation since Lucene applies static limits across the board.

[1] apache/lucene#8931
[2] https://issues.apache.org/jira/browse/LUCENE-7880

@andrross andrross added Search Search query, autocomplete ...etc and removed Indexing & Search labels Feb 29, 2024
@reta reta added v3.0.0 Issues and PRs related to version 3.0.0 v2.16.0 Issues and PRs related to version 2.16.0 labels Jun 13, 2024
@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in Search Project Board Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc v2.16.0 Issues and PRs related to version 2.16.0 v3.0.0 Issues and PRs related to version 3.0.0
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

8 participants