Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] KNN memory circuit breaker limit in opensearch.yaml #585

Closed
frejonb opened this issue Oct 19, 2022 · 11 comments
Closed

[BUG] KNN memory circuit breaker limit in opensearch.yaml #585

frejonb opened this issue Oct 19, 2022 · 11 comments
Assignees
Labels
bug Something isn't working

Comments

@frejonb
Copy link

frejonb commented Oct 19, 2022

Describe the bug
The setting knn.memory.circuit_breaker.limit doesn't seem to be respected when set in the opensearch.yaml. Even setting it in /_cluster/settings does not work unless I remove the setting from opensearch.yaml

To Reproduce
Steps to reproduce the behavior:

  1. Set knn.memory.circuit_breaker.limit: 56200000KB in opensearch.yaml
  2. Warm up an index
  3. Check /_plugins/_knn/*/stats and calculate graph_memory_usage*100/graph_memory_usage_percentage. This will be 50% of the native memory.
  4. Set knn.memory.circuit_breaker.limit: 56200000KB in the persistent settings (/_cluster/settings) and repeat 2-3.
  5. Now remove knn.memory.circuit_breaker.limit: 56200000KB from opensearch.yaml and repeat 2-3. This time the result should be 56200000.

Expected behavior
The opensearch.yaml setting should be picked up correctly.

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@frejonb frejonb added bug Something isn't working untriaged labels Oct 19, 2022
@saratvemulapalli
Copy link
Member

Thanks @frejonb settings is something we use everyday, lets start with how K-NN is handling the settings consumer and traceback if OpenSearch has problems.

@saratvemulapalli saratvemulapalli transferred this issue from opensearch-project/OpenSearch Oct 19, 2022
@vamshin
Copy link
Member

vamshin commented Oct 19, 2022

Hi @frejonb, we read from cluster settings to enable disable circuit breaker. Please use just cluster settings.

@frejonb
Copy link
Author

frejonb commented Oct 19, 2022

@vamshin but it’s harder to have IaC if I can’t set it in the config file

@vamshin
Copy link
Member

vamshin commented Oct 19, 2022

@frejonb sorry whats laC? Having it part of the config file requires process restart to read updated configuration which is not desirable. Also if there are multiple nodes this becomes even harder to update config. So we have cluster settings that can be changed dynamically with out having to do any restarts and do not have to worry about changing across all the nodes manually.

@frejonb
Copy link
Author

frejonb commented Oct 20, 2022

@vamshin thanks for the quick reply. I meant infrastructure as code. It's easier to have all my configuration manifests in source control and then easily and quickly deploy properly functional clusters on demand. Having one config file per node is not an issue because all nodes get the same config automatically (though a statefulset for example).

I'm not suggesting to remove the ability to dynamically change the settings, but to support modern DevOps practices. Also other configuration (from ES) seems to work that way.

@jmazanec15
Copy link
Member

Hi @frejonb, Im looking into this now. How much RAM does your machine have? I did a quick test of this and it appears the setting can be set in the opensearch yml file and show up in the cluster settings. That being said, could you check if when you set "knn.memory.circuit_breaker.limit": "1%", the same behavior shows up?

@jmazanec15
Copy link
Member

Actually, I was able to reproduce the behavior. Checking what might be the cause.

@jmazanec15
Copy link
Member

I see the problem. Instead of pulling the value for knn.memory.circuit_breaker.limit from the settings, we pull it from a map that sits on top of the settings (ref). This map will actually get updated with the latest settings before the settings are applied, because we update it as a settings update consumer (ref).

The problem is that when reading the settings from "opensearch.yml", this update settings consumer never gets called, so latest settings is never updated, so the cache uses the default value for the circuit breaker limit.

@frejonb
Copy link
Author

frejonb commented Jan 11, 2023

@jmazanec15 thanks for the explanation. Is this the only setting affected by this?

@jmazanec15
Copy link
Member

Yes, the other dynamic settings will also be impacted. Im working on fix now. Might require some refactoring in KNNSettings class.

@jmazanec15
Copy link
Member

Merged the fix. This will be fixed in 2.6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants