-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] OpenSearch not starting - [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block]; #14791
Comments
[Triage - attendees 1 2 3] |
This is likely not a bug, but definitely feels like one. @DumboJetEngine maybe try to dig which exact metrics caused |
@dblock I don't know if it is possible that it remembers the fact that it had less free space during an older run. |
I really don't know. I would read the code from where the error is thrown back to figure out what metrics it uses, modify the code to log those values which seems to be missing, then look at the numbers of those metrics. It's quite a bit of work :) I am sure there's a rational explanation to why this error happens. |
@dblock
The weird part is that it used to work fine for quite some time, and I can't figure out what has changed since then to break it. |
@dblock Or, I could patch a JAR file, if you decide modify the 2.9.0 code, to get more log information. |
Appreciate your help. I hope someone can dig into this. |
@DumboJetEngine @dblock I looked into this and I think this is a bug. When all nodes disk space under threshold, all the indices will be marked with A easier way to reproduce this issue is to spin up a small cluster or a single node and follow below steps:
I have created this draft PR to fix this: #15258, Thanks. |
Any news about this topic? I have the same issue on opensearch 2.11.1 |
@zane-neo |
@DumboJetEngine @danieltodea19 I investigate further and can reproduce the issue locally, after testing I found the initial investigation is not correct. The actual root cause is:
I've make code change on the cluster start part to first start the And basically a quick and simple workaround for you is to remove(maybe backup first) the observability plugin and retry starting the cluster. |
@zane-neo
Restoring the plugin folder continued to make the app exit, but that might be because I have filled my free disk space once again. I am not sure why the settings get ignored, but I guess having OpenSearch at least work is an improvement, both in terms of functionality and in terms of troubleshooting. Thanks! :) |
@DumboJet From the logs, it seems your node disk usage still higher than threshold thus the shard will be relocated away. A simple way to fix is once your cluster starts up, you can change the disk threshold cluster settings: |
Describe the bug
After indexing some documents, my OpenSearch (v.2.9.0) is not starting. In the
opensearch.log
file, I get this error:I have emptied my disk a bit (Win11 system) and I now have 64GB free, but the error didn't go away.
I've found some related ElasticSearch questions, but I don't think I can use any of the solutions, given that I can't even start OpenSearch:
https://stackoverflow.com/questions/48155774/elasticsearch-read-only-allow-delete-auto-setting
BTW, my free disk storage space was at 16%. For ElasticSearch they say that 15% is required to be free, in my provided link.
Then, I have freed up more space and my disk had 25% free space. Nothing changed with OpenSearch...
Then, I have moved the OpenSearch folder to an external disk with 75% free space. Again, nothing changed with OpenSearch... This is insane! What is wrong with it?
Here are the complete logs:
https://gist.github.com/DumboJetEngine/8cdeaf9159e9af14831a352b1f5a3128
This is the second time I get this problem.
I start a new installation from scratch, reindex my documents, and the next day that I start OpenSearch I get this issue.
There is also this error, which apparently is related to the security plugin:
...according to this.
Related component
Other
To Reproduce
I guess, indexing 2.5 million documents with OpenSearch 2.9.0.
Expected behavior
OpenSearch starts up.
Additional Details
Plugins
[2024-07-17T13:57:57,386][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-alerting]
[2024-07-17T13:57:57,386][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-anomaly-detection]
[2024-07-17T13:57:57,386][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-asynchronous-search]
[2024-07-17T13:57:57,386][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-cross-cluster-replication]
[2024-07-17T13:57:57,388][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-geospatial]
[2024-07-17T13:57:57,388][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-index-management]
[2024-07-17T13:57:57,388][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-job-scheduler]
[2024-07-17T13:57:57,389][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-knn]
[2024-07-17T13:57:57,389][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-ml]
[2024-07-17T13:57:57,389][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-neural-search]
[2024-07-17T13:57:57,389][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-notifications]
[2024-07-17T13:57:57,390][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-notifications-core]
[2024-07-17T13:57:57,390][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-observability]
[2024-07-17T13:57:57,390][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-reports-scheduler]
[2024-07-17T13:57:57,390][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-security]
[2024-07-17T13:57:57,391][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-security-analytics]
[2024-07-17T13:57:57,391][INFO ][o.o.p.PluginsService ] [PC123] loaded plugin [opensearch-sql]
Host/Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: