-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
/_cat/indices: race condition/exception thrown when creating/deleting indices rapidly #17395
Comments
@joshuar thank you for logging this! (via @faeldt) To add, sometimes the Since this is a machine parseable output, it would be delightful for it to suppress any error and just not return JSON (or any errors for that matter, unless you add |
@kwilczynski if you happen to come across this NPE again, please paste the stack trace from the logs into this issue |
@clintongormley I will try to fish something on the Elasticsearch side, as since I fixed this on my side (to retry on error, and I don't log in debug in production, I lost it for now, sadly). |
@clintongormley I found the following NPE going through the older logs:
|
thanks @kwilczynski |
Note I also am receiving similar with
Withe following stacktrace from the node:
|
Closed indices are already displayed when no indices are explicitly selected. This commit ensures that closed indices are also shown when wildcard filtering is used. It also addresses another issue that is caused by the fact that the cat action is based internally on 3 different cluster states (one when we query the cluster state to get all indices, one when we query cluster health, and one when we query indices stats). We currently fail the cat request when the user specifies a concrete index as parameter that does not exist. The implementation works as intended in that regard. It checks this not only for the first cluster state request, but also the subsequent indices stats one. This means that if the index is deleted before the cat action has queried the indices stats, it rightfully fails. In case the user provides wildcards (or no parameter at all), however, we fail the indices stats as we pass the resolved concrete indices to the indices stats request and fail to distinguish whether these indices have been resolved by wildcards or explicitly requested by the user. This means that if an index has been deleted before the indices stats request gets to execute, we fail the overall cat request. The fix is to let the indices stats request do the resolving again and not pass the concrete indices. Closes #16419 Closes #17395
Closed indices are already displayed when no indices are explicitly selected. This commit ensures that closed indices are also shown when wildcard filtering is used. It also addresses another issue that is caused by the fact that the cat action is based internally on 3 different cluster states (one when we query the cluster state to get all indices, one when we query cluster health, and one when we query indices stats). We currently fail the cat request when the user specifies a concrete index as parameter that does not exist. The implementation works as intended in that regard. It checks this not only for the first cluster state request, but also the subsequent indices stats one. This means that if the index is deleted before the cat action has queried the indices stats, it rightfully fails. In case the user provides wildcards (or no parameter at all), however, we fail the indices stats as we pass the resolved concrete indices to the indices stats request and fail to distinguish whether these indices have been resolved by wildcards or explicitly requested by the user. This means that if an index has been deleted before the indices stats request gets to execute, we fail the overall cat request. The fix is to let the indices stats request do the resolving again and not pass the concrete indices. Closes #16419 Closes #17395
Closed indices are already displayed when no indices are explicitly selected. This commit ensures that closed indices are also shown when wildcard filtering is used. It also addresses another issue that is caused by the fact that the cat action is based internally on 3 different cluster states (one when we query the cluster state to get all indices, one when we query cluster health, and one when we query indices stats). We currently fail the cat request when the user specifies a concrete index as parameter that does not exist. The implementation works as intended in that regard. It checks this not only for the first cluster state request, but also the subsequent indices stats one. This means that if the index is deleted before the cat action has queried the indices stats, it rightfully fails. In case the user provides wildcards (or no parameter at all), however, we fail the indices stats as we pass the resolved concrete indices to the indices stats request and fail to distinguish whether these indices have been resolved by wildcards or explicitly requested by the user. This means that if an index has been deleted before the indices stats request gets to execute, we fail the overall cat request. The fix is to let the indices stats request do the resolving again and not pass the concrete indices. Closes #16419 Closes #17395
Elasticsearch Versions:
2.1.x - 2.2.x
JVM Version:
OpenJDK 64-Bit Server VM (build 25.77-b03, mixed mode)
Description of the problem including expected versus actual behavior:
It looks like if you create and delete indices very rapidly you can sometimes end up getting a 404 index_not_found_exception from the
/_cat/indices
API. So basically rather than seeing the indices that still exist, you get an exception because one index was created and deleted between when the endpoint resolves indices and then call indices stats api for those indices.Steps to reproduce:
Attached is a script that should hopefully reproduce the problem, but it can take some time (few minutes depending on the ability of your ES cluster to handle responses :)
cat-indices-issue.sh.zip
The script is Bash. It will run a number of background processes that simply spin creating and then deleting an index with the REST API. The script waits until the output of
_cat/indices
produces the exception, then exits. NOTE: It will clean-up any background processes it creates but won't clean up any test indices that might still exist.When the script dies, it will produce output like:
The text was updated successfully, but these errors were encountered: