Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Index already exists error is raised if monitors are created simultaneously #646

Closed
stevanbz opened this issue Nov 4, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@stevanbz
Copy link
Contributor

stevanbz commented Nov 4, 2022

What is the bug?

Once the cluster is started and system indices are not created, if two consecutive index monitor request are being sent in a short-time frame the ResourceAlreadyExist exception with message "index [.opendistro-alerting-config/Zxb0U9ONTBKoxWKcKQ0C4g] already exists" will be thrown. This is the consequence of how index creation works. In order to resolve the issue, if the ResourceAlreadyExist is being raised with appropriate message (listed above), program flow should continue it's execution ie. something like we're doing here:

https://github.com/opensearch-project/alerting/blob/main/alerting/src/main/kotlin/org/opensearch/alerting/util/DocLevelMonitorQueries.kt#L41-L61

Code part to be refactored:
https://github.com/opensearch-project/alerting/blob/main/alerting/src/main/kotlin/org/opensearch/alerting/transport/TransportIndexMonitorAction.kt#L297-L306

How can one reproduce the bug?
Steps to reproduce the behavior:
Created integration test that creates two or more monitors simultaneously in short time-frame. Index .opendistro-alerting-config is not initialized yet.

  1. Trigger the integration test
  2. Create two monitors for the same detector
  3. Call create detector
  4. Test fails with the "index [.opendistro-alerting-config/Zxb0U9ONTBKoxWKcKQ0C4g] already exists" message

What is the expected behavior?
Two monitors should be created. Once the ResourceAlreadyExist exception is being thrown, with the message from above,

What is your host/environment?

  • OS: [e.g. iOS]
  • Version [e.g. 22]
  • Plugins

Do you have any screenshots?
If applicable, add screenshots to help explain your problem.

Do you have any additional context?
Add any other context about the problem.

@lezzago
Copy link
Member

lezzago commented Nov 7, 2022

This can potentially happen to all the create index operations. Current possible solutions:

  • Create index as leader node (possible issues if create monitor requests happens on non-leader nodes)
  • Wrap the create index call and swallow exception if its index already exists exception

@stevanbz
Copy link
Contributor Author

stevanbz commented Nov 7, 2022

You are right definitely.

The second thing I discovered is that even the check and swallowing the exception exist, we are getting the all shard failed exception if we are doing a search on the above mentioned index. That's why in a PR I added a yellowstatus check if the ResourceAlreadyExists exception is raised. But this doesn't solve completely the issue.

Let me try to describe you the issue:
Imagine that we had 3 simultaneous index monitor requests.

  1. First monitor request initializes .opendistro-alerting-config creation
  2. Second request runs into the section that handles the ResourceAlreadyExists (that waits for index to be created)
  3. Third one checks and sees that index exists and just continues with execution without entering the part that waits until the index status is yellow and throws all shards failed exception

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants