[BUG] Index already exists error is raised if monitors are created simultaneously #646

stevanbz · 2022-11-04T21:15:15Z

What is the bug?

Once the cluster is started and system indices are not created, if two consecutive index monitor request are being sent in a short-time frame the ResourceAlreadyExist exception with message "index [.opendistro-alerting-config/Zxb0U9ONTBKoxWKcKQ0C4g] already exists" will be thrown. This is the consequence of how index creation works. In order to resolve the issue, if the ResourceAlreadyExist is being raised with appropriate message (listed above), program flow should continue it's execution ie. something like we're doing here:

https://github.com/opensearch-project/alerting/blob/main/alerting/src/main/kotlin/org/opensearch/alerting/util/DocLevelMonitorQueries.kt#L41-L61

Code part to be refactored:
https://github.com/opensearch-project/alerting/blob/main/alerting/src/main/kotlin/org/opensearch/alerting/transport/TransportIndexMonitorAction.kt#L297-L306

How can one reproduce the bug?
Steps to reproduce the behavior:
Created integration test that creates two or more monitors simultaneously in short time-frame. Index .opendistro-alerting-config is not initialized yet.

Trigger the integration test
Create two monitors for the same detector
Call create detector
Test fails with the "index [.opendistro-alerting-config/Zxb0U9ONTBKoxWKcKQ0C4g] already exists" message

What is the expected behavior?
Two monitors should be created. Once the ResourceAlreadyExist exception is being thrown, with the message from above,

What is your host/environment?

OS: [e.g. iOS]
Version [e.g. 22]
Plugins

Do you have any screenshots?
If applicable, add screenshots to help explain your problem.

Do you have any additional context?
Add any other context about the problem.

lezzago · 2022-11-07T17:27:28Z

This can potentially happen to all the create index operations. Current possible solutions:

Create index as leader node (possible issues if create monitor requests happens on non-leader nodes)
Wrap the create index call and swallow exception if its index already exists exception

stevanbz · 2022-11-07T17:32:09Z

You are right definitely.

The second thing I discovered is that even the check and swallowing the exception exist, we are getting the all shard failed exception if we are doing a search on the above mentioned index. That's why in a PR I added a yellowstatus check if the ResourceAlreadyExists exception is raised. But this doesn't solve completely the issue.

Let me try to describe you the issue:
Imagine that we had 3 simultaneous index monitor requests.

First monitor request initializes .opendistro-alerting-config creation
Second request runs into the section that handles the ResourceAlreadyExists (that waits for index to be created)
Third one checks and sees that index exists and just continues with execution without entering the part that waits until the index status is yellow and throws all shards failed exception

stevanbz added bug Something isn't working untriaged labels Nov 4, 2022

stevanbz mentioned this issue Nov 5, 2022

Added exception check once the .opendistro-alerting-config index is b… #650

Merged

praveensameneni removed the untriaged label Nov 18, 2022

praveensameneni closed this as completed Mar 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Index already exists error is raised if monitors are created simultaneously #646

[BUG] Index already exists error is raised if monitors are created simultaneously #646

stevanbz commented Nov 4, 2022 •

edited

Loading

lezzago commented Nov 7, 2022

stevanbz commented Nov 7, 2022

[BUG] Index already exists error is raised if monitors are created simultaneously #646

[BUG] Index already exists error is raised if monitors are created simultaneously #646

Comments

stevanbz commented Nov 4, 2022 • edited Loading

lezzago commented Nov 7, 2022

stevanbz commented Nov 7, 2022

stevanbz commented Nov 4, 2022 •

edited

Loading