-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenDistro Security Parallel Writes #206
Comments
Hello - sorry to hear this is causing issues, this is a consequence of #119, which intended to make the client instantiation lazy. I believe it could be cached, but we'd need to make sure that (a) the caching is concurrency safe (as you noticed in #186, this was a failed attempt to cache the client), and (b) it's cached dependent on the config settings so that if the config changes, the client will re-instantiate. |
Interesting, okay thank you for that information. It absolutely makes sense why it's deferred, and it makes sense that #186 was a concurrency issue caused by this. For some extra context, I've heard back again on the AWS case that I opened about the symptom we were seeing that was caused by this and they had a couple of extra thoughts. The fact that the error we are seeing is terraform-provider-elasticsearch/es/resource_elasticsearch_opendistro_kibana_tenant.go Line 218 in c28d2de
How do you feel about adding retry logic to terraform-provider-elasticsearch/es/resource_elasticsearch_opendistro_kibana_tenant.go Line 210 in c28d2de
|
Okay, I've updated the title of this issue based on another round of chats with AWS support. The OpenDistro Security module maintains a single index for storing the configurations of all tenants, users, roles and roles mappings, and concurrent writes to this index are disallowed in order to maintain idempotency. They offer a bulk write API, but simultaneous calls to the single-write endpoints are apparently error prone. I can't find this documented everywhere, but I think it's probably super helpful information to you @phillbaker when it comes to this provider and interacting with OpenDistro Security. |
Hi @timcosta thanks for the updates. It looks like you got to the answer, but this seems very similar to #154 which was confusing because upstream OpenDistro returns a 500 instead of the canonical 409 status (issue upstream about this: opensearch-project/security#1095). I'd recommend confirming that this is the issue by passing |
Hello! I've been debugging an issue I run into sometimes with the AWS managed ES where multiple concurrent operations result in 500 errors. I reproduced it with debug logging enabled, and saw the following in the output:
I've opened a support case with AWS about this as I don't think 8 concurrent pings should result in 500s, but I was a little surprised to see that the provider creates a new ES Client for every operation, which triggers a ping for every single operation.
I'm considering making a PR to refactor this to either cache the client or create it earlier in the process, but am wondering if this is a purposeful design choice for a reason I'm not aware of. Again, I don't think it's a problem with the provider implementation that's causing the 500s which is why I opened the case with AWS, but wanted to see if optimizing this a bit was desirable as it would speed up a lot of operations theoretically.
The text was updated successfully, but these errors were encountered: