-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG - Segment Replication + Remote Store] - Support segment replication for system indices #8182
Comments
Supporting remote store for system indices involves creating the repository before we create the index. This means we also need to get repository details as part of cluster settings (today, we accept only repository name). I will create a separate issue for the same. |
cc: @psychbot |
Issue specific to remote store: #8188 |
HI @mch2 and @sachinpkale, looking at this, it seems like we will want to engage from the security side of things. Currently, we restrict modification of the security system index. The concern is escalation of privileges and some other issues that can arise such as desync between copies. Can you provide some more details on how you plan to interact with the system indices? Is there an issue outlining the flow you are working on? Tagging @peternied for some additional context on concerns associated with this. |
@scrawfor99 I don't think there is an issue with the security plugin itself - its how plugins can interact with system indices that is breaking down with seg-rep enabled. @mch2 @psychbot @sachinpkale Please let the security team know if there seems to be a 'governance change' related to these indices, which doesn't seem to be the case and I'd guess would break backward compatibility |
@scrawfor99 @peternied Thanks for taking a look here! In short there will be no modification to the content of the system index, only the replication strategy that is used behind the scenes to copy it out to replicas. Segment Replication is now configurable at a cluster level so we would like any index including system indices to use segrep if enabled. However there are perf implications here that need to be called out. SegRep will on average increase the amount of time that replica shards are inconsistent from their primaries due to the time required to copy out segments (time dependent on cluster config/hardware etc). System indices are generally small and should copy out quickly, so I wouldn't expect this to be of major concern. Are there are cases where stronger read/write consistency is desired on the system index? If this is the case plugins would need to prioritize primaries for searches. There is no strong r/w guarantee today with docrep, unless you use _bulk API with wait_until preference, which won't ack the request until each shard has refreshed on those documents. This option is not supported with segrep. Looking through security I don't think this is used. |
@scrawfor99 What do you think about creating an issue to contemplate how we configure segment replication for the security index? Building up the architecture documentation of security index <-> in memory security configuration might be a good companion for that issue. |
Hi @peternied, will do! Thanks for taking a look at this. You have a better understanding of the overall picture involved with these types of changes so I dragged you in. I will make an issue over in the Security repo and link it back here. Update: Here is a link to the issue I created: opensearch-project/security#2898. |
@mch2 When there is a change to the security configuration, the security plugin will run an transport action on every node of the cluster to instruct it to read the entire security index and refresh its local cache. See: https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/action/configupdate/TransportConfigUpdateAction.java#L126-L131 Tracing through the code this will ultimately do a Would there be any risk of the Edit: I see your comment above about preferences now:
Looks like the security codebase can take advantage of that. |
Thanks @cwperks, yes using prefer primary or primary can be used. If security is using RefreshPolicy.IMMEDIATE, this will only start the refresh & replication process after a write, so its possible there could be a stale read. This issue is currently blocked by #8193. Steps here.
|
Joining this conversation coming from Job Scheduler repo and I presume I have similar concerns/questions as @peternied and @cwperks discussed with security plugin. JS uses a system index for locking, which I do not think would be able to use the Is there any way to keep the current (2.8) situation as a configuration option (allow users to not use SEGREP for system indices) or are we forced to have these perf impacts if the user is using SEGREP at all? |
Thank you @mch2 . The security plugin is using RefreshPolicy.IMMEDIATE after a change to the security index. I have opened a PR on the security repo to add |
Closing this, since we are tracking the effort here: #8193 |
Describe the bug
Segment Replication is currently not supported on system indices. This decision was made for segment replication after failing tests with CCR plugin (issues linked below).
Related issues:
#6602 (comment)
#3823
#8158
To Reproduce
Steps to reproduce the behavior:
cluster.indices.replication.strategy: SEGMENT
Expected behavior
Segment Replication should be a supported strategy for system indices.
Additional context
This limitation also prevents system indices from being backed up with remote storage.
The text was updated successfully, but these errors were encountered: