-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
debug add-dc [DO NOT MERGE] #324
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Summary: * Add k8ssandra.io/rebuild to CassandraDatacenter when rebuild required * Use Initialized condition to check if dc is being added to existing cluster * Add RBAC annotations for CassandraTasks * Add integration tests for adding dc to existing cluster * Add new set of subtests that use an existing cluster as test fixture * Create CassandraTask for rebuild job * Update logic for computing replication factor * Add support for working with arbitrary number of kind clusters * Update replication of system keyspaces * Update replication of user keyspaces * Use k8ssandra.io/dc-replication annotation * Update replication of stargate auth and reaper keyspaces Details: In Cassandra 4 you cannot declare a non-existent dc in the replication strategy. If we are creating a K8ssandraCluster with 2 DCs, dc1 and dc2, for example, we can only declare replicas for dc1 initially. Only after dc2 is added to the C* cluster can we specify replicas for it. The cassandra.system_distributed_replication_dc_names and cassandra.system_distributed_replication_per_dc Java system properties are kind of a backdoor via the management-api that do allow us to specify non-existent DCs for system keysapces but only on the initial cluster creation. The GetDatacentersForReplication function is used for system, stargate, reaper, and user keyspaces to determine which DCs should be included for the replication. If the cluster is already initialized then only the DCs that are already part of the cluster are included. When adding a new dc replication for user keyspaces is specified via the k8ssanda.io/dc-replication annotation. If not specified, no replication changes are made for user keyspaces. If specified, all user keyspaces must be specified. If you don't want to replicate a particular keyspace, then specify a value of zero. Reconcile Stargate auth and Reaper keyspaces after reconciling each dc. This change is needed to handle rebuild and decommission scenarios. See k8ssandra#262 (comment) for a detailed explanation on why the changes are necessary.
Previously we only set the default superuser secret name in memory and did not persist it. The version check patches that I implemented caused a problem with that. The setting is lost after the first patch is applied. It makes more sense to just persist the default.
To date we have relied on setting a couple system properties to configure the replication of system keyspaces. As I added support for managing replication of reaper and stargate auth keyspaces, I attempted to consolidate how they are managed. It makes sense they basically need to be managed in the same way. I had to undo those changes though until we implement support for running repairs after replication changes. If we configure the replication for system_auth with client CQL calls from the operator instead of back door in the management api, then we need to run a repair on system_auth when a second dc is added to the cluster; otherwise, any queries against nodes in the second dc will fail. This applies when we are deploying a new cluster as well as when adding a dc to an existing cluster. This commit also updates the version of cass-operator now that the CassandraTask API has landed in master in cass-operator.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A debug PR for a test failure in #262 that I cannot reproduce locally.