-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add datacenter to existing cluster [K8SSAND-615][K8SSAND-784] #262
Conversation
I think we need to change when we reconcile the Stargate auth and Reaper schemas. The schemas should be reconciled after each CassandraDatacenter is reconciled. Here is how things are currently done in K8ssandraReconciler (this is in the var actualDcs []*cassdcapi.CassandraDatacenter
if recResult, dcs := r.reconcileDatacenters(ctx, kc, kcLogger); recResult.Completed() {
return recResult.Output()
} else {
actualDcs = dcs
}
kcLogger.Info("All dcs reconciled")
if recResult := r.reconcileStargateAuthSchema(ctx, kc, actualDcs, kcLogger); recResult.Completed() {
return recResult.Output()
}
if recResult := r.reconcileReaperSchema(ctx, kc, actualDcs, kcLogger); recResult.Completed() {
return recResult.Output()
}
if recResult := r.reconcileStargateAndReaper(ctx, kc, actualDcs, kcLogger); recResult.Completed() {
return recResult.Output()
} We first reconcile all CassandraDatacenters. We will requeue until all DCs are ready. Then we reconcile the Stargate auth and Reaper schemas. These keyspaces are created if necessary. Lastly, we reconcile Stargate and Reaper across all DCs. In other words, There are a number of changes to the for idx, dcTemplate := range kc.Spec.Cassandra.Datacenters {
...
if err = remoteClient.Get(ctx, dcKey, actualDc); err == nil {
...
logger.Info("The datacenter is ready")
actualDcs = append(actualDcs, actualDc)
if recResult := r.updateReplicationOfSystemKeyspaces(ctx, kc, desiredDc, remoteClient, logger); recResult.Completed() {
return recResult, actualDcs
}
if rebuildNeeded {
// TODO We need to handle the Stargate auth and Reaper keyspaces here.
if recResult := r.updateUserKeyspacesReplication(ctx, kc, desiredDc, remoteClient, logger); recResult.Completed() {
return recResult, actualDcs
}
if recResult := reconcileDcRebuild(ctx, actualDc, actualDcs, remoteClient, logger); recResult.Completed() {
return recResult, actualDcs
}
}
} else {
...
}
} After the current DC is ready, If Note the earlier TODO about handling the Stargate and Reaper keyspaces. If Stargate and Reaper are already installed we need to update the replication for their keyspaces before doing the rebuild so that the keyspaces are included in the streaming. It is easy enough to add the necessary calls where the TODO is. Now let's say we have an existing K8ssandraCluster that does not have Stargate and Reaper enabled. Suppose we update the K8ssandraCluster by adding a second DC and enabling Stargate and Reaper in each DC. With the code as is, the keyspaces won't get created until after the rebuild of dc2. I think we are ok in this case. As a last example, consider removing a DC from the cluster. One of the first steps we perform in this process is to stop replicating to the DC. We can't do that if we reconcile the Stargate and Reaper keyspaces after reconciling all DCs. Based on my findings I am going to move the calls to reconcile the Stargate and Reaper keyspaces to happen after each DC is ready. @adutra any thoughts/concerns? |
…ate tests See k8ssandra#262 (comment) for a detailed explanation on why the changes are necessary.
@jsanda sorry for the late reply. The suggested changes make sense and look good to me. I suppose many tests will fail because the order in which they wait for components to show up, but other than that, I don't see any red flags. |
…ate tests See k8ssandra#262 (comment) for a detailed explanation on why the changes are necessary.
Summary: * Add k8ssandra.io/rebuild to CassandraDatacenter when rebuild required * Use Initialized condition to check if dc is being added to existing cluster * Add RBAC annotations for CassandraTasks * Add integration tests for adding dc to existing cluster * Add new set of subtests that use an existing cluster as test fixture * Create CassandraTask for rebuild job * Update logic for computing replication factor * Add support for working with arbitrary number of kind clusters * Update replication of system keyspaces * Update replication of user keyspaces * Use k8ssandra.io/dc-replication annotation * Update replication of stargate auth and reaper keyspaces Details: In Cassandra 4 you cannot declare a non-existent dc in the replication strategy. If we are creating a K8ssandraCluster with 2 DCs, dc1 and dc2, for example, we can only declare replicas for dc1 initially. Only after dc2 is added to the C* cluster can we specify replicas for it. The cassandra.system_distributed_replication_dc_names and cassandra.system_distributed_replication_per_dc Java system properties are kind of a backdoor via the management-api that do allow us to specify non-existent DCs for system keysapces but only on the initial cluster creation. The GetDatacentersForReplication function is used for system, stargate, reaper, and user keyspaces to determine which DCs should be included for the replication. If the cluster is already initialized then only the DCs that are already part of the cluster are included. When adding a new dc replication for user keyspaces is specified via the k8ssanda.io/dc-replication annotation. If not specified, no replication changes are made for user keyspaces. If specified, all user keyspaces must be specified. If you don't want to replicate a particular keyspace, then specify a value of zero. Reconcile Stargate auth and Reaper keyspaces after reconciling each dc. This change is needed to handle rebuild and decommission scenarios. See k8ssandra#262 (comment) for a detailed explanation on why the changes are necessary.
Summary: * Add k8ssandra.io/rebuild to CassandraDatacenter when rebuild required * Use Initialized condition to check if dc is being added to existing cluster * Add RBAC annotations for CassandraTasks * Add integration tests for adding dc to existing cluster * Add new set of subtests that use an existing cluster as test fixture * Create CassandraTask for rebuild job * Update logic for computing replication factor * Add support for working with arbitrary number of kind clusters * Update replication of system keyspaces * Update replication of user keyspaces * Use k8ssandra.io/dc-replication annotation * Update replication of stargate auth and reaper keyspaces Details: In Cassandra 4 you cannot declare a non-existent dc in the replication strategy. If we are creating a K8ssandraCluster with 2 DCs, dc1 and dc2, for example, we can only declare replicas for dc1 initially. Only after dc2 is added to the C* cluster can we specify replicas for it. The cassandra.system_distributed_replication_dc_names and cassandra.system_distributed_replication_per_dc Java system properties are kind of a backdoor via the management-api that do allow us to specify non-existent DCs for system keysapces but only on the initial cluster creation. The GetDatacentersForReplication function is used for system, stargate, reaper, and user keyspaces to determine which DCs should be included for the replication. If the cluster is already initialized then only the DCs that are already part of the cluster are included. When adding a new dc replication for user keyspaces is specified via the k8ssanda.io/dc-replication annotation. If not specified, no replication changes are made for user keyspaces. If specified, all user keyspaces must be specified. If you don't want to replicate a particular keyspace, then specify a value of zero. Reconcile Stargate auth and Reaper keyspaces after reconciling each dc. This change is needed to handle rebuild and decommission scenarios. See k8ssandra#262 (comment) for a detailed explanation on why the changes are necessary.
|
||
createSuperuserSecret(ctx, t, f, kcKey, kc.Spec.Cassandra.Cluster) | ||
|
||
createReplicatedSecret(ctx, t, f, kcKey, "cluster-0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When adding a new cluster, we don't actually have any replication rule updates. I don't see here if we actually verify that some secrets are even copied to new clustrs? It's not strictly necessary per-datacenter of course, since datacenter could be replicated to an existing cluster in another namespace, but I wonder if these kind of issues came to mind (at least in documentation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verifyReplicatedSecretReconciled
is called in each test. It verifies that the ReplicationDone
condition gets set.
} | ||
} | ||
} else { | ||
datacenters = kc.Spec.Cassandra.Datacenters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slightly confused here, why do we want only Ready ones if initialized is true, but everything if Initialized is false? Feels unnatural why this path should ever be taken instead of just returning nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is this path taken when we initialize all the datacenters the first time K8ssandraCluster is created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is this path taken when we initialize all the datacenters the first time K8ssandraCluster is created?
short answer - yes
long answer...
We configure replication of system keyspaces by setting a couple system properties:
cassandra.system_distributed_replication_dc_names
cassandra.system_distributed_replication_per_dc
management-api looks for these properties and applies changes to the system_auth keyspace using internal Cassandra APIs, not CQL. If we are creating a cluster with 2 dcs, dc1 and dc2, both will be configured for system_auth. Note that what I am describing is already the current behavior in main.
In 4.0 Cassandra won't let you specify the replication for a keyspace that does not exist (at least through CQL). I had to introduce changes in this PR for configuring the replication of the Stargate auth and Reaper keyspaces. Those are configured via CQL calls. So going back to the 2 DCs example, when dc1 becomes ready we can create the keyspaces and specify replication for dc1 but not for dc2. Cassandra won't allow this. This resulted in different handling of system keyspaces vs other keyspaces.
Once I added support for handling non-system keyspaces, I attempted (in vain) to handle system keyspaces the same way to make the code less confusing and more maintainable. There are queries with a LOCAL_ONE
consistency executed in dc2 that will fail with auth error because the replica(s) in dc2 need a repair.
In summary, until we implement support for running repairs, at least of system_auth, we have to handle system keyspaces the way we currently do. That means 1) for system keyspaces we need to specify replication for all DCs up front and 2) for non-system keyspaces we cannot specify the replication for all DCs up front.
I will rename the method and add some docs to make it more clear that it is only intended for system keyspaces.
Summary: * Add k8ssandra.io/rebuild to CassandraDatacenter when rebuild required * Use Initialized condition to check if dc is being added to existing cluster * Add RBAC annotations for CassandraTasks * Add integration tests for adding dc to existing cluster * Add new set of subtests that use an existing cluster as test fixture * Create CassandraTask for rebuild job * Update logic for computing replication factor * Add support for working with arbitrary number of kind clusters * Update replication of system keyspaces * Update replication of user keyspaces * Use k8ssandra.io/dc-replication annotation * Update replication of stargate auth and reaper keyspaces Details: In Cassandra 4 you cannot declare a non-existent dc in the replication strategy. If we are creating a K8ssandraCluster with 2 DCs, dc1 and dc2, for example, we can only declare replicas for dc1 initially. Only after dc2 is added to the C* cluster can we specify replicas for it. The cassandra.system_distributed_replication_dc_names and cassandra.system_distributed_replication_per_dc Java system properties are kind of a backdoor via the management-api that do allow us to specify non-existent DCs for system keysapces but only on the initial cluster creation. The GetDatacentersForReplication function is used for system, stargate, reaper, and user keyspaces to determine which DCs should be included for the replication. If the cluster is already initialized then only the DCs that are already part of the cluster are included. When adding a new dc replication for user keyspaces is specified via the k8ssanda.io/dc-replication annotation. If not specified, no replication changes are made for user keyspaces. If specified, all user keyspaces must be specified. If you don't want to replicate a particular keyspace, then specify a value of zero. Reconcile Stargate auth and Reaper keyspaces after reconciling each dc. This change is needed to handle rebuild and decommission scenarios. See k8ssandra#262 (comment) for a detailed explanation on why the changes are necessary.
Summary: * Add k8ssandra.io/rebuild to CassandraDatacenter when rebuild required * Use Initialized condition to check if dc is being added to existing cluster * Add RBAC annotations for CassandraTasks * Add integration tests for adding dc to existing cluster * Add new set of subtests that use an existing cluster as test fixture * Create CassandraTask for rebuild job * Update logic for computing replication factor * Add support for working with arbitrary number of kind clusters * Update replication of system keyspaces * Update replication of user keyspaces * Use k8ssandra.io/dc-replication annotation * Update replication of stargate auth and reaper keyspaces Details: In Cassandra 4 you cannot declare a non-existent dc in the replication strategy. If we are creating a K8ssandraCluster with 2 DCs, dc1 and dc2, for example, we can only declare replicas for dc1 initially. Only after dc2 is added to the C* cluster can we specify replicas for it. The cassandra.system_distributed_replication_dc_names and cassandra.system_distributed_replication_per_dc Java system properties are kind of a backdoor via the management-api that do allow us to specify non-existent DCs for system keysapces but only on the initial cluster creation. The GetDatacentersForReplication function is used for system, stargate, reaper, and user keyspaces to determine which DCs should be included for the replication. If the cluster is already initialized then only the DCs that are already part of the cluster are included. When adding a new dc replication for user keyspaces is specified via the k8ssanda.io/dc-replication annotation. If not specified, no replication changes are made for user keyspaces. If specified, all user keyspaces must be specified. If you don't want to replicate a particular keyspace, then specify a value of zero. Reconcile Stargate auth and Reaper keyspaces after reconciling each dc. This change is needed to handle rebuild and decommission scenarios. See k8ssandra#262 (comment) for a detailed explanation on why the changes are necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second round of remarks :-)
Approving this as the e2e failures seem unrelated. A few minor questions are still inflight but can be revisited later. It would be good to have this merged asap. |
Summary: * Add k8ssandra.io/rebuild to CassandraDatacenter when rebuild required * Use Initialized condition to check if dc is being added to existing cluster * Add RBAC annotations for CassandraTasks * Add integration tests for adding dc to existing cluster * Add new set of subtests that use an existing cluster as test fixture * Create CassandraTask for rebuild job * Update logic for computing replication factor * Add support for working with arbitrary number of kind clusters * Update replication of system keyspaces * Update replication of user keyspaces * Use k8ssandra.io/dc-replication annotation * Update replication of stargate auth and reaper keyspaces Details: In Cassandra 4 you cannot declare a non-existent dc in the replication strategy. If we are creating a K8ssandraCluster with 2 DCs, dc1 and dc2, for example, we can only declare replicas for dc1 initially. Only after dc2 is added to the C* cluster can we specify replicas for it. The cassandra.system_distributed_replication_dc_names and cassandra.system_distributed_replication_per_dc Java system properties are kind of a backdoor via the management-api that do allow us to specify non-existent DCs for system keysapces but only on the initial cluster creation. The GetDatacentersForReplication function is used for system, stargate, reaper, and user keyspaces to determine which DCs should be included for the replication. If the cluster is already initialized then only the DCs that are already part of the cluster are included. When adding a new dc replication for user keyspaces is specified via the k8ssanda.io/dc-replication annotation. If not specified, no replication changes are made for user keyspaces. If specified, all user keyspaces must be specified. If you don't want to replicate a particular keyspace, then specify a value of zero. Reconcile Stargate auth and Reaper keyspaces after reconciling each dc. This change is needed to handle rebuild and decommission scenarios. See k8ssandra#262 (comment) for a detailed explanation on why the changes are necessary.
Previously we only set the default superuser secret name in memory and did not persist it. The version check patches that I implemented caused a problem with that. The setting is lost after the first patch is applied. It makes more sense to just persist the default.
To date we have relied on setting a couple system properties to configure the replication of system keyspaces. As I added support for managing replication of reaper and stargate auth keyspaces, I attempted to consolidate how they are managed. It makes sense they basically need to be managed in the same way. I had to undo those changes though until we implement support for running repairs after replication changes. If we configure the replication for system_auth with client CQL calls from the operator instead of back door in the management api, then we need to run a repair on system_auth when a second dc is added to the cluster; otherwise, any queries against nodes in the second dc will fail. This applies when we are deploying a new cluster as well as when adding a dc to an existing cluster. This commit also updates the version of cass-operator now that the CassandraTask API has landed in master in cass-operator.
What this PR does:
Which issue(s) this PR fixes:
Fixes #21
Checklist