Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Restart task to CassandraTasks #394

Merged
merged 8 commits into from
Sep 6, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ Changelog for Cass Operator, new PRs should update the `main / unreleased` secti
## unreleased

* [CHANGE] [#354](https://github.com/k8ssandra/cass-operator/issues/354) Remove oldDefunctLabel support since we recreate StS. Fix #335 created-by value to match expected value.
* [CHANGE] [#385](https://github.com/k8ssandra/cass-operator/issues/385) Deprecate CassandraDatacenter's RollingRestartRequested. Use CassandraTask instead.
* [ENHANCEMENT] [#385](https://github.com/k8ssandra/cass-operator/issues/385) Add rolling restart as a CassandraTask action.
* [CHANGE] [#397](https://github.com/k8ssandra/cass-operator/issues/397) Remove direct dependency to k8s.io/kubernetes
* [FEATURE] [#384](https://github.com/k8ssandra/cass-operator/issues/384) Add a new CassandraTask operation "replacenode" that removes the existing PVCs from the pod, deletes the pod and starts a replacement process.
* [FEATURE] [#387](https://github.com/k8ssandra/cass-operator/issues/387) Add a new CassandraTask operation "upgradesstables" that allows to do SSTable upgrades after Cassandra version upgrade.
Expand Down
2 changes: 1 addition & 1 deletion apis/cassandra/v1beta1/cassandradatacenter_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,7 @@ type CassandraDatacenterSpec struct {
// The k8s service account to use for the server pods
ServiceAccount string `json:"serviceAccount,omitempty"`

// Whether to do a rolling restart at the next opportunity. The operator will set this back
// DEPRECATED. Use CassandraTask for rolling restarts. Whether to do a rolling restart at the next opportunity. The operator will set this back
// to false once the restart is in progress.
RollingRestartRequested bool `json:"rollingRestartRequested,omitempty"`

Expand Down
10 changes: 10 additions & 0 deletions apis/control/v1alpha1/cassandratask_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

const (
RestartedAtAnnotation = "control.k8ssandra.io/restartedAt"
)

// CassandraTaskSpec defines the desired state of CassandraTask
type CassandraTaskSpec struct {

Expand Down Expand Up @@ -69,6 +73,12 @@ const (
CommandScrub CassandraCommand = "scrub"
)

const (
KeyspaceArgument string = "keyspace_name"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this related to a restart?

RackArgument string = "rack"
SourceDatacenterArgument string = "source_datacenter"
)

type CassandraJob struct {
Name string `json:"name"`

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7452,9 +7452,9 @@ spec:
type: object
type: object
rollingRestartRequested:
description: Whether to do a rolling restart at the next opportunity.
The operator will set this back to false once the restart is in
progress.
description: DEPRECATED. Use CassandraTask for rolling restarts. Whether
to do a rolling restart at the next opportunity. The operator will
set this back to false once the restart is in progress.
type: boolean
serverImage:
description: 'Cassandra server image name. Use of ImageConfig to match
Expand Down
2 changes: 1 addition & 1 deletion config/manager/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,4 @@ kind: Kustomization
images:
- name: controller
newName: k8ssandra/cass-operator
newTag: latest
newTag: v1.13.0-dev.0bb3086-20220819
27 changes: 24 additions & 3 deletions controllers/control/cassandratask_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import (
"strconv"
"time"

appsv1 "k8s.io/api/apps/v1"
batchv1 "k8s.io/api/batch/v1"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
Expand Down Expand Up @@ -256,6 +257,7 @@ func (r *CassandraTaskReconciler) Reconcile(ctx context.Context, req ctrl.Reques

var err error
var failed, completed int
JobDefinition:
for _, job := range cassTask.Spec.Jobs {
taskConfig := &TaskConfiguration{
RestartPolicy: cassTask.Spec.RestartPolicy,
Expand All @@ -273,7 +275,17 @@ func (r *CassandraTaskReconciler) Reconcile(ctx context.Context, req ctrl.Reques
case api.CommandCleanup:
cleanup(taskConfig)
case api.CommandRestart:
r.restart(taskConfig)
// This job is targeting StatefulSets and not Pods
sts, err := r.getDatacenterStatefulSets(ctx, dc)
if err != nil {
return ctrl.Result{}, err
}

res, err = r.restartSts(ctx, sts, taskConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The res value doesn't seem to be used at all. Maybe that's what could be causing prematurely setting the completion time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that res used on the line 333? I don't see it being overwritten anywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it overwritten on line 320?

res, failed, completed, err = r.reconcileEveryPodTask(ctx, dc, taskConfig)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is break JobDefinition sending us straight to line 333?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break JobDefinition should break us from the for() clause and thus line 320 should never be run.

if err != nil {
return ctrl.Result{}, err
}
break JobDefinition
case api.CommandReplaceNode:
r.replace(taskConfig)
case "forceupgraderacks":
Expand Down Expand Up @@ -333,6 +345,7 @@ func (r *CassandraTaskReconciler) Reconcile(ctx context.Context, req ctrl.Reques

cassTask.Status.Active = 0
cassTask.Status.CompletionTime = &timeNow
SetCondition(&cassTask, api.JobComplete, corev1.ConditionTrue)

// Requeue for deletion later
deletionTime := calculateDeletionTime(&cassTask)
Expand All @@ -344,8 +357,6 @@ func (r *CassandraTaskReconciler) Reconcile(ctx context.Context, req ctrl.Reques
cassTask.Status.Succeeded = completed
cassTask.Status.Failed = failed

SetCondition(&cassTask, api.JobComplete, corev1.ConditionTrue)

if err = r.Client.Status().Update(ctx, &cassTask); err != nil {
return ctrl.Result{}, err
}
Expand Down Expand Up @@ -498,6 +509,16 @@ func (r *CassandraTaskReconciler) getDatacenterPods(ctx context.Context, dc *cas
return pods.Items, nil
}

func (r *CassandraTaskReconciler) getDatacenterStatefulSets(ctx context.Context, dc *cassapi.CassandraDatacenter) ([]appsv1.StatefulSet, error) {
var sts appsv1.StatefulSetList

if err := r.Client.List(ctx, &sts, client.InNamespace(dc.Namespace), client.MatchingLabels(dc.GetDatacenterLabels())); err != nil {
return nil, err
}

return sts.Items, nil
}

// cleanupJobAnnotations removes the job annotations from the pod once it has finished
func (r *CassandraTaskReconciler) cleanupJobAnnotations(ctx context.Context, dc *cassapi.CassandraDatacenter, taskId string) error {
logger := log.FromContext(ctx)
Expand Down
Loading