Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use static labels in pods #791

Merged
merged 3 commits into from
May 3, 2024
Merged

Use static labels in pods #791

merged 3 commits into from
May 3, 2024

Conversation

spilchen
Copy link
Collaborator

@spilchen spilchen commented May 2, 2024

During a replicated upgrade we will promote a sandbox to the main cluster, which cause it to drop its sandbox state, and we need to rename subclusters to match the original names. Prior to this change, the pods deployed by the operator had labels for the sandbox and subcluster name. And the labels were set in the statefulset pod template. Now, in order to change those, the k8s controller will actually do a rollout. This means it needs to restart each of the pods. But we require sandbox promotion and subcluster rename to happen without a pod restart. So, the change here to use different labeling to ensure we don't need to restart the pods. In some cases, if we have a pod we have to refer back to the statefulset in order to figure certain metadata.

Since we are changing the statefulset selector values, the statefulset from old operator need to be recreated. We had done this in the past, so I repurposed the upgradeoperator120_reconciler.go for this purpose.

A summary of all of the labels that are set is as follows.

  • All objects will have these labels set:
app.kubernetes.io/instance: <CR name>
app.kubernetes.io/managed-by: verticadb-operator
app.kubernetes.io/component: database
app.kubernetes.io/name: vertica
app.kubernetes.io/version: 2.2.0
vertica.com/database: <vertica db name>
  • StatefulSets and Service objects will have the following labels:
vertica.com/subcluster-name: <subcluster-name>
vertica.com/subcluster-type: <primary|secondary>
vertica.com/sandbox: <sandbox name if applicable>
  • Further Service objects will have this additional label:
vertica.com/svc-type: <external|headless>
  • Pods will have the following labels:
vertica.com/subcluster-svc: <name of service object>
vertica.com/client-routing: true # Only set if not doing a drain or pending removal
vertica.com/subcluster-selector-name: <full name of the pod's statefulset>
  • ConfigMaps used by the sandbox controller will have this label:
vertica.com/watched-by-sandbox-controller: true
  • When the statefulset sets up the pod selector, it will use these labels:
app.kurbernetes.io/instance
vertica.com/subcluster-selector-name
  • When the service objects sets up its pod selector, it will use these labels:
app.kurbernetes.io/instance
vertica.com/subcluster-selector-name
vertica.com/client-routing

Matt Spilchen added 2 commits May 2, 2024 09:20
@spilchen spilchen requested a review from roypaulin May 2, 2024 13:48
@spilchen spilchen self-assigned this May 2, 2024
Copy link
Collaborator

@roypaulin roypaulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@@ -159,15 +159,15 @@ func (d *DBRemoveSubclusterReconciler) resetDefaultSubcluster(ctx context.Contex
scFinder := iter.MakeSubclusterFinder(d.VRec.Client, d.Vdb)
// We use the FindServices() API to get subclusters that already exist.
// We can only change the default subcluster to one of those.
svcs, err := scFinder.FindServices(ctx, iter.FindInVdb, vapi.MainCluster)
stss, err := scFinder.FindStatefulSets(ctx, iter.FindInVdb, vapi.MainCluster)
if err != nil {
return err
}
// If we don't find a service object we don't fail. The attempt to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update this comment now that we find sts instead of svc?

}
return sts.Labels[vmeta.SubclusterTypeLabel] == scType && !isTransient, nil
return sts.Labels[vmeta.SubclusterNameLabel] != transientName
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it is been a while but why dowe need to return false for transient sc?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When online upgrade iterates over subclusters (i.e. to upgrade, add client routing, etc), we always want to skip transient. Those aren't ever upgraded. They are meant to only serve read requests during the upgrade.

@@ -207,12 +211,16 @@ func (m *SubclusterFinder) buildObjList(ctx context.Context, list client.ObjectL
// shouldSkipBasedOnSandboxState returns true if the object whose labels
// is passed does not belong to the given sandbox or main cluster
func shouldSkipBasedOnSandboxState(l map[string]string, sandbox string) bool {
// Note, for a pod, the labels passed in will be from the statefulset. So,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to add this to the function's description?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that makes sense. There are 3 functions that have this comment. So, I'll change all 3.

// This label is added to a statefulset to indicate the sandbox it belongs
// to. This will allow the operator to filter these objects if it is looking
// for objects from the main cluster or a sandbox. Moreover, the sandbox
// controller will be watching statefulsets witGh this label and will trigger
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// controller will be watching statefulsets witGh this label and will trigger
// controller will be watching statefulsets with this label and will trigger

// statefulsets for other subclusters in the database. It may appear like it
// is derived from a subcluster name, but is does not get updated for
// subcluster rename. So, it shouldn't be treated as such.
SubclusterSelectorLabel = "vertica.com/subcluster-selector-name"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean the sts name no longer changes when the subcluster is renamed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this will stay static. This is tied to the statefulset name, which we'll keep static for the rename too.

Copy link
Collaborator

@roypaulin roypaulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a lot of changes for upgrade so I wonder if this can go in after my PR?

@spilchen
Copy link
Collaborator Author

spilchen commented May 3, 2024

I made a lot of changes for upgrade so I wonder if this can go in after my PR?

Yeah, no problem. I'll wait.

@spilchen spilchen merged commit ded5b41 into vnext May 3, 2024
31 checks passed
@spilchen spilchen deleted the spilchen/no-dynamic-labels branch May 3, 2024 22:58
cchen-vertica pushed a commit that referenced this pull request Jul 17, 2024
During a replicated upgrade we will promote a sandbox to the main
cluster, which cause it to drop its sandbox state, and we need to rename
subclusters to match the original names. Prior to this change, the pods
deployed by the operator had labels for the sandbox and subcluster name.
And the labels were set in the statefulset pod template. Now, in order
to change those, the k8s controller will actually do a rollout. This
means it needs to restart each of the pods. But we require sandbox
promotion and subcluster rename to happen without a pod restart. So, the
change here to use different labeling to ensure we don't need to restart
the pods. In some cases, if we have a pod we have to refer back to the
statefulset in order to figure certain metadata.

Since we are changing the statefulset selector values, the statefulset
from old operator need to be recreated. We had done this in the past, so
I repurposed the upgradeoperator120_reconciler.go for this purpose.

A summary of all of the labels that are set is as follows.

- All objects will have these labels set:
```
app.kubernetes.io/instance: <CR name>
app.kubernetes.io/managed-by: verticadb-operator
app.kubernetes.io/component: database
app.kubernetes.io/name: vertica
app.kubernetes.io/version: 2.2.0
vertica.com/database: <vertica db name>
```
- StatefulSets and Service objects will have the following labels:
```
vertica.com/subcluster-name: <subcluster-name>
vertica.com/subcluster-type: <primary|secondary>
vertica.com/sandbox: <sandbox name if applicable>
```
- Further Service objects will have this additional label:
```
vertica.com/svc-type: <external|headless>
```
- Pods will have the following labels:
```
vertica.com/subcluster-svc: <name of service object>
vertica.com/client-routing: true # Only set if not doing a drain or pending removal
vertica.com/subcluster-selector-name: <full name of the pod's statefulset>
```
- ConfigMaps used by the sandbox controller will have this label:
```
vertica.com/watched-by-sandbox-controller: true
```
- When the statefulset sets up the pod selector, it will use these
labels:
```
app.kurbernetes.io/instance
vertica.com/subcluster-selector-name
```
- When the service objects sets up its pod selector, it will use these
labels:
```
app.kurbernetes.io/instance
vertica.com/subcluster-selector-name
vertica.com/client-routing
```
cchen-vertica pushed a commit that referenced this pull request Jul 17, 2024
During a replicated upgrade we will promote a sandbox to the main
cluster, which cause it to drop its sandbox state, and we need to rename
subclusters to match the original names. Prior to this change, the pods
deployed by the operator had labels for the sandbox and subcluster name.
And the labels were set in the statefulset pod template. Now, in order
to change those, the k8s controller will actually do a rollout. This
means it needs to restart each of the pods. But we require sandbox
promotion and subcluster rename to happen without a pod restart. So, the
change here to use different labeling to ensure we don't need to restart
the pods. In some cases, if we have a pod we have to refer back to the
statefulset in order to figure certain metadata.

Since we are changing the statefulset selector values, the statefulset
from old operator need to be recreated. We had done this in the past, so
I repurposed the upgradeoperator120_reconciler.go for this purpose.

A summary of all of the labels that are set is as follows.

- All objects will have these labels set:
```
app.kubernetes.io/instance: <CR name>
app.kubernetes.io/managed-by: verticadb-operator
app.kubernetes.io/component: database
app.kubernetes.io/name: vertica
app.kubernetes.io/version: 2.2.0
vertica.com/database: <vertica db name>
```
- StatefulSets and Service objects will have the following labels:
```
vertica.com/subcluster-name: <subcluster-name>
vertica.com/subcluster-type: <primary|secondary>
vertica.com/sandbox: <sandbox name if applicable>
```
- Further Service objects will have this additional label:
```
vertica.com/svc-type: <external|headless>
```
- Pods will have the following labels:
```
vertica.com/subcluster-svc: <name of service object>
vertica.com/client-routing: true # Only set if not doing a drain or pending removal
vertica.com/subcluster-selector-name: <full name of the pod's statefulset>
```
- ConfigMaps used by the sandbox controller will have this label:
```
vertica.com/watched-by-sandbox-controller: true
```
- When the statefulset sets up the pod selector, it will use these
labels:
```
app.kurbernetes.io/instance
vertica.com/subcluster-selector-name
```
- When the service objects sets up its pod selector, it will use these
labels:
```
app.kurbernetes.io/instance
vertica.com/subcluster-selector-name
vertica.com/client-routing
```
cchen-vertica pushed a commit that referenced this pull request Jul 24, 2024
During a replicated upgrade we will promote a sandbox to the main
cluster, which cause it to drop its sandbox state, and we need to rename
subclusters to match the original names. Prior to this change, the pods
deployed by the operator had labels for the sandbox and subcluster name.
And the labels were set in the statefulset pod template. Now, in order
to change those, the k8s controller will actually do a rollout. This
means it needs to restart each of the pods. But we require sandbox
promotion and subcluster rename to happen without a pod restart. So, the
change here to use different labeling to ensure we don't need to restart
the pods. In some cases, if we have a pod we have to refer back to the
statefulset in order to figure certain metadata.

Since we are changing the statefulset selector values, the statefulset
from old operator need to be recreated. We had done this in the past, so
I repurposed the upgradeoperator120_reconciler.go for this purpose.

A summary of all of the labels that are set is as follows.

- All objects will have these labels set:
```
app.kubernetes.io/instance: <CR name>
app.kubernetes.io/managed-by: verticadb-operator
app.kubernetes.io/component: database
app.kubernetes.io/name: vertica
app.kubernetes.io/version: 2.2.0
vertica.com/database: <vertica db name>
```
- StatefulSets and Service objects will have the following labels:
```
vertica.com/subcluster-name: <subcluster-name>
vertica.com/subcluster-type: <primary|secondary>
vertica.com/sandbox: <sandbox name if applicable>
```
- Further Service objects will have this additional label:
```
vertica.com/svc-type: <external|headless>
```
- Pods will have the following labels:
```
vertica.com/subcluster-svc: <name of service object>
vertica.com/client-routing: true # Only set if not doing a drain or pending removal
vertica.com/subcluster-selector-name: <full name of the pod's statefulset>
```
- ConfigMaps used by the sandbox controller will have this label:
```
vertica.com/watched-by-sandbox-controller: true
```
- When the statefulset sets up the pod selector, it will use these
labels:
```
app.kurbernetes.io/instance
vertica.com/subcluster-selector-name
```
- When the service objects sets up its pod selector, it will use these
labels:
```
app.kurbernetes.io/instance
vertica.com/subcluster-selector-name
vertica.com/client-routing
```
cchen-vertica pushed a commit that referenced this pull request Jul 24, 2024
During a replicated upgrade we will promote a sandbox to the main
cluster, which cause it to drop its sandbox state, and we need to rename
subclusters to match the original names. Prior to this change, the pods
deployed by the operator had labels for the sandbox and subcluster name.
And the labels were set in the statefulset pod template. Now, in order
to change those, the k8s controller will actually do a rollout. This
means it needs to restart each of the pods. But we require sandbox
promotion and subcluster rename to happen without a pod restart. So, the
change here to use different labeling to ensure we don't need to restart
the pods. In some cases, if we have a pod we have to refer back to the
statefulset in order to figure certain metadata.

Since we are changing the statefulset selector values, the statefulset
from old operator need to be recreated. We had done this in the past, so
I repurposed the upgradeoperator120_reconciler.go for this purpose.

A summary of all of the labels that are set is as follows.

- All objects will have these labels set:
```
app.kubernetes.io/instance: <CR name>
app.kubernetes.io/managed-by: verticadb-operator
app.kubernetes.io/component: database
app.kubernetes.io/name: vertica
app.kubernetes.io/version: 2.2.0
vertica.com/database: <vertica db name>
```
- StatefulSets and Service objects will have the following labels:
```
vertica.com/subcluster-name: <subcluster-name>
vertica.com/subcluster-type: <primary|secondary>
vertica.com/sandbox: <sandbox name if applicable>
```
- Further Service objects will have this additional label:
```
vertica.com/svc-type: <external|headless>
```
- Pods will have the following labels:
```
vertica.com/subcluster-svc: <name of service object>
vertica.com/client-routing: true # Only set if not doing a drain or pending removal
vertica.com/subcluster-selector-name: <full name of the pod's statefulset>
```
- ConfigMaps used by the sandbox controller will have this label:
```
vertica.com/watched-by-sandbox-controller: true
```
- When the statefulset sets up the pod selector, it will use these
labels:
```
app.kurbernetes.io/instance
vertica.com/subcluster-selector-name
```
- When the service objects sets up its pod selector, it will use these
labels:
```
app.kurbernetes.io/instance
vertica.com/subcluster-selector-name
vertica.com/client-routing
```
cchen-vertica pushed a commit that referenced this pull request Jul 24, 2024
During a replicated upgrade we will promote a sandbox to the main
cluster, which cause it to drop its sandbox state, and we need to rename
subclusters to match the original names. Prior to this change, the pods
deployed by the operator had labels for the sandbox and subcluster name.
And the labels were set in the statefulset pod template. Now, in order
to change those, the k8s controller will actually do a rollout. This
means it needs to restart each of the pods. But we require sandbox
promotion and subcluster rename to happen without a pod restart. So, the
change here to use different labeling to ensure we don't need to restart
the pods. In some cases, if we have a pod we have to refer back to the
statefulset in order to figure certain metadata.

Since we are changing the statefulset selector values, the statefulset
from old operator need to be recreated. We had done this in the past, so
I repurposed the upgradeoperator120_reconciler.go for this purpose.

A summary of all of the labels that are set is as follows.

- All objects will have these labels set:
```
app.kubernetes.io/instance: <CR name>
app.kubernetes.io/managed-by: verticadb-operator
app.kubernetes.io/component: database
app.kubernetes.io/name: vertica
app.kubernetes.io/version: 2.2.0
vertica.com/database: <vertica db name>
```
- StatefulSets and Service objects will have the following labels:
```
vertica.com/subcluster-name: <subcluster-name>
vertica.com/subcluster-type: <primary|secondary>
vertica.com/sandbox: <sandbox name if applicable>
```
- Further Service objects will have this additional label:
```
vertica.com/svc-type: <external|headless>
```
- Pods will have the following labels:
```
vertica.com/subcluster-svc: <name of service object>
vertica.com/client-routing: true # Only set if not doing a drain or pending removal
vertica.com/subcluster-selector-name: <full name of the pod's statefulset>
```
- ConfigMaps used by the sandbox controller will have this label:
```
vertica.com/watched-by-sandbox-controller: true
```
- When the statefulset sets up the pod selector, it will use these
labels:
```
app.kurbernetes.io/instance
vertica.com/subcluster-selector-name
```
- When the service objects sets up its pod selector, it will use these
labels:
```
app.kurbernetes.io/instance
vertica.com/subcluster-selector-name
vertica.com/client-routing
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants