Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POD annotations are dropped with the reconcile of CHK STS #1469

Closed
jirislav opened this issue Aug 2, 2024 · 5 comments
Closed

POD annotations are dropped with the reconcile of CHK STS #1469

jirislav opened this issue Aug 2, 2024 · 5 comments
Labels
Keeper ClickHouse Keeper issues

Comments

@jirislav
Copy link
Contributor

jirislav commented Aug 2, 2024

Keeping the POD annotations is essential to run the workload in EKS, where the fargate profile is the default one.

Dropping essential annotations, such as "eks.amazonaws.com/compute-type" = "ec2" will cause the POD to be unschedulable due to the fact that:

  • you can't mount a volume to a fargate node.
  • you can't possibly satisfy the nodeSelector & nodeAffinity rules with fargate profile in case you ask for dedicated EC2 node for billing purposes.

Example manifest:

apiVersion: "clickhouse-keeper.altinity.com/v1"
kind: "ClickHouseKeeperInstallation"
metadata:
  name: clickhouse-keeper
spec:
  configuration:
    clusters:
      - name: chk
        layout:
          replicasCount: 3
  templates:
    podTemplates:
      - name: clickhouse-keeper
        metadata:
          annotations:
            eks.amazonaws.com/compute-type: "ec2"

Interestingly, first POD of the 3 replicas starts with correct annotation, but then, the second doesn't as the annotations are dropped from the underlying statefulset.

Note that I also see this in the log of the operator, which is possibly the result of this behavior:

E0802 07:02:22.310975       1 reconciler.go:299] err: Operation cannot be fulfilled on clickhousekeeperinstallations.clickhouse-keeper.altinity.com "chk": the object has been modified; please apply your changes to the latest version and try again
jirislav added a commit to jirislav/clickhouse-operator that referenced this issue Aug 2, 2024
Signed-off-by: Jiří Kozlovský <jirislav@users.noreply.github.com>
jirislav added a commit to jirislav/clickhouse-operator that referenced this issue Aug 2, 2024
Signed-off-by: Jiří Kozlovský <jirislav@users.noreply.github.com>
@jirislav
Copy link
Contributor Author

jirislav commented Aug 2, 2024

Please see this pull request to the branch 0.24.0 🙏🏿 .

@Kavinjsir
Copy link

I encountered a similar issue when adding additional annotations for Datadog agent metrics scraping.

Here are the details:

  1. Defining annotations in the podTemplates for a CHK manifest works successfully when creating the CHK for the first time.
  2. However, modifying the annotations block later on causes the reconciliation process to drop all annotations.

@g-marius
Copy link

We are also randomly seeing reconciler errors on some deploys. since we are using annotations in our env, i would suspect it's the same issue as above mentioned for datadaog

 1 reconciler.go:299] err: Operation cannot be fulfilled on clickhousekeeperinstallations.clickhouse-keeper.altinity.com "keeper": the object has been modified; please apply your changes to the latest version and try again

@alex-zaitsev alex-zaitsev added the Keeper ClickHouse Keeper issues label Aug 15, 2024
@Slach
Copy link
Collaborator

Slach commented Sep 7, 2024

@g-marius do you use something like Flux or ArgoCD?

@jirislav
Copy link
Contributor Author

I believe the fix is released already as part of 0.24.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Keeper ClickHouse Keeper issues
Projects
None yet
Development

No branches or pull requests

5 participants