Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collector CRD 0.99.0 requires templated conversion webhook #1167

Closed
pavolloffay opened this issue May 3, 2024 · 20 comments · Fixed by #1175 or #1176
Closed

Collector CRD 0.99.0 requires templated conversion webhook #1167

pavolloffay opened this issue May 3, 2024 · 20 comments · Fixed by #1175 or #1176

Comments

@pavolloffay
Copy link
Member

Created from #1164

The conversion webhook defined in the collector CRD should be templated because it references the operator webhook service which can be dynamically set:

helm install --namespace=my-opentelemetry-operator-system --create-namespace my-opentelemetry-operator ./charts/opentelemetry-operator
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    cert-manager.io/inject-ca-from: opentelemetry-operator-system/opentelemetry-operator-serving-cert
    controller-gen.kubebuilder.io/version: v0.14.0
  name: opentelemetrycollectors.opentelemetry.io
spec:
  conversion:
    strategy: Webhook
    webhook:
      clientConfig:
        service:
          name: {missing my-}opentelemetry-operator-webhook
          namespace: {missing my}opentelemetry-operator-system
          path: /convert
      conversionReviewVersions:
      - v1alpha1
      - v1beta1
@pavolloffay
Copy link
Member Author

Also note that cert-manager.io/inject-ca-from: opentelemetry-operator-system/opentelemetry-operator-serving-cert is required

@TylerHelmuth
Copy link
Member

Related to #677 (comment)

@swiatekm
Copy link
Contributor

swiatekm commented May 3, 2024

There's two solutions I can think of:

  1. Template the CRDs.
    Create a separate crds helm chart #677 (comment) shows various different ways this can be done.

    Positives:

    • CRDs are always up-to-date, which makes a difference when they're actively developed and not yet stable, as is the case with the operator CRDs.
    • We can use the conversion webhook without user intervention.
    • We can make CRD installation optional and let users manage them with different tools, if that's what they want.

    Negatives:

    • There's potential for problems if other applications in the cluster depend on a particular version of the Otel CRD.
    • Charts which have the operator chart as a dependency and create otel CRDs, can run into stale discovery cache issues during upgrades.
  2. We can tell users to install their CRDs separately and patch them if they're installing for the first time and want to use the v1alpha1 CRDs.

    Positives:

    • Easy to do.
    • Explicit, users know what's going on.

    Negatives:

    • Manual intervention is needed for very basic tasks.
    • We'll surely get a bunch of issues opened about this no matter how we document it.

I'm leaning towards 1, but we can also delay the decision by doing 2 if we don't want to fall behind too much with operator Helm Chart releases.

@TylerHelmuth
Copy link
Member

If we go with option 2, do we have clear instructions for how they should install the CRDs themselves?

I like the end-user experience of option 1. I believe it will work for a majority of users. We'll need to provide clear paths forward for the users who will be affected by the downsides of option 1.

@jaronoff97
Copy link
Contributor

jaronoff97 commented May 3, 2024

I could see there being a combination of these. i.e. we add the CRDs in as templates in the operator chart with a very clear path for disabling (or enabling) that. We probably should also provide explicit instructions for installing the CRDs separately and patching them for people who are using the chart in a more advanced setup. At the least, it would be great to include something in the NOTES.txt and the release notes (or README) about this.

@TylerHelmuth
Copy link
Member

@jaronoff97 do you have availability to implement your idea?

@swiatekm
Copy link
Contributor

swiatekm commented May 3, 2024

Note that with 1 we also run into an upgrade problem. If we try to install templated CRDs when non-templated ones are present in the cluster, Helm will complain about missing ownership annotations:

Error: UPGRADE FAILED: Unable to continue with update: CustomResourceDefinition "opampbridges.opentelemetry.io" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "opentelemetry-operator"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "opentelemetry-operator"

So some level of manual intervention will be necessary regardless.

@TylerHelmuth
Copy link
Member

So some level of manual intervention will be necessary regardless.

Ya this is basically always true for the operator chart since the CRDs are changing frequently.

But after that manual intervention this time, the next time the CRDs change the template will take care of them. Honestly, option 1 will probably make a lot of people happy (until they run into the downsides, which are complex)

@swiatekm
Copy link
Contributor

swiatekm commented May 3, 2024

FYI, the two commits I added on top of Pavol's branch here: https://github.com/swiatekm-sumo/opentelemetry-helm-charts/tree/move-crds-to-templates do make it work without a lot of fuss. The upgrade from the previous version still fails though - not sure what the cleanest solution is. Manually adding the Helm annotations to the existing CRD works, but feels like a hack.

@TylerHelmuth
Copy link
Member

The upgrade from the previous version still fails though - not sure what the cleanest solution is.

I think like previous CRD changes, users should manually remove the old CRDs and install the new ones.

@swiatekm
Copy link
Contributor

swiatekm commented May 3, 2024

The upgrade from the previous version still fails though - not sure what the cleanest solution is.

I think like previous CRD changes, users should manually remove the old CRDs and install the new ones.

That's a destructive operation that will delete their CRs from the cluster. It'd be much better if we could just instruct users to add the required annotations to fool Helm into believing it owns them.

@jaronoff97
Copy link
Contributor

+1 to @swiatekm-sumo people can just patch with the annotations, we can guide people with the right command there.

@jaronoff97
Copy link
Contributor

@swiatekm-sumo in your PR is there a reason we're renaming the operator's service? (i.e. from <fullname>-webhook to <fullname>-webhook-service? I could imagine that may break users potentially.

@TylerHelmuth
Copy link
Member

I'm no sure patching is better than removing. Users are used to removing CRDs before installing the chart. In my mind it is clear to:

  1. remove the CRDs (which is what users do today when upgrading the operator chart)
  2. Install the chart with the templated/managed CRDs so that they get reinstalled.

Then during the next release, the user would simply upgrade the operator and it would take care of removing and repapplying the CRDs.

@jaronoff97
Copy link
Contributor

okay and to confirm – this would also probably assist in resolving the problem I'm having here At the least, if we had a separate chart for the CRs I could install that prior to everything else.

@TylerHelmuth
Copy link
Member

At the least, if we had a separate chart for the CRs I could install that prior to everything else.

As a separate install command yes, but I dont believe helm guarantees any subchart installation order, only the Kind installation order you linked earlier.

@swiatekm
Copy link
Contributor

swiatekm commented May 4, 2024

@swiatekm-sumo in your PR is there a reason we're renaming the operator's service? (i.e. from <fullname>-webhook to <fullname>-webhook-service? I could imagine that may break users potentially.

<fullname>-webhook-service is what it is in the bundle, and I figured it would make our life easier in the long term if it was the same in the Helm Chart. But we can just as well do another string replace on the CRD when importing it.

@pavolloffay
Copy link
Member Author

I would suggest following as a resolution to this ticket:

  1. as @swiatekm-sumo pointed out above - rename the webhook service to -webhook-service to align with the manifests from the operator repo
  2. host CRD in the helm chart but do not allow users to install the operator in a different namespace. This way the CRDs do not have to be templated.

@swiatekm
Copy link
Contributor

swiatekm commented May 6, 2024

In my opinion, 2 is a no-go, installing Charts in a user chosen namespace is a very basic feature of Helm.

@TylerHelmuth
Copy link
Member

Im currently in favor of @jaronoff97's idea here: #1167 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants