Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECR: helm charts do not pull if from the root of the repository #951

Closed
Tracked by #3344
cep21 opened this issue Nov 8, 2022 · 10 comments · Fixed by fluxcd/pkg#434 or #983
Closed
Tracked by #3344

ECR: helm charts do not pull if from the root of the repository #951

cep21 opened this issue Nov 8, 2022 · 10 comments · Fixed by fluxcd/pkg#434 or #983
Assignees
Labels
area/helm Helm related issues and pull requests area/oci OCI related issues and pull requests bug Something isn't working

Comments

@cep21
Copy link

cep21 commented Nov 8, 2022

Proof I can pull from the helm CLI.

< helm pull oci://123123123.dkr.ecr.us-west-2.amazonaws.com/trino --version 1.1.5
Pulled: 123123123.dkr.ecr.us-west-2.amazonaws.com/trino:1.1.5
Digest: sha256:2b953bc779aa9249695baf128a08d1543d0230ad5ad94bff194be17d23bc2f6c

Helm release Object

> kc get helmreleases.helm.toolkit.fluxcd.io trino  -o yaml
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"helm.toolkit.fluxcd.io/v2beta1","kind":"HelmRelease","metadata":{"annotations":{},"name":"trino","namespace":"trino"},"spec":{"chart":{"spec":{"chart":"trino","reconcileStrategy":"ChartVersion","sourceRef":{"kind":"HelmRepository","name":"trino"},"version":"1.1.5"}},"interval":"5m0s"}}
  creationTimestamp: "2022-11-08T06:31:29Z"
  finalizers:
  - finalizers.fluxcd.io
  generation: 3
  name: trino
  namespace: trino
  resourceVersion: "185539053"
  uid: e6c56a19-eaf2-417f-b8a5-0bab555de5f4
spec:
  chart:
    spec:
      chart: trino
      reconcileStrategy: ChartVersion
      sourceRef:
        kind: HelmRepository
        name: trino
      version: 1.1.5
  interval: 5m0s
status:
  conditions:
  - lastTransitionTime: "2022-11-08T06:37:30Z"
    message: HelmChart 'trino/trino-trino' is not ready
    reason: ArtifactFailed
    status: "False"
    type: Ready
  failures: 124
  helmChart: trino/trino-trino
  observedGeneration: 3
  • Helm repository object
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"source.toolkit.fluxcd.io/v1beta2","kind":"HelmRepository","metadata":{"annotations":{},"name":"trino","namespace":"trino"},"spec":{"interval":"1m0s","provider":"aws","timeout":"60s","type":"oci","url":"oci://123123123.dkr.ecr.us-west-2.amazonaws.com/trino"}}
  creationTimestamp: "2022-11-08T06:31:17Z"
  finalizers:
  - finalizers.fluxcd.io
  generation: 5
  name: trino
  namespace: trino
  resourceVersion: "185287514"
  uid: e3e84473-07b3-40e0-b2a8-65b7f623414c
spec:
  interval: 1m0s
  provider: aws
  timeout: 60s
  type: oci
  url: oci://123123123.dkr.ecr.us-west-2.amazonaws.com/trino
status:
  conditions:
  - lastTransitionTime: "2022-11-08T06:33:32Z"
    message: Helm repository is ready
    observedGeneration: 5
    reason: Succeeded
    status: "True"
    type: Ready
  observedGeneration: 5

Relevant error logs from the source controller

{"level":"info","ts":"2022-11-08T16:54:12.338Z","msg":"logging in to AWS ECR for 123123123.dkr.ecr.us-west-2.amazonaws.com/trino","controller":"helmchart","controllerGroup":"source.toolkit.fluxcd.io","controllerKind":"HelmChart","HelmChart":{"name":"trino-trino","namespace":"trino"},"namespace":"trino","name":"trino-trino","reconcileID":"5d316e59-ee1d-45b3-94d7-c981829ea52e"}
{"level":"error","ts":"2022-11-08T16:54:12.527Z","msg":"chart pull error: failed to download chart for remote reference: 123123123.dkr.ecr.us-west-2.amazonaws.com/trino/trino:1.1.5: not found","name":"trino-trino","namespace":"trino","reconciler kind":"HelmChart","annotations":null,"error":"ChartPullError","stacktrace":"github.com/fluxcd/pkg/runtime/events.(*Recorder).AnnotatedEventf\n\tgithub.com/fluxcd/pkg/runtime@v0.22.0/events/recorder.go:136\ngithub.com/fluxcd/pkg/runtime/events.(*Recorder).Eventf\n\tgithub.com/fluxcd/pkg/runtime@v0.22.0/events/recorder.go:113\ngithub.com/fluxcd/source-controller/internal/reconcile/summarize.RecordContextualError\n\tgithub.com/fluxcd/source-controller/internal/reconcile/summarize/processor.go:49\ngithub.com/fluxcd/source-controller/internal/reconcile/summarize.(*Helper).SummarizeAndPatch\n\tgithub.com/fluxcd/source-controller/internal/reconcile/summarize/summary.go:193\ngithub.com/fluxcd/source-controller/controllers.(*HelmChartReconciler).Reconcile.func1\n\tgithub.com/fluxcd/source-controller/controllers/helmchart_controller.go:228\ngithub.com/fluxcd/source-controller/controllers.(*HelmChartReconciler).Reconcile\n\tgithub.com/fluxcd/source-controller/controllers/helmchart_controller.go:263\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tsigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:320\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:234"}

Current source controller image

< kc get deployments.apps source-controller -o yaml | grep 'image: '
        image: ghcr.io/fluxcd/source-controller:v0.31.0

Strange part is "123123123.dkr.ecr.us-west-2.amazonaws.com/trino/trino:1.1.5: not found". Should that be only one trino?

@cep21
Copy link
Author

cep21 commented Nov 8, 2022

Also I couldn't find any examples (anywhere on the web really) of someone using flux for private ECR helm repositories. What is the exact format of the helm release and repo object that I would expect to work?

@souleb
Copy link
Member

souleb commented Nov 8, 2022

In your hemRepository, the url should be oci://123123123.dkr.ecr.us-west-2.amazonaws.com.

@cep21
Copy link
Author

cep21 commented Nov 8, 2022

It still does not work with the root path. I'm looking around and found others having issues with ECR. Is there a recommended setup?

@cep21
Copy link
Author

cep21 commented Nov 9, 2022

Here is the debug log message I get

{"level":"error","ts":"2022-11-09T00:11:52.376Z","msg":"Reconciler error","controller":"helmchart","controllerGroup":"source.toolkit.fluxcd.io","controllerKind":"HelmChart","HelmChart":{"name":"trino-trino","namespace":"httpbin"},"namespace":"httpbin","name":"trino-trino","reconcileID":"5d4faa2f-0a9b-4a85-8bd1-bd7b20ae0581","error":"chart pull error: failed to download chart for remote reference: pulling from host 123123123.dkr.ecr.us-west-2.amazonaws.com failed with status code [manifests 1.1.5]: 401 Unauthorized","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:326\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:234"}

However I have verified that the service account for the source-controller has a role attached to it, and I get no error messages about being unable to authenticate against ECR.

@souleb
Copy link
Member

souleb commented Nov 9, 2022

can you verify that the policy attached to the role gives you read permission to all your repositories?

@cep21 cep21 changed the title Pulling helm charts from private ECR repositories does not work. ECR: helm charts do not pull if from the root of the repository Nov 9, 2022
@cep21
Copy link
Author

cep21 commented Nov 9, 2022

After may attempts to get oci://123123123.dkr.ecr.us-west-2.amazonaws.com/trino to work, and oci://123123123.dkr.ecr.us-west-2.amazonaws.com/helm/trino working trivially without any changes, I'm somewhat confident that there is a bug in how flux pulls helm charts that doesn't allow /trino to work but does allow /helm/trino to work.

This is validated by looking at how other people are using ECR and noticing they all have a /something/ path prefix.

@souleb
Copy link
Member

souleb commented Nov 9, 2022

thanks for the issue.

Can you post the output of flux check?

@cep21
Copy link
Author

cep21 commented Nov 9, 2022

To clarify, it is working now because I moved the ECR repository from /trino to /helm/trino so the bug will no longer show up in my cluster ATM.

< flux check
► checking prerequisites
✔ Kubernetes 1.23.13-eks-fb459a0 >=1.20.6-0
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.26.0
✔ image-automation-controller: deployment ready
► ghcr.io/fluxcd/image-automation-controller:v0.26.1
✔ image-reflector-controller: deployment ready
► ghcr.io/fluxcd/image-reflector-controller:v0.22.1
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.30.0
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.28.0
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v0.31.0
► checking crds
✔ alerts.notification.toolkit.fluxcd.io/v1beta1
✔ buckets.source.toolkit.fluxcd.io/v1beta1
✔ gitrepositories.source.toolkit.fluxcd.io/v1beta1
✔ helmcharts.source.toolkit.fluxcd.io/v1beta1
✔ helmreleases.helm.toolkit.fluxcd.io/v2beta1
✔ helmrepositories.source.toolkit.fluxcd.io/v1beta1
✔ imagepolicies.image.toolkit.fluxcd.io/v1beta1
✔ imagerepositories.image.toolkit.fluxcd.io/v1beta1
✔ imageupdateautomations.image.toolkit.fluxcd.io/v1beta1
✔ kustomizations.kustomize.toolkit.fluxcd.io/v1beta2
✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2
✔ providers.notification.toolkit.fluxcd.io/v1beta1
✔ receivers.notification.toolkit.fluxcd.io/v1beta1
✔ all checks passed

@darkowlzz
Copy link
Contributor

I managed to reproduce this using our OCI integration test infra in https://github.com/fluxcd/pkg/tree/main/oci/tests/integration .
Just create a new ECR repo for the chart and push the chart at the root.
HelmRepository login succeeds. Only the HelmChart download fails.

HelmRepo:

apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
  name: demo
  namespace: default
spec:
  type: "oci"
  interval: 1m0s
  provider: aws
  url: oci://1234567890.dkr.ecr.us-east-2.amazonaws.com

HelmChart:

apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmChart
metadata:
  name: demo
  namespace: default
spec:
  interval: 1m
  chart: demo
  reconcileStrategy: ChartVersion
  sourceRef:
    kind: HelmRepository
    name: demo
  version: "0.1.0"

The failure is originating from helm registry client where it calls oras.Copy(), refer https://github.com/helm/helm/blob/v3.10.1/pkg/registry/client.go#L305 , which is called in remoteChartBuilder.downloadFromRepository(), refer https://github.com/fluxcd/source-controller/blob/v0.31.0/internal/helm/chart/builder_remote.go#L152 .

Here's the error, same as the one pasted above:

chart pull error: failed to download chart for remote reference: pulling from host 1234567890.dkr.ecr.us-east-2.amazonaws.com failed with status code [manifests 0.1.0]: 401 Unauthorized

0.1.0 is the chart version that I used.

@darkowlzz darkowlzz added bug Something isn't working area/helm Helm related issues and pull requests area/oci OCI related issues and pull requests labels Nov 10, 2022
@souleb souleb self-assigned this Nov 23, 2022
@darkowlzz
Copy link
Contributor

I did further testing for this and found out that the issue was in the contextual login.
Got a surprising behavior in the repository address parser when the address is a repository root which was causing contextual login to fail for all the providers, not just AWS. See fluxcd/pkg#434 for more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/helm Helm related issues and pull requests area/oci OCI related issues and pull requests bug Something isn't working
Projects
None yet
4 participants