Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Context deadline exceeded error using manifest-generation and kustomize #2477

Closed
davidpristovnik opened this issue Sep 25, 2019 · 5 comments
Closed
Labels

Comments

@davidpristovnik
Copy link

Describe the bug
All of our deployments use kustomize to render the manifests. Directory structure looks like this with up to a few hundreds of deployments:

├── .flux.yaml
├── flux1
│   └── kustomization.yaml
├── flux2
│   └── kustomization.yaml
├── flux3
│   └── kustomization.yaml
├── flux4
│   └── kustomization.yaml
└── flux5
    └── kustomization.yaml

Each kustomization.yaml has a reference to a common base, with multiple workloads/deployments.

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

bases:
- ../../../../overlays/flux

namePrefix: flux1-

commonLabels:
  app.kubernetes.io/instance: flux1
  hostname: flux1.somedomain.com

imageTags:
- name: flux
  newTag: 2019-08-07_15-11-24_master_87ed9db687
commonAnnotations:
  fluxcd.io/automated: "true"
  fluxcd.io/tag.nginx: glob:*_master_*
  fluxcd.io/tag.uwsgi: glob:*_master_*

We use --manifest-generation feature, with .flux.yaml config looking like this:

version: 1
commandUpdated:
  generators:
    - command: |
        for k in `find . -name kustomization.yaml`; do
          echo ---; kustomize build `dirname $k`
        done
  updaters:
    - containerImage:
        command: |
          instance=${FLUX_WL_NAME}
          cd $instance
          kustomize edit set image "$FLUX_IMG:$FLUX_TAG"
      policy:
        command: |
          instance=${FLUX_WL_NAME}
          cd $instance
          kustomize edit add annotation -f "fluxcd.io/$FLUX_POLICY:$FLUX_POLICY_VALUE"

This setup works fine for small number of deployments but when trying to enable automated release cycle or do a `fluxctl release --all --update-all-images' it fails with a timeout error.

Logs

{"caller":"loop.go:144","component":"sync-loop","err":{"type":"user","help":"The release process failed, with this message:\n\n    applying changes: updating resource development:deployment/flux1 in kubernetes/apps/flux/deploy/working-1/development/.flux.yaml: error executing generator command \"for k in `find . -name kustomization.yaml`; do\\n  echo ---; kustomize build `dirname $k`\\ndone\\n\" from file \"kubernetes/apps/flux/deploy/working-1/development/.flux.yaml\": context deadline exceeded\nerror output:\n\ngenerated output:\n\n\nThis may be because of a limitation in the formats of file Flux can\ndeal with.:\n\ngenerated output:\n"}

Additional context

  • Flux version: v1.14.2
  • Kubernetes version: 1.13.10
  • Git provider: Github
  • Container registry provider: quay.io
@davidpristovnik davidpristovnik added blocked-needs-validation Issue is waiting to be validated before we can proceed bug labels Sep 25, 2019
@stefanprodan
Copy link
Member

@davidpristovnik does this happen if you let Flux do the update on its own without calling fluxctl sync?

@davidpristovnik
Copy link
Author

@stefanprodan Yes. What I did is changes the defaultJobTimeout and deployed new version. Now it works, but this does not scale.

--- a/pkg/daemon/daemon.go
+++ b/pkg/daemon/daemon.go
@@ -37,7 +37,7 @@ const (
        // A job can take an arbitrary amount of time but we want to have
        // a (generous) threshold for considering a job stuck and
        // abandoning it
-       defaultJobTimeout = 60 * time.Second
+       defaultJobTimeout = 4 * 60 * time.Second
 )

I think if flux supported this kind of setup, where we could define git path and flux would traverse all the subdirectories and use the .flux.yaml config, if preset. It would give us fine-grained control and more flexibility. You could also run multiple sync-loops that way.

│
├── flux1
│   ├── .flux.yaml
│   └── kustomization.yaml
├── flux2
│   ├── .flux.yaml
│   └── kustomization.yaml
├── flux3
│   ├── .flux.yaml
│   └── kustomization.yaml
├── flux4
│   ├── .flux.yaml
│   └── kustomization.yaml
└── flux5
    ├── .flux.yaml
    └── kustomization.yaml

@hiddeco
Copy link
Member

hiddeco commented Sep 26, 2019

Related issue: #1857

@stefanprodan stefanprodan removed the blocked-needs-validation Issue is waiting to be validated before we can proceed label Sep 26, 2019
@hiddeco
Copy link
Member

hiddeco commented Sep 26, 2019

Resolved via #2481: setting the --sync-timeout to a higher value in the next release of Flux (e.g. 4m for your use-case @davidpristovnik) should fix the issue.

@hiddeco hiddeco closed this as completed Sep 26, 2019
@davidpristovnik
Copy link
Author

@hiddeco Thank you for the quick fix.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants