title | menu_order |
---|---|
Troubleshooting Weave Flux |
50 |
Also see the issues labeled with
FAQ
, which often
explain workarounds.
If you notice that Flux takes tens of seconds or minutes to get through each sync, while you can apply the same manifests very quickly by hand, you may be running into this issue: fluxcd#1422
Briefly, the problem is that mounting a volume into $HOME/.kube
effectively disables kubectl
's caching, which makes it much much
slower. You may have used such a volume mount to override
$HOME/.kube/config
, possibly unknowingly -- the Helm chart did this
for you, prior to
weaveworks/flux#1435.
The remedy is to mount the override to some other place in the
filesystem, and use the environment entry KUBECONFIG
to point
kubectl
at it. This is what the Helm chart now does, so fixing it
may be as easy as reapplying the chart if that's what you're using.
This is also documented in the FAQ.
This usually indicates there's a bug in the Flux daemon somewhere -- in which case please tell us about it!
This means Flux can't read from and write to the git repo. Check that
-
... you've supplied a git repo URL. If it's of the form
https://github.com/user/repo
then you will need to use the SSH-style URL,git@github.com:user/repo
instead. -
... the deploy key has read/write access to the repo. In GitHub, deploy keys are installed in the settings for a repository. To get the deploy key Flux is using, use
fluxctl identity
. -
... that the host where your git repo lives is in
~/.ssh/known_hosts
in the fluxd container. We prime the container image with host keys forgithub.com
,gitlab.com
,bitbucket.org
,dev.azure.com
, andvs-ssh.visualstudio.com
, but if you're using your own git server, you'll need to add its host key. See ./standalone-setup.md.
If you're using Weave Cloud, this
probably means you haven't supplied the token. You can get the token
from the settings in Weave Cloud; set the environment variable
FLUX_TOKEN
to the token.
If you have set Flux up standalone (as in the instructions in
./get-started.md), this
probably means Flux is defaulting to using Weave Cloud because you've
not set the environment variable FLUX_URL
to point at the
daemon. See ./standalone-setup.md.
GCP (in general) has quite conservative API rate limiting, and Flux's default settings can bump API usage over the limits. See weaveworks/flux#1016 for advice.
If you're using kubectl
v1.13.x to create them, then it may be due
to this problem. In
short, there was a breaking change to how kubectl
creates secrets,
that found its way into the Kubernetes 1.13.0 release. It has been
corrected in kubectl
v1.13.2,
so using that version or newer to create secrets should fix the
problem.
Sometimes, instead of seeing the various images and their tags, the
output of fluxctl list-images
(or the UI in Weave Cloud, if you're
using that) shows nothing. There's a number of reasons this can
happen:
- Flux just hasn't fetched the image metadata yet. This may be the case if you've only just started using a particular image in a workload.
- Flux can't get suitable credentials for the image repository. At
present, it looks at
imagePullSecret
s attached to workloads, service accounts, platform-provided credentials on GCP, AWS or Azure, and a Docker config file if you mount one into the fluxd container (see the command-line usage). - When using images in ECR, from EC2, the
NodeInstanceRole
for the worker node running fluxd must have permissions to query the ECR registry (or registries) in question.eksctl
andkops
(with.iam.allowContainerRegistry=true
) both make sure this is the case. - When using images from ACR in AKS, the HostPath
/etc/kubernetes/azure.json
should be mounted into the Flux Pod. Setregistry.acr.enabled=True
in the helm chart or alter the Deployment:spec: containers: image: docker.io/weaveworks/flux ... volumeMounts: - name: acr-credentials mountPath: /etc/kubernetes/azure.json readOnly: true volumes: - name: acr-credentials hostPath: path: /etc/kubernetes/azure.json type: ""
- Flux excludes images with no suitable manifest (linux amd64) in manifestlist
- Flux doesn't yet understand image refs that use digests instead of tags; see weaveworks/flux#885.
If none of these explanations seem to apply, please file an issue.
You may notice that the ordering given to image tags does not always correspond with the order in which you pushed the images. That's because Flux sorts them by the image creation time; and, if you have retagged an older image, the creation time won't correspond to when you pushed the image. (Why does Flux look at the image creation time? In general there is no way for Flux to retrieve the time at which a tag was pushed from an image registry.)
This can happen if you explicitly tag an image that already exists. Because of the way Docker shares image layers, it can also happen implicitly if you happen to build an image that is identical to an existing image.
If this appears to be a problem for you, one way to ensure each image build has its own creation time is to label it with a build time; e.g., using OpenContainers pre-defined annotations.
Flux keeps track of the last commit that it's applied to the cluster,
by pushing a tag (controlled by the command-line flags
--git-sync-tag
and --git-label
) to the git repository. This gives
it a persistent high water mark, so even if it is restarted from
scratch, it will be able to tell where it got to.
Technically, it only needs this to be able to determine which image
releases (including automated upgrades) it has applied, and that only
matters if it has been asked to report those with the --connect
flag. Future versions of Flux may be more sparing in use of the sync
tag.