-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empty commits #166
Comments
Any chance I can see the git repo you're using? I understand if it contains sensitive data and you'd prefer not. You might have to help me out with narrowing down where the problem happens .. |
The git repository is a private GitHub repository but I can share it with you. I would prefer to mail you a zip file but I guess I could just add you as a contributor. |
Either works for me -- you can email a zip to the address on my profile, or michael@weave.works. |
I'm having this same problem as well. Same behavior on Flux2 I haven't built or put a new image in this AWS ECR repo since last week. I just updated to Flux's latest and reapplied my kustomization to continue testing today. My Kustomization: apiVersion: kustomize.toolkit.fluxcd.io/v1beta1
kind: Kustomization
metadata:
name: www-test-eks-com-v1
namespace: backend
spec:
images:
- name: 000000000000.dkr.ecr.us-east-1.amazonaws.com/www-test-eks-com-v1
newTag: 20210527-223554 # {"$imagepolicy": "flux-system:www-test-eks-com-v1:tag"}
interval: 10m0s
path: ./flux/kustomize/apps/www-test-eks-com-v1
prune: true
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
validation: client
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: www-test-eks-com-v1
namespace: backend
timeout: 2m Excerpt from the relevant containers:
- name: www-test-eks-com-v1
image: 000000000000.dkr.ecr.us-east-1.amazonaws.com/www-test-eks-com-v1:20210505-233849 Image Automation objects: ---
apiVersion: image.toolkit.fluxcd.io/v1alpha2
kind: ImageUpdateAutomation
metadata:
name: www-test-eks-com-v1
namespace: flux-system
spec:
git:
commit:
messageTemplate: |
Automated image update
Automation name: {{ .AutomationObject }}
Files:
{{ range $filename, $_ := .Updated.Files -}}
- {{ $filename }}
{{ end -}}
Objects:
{{ range $resource, $_ := .Updated.Objects -}}
- {{ $resource.Kind }} {{ $resource.Name }}
{{ end -}}
Images:
{{ range .Updated.Images -}}
- {{.}}
{{ end -}}
author:
email: fluxcdbot@users.noreply.github.com
name: fluxcdbot
interval: 2m0s
sourceRef:
kind: GitRepository
name: flux-system
# namespace: flux-system
suspend: false
---
apiVersion: image.toolkit.fluxcd.io/v1alpha2
kind: ImageRepository
metadata:
name: www-test-eks-com-v1
namespace: flux-system
spec:
image: 000000000000.dkr.ecr.us-east-1.amazonaws.com/www-test-eks-com-v1
interval: 2m0s
secretRef:
name: ecr-credentials
---
apiVersion: image.toolkit.fluxcd.io/v1alpha2
kind: ImagePolicy
metadata:
name: www-test-eks-com-v1
namespace: flux-system
spec:
filterTags:
pattern: '((?P<timestamp>\d{8}-\d{6})|(?P<timestamp>\d{8}-\d{6}-\d{1}))'
extract: '$timestamp'
imageRepositoryRef:
name: www-test-eks-com-v1
policy:
alphabetical:
order: asc Flux CLI output concerning images:
Logs - Image Automation Controller: {"level":"info","ts":"2021-06-03T19:51:44.079Z","logger":"controller-runtime.manager.controller.imageupdateautomation","msg":"Starting workers","reconciler group":"image.toolkit.fluxcd.io","reconciler kind":"ImageUpdateAutomation","worker count":1}
{"level":"info","ts":"2021-06-03T19:51:59.653Z","logger":"controller-runtime.manager.controller.imageupdateautomation","msg":"pushed commit to origin","reconciler group":"image.toolkit.fluxcd.io","reconciler kind":"ImageUpdateAutomation","name":"www-test-eks-com-v1","namespace":"flux-system","revision":"fa68e0a261ad6aa9b8cb4ecc8db89cc68f07326f","branch":"master"}
{"level":"info","ts":"2021-06-03T19:52:08.908Z","logger":"controller-runtime.manager.controller.imageupdateautomation","msg":"pushed commit to origin","reconciler group":"image.toolkit.fluxcd.io","reconciler kind":"ImageUpdateAutomation","name":"www-test-eks-com-v1","namespace":"flux-system","revision":"8e214093bf5e9870ea18319ff63032e000569296","branch":"master"}
{"level":"info","ts":"2021-06-03T19:54:12.551Z","logger":"controller-runtime.manager.controller.imageupdateautomation","msg":"pushed commit to origin","reconciler group":"image.toolkit.fluxcd.io","reconciler kind":"ImageUpdateAutomation","name":"www-test-eks-com-v1","namespace":"flux-system","revision":"c8c8dbd510859b8c8202eef437841693d37b8641","branch":"master"}
{"level":"info","ts":"2021-06-03T19:56:24.110Z","logger":"controller-runtime.manager.controller.imageupdateautomation","msg":"pushed commit to origin","reconciler group":"image.toolkit.fluxcd.io","reconciler kind":"ImageUpdateAutomation","name":"www-test-eks-com-v1","namespace":"flux-system","revision":"89e4da1fe3221f66222ff32af1120061ab7859bc","branch":"master"} Events - Image Automation controller:
|
Sorry to hear that @chicken231 , and thank you for all the environmental info. Is it possible to fork your repo, or to otherwise get access to it (keeping it private of course)? |
@Nicklason I'm sure you would have mentioned it, but just in case: is there anything that appears to trigger the empty commits; e.g., making a commit yourself to the same branch, or restarting the controller, or a particular image getting a new tag? Also: do you have any symlinks or other non-regular files/directories in your git repo? |
I can't give you access to this cluster as it's for work (AWS EKS). Was the info and YAMLs I provided insufficient to reproduce or attempt to reproduce? Would anything short of giving you a copy of the whole repo be acceptable? What I can do this weekend is package up all the relevant YAMLs relating to this deployment and image automation and test it in my bare metal lab cluster. Different container registry, different git repo provider, different k8s version, etc. If it exhibits the same symptoms, would providing those relevant YAMLs to you so you can As for your note to the OP, it seems every reconciliation loop (at every automation interval), it creates and pushes a commit no matter what. I'm 90% sure the same occurs if I attempt a reconciliation via the I have 0 symlinks in my entire repo. I just verified this with certainty on my end. Thanks for any help! If we can get this fixed, I can roll out Flux for my CD. |
Fair enough, that's the usual situation -- it's worth asking just in case :-)
A lot of the behaviour of the image-automation-controller -- especially buggy behaviour -- depends on the contents of the git repo. For example, the problem in #164 depended on a particular interleaving of commits, and #135 involved broken symlinks in the repo (thus the question about symlinks!). As a result, it's tricky to see what's really going on given just the Kubernetes configuration.
Yes for sure -- or slightly better, if you can make a github repo with the reduced/redacted files, I can fork it and go from there.
That sounds like #135, which produced a commit every loop because go-git mistakenly reported a broken symlink as modified. So perhaps there's some other similar thing going on. |
I am not sure if it has something to do with me working on the repository or restarting flux pods, it just seems random. The repository shouldn't have any symbolic links either. |
@Nicklason Naively, I ran the controller against the definitions in your repo, to see if I could get empty commits. I've been running it for a day and have not seen any. I notice you see around one or two empty commits a day, but I will need to run it longer to have confidence that I'm failing to reproduce the problem (something something poisson distribution). In the meantime, I don't see any smoking guns in the repo -- it all looks innocuous 🤔. |
@squaremo Yes I still want to help. If you make me an image I'll run it. It might be a good idea to add debugging to all the images anyway 😃 |
I wasn't able to reproduce empty commits in my lab (Gitlab repo, Gitlab registry, Flux v0.15.2). It works like a charm. The only differences in my manifests pasted above between my two environments is the object names and using SemVer image policy. |
The image docker.io/squaremo/image-automation-controller:0a9c9a7437629093349eb3d3c77f8d9e5d6fba7c is built from #190. It outputs a bunch of tracing to the log at debug level -- if you use it, you'll need to set Among the tracing:
By definition, it cannot be changing files if the result is an empty commit; but it's difficult to know what else to trace. Suggestions encouraged :-) |
@squaremo You need to build it for arm64 :) Might have been worth mentioning this earlier. |
@squaremo I tested the image you provided with debug logs enabled in my x86 EKS cluster. Current Flux version = An interesting excerpt from the logs that it records before creating the empty commit:
|
That smells like the comparison thinks it is looking at different types, e.g. the old value being a interpreted as a string, but the new value as an integer, or vice-versa. |
I find it hard to get anything useful from the logs. I have quite a lot of image policies and most of the log messages are about them and are repeated every minute. I think I would have to set up a notification for when a commit is made in order to get a chance to see the relevant logs. These are the messages I keep seeing:
As @chicken231 said, I also see "old" and "new" being the same. The "old" and "new" are just links to a docker image. I don't know what it means but I would guess that it is used to check if there is a new tag matching the image policy, but you can correct me on that. |
That is the nature of log messages at this level -- I don't know what I'm looking for, so I output everything I can think of. I was hoping that there would be a noticeable pattern of trace messages before a commit which turns out to be empty -- given the commit ref, you can grep the log for it and look at what came immediately before (and compare with a commit that isn't empty). But it's a shot in the dark, really -- I rather suspect there's some environmental cause, rather than a particular path through the code. The trace message about applying a setter with "old" and "new" values is output whether or not the values differ (the filter sets the value unconditionally and the log follows). There's a message about finding a setter mark, so it would make sense to restrict the trace message to just when it's changing the value. |
Yes I understand. Sorry my last comment came out a bit wrong.
I think you are right. It looks like I have indirectly solved the issue - I have not had empty commits for a week (since 9/7/2021). The cluster consists of 6 network booted raspberry pi and a week ago (same day as last empty commit) I migrated to a new NAS, so it looks like the NAS was the problem. I should have looked closer at the logs before moving to a new NAS and I should have included information about the environment when I made the issue, but I just didn't suspect that the NAS / filesystem would affect the controller. |
I tried switching ImagePolicy policies with different RegExs, SemVer, but the empty commits kept coming. Is there anything I can do from my end to help get this fixed? |
To clarify: you stopped running the image registry that hosts images used by the automation? |
@squaremo This was a month or more ago so my memory is fuzzy. So I may have spoken incorrectly, I can't recall at what point in time these empty commits occurred, if it was after I deleted all the container images in the registry, or after I did that and after I changed username (then the endpoint should have returned a 404 I'd think). Either way I don't think it should be pushing a commit unless a newer image is actually detected. How about I spin up a small test cluster somewhere and give a few folks in this thread access to it and the corresponding repo, assuming I can reproduce the problem there? |
That would be super helpful! Thank you for putting time into this 🙏 |
I'm happy to report that I can no longer reproduce this issue. |
@DaemonDude23 thank you for testing this and confirming you could not reproduce it anymore, therefore I am closing the issue. If users are still experiencing this, please add a comment and we will reopen the issue. |
I am using the ImageUpdateAutomation resource and see empty commits being pushed to my git repository. I looked at a similiar issue (#135) but they got an empty commit every minute, where I get them randomly, so I believe it is a different issue.
I have had the issue since 20. april. I was using flux v0.12.1 back then, and now I use latest version v0.13.4 and I still get them.
20/4/2021
16/5/2021
The text was updated successfully, but these errors were encountered: