You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On our clusters, we had connectivity issues on Dec 26th. I noticed that since then, image automation failed to update (but ImagePolicies were up to date).
Here are the last logs for one of the controllers:
{"level":"error","ts":"2021-12-26T21:37:24.895Z","logger":"controller-runtime.manager.controller.imageupdateautomation","msg":"Reconciler error","reconciler group":"image.toolkit.fluxcd.io","reconciler kind":"ImageUpdateAutomation","name":"flux-system","namespace":"flux-system","error":"unable to clone 'ssh://git@github.com/myco/fleet-infra', error: SSH could not read data: Error waiting on socket"}
Since that point in time, the controller stopped working.
Killing the pod fixed this issue, but it'd be great if it could self-heal in this scenario?
Steps to reproduce
Allow image automation controller to have connectivity issue from Git repo
Observe that the controller will not try again to connect, or crash
Expected behavior
The controller should try again to reconcile as the connectivity would have been resolved.
Screenshots and recordings
No response
OS / Distro
Ubuntu 20.04
Flux version
flux version 0.16.1
Flux check
N/A
Git provider
GitHub
Container Registry provider
No response
Additional context
No response
Code of Conduct
I agree to follow this project's Code of Conduct
The text was updated successfully, but these errors were encountered:
getting "error: SSH could not read data: Error waiting on socket" in the logs
the image-automation-controller stops doing anything after that message
I know you are seeing the log message, from the comment you linked -- are you also experiencing the second problem, that the controller stops doing anything?
Describe the bug
On our clusters, we had connectivity issues on Dec 26th. I noticed that since then, image automation failed to update (but ImagePolicies were up to date).
Here are the last logs for one of the controllers:
Since that point in time, the controller stopped working.
Killing the pod fixed this issue, but it'd be great if it could self-heal in this scenario?
Steps to reproduce
Expected behavior
The controller should try again to reconcile as the connectivity would have been resolved.
Screenshots and recordings
No response
OS / Distro
Ubuntu 20.04
Flux version
flux version 0.16.1
Flux check
N/A
Git provider
GitHub
Container Registry provider
No response
Additional context
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: