-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issue - telegraf-operator - MountVolume.SetUp failed for volume "telegraf-config" : secret "telegraf-XXXX" not found #137
Comments
We have the same issue. It looks like there is a race condition in the webhook with STS, due to the pod names and therefore the secret names to be the same from deploy to deploy. What is supposed to happen:
But what ends up happening is:
|
Throwing in a +1 for this issue happening for us as well. Running influxdb/telegraf-operator:v1.3.10 |
@GBlodgett35 did you found a workaround? It's pitty that we cannot tel to pod to be respawned when this issue occur :/ |
Not sure why but it start to be recurrent during deployment and especially consolidation of nodes (and in specific case, it's really blocker because there is now way to automatically trigger a restart if pod is in error due to this missing secret). Starting to have a doubt if this project is always maintained or if we need to investigate for another solution? @gitirabassi @wojciechka Cannot really help in go but if you need test or more informations, don't hesitate. |
@anthosz We ended up embedding Telegraf on the image instead of using the operator :( |
That's what I feared, seems this project is not maintained anymore 😑 |
Hello, |
According to influxdata/telegraf#15192 (comment) It seems that it's not maintained anymore by influxdata, the only way is to create the PR ourself with the fix.. (not sure if someone have knowledge about golang/operator..) |
@anthosz @tlereste @GBlodgett35 Hi folks, As this project appears to no longer be maintained, I've gone ahead re-written this project from scratch over at https://github.com/jmickey/telegraf-sidecar-operator. It currently supports the majority of the annotations as this project, with one notable exception: The project it technically pre-1.0.0, but I've been running it on a staging cluster for about a week and it's been working well. It also resolves this issue. |
Thank you for the feedback, at this time, personnally, I moved all the stuff to sidecar & removed the operator. |
Hello,
Since few month, we experiment this kind of issue (50% of time when we plan an upgrade (when the pod respawn) and 20% of time during a pod reschedule (when it switch from a node to another one).
It is included in a Varnish statefulset.
Template
How to reproduce
Deploy a new version or move the pod to another node.
Current behavior (randomly):
Warning FailedMount 2m23s (x242 over 7h58m) kubelet MountVolume.SetUp failed for volume "telegraf-config" : secret "telegraf-config-varnish-0" not found
Due to that, the pod cannot start.
Workaround:
Kill the pod and the secret is well recreated.
Expected behavior:
The secret is found
Other informations
The age of secret source is more 100 days so cannot be related to this one.
But the telegraf secret seems to be recreated every time than the pod is spawn and it seems there is an issue here: the secret cannot be created so telegraf cannot spawn (unable to mount not found secret) so the pod is freezed until we terminate it.
Versions
The text was updated successfully, but these errors were encountered: