-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generic worker becomes stale #124
Comments
I believe I'm having the same issue - after a while I'm reaching "Unknown error occurred while performing connection test" for all queries and it seems that adhocworker gets stuck. Currently it's only a guess because there are no indicative logs in any of the pods. |
Would be good to confirm if this is a chart specific issue or a data source connection issue - I see some reports of this message (e.g. getredash/redash#2047 & getredash/redash#5664). |
@grugnog This happened for all datasources I've tried - Postgres and Prometheus. Connections worked again after restarting the workers. |
@oedri if you are able to add any detail (debug logs, strace perhaps?) it would be great if you could open a ticket regarding this on https://github.com/getredash/redash - it seems unlikely to be a Kubernetes issue, except perhaps something environmental (resource exhaustion etc) which is not really in scope of this chart, although we could adjust the docs/defaults perhaps if we identify that as the cause. |
happed to me too |
happening to me everyday too. |
@grugnog How can we enable |
+1. There seems to be an issue respawning the process if it dies. A transient redis issue triggers a persistent problem for the worker.
Liveness check for workers PR should improve the sittuation. |
Hello,
I've stumbled upon an issue where generic workers (and possibly scheduled workers too) become stable at arbitrary intervals. By stale I mean they don't pick up new jobs neither process anything. The only workaround I've found so far is to kill the pods so that they get recreated, but I'm trying to automate this.
Have anyone had the same problem?
Chart Version: 3.0.0-beta1
The text was updated successfully, but these errors were encountered: