-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large number of warning logs with nomad service provider #16238
Comments
@shoenig |
Whelp I guess there was more than one bug then 😬 I'll try and reproduce again with a job structured exactly like yours @chenjpu |
This PR fixes a bug where issuing a restart to a terminal allocation would cause the allocation to run its hooks anyway. This was particularly apparent with group_service_hook who would then register services but then never deregister them - as the allocation would be effectively in a "zombie" state where it is prepped to run tasks but never will. Fixes #17079 Fixes #16238 Fixes #14618
This PR fixes a bug where issuing a restart to a terminal allocation would cause the allocation to run its hooks anyway. This was particularly apparent with group_service_hook who would then register services but then never deregister them - as the allocation would be effectively in a "zombie" state where it is prepped to run tasks but never will. Fixes #17079 Fixes #16238 Fixes #14618
This PR fixes a bug where issuing a restart to a terminal allocation would cause the allocation to run its hooks anyway. This was particularly apparent with group_service_hook who would then register services but then never deregister them - as the allocation would be effectively in a "zombie" state where it is prepped to run tasks but never will. Fixes #17079 Fixes #16238 Fixes #14618
This PR fixes a bug where issuing a restart to a terminal allocation would cause the allocation to run its hooks anyway. This was particularly apparent with group_service_hook who would then register services but then never deregister them - as the allocation would be effectively in a "zombie" state where it is prepped to run tasks but never will. Fixes #17079 Fixes #16238 Fixes #14618
I am experiencing a tsunami of these logs with Nomad 1.6.1. It's also generating a large amount of request error logs in Consul. Example consul log:
This seems to have started with a task that failed to start due to a missing vault secret. It's now generating hundreds of thousands of logs between nomad and consul. The job itself has two groups, each with different vault policies. Each group contains 1 service, 1 task. Only the second group has a service check, it is this check that is causing all the errors. The issue causing the task to fail has been resolved, however, the log tsunami persists. In the short term, is there a workaround to get the logs to stop, short of restoring to an older snapshot? |
For others running into this and need a quick way to stop the influx of logs, restart the nomad client. For us, we ran an instance refresh across the affected autoscaling group. |
We're also running into this. |
Nomad version
1.4.4
Issue
logs( journalctl -fu nomad )
job.hcl
The text was updated successfully, but these errors were encountered: