-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graceful shutdown #690
Graceful shutdown #690
Conversation
… to finish before dying.
log.Info("agent: Gracefully shutting down agent...") | ||
go func() { | ||
plugin.CleanupClients() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is this going to happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested it only with containers. Container exits, and the everything that was running inside dies also. So it doesn't matter in my case :)
But you're right, not everyone is using containers.
I'll add plugin.CleanupClients() somewhere below
cmd/agent.go
Outdated
time.Sleep(1 * time.Second) | ||
} | ||
|
||
plugin.CleanupClients() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this gofmt'ed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, edited it with vi first :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
When I started to test graceful shutdown of worker type agents, it appeared that they don't wait for their running jobs before shutdown.
If I recall correctly, plugin.CleanupClients() kill plugins immediately, causing childs (in case of shell executor) to die.
This PR does the following:
Note: Running jobs successfully can send ExecutionDone to leader even after agent left cluster.