-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
supervisor terminates programs before closing #396
Conversation
Signed-off-by: Vladimir Lavor <vlavor@cisco.com>
exec/supervisor/plugin.go
Outdated
svLogger: svLogger, | ||
} | ||
|
||
return nil | ||
} | ||
|
||
func (p *Plugin) watch(stateChan chan status.ProcessStatus, name string) { | ||
func (p *Plugin) watch(stateChan chan status.ProcessStatus, doneChan chan struct{}, name string) { | ||
p.wg.Add(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've mentioned this few times before, but to be more specific I will quote important part of documentation for WaitGroup.Add
:
Typically this means the calls to Add should execute before the statement creating the goroutine or other event to be waited for...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, but the reason for this rule is to prevent calling Wait()
before the first Add()
which may happen when the goroutine execution is delayed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is just wrong. You should really realize the flow of the execution.
You have chain of goroutines starting from Init
method and continuing startPrograms
and then this watch
where you can never ensure the the time of execution. The Close
could be called immediately after Init
when the p.programs
is still empty, which can easily happen, because of so many unsynchronized goroutines, which would actually make the execution skip to the Wait
call and immediately exiting, possibly creating a lot of race conditions here.
This is simply bad practice that breeds unreliable code.
exec/supervisor/plugin.go
Outdated
} | ||
|
||
// close hook watcher when all programs are terminated | ||
p.wg.Wait() | ||
close(p.hookChan) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is improper handling of channel as I've explained in other PR: ligato/vpp-agent#1318 (comment)
The Close
method is called from different goroutine than the goroutine that handles watch
method and sends events to the hookChan
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, that's the reason for the wait group. There is a watch
goroutine for every process started and the hook channel is shared between them. The wait group ensures that no more watchers are running, and then it's safe to close the channel even from a different goroutine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it doesn't really make much sense why you want to close the hookChan
when immediately after closing it you return from Close
.. which means that it will not wait for watchEvents
to finish anyway so why bother?
Signed-off-by: Vladimir Lavor <vlavor@cisco.com>
Signed-off-by: Vladimir Lavor vlavor@cisco.com