Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

supervisor terminates programs before closing #396

Merged
merged 2 commits into from
May 30, 2019

Conversation

VladoLavor
Copy link
Collaborator

Signed-off-by: Vladimir Lavor vlavor@cisco.com

Signed-off-by: Vladimir Lavor <vlavor@cisco.com>
svLogger: svLogger,
}

return nil
}

func (p *Plugin) watch(stateChan chan status.ProcessStatus, name string) {
func (p *Plugin) watch(stateChan chan status.ProcessStatus, doneChan chan struct{}, name string) {
p.wg.Add(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've mentioned this few times before, but to be more specific I will quote important part of documentation for WaitGroup.Add:

Typically this means the calls to Add should execute before the statement creating the goroutine or other event to be waited for...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but the reason for this rule is to prevent calling Wait() before the first Add() which may happen when the goroutine execution is delayed.

Copy link
Member

@ondrej-fabry ondrej-fabry May 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is just wrong. You should really realize the flow of the execution.

You have chain of goroutines starting from Init method and continuing startPrograms and then this watch where you can never ensure the the time of execution. The Close could be called immediately after Init when the p.programs is still empty, which can easily happen, because of so many unsynchronized goroutines, which would actually make the execution skip to the Wait call and immediately exiting, possibly creating a lot of race conditions here.

This is simply bad practice that breeds unreliable code.

}

// close hook watcher when all programs are terminated
p.wg.Wait()
close(p.hookChan)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is improper handling of channel as I've explained in other PR: ligato/vpp-agent#1318 (comment)

The Close method is called from different goroutine than the goroutine that handles watch method and sends events to the hookChan.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, that's the reason for the wait group. There is a watch goroutine for every process started and the hook channel is shared between them. The wait group ensures that no more watchers are running, and then it's safe to close the channel even from a different goroutine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it doesn't really make much sense why you want to close the hookChan when immediately after closing it you return from Close.. which means that it will not wait for watchEvents to finish anyway so why bother?

Signed-off-by: Vladimir Lavor <vlavor@cisco.com>
@ondrej-fabry ondrej-fabry merged commit 691b90e into ligato:dev May 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants