-
Notifications
You must be signed in to change notification settings - Fork 880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docker-proxy cannot exit after send SIGINT #2421
base: master
Are you sure you want to change the base?
Conversation
Please sign your commits following these rules: $ git clone -b "master" git@github.com:mingfukuang/libnetwork.git somewhere
$ cd somewhere
$ git commit --amend -s --no-edit
$ git push -f Amending updates the existing PR. You DO NOT need to open a new one. |
5aa615a
to
db56873
Compare
Why libnetwork use os.Interrupt signal try to kill docker-proxy? Is there any special purpose? |
ping @selansen @thaJeztah, pls help . |
@mingfukuang looks like the commit message itself is missing a DCO sign-off; could you amend your commit to have a sign-off? (not just the PR description on GitHub)? Also make sure that the sign-off uses your real name (not your GitHub username) |
/cc @euanh @kolyshkin PTAL |
I already signed just a moment ago , please help review again, appreciate your quick reply. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
ping @euanh ptal |
portmapper/proxy.go
Outdated
select { | ||
case result := <-waitChan: | ||
return result | ||
case <-time.After(15 * time.Second): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This leads to a stray timer. It is better to create a timer and properly stop it.
t := time.NewTimer(15 * time.Second)
defer t.Stop()
select {
case result := <-waitChan:
return result
case <- t.C:
if err := p.cmd.Process.Signal(os.Kill); err != nil {
return err
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, it would be better to stop the timer by ourself rather than GC do, I will refresh code .
@@ -65,10 +66,23 @@ func (p *proxyCommand) Start() error { | |||
|
|||
func (p *proxyCommand) Stop() error { | |||
if p.cmd.Process != nil { | |||
if err := p.cmd.Process.Signal(os.Interrupt); err != nil { | |||
if err := p.cmd.Process.Signal(syscall.SIGTERM); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason to replace SIGINT with SIGTERM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was wondering as well; should we handle both ? (for backward compat)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, i’m also not sure why send SIGINT other than SIGTERM. Proxy will deal with both of them . According appearance of problem, SIGINT looks can't got response. Besides, semantically speaking,SIGTERM could be suitable. so , i made this change.
Looking forward to your further suggestions :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look at my comments.
Also, if would be nice if commit message would contain some info about why this is done (I guess you can just reuse the PR description).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy, thanks for your time , I will add commit message later.
portmapper/proxy.go
Outdated
select { | ||
case result := <-waitChan: | ||
return result | ||
case <-time.After(15 * time.Second): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, it would be better to stop the timer by ourself rather than GC do, I will refresh code .
@@ -65,10 +66,23 @@ func (p *proxyCommand) Start() error { | |||
|
|||
func (p *proxyCommand) Stop() error { | |||
if p.cmd.Process != nil { | |||
if err := p.cmd.Process.Signal(os.Interrupt); err != nil { | |||
if err := p.cmd.Process.Signal(syscall.SIGTERM); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, i’m also not sure why send SIGINT other than SIGTERM. Proxy will deal with both of them . According appearance of problem, SIGINT looks can't got response. Besides, semantically speaking,SIGTERM could be suitable. so , i made this change.
Looking forward to your further suggestions :-)
Signed-off-by: mingfukuang <kuang.mingfu@zte.com.cn> When stop or delete one container with docker-proxy, stop container will call Unmap() to release port , func Unmap() need to acquire pm.lock.Lock() firstly, then stop docker-proxy, and then pm.lock.UnLock(). but some situation(cannot be reproduced), stop docker-proxy will hang on. Once one container cann't stop docker-proxy, above mentioned pm.lock cann't be released, so this container cann’t be stopped, Also all operation of this container could be hang on. Furthermore, if other containers enter stop or delete process, those containers also need to acquire the global pm.lock, and get stuck. As result, other operations to those containers also hang on, such as docker inspect, docker exec, etc. To fix this, I add protective measures to avoid the situation of global lock(pm.lock)cann't being released.
Signed-off-by: mingfukuang kuang.mingfu@zte.com.cn
- What I did
when stop or delete one container with docker-proxy, stop container will call Unmap() to release port ,
func Unmap() need to acquire pm.lock.Lock() firstly , then stop docker-proxy , and then pm.lock.UnLock() . but some situation(cannot be reproduced), stop docker-proxy will hang on :
I have the whole stack file ,but it too big (89M), so I did not put here.
corresponding code as followed:
once one container cann't stop docker-proxy , above mentioned pm.lock cann't be released, so this container cann’t be stopped , Also all operation of this container could be hang on.
Furthermore , if other containers enter stop or delete process, those containers also need to acquire the global pm.lock, and get stuck . As result, other operations to those containers also hang on , such as docker inspect, docker exec ,etc.
when above mentioned situation happen, adding protective measures to fix the problem of the global lock(pm.lock)cann't being released.