Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: nil pointer dereference in closeConnIfStillIdle #16208

Closed
reusee opened this issue Jun 29, 2016 · 15 comments
Closed

net/http: nil pointer dereference in closeConnIfStillIdle #16208

reusee opened this issue Jun 29, 2016 · 15 comments

Comments

@reusee
Copy link

reusee commented Jun 29, 2016

  1. What version of Go are you using (go version)?
    go version devel +b75b063 Tue Jun 28 04:49:33 2016 +0000 linux/amd64
  2. What operating system and processor architecture are you using (go env)?
    GOARCH="amd64"
    GOBIN=""
    GOEXE=""
    GOHOSTARCH="amd64"
    GOHOSTOS="linux"
    GOOS="linux"
    GOPATH="/home/reus"
    GORACE=""
    GOROOT="/home/reus/go"
    GOTOOLDIR="/home/reus/go/pkg/tool/linux_amd64"
    CC="gcc"
    GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build483039895=/tmp/go-build -gno-record-gcc-switches"
    CXX="g++"
    CGO_ENABLED="1"
  3. What did you see instead?
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4de58f]

goroutine 54787 [running]:
panic(0x7a2f40, 0xc4200120c0)
    /home/reus/go/src/runtime/panic.go:500 +0x1a1
net/http.(*persistConn).closeConnIfStillIdle(0xc4249cfb00)
    /home/reus/go/src/net/http/transport.go:1255 +0x2f
net/http.(*persistConn).(net/http.closeConnIfStillIdle)-fm()
    /home/reus/go/src/net/http/transport.go:643 +0x2a
created by time.goFunc
    /home/reus/go/src/time/sleep.go:154 +0x44

@ALTree
Copy link
Member

ALTree commented Jun 29, 2016

You should include a recipe to reproduce the error in your report. Point 3 in the issue report list says

  1. What did you do?

If possible, provide a recipe for reproducing the error.

A complete runnable program is good.

A link on play.golang.org is best.

@bradfitz
Copy link
Contributor

In addition to code, as @ALTree mentioned, does your program run under the race detector?

@reusee
Copy link
Author

reusee commented Jun 29, 2016

@bradfitz No, not with race detector.

I don't think it will be easy to reproduce by a short example. It's a large long-running program. I believe it's introduced by recent changes. I will try to do some git bisect to find out which commit.

@bradfitz bradfitz added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 29, 2016
@bradfitz
Copy link
Contributor

bradfitz commented Jun 29, 2016

No, not with race detector.

Does that mean you haven't tried, or it fails?

I will try to do some git bisect to find out which commit.

Please do. I labeled this "WaitingForInfo" since there's nothing for us to help with until we have more to go on. If you get us something soon, there's a chance it could be fixed for Go 1.7, if there's a problem.

@reusee
Copy link
Author

reusee commented Jun 30, 2016

@bradfitz I didn't run the program with race detector, I will give it a try.

@reusee
Copy link
Author

reusee commented Jul 4, 2016

Close because the program ran for several days with no crash at all, with race detector enabled / disabled. No way to reproduce.

@reusee reusee closed this as completed Jul 4, 2016
@vadmeste
Copy link

This bug occurred with me too (minio/minio#2456). However, it is not reproducible anymore. So, just wanted to encourage you to review the code if many other people experiment the same issue.

@bradfitz
Copy link
Contributor

@vadmeste, if you find a reproducible example, please file a new bug. You can reference this one. But we don't re-use old bugs.

@kidoman
Copy link

kidoman commented Aug 18, 2016

Just got this today after upgrading to Go 1.7. Try to see if it occurs multiple times and will file a new bug. As this occurs in a go routine outside our control, it ended up bringing down the entire service.

@bradfitz
Copy link
Contributor

@kidoman, if you find a reproducible example, please file a new bug. You can reference this one. But we don't re-use old bugs.

@golang golang locked and limited conversation to collaborators Aug 18, 2016
@golang golang unlocked this conversation Aug 18, 2016
@bradfitz bradfitz reopened this Aug 18, 2016
@bradfitz bradfitz changed the title net/http: invalid memory address or nil pointer dereference net/http: nil pointer dereference in closeConnIfStillIdle Aug 18, 2016
@bradfitz bradfitz removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Aug 18, 2016
@bradfitz bradfitz modified the milestones: Go1.7.1, Go1.7Maybe Aug 18, 2016
@bradfitz
Copy link
Contributor

Okay, I have a fuzzy theory, so let's just use this bug, since it was never fixed anyway.

The crash is because persistConn.t is nil and we're crashing on the mutex Lock line here:

func (pc *persistConn) closeConnIfStillIdle() {
        t := pc.t
        t.idleMu.Lock()  // <--- BOOM, because t == nil

As far as I can tell, the only place wheret is nil (since it's never set to nil once created) is in the HTTP/2 (alt) path:

   return &persistConn{alt: next(cm.targetAddr, pconn.conn.(*tls.Conn))}, nil

So somehow that persistConn is getting into the tryPutIdleConn path. I suspect it's via t.putOrCloseIdleConn, in this path:

        handlePendingDial := func() {
                testHookPrePendingDial()
                go func() {
                        if v := <-dialc; v.err == nil {
                                t.putOrCloseIdleConn(v.pc)  // <--- here
                        }
                        testHookPostPendingDial()
                }()
        }

I don't have a repro yet, but it looks suspicious.

@mrjrieke
Copy link

Looks promising. Possibly race between setup and teardown?

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/27450 mentions this issue.

gopherbot pushed a commit that referenced this issue Sep 7, 2016
…rash after IdleConnTimeout

Go 1.7 crashed after Transport.IdleConnTimeout if an HTTP/2 connection
was established but but its caller no longer wanted it. (Assuming the
connection cache was enabled, which it is by default)

Fixes #16208

Change-Id: I9628757f7669e344f416927c77f00ed3864839e3
Reviewed-on: https://go-review.googlesource.com/27450
Reviewed-by: Andrew Gerrand <adg@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-on: https://go-review.googlesource.com/28637
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
oconnor663 added a commit to keybase/client that referenced this issue Oct 6, 2016
@sudersen
Copy link

sudersen commented Dec 3, 2016

What version of Go are you using (go version)?

go version go1.7.3 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH=""
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build191499881=/tmp/go-build"
CXX="g++"
CGO_ENABLED="1"

What did you do?

This is the first time I'm seeing this crash on the same code running for months and It happened for me sporadically on high load. Following are the initial trace.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x5241a0]

What did you expect to see?

We make an HTTP request to a service, process the response and return it as reference to the caller. The reference is never expected to be out of box since we have created an object and this failure did not occur by far for us in the service running for months.

Caller here is spawned using standard waitGroup method. Are there any other race conditions that could happen?

@bradfitz
Copy link
Contributor

bradfitz commented Dec 3, 2016

@sudersen, you're commenting on a closed issue. Closed issues aren't tracked.

If you're experiencing a problem, please file a new code and include code. There's nothing in your comment to use to help you.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants