-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: document that on some systems SetLinger causes conn.Close to block #58882
Comments
UPDATE: I did a little more in-depth research:
In fact, a socket in non-blocking mode will not block calling From my point of view, the comment on Or, are you actually experiencing the blocking from |
What is your goal here? The default behavior is for any remaining unsent socket data to be sent in the background (the usual TCP timeouts continue to apply, so the data won't hang around indefinitely). Calling Since Go uses non-blocking I/O, as @panjf2000 says, the behavior of The most common use of |
It seems that way (at least to me). After running |
The goal here is mainly around owning the amount of time for the server to attempt to flush the data before dropping it; rather than relying on an amorphous OS default. |
If this is the case, then I believe there is nothing to do with |
Hm, there are other goroutines calling The main difference here is that |
I think this may actually be the issue. It seems that on linux systems, calling Apologies for the sketchy link... (let me know if I can send this info better somehow...) https://www.nybek.com/blog/2015/04/29/so_linger-on-non-blocking-sockets/
|
Sorry for the amphibolous statement about my previous comment, your usage of
Despite what this link says, according to your comment, |
My understanding is that the OS default is simply the TCP timeout. And my understanding is that That said, @panjf2000 makes a good point: can you confirm that when you see a goroutine hanging in close, it is hanging specifically on the line Is there a test case we can run to recreate the problem? |
Yeah, I wouldn't consider this a pressing issue for us, we're likely to just remove our usage of
Yes. Here is a trace showing that (was run using our release binaries which were built with go1.19.6):
I'll work on getting a reproducible test case in the next couple days. |
After writing a very short program I was able to replicate I got this from running this server: package main
import (
"fmt"
"net"
"time"
)
func main() {
fmt.Println("starting to listen")
listener, err := net.Listen("tcp", ":7777")
if err != nil {
panic(err)
}
fmt.Println("listening")
fmt.Println("waiting to accept")
conn, err := listener.Accept()
if err != nil {
panic(err)
}
fmt.Println("accepted connection")
fmt.Println("setting linger to 15s")
tcpConn := conn.(*net.TCPConn)
err = tcpConn.SetLinger(1500)
if err != nil {
panic(err)
}
fmt.Println("set linger to 15s")
fmt.Println("writing some data")
msg := make([]byte, 1<<20)
n, err := conn.Write(msg)
if err != nil {
panic(err)
}
fmt.Printf("wrote %d bytes of data\n", n)
startClose := time.Now()
fmt.Println("starting to close the connection")
err = conn.Close()
if err != nil {
panic(err)
}
fmt.Printf("closing the connection took %s\n", time.Since(startClose))
} Along with running the client from the previously mentioned SO_LINGER tests: https://github.com/nybek/linger-tools/blob/master/linger-client.c with the arguments The output of the server should look like:
This was all run using the go version + go env listed at the beginning of the issue. I feel like this is already a deviation from the documented behavior on |
Here is a pure golang example that includes both the server and the client and has the same results as above: package main
import (
"fmt"
"net"
"sync"
"time"
)
func main() {
fmt.Println("starting to listen")
listener, err := net.Listen("tcp", ":")
if err != nil {
panic(err)
}
fmt.Println("listening")
defer listener.Close()
addr := listener.Addr()
var wg sync.WaitGroup
wg.Add(1)
go func() {
fmt.Println("waiting to accept")
conn, err := listener.Accept()
if err != nil {
panic(err)
}
fmt.Println("accepted connection")
fmt.Println("setting linger to 15s")
tcpConn := conn.(*net.TCPConn)
err = tcpConn.SetLinger(15)
if err != nil {
panic(err)
}
fmt.Println("set linger to 15s")
fmt.Println("writing some data")
msg := make([]byte, 1<<20)
n, err := conn.Write(msg)
if err != nil {
panic(err)
}
fmt.Printf("wrote %d bytes of data\n", n)
startClose := time.Now()
fmt.Println("starting to close the connection")
err = conn.Close()
if err != nil {
panic(err)
}
fmt.Printf("closing the connection took %s\n", time.Since(startClose))
wg.Done()
}()
conn, err := net.Dial("tcp", addr.String())
if err != nil {
panic(err)
}
defer conn.Close()
wg.Wait()
} |
Ok, I think I'm able to fully close the loop on this now. This program replicates the blocking on Here is the output from package main
import (
"fmt"
"net"
"sync"
"time"
)
func main() {
fmt.Println("starting to listen")
listener, err := net.Listen("tcp", ":")
if err != nil {
panic(err)
}
fmt.Println("listening")
defer listener.Close()
addr := listener.Addr()
var wg sync.WaitGroup
wg.Add(1)
go func() {
fmt.Println("waiting to accept")
conn, err := listener.Accept()
if err != nil {
panic(err)
}
fmt.Println("accepted connection")
fmt.Println("setting linger to 15s")
tcpConn := conn.(*net.TCPConn)
err = tcpConn.SetLinger(1500)
if err != nil {
panic(err)
}
fmt.Println("set linger to 15s")
go func() {
fmt.Println("starting read")
_, _ = conn.Read([]byte{0})
fmt.Println("exited read")
}()
fmt.Println("writing some data")
msg := make([]byte, 1<<20)
n, err := conn.Write(msg)
if err != nil {
panic(err)
}
fmt.Printf("wrote %d bytes of data\n", n)
startClose := time.Now()
fmt.Println("starting to close the connection")
err = conn.Close()
if err != nil {
panic(err)
}
fmt.Printf("closing the connection took %s\n", time.Since(startClose))
wg.Done()
}()
conn, err := net.Dial("tcp", addr.String())
if err != nil {
panic(err)
}
defer conn.Close()
wg.Wait()
} It seems like the call to |
Thanks. For me that program blocks for 15 seconds in the call to But I guess we can mention that in the |
About the blocking on
Also, I've run the test code above on my macOS and it didn't reproduce this issue, which testifies my updated comment. |
Change https://go.dev/cl/473915 mentions this issue: |
Thank you for all the effort here. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
We added an explicit call to
SetLinger(15)
here: ava-labs/avalanchego@1a2dca1What did you expect to see?
go/src/net/tcpsock.go
Lines 161 to 173 in b94dc38
SetLinger
with a positive value will:We expected for the OS to flush any outstanding data over the TCP stream in the background.
What did you see instead?
It doesn't seem that the data is being sent in the backaground, but that
conn.Close()
may block until the specified timeout.I don't think this is actually unexpected for the behavior of SO_LINGER:
However, I feel like the comment on
SetLinger
seems to contradict the man pages.The text was updated successfully, but these errors were encountered: