-
Notifications
You must be signed in to change notification settings - Fork 18k
net: copy from Unix socket to os.Stdout fails with "waiting for unsupported file type" #59041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Actually it looks like this can happen with TCP as well, so it may not be CL 466015. |
Change https://go.dev/cl/476335 mentions this issue: |
I don't understand why there is a stat call needed asking the inode caches to achieve this instead of the much cheaper and more obviously correct normal tty check that only checks for presence a kernel file operation (equivalent to a static itab compare in Go). See e.g. https://github.com/mattn/go-isatty how to achieve this. Checking for terminal only would still allow splicing to high speed device drivers that implement it correctly. |
Honestly, we've accumulated so many special-case checks on that path by now that I wonder if it would be more efficient to just try the |
@bcmills The problem in this particular case is that we actually use two splice calls, one from the source to a pipe, and one from the pipe to the destination. We do already try the first splice call and give up if it fails. But if it works then we have started reading from the source, so if the second splice fails then we are stuck. This change is trying to detect whether the second splice will work. Though now that I write that it occurs to me that perhaps we could add still more complexity: if the second splice fails, we could just copy the data ourselves. The result would likely be slower overall but it might work. @nightlyone Thanks for the suggestion. |
I now think that the stat call was a mistake on my part, and it may not be required at all. Taking it back out. Thanks for raising the issue. |
Change https://go.dev/cl/477035 mentions this issue: |
Was removing We've been running into an issue on a couple of our proxy servers (running github.com/andybalholm/redwood) where certain downloads time out instead of completing normally. We can't reproduce the issue on other machines, and the effects aren't even quite the same on the two machines. But it's quite consistent on those machines. So we did a diff --git a/src/internal/poll/splice_linux.go b/src/internal/poll/splice_linux.go
index 9505c5dcfc..4690962ae6 100644
--- a/src/internal/poll/splice_linux.go
+++ b/src/internal/poll/splice_linux.go
@@ -13,6 +13,9 @@ import (
)
const (
+ // spliceNonblock makes calls to splice(2) non-blocking.
+ spliceNonblock = 0x2
+
// maxSpliceSize is the maximum amount of data Splice asks
// the kernel to move in a single call to splice(2).
// We use 1MB as Splice writes data through a pipe, and 1MB is the default maximum pipe buffer size,
@@ -89,7 +92,7 @@ func spliceDrain(pipefd int, sock *FD, max int) (int, error) {
return 0, err
}
for {
- n, err := splice(pipefd, sock.Sysfd, max, 0)
+ n, err := splice(pipefd, sock.Sysfd, max, spliceNonblock)
if err == syscall.EINTR {
continue
}
@@ -127,7 +130,7 @@ func splicePump(sock *FD, pipefd int, inPipe int) (int, error) {
}
written := 0
for inPipe > 0 {
- n, err := splice(sock.Sysfd, pipefd, inPipe, 0)
+ n, err := splice(sock.Sysfd, pipefd, inPipe, spliceNonblock)
// Here, the condition n == 0 && err == nil should never be
// observed, since Splice controls the write side of the pipe.
if n > 0 { When we make this change to the tip of the master branch, our problem goes away, and all tests still pass. Interestingly, the removal of |
I removed Is there any difference in the kernel version on the machines where your code fails? Can you share a test case that demonstrates the failure? Thanks. |
Both of the machines with the problem are running Linux kernel 3.10.0 (from CentOS 7). We do have a few different kernel versions in use, but some of the ones that are not affected are also running 3.10.0. I have no idea how to make a test case. For my |
Looking at the documentation for
If the files are blocking, the splice call can block even if SPLICE_F_NONBLOCK is set. It doesn't say what could happen if the files are nonblocking but SPLICE_F_NONBLOCK is set, but maybe there are circumstances where the splice pipe operations themselves can block. |
Actually, I believe that the As far as I can tell, the original intent of Furthermore, the Linux kernel added I've drafted a CL for this. |
Change https://go.dev/cl/536015 mentions this issue: |
@panjf2000 Thanks for the analysis. |
…stency with O_NONBLOCK For #59041 Details: #59041 (comment) Change-Id: Id3fc1df6d86b7c4cc383d09f9465fa8f4cc2a559 Reviewed-on: https://go-review.googlesource.com/c/go/+/536015 Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Ian Lance Taylor <iant@google.com>
…stency with O_NONBLOCK For golang#59041 Details: golang#59041 (comment) Change-Id: Id3fc1df6d86b7c4cc383d09f9465fa8f4cc2a559 Reviewed-on: https://go-review.googlesource.com/c/go/+/536015 Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Ian Lance Taylor <iant@google.com>
@ianlancetaylor, @panjf2000 - thanks for the analysis and fix. goroutine 1643 [IO wait]: goroutine 1644 [syscall]: Would you know when the fix for https://go.dev/cl/536015 will be available in git? I don't see this as an open issue. Would the fix make it in 1.21.4 or will it be available in a new version of golang.org/x/net? Thanks! |
Could you pull the latest code of Go from the |
We have created a test program to reproduce the issue in a Linux system with with kernel <
We need to do some code cleanup before we can share a minimum program to reproduce. Should this issue be re-opened? Or should a new issue be created? It seems the issue is severe enough that it should be backported to the 1.21 branch. |
The attached go file can be used to recreate/validate the fix.
Stalled: Success: |
We should open a new issue for the subsequent discussions and link it to this issue. Could you open a new issue to which you can move those previous comments and attachments about the suspicious behaviors (including what @andybalholm mentioned earlier)? Thanks! @sebastien-rosset @sandeshrk |
I have created #63795 |
Change https://go.dev/cl/538117 mentions this issue: |
… splice to avoid inconsistency with O_NONBLOCK Fixes #63801 Updates #59041 Updates #63795 Details: #59041 (comment) Change-Id: Id3fc1df6d86b7c4cc383d09f9465fa8f4cc2a559 Reviewed-on: https://go-review.googlesource.com/c/go/+/536015 Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Ian Lance Taylor <iant@google.com> (cherry picked from commit 40cdf69) Reviewed-on: https://go-review.googlesource.com/c/go/+/538117 Auto-Submit: Heschi Kreinick <heschi@google.com> Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com> Reviewed-by: Heschi Kreinick <heschi@google.com>
On HEAD this test function fails:
I see
Here line 50 is the
t.Fatal
called whenio.Copy
fails.This test passes with Go 1.20.
This is most likely due to https://go.dev/cl/466015. CC @panjf2000
The text was updated successfully, but these errors were encountered: