-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: short writes with FileServer on macos #70000
Comments
Can you show us code that we can use the replicate the problem? Thanks. CC @golang/darwin |
CC @panjf2000 |
There's nothing special to it, really: m := http.NewServeMux()
m.Handle("/large-payload/", http.StripPrefix("/large-payload/", http.FileServer(http.Dir("path/to/large-payload")))) it looks at a directory with a bunch of numeric files all 350MiB in size: $ tree path/to/large-payload | head -n3
path/to/large-payload
├── 1
├── 10
... and I simply hit the server with: curl localhost:44444/large-payload/1 -o ./test In an OK case, this results in:
In the error case (
|
Can we get the error returned by your server-side? This 970b1c0 will ship in Go1.24. |
@panjf2000 I'll gladly comply if you can narrow the context. On high-level it writes short here: Line 424 in ed07b32
From what I observed, there's no error, e.g.: go/src/net/sendfile_unix_alt.go Line 77 in ed07b32
returns with
the number of bytes written fluctuates, e.g.:
the main point is that it's a short write. Please let me know if you need more details. |
As @a-palchikov mentioned, we don't see any errors from the server itself. The observation from the client is a partially fetched file. |
cc @neild |
This seems to do the trick:
The sequence of events is like following:
given the series of EAGAIN followed by no error, when it reaches the limit (4MiB), it bails out. |
@a-palchikov Thanks. Looks like we need something similar in the other sendfile implementations as well. |
Just wanted to chime in that this exhibits worse behavior on the client side when uploading large files. I have a client that will hang indefinitely on macOS pretty regularly, which I believe is due to this issue, as it uses an This was a bit of a heisenbug for me because when I tried wrapping the Thanks for uncovering this! |
Change https://go.dev/cl/622235 mentions this issue: |
@gopherbot Please open backport issues. This causes inexplicable and difficult to avoid failures on BSD systems. We should backport the fix. |
Backport issue(s) opened: Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases. |
The BSD implementation of poll.SendFile incorrectly halted copying after succesfully writing one full chunk of data. Adjust the copy loop to match the Linux and Solaris implementations. In testing, empirically macOS appears to sometimes return EAGAIN from sendfile after successfully copying a full chunk. Add a check to all implementations to return nil after successfully copying all data if the last sendfile call returns EAGAIN. For #70000 Change-Id: I57ba649491fc078c7330310b23e1cfd85135c8ff Reviewed-on: https://go-review.googlesource.com/c/go/+/622235 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
Thanks for the details! I think I've found the root cause and you would get rid of this issue after adding the below changes to your source code of go1.23.2: diff --git a/src/internal/poll/sendfile_bsd.go b/src/internal/poll/sendfile_bsd.go
index f2d13e1069..1f9c42253a 100644
--- a/src/internal/poll/sendfile_bsd.go
+++ b/src/internal/poll/sendfile_bsd.go
@@ -35,12 +35,16 @@ func SendFile(dstFD *FD, src int, pos, remain int64) (written int64, err error,
if int64(n) > remain {
n = int(remain)
}
+ m := n
pos1 := pos
n, err = syscall.Sendfile(dst, src, &pos1, n)
if n > 0 {
pos += int64(n)
written += int64(n)
remain -= int64(n)
+ if n == m {
+ continue
+ }
}
if err == syscall.EINTR {
continue Please try it out and let me know if it works, thanks! @a-palchikov |
Change https://go.dev/cl/622255 mentions this issue: |
@panjf2000 Yes, it works as well and is short-circuiting the wait write (which I assume is only necessary when handling the EAGAIN). Thanks! |
Change https://go.dev/cl/622696 mentions this issue: |
Change https://go.dev/cl/622697 mentions this issue: |
…Sendfile return on BSD The BSD implementation of poll.SendFile incorrectly halted copying after succesfully writing one full chunk of data. Adjust the copy loop to match the Linux and Solaris implementations. In testing, empirically macOS appears to sometimes return EAGAIN from sendfile after successfully copying a full chunk. Add a check to all implementations to return nil after successfully copying all data if the last sendfile call returns EAGAIN. For #70000 For #70020 Change-Id: I57ba649491fc078c7330310b23e1cfd85135c8ff Reviewed-on: https://go-review.googlesource.com/c/go/+/622235 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> (cherry picked from commit bd388c0) Reviewed-on: https://go-review.googlesource.com/c/go/+/622696
…dfile(2) sending the full chunk CL 622235 would fix #70000 while resulting in one extra sendfile(2) system call when sendfile(2) returns (>0, EAGAIN). That's also why I left sendfile_bsd.go behind, and didn't make it line up with other two implementations: sendfile_linux.go and sendfile_solaris.go. Unlike sendfile(2)'s on Linux and Solaris that always return (0, EAGAIN), sendfile(2)'s on *BSD and macOS may return (>0, EAGAIN) when using a socket marked for non-blocking I/O. In that case, the current code will try to re-call sendfile(2) immediately, which will most likely get us a (0, EAGAIN). After that, it goes to `dstFD.pd.waitWrite(dstFD.isFile)` below, which should have been done in the first place. Thus, the real problem that leads to #70000 is that the old code doesn't handle the special case of sendfile(2) sending the exact number of bytes the caller requested. Fixes #70000 Fixes #70020 Change-Id: I6073d6b9feb58b3d7e114ec21e4e80d9727bca66 Reviewed-on: https://go-review.googlesource.com/c/go/+/622255 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Damien Neil <dneil@google.com> Run-TryBot: Andy Pan <panjf2000@gmail.com> Reviewed-on: https://go-review.googlesource.com/c/go/+/622697
is this a regression or has been bugged for a long time? ie what is the range of go versions with that issue? |
I believe it was introduced in this commit so affected only the 1.23 release. |
…Sendfile return on BSD The BSD implementation of poll.SendFile incorrectly halted copying after succesfully writing one full chunk of data. Adjust the copy loop to match the Linux and Solaris implementations. In testing, empirically macOS appears to sometimes return EAGAIN from sendfile after successfully copying a full chunk. Add a check to all implementations to return nil after successfully copying all data if the last sendfile call returns EAGAIN. For golang#70000 For golang#70020 Change-Id: I57ba649491fc078c7330310b23e1cfd85135c8ff Reviewed-on: https://go-review.googlesource.com/c/go/+/622235 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> (cherry picked from commit bd388c0) Reviewed-on: https://go-review.googlesource.com/c/go/+/622696
…Sendfile return on BSD The BSD implementation of poll.SendFile incorrectly halted copying after succesfully writing one full chunk of data. Adjust the copy loop to match the Linux and Solaris implementations. In testing, empirically macOS appears to sometimes return EAGAIN from sendfile after successfully copying a full chunk. Add a check to all implementations to return nil after successfully copying all data if the last sendfile call returns EAGAIN. For golang#70000 For golang#70020 Change-Id: I57ba649491fc078c7330310b23e1cfd85135c8ff Reviewed-on: https://go-review.googlesource.com/c/go/+/622235 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> (cherry picked from commit bd388c0) Reviewed-on: https://go-review.googlesource.com/c/go/+/622696
Go version
go version go1.23.2 darwin/arm64
Output of
go env
in your module/workspace:What did you do?
I have a small webserver that serves static files in a testing setup.
What did you see happen?
I'm on go 1.23.2 on macos 14.7 and I'm having issues with the stdlib http.FileServer.
After much of poking around, I found that with the BSD sendfile implementation in internal/poll/sendfile_bsd.go having
the
maxSendfileSize
set to 4MiB is causing all large file transfers to fail. For example, the following is the curl failure:so all requests are failing with short writes.
If I update the
maxSendfileSize
to maximum (adopting the change from this commit) I don't observe any failures.Just setting the
maxSendfileSize
feels like a hack though (maybe I'm wrong) so I'd appreciate any hints on this behavior.What did you expect to see?
The
sendfile
API should be able to send files in their entirety.The text was updated successfully, but these errors were encountered: