-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: broken sendfile on SmartOS/Solaris? large files garbled when sent over network #13892
Comments
I suspect the sendfile implementation for Solaris is broken, then. Can you modify func sendFile(c *netFD, r io.Reader) (written int64, err error, handled bool) {
return And see if that fixes it? (it would revert to Read+Write) /cc @4ad |
@bradfitz You nailed it. I added a log line and the return statement, and I can now see the file being served correctly. Really appreciate the fast diagnosis. |
Could you please try with the latest SmartOS image? Thanks. |
Just for confirmation, you tested on a recent SmartOS image, a.i. a recent kernel, not an older SmartOS image with a recent base-64 image? |
@4ad the tests were all run on instances in the Joyent Public Cloud, base-64 is the name of their default basic 64-bit SmartOS image. joyent_20151002T091442Z is SmartOS Base Image v15.3.0. I can donate an instance for testing, if that helps. EDIT: FWIW, I just rebuilt Consul with net/sendfile patched, and it now too works as expected. |
Perhaps we should just disable sendfile on Solaris for Go 1.6 as a safe option. We can re-enable in Go 1.7 once we understand and fix the problem. |
I think that is a good idea On Tue, 12 Jan 2016, 05:57 Brad Fitzpatrick notifications@github.com
|
There are reports of corruption. Let's disable it for now (for Go 1.6, especially) until we can investigate and fix properly. Update #13892 Change-Id: I557275e5142fe616e8a4f89c00ffafb830eb3b78 Reviewed-on: https://go-review.googlesource.com/18540 Reviewed-by: Dave Cheney <dave@cheney.net>
Moved milestone to Go1.7. In Go 1.6, sendfile will just be disabled for Solaris and it will fall back to the path that works. |
Any news here? |
+cc @binarycrusader |
I'll look into this. |
Have reproduced on Solaris development build using decompressed Mark.Twain-Tom.Sawyer.txt.bz2 from src/compress/bzip2/testdata. |
It consistently fails after the first 262,144 characters (256KiB), where it starts to the repeat the input all over again. |
So the root cause of this issue is that the implementation of sendfile() for Solaris is not correct. In particular, for sizes beyond the size of the socket buffer, the same data will be sent This is only vaguely referenced in the Solaris sendfile(3c) man page. The FreeBSD sendfile(2) man page however makes this quite clear:
As a result, I believe the implementations for the following platforms may not actually be correct:
I am currently testing to determine whether any other platforms need a similar fix. |
So my new test works on Solaris, but did not fail as expected on DragonFly/FreeBSD, despite the fact that they clearly do not account for the EAGAIN scenario. I would welcome any insight others might have to as to why it did not fail. The only guess I have is that the file is not big enough to trigger the failure on FreeBSD. |
CL https://golang.org/cl/21769 mentions this issue. |
What version of Go are you using (go version)? go version go1.5.1 solaris/amd64
What operating system and processor architecture are you using? SunOS 5.11 joyent_20141030T081701Z i86pc i386 i86pc Solaris, SunOS 5.11 joyent_20151002T091442Z i86pc i386 i86pc Solaris
What did you do? Retrieved a >250K file over http from a golang server
What did you expect to see? The correct contents of the file
What did you see instead? A file with the correct size, but the first ~25% of the file repeated to fill the file.
If I access the file through localhost, the file is served correctly. If I send it over any network interface, the bug appears. Different lengths of the file are repeated to fill the space in each case. All smaller files come through without error.
I attempted to step through using godebug but ran into mailgun/godebug#12, since the issue is clearly somewhere in the standard library. So I have setup a temporary fileserver to demonstrate the issue.
text.text.2
contains the beginning of Sherlock Holmes from Project Gutenberg (thanks Norvig).Control:
python -m SimpleHTTPServer
http://165.225.174.156:8000/
http://165.225.174.156:8000/test.txt.2
Serves correct contents of file.
Bug:
gohttp
(itang/gohttp)http://165.225.174.156:8080/
http://165.225.174.156:8080/test.txt.2
Serves garbled file, every time.
Localhost:
gohttp
with a barebones nginx proxy from localhosthttp://165.225.174.156/
http://165.225.174.156/test.txt.2
Serves correct contents of file.
I discovered this bug because we just recently have hashicorp/consul building on SmartOS (hashicorp/consul#159), and the Consul UI contains a ~750K minified JS file. Consul uses gorilla/mux and
http.FileServer
, but I am able to reproduce with the much simpler case of gohttp.The test suite for the http response methods doesn't have tests from a remote machine, or use large files, so this bug can slip through in most cases unnoticed. Please let me know if I can provide more details. We really need a way to step through and debug the standard libs!
The text was updated successfully, but these errors were encountered: