Skip to content

Need for Improvements on Documentation of io.Copy #38436

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rafaelreinert opened this issue Apr 14, 2020 · 6 comments
Closed

Need for Improvements on Documentation of io.Copy #38436

rafaelreinert opened this issue Apr 14, 2020 · 6 comments
Labels
Documentation Issues describing a change to documentation. FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@rafaelreinert
Copy link

Hey Guys, Recently I've been studying a lot about system calls and I noticed that The standard library uses many of then to increase performance, like on io.Copy when if Reader is a file, Writer a TCPConn and the OS is linux, freebsd or windows, it uses sendfile or splice. Another example is this issue to be implemented #36817.

But I feel these different paths are not clear enough on documentation. Example:
On io.Copy:

go/src/io/io.go

Lines 363 to 367 in 34e38ac

// If src implements the WriterTo interface,
// the copy is implemented by calling src.WriteTo(dst).
// Otherwise, if dst implements the ReaderFrom interface,
// the copy is implemented by calling dst.ReadFrom(src).
func Copy(dst Writer, src Reader) (written int64, err error) {

On net.TCPConn.ReadFrom:

go/src/net/tcpsock.go

Lines 98 to 99 in 34e38ac

// ReadFrom implements the io.ReaderFrom ReadFrom method.
func (c *TCPConn) ReadFrom(r io.Reader) (int64, error) {

If the programmer doesn't read all the code, he never will realize what the code is doing. Like he will never know that his solution isn't copying the file to userspace but it's just sending from page cache to socket.

I would like to know if we can describe these cases on io.Copy comments or create a wiki about this topic? What is the best approach to spread this knowledge?

Thanks a lot :) and it's my first issue on this project, if I've done something wrong let me know ;)

@gopherbot gopherbot added the Documentation Issues describing a change to documentation. label Apr 14, 2020
@ianlancetaylor
Copy link
Contributor

Thanks. My first reaction is: why does it matter? It's important that we clearly document what io.Copy does, but I don't see any need to document exactly what system calls it uses. Especially when that is going to vary across operating systems and operating system versions.

I think it would be entirely reasonable to write a blog post about this, maybe even on the Go blog. I'm not sure it has a clear place on the wiki, as I don't know how people would discover it.

@ianlancetaylor ianlancetaylor added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Apr 14, 2020
@andybons
Copy link
Member

Thanks for the issue.

If the programmer doesn't read all the code, he never will realize what the code is doing. Like he will never know that his solution isn't copying the file to userspace but it's just sending from page cache to socket.

What unexpected behavior results from the implementation details you describe? How often does a programmer need to understand that level of detail? I don’t really care how http.Get works internally as long as it behaves the way I expect it to. If something weird is going on in an uncommon case due to an implementation detail, then inspecting the code is the next step I would take.

I echo Ian’s points as well (posted before I got a chance to send this :). I think a blog post is a good first start, perhaps.

@andybons andybons added this to the Unplanned milestone Apr 14, 2020
@ericlagergren
Copy link
Contributor

I think major optimizations like splice, sendfile, etc. should at least be mentioned by name in TCPConn’s docs. (io.Copy is definitely out of scope.)

On more than one occasion I’ve seen stuff like this: https://github.com/inetaf/tcpproxy/blob/b6bb9b5b82524122bcf27291ede32d1517a14ab8/tcpproxy.go#L434

While Brad is intimately familiar with net, most people aren’t. Taking advantage of sort of generic optimizations shouldn’t require understanding the guts of net.

@ianlancetaylor
Copy link
Contributor

I agree that the use of magic methods and the risk of hiding them is a general problem, but I'm not quite sure it's this problem.

@rafaelreinert
Copy link
Author

Hey, I've got the point, Thanks @ianlancetaylor @andybons @ericlagergren, when I wrote this issue I thought about some not trivial cases, like if I serve a static file and I have some trouble on my response time, if I know that my application is using these kindle of syscalls I go directly to page cache stats to see the hit ratio. but if I don't know it, I will spend more time to figure that.
But how @andybons has pointed, It's a really uncommon case and probably people who have these kinds of troubles have good knowledge about it.

Now I think that a blog post is a good first start, if you haven't plans to create it on go blog, I can create it on medium and see the effectiveness, we can revisit this issue in future.

About the @ericlagergren point, write a mention on TCPConn is a good idea for me, I found some comments which has this kind of approach like:

go/src/net/http/server.go

Lines 568 to 570 in dad94e7

// ReadFrom is here to optimize copying from an *os.File regular file
// to a *net.TCPConn with sendfile.
func (w *response) ReadFrom(src io.Reader) (n int64, err error) {

go/src/net/http/transport.go

Lines 1670 to 1673 in ea1437a

// ReadFrom exposes persistConnWriter's underlying Conn to io.Copy and if
// the Conn implements io.ReaderFrom, it can take advantage of optimizations
// such as sendfile.
func (w persistConnWriter) ReadFrom(r io.Reader) (n int64, err error) {

go/src/net/net.go

Lines 622 to 624 in cbaa666

// Fallback implementation of io.ReaderFrom's ReadFrom, when sendfile isn't
// applicable.
func genericReadFrom(w io.Writer, r io.Reader) (n int64, err error) {

And about @ianlancetaylor last comment I really agree with you, magic methods have trade-offs but in this case, I strongly believe it gives a good optimisation with a low cost. and in golang is easy to understand this kind of methods.

Thanks ;)

@andybons
Copy link
Member

Would you like to close this issue or repurpose it to discuss the TCPConn comment?

@golang golang locked and limited conversation to collaborators Apr 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Documentation Issues describing a change to documentation. FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

5 participants