Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using vectorized IO (scatter/gather) #404

Open
jakirkham opened this issue Feb 28, 2021 · 3 comments
Open

Using vectorized IO (scatter/gather) #404

jakirkham opened this issue Feb 28, 2021 · 3 comments

Comments

@jakirkham
Copy link
Contributor

Operations like writelines in the Stream and the Transport APIs provide library authors the opportunity to send collections of buffers that they would like written, sent, etc. in one go. This can be really handy as it takes only one pass (as opposed to multiple passes) through layers of code to prepare buffers before they go out.

Many OSes supply similar C-level operations like writev on POSIX compatible or similar on Windows for operating on file descriptors. Similarly sendmsg on Linux and Unix or WSASend on Windows provide implementations for sockets.

Admittedly am not very familiar with libuv's API (so maybe devs here can comment on this), but it appears there are some APIs in libuv like uv_write can take multiple buffers, which can internally redirect to sendmsg or WSASend. This also appears to be true for files with the uv_fs_write API.

AFAICT (and I could be wrong about this) uvloop's writelines for Streams calls an internal _write function in a loop, which could write one entire buffer (if it is sufficiently large, etc.) or at least queue a write. Please correct me if I'm misunderstanding anything here.

However given libuv's own propensity to use scatter/gather IO under-the-hood, it might be worth holding off on queuing write operations until all of the buffers in writelines are collected and prepped. This would allow one larger send, write, etc. to occur and if it is above the high watermark for any buffer (likely?), no additional buffer prepping would be necessary either.


Side note: A separate interesting question would be doing something similar for reading. Not sure there is an API that could leverage this currently (may be wrong about this though). Maybe through pausing and resuming reading one could get close (though likely still leaves something on the table)?

Note: There may be similar optimizations possible in asyncio ( python/asyncio#339 ) ( python/cpython#19062 )

@jakirkham
Copy link
Contributor Author

After digging into this more I think uv__try_write uses sendmsg, which uvloop already calls in _exec_write, does handle scattering. Plus uvloop already can collect multiple buffers to send. So uvloop is already doing most of the right things.

We would just want to hold off on calling _queue_write until we have finished processing buffers in writelines.

@jakirkham
Copy link
Contributor Author

Submitted PR ( #445 ) to make the change proposed in the last comment.

@jakirkham
Copy link
Contributor Author

PR ( #445 ) has since been merged and included in the 0.17.0 release

An open question remains around how to do this for reading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant