Using vectorized IO (scatter/gather) #404

jakirkham · 2021-02-28T00:36:42Z

Operations like writelines in the Stream and the Transport APIs provide library authors the opportunity to send collections of buffers that they would like written, sent, etc. in one go. This can be really handy as it takes only one pass (as opposed to multiple passes) through layers of code to prepare buffers before they go out.

Many OSes supply similar C-level operations like writev on POSIX compatible or similar on Windows for operating on file descriptors. Similarly sendmsg on Linux and Unix or WSASend on Windows provide implementations for sockets.

Admittedly am not very familiar with libuv's API (so maybe devs here can comment on this), but it appears there are some APIs in libuv like uv_write can take multiple buffers, which can internally redirect to sendmsg or WSASend. This also appears to be true for files with the uv_fs_write API.

AFAICT (and I could be wrong about this) uvloop's writelines for Streams calls an internal _write function in a loop, which could write one entire buffer (if it is sufficiently large, etc.) or at least queue a write. Please correct me if I'm misunderstanding anything here.

However given libuv's own propensity to use scatter/gather IO under-the-hood, it might be worth holding off on queuing write operations until all of the buffers in writelines are collected and prepped. This would allow one larger send, write, etc. to occur and if it is above the high watermark for any buffer (likely?), no additional buffer prepping would be necessary either.

Side note: A separate interesting question would be doing something similar for reading. Not sure there is an API that could leverage this currently (may be wrong about this though). Maybe through pausing and resuming reading one could get close (though likely still leaves something on the table)?

Note: There may be similar optimizations possible in asyncio ( python/asyncio#339 ) ( python/cpython#19062 )

The text was updated successfully, but these errors were encountered:

jakirkham · 2021-09-08T22:14:39Z

After digging into this more I think uv__try_write uses sendmsg, which uvloop already calls in _exec_write, does handle scattering. Plus uvloop already can collect multiple buffers to send. So uvloop is already doing most of the right things.

We would just want to hold off on calling _queue_write until we have finished processing buffers in writelines.

jakirkham · 2021-10-01T21:16:43Z

Submitted PR ( #445 ) to make the change proposed in the last comment.

jakirkham · 2022-09-26T08:58:24Z

PR ( #445 ) has since been merged and included in the 0.17.0 release

An open question remains around how to do this for reading

jakirkham mentioned this issue Mar 15, 2021

Profiling Scheduler Performance dask/distributed#4443

Open

jakirkham mentioned this issue Oct 1, 2021

Queue write only after processing all buffers #445

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using vectorized IO (scatter/gather) #404

Using vectorized IO (scatter/gather) #404

jakirkham commented Feb 28, 2021

jakirkham commented Sep 8, 2021

jakirkham commented Oct 1, 2021

jakirkham commented Sep 26, 2022

Using vectorized IO (scatter/gather) #404

Using vectorized IO (scatter/gather) #404

Comments

jakirkham commented Feb 28, 2021

jakirkham commented Sep 8, 2021

jakirkham commented Oct 1, 2021

jakirkham commented Sep 26, 2022