-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inspect io wrappers #5033
Inspect io wrappers #5033
Conversation
There are use cases like checking hashes of files that benefit from being able to inspect bytes read as they come in, while still letting the main code process the bytes as normal (e.g. deserializing into objects, knowing that if there's a hash failure, you'll discard the result). As this is non-trivial to get right (e.g. handling a `buf` that's not empty when passed to `poll_read`, add a wrapper `InspectReader` that gets this right, passing all newly read bytes to a supplied `FnMut` closure. Fixes: tokio-rs#4584
When writing things out, it's useful to be able to inspect the bytes that are being written and do things like hash them as they go past. This isn't trivial to get right, due to partial writes and efficiently handling vectored writes (if used). Provide an `InspectWriter` wrapper that gets this right, giving a supplied `FnMut` closure a chance to inspect the buffers that have been successfully written out. Fixes: tokio-rs#4584
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments on where bounds also apply to the writer.
Sorry about the delay in reviewing this. We had some trouble with our CI setup that took a while to fix. |
Co-authored-by: Alice Ryhl <aliceryhl@google.com>
tokio-util/src/io/inspect.rs
Outdated
if let Poll::Ready(Ok(count)) = res { | ||
(me.f)(&buf[..count]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if count
is zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In current code, f
gets passed an empty slice, and is expected to handle this - which is fine for hashing and teeing uses,
I would prefer to document this, but can put a guard in place to stop you getting an empty slice if you'd prefer that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that we should avoid empty slices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do so for both writes and reads - the user should know about EOF on read, or ENOSPC on write from the "main" write or read process, and thus be able to handle that without needing the inspection callback to be called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing an empty slice on EOF read seems ok, but otherwise I would not expect the closure to ever get passed an empty slice.
tokio-util/src/io/inspect.rs
Outdated
for buf in bufs { | ||
let size = count.min(buf.len()); | ||
(me.f)(&buf[..size]); | ||
count -= size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if buf
is empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then f
gets called with an empty slice, since count.min(buf.len())
should be zero, and &buf[..0]
is an empty slice.
I only see two cases in which this can happen, and in both cases I think f
should be able to handle it (reasoning as above):
- The wrapped
AsyncWrite
genuinely returned a zero byte write. In this case, it is claiming to successfully write zero bytes, which is allowed in the contract (implying that the underlying object can't accept more data ever again). - The caller supplied one or more empty buffers to
poll_write_vectored
. In this case, you're being told that the supplied empty buffer was indeed written, which is accurate (if not particularly helpful).
I will add a documentation note to warn users that f
must cope with empty buffers to both wrappers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoiding empty slices instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some doc suggestions.
tokio-util/src/io/inspect.rs
Outdated
/// If no new data is supplied by a successful `poll_read`, then `f` will | ||
/// be called with an empty slice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// If no new data is supplied by a successful `poll_read`, then `f` will | |
/// be called with an empty slice. | |
/// The closure is called with an empty slice only if the inner reader has reached EOF, or if `poll_read` is called with an empty buffer. |
tokio-util/src/io/inspect.rs
Outdated
/// `f` will never be called with an empty slice; a vectored write will | ||
/// result in multiple calls to `f`, one for each buffer that was used by | ||
/// the write. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally avoid sentences that start with code since we can't make f
a capital letter.
/// `f` will never be called with an empty slice; a vectored write will | |
/// result in multiple calls to `f`, one for each buffer that was used by | |
/// the write. | |
/// The closure `f` will never be called with an empty slice. A vectored write can result in multiple calls to `f`. |
Motivation
When writing or reading data, it can be useful to inspect the bytes passing through, so that you can (e.g.) hash as you write, or verify hashes of read data concurrently with deserialization.
Wrapping
AsyncRead
andAsyncWrite
successfully so that you can inspect the bytes being handled has a few edge cases to get right, so let's bundle this all up into one place.Solution
A pair of wrappers that allow you to inspect either the bytes being read, or the bytes that have been written, plus test cases to show that they work correctly even when the underlying implementation does partial writes, or is only able to read in small chunks.