-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Reader.read_at_least() #13127
Add Reader.read_at_least() #13127
Conversation
@@ -320,8 +319,12 @@ pub enum IoErrorKind { | |||
ResourceUnavailable, | |||
IoUnavailable, | |||
InvalidInput, | |||
/// The Reader returned 0 from `read()` too many times. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd clarify this to "0 bytes"
I'm a little uncomfortable about the litany of methods that are getting added to the Approaching |
@alexcrichton I'm also open to the suggestion of renaming it to |
@alexcrichton I removed |
/// Fails if `len` is greater than the length of `buf`. | ||
fn read_at_least(&mut self, buf: &mut [u8], len: uint) -> IoResult<uint> { | ||
assert!(len <= buf.len()); | ||
// always read at least once in case len == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to read at least once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it seems exceedingly odd to me to call read_at_least(buf, n)
and have it read zero times.
Hm, I'm going to try to put some thinking down into words, this is all just thinking out loud. Before this change, we have this list of methods for dealing with a partial amount of bytes on a
After this change, we have this list of methods
This still seems a little sprawling to me, especially when I look at the types that everything operates on. For example, Some thoughts:
Depending on the answers, we may be able to pare down the API to fn read(&mut self, buf: &mut [u8]) -> IoResult<uint>;
fn read_at_least(&mut self, buf: &mut [u8], amt: uint) -> IoResult<uint>;
fn read_exact(&mut self, buf: &mut [u8]) -> IoResult<()>; I like that this only operates on A little bit of a ramble, but I'm thinking that these utility functions on a |
@alexcrichton I would like to remove let bytes = try!(file.read_exact(names_bytes as uint - 1)); became the following:
and that felt just awkward enough that I decided not to remove it at this time. But I only count 5 calls to As for I could see removing |
Also, looking at your suggested API again, |
Looking at the API again, I think that I'm wondering if there's any utility to splitting In any case, I'm open to removing |
What about doing something like how the |
@DaGenix As for doing something like |
I was thinking that .next() could read whatever type is appropriate depending upon the value being assigned to - byte, uint, float, etc. However, I failed to consider big-ending vs. little-endian issues. So, yeah, nevemind this idea. |
@alexcrichton, @brson: Any more thoughts on this PR? |
I just rebased on top of the recent |
I'm sorry it took awhile to get back on this, but can you elaborate on what the use case for this is? I can't really think of a case where I want to read N bytes, but if I read some number greater than N I can easily deal with it. |
@alexcrichton I think the rationale for this is to provide a convenience method that deals with zero-length reads. I'm not sure if @kballard has other use cases in mind. Though, now that I think about it, for solving zero-length reads, maybe |
The short answer is we need a |
Rebased on top of master. I still think this is something we should do. @pongad: I don't think that not returning the number of bytes read is a problem. It's hard to think of a use case where those bytes are actually useful. The only real case I can think of where I'd want to be able to use those bytes is if I'm trying to report an error that includes the truncated bytes, and that seems niche enough that it's not worth complicating the API. |
@alexcrichton et al, any more feedback? I need to rebase it at this point, but I'd like to know that we can move forward with this. |
data: self.as_ptr().offset(start as int), | ||
len: (end - start) | ||
}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid extending Vec
's api further?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you prefer the ugly code that slice_capacity()
is replacing? To be clear, that's calling set_len()
to pretend uninitialized memory is valid, slicing that, then using try_finally()
to call set_len()
back to the correct value when done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I prefer to not expand Vec
's api on a whim. If you want to add this as a private helper in std::io
, that's fine.
I'm ok with merging this pending comments. It will need a re-worded commit message to reflect the breaking change as well. |
@alexcrichton Thanks. I need to rebase it, and I'll reword the commit message. Based on my reply to your comments, do you still believe I need to change the semantics of |
It looks like this is kinda based off what go is doing, and they don't do the read when Additionally, perhaps an error should be returned rather than failing in these new methods? |
Hmm, I didn't check to see what Go does. I'm actually vaguely surprised to see that I documented that the read is always performed because I wanted clients to be able to rely on it. If you don't want it documented, then I think it's better to not do the read when Given this precedent, I'll go ahead and change the behavior to skip the read.
The failure happens in response to a logic error, i.e. the client passing a |
Our current spirit is to not fail as much as possible, and this seems like an easy case to not fail (you're already returning an error), and I figured that the |
I didn't realize that's what |
r? @alexcrichton I've made the requested changes, and also squashed it down to one commit. |
Reader.read_at_least() ensures that at least a given number of bytes have been read. The most common use-case for this is ensuring at least 1 byte has been read. If the reader returns 0 enough times in a row, a new error kind NoProgress will be returned instead of looping infinitely. This change is necessary in order to properly support Readers that repeatedly return 0, either because they're broken, or because they're attempting to do a non-blocking read on some resource that never becomes available. Also add .push() and .push_at_least() methods. push() is like read() but the results are appended to the passed Vec. Remove Reader.fill() and Reader.push_exact() as they end up being thin wrappers around read_at_least() and push_at_least(). [breaking-change]
Reader.read_at_least() ensures that at least a given number of bytes have been read. The most common use-case for this is ensuring at least 1 byte has been read. If the reader returns 0 enough times in a row, a new error kind NoProgress will be returned instead of looping infinitely. This change is necessary in order to properly support Readers that repeatedly return 0, either because they're broken, or because they're attempting to do a non-blocking read on some resource that never becomes available. Also add .push() and .push_at_least() methods. push() is like read() but the results are appended to the passed Vec. Remove Reader.fill() and Reader.push_exact() as they end up being thin wrappers around read_at_least() and push_at_least(). [breaking-change]
Reader.read_at_least() ensures that at least a given number of bytes
have been read. The most common use-case for this is ensuring at least 1
byte has been read. If the reader returns 0 enough times in a row, a new
error kind NoProgress will be returned instead of looping infinitely.
This change is necessary in order to properly support Readers that
repeatedly return 0, either because they're broken, or because they're
attempting to do a non-blocking read on some resource that never becomes
available.
Also add .push() and .push_at_least() methods. push() is like read() but
the results are appended to the passed Vec.
Remove Reader.fill() and Reader.push_exact() as they end up being thin
wrappers around read_at_least() and push_at_least().
[breaking-change]