Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify Read trait blocking behavior #625

Merged
merged 5 commits into from
Sep 9, 2024
Merged

Conversation

ivmarkov
Copy link
Contributor

This PR is an effort to clarify the blocking behavior of the Read trait with regards to the length of the user-supplied buffer.

It is a follow up on a recent discussion in the "Rust Embedded" Matrix chat where the agreement was that we could probably be more explicit about that.

For now, I've changed only the documentation of the blocking Read trait. Once/if we arrive at a text which seems reasonable to everybody, I'll extend the PR with a similar change to embedded_io_async::Read.

Ideally, we should arrive at a piece of documentation that we could also justify to be included upstream in the Rust STD Read trait as well, and possibly in its async variants in the futures and tokio crates.

@ivmarkov ivmarkov requested a review from a team as a code owner August 24, 2024 07:45
@ivmarkov
Copy link
Contributor Author

@Dirbaio @jamesmunns Pinging you here as you were actively participating in the Rust Embedded discussion on the subject.

Copy link
Member

@jamesmunns jamesmunns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I'm find with this addition, I think it is very clearly written!

My only questions are whether we say "SHOULD" or "is recommended", to make clear that this isn't mandatory, at least until we have matching language in std::io::Read, or whether we note "this is how the recv syscall works in posix, and we're aiming to match that behavior which is generally consistent in all major desktop operating systems".

embedded-io/src/lib.rs Outdated Show resolved Hide resolved
Co-authored-by: James Munns <james@onevariable.com>
@ivmarkov
Copy link
Contributor Author

Overall I'm find with this addition, I think it is very clearly written!

My only questions are whether we say "SHOULD" or "is recommended", to make clear that this isn't mandatory, at least until we have matching language in std::io::Read, or whether we note "this is how the recv syscall works in posix, and we're aiming to match that behavior which is generally consistent in all major desktop operating systems".

I'm very certain it cannot be "SHOULD" or "recommended". It is a "MUST", so to say, or else any protocol implementation on top of Read that assumes this behavior is simply buggy and needs to be rewritten to read only byte-by-byte.

That includes the "read_until" readers in STD and the fill_buf implementation.

@MabezDev
Copy link
Member

I'm very certain it cannot be "SHOULD" or "recommended". It is a "MUST", so to say, or else any protocol implementation on top of Read that assumes this behavior is simply buggy and needs to be rewritten to read only byte-by-byte.

I would agree with this, otherwise we end up in the same place where one implementor may choose not to follow this "recommendation" and we end up with differing impls.

This LGTM from my perspective once we've applied the same to the async docs.

/// If bytes are available to read:
/// - A non-zero amount of bytes is read to the beginning of `buf`, and the amount is returned immediately,
/// *without blocking and waiting for more bytes to become available*;
/// - It is not guaranteed that *all* available bytes are returned, it is possible for the implementation to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: this line looks like it applies only to the "if there are bytes available to read" case, while it should apply to both. Perhaps split it to a new paragraph and remove the -.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, but it is applicable only to the "if there are bytes available to read" case, as it talks about - if there are bytes available to read in the first place - how many of those would be returned - i.e. not necessarily all of them (because (a) the buffer might actually be shorter than what is available (b) because the implementation just decides so, for whatever other reasons).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies to the first case as well, doesn't it? In "Once at least one (or more) bytes become available", one could also ask if it's guaranteed that all available bytes are returned. The answer is "it is not guaranteed", but it's not completely obvious from the description as it is written now.

But as Dirbaio wrote, it is a "minor nit". Personally, I would not expect to have such a guarantee when reading the text you proposed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how come the first case is unclear that only a subset of the just-read bytes are returned, given that it says (emphasis mine):
"

  • Once at least one (or more) bytes become available, a non-zero amount of those is copied to the beginning of buf, and the amount is returned...

"

"a non-zero amount of those". Do we really need to be more explicit than that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah be it. If two folks find it unclear then it is probably unclear. :(
I'll re-arrange so that the problematic text is a separate paragraph.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jannic I've re-arranged it now so that the problematic text is a separate paragraph. If you can take a quick look - thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how come the first case is unclear that only a subset of the just-read bytes are returned, given that it says (emphasis mine): "

* Once at least one (or more) bytes become available, **a non-zero amount of those** is copied to the beginning of `buf`, and the amount is returned...

"

"a non-zero amount of those". Do we really need to be more explicit than that?

IMHO the only possible source of confusion was the asymmetry between the two cases. Yes, in general I agree that the wording "a non-zero amount of those" was sufficiently clear. But then the second case, in addition to that, explicitly stated that it's allowed to return less than available. For the first case, that sentence was missing. Readers may wonder why? Is there a difference in guarantees?

Again, this was a very minor detail. But if we can avoid this possible confusion by just putting the last sentence in a separate paragraph, as you now did, we should do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh, don't want to over-complicate it, but now - with the problematic text being a separate paragraph - the next paragraph that starts with "This waiting behavior is important for the cases where Read..." becomes unclear... in that readers might be confused that it only refers to the newly-born "Note..." paragraph, while it refers to everything mentioned above.

Any brilliant ideas? Or is it just my mind that finds the new arrangement also problematic?

@Dirbaio
Copy link
Member

Dirbaio commented Aug 25, 2024

LGTM. Thanks for taking care of this 👍

(I too think it should be MUST, for the already-stated reasons)

@ivmarkov
Copy link
Contributor Author

I've now added identical changes to the async version of the Read trait, where the only changes I did is to use "waiting" at the places where the blocking version uses either "blocking" or "blocking and waiting".

With that said, shall we delay the merge a bit until I open a similar PR or issue for the std::io::Read trait? If they accept a similar change, they might have more feedback to the text, that we might want to incorporate in our Read trait versions too?

@Dirbaio
Copy link
Member

Dirbaio commented Aug 26, 2024

Up to you. Getting feedback or merges to rust-lang PRs is not exactly fast though :)

@ivmarkov
Copy link
Contributor Author

Up to you. Getting feedback or merges to rust-lang PRs is not exactly fast though :)

I want to mitigate the danger of making the described behavior a MUST in our traits, and get resistance from the STD team to do it for the Read trait.

  • If they just return feedback that it is "obvious" - fine...
  • ...but if they say that it is unspecified i.e. "Option B" (what to do with the line-readers and stuff is then another topic), that would be quite a problem...

@Dirbaio
Copy link
Member

Dirbaio commented Aug 26, 2024

IMO even if std explicitly decides to make it Option B we should go for Option A.

Option B is just insane. as you point out many things like ReadBuf / read_line become broken unless you read byte by byte.

@adamgreig
Copy link
Member

We briefly discussed this in the meeting today as well, it seems like merging this now seems the best option. We can always wait a bit before releasing it to see what happens on rust-lang.

@ivmarkov
Copy link
Contributor Author

We briefly discussed this in the meeting today as well, it seems like merging this now seems the best option. We can always wait a bit before releasing it to see what happens on rust-lang.

OK - then be it. I've anyway not even started the PR/issue in STD upstream.

So if you could look at the embedded_io_async::Read trait wording as well, and if no additional feedback, I guess we can merge.

@ivmarkov
Copy link
Contributor Author

ivmarkov commented Sep 7, 2024

If no further feedback, can somebody with committer permissions approve this?

jannic
jannic previously approved these changes Sep 7, 2024
/// If bytes are available to read:
/// - A non-zero amount of bytes is read to the beginning of `buf`, and the amount is returned immediately,
/// *without blocking and waiting for more bytes to become available*;
/// - It is not guaranteed that *all* available bytes are returned, it is possible for the implementation to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies to the first case as well, doesn't it? In "Once at least one (or more) bytes become available", one could also ask if it's guaranteed that all available bytes are returned. The answer is "it is not guaranteed", but it's not completely obvious from the description as it is written now.

But as Dirbaio wrote, it is a "minor nit". Personally, I would not expect to have such a guarantee when reading the text you proposed.

Copy link
Member

@MabezDev MabezDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for taking care of this!

@MabezDev MabezDev added this pull request to the merge queue Sep 9, 2024
Merged via the queue into rust-embedded:master with commit 0c31d81 Sep 9, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants