Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ReadableStreamBYOBReader.prototype.read(view, { min }) #1145

Merged
merged 43 commits into from
Nov 13, 2023

Conversation

MattiasBuelens
Copy link
Collaborator

@MattiasBuelens MattiasBuelens commented Jul 15, 2021

Implements #1143 and #1175.

In my first attempt, I did most of the heavy lifting in a ReadableStreamBYOBReaderReadFully() helper that calls read(view). If the first read didn't fill the entire view yet, it immediately calls read(view) again but with a special "addToFront" flag so that the new read-into request goes to the front of the queue rather than the back. This works fine, but it's not very pretty. It also means we're possibly calling read(view) re-entrantly from within ReadIntoRequest.chunkSteps(), which is somewhat pushing the boundaries of what you can reasonably expect from the spec. 😬 (I've kept this attempt in the commit history, so you can judge for yourselves.)

I then realized that we already have logic that keeps a read-into request in the queue after a respond(). When the user performs a BYOB read with a multi-byte typed array, the stream only fulfills the request when at least elementSize have been written into the pull-into descriptor. If not, the request stays pending and the pull-into descriptor remains at the head of the queue. Therefore, in my second attempt, I add a special type of pull-into descriptor which must stay in the queue as long as bytesFilled < byteLength, rather than the usual bytesFilled < elementSize. This also works very well, but requires changes across multiple parts of the code.

In #1175, we realized that it would be more powerful to allow reading at least a given number of bytes, rather than reading an exact number of bytes. Thus, in my third attempt, every pull-into descriptor now has a "minimum fill" slot that tracks how many bytes must be filled before this descriptor can fulfill its associated read(view) request. As a result, the bytesFilled < elementSize and bytesFilled < byteLength conditions from the previous solution are now replaced with a single bytesFilled < minimumFill condition. This greatly simplifies the logic.

To do:

  • Decide upon the final API
    • Current proposal: byobReader.read(view, { min })
    • Do we still want byobReader.fill(view) too? This would basically be a shorthand for read() where min = view.length
  • Write (many more) tests to prove that this actually handles all possible edge cases
  • Write the specification text once we've decided on the approach

(See WHATWG Working Mode: Changes for more details.)


Preview | Diff

Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems way too easy! I expected something more like the manual loop you would have to do if writing this in userspace.

If it works, I definitely like this approach. I imagine tests will let us know if it does...

@MattiasBuelens
Copy link
Collaborator Author

I looked at a few other languages to see what their API looks like:

  • Java java.io.DataInput.readFully
    • If end of file is detected before filling up to b.length, then this throws an EOFException. In this case, it may be that some but not all bytes of b have been updated with data from the input stream.
  • Rust std::io::Read::read_exact
    • If this function encounters an “end of file” before completely filling the buffer, it returns an error of the kind ErrorKind::UnexpectedEof. The contents of buf are unspecified in this case.
  • .NET System.IO.Stream.ReadAsync
    • Behaves like ReadableStreamBYOBReader.read(view), in that it doesn't necessarily fill the entire buffer. I couldn't find a direct equivalent that does fill up the entire buffer. Even BinaryReader.ReadBytes() may read less than the requested number of bytes.
    • BinaryReader.ReadUint32() throws an EndOfStreamException if it detects EOF before reading 4 bytes.

It looks like most other languages throw an error if they encounter EOF before completely filling the buffer. That said, the Streams standard already takes a different approach for "regular" reads too: instead of returning the number of bytes read into the given buffer, we transfer the given buffer and return a { done, value } pair with a view on that buffer. So it seems appropriate to align closer with our existing read(view) and use the done flag to indicate if we encountered EOF earlier. We can also guarantee that the returned view contains all data up to EOF, rather than leaving the buffer contents unspecified.

If we want to bikeshed a bit on the name, here are some ideas:

  • readFully(view)
  • readExact(view)

Copy link
Collaborator Author

@MattiasBuelens MattiasBuelens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we should update (or remove) the readInto() example too? Since that can be now replaced with a single readFully() call.

index.bs Outdated Show resolved Hide resolved
@MattiasBuelens
Copy link
Collaborator Author

I've been thinking about whether we want to expose pullIntoDescriptor.[[readFully]] to the underlying byte source, e.g. something like ReadableStreamBYOBRequest.isReadFullyRequest.

  • Pro: the underlying byte source might be able to optimize for large reads, instead of receiving repeated pull() calls with continuations of the same BYOB request. For example, for ReadableByteStreamTee, we could forward a readFully() request from the branch all the way to the original stream.
  • Con: we might end up with too much buffering. Again using the ReadableByteStreamTee example: if the first branch does readFully(new Uint8Array(1024)) and then the second branch does read(new Uint8Array(1)), the second branch wouldn't receive any data until the underlying source of the original stream has filled up the entire view of the forwarded readFully() request from the first branch. (This case probably deserves its own WPT test... 🤔)

So yeah, it might be an optimization opportunity, but it might also be a footgun. At least for now, I'll leave ReadableStreamBYOBRequest untouched. 🙂

Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. This LGTM with the minor potential improvement of combining the boolean with reader type. @ricea do you think we could implement this? Shouldn't take long...

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
Copy link
Collaborator

@ricea ricea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With regards to the spec, this looks good.

Sorry to show up late with a bikeshed, but I like the name fill(). reader.fill(view) feels good to me. Disadvantages: no precedent for the name, and it would a nuisance to change all the tests and everything now. It may be to harder to find (but the reader doesn't have so many methods that it would be lost in the noise). I'm just suggesting it to see how others feel.

Regarding implementation in Blink, I think this clearly fixes an ergonomic problem. Some people expect read() to work this way already. Implementation is stuck behind #1123 and other things that are higher priority.

index.bs Outdated Show resolved Hide resolved
@MattiasBuelens
Copy link
Collaborator Author

Sure, fill(view) also works. Luckily readFully() was a very distinct name, so I could easily find/replace it. 😁

If we still don't like it, it's pretty easy to change/revert that last commit. Bikeshed away!

@MattiasBuelens MattiasBuelens changed the title Add ReadableStreamBYOBReader.readFully() Add ReadableStreamBYOBReader.fill(view) Aug 19, 2021
@MattiasBuelens MattiasBuelens marked this pull request as ready for review August 19, 2021 21:53
Copy link
Collaborator

@ricea ricea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. I like this very much.

@MattiasBuelens
Copy link
Collaborator Author

I take it that means Chrome is interested in implementing this? (Gotta tick those checkboxes! 😁)

@ricea
Copy link
Collaborator

ricea commented Aug 19, 2021

Yes, you can tick the checkbox for Chrome.

@domenic domenic added the addition/proposal New features or enhancements label Sep 7, 2021
@MattiasBuelens MattiasBuelens changed the title Add ReadableStreamBYOBReader.fill(view) Add ReadableStreamBYOBReader.read(view, { atLeast }) Jan 18, 2022
@MattiasBuelens MattiasBuelens force-pushed the rs-byob-read-fully branch 2 times, most recently from 1e427ce to 86b6982 Compare January 18, 2022 22:50
@MattiasBuelens
Copy link
Collaborator Author

Updated to use read(view, { atLeast }) instead of fill(view), as suggested in #1175. This is a more general solution: users can now choose to only fulfill when the view is filled with at least 1 byte, or 2 bytes, or view.length / 2 bytes, or whatever. With fill(view), it's all or nothing.

We can still introduce fill(view) as a shorthand for read(view, { atLeast: view.length }) if we want to. Happy to discuss. 🙂

Currently, the tests only cover the atLeast == view.length case (i.e. the fill() use case). I still need to write tests for the more general cases, i.e. where 0 < atLeast <= view.length.

@ricea
Copy link
Collaborator

ricea commented Jan 19, 2022

Thanks! I am happy not to have fill() since it is not too hard for people to use the more general API.

@lucacasonato
Copy link
Member

Deno is interested in implementing this.

@MattiasBuelens
Copy link
Collaborator Author

A little bikeshedding: how do we feel about the name "atLeast" for the new option? I'm thinking "min" might be better? (Still avoiding "minBytes", because of multi-byte typed arrays.)

const { done, value } = byobReader.read(view, { min: 8 });
// value is minimum 8 elements long

@jasnell
Copy link

jasnell commented Nov 13, 2023

Woo! super happy to see this land. Thanks for the work all.

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Nov 22, 2023
…BReader.read(view, { min }), a=testonly

Automatic update from web-platform-tests
Streams: add tests for ReadableStreamBYOBReader read(view, { min })

Follows whatwg/streams#1145.
--

wpt-commits: 7eaf605c38d80377c717828376deabad86b702b2
wpt-pr: 29723
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Nov 22, 2023
…BReader.read(view, { min }), a=testonly

Automatic update from web-platform-tests
Streams: add tests for ReadableStreamBYOBReader read(view, { min })

Follows whatwg/streams#1145.
--

wpt-commits: 7eaf605c38d80377c717828376deabad86b702b2
wpt-pr: 29723
vinnydiehl pushed a commit to vinnydiehl/mozilla-unified that referenced this pull request Nov 24, 2023
…BReader.read(view, { min }), a=testonly

Automatic update from web-platform-tests
Streams: add tests for ReadableStreamBYOBReader read(view, { min })

Follows whatwg/streams#1145.
--

wpt-commits: 7eaf605c38d80377c717828376deabad86b702b2
wpt-pr: 29723
vinnydiehl pushed a commit to vinnydiehl/mozilla-unified that referenced this pull request Nov 24, 2023
…BReader.read(view, { min }), a=testonly

Automatic update from web-platform-tests
Streams: add tests for ReadableStreamBYOBReader read(view, { min })

Follows whatwg/streams#1145.
--

wpt-commits: 7eaf605c38d80377c717828376deabad86b702b2
wpt-pr: 29723
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this pull request Nov 30, 2023
…BReader.read(view, { min }), a=testonly

Automatic update from web-platform-tests
Streams: add tests for ReadableStreamBYOBReader read(view, { min })

Follows whatwg/streams#1145.
--

wpt-commits: 7eaf605c38d80377c717828376deabad86b702b2
wpt-pr: 29723

UltraBlame original commit: 820875020074b1b3ca04fde919a52f2fd580bbbb
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this pull request Nov 30, 2023
…BReader.read(view, { min }), a=testonly

Automatic update from web-platform-tests
Streams: add tests for ReadableStreamBYOBReader read(view, { min })

Follows whatwg/streams#1145.
--

wpt-commits: 7eaf605c38d80377c717828376deabad86b702b2
wpt-pr: 29723

UltraBlame original commit: 820875020074b1b3ca04fde919a52f2fd580bbbb
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this pull request Nov 30, 2023
…BReader.read(view, { min }), a=testonly

Automatic update from web-platform-tests
Streams: add tests for ReadableStreamBYOBReader read(view, { min })

Follows whatwg/streams#1145.
--

wpt-commits: 7eaf605c38d80377c717828376deabad86b702b2
wpt-pr: 29723

UltraBlame original commit: 820875020074b1b3ca04fde919a52f2fd580bbbb
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Oct 28, 2024
i3roly pushed a commit to i3roly/firefox-dynasty that referenced this pull request Oct 29, 2024
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Oct 30, 2024
ErichDonGubler pushed a commit to erichdongubler-mozilla/firefox that referenced this pull request Oct 30, 2024
jamienicol pushed a commit to jamienicol/gecko that referenced this pull request Oct 31, 2024
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this pull request Nov 1, 2024
… }). r=saschanaz,webidl,smaug

Implements whatwg/streams#1145

Differential Revision: https://phabricator.services.mozilla.com/D226225

UltraBlame original commit: df7878660191f4991ce6a5898629c963b38a6e50
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this pull request Nov 1, 2024
… }). r=saschanaz,webidl,smaug

Implements whatwg/streams#1145

Differential Revision: https://phabricator.services.mozilla.com/D226225

UltraBlame original commit: 56a71d42ee3220891aa1cb2dfae5df8f544865f9
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this pull request Nov 1, 2024
… }). r=saschanaz,webidl,smaug

Implements whatwg/streams#1145

Differential Revision: https://phabricator.services.mozilla.com/D226225

UltraBlame original commit: df7878660191f4991ce6a5898629c963b38a6e50
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this pull request Nov 1, 2024
… }). r=saschanaz,webidl,smaug

Implements whatwg/streams#1145

Differential Revision: https://phabricator.services.mozilla.com/D226225

UltraBlame original commit: 56a71d42ee3220891aa1cb2dfae5df8f544865f9
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this pull request Nov 1, 2024
… }). r=saschanaz,webidl,smaug

Implements whatwg/streams#1145

Differential Revision: https://phabricator.services.mozilla.com/D226225

UltraBlame original commit: df7878660191f4991ce6a5898629c963b38a6e50
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this pull request Nov 1, 2024
… }). r=saschanaz,webidl,smaug

Implements whatwg/streams#1145

Differential Revision: https://phabricator.services.mozilla.com/D226225

UltraBlame original commit: 56a71d42ee3220891aa1cb2dfae5df8f544865f9
i3roly pushed a commit to i3roly/firefox-dynasty that referenced this pull request Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements
Development

Successfully merging this pull request may close these issues.

8 participants