Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow stream reader cancel() method to return bytes from queue instead of discarding them. #1147

Closed
DarDoro opened this issue Jul 18, 2021 · 12 comments

Comments

@DarDoro
Copy link

DarDoro commented Jul 18, 2021

There are no benefits from returning {value: undefined, done: true} after "cancel()" and discarding already received bytes.

Returning {value: byes_read_so_far, done: true} will act much better, allowing to implement timeout and preserving data.
Also, "cancel()" implemented this way will be compatible with the current implementation. When done=true, "readable" will be unlocked. For "value" not eq. "undefined" the last chunk will be reliable processed.
It's possibly the "lowest hanging fruit" for many similar requests.
It will partially fulfill other requests eg. #1103
whatwg/fetch#180

@DarDoro
Copy link
Author

DarDoro commented Jul 18, 2021

Sorry, intended for WICG/serial
WICG/serial#146

@DarDoro DarDoro closed this as completed Jul 18, 2021
@reillyeon
Copy link

@domenic, can you reopen this issue? If there's action to take here then it should be done at the Streams API level.

While it might be nice syntactically to have cancel() return any bytes left in the queue I think this can be polyfilled with,

const finalReadPromise = reader.read();
await reader.cancel();
const finalRead = await finalReadPromise;

If there was any data left in the queue then it will be in finalRead.value. Otherwise, if the read was blocked waiting for more data then finalReadvalue will be undefined. The key insight here being that you can do a "non-blocking" read to see if there are queued bytes by not awaiting on the Promise it returns. Once you call cancel() then that Promise is guaranteed to have resolved one way or another.

@domenic domenic reopened this Jul 19, 2021
@domenic
Copy link
Member

domenic commented Jul 19, 2021

This is indeed pretty interesting on the streams level.

In 8a7d92b we made cancel() immediately kill any pending read requests. I believe if that change gets implemented in Chromium it would invalidate @reillyeon's suggestion.

We did that because it was our belief that if you were canceling the read you were OK losing the memory. Canceling should be a relatively rare operation so re-allocating each time seems OK.

Can you tell us more about your use case where you want to cancel the stream, you want the memory back, but you were unable to wait for the read to complete?

@reillyeon
Copy link

In 8a7d92b we made cancel() immediately kill any pending read requests. I believe if that change gets implemented in Chromium it would invalidate @reillyeon's suggestion.

Would it? If there are queued bytes then shouldn't the read complete before the call to cancel()? I guess it depends on whether read() queues a task to check the queue and so can never complete "synchronously".

@MattiasBuelens
Copy link
Collaborator

MattiasBuelens commented Jul 19, 2021

While it might be nice syntactically to have cancel() return any bytes left in the queue I think this can be polyfilled with,

const finalReadPromise = reader.read();
await reader.cancel();
const finalRead = await finalReadPromise;

That's pretty clever! 😀 It should work in simple cases, although I'm not sure how well it would translate to more complex scenarios (e.g. pipe chains or cross-realm streams) where chunks are buffered in multiple streams.

I don't know if we want to make this easier. Perhaps we could have a reader property indicating if there are chunks available in the queue (i.e. if [[queue]] is non-empty)? A bit like the EWOULDBLOCK error code, or Java's InputStream.available(). But as Domenic already asked: it would help us if we could better understand the use case. 🙂

Would it? If there are queued bytes then shouldn't the read complete before the call to cancel()? I guess it depends on whether read() queues a task to check the queue and so can never complete "synchronously".

Correct, the specification requires read() to pull a chunk from the queue synchronously. All of these steps happen in the same tick: ReadableStreamDefaultReaderRead ➡️ [[PullSteps]] ➡️ Remove entry from this.[[queue]].

However, it will only pull one chunk. If multiple chunks were enqueue()d, a single read() will only pull one of those chunks and there might still be chunks left in the queue by the time your example cancel()s the stream. That said, if you know you're dealing with a pull source with HWM = 0 (which is the main use case for BYOB), then you'll never have more than one chunk in the queue... so you'd still be fine.

@DarDoro
Copy link
Author

DarDoro commented Jul 20, 2021

While it might be nice syntactically to have cancel() return any bytes left in the queue I think this can be polyfilled with,

const finalReadPromise = reader.read();
await reader.cancel();
const finalRead = await finalReadPromise;

That's pretty clever! 😀 It should work in simple cases, although I'm not sure how well it would translate to more complex scenarios (e.g. pipe chains or cross-realm streams) where chunks are buffered in multiple streams.

I want exactly this behavior, but it does not work this way in current Chrome. "finalReadPromise" is immediately resolved with { value:undefined, done:true} and nothing is read from the port at all.
I send a lazy stream of 1 byte containing a sequence number every 50-100 ms.
I've got some results with this code but with 1-2% data loss.

setTimeout( () => {reader.cancel()} , 0);
data = await reader.read();
console.log( data);

@reillyeon
Copy link

I want exactly this behavior, but it does not work this way in current Chrome. "finalReadPromise" is immediately resolved with { value:undefined, done:true} and nothing is read from the port at all.
I send a lazy stream of 1 byte containing a sequence number every 50-100 ms.

I'd have to see a larger example but are you sure that when this code runs the data has already been received and hasn't already been read? If the next sequence number hasn't been received yet then there's nothing in the queue and { value:undefined, done:true } is the expected result. If you know that the device will send something every 50-100 ms and want to be sure that you'll receive it why not wait for it to be received and then cancel the stream?

Backing up though because maybe I don't understand what it is you are trying to accomplish at a higher level: Why are you canceling the stream?

The only reason to cancel a stream is a) you are closing the port or b) you want to discard any queued data and start over with whatever the device sends next. From what you are saying about getting 1-2% data loss it sounds like you are running your snippet in a loop and are losing 1-2% of the data you are expecting to receive. This is expected because in addition to discarding data in the ReadableStream's queue it also tells the operating system and hardware to flush its buffers, which means that if the data arrives at just the right time it won't be read and will instead be discarded. There's no reason to run this kind of code in a loop however. If you want to read continuously from the device just use the same reader and keep calling read().

In essence, if you don't want to lose data then don't call cancel(). I'd like to understand why you are calling cancel() in the first place because it is probably the wrong solution for the problem you are trying to solve.

@DarDoro
Copy link
Author

DarDoro commented Jul 20, 2021

@domenic, can you reopen this issue? If there's action to take here then it should be done at the Streams API level.

While it might be nice syntactically to have cancel() return any bytes left in the queue I think this can be polyfilled with,

const finalReadPromise = reader.read();
await reader.cancel();
const finalRead = await finalReadPromise;

If there was any data left in the queue then it will be in finalRead.value. Otherwise, if the read was blocked waiting for more data then finalReadvalue will be undefined. The key insight here being that you can do a "non-blocking" read to see if there are queued bytes by not awaiting on the Promise it returns. Once you call cancel() then that Promise is guaranteed to have resolved one way or another.

So we agree that this does not preserve any data in buffer?

Second, "reader.cancel()" is called to cancel reader NOT the stream itself. The whole stream has its own readable.cancel() to discard buffers.

This simple change - the last outcome of the canceled read operation are bytes read so far and done=true, will open more possibilities. E.g writing simple, async sequential like code even with well known timed-out read.

Otherwise, having all that modern JS , async, await, promises, stream apis - to send and read few bytes you need to implement state machine or own FIFO buffer.

I literaly used it this way in a loop.

if( FIFO.available()) return FIFO.pop();
data = await reader.read();
FIFO.push( data.value) 

but why design API to loose data without any reason. Rigth after cancel(), opened port starts to collect the next bytes received.

As spec says https://wicg.github.io/serial/#dom-serialport-readable
Some other ways of encoding message ... to wait a defined length of time before transmitting the next message.
It's quite common to wait some time for an answer.

The read loop pattern is just callback disguised as promise based api.
port.onavailable( data =>{ do_something( data)})
will do the same without the hassle of the whole infrastructure/locking.

But to do something non trivial this way you need to store and maintain at lest reader/writer and global state or FIFO.
So we need something like (or built with generator)

do_something( data){
 if( global_state = credentials_prompt)
  global_state = credentials_reply;
  send_credentials();
 }
if( global_state = credentials_reply){
  If ( credentialOK( data)) {
    global_state=credentials_accepted;
  }else{
    global_state=access_denied;
    reader.cancel();
    reader.releaseLock();
}
}
if( global_acceptes = credentials_reply)
...

@DarDoro
Copy link
Author

DarDoro commented Jul 20, 2021

I've reread the stream spec.
I am quite convinced, now, that all I need is "byob" mode for WEBserial.
So my feature request is just irrelevant and should be closed.
I will close this after 24h if no one wants this to stay open for further comments.
Thank you for the pieces of information.

@domenic
Copy link
Member

domenic commented Jul 20, 2021

Second, "reader.cancel()" is called to cancel reader NOT the stream itself. The whole stream has its own readable.cancel() to discard buffers.

This is not correct. readable.cancel() is the same as readable.getReader().cancel().

If you want to release a given reader, use reader.releaseLock().

@DarDoro
Copy link
Author

DarDoro commented Jul 22, 2021

https://streams.spec.whatwg.org/#rs-prototype

4.4.3. Constructor, methods, and properties
(For web developer non-normative)
await reader.cancel([ reason ])
If the reader is active, behaves the same as stream.cancel(reason).

It seems to be non-normative implementation detail rather then specification part.

It is deliberatelly designed this way so time to close. Maybe the desired behaviour will be implemented as abortable read.
#1103
or
WICG/serial#122

#1123

From the other direction, if a stream consumer wants to cancel the stream (thus closing it), do they really expect to get their buffer back? It seems like annoying boilerplate to require that underlying source authors intercept cancels and send back the buffer. I'm inclined to just say that if you cancel the stream, you won't get your buffer back.

@DarDoro DarDoro closed this as completed Jul 22, 2021
@domenic
Copy link
Member

domenic commented Jul 22, 2021

It's not a "non-normative implementation detail", it's a non-normative summary of the intentionally-designed normative algorithms elsewhere in the specification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants
@reillyeon @domenic @MattiasBuelens @DarDoro and others