Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async iteration #178

Closed
alubbe opened this issue Feb 19, 2020 · 5 comments
Closed

Async iteration #178

alubbe opened this issue Feb 19, 2020 · 5 comments

Comments

@alubbe
Copy link

alubbe commented Feb 19, 2020

At exceljs, we're currently exploring using async iteration as the interface with unzipper (exceljs/exceljs#1135). By default, unzip.Parse only emits the 'entry' event, so the following does not work

      const zip = unzip.Parse();
      someReadStream.pipe(zip);
      for await (const entry of zip) {
        // this loop block is never run
      }

I've found this in the code base, which allows me to 'enable' the normal stream api (via .push())

        if (self._readableState.pipesCount || (self._readableState.pipes && self._readableState.pipes.length))
          self.push(entry);

By creating a PassThrough, the above condition turns true and everything works:

      const zip = unzip.Parse();
      const passThrough = new Stream.PassThrough({objectMode: true});
      zip.pipe(passThrough);
      someReadStream.pipe(zip);
      for await (const entry of passThrough) {
        // now everything works!
      }

The PassThrough stream seems rather wasteful, so my question is - have you thought about async iteration in the design of your current implementation? One idea I had is to add an option during the instantiation, like unzip.Parse({pipeable: true}); and to extend the above check to this:

        if (self.opts.pipeable || self._readableState.pipesCount || (self._readableState.pipes && self._readableState.pipes.length))
          self.push(entry);

What do you think?

@ZJONSSON
Copy link
Owner

Hey @alubbe - really appreciate it. The code snippet above was unfortunately necessary to keep the backwards compatibility with unzip - i.e. if you are not piping, the parser should not pause and just continue emitting entry events (similar to flowing vs pause mode in native streams.

I agree it would be very convenient to force the stream mode so you can apply async iterators directly. Perhaps we should name the config more explicitly, i.e. forceStream: true, as if enabled the stream will be paused until you read it like a stream (i.e. the .on('entry) events will not emit automatically)

Would you mind doing a quick PR?

@alubbe
Copy link
Author

alubbe commented Feb 19, 2020

I could definitely do that - two quick questions:

  1. I'm not sure how to write a test for that, any suggestions on how to do that?
  2. should we disable/stop emitting 'entry' when forceStream is enabled?

@ZJONSSON
Copy link
Owner

  1. Currently the test suite runs on various legacy versions of node as a drop in replacement for unzip and therefore doesn't support the use of ES6 and async await etc. Any test that requires ES6 would have to be ignored by the legacy versions. See https://github.com/ZJONSSON/node-unzipper/pull/94/files for example (es6 code would have to be lazily loaded after the version check).

For the actual test itself, you can copy any of the tests in the /test directory, apply the async iterator and verify that results are correct

  1. I leave it up to you - I'm mostly indifferent because forceStream is a special case

@alubbe
Copy link
Author

alubbe commented Feb 21, 2020

Done: #180

@ZJONSSON
Copy link
Owner

closed with #180

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants