Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async support #121

Closed
jpillora opened this issue Apr 21, 2014 · 16 comments
Closed

Async support #121

jpillora opened this issue Apr 21, 2014 · 16 comments

Comments

@jpillora
Copy link

No description provided.

@dduponchel
Copy link
Collaborator

async ? asynchronous API ? async.js ? What do you have in mind ?

@jpillora
Copy link
Author

Yep an asynchronous version of the API. I'm for replacing the sync API,
though that might not bode well with everyone.

So it currently works well for small files

Though, if one was to zip 50MB (dependent on client, worse on mobile), the
browser would most definitely hang - bad user experience. If you split
computation across multiple ticks of the event loop, it would be possible
to show progress and still allow the user to interact with the page while
it archives.

This could further be improved with web workers - with double/quadruple
archive speed increase!

I'm interested in using this library though haven't yet due to this

And I understand that this would be quite a large architectural change so
let this be food for thought anyway

On Wednesday, April 23, 2014, David Duponchel
<notifications@github.comjavascript:_e(%7B%7D,'cvml','notifications@github.com');>
wrote:

async ? asynchronous API ? async.js https://github.com/caolan/async ?
What do you have in mind ?


Reply to this email directly or view it on GitHubhttps://github.com//issues/121#issuecomment-41071477
.

@dduponchel
Copy link
Collaborator

I agree with the need of an async API, at least for the CPU heavy tasks.
I have other tasks to finish but that will be my next task :)

@jpillora
Copy link
Author

Ah great, sounds good :)

On Thursday, April 24, 2014, David Duponchel notifications@github.com
wrote:

I agree with the need of an async API, at least for the CPU heavy tasks.
I have other tasks to finish but that will be my next task :)


Reply to this email directly or view it on GitHubhttps://github.com//issues/121#issuecomment-41197110
.

@dduponchel
Copy link
Collaborator

Just a quick update :

On a local branch I've created async getters (asXXXAsync()) and generateAsync() while keeping the synchronous methods. I had to duplicate some internal functions (related to compression, decompression, utf8 encode, utf8 decode) plus the API but it's not that bad.
My code still needs some work/cleanup (and documentation and unit tests and tests on IE) but the first results are encouraging :

Starting with a zip file (7MB) containing compressed xml files (412MB decompressed) :

  • I read all the zip files
  • compress the decompressed files

Firefox/Chrome take between 4 and 14 seconds to decompress the files and 20 seconds to recompress them without freezing the browser.
Nodejs takes 10 seconds to decompress the files and 6 seconds to recompress them.

I'd say it's a good start.

@jpillora
Copy link
Author

jpillora commented May 4, 2014

Awesome sounds good David 👍

On Monday, May 5, 2014, David Duponchel notifications@github.com wrote:

Just a quick update :

On a local branch I've created async getters (asXXXAsync()) and
generateAsync() while keeping the synchronous methods. I had to duplicate
some internal functions (related to compression, decompression, utf8
encode, utf8 decode) plus the API but it's not that bad.
My code still needs some work/cleanup (and documentation and unit tests
and tests on IE) but the first results are encouraging :

Starting with a zip file (7MB) containing compressed xml files (412MB
decompressed) :

  • I read all the zip files
  • compress the decompressed files

Firefox/Chrome take between 4 and 14 seconds to decompress the files and
20 seconds to recompress them without freezing the browser.
Nodejs takes 10 seconds to decompress the files and 6 seconds to
recompress them.

I'd say it's a good start.


Reply to this email directly or view it on GitHubhttps://github.com//issues/121#issuecomment-42136211
.

@dduponchel
Copy link
Collaborator

I will add to this one the support of streaming : removing the time limit but keeping the memory limit (because JSZip holds everything into memory) won't help a lot.

@dduponchel
Copy link
Collaborator

This will takes more time than expected, the conversion async -> stream isn't as straightforward as I thought (and streamed zip files break all my unit tests).
For the internal streams I started with readable-stream (nodejs implementation) but that's maybe not a good idea. I feel like I'm fighting against them a lot : using objectMode because I'm streaming Uint8Array chunks, changing decodeStrings to avoid string -> Buffer conversion, etc. Since readable-steam heavily depends of nodejs Buffers, browserify put a lot of polyfills in the generated dist file.
On the plus side, the nodejs integration will be easy, the code handles back pressure for free, etc.
If anyone used streams in a browser and has a better option, that interests me !

@jpillora
Copy link
Author

Any sync to async library conversion is a big job - done it a few times myself

As for using node streams, id recommend you write your own basic streams implementation. Since all streams are just event emitters that you write() to and emit("data",...) over time, and finishing with emit("end"). Streams2/3 are more complex though since you're in the browser, I'd say they're not required.

An idea, you could also make your streams behave synchronously when desired by using a optional setTimeout 0. This way your sync functions could call the new streaming functions only when in sync mode - preventing the need to maintain two versions of each function. Not sure if that's feasible though

@dduponchel
Copy link
Collaborator

After fighting with/against streams, here is an update.

  • I started writing my own streams
  • I thought that I would have to implement a lot of features from nodejs streams
  • I rewrote my code to require('stream')
  • I discovered that nodejs streams have some issues :
  • I wrote my own implementation of stream.

Now, I agree with @jpillora : I should have started with my own basic stream implementation.

The results now. I rewrote a lot of stuff : instead of duplicating every methods (sync vs async), I used synchronous "workers". Each one does one task (convert a chunk, utf8 transformation, deflate, etc) and handles the "how". I have two implementations handling the "when" : one synchronous and one asynchronous. Internal methods use workers, only the public API chooses the sync/async implementation.

I still have some work before a pull request : the utf8 encode/decode doesn't split correctly the chunks and the sync/async implementations is still a bit rough. I won't give any time estimate : I've already proven that I'm bad with them :-)

Regarding the public API : the sync methods didn't change. It may be a good idea to target a v3 for this, remove the deprecated stuff and rethink some parameters (checkCRC32 for example).
We can have 2 new async methods for each getters and for generate : one for a basic stream and one for the complete file (accumulating the chunks). We can create asTextStream(), asTextAsync(cb), ... but it may be a good idea to create only asTextAsync() returning a generic wrapper. This wrapper could then expose toSimpleStream(), accumulate(cb) (and maybe one day toNodejsStream()). It will be easier to maintain and easier to document.

@okv
Copy link

okv commented Feb 10, 2016

Hi there.

Any updates on this?
How is stream support going?

@Noitidart
Copy link

@okv if you want async, put jszip into a webworker. Its better then async. All operations are off the main thread, and the post message method is innately async.

@okv
Copy link

okv commented Feb 10, 2016

@Noitidart i'm using it on server side (node.js) and want to use default non-blocking approach for this environment - streams.

@dduponchel
Copy link
Collaborator

Here is the current state: #195 moved JSZip to async methods (and added stream support) and landed on the branch jszip_v3. #253 continues the work on the future v3 version and fixes most of #224.
Once #253 is merged, the only missing feature before a v3 release should be an improved support of Blob and Promise.

@okv
Copy link

okv commented Feb 11, 2016

@dduponchel got it, thanks for your detailed explanation

@dduponchel
Copy link
Collaborator

v3.0.0 has been released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants