Replies: 3 comments 14 replies
-
Thanks for the exact numbers, it's a nice reference. For context, it's been suggested before (by @andreialecu I believe), but the pushback I made at the time was that having two different zip implementations inside the core would increase the maintenance surface while giving us less control over the implementation (we'll always need to provide an |
Beta Was this translation helpful? Give feedback.
-
Here's a past discussion on this topic https://discord.com/channels/226791405589233664/226793713722982400/777955526365544488 Thanks for providing benchmark numbers, and for pointing to fflate. I'm also in heavy support of this idea. I even started a quick and dirty prototype js zip implementation for this purpose that I didn't finish (https://github.com/andreialecu/tszip). It would be interesting to see if the pure js implementation also parallelizes better. I suspect that yarn could take more advantage of it, and a pure js+zlib implementation would scale better. |
Beta Was this translation helpful? Give feedback.
-
I have carried out experiment whether we will have perf boost due to zip in JS implementation compared to wasm libzip, the answer is no: |
Beta Was this translation helpful? Give feedback.
-
There is little point to using LibZip because in Yarn, Node.js Zlib is used for decompression and the ZIP format itself is very cheap to parse. In fact, a JavaScript ZIP library will easily outperform LibZip when used in Yarn due to the cost of the WASM-JS boundary and the need for excessive memory copying when using WASM instead of JS. I ran a test comparing LibZip with fflate, a fast compression library with both a JS DEFLATE implementation and the potential to use custom (de)compressors.
Here's a small benchmark for decompression performance:
Code used
Clearly, the decompression is the major factor, since both fflate and LibZip decrease the time taken by similar amounts using the native Zlib program over a JS implementation. However, the WASM overhead means fflate is always faster. The few discrepancies are due to the higher performance of the JS implementation when decompressing in memory as opposed to decompressing streamed data. The ability to optionally utilize data streaming is yet another benefit of using a JS library over LibZip.
Obviously there are more implementation details (such as saving any changes back into the archive, using append mode when adding files and overwriting when modifying, making the code less ugly) but I'd be willing to create a PR with
fflate
,archiver
, or some other ZIP library in JS, since it's almost guaranteed to be faster than WASM.Beta Was this translation helpful? Give feedback.
All reactions