-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ability to fetch from zip files #17408
Comments
I happen to have been working with ZIP files in zig anyway, so I might take a shot at this if it's ok. I have a branch to start with: master...MFAshby:build_zip_support I'll figure out how to build & test it before submitting a PR. ZIP files are a little different from TAR archives in that they might have data at the start of the file which is not part of the ZIP at all, (a fact taken advantage of by things like self-extracting ZIPs, and the https://redbean.dev/ web server). They also store the metadata preceding the file data, and repeated in a 'central directory' at the end of the file. Probably the more reliable way to extract the ZIP is to use the central directory, which would need a little code restructuring so that the |
I know enough about zip files to confidently say that the End of Central Directory Record should certainly be used as the source of truth, rather than trying to scan a stream for local headers which may actually just be file contents. I suggest to wait until #17392 lands because the logic is much more straightforward and will not need any restructuring. The function is passed a temporary directory and a reader stream of a remote zip file. So it just needs to pipe the stream into a file in the temporary directory, then unzip all the files into the temporary directory. |
A while back I rewrote @jorangreef's pure in Zig, and was about to refactor it to use a Reader before I got tied up with work. I think this would be a better base for a ZIP reader, as pure is a a static-analysis tool to protect against Zip bombs more than a generic Zip library. Of course I would need some helping hands if this were ever to land in the stdlib. |
Alright I started on this and have a working implementation of a ZIP reader in 400 lines of pure Zig with an Code is at https://github.com/nDimensional/zig-zip for now but I'd like to PR this into I have a couple questions for you @andrewrk if you have input
|
libzip has a regression testing suite. I think APPNOTE.txt is the spec from the original developer of the zip format, it does not come with a test suite though. |
Okay here's my current idea. My implementation has a small test suite that uses the system My understanding is that this type of testing is appropriate for https://github.com/ziglang/contrib-testing - so maybe I open a PR to create |
no, a eg |
Okay happy to do that too, just wondered if there was a way to avoid checking more binary blobs into the repo :) |
Here's an mmap that should work on posix and windows: https://github.com/marler8997/zigx/blob/master/MappedFile.zig |
No, mmap is problematic because instead of convenient error codes from I/O you get signals which are nearly impossible to handle correctly. Mmap should be avoided for this use case. |
I was thinking of taking a stab at this and wanted to clarify my understanding of the solutions. Does this breakdown look correct? Solution 1: Process zip as a stream (front to back)No. zip files cannot be interpreted correctly front-to-back, they must be done back to front starting from the central directory header/record (https://games.greggman.com/game/zip-rant/). Solution 2: Read entire zip contents from reader into memoryNo. A zip file could be too large to fit into memory. Let's write it to a file first instead, then we can load only the parts we need into memory as we go. Solution 3: Use MMAP after writing it to diskNo. Nearly impossible to handle IO errors correctly Solution 4: Read/Process zip file in chunks after writing it to disk using normal reads/writesYes |
Here's my version that avoids mmap based on Joel's implementation: I'll see if I can get a PR together for it. |
Nice! Although there's still more I need to work into my impl, at least Zig64 support and executable bit preservation |
FYI, I added a bufferedReader when passing the file stream to the inflate decompressor...and performance went from about 3 to 4 times slower than the mmap version to about the same performance. (i.e. decompressing |
Here's the branch I'm currently working on: master...marler8997:zig:zip |
fixes ziglang#17408 Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame and Andrew Kelley. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame and Andrew Kelley. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame and Andrew Kelley. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame and Andrew Kelley. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame and Andrew Kelley. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame and Andrew Kelley. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame and Andrew Kelley. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame and Andrew Kelley. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame and Andrew Kelley. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame and Andrew Kelley. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame and Andrew Kelley. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame and Andrew Kelley. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame and Andrew Kelley. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes #17408 Improved by helpful reviews from Josh Wolfe and Auguste Rame and Andrew Kelley. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Helpful reviewers/testers include Joshe Wolfe, Auguste Rame, Andrew Kelley and Jacob Young. Co-authored-by: Joel Gustafson <joelg@mit.edu>
fixes ziglang#17408 Helpful reviewers/testers include Joshe Wolfe, Auguste Rame, Andrew Kelley and Jacob Young. Co-authored-by: Joel Gustafson <joelg@mit.edu>
zig fetch foo.zip
should work.Relevant logic is here:
zig/src/Package/Fetch.zig
Lines 946 to 1024 in f7bc55c
The text was updated successfully, but these errors were encountered: