Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi, this is a followup to #85, #47, and zip-old#412.
First off, #93 seems to have sped up the code significantly. Thanks!
I have some tweaks here that can improve the performance even more. But even if these changes do not get accepted, I think #93 is probably fast enough for my use case—the old-zip code was taking somewhere between 200ms and 500ms to search for the end-of-central-directory-record, the new code is in the 30ms to 60ms range. It's not amazing but it's 10x faster and (more importantly) just under the human perception of lag.
That said, there are 3 commits here:
A benchmark for the particular behavior I am seeing. It creates a 17MB file of zeros and measures how fast the code fails the open. I used a temporary file instead of a large buffer so that it would include real
read()
round-trips from the kernel.Note: I had issues running the benchmarks. It seems like this change might have broken one of them. As an aside, when I was reading the pkzip appnote for this PR, I happened across section
4.4.1.4
which sounds like it might apply to this breakage.To get past this I always just ran the one benchmark I cared about:
This new benchmark code uses
TempDir
so I made this branch on top of deps(dev): Use the tempfile crate instead of the deprecated tempdir crate #254. Apologies if that makes it more annoying to merge.A patch to enlarge the
END_WINDOW_SIZE
inZip32CentralDirectoryEnd::find_and_parse()
. I benchmarked several different sizes from 512 to 65536. Here are the results I got:END_WINDOW_SIZE
30,450,608 ns/iter (+/- 673,910)
7,741,366 ns/iter (+/- 521,101)
5,773,801 ns/iter (+/- 411,277)
4,792,533 ns/iter (+/- 332,918)
4,324,072 ns/iter (+/- 384,048)
4,047,872 ns/iter (+/- 352,416)
65,132,581 ns/iter (+/- 7,429,976)
14,109,503 ns/iter (+/- 2,892,086)
9,942,500 ns/iter (+/- 1,886,063)
8,205,851 ns/iter (+/- 2,902,041)
7,012,011 ns/iter (+/- 2,222,879)
6,577,275 ns/iter (+/- 881,546)
The linux box is an x86_64 running Debian testing. Its
/tmp
is a tmpfs mount. The mac is an M1 running macOS 15.0.1. Its/tmp
is an apfs backed by an SSD.From those times, 8192 seems to be the sweet spot. It's significantly (~6x) faster than 512 and increasing it doesn't further improve performance by very much. So the patch uses 8192.
I don't anticipate this patch being controversial as it doesn't really affect anything except using a tiny bit more RAM while loading the zip.
A patch to shorten the search window at the end. I tested 3 configurations here: searching the whole file, searching the last 128K and searching the last 66,000 bytes. I did these benchmarks with all the
END_WINDOW_SIZE
s combos shown above, but they aren't interesting so I'm just going to show the 8192 size here. Here are the results I got:5,773,801 ns/iter (+/- 411,277)
54,402 ns/iter (+/- 4,126)
36,152 ns/iter (+/- 4,293)
9,942,500 ns/iter (+/- 1,886,063)
73,604 ns/iter (+/- 16,662)
41,483 ns/iter (+/- 19,692)
As you might expect, closing this windows makes a extremely large difference when the file is large.
Since 7a55945, current code searches the entire file. Previously it searched the last 66,000 bytes (5237543). Before that, it searched 65557 bytes, as discussed in zip-old#183.
pkware's zip implementation isn't open source so it's hard to check, but a comment in zip-old#183 claims it uses 66,000 bytes which is where that number came from.
Info-Zip, however, seems to be the main zip program shipped in Debian, Homebrew, Nix, Fedora, etc. Its official source is only available on an ftp site, though there's someone's random fork on github that I'll link to. The relevant part is here and is the same as the info-zip source from the ftp site. They search back
0x20000
bytes, which is 128K.Given that is the de-factor zip in the open source would and seems to be more generous than pkzip, it seems like this is a reasonable number to split the difference between (overly?) strict spec adherence (65557 bytes) and searching the whole file.
That was a long winded way of saying that in the 3rd patch I chose to limit the search to the last 128K of the file.
I understand the last one might be more controversial given that it could impact compatibility. I believe the 128K is pretty reasonable number for the reasons give above, but I've left the PR in 3 commits so its easier if you don't want that last one.
Local test checklist:
cargo test
cargo test --no-default-features
cargo test --all-features
cargo clippy --all-targets
cargo clippy --all-targets --no-default-features
cargo clippy --all-targets --all-features
cargo doc --no-deps
†cargo doc --no-deps --no-default-features
†cargo doc --no-deps --all-features
†cargo fmt --check --all
†: Had warnings unrelated to these changes.