Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Directly produce an optimal ZIP/gzip file #13

Open
lifthrasiir opened this issue Aug 25, 2021 · 2 comments
Open

Directly produce an optimal ZIP/gzip file #13

lifthrasiir opened this issue Aug 25, 2021 · 2 comments
Assignees
Labels
decompressor enhancement New feature or request

Comments

@lifthrasiir
Copy link
Owner

Roadroller strongly depends on DEFLATE's Huffman coding because JS string literals are not efficient in terms of information entropy (~6.96 bits per byte). The first line specifically exploits this by using the minimum amount of literal bytes, but a stock zlib doesn't fully recognize this difference in the symbol distribution and normally combines two lines into one block. Zopfli-like tools do recognize this, but the user has to use those tools to benefit from this.

Maybe we can solve this UX problem by directly producing an optimal ZIP/gzip file from Roadroller itself. This is not a small task because:

  • We should be able to insert a preamble before the resulting <script> tag.
  • In case of ZIP:
    • We should be able to insert arbitrary additional files to the resulting file; or
    • We should completely eliminate the needs for additional files, e.g. we should process image files into Roadroller-friendly formats.

While Roadroller somehow has a working implementation of zlib (-5), the optimal size can only be reached with Zopfli or similar tool so Roadroller should depend on that.

@lifthrasiir lifthrasiir added the enhancement New feature or request label Aug 25, 2021
@lifthrasiir lifthrasiir self-assigned this Aug 25, 2021
@lifthrasiir
Copy link
Owner Author

As per #29, these would be implemented as the following additional output formats:

  • -F8gz
  • -F8zip
  • -F8zpng

I originally used -F6gz and so on, but thinking about that it should be -F8gz etc. because their coding rate should be around 8 bit/byte minus fixed overhead and users would think the first digit as an relative efficiency, not a part of the internal implementation strategy.

For -F8zip the file name should be supplied. This can be done either with a separate argument (--zip-file-name index.html) or with a combined argument (-F8zip:index.html).

To my knowledge there is no tool that directly recompresses a truncated PNG file, so -F8zpng might be a stretch since it can't be recompressed by external tools.

@lifthrasiir
Copy link
Owner Author

I've also briefly considered -F8zwebp which uses WebP Lossless instead of PNG, but it wasn't significantly better than (optimally recompressed) PNG because both use bytewise LZ77 + Huffman coding as a backend. WebP Lossless is better at exploiting spatial locality than PNG but that is useless for our purpose anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
decompressor enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant