Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide alternate, better compressed tarballs #1233

Closed
3 tasks done
N-R-K opened this issue May 12, 2023 · 16 comments · Fixed by #1235
Closed
3 tasks done

Provide alternate, better compressed tarballs #1233

N-R-K opened this issue May 12, 2023 · 16 comments · Fixed by #1235

Comments

@N-R-K
Copy link

N-R-K commented May 12, 2023

  • I have searched the issues for my request and found nothing related and/or helpful
  • I have searched the FAQ for help
  • I have searched the Wiki for help

Is your feature request related to a problem? Please describe.

Some of the nerd-font packages can be really large, for example the Iosevka font clocks in at 244MiB on the latest release.

This can be painful for people who have slow connection and/or limited bandwidth.

Describe the solution you'd like

It'd be great if the project can provide alternative tarballs with better compression methods. From my (somewhat limited) research, zstandard, lzip and xz seems to be good contender for that spot.

To test things out, I've decided to go with VictorMono, which is currently 66MiB with the zip format (and 121MiB uncompressed).

To create the tarballs, I'm using the following commands. I'm also using plzip and pxz, the parallel versions of lzip and xz for better compression speed. For zstandard, I'm using the default/reference zstd package.

$ du -h VictorMono
121M    VictorMono
$ time tar -I 'zstd --ultra -22 -T10' -cf VictorMono.tar.zst VictorMono
4.53s user 0.16s system 100% cpu 4.672 total
$ time tar -I 'plzip -9 -n10' -cf VictorMono.tar.lz VictorMono
39.18s user 0.24s system 185% cpu 21.217 total
$ time tar -I 'pxz -9 -T10' -cf VictorMono.tar.xz VictorMono
17.00s user 0.22s system 100% cpu 17.090 total
$ du -h VictorMono.*
67M     VictorMono.zip
5.9M    VictorMono.tar.lz
3.8M    VictorMono.tar.xz
4.7M    VictorMono.tar.zst

As you can see, the size difference is massive compared to the default zip file!

Some test for decompression speed:

$ time tar --use-compress-program="zstd" -xf VictorMono.tar.zst 
0.04s user 0.08s system 134% cpu 0.093 total
$ time tar --use-compress-program="plzip" -xf VictorMono.tar.lz 
0.82s user 0.09s system 187% cpu 0.488 total
$ time tar --use-compress-program="pxz" -xf VictorMono.tar.xz 
0.26s user 0.07s system 115% cpu 0.284 total

zstd seems to be performing really well while xz and plzip required noticeable amount of time to extract.

Given these data, to me it seems like zstd would be the ideal choice. It gives compression ratio comparable with the alternatives while still being significantly faster to both decompress and compress.

I understand that providing multiple variant of the same archive can add some maintenance burden, but given the fact that producing is a one time effort while there's going to be thousands of people consuming the result - I believe the effort will be well worth it.

Describe alternatives you've considered
(None)

Additional context
(None)

@Finii
Copy link
Collaborator

Finii commented May 12, 2023

Nice evaluation!

Instinctively I would have chosen xz, as I have the feeling that is in widespread use. While I believe I have never once seen zstd out in the wild. Any of the two would be a great addition, right.

Just to state the obvious...

-rw-rw-r-- 1 fini fini  63M Mai 12 15:07 Noto.tar.zst
-rw-rw-r-- 1 fini fini  69M Mai 12 14:57 Noto.tar.xz
-rw-rw-r-- 1 fini fini 459M Apr 30 21:11 Noto.zip
$ unzip -d Noto Noto.zip
6,96s user 0,87s system 99% cpu 7,846 total
$ tar xvf Noto.tar.xz
5,94s user 2,21s system 127% cpu 6,387 total
$ tar xvf Noto.tar.zst
0,97s user 1,60s system 146% cpu 1,756 total

@N-R-K
Copy link
Author

N-R-K commented May 12, 2023

Instinctively I would have chosen xz

About that.. I wanted to mention a couple things but the post was already quite long. In short, you are probably right that xz has more adoption (7z, which is basically the de facto standard on windows, supports it out of box). However, there's this article from the lzip author which points out a lot of flaws with xz (Note: I'm not an expert on this field, so I cannot tell how accurate the info is).

Moreover, the adoption for zstandard seems to be climbing. From wikipedia some notable adoption are:

  • Linux kernel
  • It's the default for Arch Linux and Ubuntu packages
  • It's the default filesystem compression for fedora 33

And finally

In 2020, Zstandard was implemented in version 6.3.8 of the zip file format with codec number 93. [...] New versions of zip programs often support this new feature.

Although I'm not sure how well supported zstandard compressed zip files are...

On linux (and I assume BSDs too), installing zstd should be easy. On windows, it seems that the recommended tool is this 7z fork (.exe download here) which adds many new format support (including zstandard).


But with all that being said, yes, it's true that either xz or zst would bring massive size reduction - which was the main problem raised by this issue.

So although I'm quite impressed with zstandard's all round well performance, I'd still be content to have well compressed xz tarballs!

@Finii
Copy link
Collaborator

Finii commented May 12, 2023

Maintenance-wise this is just one more line in the 'archive all the stuff' script.
We can try things out (and later change if it was bad) already with 3.0.1 (means ... today? tomorrow?).

Dropping zip is another beast, as a lot automated-repackager scripts rely on that archives. They must be won over. I know only the responsibles for cask (me) and AUR. But maybe they will adopt over time ... (years); keeping the zip is in principle no problem and esp for Windows it seams the easiest format right now. But then, I'm using Windows only very infrequent the past years.

@Finii
Copy link
Collaborator

Finii commented May 12, 2023

@N-R-K
Copy link
Author

N-R-K commented May 12, 2023

Taken from https://clearlinux.org/news-blogs/linux-os-data-compression-options-comparing-behavior, from 2017

Thanks for the link, it was interesting. However I'm a bit disappointed that the author didn't share the commands he used.

XZ being able to compress a bit more aligns with my experience, but the "faster" part contradicts my experience completely. Perhaps the author didn't use multi-threading? (by default zstd doesn't use multiple compression threads, you need to use -T flag for it. E.g -T10 for 10 threads or -T0 for auto).

@Finii
Copy link
Collaborator

Finii commented May 12, 2023

Not representative statistical data of release package formats... (for fonts, on mac)

image

5.5 of 2054 use xz (0.27%)
0 use zst

Edit: Add percentages

@Finii
Copy link
Collaborator

Finii commented May 12, 2023

image

Edit: The runtime is without zip-ing, that has been commented out

Finii added a commit that referenced this issue May 12, 2023
See #1233

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
@Finii
Copy link
Collaborator

Finii commented May 13, 2023

Unfortunately Homebrew seems to not support xz. There are a few (very few) Casks that utilize .tar.xz right now.

Also see

In the Cask manual (which might or might not be up-to-date) it is not listed:

image
From: https://docs.brew.sh/Cask-Cookbook#required-stanzas

@Finii
Copy link
Collaborator

Finii commented May 13, 2023

Maybe @polyzen has an idea if the Arch packages would / could utilize .tar.xz.

@N-R-K
Copy link
Author

N-R-K commented May 13, 2023

Some additional data-point: gentoo requires xz-utils in it's @system set (meaning it will be installed by default), whereas there's no unzip installed by default (and thus requires additional dependency on unzip).

Also FYI, zstd is also a dependency for portage (the default package manager on gentoo) although it's not in the @system set yet.

@Finii
Copy link
Collaborator

Finii commented May 13, 2023

Release with dual archives:

image

@polyzen
Copy link
Contributor

polyzen commented May 14, 2023

Maybe @polyzen has an idea if the Arch packages would / could utilize .tar.xz.

Definitely can.

Edit: blakkheim already switched over to them: archlinux/svntogit-community@3c60f2c

@msfjarvis
Copy link

I've switched over the nixpkgs package to use the new tarballs: NixOS/nixpkgs#231749

When building a derivation with all available fonts, it cut down the build time nearly in half: https://androiddev.social/@msfjarvis/110380158758015208

nandalopes added a commit to nandalopes/dotfiles that referenced this issue May 19, 2023
- Bump Nerd Fonts version to v3.0.1
- Use `tar.xz` archive format [`ryanoasis/nerd-fonts#1233`](ryanoasis/nerd-fonts#1233)
- Move code to default linux font folder
- Include template on MacOS font folder
@Finii Finii added this to the v3.0.2 milestone Jun 2, 2023
Finii added a commit that referenced this issue Jun 2, 2023
See #1233

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
Finii added a commit that referenced this issue Jun 2, 2023
See #1233

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
Finii added a commit that referenced this issue Jun 2, 2023
See #1233

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
Finii added a commit that referenced this issue Jun 2, 2023
See #1233

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
@N-R-K
Copy link
Author

N-R-K commented Jun 2, 2023

The gentoo guru overlay also switched over to the .xz packs for it's nerdfont packages. 🎉

@Finii
Copy link
Collaborator

Finii commented Jun 2, 2023

👍

Somehow the bot failed to add you as contributor, I guess because you have two dashes in the name 🙄
Did it manually, hope it will not break on subsequent bot runs. See #1235...

LNKLEO pushed a commit to LNKLEO/Nerd that referenced this issue Nov 24, 2023
See ryanoasis#1233

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
Copy link
Contributor

github-actions bot commented Dec 4, 2023

This issue has been automatically locked since there has not been any recent activity (i.e. last half year) after it was closed. It helps our maintainers focus on the active issues. If you have found a problem that seems similar, please open a new issue, complete the issue template with all the details necessary to reproduce, and mention this issue as reference.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants