Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port Chromium optimizations to zlib #346

Open
Adenilson opened this issue Apr 6, 2018 · 26 comments
Open

Port Chromium optimizations to zlib #346

Adenilson opened this issue Apr 6, 2018 · 26 comments

Comments

@Adenilson
Copy link

Chromium has a fork of zlib with optimizations both for Intel (compression, decompression) and ARM (decompression).

It would be pretty nice if those patches could be merged upstream (e.g. 'contrib/chromium').

For reference, chromium's zlib can be found in:
https://cs.chromium.org/chromium/src/third_party/zlib/

A quick list of optimizations:
a) Decompression: inflate_fast, adler32, crc32, etc.

b) Compression: fill_window, longest_match, hash function, etc.

Finally, Chromium also has a pretty nice tool (zlib_bench, see: https://chromium.googlesource.com/chromium/src/+/875ad5e3c08831f1efa91adade7fc350b1ef45bc) that is used to validate the optimizations. I feel that it would be potentially helpful to the project to have something similar.

@Adenilson
Copy link
Author

@madler any comment?

@aytey
Copy link

aytey commented Jul 10, 2018

@Adenilson you might get more traction on this if you created a fork of zlib, merged in the Chromium changes, and then opened-up a PR for these changes against this project.

I just took 30s to dive into 3rd party zlib link you gave, and it is (I feel) "not immediate" where all of the changes lie.

If this is important to you, I think it is only fair if you do some of the leg work.

@Adenilson
Copy link
Author

@andrewvaughanj thanks for the suggestion. I've created over 1 year ago a merge request for 2 of the ARM specific optimizations (#251) and 3 months ago for another (#345).

Notice that those merge requests were ARM specific and targeting decompression only, while Chromium's zlib has also Intel optimizations, which prompted to open the current issue.

@aytey
Copy link

aytey commented Jul 10, 2018

Fair enough; I can see your hesitation to do this if you've already done it twice already (in other areas) and haven't got traction.

I wasn't aware of that, so my apologies if you thought I was (to use the British phrase) "teaching you to suck eggs".

@Adenilson
Copy link
Author

@madler ping?

@madler
Copy link
Owner

madler commented Aug 15, 2018

Where can I find benchmarks that show the gain?

@Adenilson
Copy link
Author

Current benchmark data collected this week can be found at:
https://goo.gl/qLVdvh

The last 2 weeks I've implemented hash_slide and insert_string for ARM boosting compression performance in average 36% for big cores (14% for littles), if all goes well it will be featured in Chromium M70.

For the snappy dataset (https://github.com/google/snappy/tree/master/testdata), the average performance gains for using Z_DEFAULT_COMPRESSION are:
a) ARM: 1.36x compression, 1.69x decompression.
b) Intel: 1.22x compression, 1.97x decompression.

The averages can be a bit misleading, as the snappy corpus has files with some really high entropy (bad-data files are around 0.95). For HTML (entropy 0.7) the performance gains are even better (e.g. 1.3x compression, 2.2x decompression on x86).

It is relevant to share that zlib is used by Chromium network library (cronet), that is itself used by quite a few of Google Android apps, being widely distributed (i.e. over 1 billion users) and tested.

For decompression, the first optimization (inflate_fast_chunk: #345) started shipping in Chromium M62, followed by Adler-32 (M63) and crc-32 (M66).

For decompression, the x86 optimizations were landed on Chromium in late 2014 (while the ARM optimizations were just landed the last 2 weeks).

@Adenilson
Copy link
Author

Sorry, a fix: For compression, the x86 optimizations were landed on Chromium in late 2014 (while the ARM optimizations were just landed the last 2 weeks).

@Adenilson
Copy link
Author

@madler collected some extra data for x86 (Intel(R) Xeon(R) CPU E5-2697A v4 @ 2.60GHz), hope it might be useful:
a) Fastest (1)
adenilson@chrome-monster:~/chromium/src/out/Release$ time ./zlib_bench gzip ~/temp/snappy/testdata/html*
/home/adenilson/temp/snappy/testdata/html :
GZIP: [b 1M] bytes 102400 -> 16623 16.2% comp 206.5 (211.0) MB/s uncomp 843.3 (857.1) MB/s
/home/adenilson/temp/snappy/testdata/html_x_4 :
GZIP: [b 1M] bytes 409600 -> 65568 16.0% comp 231.0 (231.5) MB/s uncomp 850.5 (852.3) MB/s

b) Default (7)
adenilson@chrome-monster:~/chromium/src/out/Release$ time ./zlib_bench gzip ~/temp/snappy/testdata/html*
/home/adenilson/temp/snappy/testdata/html :
GZIP: [b 1M] bytes 102400 -> 13707 13.4% comp 69.6 ( 70.1) MB/s uncomp 976.9 (982.3) MB/s
/home/adenilson/temp/snappy/testdata/html_x_4 :
GZIP: [b 1M] bytes 409600 -> 53285 13.0% comp 55.4 ( 66.9) MB/s uncomp 939.9 (970.5) MB/s

c) Vanilla Fastest
adenilson@chrome-monster:~/Desktop/chromium/zlib/uk/canonical-fork/x86$ ./zlib_bench gzip ~/temp/snappy/testdata/html*
/home/adenilson/temp/snappy/testdata/html :
GZIP: [b 1M] bytes 102400 -> 17016 16.6% comp 126.2 (128.7) MB/s uncomp 371.9 (385.9) MB/s
/home/adenilson/temp/snappy/testdata/html_x_4 :
GZIP: [b 1M] bytes 409600 -> 67311 16.4% comp 119.6 (121.3) MB/s uncomp 372.8 (385.4) MB/s

d) Vanilla Default
adenilson@chrome-monster:~/Desktop/chromium/zlib/uk/canonical-fork/x86$ ./zlib_bench gzip ~/temp/snappy/testdata/html*
/home/adenilson/temp/snappy/testdata/html :
GZIP: [b 1M] bytes 102400 -> 13711 13.4% comp 60.0 ( 60.1) MB/s uncomp 406.2 (413.2) MB/s
/home/adenilson/temp/snappy/testdata/html_x_4 :
GZIP: [b 1M] bytes 409600 -> 53299 13.0% comp 53.7 ( 54.2) MB/s uncomp 412.4 (428.6) MB/s

@breznak
Copy link

breznak commented Aug 31, 2018

Is this project actively developed in this repo? Latest activity ~2yrs ago for such a big project is sad.

@madler
Copy link
Owner

madler commented Aug 31, 2018

You can see recent activity in the develop branch.

https://github.com/madler/zlib/commits/develop

@Adenilson
Copy link
Author

@madler ping? Did you have time to have a look on the benchmark data?

Should we go ahead and submit a merge request?

@Adenilson
Copy link
Author

@madler friendly ping?

@Adenilson
Copy link
Author

Adenilson commented Jan 18, 2019

A google engineer started to perform compression on Javascript source strings to save memory in Chromium browser tabs and did some benchmarking comparing Chromium's zlib x zstd x brotli.

He collected some pretty interesting data on both x86 and ARM, I thought you could be interested in checking:
https://bugs.chromium.org/p/chromium/issues/detail?id=912902#c6

Just download the zip and open the html files in a webbrowser.
chromium_zlib_zstd_brotli.zip

@Adenilson
Copy link
Author

@madler ping?

2 similar comments
@Adenilson
Copy link
Author

@madler ping?

@Adenilson
Copy link
Author

@madler ping?

@Adenilson
Copy link
Author

@madler should I close the issue?

@LifeIsStrange
Copy link

@madler what do you think about making a crowdfunding or making a proposal to Google for funding zlib maintenance and development?

@madler
Copy link
Owner

madler commented Nov 23, 2019

I personally don't need any funding. What would we do with the money?

@LifeIsStrange
Copy link

Invest in more maintainers in order to integrate e.g chromium zlib optimizations to the benefit of many softwares?

@nmoinvaz
Copy link
Contributor

Why not just use the chromium fork instead?

@ProgramMax
Copy link

Ideally, Chromium's fork shouldn't be the source of truth.

Also, Mark has set in motion some things to make it easier for pull requests like this to land. To be honest, I dropped the ball in helping on it. I'll try to pick it back up. :)

@LifeIsStrange
Copy link

Because zlib is by default on Linux distributions unlike chromium zlib.
Moreover, most developers are unaware of the existence of chromium zlib.

@Strum355
Copy link

Chromiums zlib implementation causes SIGILL on CPUs without SSSE3 support from what I've gathered. nodejs/node#32553

@Adenilson
Copy link
Author

Upon further investigation, it seems to be an issue with the way Node is building the code (i.e. passing -mssse3 to the portable code section):
nodejs/node#32553 (comment)

GerHobbelt pushed a commit to GerHobbelt/zlib that referenced this issue Nov 20, 2021
* Added cmake build instructions, build options, install instructions, and repository contents to README.md.
* Moved INDEX file content to README.md files.
* Added configure instructions and options.
* Added CodeFactor to build integration.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants