-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARM specific optimizations #216
Comments
I recently implemented a NEON-ized version of Adler32 checksum (https://codereview.chromium.org/2676493007/, about 3x faster on ARMv8) and I'm looking forward to upstream this patches instead of forking even more the zlib used in Chromium. |
Please see pull request at: |
Next candidate would be CRC32 (it can be made 7x to 10x faster by using the CRC32 instruction available in ARMv8). https://bugs.chromium.org/p/chromium/issues/detail?id=709716 |
If it can be made faster just do it. Anyone could easily make great use of it being faster with performance increase. |
zlib is both efficient and fast (not to mention insanely portable) and has provided great services for the world for the last 2 decades. It is used everywhere: Linux kernel, Chromium, Firefox, libpng, iOS, Android, etc. We all should be grateful that it was made available for free by their authors. |
One way to improve performance is by sacrificing portability (e.g. CPU specific code), which is a considerable cost and it is better to keep it contained in well separated functions/modules. |
@timofonic zlib-ng has accepted the ARM specific optimizations, IIRC it is in a development branch. |
Libpng has both intrinsics and hand written ASM code for ARM (on the pre-filters).
Would zlib be open to contributions of a few core/hot functions targeting ARM?
One good candidate we identified is Adler-32, a SIMD version is about 3x faster on ARMv8.
The text was updated successfully, but these errors were encountered: