-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
numpy savez_compressed much smaller filesizes for small arrays #116
Comments
@lopsided thank you for asking about this. What settings are you using for Blosc and bloscpack. Maybe you need to either use a higher compression setting (like 9) and/or change the internal algorithm? I think it could be worth a shot. |
@lopsided a list of settings to explore is here: https://github.com/Blosc/bloscpack#settings If you can share the data or an anonymized variant that has similar entropy we could look into this in more detail. |
Thanks for the quick reply! I've just been using pretty much default settings:
I've attached an example image (actually a triplet of greyscale images), saved uncompressed using |
Thank you, it may take me a few days to tinker. |
I am so sorry, but there was no space left in my schedule to look into this. |
I have a few million images to save to disk and have been trying a few options out. I thought blosc/bloscpack would be well suited but I'm getting far larger image sizes than using the standard numpy
savez_compressed
.My images are size
(3,200,200)
anddtype=float32
. Typical file sizes I'm getting are:np.savez
~470k
np.savez_compressed
~53k
blosc.pack_array
~200k
blosc.compress_ptr
~200k
bloscpack.pack_ndarray_to_file
~200-400k
For a sample of 370 images this gives:
For the
blosc_*
methods I'm writing thepacked
bytes like:Is there anything I'm missing or is numpy's compression just as good as it gets for small images like these?
The text was updated successfully, but these errors were encountered: