forked from zarr-developers/numcodecs
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* added PCodec * fix line length and print statements * docs * mock pcodec on rtd * fix typo * add dtype details * changed import style for pcodec * fix flake8 * revert import changes * fix errors due to changes in pcodec API * change import style * skip coverage of failed import path * skip pcodec tests if not installed
- Loading branch information
Showing
89 changed files
with
250 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Submodule c-blosc
updated
4 files
+1 −1 | .github/workflows/cmake.yml | |
+7 −0 | RELEASE_NOTES.rst | |
+2 −2 | blosc/blosc.h | |
+1 −1 | tests/test_common.h |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -67,6 +67,7 @@ Contents | |
abc | ||
registry | ||
blosc | ||
pcodec | ||
lz4 | ||
zfpy | ||
zstd | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
PCodec | ||
====== | ||
|
||
.. automodule:: numcodecs.pcodec | ||
|
||
.. autoclass:: PCodec | ||
|
||
.. autoattribute:: codec_id | ||
.. automethod:: encode | ||
.. automethod:: decode |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{ | ||
"delta_encoding_order": null, | ||
"equal_pages_up_to": 262144, | ||
"float_mult_spec": "enabled", | ||
"id": "pcodec", | ||
"int_mult_spec": "enabled", | ||
"level": 8 | ||
} |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{ | ||
"delta_encoding_order": null, | ||
"equal_pages_up_to": 262144, | ||
"float_mult_spec": "enabled", | ||
"id": "pcodec", | ||
"int_mult_spec": "enabled", | ||
"level": 1 | ||
} |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{ | ||
"delta_encoding_order": null, | ||
"equal_pages_up_to": 262144, | ||
"float_mult_spec": "enabled", | ||
"id": "pcodec", | ||
"int_mult_spec": "enabled", | ||
"level": 5 | ||
} |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{ | ||
"delta_encoding_order": null, | ||
"equal_pages_up_to": 262144, | ||
"float_mult_spec": "enabled", | ||
"id": "pcodec", | ||
"int_mult_spec": "enabled", | ||
"level": 9 | ||
} |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{ | ||
"delta_encoding_order": null, | ||
"equal_pages_up_to": 262144, | ||
"float_mult_spec": "disabled", | ||
"id": "pcodec", | ||
"int_mult_spec": "disabled", | ||
"level": 8 | ||
} |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{ | ||
"delta_encoding_order": null, | ||
"equal_pages_up_to": 300, | ||
"float_mult_spec": "enabled", | ||
"id": "pcodec", | ||
"int_mult_spec": "enabled", | ||
"level": 8 | ||
} |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
from typing import Optional, Literal | ||
|
||
import numcodecs | ||
import numcodecs.abc | ||
from numcodecs.compat import ensure_contiguous_ndarray | ||
|
||
try: | ||
from pcodec import standalone, ChunkConfig, PagingSpec | ||
except ImportError: # pragma: no cover | ||
standalone = None | ||
|
||
|
||
DEFAULT_MAX_PAGE_N = 262144 | ||
|
||
|
||
class PCodec(numcodecs.abc.Codec): | ||
""" | ||
PCodec (or pco, pronounced "pico") losslessly compresses and decompresses | ||
numerical sequences with high compression ratio and fast speed. | ||
See `PCodec Repo <https://github.com/mwlon/pcodec>`_ for more information. | ||
PCodec supports only the following numerical dtypes: uint32, unit64, int32, | ||
int64, float32, and float64. | ||
Parameters | ||
---------- | ||
level : int | ||
A compression level from 0-12, where 12 take the longest and compresses | ||
the most. | ||
delta_encoding_order : init or None | ||
Either a delta encoding level from 0-7 or None. If set to None, pcodec | ||
will try to infer the optimal delta encoding order. | ||
int_mult_spec : {'enabled', 'disabled'} | ||
If enabled, pcodec will consider using int mult mode, which can | ||
substantially improve compression ratio but decrease speed in some cases | ||
for integer types. | ||
float_mult_spec : {'enabled', 'disabled'} | ||
If enabled, pcodec will consider using float mult mode, which can | ||
substantially improve compression ratio but decrease speed in some cases | ||
for float types. | ||
equal_pages_up_to : int | ||
Divide the chunk into equal pages of up to this many numbers. | ||
""" | ||
|
||
codec_id = "pcodec" | ||
|
||
def __init__( | ||
self, | ||
level: int = 8, | ||
delta_encoding_order: Optional[int] = None, | ||
int_mult_spec: Literal["enabled", "disabled"] = "enabled", | ||
float_mult_spec: Literal["enabled", "disabled"] = "enabled", | ||
equal_pages_up_to: int = 262144 | ||
): | ||
if standalone is None: # pragma: no cover | ||
raise ImportError( | ||
"pcodec must be installed to use the PCodec codec." | ||
) | ||
|
||
# note that we use `level` instead of `compression_level` to | ||
# match other codecs | ||
self.level = level | ||
self.delta_encoding_order = delta_encoding_order | ||
self.int_mult_spec = int_mult_spec | ||
self.float_mult_spec = float_mult_spec | ||
self.equal_pages_up_to = equal_pages_up_to | ||
|
||
def encode(self, buf): | ||
buf = ensure_contiguous_ndarray(buf) | ||
|
||
paging_spec = PagingSpec.equal_pages_up_to(self.equal_pages_up_to) | ||
|
||
config = ChunkConfig( | ||
compression_level=self.level, | ||
delta_encoding_order=self.delta_encoding_order, | ||
int_mult_spec=self.int_mult_spec, | ||
float_mult_spec=self.float_mult_spec, | ||
paging_spec=paging_spec, | ||
) | ||
return standalone.simple_compress(buf, config) | ||
|
||
def decode(self, buf, out=None): | ||
if out is not None: | ||
out = ensure_contiguous_ndarray(out) | ||
standalone.simple_decompress_into(buf, out) | ||
return out | ||
else: | ||
return standalone.simple_decompress(buf) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
import pytest | ||
import numpy as np | ||
|
||
from numcodecs.pcodec import PCodec | ||
|
||
try: | ||
# initializing codec triggers ImportError | ||
PCodec() | ||
except ImportError: # pragma: no cover | ||
pytest.skip( | ||
"pcodec not available", allow_module_level=True | ||
) | ||
|
||
from numcodecs.tests.common import ( | ||
check_encode_decode_array_to_bytes, | ||
check_config, | ||
check_repr, | ||
check_backwards_compatibility, | ||
check_err_decode_object_buffer, | ||
check_err_encode_object_buffer, | ||
) | ||
|
||
|
||
codecs = [ | ||
PCodec(), | ||
PCodec(level=1), | ||
PCodec(level=5), | ||
PCodec(level=9), | ||
PCodec(float_mult_spec="disabled", int_mult_spec="disabled"), | ||
PCodec(equal_pages_up_to=300), | ||
] | ||
|
||
|
||
# mix of dtypes: integer, float | ||
# mix of shapes: 1D, 2D | ||
# mix of orders: C, F | ||
arrays = [ | ||
np.arange(1000, dtype="u4"), | ||
np.arange(1000, dtype="u8"), | ||
np.arange(1000, dtype="i4"), | ||
np.arange(1000, dtype="i8"), | ||
np.linspace(1000, 1001, 1000, dtype="f4"), | ||
np.linspace(1000, 1001, 1000, dtype="f8"), | ||
np.random.normal(loc=1000, scale=1, size=(100, 10)), | ||
np.asfortranarray(np.random.normal(loc=1000, scale=1, size=(100, 10))), | ||
np.random.randint(0, 2**60, size=1000, dtype="u8"), | ||
np.random.randint(-(2**63), -(2**63) + 20, size=1000, dtype="i8"), | ||
] | ||
|
||
|
||
@pytest.mark.parametrize("arr", arrays) | ||
@pytest.mark.parametrize("codec", codecs) | ||
def test_encode_decode(arr, codec): | ||
check_encode_decode_array_to_bytes(arr, codec) | ||
|
||
|
||
def test_config(): | ||
codec = PCodec(level=3) | ||
check_config(codec) | ||
|
||
|
||
def test_repr(): | ||
check_repr( | ||
"PCodec(delta_encoding_order=None, equal_pages_up_to=262144, float_mult_spec='enabled', " | ||
"int_mult_spec='enabled', level=3)" | ||
) | ||
|
||
|
||
def test_backwards_compatibility(): | ||
check_backwards_compatibility(PCodec.codec_id, arrays, codecs) | ||
|
||
|
||
def test_err_decode_object_buffer(): | ||
check_err_decode_object_buffer(PCodec()) | ||
|
||
|
||
def test_err_encode_object_buffer(): | ||
check_err_encode_object_buffer(PCodec()) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters