-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(feat): fall back to pure python indexing in case of unhandled rust indexing for read #30
Conversation
@ilan-gold, you are a testing wizard. There are a few extra failures here compared to |
100% |
Will take care of that here then @LDeakin |
Aside from the things that I think writing can come in a separate PR. Ideally the solution for both of these is integer indexing in |
05e098a
to
8b60bed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one!
The testing here is a bit broken, some stuff got lost in the refactor. |
…ip[vindex-contiguous_in_chunk_array-contiguous_in_chunk_array]"
Ok the issue is that we need to use https://numpy.org/doc/stable/reference/generated/numpy.broadcast_shapes.html to distinguish between cases we can handle and not i.e., input and output must have the same dimensionality upon selection. |
I tried extending our use-case here to 3d (hence the bigger changes than just xfailing the one previous failing test: deba69e) to ensure that things really work and that my checks are functional, but that really only exacerbated the "do things work" in zarr problem: zarr-developers/zarr-python#2485. I am tempted to just say "2D works for everything except for the explcitly skipped tests, and >2D only works with slicing" or something to this effect. |
29792a0
to
deba69e
Compare
* (chore): add basic 3d tests * (refactor): use `pytest_generate_tests`
…ndexing for read (#30) * (chore): file structure * (chore): parametrize tests to get full scope of possibilities * (chore): xfail tests that fail on zarr-python default pipeline * (fix) singular * (fix): check for contiguous index arrays * (fix): contiguous numpy arrays converted to slices * (feat): add reading for non-contiguous buffers * (chore): remove unused imports * (fix): cleanup unwraps in `retrieve_chunks` * Refactor full indexing (#34) * (chore): `make_chunk_info_for_rust` cleanup * (fix): all tests working except "tests/test_pipeline.py::test_roundtrip[vindex-contiguous_in_chunk_array-contiguous_in_chunk_array]" * (fix): skip read in `store_chunk_subset_bytes` for full chunks * (fix): improve dropped index detection + disallow integer write case * (chore): message more specific * (fix): use `Exception` * (chore): erroneous comment * (chore): `drop_axes` default * (chore): `drop_axes` param * (chore): apply review * (chore): `else` branch * (chore): add basic nd tests (#35) * (chore): add basic 3d tests * (refactor): use `pytest_generate_tests` * (fix): clarify collapsed dimension behavior * (chore): clean ups --------- Co-authored-by: Lachlan Deakin <ljdgit@gmail.com> Co-authored-by: Philipp A. <flying-sheep@web.de>
* Add a `CodecPipeline` stub * Pass chunk spec * (fix): return bytes as numpy array (#18) * (fix): return bytes as numpy array * (fix): reshape after making view * (fix): remove unused code * (fix): error handling in `get_store_and_path` * (fix): change `example/simple.py` to a 2D array * (fix): pass value to `store_chunk_subset` * (fix): clippy warnings * (fix): handle missing chunks in `retrieve_chunk_subset` and cleanup * (fix): panics to errors in `get_store_and_path` * (fix): add partial reads/writes to `simply.py` * (fix): minimum working codec pipeline - Add internal `get_chunk_representation`, `retrieve_chunk_bytes`, and `store_chunk_bytes` - Separate `retrieve_chunk` and `retrieve_chunk_subset` - Separate `store_chunk` and `store_chunk_subset` - Add assertions to simple.py * (fix): handle relative filesystem paths * (fix): minimal error handling and fix clippy warnings * (fix): handle relative filesystem paths take 2 * (fix): constant handling * (fix): add `retrieve_chunks` for parallel read - Add config options - CodecPipelineImpl interior mutability * (fix): add `store_chunks` for parallel write * (fix) convert write value to contiguous array if needed * (fix): bump zarrs to 21e86cb9 * CI for codec pipeline (#20) * (fix): ci for running tests * (fix): no need to extract tests * (Fix): remove duplicate name * (fix): use a submodule... * (chore): remove memory store + port zarr codec tests * (chore): remove `dlpark` * (fix): getattr for itemsize * (chore): remove runtime F ordering * (chore): skip vlen * (feat): parse int --------- Co-authored-by: Lachlan Deakin <ljdgit@gmail.com> * (fix): support writing arrays with non-native endianness * (fix): disable bad/unsupported invalid metadata tests * (fix): do not store empty chunks * (fix) remove dead code in codec pipeline * (fix): move some selection logic from Rust to Python * (feat): `chunks_desc` handling + concurrency via `zarr.config` (#21) * (fix): minimum working codec pipeline - Add internal `get_chunk_representation`, `retrieve_chunk_bytes`, and `store_chunk_bytes` - Separate `retrieve_chunk` and `retrieve_chunk_subset` - Separate `store_chunk` and `store_chunk_subset` - Add assertions to simple.py * (fix): handle relative filesystem paths * (fix): minimal error handling and fix clippy warnings * (fix): handle relative filesystem paths take 2 * (fix): constant handling * (fix): add `retrieve_chunks` for parallel read - Add config options - CodecPipelineImpl interior mutability * (fix): add `store_chunks` for parallel write * (fix) convert write value to contiguous array if needed * (fix): bump zarrs to 21e86cb9 * CI for codec pipeline (#20) * (fix): ci for running tests * (fix): no need to extract tests * (Fix): remove duplicate name * (fix): use a submodule... * (chore): remove memory store + port zarr codec tests * (chore): remove `dlpark` * (fix): getattr for itemsize * (chore): remove runtime F ordering * (chore): skip vlen * (feat): parse int --------- Co-authored-by: Lachlan Deakin <ljdgit@gmail.com> * (fix): support writing arrays with non-native endianness * (fix): disable bad/unsupported invalid metadata tests * (fix): do not store empty chunks * (fix) remove dead code in codec pipeline * (fix): move some selection logic from Rust to Python * (chore): `chunks_desc` cleanup * (feat): adding concurrency via zarr config * (chore): remove extra comment * (fix): refactor chunk info creation + threads->threading + ruff * (fix): use `or` for `threading.max_workers` getting --------- Co-authored-by: Lachlan Deakin <ljdgit@gmail.com> * (chore): pipeline-specific tests (#23) * (chore): add pipeline-specific tests * (chore): add more tests, including weird zarr one * (fix): arr shape used in test * (chore): change names * (chore): remove unnecessary fill_value checking * (chore): make roundtrip_full_array shorter * (fix): handle singleton axis indexing Fixes `test_roundtrip_singleton_axis` test * (fix): deprecated numpy array to scalar in `make_slice_selection` Fixes the `test_roundtrip_partial` test --------- Co-authored-by: Lachlan Deakin <ljdgit@gmail.com> * (chore): add orthogonal indexing tests (#24) * (chore): add orthogonal indexing tests * (fix): oindex roundtrip * (fix): dont skip test because it works actually * (fix): clean up returning * Delete tests/__init__.py * (chore): add ellipsis indexing tests (#25) * Format (#26) * Pyo3-0.22 (#29) * Refactor and deduplicate ChunksItem (#31) * Improve error mapping code (#28) Co-authored-by: ilan-gold <ilanbassgold@gmail.com> * (chore): add `pyarray_itemsize()` * (chore): add safety comments to `ndarray_to_*` methods Also add `py_untyped_array_to_array_object` * (chore): label fields in `ChunksItemRaw` * (fix): fix store/retrieve with scalars + refactor - Fixed fill value bytes being larger than needed when storing - Handle reduced dimensionality inputs/outputs - Uses LDeakin/zarrs@8c1391f - This inconsistency was picked up by `zarrs` in debug build * (fix): address several unwraps * (chore): cleanup `slice_to_range` * (chore): change `retrieve_chunk_bytes` to return `ArrayBytes` * (chore): add SAFETY docs and `store_chunk_subset_bytes` input validation * (chore): file structure (#27) Co-authored-by: Lachlan Deakin <ljdgit@gmail.com> * (feat): refactor `CodecPipelineStore` to a trait - Move filesystem store implementation to a submodule * (fix): bring back filesystem relative path support Broken with 4c1ce39 * (fix): bump `requires-python` to 3.11 This is the minimum set by `zarr` 3.0.0b1 * Rust tests (#33) * Simplify shape calculation * (chore): add rust-cache to CI and use the standard uv action (#32) * (fix): use the standard uv GH action * cache with pyproject.toml * ci: remove install rust * (chore): add rust-cache to CI * (fix): add note in CI about rust-toolchain action * (feat): fall back to pure python indexing in case of unhandled rust indexing for read (#30) * (chore): file structure * (chore): parametrize tests to get full scope of possibilities * (chore): xfail tests that fail on zarr-python default pipeline * (fix) singular * (fix): check for contiguous index arrays * (fix): contiguous numpy arrays converted to slices * (feat): add reading for non-contiguous buffers * (chore): remove unused imports * (fix): cleanup unwraps in `retrieve_chunks` * Refactor full indexing (#34) * (chore): `make_chunk_info_for_rust` cleanup * (fix): all tests working except "tests/test_pipeline.py::test_roundtrip[vindex-contiguous_in_chunk_array-contiguous_in_chunk_array]" * (fix): skip read in `store_chunk_subset_bytes` for full chunks * (fix): improve dropped index detection + disallow integer write case * (chore): message more specific * (fix): use `Exception` * (chore): erroneous comment * (chore): `drop_axes` default * (chore): `drop_axes` param * (chore): apply review * (chore): `else` branch * (chore): add basic nd tests (#35) * (chore): add basic 3d tests * (refactor): use `pytest_generate_tests` * (fix): clarify collapsed dimension behavior * (chore): clean ups --------- Co-authored-by: Lachlan Deakin <ljdgit@gmail.com> Co-authored-by: Philipp A. <flying-sheep@web.de> * (fix): support zarr 3.0.0b2 + bump zarrs to 0.18.0-beta.0 (#36) * (fix): support zarr 3.0.0b2 * (fix): open store read_only in test_roundtrip_read_only_zarrs * (fix): pin zarrs revision I think an old version is cached? * (chore): add Cargo.toml to uv dependency glob * (fix): unquote uv dependency glob * (chore): bump `zarrs` to 0.18.0-beta.0 * (fix): remove unused rust features and `paste` * (chore): first pass at docs * (chore): add `yml` * (chore): small cleanup in `README.md` * (chore): add some little examples + integer indexing note * (chore): more examples * (chore): link * Update hatch.toml Co-authored-by: Philipp A. <flying-sheep@web.de> * (chore): update package name * merge * (fix): add zarrs import to example * (fix): fix typo in docs * (fix): remove python/zarrs_python directory * (fix): clarify what is not supported * (chore): futher clarify --------- Co-authored-by: Lachlan Deakin <ljdgit@gmail.com> Co-authored-by: Philipp A. <flying-sheep@web.de>
There are a lot of failures here, and none of them come from numpy, which supports all of these indexing operations. This may be a bit out of scope....