Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow VLenUTF8 to encode a read-only source #515

Merged
merged 5 commits into from
Jul 12, 2024

Conversation

ivirshup
Copy link
Contributor

@ivirshup ivirshup commented Mar 12, 2024

Allow read-only arrays to be encoded with VLenUTF8

Arguably this should be tested and potentially fixed for all codecs.

Question: Is this implementation fine? I've done very little Cython, so I have no idea if there are any downsides to this.

TODO:

  • Unit tests and/or doctests in docstrings
  • Tests pass locally
  • Docstrings and API docs for any new/modified user-facing classes and functions
  • Changes documented in docs/release.rst
  • Docs build locally
  • GitHub Actions CI passes
  • Test coverage to 100% (Codecov passes)

@pep8speaks
Copy link

pep8speaks commented Mar 12, 2024

Hello @ivirshup! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2024-03-12 20:26:51 UTC

@ivirshup ivirshup marked this pull request as ready for review March 12, 2024 18:20
@ivirshup ivirshup marked this pull request as draft March 12, 2024 18:23
@ivirshup ivirshup marked this pull request as ready for review March 12, 2024 18:25
@ivirshup ivirshup changed the title Vlen readonly Allow VLenUTF8 to encode a read-only source Mar 12, 2024
Copy link

codecov bot commented Jul 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.91%. Comparing base (1be12d3) to head (47f1afe).
Report is 31 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #515   +/-   ##
=======================================
  Coverage   99.91%   99.91%           
=======================================
  Files          59       59           
  Lines        2336     2339    +3     
=======================================
+ Hits         2334     2337    +3     
  Misses          2        2           
Files with missing lines Coverage Δ
numcodecs/tests/test_vlen_utf8.py 100.00% <100.00%> (ø)

Copy link
Contributor

@rabernat rabernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a simple improvement with no backwards incompatibilities.

I'm sorry @ivirshup that no one reviewed this sooner. It has been hard to find maintainers for numcodecs.

Changes documented in docs/release.rst

Would you care to document this change? If not we can just merge as is, as this is pretty minor.

@rabernat rabernat merged commit f934052 into zarr-developers:main Jul 12, 2024
26 checks passed
@@ -74,7 +75,7 @@ class VLenUTF8(Codec):
def encode(self, buf):
cdef:
Py_ssize_t i, l, n_items, data_length, total_length
object[:] input_values
ndarray[object, ndim=1] input_values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isaac, could you please share a bit more about why this is needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember that much of the specifics, but it has to do with which types Cython is able to work with a read-only buffer. I remember running into this a bit with pandas too when I tried to make the index arrays actually read-only.

Is that what you were wondering about?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More info here: #514 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes thank you! 🙏

rabernat added a commit that referenced this pull request Aug 6, 2024
DimitriPapadopoulos pushed a commit to DimitriPapadopoulos/numcodecs that referenced this pull request Aug 10, 2024
* fix?

* test

* format

* add release notes

---------

Co-authored-by: Ryan Abernathey <ryan.abernathey@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VLenUTF8().encode(buffer) fails is buffer is read-only
4 participants