Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a Base store class for Zarr Store (update) #789

Merged
merged 39 commits into from
Oct 21, 2021

Conversation

grlee77
Copy link
Contributor

@grlee77 grlee77 commented Jun 23, 2021

Use a base Store class instead of just MutableMapping

I am resuming the work @Carreau started in #612 here. This is mostly just a rebase of that PR with a small new commit in 8136713. Matthias and I met earlier today and we decided that starting with this PR should make it easier to add zarr-v3 support without trying to implement an entirely independent set of classes. The goal is to reduce code duplication between v2 and v3 where possible (None of the changes in this PR are specific to v3 yet)

Note that I updated the version string text to 2.9.0 in a couple of places, but that may need to be updated again depending on when this is merged.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@pep8speaks
Copy link

pep8speaks commented Jun 23, 2021

Hello @grlee77! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-10-21 04:50:40 UTC

@joshmoore
Copy link
Member

The Fatal Python error: Segmentation fault was also seen in #788 (comment)

@grlee77
Copy link
Contributor Author

grlee77 commented Jun 24, 2021

Yes, I just started looking into it. I can reproduce it locally after upgrading to NumPy 1.21. It does not occur for NumPy 1.20.3

@grlee77
Copy link
Contributor Author

grlee77 commented Jun 24, 2021

I opened numpy/numpy#19325 regarding this segfault

@codecov
Copy link

codecov bot commented Jun 24, 2021

Codecov Report

Merging #789 (06086dc) into master (d8ac8a7) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #789   +/-   ##
=======================================
  Coverage   99.94%   99.94%           
=======================================
  Files          31       32    +1     
  Lines       11133    11175   +42     
=======================================
+ Hits        11127    11169   +42     
  Misses          6        6           
Impacted Files Coverage Δ
zarr/_storage/absstore.py 100.00% <100.00%> (ø)
zarr/_storage/store.py 100.00% <100.00%> (ø)
zarr/attrs.py 100.00% <100.00%> (ø)
zarr/convenience.py 100.00% <100.00%> (ø)
zarr/core.py 100.00% <100.00%> (ø)
zarr/creation.py 100.00% <100.00%> (ø)
zarr/hierarchy.py 100.00% <100.00%> (ø)
zarr/storage.py 100.00% <100.00%> (ø)
zarr/tests/test_attrs.py 100.00% <100.00%> (ø)
zarr/tests/test_convenience.py 100.00% <100.00%> (ø)
... and 7 more

@grlee77 grlee77 mentioned this pull request Jun 24, 2021
6 tasks
@jakirkham
Copy link
Member

cc @martindurant @rabernat @jhamman @shoyer @shikharsg (in case any of you have thoughts here 🙂 this will affect how other store implementations work 😉)

@jakirkham
Copy link
Member

Also cc @TomAugspurger 🙂

@grlee77
Copy link
Contributor Author

grlee77 commented Jun 30, 2021

Looking ahead to adding v3 spec support:

In terms of class members for a Store interface, the v3 spec indicates the following should be present:

  • readable stores must have: get
  • writeable stores must have : set, erase, erase_prefix
  • listable stores should have one or more of: list, list_dir, list_prefix:

The Store class here matches the existing store implementations in zarr-python in that it has a listdir method, but this differs from the list_dir spelling in the v3 spec which is unfortunate. erase seems equivalent to __delitem__ while erase_prefix seems equivalent to rmdir

@joshmoore
Copy link
Member

Green now except for the removed build (==1.16.4) which can be ignored.

@martindurant
Copy link
Member

Clarification: if we open an array or group, and pass a dict-like as before, it gets wrapped to become a Store, right?

Copy link
Member

@joshmoore joshmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned on call, this is looking good. Thanks, @grlee77. Probably the most important remaining step is to get the release blurb so that we can all agree on the degree of breakage and have a comprehensive list of what (if anything) will no longer take dicts. Additionally some documentation may be necessary around the "pass by reference" semantics.

Propose then to target this for a 2.11

@joshmoore
Copy link
Member

Solved the (minor) merge conflict, @grlee77. From my side, I'd say let's try to get this into a RC as early as tomorrow. (Or someone else can do it this evening. I'm going to 🛏️ ) Up to you if you think we should also try to get metadata handling in.

(Release blurb would still be appreciated.)

@grlee77
Copy link
Contributor Author

grlee77 commented Oct 20, 2021

Solved the (minor) merge conflict, @grlee77. From my side, I'd say let's try to get this into a RC as early as tomorrow. (Or someone else can do it this evening. I'm going to bed ) Up to you if you think we should also try to get metadata handling in.

I think we may as well also merge the metadata handling for the RC. Let's discuss in the community call.

One remaining doubt about the approach here is the presence of listdir and rmdir methods in this class which seems specific to our current v2 store implementations. They are not among the abstract store methods listed in the v3 spec (there is a list_dir, not listdir but behavior is not exactly the same. erase_prefix in the spec is like rmdir). I am wondering if we should instead hava a BaseStore that does not have these methods and then use a StoreV2 subclass that would add these?

A future StoreV3 class could then inherit from BaseStore instead of StoreV2, so that it would not have to inherit additional methods that are not part of the spec. That said, at the moment many high-level functions use listdir and rmdir, so we may want to keep them around, at least temporarily.

BaseStore does not have the listdir or rmdir methods

cleaned up some type declerations, making sure mypy passes
Otherwise the save_array doc example fails to write to a ZipStore
@@ -133,7 +133,7 @@ def rmdir(store: StoreLike, path: Path = None):
this will be called, otherwise will fall back to implementation via the
`Store` interface."""
path = normalize_storage_path(path)
if hasattr(store, "rmdir"):
if hasattr(store, "rmdir") and store.is_erasable(): # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker for getting this into a 2.11 alpha build but is the hasattr not now redundant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite yet. There are calls to rmdir from _init_array_metadata (via init_array which still allows a MutableMapping for the store). This is why I put store as StoreLike in the type annotation, but then I had to add "# type: ignore" on this line or mypy complained that MutableMappings do not have an is_erasable attribute.

@joshmoore
Copy link
Member

As discussed during the community call yesterday evening (my time), merging this into a 2.11 alpha release with the (now passing) proposed changes.

Thanks to both @grlee77 and @Carreau for all their labors! 👏🏽

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants