-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add experimental code for data fragments. #282
Conversation
Ok, I think that this is ready for another pair of eyes, if only to sanity check what I have done so far. I think that it is pretty simple. Currently the CLI is very basic and only provides the option to The CLI could optionally be extended with the following, more complicated functionality:
|
This PR doesn't depend on #284 but that PR is likely also required for this functionality to be exploited as it is sometimes necessary to rechunk data being written to a fragment due to zarr chunk size limits. |
Could you also please rebase this PR on master? |
… a wrapper around identical functionality.
26b7a91
to
4c7b856
Compare
I have rebased to |
Tests added / passed
If the pep8 tests fail, the quickest way to correct
this is to run
autopep8
and thenflake8
andpycodestyle
to fix the remaining issues.Fully documented, including
HISTORY.rst
for all changesand one of the
docs/*-api.rst
files for new APITo build the docs locally:
This PR is a WIP which investigates reading data from multiple sources (with potentially different backends) and utilising
xarray
functionality to merge the resulting datasets dynamically. Practically, this makes it possible to read the static contents of a measurement set e.g. DATA, UVW from one location (e.g. a read-only s3 bucket) and the mutable contents such as FLAG from another location. This may make it possible to implement a basic versioning system in which we create proxy datasets which hold some (mutable) data, but which point back at some parent object from which the remaining data can be retrieved.