-
Added module
zappend.contrib
that contributes functions to zappend's core functionality. -
Added experimental function
zappend.contrib.write_levels()
that generates datasets using the multi-level dataset format as specified by xcube. It resembles thestore.write_data(cube, "<name>.levels", ...)
method provided by the xcube filesystem data stores ("file", "s3", "memory", etc.). The zappend version may be used for potentially very large datasets in terms of dimension sizes or for datasets with very large number of chunks. It is considerably slower than the xcube version (which basically usesxarray.to_zarr()
for each resolution level), but should run robustly with stable memory consumption. The function requiresxcube
package to be installed. (#19)
-
The function
zappend.api.zappend()
now returns the number of slices processed. (#93) -
Moved all project configuration to
pyproject.toml
and removedsetup.cfg
and requirements files. (#88) -
Added new section How do I ... to the documentation. (#66)
-
Fixed link to slice sources in documentation main page.
-
Fixed broken CI. (#97)
-
Made writing custom slice sources easier and more flexible: (#82)
-
Slice items can now be a
contextlib.AbstractContextManager
so custom slice functions can now be used with @contextlib.contextmanager. -
Introduced
SliceSource.close()
so contextlib.closing() is applicable. DeprecatedSliceSource.dispose()
. -
Introduced new optional configuration setting
slice_source_kwargs
that contains keyword-arguments passed to a configuredslice_source
together with each slice item. -
Introduced optional configuration setting
extra
that holds additional configuration not validated by default. Intended use is by aslice_source
that expects an argument namedctx
and therefore can access the configuration.
-
-
Improved readability of the configuration reference by using setting categories and applied logical ordering of settings within categories. (#85)
-
Added configuration setting
force_new
, which forces creation of a new target dataset. An existing target dataset (and its lock) will be permanently deleted before appending of slice datasets begins. (#72) -
Chunk sizes can now be
null
for a given dimension. In this case the actual chunk size used is the size of the array's shape in that dimension. (#77)
-
Simplified writing of custom slice sources for users. The configuration setting
slice_source
can now be aSliceSource
class or any function that returns a slice item: a local file path or URI, anxarray.Dataset
, aSliceSource
object. Dropped concept of slice factories entirely, including functionsto_slice_factory()
andto_slice_factories()
. (#78) -
Extracted
Config
class out ofContext
and made available via newContext.config: Config
property. The change concerns any usages of thectx: Context
argument passed to user slice factories. (#74)
-
Fixed rollback for situations where writing to Zarr fails shortly after the Zarr directory has been created. (#69)
In this case the error message was
TypeError: Transaction._delete_dir() missing 1 required positional argument: 'target_path'
.
-
The configuration setting
attrs
can now be used to define dynamically computed dataset attributes using the syntax{{ expression }}
. (#60)Example:
permit_eval: true attrs: title: HROC Ocean Colour Monthly Composite time_coverage_start: {{ lower_bound(ds.time) }} time_coverage_end: {{ upper_bound(ds.time) }}
-
Introduced new configuration setting
attrs_update_mode
that controls how dataset attributes are updated. (#59) -
Simplified logging to console. You can now set configuration setting
logging
to a log level which will implicitly enable console logging with given log level. (#64) -
Added a section in the notebook
examples/zappend-demo.ipynb
that demonstrates transaction rollbacks. -
Added CLI option
--traceback
. (#57) -
Added a section in the notebook
examples/zappend-demo.ipynb
that demonstrates transaction rollbacks.
- Fixed issue where a NetCDF package was missing to run the
demo Notebook
examples/zappend-demo.ipynb
in Binder. (#47)
-
Global metadata attributes of target dataset is no longer empty. (#56)
-
If the target parent directory did not exist, an exception was raised reporting that the lock file to be written does not exist. Changed this to report that the target parent directory does not exist. (#55)
- Added missing documentation of the
append_step
setting in the configuration reference.
-
A new configuration setting
append_step
can be used to validate the step sizes between the labels of a coordinate variable associated with the append dimension. Its value can be a number for numerical labels or a time delta value of the form8h
(8 hours) or2D
(two days) for date/time labels. The value can also be negative. (#21) -
The configuration setting
append_step
can take the special values"+"
and"-"
which are used to verify that the labels are monotonically increasing and decreasing, respectively. (#20) -
It is now possible to reference environment variables in configuration files using the syntax
${ENV_VAR}
. (#36) -
Added a demo Notebook
examples/zappend-demo.ipynb
and linked it by a binder badge in README.md. (#47)
-
When
slice_source
was given as class or function and passed
to thezappend()
function either as configuration entry or as keyword argument, aValidationError
was accidentally raised. (#49) -
Fixed an issue where an absolute lock file path was computed if the target Zarr path was relative in the local filesystem, and had no parent directory. (#45)
-
Allow for passing custom slice sources via the configuration. The new configuration setting
slice_source
is the name of a class derived fromzappend.api.SliceSource
or a function that creates an instance ofzappend.api.SliceSource
. Ifslice_source
is given, slices passed to the zappend function or CLI command will be interpreted as parameter(s) passed to the constructor of the specified class or the factory function. (#27) -
It is now possible to configure runtime profiling of the
zappend
processing using the new configuration settingprofiling
. (#39) -
Added
--version
option to CLI. (#42) -
Using
sizes
instead ofdims
attribute ofxarray.Dataset
in implementation code. (#25) -
Enhanced documentation including docstrings of several Python API objects.
-
Fixed a problem where the underlying i/o stream of a persistent slice dataset was closed immediately after opening the dataset. (#31)
-
Now logging ignored encodings on level DEBUG instead of WARNING because they occur very likely when processing NetCDF files.
-
Introduced slice factories
-
Allow passing slice object factories to the
zappend()
function. Main use case is to return instances of a customzappend.api.SliceSource
implemented by users. (#13) -
The utility functions
to_slice_factories
andto_slice_factory
exported byzappend.api
ease passing inputs specific for a customSliceSource
or other callables that can produce a slice object. (#22)
-
-
Introduced new configuration flag
persist_mem_slices
. If set, in-memoryxr.Dataset
instances will be first persisted to a temporary Zarr, then reopened, and then appended to the target dataset. (#11) -
Added initial documentation. (#17)
-
Improved readability of generated configuration documentation.
-
Using
requirements-dev.txt
for development package dependencies.
-
Fixed problem when passing slices opened from NetCDF files. The error was
TypeError: VariableEncoding.__init__() got an unexpected keyword argument 'chunksizes'
. (#14) -
Fixed problem where info about closing slice was logged twice. (#9)
Metadata fixes in setup.cfg
. No actual code changes.
The initial release.