Support modifications of a read file in an external overlay #676
Labels
category: proposal
proposed enhancements or new features
priority: medium
non-critical problem and/or affecting only a small set of users
Milestone
In neurophysiology, metadata is often not static. Sometimes, the experiment description, related publications, or data annotations need to be changed after the file is written. Currently, users can open the file in HDMF in append mode and add containers. Users can also open the file in read mode, make a modification to any part of a container in-memory, and export the modified in-memory container to a new file on disk. Simple changes to metadata cannot be made in append mode and require rewriting the file, which can be expensive (e.g., a 10 GB file is written but one attribute needs to be changed.) Metadata changes can be done in h5py but is hacky.
Data archives would also like to be able to support versioning in a lean way, where making small metadata changes would not require maintaining a complete copy of a file with each change.
The HDF5 group is planning to add support for similar changes / versioning but it will likely be a while before this feature is widely supported.
The NWB and DANDI teams have brainstormed several approaches, one of which is to maintain a human-readable, sidecar JSON file with the same name as the NWB file but with a .json suffix that contains the sequence of changes to be made to the data after reading it. The original NWB file would stay intact.
Other options should be explored too.
Checklist
The text was updated successfully, but these errors were encountered: