Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TimeSeries can be written without 'data' dataset #1220

Open
5 tasks done
rly opened this issue Apr 10, 2020 · 8 comments · Fixed by #1274
Open
5 tasks done

TimeSeries can be written without 'data' dataset #1220

rly opened this issue Apr 10, 2020 · 8 comments · Fixed by #1274
Assignees
Labels
category: bug errors in the code or code behavior

Comments

@rly
Copy link
Contributor

rly commented Apr 10, 2020

Description

The TimeSeries/data dataset is required in the NWB 2.0+ schema. However, in PyNWB, you can create a TimeSeries without data. This results in a validation error. This has come up before but I forget where. Credit to @yarikoptic for identifying the issue again.

Steps to Reproduce

from pynwb import NWBFile, TimeSeries, NWBHDF5IO, validate
from datetime import datetime


ts = TimeSeries(name='ts_name',
                unit='ts_unit',
                timestamps=[0., 1., 2.])

nwbfile = NWBFile(session_description='ts',
                  identifier='i',
                  session_start_time=datetime.now().astimezone())

nwbfile.add_acquisition(ts)

filename = 'nwbfile.nwb'

with NWBHDF5IO(filename, 'w') as io:
    io.write(nwbfile)

with NWBHDF5IO(filename, 'r') as io:
    errors = validate(io)
    print(errors)
    nwbfile = io.read()

Environment

Python Executable: Python
Python Version: Python 3.7
Operating System: Windows
PyNWB Version: latest 1.3.0
HDMF Version: latest dev

Checklist

  • Have you ensured the feature or change was not already reported?
  • Have you included a brief and descriptive title?
  • Have you included a clear description of the problem you are trying to solve?
  • Have you included a minimal code snippet that reproduces the issue you are encountering?
  • Have you checked our Contributing document?
@rly rly added the category: bug errors in the code or code behavior label Apr 10, 2020
@rly rly self-assigned this Apr 10, 2020
@rly
Copy link
Contributor Author

rly commented Apr 10, 2020

Also worth noting is that in the NWB schema, the type ImageSeries, which extends TimeSeries, allows an empty 'data' dataset ('quantity: ?') since the data may be stored as an external image file. So ImageSeries is an exceptional case where the child type is allowed to omit a required dataset of a parent type.

@rly
Copy link
Contributor Author

rly commented Apr 10, 2020

This raises the question: should the API be changed so that TimeSeries/data is required or should the schema be changed so that TimeSeries/data is optional? @oruebel @ajtritt @bendichter thoughts?

@ajtritt
Copy link
Member

ajtritt commented Apr 10, 2020

If this behavior must be changed, I would prefer making TimeSeries.data required in the constructor, unless there is broad consensus that time-series data can legitimately be time-series data there without data.

If you can always use the validator to determine if a dataset is fully schema compliant, I don’t see why this is a problem? Does the mere existence of non schema compliant data create a problem?

@rly
Copy link
Contributor Author

rly commented Apr 15, 2020

@ajtritt Sorry, I wasn't clear. I think making TimeSeries.data required in the constructor makes the most sense. The minor minor issue is that since ImageSeries.data makes data optional, 1) we would be allowing subtypes to remove required fields from parent types and 2) inheritance in PyNWB becomes a little wonky because in ImageSeries.__init__, I can't pass data=None to super().__init__.

@ajtritt
Copy link
Member

ajtritt commented Apr 15, 2020

TimeSeries.data should be required. ImageSeries.__init__ should pass data=list() if external_file has been specified.

@yarikoptic
Copy link
Contributor

I think making TimeSeries.data required in the constructor...

wouldn't this break backward compatibility (in code and ability to load previously saved .nwb files)?

@ajtritt
Copy link
Member

ajtritt commented Apr 15, 2020

wouldn't this break backward compatibility

Yup, but we'll bump the version accordingly. We can modify the TimeSeries ObjectMapper to fill in a default value when reading a data-less TimeSeries.

@rly
Copy link
Contributor Author

rly commented Jul 10, 2020

Related: TimeSeries can also be written without the TimeSeries.unit field, which is required. We should make unit required in the constructor. As with TimeSeries.data, we will modify the TimeSeries ObjectMapper to fill in a default value when reading a unit-less TimeSeries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: bug errors in the code or code behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants