-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dask: Data.__init__
with no arguments
#551
dask: Data.__init__
with no arguments
#551
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code changes themselves are fine and good, and with this PR merged creation of a plain cf.Data()
works and is tested to do so, but I am concerned about side-effects of the creation of such an object. For example from seeing the list of methods available on a case via a __dir__()
call, and trying out some, I hit trivial errors such as those illustrated below. Those are often the kinds of errors you might expect to happen, conceptually, on an array-less instance, or following from making:
the "dask" key in the _custom dictionary is inintialised to None, thereby allowing basic inspection of the instance.
Ultimately I am not really sure whether allowing this (Data.__init__
with no arguments) is a good idea after all, because really if we allow it we should go through all of the possible methods for Data
and ensure they run or (probably for most) error gracefully with that case, and that could be pretty time-consuming. But with the alternative I implemented in #491 we have the complication of the docstring rewriting testing, so either way there is work to be done!
In light of that, what do you think we should do from here? Could we add some fancy logic, perhaps, to stop methods being called on a Data
instance created with no data? Or (less preferred, in my eyes) warn users upon initialisation of data-less Data
that many or most methods won't work properly?
Method call examples
>>> d = cf.Data()
>>> d.ndim
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/sadie/cf-python/cf/data/data.py", line 4357, in ndim
return dx.ndim
AttributeError: 'NoneType' object has no attribute 'ndim'
>>> d.ndim()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/sadie/cf-python/cf/data/data.py", line 4357, in ndim
return dx.ndim
AttributeError: 'NoneType' object has no attribute 'ndim'
>>> d.size()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/sadie/cf-python/cf/data/data.py", line 4427, in size
size = dx.size
AttributeError: 'NoneType' object has no attribute 'size'
>>> d.max()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/sadie/cfdm/cfdm/decorators.py", line 42, in inplace_wrapper
self.INPLACE_ENABLED_PLACEHOLDER = self.copy()
File "/home/sadie/cfdm/cfdm/data/data.py", line 1520, in copy
return super().copy(array=array)
File "/home/sadie/cfdm/cfdm/core/data/data.py", line 322, in copy
return type(self)(source=self, copy=True, _use_array=array)
File "/home/sadie/cf-python/cf/data/data.py", line 361, in __init__
self._set_dask(array, copy=copy, conform=False)
File "/home/sadie/cf-python/cf/data/data.py", line 1282, in _set_dask
array = array.copy()
AttributeError: 'NoneType' object has no attribute 'copy'
>>> d.get_filenames()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/sadie/cf-python/cf/data/data.py", line 5474, in get_filenames
dsk = collections_to_dsk((self.to_dask_array(),), optimize_graph=True)
File "/home/sadie/anaconda3/envs/cf-env/lib/python3.8/site-packages/dask/base.py", line 368, in collections_to_dsk
dsk, keys = _extract_graph_and_keys(val)
File "/home/sadie/anaconda3/envs/cf-env/lib/python3.8/site-packages/dask/base.py", line 394, in _extract_graph_and_keys
graphs.append(v.__dask_graph__())
AttributeError: 'NoneType' object has no attribute '__dask_graph__'
>>> d.is_masked()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/sadie/cf-python/cf/data/data.py", line 4281, in is_masked
out_ind = tuple(range(dx.ndim))
AttributeError: 'NoneType' object has no attribute 'ndim'
Hi Sadie, OK, how about 59a6e3d, which produces: >>> import cf
>>> d = cf.Data()
>>> d.ndim
Traceback (most recent call last):
...
ValueError: Data object has no data
>>> d.get_filenames()
Traceback (most recent call last):
...
ValueError: Data object has no data
>>> |
Ah, perfect! Achieves what I hoped for as best solution, namely:
without any of the nuisance I anticipated! Great idea. I'm just re-running the test suite and then I can approve. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good now (feedback addressed). Ideally we can add one or two self.assertRaises(ValueError)
checks to the test to ensure the behaviour implemented by 59a6e3d is coming through to the methods (which I have tested manually), but regardless please merge when ready.
Thanks, @sadielbartholomew - I've add a test (8e343bb) ... and merging. |
(Re-) Fixes #378. See that issue for details.