Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LAMA to Dask: managing default(-argument) Data instance #378

Closed
sadielbartholomew opened this issue Apr 11, 2022 · 9 comments · Fixed by #551
Closed

LAMA to Dask: managing default(-argument) Data instance #378

sadielbartholomew opened this issue Apr 11, 2022 · 9 comments · Fixed by #551
Assignees
Labels
dask Relating to the use of Dask
Milestone

Comments

@sadielbartholomew
Copy link
Member

(Opening as a general issue since __init__ isn't listed under our table of methods in #295, so I am not sure of the migration status of Data initialisation.) Whilst I was trying to create diverse cases of instances of Data (in the context of #376 (comment)), I noticed cf.Data() initialised as-is, i.e. with no arguments, which is seemingly documented as being valid (initial argument array=None with array: optional stated), runs into an error:

>>> cf.Data()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/sadie/cf-python/cf/data/data.py", line 964, in __repr__
    return super().__repr__().replace("<", "<CF ", 1)
  File "/home/sadie/anaconda3/envs/cf-env/lib/python3.8/site-packages/cfdm/data/data.py", line 214, in __repr__
    shape = self.shape
  File "/home/sadie/cf-python/cf/data/data.py", line 158, in wrapper
    return method(*args, **kwargs)
  File "/home/sadie/cf-python/cf/data/data.py", line 6288, in shape
    dx = self._get_dask()
  File "/home/sadie/cf-python/cf/data/data.py", line 1387, in _get_dask
    return self._custom["dask"]
KeyError: 'dask'

The fix is probably a quite simple adaptation, though I don't have time to look into it now; once I have finished reviewing my backlog of Dask migration PRs I will sort this, if you haven't already tackled it @davidhassell.

@sadielbartholomew sadielbartholomew added the dask Relating to the use of Dask label Apr 11, 2022
@davidhassell
Copy link
Collaborator

Hi Sadie, I think I'd rather make d = cf.Data(array=None, source=None) raise an exception. Let's add this to the list of items to talk about.

@sadielbartholomew
Copy link
Member Author

I think I'd rather make d = cf.Data(array=None, source=None) raise an exception. Let's add this to the list of items to talk about.

Sure, that's actually that's fine by me without the need for any debate! So I guess this issue becomes that of raising an appropriate exception rather than the error it presently runs into, as above. I'll change the title of this Issue to reflect that.

@sadielbartholomew sadielbartholomew changed the title LAMA to Dask: can't create default(-argument) Data instance LAMA to Dask: managing default(-argument) Data instance Apr 12, 2022
@sadielbartholomew
Copy link
Member Author

sadielbartholomew commented Nov 8, 2022

@davidhassell this Issue, which boils down now to:

make d = cf.Data(array=None, source=None) raise an exception

should be added to the list of tasks to do towards #182, since we agreed on applying that behaviour (see comments above!) and forgot about it, I guess! It's pretty trivial to implement, and concerns Data which we have largely moved on from, so probably best get it done sooner rather than later.

Assuming you agree with the above, do you want to tackle this, or shall I?

(FYI the underlying problem raised in the initial comment here is still true on our current lama-to-dask branch, assuming you call some operation on the cf.Data() e.g. run cf.Data().max().)

@davidhassell
Copy link
Collaborator

Thanks - All yours!

@davidhassell
Copy link
Collaborator

Fixed by #491

@davidhassell
Copy link
Collaborator

Hmm, this change causes a unit test to fail (test_docstring). Reopening for further thought ...

@davidhassell
Copy link
Collaborator

The new PR #551 changes tack, in that it allows a no-arg initialisation (Data()) but makes sure that the "dask" key in the _custom dictionary is inintialised to None, thereby allowing basic inspection of the instance.

This makes the Data class consistent with all other classes, i.e. all classes allow no-arg initialisations, and also allows the test_docstring.py to work.

@sadielbartholomew
Copy link
Member Author

This new approach is fine by me, assuming:

This makes the Data class consistent with all other classes, i.e. all classes allow no-arg initialisations, and also allows the test_docstring.py to work.

is the true motivation to change tack and not just that it is a convenient change in order to bypass the breaking of the docstring rewriting on the initial one! Which is definitely the case, I am sure... 😄

(Also it does seem theoretically nice to allow the no-argument initialisation.)

Thanks for putting up the associated PR. I'll review it in a moment.

@davidhassell
Copy link
Collaborator

We have settled, for now on letting all classes have a no-arg init, so closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dask Relating to the use of Dask
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants