Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dask: Data.where #260

Merged
merged 7 commits into from
Oct 5, 2021
Merged

Conversation

davidhassell
Copy link
Collaborator

@davidhassell davidhassell commented Sep 13, 2021

Includes in cf.Query a bug-fix and new utility method set_condition_units.

@davidhassell davidhassell added the dask Relating to the use of Dask label Sep 30, 2021
Copy link
Member

@sadielbartholomew sadielbartholomew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a rogue line, or a missed parameter from a typo, might have slipped into the new where docstring examples, but otherwise this is all good. The new unit tests are sound and pass for me locally.

cf/data/data.py Outdated Show resolved Hide resolved
cf/data/data.py Outdated Show resolved Hide resolved
shape_x = x.shape
shape_data = data.shape
for n, m in zip(shape_x[::-1], shape_data[::-1]):
if n != m and n != 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially my thought as follows would be premature optimisation, so maybe if anything it could be a tentative TODODASK comment, but could this logic be refactored to avoid the need to calculate the shape of the data in the cases across where x has only size-1 dimensions? In other words, to essentially change the short-circuiting on the if check here to:

Suggested change
if n != m and n != 1:
if n != 1 n != m:

in case x is trivial and data is not, notably if it is large so expensive to get its shape? Would that even be worthwhile or could it e.g. be a niche case with general usage of the where method (not sure right now, thinking aloud here)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we talk about this one on out next call?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidhassell, sure. I'll add it to my (sprawling) questions and thoughts list.

Copy link
Member

@sadielbartholomew sadielbartholomew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks! (There are some new left conflicts to resolve, but please go ahead and merge after they have been addressed.)

@davidhassell davidhassell merged commit cff3d79 into NCAS-CMS:lama-to-dask Oct 5, 2021
@davidhassell davidhassell deleted the dask-where branch November 15, 2022 09:11
@davidhassell davidhassell added this to the 3.14.0 milestone Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dask Relating to the use of Dask
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants