-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data.func
and friends: migrate to Dask
#300
Data.func
and friends: migrate to Dask
#300
Conversation
Data.func
: migrate to Dask
Data.func
: migrate to DaskData.func
and friends: migrate to Dask
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Sadie, not much to say here, as it all looks good. I agree with you analysis/context/plan/proposal (#300 (comment)). I can confirm that the tests all pass, and have noted the TODOs in them to be sorted out when other functionality comes on line.
Thanks - please merge when you're ready.
Thanks for the feedback, @davidhassell. I've deprecated the 'out' keyword as suggested, and also added a minimal unit test for |
Convert the
Data.func
method towards #182, in doing so daskifying several other element-wise operation methods that use it to perform the underlying operation, as follows (though note the below context since in short most of these can and will be then converted to use Dask built-ins rather than callingfunc
):sin
,arcsin
,sinh
,arcsin
etc.);ceil
;exp
;floor
;rint
;round
;trunc
;and daskifying
log
except in the case of setting a base that isn'te
,2
or10
which requires which requires__itruediv__
to be daskified to work.Context and plan/proposal for follow-on work
As well as daskifying
func
this PR effectively daskifies the other methods listed above in doing so because they usefunc
to operate in the code at present, hence I have removed the test-skipping decorators from the unit tests for, and added thedaskified
marker to, each of the methods (exceptlog
where a comment has been made to note the nearly-daskified state), however such methods are more efficiently daskified by conversion to using the existing Dask built-in equivalents instead of callingfunc
this way.So the plan is to use follow-on PRs to convert from use of
func
to the appropriate built-in in each method. It turns out there are Dask built-ins for every one listed above available for us to use directly.A small complication is in the case of the trig. and hyperbolic methods with a restricted domain (4 of the 12, e.g.
arcsin
which is undefined outside of[-1, 1]
), since we support apreserve_invalid
keyword which preserves anyNaN
orinf
like values rather than masking them as important for the use context of cf-python. In order to still support that for methods we think should have it as a keyword flag, we should use the daskifiedfunc
rather than the Dask built-in, as it already implements that keyword. Overall, therefore, my proposal is as follows, to be applied in a series of new PRs:For data methods that apply a trivial operation in an element-by-element manner (see list above), use the following:
preserve_invalid
option that NumPy does not support in any trivial way.func
to apply the operation.