-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate Data.transpose
from LAMA to Dask
#247
Migrate Data.transpose
from LAMA to Dask
#247
Conversation
Of relevance to this PR and to keep an eye on going forward: data-apis/array-api#228 |
HI @davidhassell, just 'bumping' this in case it fell off your radar. It would be good to get all outstanding |
cf/data/data.py
Outdated
@@ -12142,21 +12142,14 @@ def transpose(self, axes=None, inplace=False, i=False): | |||
) | |||
# --- End: if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we ditch all of this checking, as dask duplicates it, and replace it with a try ... except ValueError ...
around the actual dask transpose operation?
In [17]: from dask import array as da
In [18]: x = da.from_array([[2, 1]])
In [19]: x.transpose([2, 0])
...
ValueError: ('Unknown dimension', {2})
In [20]: x.transpose([1, 0, -1])
...
ValueError: axes don't match array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good thinking, I didn't consider that. I imagine so, let me try it out and see...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK now I am implementing it, I see we use iaxes
for the new d._axes
value calculation, so we do need (want) the if/else
(as it also short circuits for a trivial case which is good generally) but can cut all but the first line from the else
(this might be what you were implying to being with). Unless, that is, there is a clever way to avoid calculating iaxes
or the equivalent...
I'm about to push a new commit but if you have thoughts or a new idea relating to this, just let me know and we can tweak it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah - I see. Given our need to also permute the _axes
attribute, I can't see a way other than keeping the existing code. No problem
cf/test/test_Data.py
Outdated
@@ -859,6 +859,7 @@ def test_Data_squeeze_insert_dimension(self): | |||
def test_Data___getitem__(self): | |||
if self.test_only and inspect.stack()[0][3] not in self.test_only: | |||
return | |||
# TODODASK - why is there no testing here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know! We'll certainly get some in when __getitem__
is daskified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... which has now happened (#257)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good spot. I'll remove this line. (Given the time since I put up this PR I probably should have done a quick scan to check for anything that could have been updated before tagging you, sorry 🙂)
Co-authored-by: David Hassell <davidhassell@users.noreply.github.com>
Co-authored-by: David Hassell <davidhassell@users.noreply.github.com>
39eff85
to
dd19fa3
Compare
As well as adding a few new commits to address the feedback, I have rebased onto |
... hence the need to force-push. |
Trying via open-close to re-trigger the CI jobs to indicate that the relevant test module passes... |
Thanks for your feedback @davidhassell, I believe I have addressed everything now so let me know if any adjustments are required or desired. |
Hi @davidhassell, since you are about to go on leave thought I might bump this in case you have a chance to confirm, is it ready to be merged? Next week I will return somewhat to cf-python daskifying work, FYI, though naturally mostly focusing on ES-DOC machine and performance. |
Co-authored-by: David Hassell <davidhassell@users.noreply.github.com>
Thanks for the nudge. I've made a minor style suggestion, but otherwise good to go! |
Thanks for all of the carefully-considered feedback @davidhassell, I have corrected the typo you pointed out in a final commit so will now merge. |
Towards #182, 'daskify' the
Data.transpose
method. This one is fairly trivial (I believe) since there is a Dask intrinsic method totranspose
.Note I had to remove the loop and context management over
chunksize in self.chunk_sizes
for the corresponding test, but otherwise I have not touched that test, which now passes, as indicated by the removal of theunittest.skipIf
decorator on the test which now passes locally and the CI will indicate if it passes across the job OS and version matrix.