Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] apply_gufunc #119

Closed
wants to merge 3 commits into from
Closed

Conversation

TomNicholas
Copy link
Member

WIP - all I've really done so far is copy across the relevant dask code from dask/array/gufunc.py. It's really just copied verbatim, but not importable as it's not factored out into smaller functions in dask.

Comments:

  • Does copying code verbatim present a licensing issue? Is there a better approach?
  • We might want an asarray function
  • Do we want to copy dask's whole meta thing?
  • blockwise will need concatenate kwarg
  • dask has a dependency on toolz that I haven't added here yet
  • I don't yet understand what the last part of dask.array.apply_gufunc is doing, the highlevel graph construction. I need to look through that more closely.

@tomwhite
Copy link
Member

tomwhite commented Sep 3, 2022

Awesome - thanks for starting on this @TomNicholas!

  • Does copying code verbatim present a licensing issue? Is there a better approach?

I don't think it necessarily is an issue, but we should provide attribution, much like Dask does for NumPy for example: https://github.com/dask/dask/blob/cf7781bcf2c7b478b701fd848e78d51c5a9c6e8f/dask/array/core.py#L4071-L4073.

Longer-term it would be interesting to see if it's possible to share code, perhaps by taking advantage of Dask's HLG work - but I'm not very familiar with that.

  • We might want an asarray function

There's one in https://github.com/tomwhite/cubed/blob/main/cubed/array_api/creation_functions.py

  • Do we want to copy dask's whole meta thing?

There's no meta in Cubed, and so far it's been OK to pass dtype, shape, chunks etc around directly. Would that work here?

  • blockwise will need concatenate kwarg

map_blocks in Dask uses concatenate=True, but Cubed doesn't so maybe follow the same approach if possible. Also,
IIRC concatenate can cause memory issues (e.g. in Dask matmul it was removed), so another reason to avoid if we can.

  • dask has a dependency on toolz that I haven't added here yet

I think we use toolz in other places already, so that's not a problem.

  • I don't yet understand what the last part of dask.array.apply_gufunc is doing, the highlevel graph construction. I need to look through that more closely.

It might be helpful to see how it was implemented before HLGs were introduced, just to help get some understanding.

@tomwhite tomwhite linked an issue Sep 12, 2022 that may be closed by this pull request
@tomwhite tomwhite mentioned this pull request Oct 31, 2022
@tomwhite
Copy link
Member

It would be useful to add support for gufuncs for the Pangeo examples. Is this PR something you would like to continue working on @TomNicholas? I could look at it in the new year if not.

@TomNicholas
Copy link
Member Author

TomNicholas commented Jan 8, 2023 via email

@tomwhite
Copy link
Member

tomwhite commented Feb 4, 2023

I plan to work on this soon.

@tomwhite
Copy link
Member

I'm going to close this now, as this was superseded by #149. Thanks for starting this @TomNicholas - it really helped me with the work in #149.

@tomwhite tomwhite closed this Feb 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Supporting xarray.apply_ufunc
2 participants