Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block-mapped resample with the help of flox #1848

Merged
merged 42 commits into from
Oct 10, 2024
Merged

Block-mapped resample with the help of flox #1848

merged 42 commits into from
Oct 10, 2024

Conversation

aulemahal
Copy link
Collaborator

@aulemahal aulemahal commented Jul 19, 2024

Pull Request Checklist:

  • This PR addresses an already opened issue (for bug fixes / features)
    • This PR fixes #xyz
  • Tests for the changes have been added (for bug fixes / features)
    • (If applicable) Documentation has been added / updated (for bug fixes / features)
  • CHANGELOG.rst has been updated (with summary of main changes)
    • Link to issue (:issue:number) and pull request (:pull:number) has been added

What kind of change does this PR introduce?

Implements resample_map. This function is meant for all da.resample(...).map(...) calls. These, flox cannot improve automatically so we use some flox logic to help. The idea is to map the resample-map construct on each block in parallel. This is possible by first rechunking the array so that chunks boundary fit with resampling period boundaries (this is a flox function).

The main improvement should come from the fact that map_blocks hides much of the complexity to dask, so the resulting graph is much lighter. I still have to better test the performance of this. My goal would be to have some short text in xclim's doc that highlights when the option is useful and when it is not. The option is activated through set_options.

The current function works only when the input object is of the same type as the output one. So some functions couldn't be wrapped with this yet. The most important untouched code for the moment is the missing checks where I think this could help a lot.

Does this PR introduce a breaking change?

It should not. This is completely optional.

Other information:

In progress, I still need to prove the performance boost.

This depends on #1845 because I need all improvements for PC.

@github-actions github-actions bot added the indicators Climate indices and indicators label Jul 19, 2024
@aulemahal aulemahal changed the title Resample map Block-mapped resample-map with the help of flox Jul 19, 2024
pyproject.toml Outdated Show resolved Hide resolved
Base automatically changed from generic-season to main August 1, 2024 20:53
@aulemahal aulemahal changed the title Block-mapped resample-map with the help of flox Block-mapped resample with the help of flox Sep 6, 2024
@aulemahal aulemahal marked this pull request as ready for review September 6, 2024 18:09
@aulemahal aulemahal mentioned this pull request Sep 18, 2024
8 tasks
@aulemahal aulemahal requested a review from coxipi October 1, 2024 22:07
@coveralls
Copy link

coveralls commented Oct 4, 2024

Coverage Status

coverage: 89.472% (+0.03%) from 89.445%
when pulling 8096991 on resample-map
into dd37e5a on main.

@Zeitsperre Zeitsperre added this to the v0.53.0 milestone Oct 8, 2024
xclim/indices/helpers.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coxipi coxipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved!

@aulemahal aulemahal merged commit 438ef2e into main Oct 10, 2024
21 checks passed
@aulemahal aulemahal deleted the resample-map branch October 10, 2024 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Approved for additional tests indicators Climate indices and indicators
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants