Optimize rolling dask #311

aulemahal · 2019-11-26T17:38:33Z

Pull Request Checklist:

This PR addresses an already opened issue (for bug fixes / features)
Tests for the changes have been added (for bug fixes / features)
Docs have been added / updated (for bug fixes / features)
HISTORY.rst has been updated

After merging to master and before closing the branch:

bumpversion (minor / major / patch) has been called on master branch
Tags have been pushed

What kind of change does this PR introduce?
Create a new util function rolling to optimize the rolling sums and means over dask arrays.
Sum and mean are implemented as they both use the same numpy.cumsum trick. Other rolling operations (min, max, std) could be implemented later on.

The function does the computation lazily, which xarray's rolling() didn't. It defaults to xarray's version is the data is not in a dask array as it uses a special dask function : dask.Array.map_overlap().

This function is faster than xarray's even for datasets with only one chunk along the rolled dimension. And it solves MemoryErrors when using huge datasets.

I did some Jedi tricks to get the same behaviour as xarray's version when NaNs are encountered. Without the second np.cumsum, we can reduce the computation time by 30-50%, but NaNs will be treated as zeros.

Does this PR introduce a breaking change?
No
Other information:
I called the function xc.utils.rolling, but in comparison to xr.DataArray.rolling, it actually does the sum or mean. So same name, different behaviour.

coveralls · 2019-11-26T17:49:09Z

Pull Request Test Coverage Report for Build 1318

52 of 52 (100.0%) changed or added relevant lines in 6 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.2%) to 91.304%

Totals
Change from base Build 1316:	0.2%
Covered Lines:	6384
Relevant Lines:	6992

💛 - Coveralls

aulemahal · 2019-11-26T20:41:00Z

So, it finally wasn't that hard. I added a function that takes care of rolling operation other than sum and mean. It acts the same way as the conventional rolling : creates a new dimension and copies* data so that the rolling op can be applied along that new dim. But it still does this chunk-wise and is lazy, thus not going into a MemoryError on huge datasets.

I believe it returns a view of the values, not actually copying them.

Also, this can accept any function, not only the numpy ones.

huard · 2019-11-26T20:43:28Z

Questions: Can this be somehow ported to xarray ?

aulemahal · 2019-11-27T00:21:01Z

Yes and no, as far as I understang xarray's code. The whole "rolling" module is completely exempt to any explicit call to dask. I don't really see any elegant way to add this code. Also, there is some time in the overhead, that I just skip. What exactly does this overhead do, and is it useful? I do not know.
During this exploration, I did found that you can do ds.rolling(time=5).mean(allow_lazy=True). Which saves up some memory on compute time if you select a specific slice futher along. However, there is still something I don't understand, because it uses 20-40% more memory than my version and is slower.
So, it would be possible, but I don't see a way that would make an acceptable PR.

tlogan2000 · 2019-11-27T13:10:12Z

This might indirectly be a solution to #184 where i could never completely understand the snowballing memory problems. I will try to see if this helps once merged to master

tlogan2000 · 2019-11-27T13:13:07Z

Questions: Can this be somehow ported to xarray ?

I think we could minimally create an issue on the xarray github describing the problems we are experiencing for long time-series and present @aulemahal 's solution / work around as being more efficient. They might decide to find a way to integrate it themselves?

huard · 2019-11-27T14:35:02Z

I've read similar memory issues happening when bottleneck is not installed. Could you confirm this is not the issue here.

huard · 2019-11-27T15:11:05Z

See pydata/xarray#3332 Might be a place where you could describe what you've done.

tlogan2000 · 2019-11-27T15:20:34Z

Messing around with various reduction options I see that using 'std' results in very small diffs between xr and utils.rolling() not sure why..

    std_nd_xr = ds_nd.pr.rolling(time=5).std()
    std_nd_xc = utils.rolling(
        ds_nd.pr, window=5, dim="time", mode="std", keep_attrs=False
    )
    std_dask = utils.rolling(ds_dask.pr, window=5, dim="time", mode="std")

    xr.testing.assert_identical(std_nd_xr, std_nd_xc)  # fails : 'allclose' works
    xr.testing.assert_allclose(std_dask, std_nd_xr)

tlogan2000 · 2019-11-27T15:50:43Z

It might be prudent to include a chunked version of tests for functions that rely on rolling (or indirectly runlength) in the indicator test suite?

E.g. tests on .nc files in test_precip.py, test_temperature.py etc currently do not chunk the test datasets before calculating indices

Example from test_precip

class TestMaxNDay:
    # testing of wet_day and daily_pr_intensity, both are related

    nc_file = os.path.join(TESTS_DATA, "NRCANdaily", "nrcan_canada_daily_pr_1990.nc")

    def test_3d_data_with_nans(self):
        # test with 3d data
        pr = xr.open_dataset(self.nc_file).pr
        
        ##### potential added line
        pr_dask = xr.opend_dataset(self.nc_file, chunks=dict(time=10)).pr

        prMM = xr.open_dataset(self.nc_file).pr
        prMM.values *= 86400.0
        prMM.attrs["units"] = "mm/day"
        # put a nan somewhere
        prMM.values[10, 1, 0] = np.nan
        pr.values[10, 1, 0] = np.nan
        wind = 3
        out1 = atmos.max_n_day_precipitation_amount(pr, window=wind, freq="MS")
        out2 = atmos.max_n_day_precipitation_amount(prMM, window=wind, freq="MS")
       
        ##### potential added line
        out3 = atmos.max_n_day_precipitation_amount(pr_dask, window=wind, freq="MS")
       
        cont'd assertion tests ....

Maybe this overkill as we already test rolling generally in test_utils.py.. opinions?

huard · 2019-11-27T15:59:52Z

I think you're right, but I would like this to be parameterized. We don't want to duplicate all tests, we need to use pytest fictures to parameterize test input so that each test runs with both chunked and un-chunked datasets. This goes back to #52 ... I suggest we raise the priority on this.

aulemahal · 2019-11-27T17:56:14Z

Messing around with various reduction options I see that using 'std' results in very small diffs between xr and utils.rolling() not sure why..

@tlogan2000 I wasn't able to replicate your problem... Are you sure ds_nd in your example is not a dask array? If it is, then rolling() defaults to my method, which gives slightly different results.

@huard I confirm bottleneck is installed in my environment. Also, AFAIK bottleneck is not used by xarray when rolling on a dask array.

I'll modifiy the tests with parametrization in mind.

aulemahal · 2019-11-28T20:18:59Z

So, I realize (tnks @tlogan2000 ) that my solution was float-dependent, which isn't good. Fixed it, but this makes me wonder if other edge cases are not covered in my code, so I decided to put the function private. Like : if ever xarray fixes the problem, or we find a more convenient solution, I don't want this to be a breaking change.
For now, I believe the current solution is better or similar to doing:

    r = da.rolling(dim=win)
    a = r.construct('roll').reduce(np.func, dim='roll', allow_lazy=True)
    a = a.where(r._counts() >= win)

where the last line is to take care of NaN values and edges. Going through the construct -> reduce, saves some overhead and uses less memory than going directly to rolling().reduce()`.

These changes were really useful for my own script generating indices for the ouranos website, but I would understand if it isn't appropriate for xclim, as it's more a xarray problem.

aulemahal · 2019-11-28T20:39:26Z

Oh! And I added tests following recommendations from #52.

tlogan2000

I have done quite a bit of testing with this now and I think it seems to work quite well for dealing with the memory issues we were having.

@aulemahal Did some preliminary benchmarking comparing this to xr.rolling() and xr.rolling( allow_lazy=True) and it seems to be more efficient however I think it might be worhtwhile to do a more thorough benchmarking of all xclim fucntions requiring ,rolling() in order to get a fuller picture of the benefits in terms of speed / memory use. If overall the benefit is marginal compared to xr.rolling(allow_lazy=True) then we will have to consider whether it is worthwhile to maintain a custum utility long term

tlogan2000 · 2019-12-04T13:27:30Z

I see the py3.5 build is failing but we will be dropping 3.5 shortly anyhow. Black and flake 8 failed as well but should be an easy fix..

aulemahal · 2019-12-04T15:46:25Z

So, I fixed that black thing.
The failing build was due to a implementation bug in python 3.5 (https://bugs.python.org/issue9232), I fixed that too.

aulemahal · 2019-12-05T21:07:28Z

I suggest we merge this for now.
We should definitely complete #316 before v1.0, but this is our best shot for the moment.

Revert #311 and add keywords

aulemahal added 3 commits November 21, 2019 19:39

Added rolling function and changed indices where possible

3c08f05

Fixes to the rolling func | black | tests

d9d38b8

merge master

2d869f4

aulemahal added the enhancement New feature or request label Nov 26, 2019

aulemahal requested review from huard and tlogan2000 November 26, 2019 17:38

aulemahal self-assigned this Nov 26, 2019

aulemahal added 2 commits November 26, 2019 15:35

Generalized xc.utils.rolling | applied it everywhere

63a25ea

Blacked it

4ead46a

Arg order for more meaningful positional access

31b2d5a

aulemahal added 4 commits November 28, 2019 13:58

Rolling now private | Added parametrized tests | Fix non-float caveat

8435551

Generalize func so that we can use it universally

34e8aa8

Merge branch 'master' into optimize-rolling-dask

183741f

update History.rst

c407568

tlogan2000 approved these changes Dec 4, 2019

View reviewed changes

Black and avoid py35 bug

339e494

aulemahal added 2 commits December 5, 2019 15:19

Merge branch 'master' into optimize-rolling-dask

69d8d46

Fix codefactor suggestion

55a03ee

aulemahal merged commit eff131e into master Dec 5, 2019

aulemahal deleted the optimize-rolling-dask branch December 5, 2019 22:48

aulemahal added a commit that referenced this pull request Dec 6, 2019

Add modes for heatwaves - waiting #311

8e34e25

aulemahal restored the optimize-rolling-dask branch January 6, 2020 21:54

aulemahal deleted the optimize-rolling-dask branch January 6, 2020 21:55

aulemahal added a commit that referenced this pull request Jan 6, 2020

Revert #311 and add keywords

fd0248d

aulemahal mentioned this pull request Jan 6, 2020

Revert #311 and add keywords #334

Merged

6 tasks

huard added a commit that referenced this pull request Jan 7, 2020

Merge pull request #334 from Ouranosinc/revert-rolling-to-xr

f661875

Revert #311 and add keywords

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize rolling dask #311

Optimize rolling dask #311

aulemahal commented Nov 26, 2019 •

edited

Loading

coveralls commented Nov 26, 2019 •

edited

Loading

aulemahal commented Nov 26, 2019

huard commented Nov 26, 2019

aulemahal commented Nov 27, 2019 •

edited

Loading

tlogan2000 commented Nov 27, 2019

tlogan2000 commented Nov 27, 2019 •

edited

Loading

huard commented Nov 27, 2019

huard commented Nov 27, 2019

tlogan2000 commented Nov 27, 2019

tlogan2000 commented Nov 27, 2019 •

edited

Loading

huard commented Nov 27, 2019

aulemahal commented Nov 27, 2019

aulemahal commented Nov 28, 2019

aulemahal commented Nov 28, 2019

tlogan2000 left a comment

tlogan2000 commented Dec 4, 2019

aulemahal commented Dec 4, 2019

aulemahal commented Dec 5, 2019

Optimize rolling dask #311

Optimize rolling dask #311

Conversation

aulemahal commented Nov 26, 2019 • edited Loading

Pull Request Checklist:

After merging to master and before closing the branch:

coveralls commented Nov 26, 2019 • edited Loading

Pull Request Test Coverage Report for Build 1318

💛 - Coveralls

aulemahal commented Nov 26, 2019

huard commented Nov 26, 2019

aulemahal commented Nov 27, 2019 • edited Loading

tlogan2000 commented Nov 27, 2019

tlogan2000 commented Nov 27, 2019 • edited Loading

huard commented Nov 27, 2019

huard commented Nov 27, 2019

tlogan2000 commented Nov 27, 2019

tlogan2000 commented Nov 27, 2019 • edited Loading

huard commented Nov 27, 2019

aulemahal commented Nov 27, 2019

aulemahal commented Nov 28, 2019

aulemahal commented Nov 28, 2019

tlogan2000 left a comment

Choose a reason for hiding this comment

tlogan2000 commented Dec 4, 2019

aulemahal commented Dec 4, 2019

aulemahal commented Dec 5, 2019

aulemahal commented Nov 26, 2019 •

edited

Loading

coveralls commented Nov 26, 2019 •

edited

Loading

aulemahal commented Nov 27, 2019 •

edited

Loading

tlogan2000 commented Nov 27, 2019 •

edited

Loading

tlogan2000 commented Nov 27, 2019 •

edited

Loading