Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZeroDivisionError for large arrays in rank_histogram because block_size is zero #334

Closed
raspstephan opened this issue Jun 28, 2021 · 3 comments

Comments

@raspstephan
Copy link

For large arrays xs.rank_histogram will throw a ZeroDivisionError because block_size becomes zero.

/anaconda/envs/nwp-downscale/lib/python3.8/site-packages/xhistogram/core.py in _determine_block_chunks(bin_indices, block_size)
     64             block_size = min(_MAX_CHUNK_SIZE // N, M)
     65     assert isinstance(block_size, int)
---> 66     num_chunks = M // block_size
     67     block_chunks = num_chunks * (block_size,)
     68     residual = M % block_size

ZeroDivisionError: integer division or modulo by zero

This is because block_size = min(_MAX_CHUNK_SIZE // N, M) returns zero for arrays that have more elements N than _MAX_CHUNK_SIZE which is set to 10_000_000. I am not entirely sure what block_size does, so I am not sure whether setting it to 1 in this case is a problem.

Here is the array and code that causes the problem:

image

@raspstephan
Copy link
Author

Note that chunking the data in dask before calling the function solves the problem, which is great but the original problem still exists.

image

@dougiesquire
Copy link
Collaborator

Thanks @raspstephan for the issue report and work around! The issue comes from the package xhistogram, which is used under the hood by xskillscore. There is an open issue about the issue here: xgcm/xhistogram#16.

This bug should get fixed up at the xhistogram end in the near future as we're revisiting the block_size logic. In the meantime, your workaround is probably the best way forward.

@raspstephan
Copy link
Author

Ah perfect, thanks for pointing me towards this. You can close this if you want :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants