Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why iris.cube.Cube.collapsed requires weights with full shape? #3707

Closed
TomekTrzeciak opened this issue May 14, 2020 · 9 comments · Fixed by #3943
Closed

Why iris.cube.Cube.collapsed requires weights with full shape? #3707

TomekTrzeciak opened this issue May 14, 2020 · 9 comments · Fixed by #3943

Comments

@TomekTrzeciak
Copy link

TomekTrzeciak commented May 14, 2020

We are looking at ways to reduce memory footprint of our iris code in https://github.com/metoppv/improver and are somewhat stuck on calculating weighted averages with iris.cube.Cube.collapsed. It looks like Iris insists on weights having the same shape as the collapsed cube even though the underlying numpy.ma.average in iris.analysis.MEAN can happily work with 1D array of weights. Why this restriction?

@TomekTrzeciak TomekTrzeciak changed the title Why iris.cube.Cube.collapse requires weights with full shape? Why iris.cube.Cube.collapsed requires weights with full shape? May 14, 2020
@rcomer
Copy link
Member

rcomer commented May 19, 2020

I can't answer the specific question, but if you use iris.util.broadcast_to_shape, it claims to make the new array out of multiple views of the input array. Does this help with the memory footprint issue?

@TomekTrzeciak
Copy link
Author

We already use iris.util.broadcast_to_shape, the problem was some intermediate step that destroyed that array view and made it full size (it's difficult to control this sort of thing).

Still, using 1D weights seems like a common case and it would be nice to allow it. Would the following change be acceptable?

diff --git a/lib/iris/cube.py b/lib/iris/cube.py
index fec68c57..f1fe313b 100644
--- a/lib/iris/cube.py
+++ b/lib/iris/cube.py
@@ -3957,8 +3957,9 @@ bound=(1994-12-01 00:00:00, 1998-12-01 00:00:00)
             unrolled_data = np.transpose(self.data, dims).reshape(new_shape)
 
             # Perform the same operation on the weights if applicable
-            if kwargs.get("weights") is not None:
-                weights = kwargs["weights"].view()
+            weights = kwargs.get("weights")
+            if weights is not None and weights.ndim > 1:
+                weights = weights.view()
                 kwargs["weights"] = np.transpose(weights, dims).reshape(
                     new_shape
                 )

@TomekTrzeciak
Copy link
Author

Bump. This is a simple change and it would be useful to have it. Any thoughts on this?

@pp-mo
Copy link
Member

pp-mo commented Jan 4, 2021

If we're going to get this into Iris 3.0 we need to move quick !
To make this into a viable PR we should add a testcase and a whatsnew description.
I will try and get something done about this today.

@pp-mo
Copy link
Member

pp-mo commented Jan 4, 2021

The scope of this proposed change is still rather limited, and doesn't do all that we might.

It allows us to pass 1D weights matching a single "collapse dimension".
From the code, that dimension can come from any single cube dimension (not just the last),
or a flattening of multiple cube dims -- in which case supplying a 1D array would be a bit odd (and we can't ensure that the dim ordering is correct).

It still can't broadcast any multi-dimensional weights, even if the dims match by normal array rules
-- e.g. cube is (100, 3, 4), collapse over last 2 dims, with weights of shape (3,4) or (12,).

So, ideally, I feel we should be able to pass weights matching the shape of the collapsed-dims (not just the whole cube), or the flattened equivalent.
E.G. if cube has dims x/T/y with sizes (3, 100, 4), we should be able to do ...

  • `cube.collapsed(['x', 'y'], ... weights=np.ones((3, 4)))', or
  • `cube.collapsed(['y', 'x'], ... weights=np.ones((4, 3)))', or
  • `cube.collapsed(['x', 'y'], ... weights=np.ones(12))'
    (obviously, not all ones, but weights of those shapes)

I assume this is more or less why you are using broadcast_to_shape
Does this make some sense to you, and is it worth considering such extra cases ?
(but maybe in another PR)

@TomekTrzeciak
Copy link
Author

So, ideally, I feel we should be able to pass weights matching the shape of the collapsed-dims (not just the whole cube), or the flattened equivalent.
E.G. if cube has dims x/T/y with sizes (3, 100, 4), we should be able to do ...

  • `cube.collapsed(['x', 'y'], ... weights=np.ones((3, 4)))', or
  • `cube.collapsed(['y', 'x'], ... weights=np.ones((4, 3)))', or
  • `cube.collapsed(['x', 'y'], ... weights=np.ones(12))'
    (obviously, not all ones, but weights of those shapes)

I assume this is more or less why you are using broadcast_to_shape
Does this make some sense to you, and is it worth considering such extra cases ?

It does make sense (with question mark about the flattened array case), but these cases are also less common and not directly supported at the numpy layer (would require broadcast_to_shape internally).

(but maybe in another PR)

Might as well in the interest of getting this in quickly.

@pp-mo
Copy link
Member

pp-mo commented Jan 4, 2021

How does #3943 look ?

@TomekTrzeciak
Copy link
Author

How does #3943 look ?

Looks all good to me 👍

@rcomer
Copy link
Member

rcomer commented Apr 14, 2021

This is linked to #3943, which was merged. So I believe it should be closed.

@rcomer rcomer closed this as completed Apr 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants