Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid in-place multiplication of a large value to an array with small integer dtype #8867

Merged
merged 7 commits into from
Mar 29, 2024

Conversation

Illviljan
Copy link
Contributor

@Illviljan Illviljan commented Mar 22, 2024

Upstream numpy has become a bit more particular with which types you can use for inplace operations. This PR fixes

_______________ TestImshow.test_imshow_rgb_values_in_valid_range _______________

self = <xarray.tests.test_plot.TestImshow object at 0x7f88320c2780>

    def test_imshow_rgb_values_in_valid_range(self) -> None:
        da = DataArray(np.arange(75, dtype="uint8").reshape((5, 5, 3)))
        _, ax = plt.subplots()
>       out = da.plot.imshow(ax=ax).get_array()

/home/runner/work/xarray/xarray/xarray/tests/test_plot.py:2034: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/home/runner/work/xarray/xarray/xarray/plot/accessor.py:421: in imshow
    return dataarray_plot.imshow(self._da, *args, **kwargs)
/home/runner/work/xarray/xarray/xarray/plot/dataarray_plot.py:1601: in newplotfunc
    primitive = plotfunc(
/home/runner/work/xarray/xarray/xarray/plot/dataarray_plot.py:1853: in imshow
    alpha *= 255
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = masked_array(
  data=[[[1],
         [1],
         [1],
         [1],
         [1]],

        [[1],
         [1],
    ...,
         [1],
         [1],
         [1],
         [1]]],
  mask=False,
  fill_value=np.int64(999999),
  dtype=uint8)
other = 255

    def __imul__(self, other):
        """
        Multiply self by other in-place.
    
        """
        m = getmask(other)
        if self._mask is nomask:
            if m is not nomask and m.any():
                self._mask = make_mask_none(self.shape, self.dtype)
                self._mask += m
        elif m is not nomask:
            self._mask += m
        other_data = getdata(other)
        other_data = np.where(self._mask, other_data.dtype.type(1), other_data)
>       self._data.__imul__(other_data)
E       numpy._core._exceptions._UFuncOutputCastingError: Cannot cast ufunc 'multiply' output from dtype('int64') to dtype('uint8') with casting rule 'same_kind'

/home/runner/micromamba/envs/xarray-tests/lib/python3.12/site-packages/numpy/ma/core.py:4415: UFuncTypeError

Some curious behaviors seen while debugging:

alpha = np.array([1], dtype=np.int8)
alpha *= 255
repr(alpha) # 'array([-1], dtype=int8)'

alpha = np.array([1], dtype=np.int16)
alpha *= 255
repr(alpha) # 'array([255], dtype=int16)'

xref: #8844

@Illviljan Illviljan added the run-upstream Run upstream CI label Mar 22, 2024
@dopplershift
Copy link
Contributor

Why is in-place multiplication a problem here?

@Illviljan
Copy link
Contributor Author

Illviljan commented Mar 22, 2024

Quick response for a draft PR! My idea 5 minutes ago is that

alpha = np.array(1, dtype=np.int8) 
alpha *= 255

will create a number larger than int8, which is strange. So I think we should create a new array instead with whatever numpy thinks is the best dtype.

@Illviljan Illviljan changed the title Avoid inplace multiplication Avoid in-place multiplication of a large value to an array with small integer dtype Mar 22, 2024
@Illviljan Illviljan marked this pull request as ready for review March 26, 2024 20:42
@@ -1848,9 +1848,10 @@ def _center_pixels(x):
# missing data transparent. We therefore add an alpha channel if
# there isn't one, and set it to transparent where data is masked.
if z.shape[-1] == 3:
alpha = np.ma.ones(z.shape[:2] + (1,), dtype=z.dtype)
safe_dtype = np.promote_types(z.dtype, np.uint8)
Copy link
Contributor Author

@Illviljan Illviljan Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dtype should allow at least 0 and 255. But it will be converted in np.ma.concatenate at some point anyway so I thought it's best to just figure it out early and initialize alpha correctly.

@dcherian
Copy link
Contributor

Thanks for figuring this out!

@dcherian dcherian merged commit afce18f into pydata:main Mar 29, 2024
27 of 33 checks passed
dcherian added a commit to dcherian/xarray that referenced this pull request Apr 2, 2024
* main: (26 commits)
  [pre-commit.ci] pre-commit autoupdate (pydata#8900)
  Bump the actions group with 1 update (pydata#8896)
  New empty whatsnew entry (pydata#8899)
  Update reference to 'Weighted quantile estimators' (pydata#8898)
  2024.03.0: Add whats-new (pydata#8891)
  Add typing to test_groupby.py (pydata#8890)
  Avoid in-place multiplication of a large value to an array with small integer dtype (pydata#8867)
  Check for aligned chunks when writing to existing variables (pydata#8459)
  Add dt.date to plottable types (pydata#8873)
  Optimize writes to existing Zarr stores. (pydata#8875)
  Allow multidimensional variable with same name as dim when constructing dataset via coords (pydata#8886)
  Don't allow overwriting indexes with region writes (pydata#8877)
  Migrate datatree.py module into xarray.core. (pydata#8789)
  warn and return bytes undecoded in case of UnicodeDecodeError in h5netcdf-backend (pydata#8874)
  groupby: Dispatch quantile to flox. (pydata#8720)
  Opt out of auto creating index variables (pydata#8711)
  Update docs on view / copies (pydata#8744)
  Handle .oindex and .vindex for the PandasMultiIndexingAdapter and PandasIndexingAdapter (pydata#8869)
  numpy 2.0 copy-keyword and trapz vs trapezoid (pydata#8865)
  upstream-dev CI: Fix interp and cumtrapz (pydata#8861)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants