Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use numpy.can_cast instead of casting and checking #7834

Closed
wants to merge 3 commits into from

Conversation

mx-moth
Copy link
Contributor

@mx-moth mx-moth commented May 11, 2023

In numpy >= 1.24 unsafe casting raises a RuntimeWarning for an operation that xarray does often to check if casting is safe. numpy.can_cast looks like an alternative approach designed for this exact case.

  • Tests added
  • User visible changes (including notable bug fixes) are documented in whats-new.rst
  • New functions/methods are listed in api.rst

In numpy >= 1.24 unsafe casting raises a RuntimeWarning for an operation
that xarray does often to check if casting is safe. `numpy.can_cast`
looks like an alternative approach designed for this exact case.
@mx-moth
Copy link
Contributor Author

mx-moth commented May 11, 2023

Unsure if new tests need to be added, as the intention is that no behaviour changes except for the lack of warnings from numpy.

@mx-moth
Copy link
Contributor Author

mx-moth commented May 11, 2023

Using latest xarray and numpy >= 1.24.0, the following code generates a warning. This function is called when saving datasets to disk using dataset.to_netcdf:

import numpy as np
from xarray.coding.times import cast_to_int_if_safe
array = np.array([1., 2., np.nan])
cast_to_int_if_safe(array)
$HOME/projects/xarray/xarray/coding/times.py:619: RuntimeWarning: invalid value encountered in cast
int_num = np.asarray(num, dtype=np.int64)
array([ 1.,  2., nan])

@kmuehlbauer
Copy link
Contributor

@mx-moth Yes, this casting should be fixed.

I'm adding a bit of context here, as this might need to be solved in combination with #7098 and #7827. #7098 removes undefined casting for decoding. In #7827 there are efforts to do this for encoding, too.

As cast_to_int_if_safe is called for encoding as well as decoding I'm not sure if all cases have been catched by these two PR.

One issue on decoding is that at least for datetime64 based times the calculated time_deltas are currently converted to float64 in the presence of NaT (although NaT can perfectly be expressed as int64). It would be great if you could try your PR on top of #7827 (which includes #7098) to see if that fixes the errors in this PR.

@mx-moth
Copy link
Contributor Author

mx-moth commented May 11, 2023 via email

@mx-moth
Copy link
Contributor Author

mx-moth commented Jun 26, 2023

Closing this PR, the approach I tried was incorrect. I've opened #7942 to track the issue instead.

@mx-moth mx-moth closed this Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants