Enable many complex number tests #54761

MichaelTiemannOSC · 2023-08-25T18:40:11Z

These changes put complex128 on an even footing with float64 and int64 as far as numerical testing is concerned. These changes have been tested both against the Pandas test suite as well as the Pint-Pandas testsuite (using complex magnitudes).

These changes are a simpler version of a previous pull-request that was destroyed by GitHub's fork synchronize behavior.

closes #xxxx (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

These changes put complex128 on an even footing with float64 and int64 as far as numerical testing is concerned. These changes have been tested both against the Pandas test suite as well as the Pint-Pandas testsuite (using complex magnitudes). These changes are a simpler version of a previous pull-request that was destroyed by GitHub's fork synchronize behavior. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Add description of this PR to what's new file. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

MichaelTiemannOSC · 2023-08-26T08:08:54Z

False alarm: these simplified changes DO work with uncertainties in Pint...when used with the version of Pint that also supports uncertainties. 😊

MichaelTiemannOSC · 2023-08-26T17:46:11Z

Tests with uncertainties all pass (with similar request-related changes). I cannot commit those changes to Pint-Pandas until these changes are accepted. It is very nice that the underlying changes made to Pandas 2.1 really simplifies the Pint-Pandas uses of EAs.

@andrewgsavage @hgrecco

Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

xfail complex tests, but otherwise defer to parent object to implement test case. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

MichaelTiemannOSC · 2023-09-08T13:40:20Z

@mroeschke can you have a look at this and let me know if any additional changes are needed?

mroeschke · 2023-09-08T17:41:58Z

pandas/core/dtypes/astype.py

@@ -100,6 +100,11 @@ def _astype_nansafe(
    elif np.issubdtype(arr.dtype, np.floating) and dtype.kind in "iu":
        return _astype_float_to_int_nansafe(arr, dtype, copy)

+    elif np.issubdtype(arr.dtype, np.complexfloating) and is_object_dtype(dtype):
+        if np.isnan(arr).any():
+            res = np.asarray([np.nan if np.isnan(x) else x for x in arr], dtype)


Could you just do res[np.isnan(arr)] = np.nan?

res has to be set to something first (it's the "result"). But I can look at using copy to decide if we really need something fresh before we do something that may have a side-effect.

doc/source/whatsnew/v2.1.0.rst

pandas/tests/extension/base/missing.py

@mroeschke

Attempt to resolve all comments from @mroeschke Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

doc/source/whatsnew/v2.2.0.rst

pandas/tests/extension/base/ops.py

pandas/tests/extension/decimal/test_decimal.py

pandas/tests/extension/test_categorical.py

Replace `request` parameter with `*args, **kwargs` in many places. This allows us to avoid needlessly passing request parameters just to satisfy method signatures. Also remove whatsnew entry as this enhancement to test cases is not really user-visible. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

MichaelTiemannOSC · 2023-10-10T13:30:02Z

I strongly suspect the mypy failure was due to unrelated issues in CI/CD. In the intervening weeks, things have moved forward. Is it preferred practice to update via merge or rebase to catch up to latest after sitting idle for 3 weeks?

mroeschke · 2023-10-10T15:33:36Z

Merging in main is sufficient. History is squashed during merging into main

The _duplicate functions expect complex128 data in ndarrays, not EAs. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

The command `pre-comment run -a` doesn't do all the things that Pandas CI/CD does. This commit fixes two problems found in the mypy section: * Adding some comments to ignore some errors in _ensure_data for complex dtypes * Fix the type signature of _get_expected_exception to match LSP stylings Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Fix indentation errors. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Don't need `*args, **kwargs` changes anymore due to refactoring upstream. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Re-harmonize with upstream. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

It turns out that `data` in `test_sparse.py` never contains complex numbers, and furthermore, that it's quite a lot of extra work to make the sparse tests complex-friendly. So we leave complex number testing out of test_sparse. Contributions welcome, as usual. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Arrow doesn't support complex numbers, so no need to special case tests as if it does. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Restore function accidentally deleted. Intention was to only delete unneeded complex code handling, not the whole function. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Delete `request` parameter no longer needed for `test_fillna_no_op_returns_copy`. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

MichaelTiemannOSC · 2024-01-06T03:45:15Z

Thanks to some upstream changes, the changes required to support many complex test cases is greatly simplified. Please let me know your feedback on these changes.

In certain cases where Python throws a TypeError for complex values where it would throw ValueError for real values, transform exception to ValueError. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

MichaelTiemannOSC · 2024-01-16T17:54:55Z

Can somebody remove the stale tag from this PR?

MichaelTiemannOSC · 2024-01-20T01:36:31Z

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

I have updated this PR.

Make ruff happy by raising from `exc`. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

MichaelTiemannOSC · 2024-01-24T03:17:49Z

Note that the only failure now is Numpy Dev (and most likely "not my fault").

jbrockmendel · 2024-01-30T23:05:22Z

pandas/core/arrays/_mixins.py

+            # Don't let Python's handling of `complex` make extra complexity for Pandas
+            if self._ndarray.dtype.kind == "c":
+                raise ValueError(
+                    *(x.replace("real", "complex") for x in exc.args)


why are we tinkering with the exception message?

Maybe this is actually a NumPy bug. When value is an object, NumPy appears to try constructing a complex number by passing the object in as the real component, but that, too fails with the message TypeError: must be a real number, not object. But that's a wrong message, because what we really want is to report the failure to construct a complex number. Here's a test case of sorts:

import numpy as np xx = object() np.asarray(xx).astype('float64') # *** TypeError: float() argument must be a string or a real number, not 'object' np.asarray(xx).astype('complex128') # *** TypeError: must be real number, not object

That's what I'm trying to fix. So I could file an issue against NumPy, remove the above special case, and Pandas will fix itself when NumPy fixes itself. Is that the best approach?

jbrockmendel · 2024-01-30T23:05:40Z

pandas/core/algorithms.py

+        # error: Item "ExtensionDtype" of "Union[Any, ExtensionDtype]"
+        # has no attribute "itemsize"
+        if values.dtype.itemsize == 16:  # type: ignore[union-attr]
+            # We have support for complex128


not complex64?

I'd be happy to add complex64 to the test suite, but thought it better to do one case (complex128) at a time. I didn't want to add something (values.dtype.itemsize == 8) that wasn't tested by the test suite.

not a deal-breaker, but id prefer to do them both in this pass. or at least leave a comment explaining why one is excluded

pandas/tests/arithmetic/test_numeric.py

Updated code as per code review suggestions. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

`test_fillna_no_op_returns_copy` now works for all cases in Pandas 3.0.0 without special casing. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

MichaelTiemannOSC · 2024-03-22T12:46:52Z

The new 3.0.0 build rules make these changes happy. Previous build rules across a variety of merge points in the 2.2.x created problems with inconsistent parameters unrelated to these changes. Perhaps this is now ready to merge?

jbrockmendel · 2024-03-27T22:57:55Z

pandas/core/arrays/_mixins.py

+            # Note: when `self._ndarray.dtype.kind == "c"`, Numpy incorrectly complains
+            # that `must be real number, not ...` when in reality
+            # a complex argument is more likely what's expected
+            raise ValueError(exc.args) from exc


the comment here is helpful, thanks. it suggests to me that we may want to only catch-and-re-raise in a subset of cases? otherwise we'll be re-raising as ValueError more often than we really want?

Previously @mroeschke questioned when ever this would be a TypeError in the first place...the above assignment was expected only ever raise a ValueError. So, instead of trying to also handle TypeError in higher-level code, we translate this "impossible" case into the canonical form that Pandas expects. The comment helps the user understand a completely unhelpful error message that comes from Python. Previously I attempted to edit the error message to something more reasonable, but that was challenged. At the end of the day the question is: how much steering do we want to do for this case vs. just letting the exception raise in the expected way and let users decipher what was wrong with their code in the first place.

Allow all defined numpy complex dtypes to work as well as complex128. Note that by default the test suite only tests the complex128 case, but users can add or alter that default by modifying the test suite source code to test other cases. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

mroeschke · 2024-10-29T20:22:12Z

Looks like this PR as gone stale so closing to clear the queue, if interested in contributing please merge in main and we can reopen

MichaelTiemannOSC · 2024-10-29T21:04:08Z

Twice I've tried to get this merged, and twice it's gone stale. I thought it was a good idea, and I'm willing to try again, but only if there's enough interest on the Pandas side to make complex128 a first-class test case.

MichaelTiemannOSC added 2 commits August 25, 2023 14:38

Update v2.1.0.rst

e7a285a

Add description of this PR to what's new file. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

MichaelTiemannOSC marked this pull request as ready for review August 25, 2023 18:48

MichaelTiemannOSC marked this pull request as draft August 26, 2023 08:07

MichaelTiemannOSC marked this pull request as ready for review August 26, 2023 08:32

MichaelTiemannOSC added 3 commits August 29, 2023 13:16

Merge branch 'main' into test_numpy_complex2

02719d9

Fix merge error in test_decimal.py

f9bfeb9

Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Simplify test_fillna_no_op_returns_copy

077213f

xfail complex tests, but otherwise defer to parent object to implement test case. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Merge remote-tracking branch 'upstream/main' into test_numpy_complex2

9fda0ef

mroeschke reviewed Sep 8, 2023

View reviewed changes

doc/source/whatsnew/v2.1.0.rst Outdated Show resolved Hide resolved

mroeschke reviewed Sep 8, 2023

View reviewed changes

pandas/tests/extension/base/missing.py Outdated Show resolved Hide resolved

mroeschke requested a review from jbrockmendel September 8, 2023 17:46

mroeschke added the Complex Complex Numbers label Sep 8, 2023

MichaelTiemannOSC added 3 commits September 8, 2023 16:42

changes from review

d25baa2

Attempt to resolve all comments from @mroeschke Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Merge remote-tracking branch 'upstream/main' into test_numpy_complex2

ad841bf

Merge branch 'main' into test_numpy_complex2

7535374

mroeschke reviewed Sep 22, 2023

View reviewed changes

doc/source/whatsnew/v2.2.0.rst Outdated Show resolved Hide resolved

mroeschke reviewed Sep 22, 2023

View reviewed changes

pandas/tests/extension/base/ops.py Outdated Show resolved Hide resolved

mroeschke reviewed Sep 22, 2023

View reviewed changes

pandas/tests/extension/decimal/test_decimal.py Outdated Show resolved Hide resolved

mroeschke reviewed Sep 22, 2023

View reviewed changes

pandas/tests/extension/test_categorical.py Outdated Show resolved Hide resolved

MichaelTiemannOSC added 3 commits October 10, 2023 13:03

Merge branch 'main' into test_numpy_complex2

f1139f5

Handle complex128 EA in _ensure_data

19d3127

The _duplicate functions expect complex128 data in ndarrays, not EAs. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

MichaelTiemannOSC added 8 commits January 6, 2024 12:09

Update test_numpy.py

de56177

Fix indentation errors. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Merge branch 'main' into test_numpy_complex2

198a16d

Update ops.py

554a5c3

Don't need `*args, **kwargs` changes anymore due to refactoring upstream. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Update test_decimal.py

6ddb7f7

Re-harmonize with upstream. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Update test_arrow.py

040c98b

Arrow doesn't support complex numbers, so no need to special case tests as if it does. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Update test_arrow.py

3a58f5a

Restore function accidentally deleted. Intention was to only delete unneeded complex code handling, not the whole function. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Update test_arrow.py

29aa747

Delete `request` parameter no longer needed for `test_fillna_no_op_returns_copy`. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

MichaelTiemannOSC marked this pull request as ready for review January 6, 2024 03:44

MichaelTiemannOSC added 2 commits January 9, 2024 15:12

setitem exceptions for complex raise ValueError

5210c8b

In certain cases where Python throws a TypeError for complex values where it would throw ValueError for real values, transform exception to ValueError. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Merge branch 'main' into test_numpy_complex2

9f4bea5

MichaelTiemannOSC added 2 commits January 24, 2024 06:33

Merge branch 'main' into test_numpy_complex2

be1f02b

Update _mixins.py

b3edefa

Make ruff happy by raising from `exc`. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

jbrockmendel reviewed Jan 30, 2024

View reviewed changes

pandas/tests/arithmetic/test_numeric.py Show resolved Hide resolved

MichaelTiemannOSC added 3 commits January 31, 2024 22:03

Incorporate feedback

89ea60b

Updated code as per code review suggestions. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

Merge branch 'main' into test_numpy_complex2

4dc3bea

Update test_sparse.py

4e273fa

`test_fillna_no_op_returns_copy` now works for all cases in Pandas 3.0.0 without special casing. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>

jbrockmendel reviewed Mar 27, 2024

View reviewed changes

MichaelTiemannOSC added 2 commits March 29, 2024 08:39

Merge branch 'main' into test_numpy_complex2

abfdedb

mroeschke closed this Oct 29, 2024

Uh oh!

Enable many complex number tests #54761

Enable many complex number tests #54761

Uh oh!

Conversation

MichaelTiemannOSC commented Aug 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichaelTiemannOSC commented Aug 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichaelTiemannOSC commented Aug 26, 2023

Uh oh!

MichaelTiemannOSC commented Sep 8, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MichaelTiemannOSC commented Oct 10, 2023

Uh oh!

mroeschke commented Oct 10, 2023

Uh oh!

MichaelTiemannOSC commented Jan 6, 2024

Uh oh!

MichaelTiemannOSC commented Jan 16, 2024

Uh oh!

MichaelTiemannOSC commented Jan 20, 2024

Uh oh!

MichaelTiemannOSC commented Jan 24, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MichaelTiemannOSC commented Mar 22, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mroeschke commented Oct 29, 2024

Uh oh!

MichaelTiemannOSC commented Oct 29, 2024

Uh oh!

Uh oh!

MichaelTiemannOSC commented Aug 25, 2023 •

edited

Loading

MichaelTiemannOSC commented Aug 26, 2023 •

edited

Loading