Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mechanism to find complex dtype info #433

Closed
honno opened this issue May 16, 2022 · 17 comments
Closed

Mechanism to find complex dtype info #433

honno opened this issue May 16, 2022 · 17 comments
Labels
API extension Adds new functions or objects to the API. topic: Complex Data Types Complex number data types.

Comments

@honno
Copy link
Member

honno commented May 16, 2022

When introducing complex dtypes and updating/introducing related functions (#373), I wonder if we'd want to introduce a cinfo() function ala iinfo()/finfo(). What would its resulting info object look like?

I don't see any array/tensor library which implements exactly cinfo(), although there might be an equivalent I'm missing. For say NumPy, was there just not demand for such a function, and/or such a need was somewhat served with finfo()?

>>> np.finfo(np.complex64)
finfo(resolution=1e-06, min=-3.4028235e+38, max=3.4028235e+38, dtype=float32)
>>> assert np.finfo(np.complex64) == np.finfo(np.float32)
@jakirkham
Copy link
Member

Sorry if this is a bit off topic, but I wonder if instead of having different functions like this, we could have just one function that was type agnostic like tinfo or similar. This would save needing to check type and call the right one when determining things like the range of values or similar.

@kgryte kgryte added API extension Adds new functions or objects to the API. topic: Complex Data Types Complex number data types. labels May 16, 2022
@honno
Copy link
Member Author

honno commented May 17, 2022

Sorry if this is a bit off topic, but I wonder if instead of having different functions like this, we could have just one function that was type agnostic like tinfo or similar. This would save needing to check type and call the right one when determining things like the range of values or similar.

My impression is this would be awkward as float and complex dtypes have additional information to ints, so if you don't want polymorphic returns from a universal function (previous unique discussions suggest a hard no), you might want a function (or separate functions) dedicated for bounds and bit-size. Personally whenever I've wanted to know bounds/size for code that deals with both ints and floats, I'm often writing additional logic for float scenarios (NaN branches 🙃), so would be checking dtype family anyway.

@jakirkham
Copy link
Member

Here's the results of an integer and floating point type from NumPy (line 1 was the import):

In [2]: np.iinfo(np.int32)
Out[2]: iinfo(min=-2147483648, max=2147483647, dtype=int32)

In [3]: np.finfo(np.float32)
Out[3]: finfo(resolution=1e-06, min=-3.4028235e+38, max=3.4028235e+38, dtype=float32)

Nearly all of the information is shared other than resolution. In fact resolution could even be specified for integers for simplicity (as it would just be 1).

Anyways it's unclear why having one function would be challenging. Though please feel free to clarify

@honno
Copy link
Member Author

honno commented May 17, 2022

Nearly all of the information is shared other than resolution. In fact resolution could even be specified for integers for simplicity (as it would just be 1).

Ah resolution=1 seems like a clean solution, although with the spec it's called eps (still feels correct, just not as semantically nice).

Note the np.finfo() repr hides a lot of additional info NumPy finds, although most of that we can ignore. smallest_normal however is an attribute that both np.finfo() and xp.finfo() share—is there a nice solution for smallest_normal here?

@jakirkham
Copy link
Member

smallest_normal could also be 1 for integers

We could also make these things None if we prefer that

@jakirkham
Copy link
Member

jakirkham commented May 17, 2022

Leo, those show the same attributes as the example above. All they add is bits (which they both have) and smallest_normal (which we are now discussing).

@leofang
Copy link
Contributor

leofang commented May 18, 2022

OK perhaps I should've linked to the corresponding NumPy pages? My point is the returned attributes could increase and further diverge in the future, just like the status quo in NumPy. I feel this unifying discussion would make our life harder when it comes to extensibility in the future.

btw, to clarify:

Ah resolution=1 seems like a clean solution, although with the spec it's called eps (still feels correct, just not as semantically nice).

eps and resolution have different meanings in NumPy.

@kgryte
Copy link
Contributor

kgryte commented Jun 2, 2022

Not clear to me what info would belong on cinfo. For example, what should eps be for complex numbers? What would min and max be given that complex numbers don't have a natural ordering? Same for smallest_normal?

@leofang
Copy link
Contributor

leofang commented Jun 6, 2022

I'd keep at least bits and eps for cinfo, but definitely exclude min and max because we don't wanna follow NumPy to define some awkward comparison/sort capabilities over complex-valued arrays (IIRC it's discussed long time ago in one of the threads).

@honno
Copy link
Member Author

honno commented Jun 23, 2022

How about we have bound-y attributes per component, i.e. {real/imag}_{min/max}, {real/imag}_eps and {real/imag}_smallest_normal? Or just attributes for both the real and imag components (which should be the same), i.e. component_{min/max}, component_eps and component_smallest_normal.

Both of those options do feel weird, but having an array library tell you the bounds can be pretty useful.

@kgryte
Copy link
Contributor

kgryte commented Jun 23, 2022

Based on feedback in the most recent consortium meeting (2022-06-23), the plan is to open an issue on the NumPy issue tracker to gauge appetite there before moving forward adding cinfo to the specification.

The main argument for adding cinfo to the standard is consistency and thoroughness. However, hard to justify adding to spec atm without having any real-world array library implementations. We'd like to see an array library add a cinfo object first, hence opening an issue on NumPy proposing the addition.

@honno
Copy link
Member Author

honno commented Sep 14, 2022

Think we forgot to delegate someone to do this, so just now wrote an issue at numpy/numpy#22260

MatteoRaso added a commit to MatteoRaso/numpy that referenced this issue Sep 14, 2022
There was some confusion about how finfo works with complex numbers,
leading to an issue being made requesting that numpy either adds a
cinfo function or changes the documentation of finfo. The submitter of
that issue also linked to an issue for a seperate repository, which also
included conversation about the proposed change. (github.com/data-apis/array-api/issues/433)
In both discussions, there was a general concensus that it would be better
to change the documentation to explain how finfo works instead of
creating a cinfo function.

Since this is just a small documentation change, there's no need
to run the normal checks.

[skip ci]
@honno honno changed the title cinfo() for complex dtypes Mechanism to find complex dtype info Sep 19, 2022
@honno
Copy link
Member Author

honno commented Sep 19, 2022

How do folks feel about changing xp.finfo() so it supports complex dtypes? numpy/numpy#22260 (comment) I think makes a sound argument

cinfo() does not sound very useful to me. Complex types are a composite type, composed of two instances of a real type. When I try to think of actual use-cases for cinfo(), I end up wanting the information of the underlying real type. So I think the current behavior of finfo() is reasonable, and makes cinfo() unnecessary. Of course, this behavior of finfo() should be documented.

numpy/numpy#22263 has indeed now added documented complex support in numpy.finfo().

@kgryte
Copy link
Contributor

kgryte commented Sep 19, 2022

@honno I suggest adding your proposal to the next Array API meeting agenda.

@rgommers
Copy link
Member

rgommers commented Oct 7, 2022

@honno is there anything left to do here after gh-484?

@honno
Copy link
Member Author

honno commented Oct 7, 2022

@honno is there anything left to do here after gh-484?

Nope—forgot to link this issue to that merged PR, so I'll close this.

@honno honno closed this as completed Oct 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API extension Adds new functions or objects to the API. topic: Complex Data Types Complex number data types.
Projects
None yet
Development

No branches or pull requests

5 participants