ASDF handling of numpy scalar values (and scalar precision) #1519

braingram · 2023-04-11T15:31:46Z

Based off discussions in recent PRs:
#1474 (comment)
#1475 (comment)

ASDF (and the ASDF standard) should likely clarify how to handle scalar values.

For a short example of what should be clarified. Suppose a user adds a scalar (non-array) 'numpy.uint8' to the tree, then saves, then loads the tree. Should the value be a 'numpy.uint8', int?

To summarize the discussion so far based off contributions by @perrygreenfield

What should the standard require?

Perhaps support for "The greatest common types for integers, floats and complex values across standard platforms... int64, float64 and complex128". With larger types possibly supported with associated metadata.

What is required for round-tripping?

"Up to the library", so for python:

require permission to write out larger than standard values
raise a warning when the library doesn't support the size and return a special object
require round-tripping the special objects
raise an exception if the standard requirement is not met

How should numpy scalars that fit within the standard scalar?

The library will return the largest (standard) type which can violate round-tripping. Any user that expects a 'numpy.uint8' to stay a 'numpy.uint8' (and not be promoted to an int) should convert the value separately.

braingram · 2023-04-11T15:44:43Z

Some guidance is already provided in the standard for integers:
https://asdf-standard.readthedocs.io/en/1.0.3/known_limits.html#literal-integer-values-in-the-tree

One thought, which I think is in partial agreement with what @perrygreenfield described would be to say that scalars in the tree can be:

int64
float64
complex128
By default, ASDF (the python implementation) will convert any numpy scalar that fits into these types (and violate round-tripping, this means numpy.uint8 will become int). For any value outside of this range, I'm thinking we should throw an error mentioning the limitation and directing the user to a page with more information and a description of how to save the scalar as a ndarray (which will round trip without the same scalar issues).
We could enable an option to disable the automatic conversion from numpy scalar to builtin type (and instead throw an error).

braingram · 2023-07-27T15:36:42Z

Copying from: #1605 (comment)

"""
Now that light has been shined on the issue of numpy scalars, I think ASDF should work towards serializing numpy scalars differently than the built in python scalars. NEP 41 is working towards having more complicated dtypes (which would apply to scalars too). One of the main motivations for this effort is to encode units into the dtype itself (for performance). Looking towards this indicates that ASDF might want to start considering how to encode the dtype for scalars.
"""

WilliamJamieson · 2023-07-27T15:52:23Z

Linking this to asdf-format/asdf-standard#396, which is a proposal to add a scalar type to be added to the standard.

perrygreenfield · 2023-07-27T17:58:00Z

I wonder if this is another instance of trying to do too much with ASDF, similar to the argument I made about tuples previously. What sense is there in storing float16 scalars in an ASDF file? I am leaning towards thinking that if people want to deal with these in a consistent way, it is for them to address it specifically rather than add things like dtypes to scalar floats. Can we just assume they are doubles and leave it at that?

braingram assigned perrygreenfield, nden and eslavich Apr 11, 2023

eslavich mentioned this issue Apr 16, 2023

Move built-in tags to converters (except ndarray and integer) #1474

Merged

braingram mentioned this issue Jul 27, 2023

convert numpy scalars to python types before yaml encoding #1605

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ASDF handling of numpy scalar values (and scalar precision) #1519

ASDF handling of numpy scalar values (and scalar precision) #1519

braingram commented Apr 11, 2023

braingram commented Apr 11, 2023

braingram commented Jul 27, 2023 •

edited

Loading

WilliamJamieson commented Jul 27, 2023 •

edited

Loading

perrygreenfield commented Jul 27, 2023 •

edited

Loading

ASDF handling of numpy scalar values (and scalar precision) #1519

ASDF handling of numpy scalar values (and scalar precision) #1519

Comments

braingram commented Apr 11, 2023

What should the standard require?

What is required for round-tripping?

How should numpy scalars that fit within the standard scalar?

braingram commented Apr 11, 2023

braingram commented Jul 27, 2023 • edited Loading

WilliamJamieson commented Jul 27, 2023 • edited Loading

perrygreenfield commented Jul 27, 2023 • edited Loading

braingram commented Jul 27, 2023 •

edited

Loading

WilliamJamieson commented Jul 27, 2023 •

edited

Loading

perrygreenfield commented Jul 27, 2023 •

edited

Loading