With NumPy 2.0, `text/plain` + float formatting results in `SphinxWarning` #610

bryanwweber · 2024-06-16T17:38:37Z

Describe the bug

context
When I do glue a NumPy float into a page with the {glue:text}`variable:<float format>`, I get a SphinxWarning:

sphinx.errors.SphinxWarning: <filename>:257:Failed to format text/plain data: could not convert string to float: 'np.float64(0.9643970836113095)' [mystnb.glue]

expectation
I expected no warning to occur.

problem
This is a problem for people doing formatting floating point NumPy objects because if they turn warnings to errors, their pages won't build.

The cause is described here: https://numpy.org/devdocs/release/2.0.0-notes.html#representation-of-numpy-scalars-changed

Basically, the repr of NumPy types changed to include the NumPy type. As such, the IPython display formatter outputs the literal 'numpy.float64(1.2345)' which obviously can't be formatted as a float. I'm not sure if this should be filed here or with IPython... I landed on here because it affects the glue functionality in particular and because it seems like the display formatter for NumPy types in IPython is probably doing the right thing.

The immediate workaround is as suggested in the NumPy release notes, setting np.set_printoptions(legacy="1.25") fixes the warning.

Reproduce the bug

Install NumPy 2.0

Glue and print a variable:

```{code-cell}
from myst_nb import glue
import numpy as np
a = np.float64(1.234)
glue("var-a", a)
```
{glue:text}:`var-a:.2F`

Build a book/the page and see the warning/error if -W is on.

List your environment

❯ jupyter-book --version
Jupyter Book      : 1.0.0
External ToC      : 1.0.1
MyST-Parser       : 2.0.0
MyST-NB           : 1.1.0
Sphinx Book Theme : 1.1.3
Jupyter-Cache     : 1.0.0
NbClient          : 0.10.0

and most importantly, NumPy >=2.0

The text was updated successfully, but these errors were encountered:

bryanwweber · 2024-06-16T17:48:39Z

Also want to recognize that as of this comment, NumPy 2.0 was released approximately 5 hours ago, so I'm way ahead of the curve. Happy to help contribute a fix as needed!

flying-sheep · 2024-06-27T12:24:35Z

Also want to recognize that as of this comment, NumPy 2.0 was released approximately 5 hours ago, so I'm way ahead of the curve. Happy to help contribute a fix as needed!

numpy has pre-releases, so it’s very possible to catch and fix issues with it before the official release 😛

agoose77 · 2024-06-27T12:28:47Z

I don't think this is something that we should try to solve. glue(name, object) is assuming that we can use eval for the text/plain formatting of floats. This is true for built-in floats, but no-longer true (without importing NumPy into the execution context) for NumPy floats.

I don't think we should special-case NumPy here, instead the user should either set the print options, or store the glue object as a Python float. Glue should be given something that's serialisable.

I'm not 100% firm on this personally; I can see merit in a less cumbersome UX. But, with my numeric-hat on; NumPy floats are not Python floats, and this distinction matters.

bryanwweber · 2024-06-27T20:50:49Z

But, with my numeric-hat on; NumPy floats are not Python floats, and this distinction matters.

Agreed in all the contexts where we care what types are being used! But, in this specific case of text formatting for prose, I'd argue the distinction doesn't matter; as a user, I simply want something that I'm using as a number to show up as a specific format.

I don't think we should special-case NumPy here, instead the user should either set the print options, or store the glue object as a Python float... I can see merit in a less cumbersome UX.

I understand that special-casing NumPy here is quite fraught. Where does it stop, should we also do this for SciPy objects? etc. I personally think NumPy floats are common enough that having a special case will be a huge benefit to avoid questions - and especially inscrutable errors/warnings! That said, although I think a fix is warranted, I don't whether it belongs here or in IPython or even in NumPy. Ideally, NumPy objects would know how to format themselves; on the other hand, glue is using the text/plain mimetype here, which should also be displayed in the REPL where clearly indicating numpy.float is important. Adding a new mimetype might be an option, but then that would have to go into IPython and support added here in MyST-NB, and it would be even more confusing that the current situation. Another option would be to avoid serializing to text before actually pasting, but that's a total reversal of how the gluing system works now where all the mimetypes available for an object are stored as metadata on a cell.

The only reasonable solution I can see in the short term is to do something like (this is pseudo-code):

def glue(...):
    try:
        import numpy
    except ImportError:
        numpy = None
    if numpy is not None and isinstance(variable, numpy.float):
        variable = float(variable)
    mimebundle, metadata = IPython.core.formatters.format_display_data(variable)
    ...

but then that takes away options from end users as far as how they want to do the conversion.

So, after all that, I've convinced myself that the status quo is the best option, with the exception that I think this behavior should be documented (I had to find it in the NumPy release notes!). I'll write up a PR for a doc improvement, probably over the weekend or early next week if no one beats me to it 😄

agoose77 · 2024-06-27T22:53:23Z

Yes, I went back and forth on this too. Ultimately, doing any kind of float conversion is "magic" and changes what is actually "stored". We should do a docs pass, I think.

flying-sheep · 2024-06-28T07:55:07Z

glue:text seems to be fundamentally flawed if ends up taking a round-trip through repr before applying formatting specifiers.

If you want to serialize things for later, you need to use a serialization format, not repr, which is documented to “often” be able to work as one, but not reliably:

Return a string containing a printable representation of an object. For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval();

The only standard Python serialization format is pickle, which would mean that you’d have to have the same packages installed in the environment that unpickles the glued values as there were when pickling them.

Alternatively you could make the serializer/deserializer configurable, and provide a default that can handle basic Python types and throws an error when it can’t handle something (e.g. json.dumps/loads).

agoose77 · 2024-06-28T09:50:38Z

@flying-sheep it would seem to me that the intention of glue has been muddied. At its core, it provides a way to name MIME-bundles. The glue:text functionality is an addition which implies interpretation of the contents, i.e. glue:text assumes the content is a number if you pass in the proper format specifiers. To do that properly, one should perform the MIME-bundle generation at document-rendering time, rather than at execution time. But that would introduce peculiar bugs; the execution environment would no longer have any control over the representation of the object, etc., and moreover, it would presume a specific kernel type.

For now, I think the best path is what we have. Long-term, things like the {eval} role are a better way to do this - shifting the specific formatting work to the kernel.

bryanwweber · 2024-06-28T19:54:16Z

I was just looking at the formatting code:

MyST-NB/myst_nb/core/variables.py

Lines 202 to 218 in 6ce30cd

    
           def format_plain_text(text: str, fmt_spec: str) -> str: 
        
               """Format plain text for display in a docutils node.""" 
        
               # literal eval, to remove surrounding quotes 
        
               try: 
        
                   value = literal_eval(text) 
        
               except (SyntaxError, ValueError): 
        
                   value = text 
        
               if fmt_spec == "": 
        
                   return str(value) 
        
               type_char = fmt_spec[-1] 
        
               if type_char == "s": 
        
                   value = str(value) 
        
               elif type_char in ("b", "c", "d", "o", "x", "X"): 
        
                   value = int(value) 
        
               elif type_char in ("e", "E", "f", "F", "g", "G", "n", "%"): 
        
                   value = float(value) 
        
               return format(value, fmt_spec)

I wonder if it makes sense to check whether the string numpy or np exists in the text and extract the float from it? Something like:

def format_plain_text(text: str, fmt_spec: str) -> str:
    ...
    elif type_char in ("e", "E", "f", "F", "g", "G", "n", "%"):
        import re
        is_np = re.match(text, r"(np)|(numpy)\.float\d+\((\d+\.\d*)\))
        if is_np is not None:
             value = is_np.group(1)
        value = float(value)
    return format(value, fmt_spec)

Other than the problem with special-casing NumPy, the only problems I see here are numpy not imported as np or numpy and that the precision in the repr is limited. This would at least resolve most of the most common cases, I expect...

bryanwweber added the bug Something isn't working label Jun 16, 2024

flying-sheep mentioned this issue Jun 27, 2024

BUG: Get eval results from metadata #597

Open

agoose77 closed this as completed Jun 27, 2024

bryanwweber mentioned this issue Jul 8, 2024

Update glueing docs for NumPy >=2.0 #615

Merged

bsipocz added a commit to bryanwweber/MyST-NB that referenced this issue Sep 19, 2024

Merge branch 'master' into update-docs-executablebooksgh-610

521b0f4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

With NumPy 2.0, `text/plain` + float formatting results in `SphinxWarning` #610

With NumPy 2.0, `text/plain` + float formatting results in `SphinxWarning` #610

bryanwweber commented Jun 16, 2024 •

edited

Loading

bryanwweber commented Jun 16, 2024

flying-sheep commented Jun 27, 2024

agoose77 commented Jun 27, 2024

bryanwweber commented Jun 27, 2024

agoose77 commented Jun 27, 2024

flying-sheep commented Jun 28, 2024 •

edited

Loading

agoose77 commented Jun 28, 2024

bryanwweber commented Jun 28, 2024

With NumPy 2.0, text/plain + float formatting results in SphinxWarning #610

With NumPy 2.0, text/plain + float formatting results in SphinxWarning #610

Comments

bryanwweber commented Jun 16, 2024 • edited Loading

Describe the bug

Reproduce the bug

List your environment

bryanwweber commented Jun 16, 2024

flying-sheep commented Jun 27, 2024

agoose77 commented Jun 27, 2024

bryanwweber commented Jun 27, 2024

agoose77 commented Jun 27, 2024

flying-sheep commented Jun 28, 2024 • edited Loading

agoose77 commented Jun 28, 2024

bryanwweber commented Jun 28, 2024

With NumPy 2.0, `text/plain` + float formatting results in `SphinxWarning` #610

With NumPy 2.0, `text/plain` + float formatting results in `SphinxWarning` #610

bryanwweber commented Jun 16, 2024 •

edited

Loading

flying-sheep commented Jun 28, 2024 •

edited

Loading