naturalsize np.int32 multiplication overflow #217

Toprak2 · 2024-11-05T19:01:52Z

What did you do?

I was using the torchio library, which relies on humanize to return the memory size of image arrays. The images I processed had dimensions 512x512x166, with each pixel being a 32-bit (4-byte) integer.

What did you expect to happen?

torchio calls the naturalsize function with the occupied memory size in bytes and sets binary=True. Manually calculating the expected value:
512×512×166×4÷(1024×1024)=166 MiB

So, I expected the function to return approximately 166 MiB.

What actually happened?

Instead, the returned value was -2 MiB, accompanied by this warning:

RuntimeWarning: overflow encountered in scalar multiply
ret: str = format % ((base * bytes_ / unit)) + s

This overflow occurs because the input to the function was of type np.int32 instead of Python's native int. Since np.int32 has a maximum value of 2^31−1, the multiplication of base and bytes_ results in an overflow.

Steps to Reproduce

import humanize
import numpy as np

print(humanize.naturalsize(512*512*166*4, binary=True))  
# Expected: 166.0 MiB
# Works as expected with Python’s built-in int type

print(humanize.naturalsize(np.int32(512*512*166*4), binary=True))
# Returns: -2.0 MiB
# RuntimeWarning: overflow encountered in scalar multiply ret: str = format % ((base * bytes_ / unit)) + s

Proposed Solutions

Change the Order of Operations
Adjusting the order of operations can avoid overflow. In the current line:

ret: str = format % ((base * bytes_ / unit)) + s

when bytes_ is np.int32, multiplying base and bytes_ produces an np.int32 result, which overflows before it’s divided by unit. By dividing either base or bytes_ by unit before the multiplication, each sub-operation remains a float:

ret: str = format % ((base * (bytes_ / unit))) + s
# or
ret: str = format % ((base / unit * bytes_)) + s

Convert Input to Float
Alternatively, cast value to float without checking its type. Currently, the casting applies only if value is a string:

# Current approach
if isinstance(value, str):
    bytes_ = float(value)
else:
    bytes_ = value

Updating it to cast all inputs to float could resolve the issue:

bytes_ = float(value)

I haven’t created a pull request since I’m unsure if you’d prefer developers to ensure input compatibility or handle this within the function.
Environment

OS: Windows 11
Python: 3.12.1
Humanize: 4.11.0
Numpy: 1.26.3

The text was updated successfully, but these errors were encountered:

hugovk · 2024-11-06T14:44:48Z

Thank you for the well explained report!

As you suggest, we could say that np.int32 inputs are not supported and the caller should make sure they're using an int, float or str, as documented, but the suggested fixes appear quite straightforward. I'll reserve the right to revert if it causes problems in the future :)

Let's go for option 1, because option 2 fails this test:

    def test_naturalsize(test_args: list[int] | list[int | bool], expected: str) -> None:
>       assert humanize.naturalsize(*test_args) == expected
E       AssertionError: assert '1000.0 ZB' == '1.0 YB'
E
E         - 1.0 YB
E         + 1000.0 ZB

Would you like to open a PR?

Toprak2 · 2024-11-06T16:19:52Z

Thank you for reviewing the issue quickly.

I’ll proceed with implementing option 1 and will create a PR soon. If there are any additional tests you’d like me to add, please let me know.

hugovk · 2024-11-06T16:23:51Z

Yeah, I was considering if we should add a test using NumPy, but np.int32 isn't really a supported input, and installing NumPy can be tricky during the Python pre-release phase when there's no wheel, so maybe not.

hugovk added the bug Something isn't working label Nov 6, 2024

hugovk changed the title ~~naturalsize np.int32 Multiplication Overflow~~ naturalsize np.int32 multiplication overflow Nov 6, 2024

Toprak2 mentioned this issue Nov 6, 2024

Fix overflow issue in naturalsize with np.int32 input #218

Merged

hugovk closed this as completed in #218 Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

naturalsize np.int32 multiplication overflow #217

naturalsize np.int32 multiplication overflow #217

Toprak2 commented Nov 5, 2024

hugovk commented Nov 6, 2024

Toprak2 commented Nov 6, 2024

hugovk commented Nov 6, 2024

naturalsize np.int32 multiplication overflow #217

naturalsize np.int32 multiplication overflow #217

Comments

Toprak2 commented Nov 5, 2024

What did you do?

What did you expect to happen?

What actually happened?

Steps to Reproduce

Proposed Solutions

hugovk commented Nov 6, 2024

Toprak2 commented Nov 6, 2024

hugovk commented Nov 6, 2024