You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was using the torchio library, which relies on humanize to return the memory size of image arrays. The images I processed had dimensions 512x512x166, with each pixel being a 32-bit (4-byte) integer.
What did you expect to happen?
torchio calls the naturalsize function with the occupied memory size in bytes and sets binary=True. Manually calculating the expected value:
512×512×166×4÷(1024×1024)=166 MiB
So, I expected the function to return approximately 166 MiB.
What actually happened?
Instead, the returned value was -2 MiB, accompanied by this warning:
RuntimeWarning: overflow encountered in scalar multiply
ret: str = format % ((base * bytes_ / unit)) + s
This overflow occurs because the input to the function was of type np.int32 instead of Python's native int. Since np.int32 has a maximum value of 2^31−1, the multiplication of base and bytes_ results in an overflow.
Steps to Reproduce
importhumanizeimportnumpyasnpprint(humanize.naturalsize(512*512*166*4, binary=True))
# Expected: 166.0 MiB# Works as expected with Python’s built-in int typeprint(humanize.naturalsize(np.int32(512*512*166*4), binary=True))
# Returns: -2.0 MiB# RuntimeWarning: overflow encountered in scalar multiply ret: str = format % ((base * bytes_ / unit)) + s
Proposed Solutions
Change the Order of Operations
Adjusting the order of operations can avoid overflow. In the current line:
ret: str=format% ((base*bytes_/unit)) +s
when bytes_ is np.int32, multiplying base and bytes_ produces an np.int32 result, which overflows before it’s divided by unit. By dividing either base or bytes_ by unit before the multiplication, each sub-operation remains a float:
Convert Input to Float
Alternatively, cast value to float without checking its type. Currently, the casting applies only if value is a string:
# Current approachifisinstance(value, str):
bytes_=float(value)
else:
bytes_=value
Updating it to cast all inputs to float could resolve the issue:
bytes_=float(value)
I haven’t created a pull request since I’m unsure if you’d prefer developers to ensure input compatibility or handle this within the function.
Environment
OS: Windows 11
Python: 3.12.1
Humanize: 4.11.0
Numpy: 1.26.3
The text was updated successfully, but these errors were encountered:
As you suggest, we could say that np.int32 inputs are not supported and the caller should make sure they're using an int, float or str, as documented, but the suggested fixes appear quite straightforward. I'll reserve the right to revert if it causes problems in the future :)
Let's go for option 1, because option 2 fails this test:
Yeah, I was considering if we should add a test using NumPy, but np.int32 isn't really a supported input, and installing NumPy can be tricky during the Python pre-release phase when there's no wheel, so maybe not.
What did you do?
I was using the torchio library, which relies on humanize to return the memory size of image arrays. The images I processed had dimensions 512x512x166, with each pixel being a 32-bit (4-byte) integer.
What did you expect to happen?
torchio calls the naturalsize function with the occupied memory size in bytes and sets binary=True. Manually calculating the expected value:
512×512×166×4÷(1024×1024)=166 MiB
So, I expected the function to return approximately 166 MiB.
What actually happened?
Instead, the returned value was -2 MiB, accompanied by this warning:
This overflow occurs because the input to the function was of type np.int32 instead of Python's native int. Since np.int32 has a maximum value of 2^31−1, the multiplication of base and bytes_ results in an overflow.
Steps to Reproduce
Proposed Solutions
Adjusting the order of operations can avoid overflow. In the current line:
when bytes_ is np.int32, multiplying base and bytes_ produces an np.int32 result, which overflows before it’s divided by unit. By dividing either base or bytes_ by unit before the multiplication, each sub-operation remains a float:
Alternatively, cast value to float without checking its type. Currently, the casting applies only if value is a string:
Updating it to cast all inputs to float could resolve the issue:
I haven’t created a pull request since I’m unsure if you’d prefer developers to ensure input compatibility or handle this within the function.
Environment
The text was updated successfully, but these errors were encountered: