Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only allow integer type_caster to call __int__ method when conversion is allowed; always call __index__ #2698

Merged
merged 11 commits into from
Jan 17, 2021

Conversation

YannickJadoul
Copy link
Collaborator

@YannickJadoul YannickJadoul commented Nov 26, 2020

Description

As the title says and the patch shows: currently, PyLong_AsLong/PyLong_AsLongLong gets called independently of convert in the type_caster for integral types. This seems inconsistent with other type casters.

Suggested changelog entry:

The ``type_caster`` for integers does not convert Python objects with ``__int__`` anymore with ``noconvert`` or during the first round of trying overloads.

@henryiii
Copy link
Collaborator

Milestone?

@YannickJadoul
Copy link
Collaborator Author

I'd consider this a fix; do you agree, @henryiii? If not, feel free to change to 2.7.0

@YannickJadoul YannickJadoul added this to the v2.6.2 milestone Dec 16, 2020
@rwgk
Copy link
Collaborator

rwgk commented Dec 18, 2020

Observation only, no analysis yet:

This PR breaks https://github.com/google/mediapipe/blob/master/mediapipe/python/packet_test.py
The test output is below.
The test runs fine without this PR, after patching in this PR it fails.

[ RUN      ] PacketTest.test_boolean_packet
[       OK ] PacketTest.test_boolean_packet
[ RUN      ] PacketTest.test_bytes_packet
[       OK ] PacketTest.test_bytes_packet
[ RUN      ] PacketTest.test_detection_proto_packet
[       OK ] PacketTest.test_detection_proto_packet
[ RUN      ] PacketTest.test_double_packet
[       OK ] PacketTest.test_double_packet
[ RUN      ] PacketTest.test_empty_packet
[       OK ] PacketTest.test_empty_packet
[ RUN      ] PacketTest.test_float_array_packet
[       OK ] PacketTest.test_float_array_packet
[ RUN      ] PacketTest.test_float_image_frame_packet
[       OK ] PacketTest.test_float_image_frame_packet
[ RUN      ] PacketTest.test_float_packet
[       OK ] PacketTest.test_float_packet
[ RUN      ] PacketTest.test_float_vector_packet
[       OK ] PacketTest.test_float_vector_packet
[ RUN      ] PacketTest.test_image_frame_packet_copy_creation_with_cropping
[       OK ] PacketTest.test_image_frame_packet_copy_creation_with_cropping
[ RUN      ] PacketTest.test_image_frame_packet_creation_copy_mode
[       OK ] PacketTest.test_image_frame_packet_creation_copy_mode
[ RUN      ] PacketTest.test_image_frame_packet_creation_reference_mode
[       OK ] PacketTest.test_image_frame_packet_creation_reference_mode
[ RUN      ] PacketTest.test_int16_packet
[  FAILED  ] PacketTest.test_int16_packet
[ RUN      ] PacketTest.test_int32_packet
[  FAILED  ] PacketTest.test_int32_packet
[ RUN      ] PacketTest.test_int64_packet
[  FAILED  ] PacketTest.test_int64_packet
[ RUN      ] PacketTest.test_int8_packet
[  FAILED  ] PacketTest.test_int8_packet
[ RUN      ] PacketTest.test_int_array_packet
[       OK ] PacketTest.test_int_array_packet
[ RUN      ] PacketTest.test_int_packet
[  FAILED  ] PacketTest.test_int_packet
[ RUN      ] PacketTest.test_int_vector_packet
[       OK ] PacketTest.test_int_vector_packet
[ RUN      ] PacketTest.test_matrix_packet
[       OK ] PacketTest.test_matrix_packet
[ RUN      ] PacketTest.test_matrix_packet_with_non_c_contiguous_data
[       OK ] PacketTest.test_matrix_packet_with_non_c_contiguous_data
[ RUN      ] PacketTest.test_packet_vector_packet
[       OK ] PacketTest.test_packet_vector_packet
[ RUN      ] PacketTest.test_string_packet
[       OK ] PacketTest.test_string_packet
[ RUN      ] PacketTest.test_string_to_packet_map_packet
[       OK ] PacketTest.test_string_to_packet_map_packet
[ RUN      ] PacketTest.test_string_vector_packet
[       OK ] PacketTest.test_string_vector_packet
[ RUN      ] PacketTest.test_uint16_image_frame_packet
[       OK ] PacketTest.test_uint16_image_frame_packet
[ RUN      ] PacketTest.test_uint16_packet
[  FAILED  ] PacketTest.test_uint16_packet
[ RUN      ] PacketTest.test_uint32_packet
[  FAILED  ] PacketTest.test_uint32_packet
[ RUN      ] PacketTest.test_uint64_packet
[       OK ] PacketTest.test_uint64_packet
[ RUN      ] PacketTest.test_uint8_image_frame_packet
[       OK ] PacketTest.test_uint8_image_frame_packet
[ RUN      ] PacketTest.test_uint8_packet
[  FAILED  ] PacketTest.test_uint8_packet
======================================================================
ERROR: test_int16_packet (__main__.PacketTest)
PacketTest.test_int16_packet
----------------------------------------------------------------------
Traceback (most recent call last):
  File "mediapipe/python/packet_test.py", line 73, in test_int16_packet
    p2 = mp.packet_creator.create_int16(np.int16(1))
TypeError: create_int16(): incompatible function arguments. The following argument types are supported:
    1. (arg0: int) -> mediapipe.python._framework_bindings.packet.Packet

Invoked with: 1

======================================================================
ERROR: test_int32_packet (__main__.PacketTest)
PacketTest.test_int32_packet
----------------------------------------------------------------------
Traceback (most recent call last):
  File "mediapipe/python/packet_test.py", line 86, in test_int32_packet
    p2 = mp.packet_creator.create_int32(np.int32(1))
TypeError: create_int32(): incompatible function arguments. The following argument types are supported:
    1. (arg0: int) -> mediapipe.python._framework_bindings.packet.Packet

Invoked with: 1

======================================================================
ERROR: test_int64_packet (__main__.PacketTest)
PacketTest.test_int64_packet
----------------------------------------------------------------------
Traceback (most recent call last):
  File "mediapipe/python/packet_test.py", line 96, in test_int64_packet
    p2 = mp.packet_creator.create_int64(np.int64(1))
TypeError: create_int64(): incompatible function arguments. The following argument types are supported:
    1. (arg0: int) -> mediapipe.python._framework_bindings.packet.Packet

Invoked with: 1

======================================================================
ERROR: test_int8_packet (__main__.PacketTest)
PacketTest.test_int8_packet
----------------------------------------------------------------------
Traceback (most recent call last):
  File "mediapipe/python/packet_test.py", line 61, in test_int8_packet
    p2 = mp.packet_creator.create_int8(np.int8(1))
TypeError: create_int8(): incompatible function arguments. The following argument types are supported:
    1. (arg0: int) -> mediapipe.python._framework_bindings.packet.Packet

Invoked with: 1

======================================================================
ERROR: test_int_packet (__main__.PacketTest)
PacketTest.test_int_packet
----------------------------------------------------------------------
Traceback (most recent call last):
  File "mediapipe/python/packet_test.py", line 49, in test_int_packet
    p2 = mp.packet_creator.create_int(np.intc(1))
TypeError: create_int(): incompatible function arguments. The following argument types are supported:
    1. (arg0: int) -> mediapipe.python._framework_bindings.packet.Packet

Invoked with: 1

======================================================================
ERROR: test_uint16_packet (__main__.PacketTest)
PacketTest.test_uint16_packet
----------------------------------------------------------------------
Traceback (most recent call last):
  File "mediapipe/python/packet_test.py", line 120, in test_uint16_packet
    p2 = mp.packet_creator.create_uint16(np.uint16(1))
TypeError: create_uint16(): incompatible function arguments. The following argument types are supported:
    1. (arg0: int) -> mediapipe.python._framework_bindings.packet.Packet

Invoked with: 1

======================================================================
ERROR: test_uint32_packet (__main__.PacketTest)
PacketTest.test_uint32_packet
----------------------------------------------------------------------
Traceback (most recent call last):
  File "mediapipe/python/packet_test.py", line 132, in test_uint32_packet
    p2 = mp.packet_creator.create_uint32(np.uint32(1))
TypeError: create_uint32(): incompatible function arguments. The following argument types are supported:
    1. (arg0: int) -> mediapipe.python._framework_bindings.packet.Packet

Invoked with: 1

======================================================================
ERROR: test_uint8_packet (__main__.PacketTest)
PacketTest.test_uint8_packet
----------------------------------------------------------------------
Traceback (most recent call last):
  File "mediapipe/python/packet_test.py", line 108, in test_uint8_packet
    p2 = mp.packet_creator.create_uint8(np.uint8(1))
TypeError: create_uint8(): incompatible function arguments. The following argument types are supported:
    1. (arg0: int) -> mediapipe.python._framework_bindings.packet.Packet

Invoked with: 1

----------------------------------------------------------------------
Ran 31 tests in 0.523s

FAILED (errors=8)

@henryiii
Copy link
Collaborator

__index__ is somewhat special. Would it work if __int__ was not called automatically, but __index__ was?

object.__index__(self)

Called to implement operator.index(), and whenever Python needs to losslessly convert the numeric object to an integer object (such as in slicing, or in the built-in bin(), hex() and oct() functions). Presence of this method indicates that the numeric object is an integer type. Must return an integer.

Therefore, having an __index__ means that this object is an integer type, with a lossless representation as an integer. __int__ means you can convert it to an integer, possibly/probably lossy, and the object is possibly not an integer.

@rwgk
Copy link
Collaborator

rwgk commented Dec 18, 2020

test_int_convert_numpy below is a minimal example of the mediapipe issue.

Main conclusion: this PR is working as intended, the mediapipe code needs to remove py::arg().noconvert().

My feeling about milestone: given that we now know this PR breaks someone, maybe 2.7 is better?

Might be nice to include the extra test below in this PR. It's similar to test_numpy_bool but uses int_passthrough, i.e. adds a little bit of coverage.

Tangential: test_numpy_bool cant_convert(np.zeros(2, dtype="int")) seems to convolute two things: an array cannot be converted to a scalar, int vs bool. Note that cant_convert(np.zeros(2, dtype="bool")) also passes the test. How about this instead (to keep the two things separate)?

    pytest.raises(TypeError, convert, np.zeros(2, dtype="bool"))
    pytest.raises(TypeError, noconvert, np.int32(0))
diff --git a/tests/test_builtin_casters.py b/tests/test_builtin_casters.py
index 3942482..f130dd2 100644
--- a/tests/test_builtin_casters.py
+++ b/tests/test_builtin_casters.py
@@ -280,6 +280,16 @@ def test_int_convert():
     cant_convert(FuzzyThought())
 
 
+def test_int_convert_numpy():
+    np = pytest.importorskip("numpy")
+
+    convert, noconvert = m.int_passthrough, m.int_passthrough_noconvert
+
+    v = np.int32(84)
+    assert convert(v) == 84
+    pytest.raises(TypeError, noconvert, v)
+
+
 def test_tuple(doc):
     """std::pair <-> tuple & std::tuple <-> tuple"""
     assert m.pair_passthrough((True, "test")) == ("test", True)

@YannickJadoul
Copy link
Collaborator Author

__index__ is somewhat special. Would it work if __int__ was not called automatically, but __index__ was?

The problem is that both are treated equally by PYBIND11_LONG_CHECK, I believe?
But yes, that would make sense.

I feel stupid for not documenting this better, but I believe this was tackling a complaint from Gitter, where the conversion from float to int is considered noconvert (since float has __int__). I'll check tomorrow if I can figure out the context!

So this was also why I considered this as a bugfix.

My feeling about milestone: given that we now know this PR breaks someone, maybe 2.7 is better?

But @rwgk's comment also makes sense, from that respect, yes. Though ... well, it's very tricky behavior to depend on, I'd say?

Might be nice to include the extra test below in this PR. It's similar to test_numpy_bool but uses int_passthrough, i.e. adds a little bit of coverage.

I'll have a look at the tests as well, but in principle, these NumPy types are just as well just types with or without __int__/__index__ methods, no? So this should already be covered by my explicit, custom test classes DeepThought and/or ShallowThought, I believe? I'll check again, though.

@rwgk
Copy link
Collaborator

rwgk commented Dec 18, 2020

I'll have a look at the tests as well, but in principle, these NumPy types are just as well just types with or without __int__/__index__ methods, no? So this should already be covered by my explicit, custom test classes DeepThought and/or ShallowThought, I believe? I'll check again, though.

Yes, that's true, if you assume numpy behaves that way (seems like a very safe assumption).
Clearly, the test isn't essential, but I'd add those few lines there anyway, mainly to point people (like the mediapipe folks) to, which will imply the message: here, we thought about it, we know, that's the behavior we want, you need to adjust.

@bstaletic
Copy link
Collaborator

Those numpy types do simply implement __int__. I was going to argue in favour of sticking to milestone 2.6.2, but after thinking about it a bit more, I think @rwgk is right. We're essentially changing what pybind11 considers "an int".

@henryiii
Copy link
Collaborator

PYBIND11_LONG_CHECK

Does that need to be updated or a different check used? There should be a difference between a class with only __int__ (possibly lossy conversion, like a float), and __index__ ("I am an int").

@henryiii henryiii modified the milestones: v2.6.2, v2.7 Dec 21, 2020
@henryiii
Copy link
Collaborator

I think this is a 2.7 fix then, and I also still think __index__ should not be "a conversion", but __int__ should be - that's what those methods mean.

@YannickJadoul YannickJadoul changed the title Only allow integer type_caster to call __int__ or __index__ method when conversion is allowed Only allow integer type_caster to call __int__ method when conversion is allowed; always call __index__ Dec 29, 2020
@YannickJadoul
Copy link
Collaborator Author

I fixed the __index__ thing, demonstrated by a new test (which failed, before). This change should now be way less intrusive (not saying I need it into 2.6.2, but there's a chance it would fit, now).

PyLong_AsLongLong (and others) only call __index__ from 3.8 onwards, but always call __int__, so I think we're fine in this respect? Only if __int__ and __index__ do different things, we're in trouble, but if that's the case, I'm not blaming pybind11. (Also, __int__ being deprecated is now deprecated and gives a warning in Python >= 3.8, so I assume this is part of the same change, where PyLong_AsLongLong switches to ónly take __index__ into account?)

Let's see why Python 2.7 is unhappy. sigh

assert convert(np.intc(42)) == 42
assert noconvert(np.intc(42)) == 42

assert convert(np.float32(3.14159)) == 3 # This might be wrong/unwanted?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also kind of weird. It's because we only explicitly test for float and not for "float-like" types. I'm not sure if there's a general way to recognize "float-like" types, though. We could just add a special case for NumPy float types?

Not really related to this PR, though, so definitely another PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, although it looks tricky to implement, therefore I wonder about cost/benefit of working on this.
Pragmatic idea: report the situation in a comment and wait for an opportune moment in the future (e.g. when there is an actual problem that needs to be resolved). My first stab:

# The implicit conversion from np.float is undesirable but difficult to detect, without making the pybind11
# int converter dependent on numpy. Currently Python does not have a generalized concept of float-like types.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do already have something like this (a special case for NumPy) in the type_caster for bool, though.

But yes, this might be something we'll have to live with? At the very least, this should now be considered a "conversion".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Google-global testing of this PR had no issues.
I see two behavior changes here: 1. the one currently implemented, and 2. potentially also raising a TypeError for np.float types (with ifdefs I guess to use the numpy api if available).
Given that this PR is exhaustively tested and doesn't disturb anything anymore, I feel it's best to merge now (for 2.6.2), and try the 2. behavior change in a separate PR. The main rationale for this two-stage approach is to use our time efficiently. Putting off this well-tested change will cost extra time.
For the comment, this is my 2nd suggested version:

# The implicit conversion from np.float is undesirable but currently accepted.

Just a hint that we thought about it, leaving all options open for the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Never meant to lump this in this PR. I really thought it would just be a straightforward one, but once again, Python had some surprises.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, to make things more complicated, np.float to int isn't allowed, because np.float is just a Python float (I'm not joking; this is really the case). Anyway, I'll add 32 and commit this suggestion.

@YannickJadoul
Copy link
Collaborator Author

Let's see why Python 2.7 is unhappy. sigh

even deeper sigh Old-style classes...

cant_convert(ShallowThought())
cant_convert(FuzzyThought())
if not env.PY2:
# I have no clue why __index__ is not picked up by Python 2's PyIndex_check
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too bad for Python 2? I don't know what went wrong, here...

This does work in Python 2.7, though:

Python 2.7.17 (default, Sep 30 2020, 13:38:04) 
[GCC 7.5.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> x = list(range(50))
>>> class IndexedThought(object):
...     def __index__(self):
...         return 42
... 
>>> x[IndexedThought()]
42

@YannickJadoul
Copy link
Collaborator Author

Ready for another review

include/pybind11/cast.h Outdated Show resolved Hide resolved
assert convert(np.intc(42)) == 42
assert noconvert(np.intc(42)) == 42

assert convert(np.float32(3.14159)) == 3 # This might be wrong/unwanted?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, although it looks tricky to implement, therefore I wonder about cost/benefit of working on this.
Pragmatic idea: report the situation in a comment and wait for an opportune moment in the future (e.g. when there is an actual problem that needs to be resolved). My first stab:

# The implicit conversion from np.float is undesirable but difficult to detect, without making the pybind11
# int converter dependent on numpy. Currently Python does not have a generalized concept of float-like types.

@YannickJadoul
Copy link
Collaborator Author

I need to look into these new failures, but, @rwgk, this might break a lot fewer tests, on your end?

@YannickJadoul
Copy link
Collaborator Author

I think this should fix the tests (it does so on my machine anyway, where I had failed to run a 3 <= Python version < 3.8, before).

Remaining issue: PyLong_AsLongLong does not call __index__ before 3.8. This is consistent with calling int(...) but not with some_list[...].

Python 3.8:

>>> int(IndexedThought())
42
>>> list(range(100))[IndexedThought()]
42

Python 3.6:

>>> int(IndexedThought())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: int() argument must be a string, a bytes-like object or a number, not 'IndexedThought'
>>> list(range(100))[IndexedThought()]
42

Python 2.7:

>>> int(IndexedThought())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: int() argument must be a string or a number, not 'IndexedThought'
>>> list(range(100))[IndexedThought()]
42

So should we make this consistent, and always call PyNumber_Index in versions < 3.8? I'm not sure, but we could. We could also say: "this is just how Python versions < 3.8 handle int conversions, so pybind11 is not going to change that (and you should update anyway)".

Minor extra inconsistency: in versions < 3.6, we now check PyIndex_Check (whether it has an __index__ and is thus an integer numeric type!) to determine if the conversion is noconvert, but later, PyLong_AsLongLong still calls __int__. That's ... weird, at least. And maybe an argument to say we should manually call PyNumber_Index anyway?

@rwgk
Copy link
Collaborator

rwgk commented Dec 30, 2020

I quick observation: https://github.com/google/mediapipe/blob/master/mediapipe/python/packet_test.py passes with the current version of this PR (915423d).
I verified that the mediapipe bindings still have the .noconvert() (that code was last changed in on Nov 17).

assert convert(np.intc(42)) == 42
assert noconvert(np.intc(42)) == 42

assert convert(np.float32(3.14159)) == 3 # This might be wrong/unwanted?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Google-global testing of this PR had no issues.
I see two behavior changes here: 1. the one currently implemented, and 2. potentially also raising a TypeError for np.float types (with ifdefs I guess to use the numpy api if available).
Given that this PR is exhaustively tested and doesn't disturb anything anymore, I feel it's best to merge now (for 2.6.2), and try the 2. behavior change in a separate PR. The main rationale for this two-stage approach is to use our time efficiently. Putting off this well-tested change will cost extra time.
For the comment, this is my 2nd suggested version:

# The implicit conversion from np.float is undesirable but currently accepted.

Just a hint that we thought about it, leaving all options open for the future.

@YannickJadoul YannickJadoul modified the milestones: v2.7, v2.6.2 Dec 30, 2020
@YannickJadoul
Copy link
Collaborator Author

Added back to 2.6.2, as @rwgk suggested; we can always switch back to 2.7 (or 2.6.3), if there's no time to get this fully reviewed and in.

There's still the issue from #2698 (comment) that doesn't really feel nice to me. I'm wondering if we shouldn't strive for more consistency in pybind11, and adopt 3.8+ behavior.

@YannickJadoul
Copy link
Collaborator Author

I've moved the last question to be decided to a more concrete PR, #2801, demonstrating the implementation is straightforward. All that's left is arguing why we do or don't want this.

…eck (unnoticed because we currently don't have PyPy >= 3.8)
@YannickJadoul
Copy link
Collaborator Author

#2801 still discovered an inconsistency with PyPy. @rwgk, do have a look at the latest 0e89b14, but I'm assuming you're fine with this since it doesn't change CPython behavior?

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the PyPy fix looks great to me. Thank @YannickJadoul!

@YannickJadoul
Copy link
Collaborator Author

Except that ICC isn't happy with this. It's hardly been a day... :-|

@henryiii
Copy link
Collaborator

Aren't you glad we added that check? 😉

@YannickJadoul
Copy link
Collaborator Author

Aren't you glad we added that check? wink

I'm thrilled. Anyway, yeah, otherwise that other PR had to do it.

I'm trying to get rid of that lambda, by means of a macro, I guess. Wouldn't know what else it could be the issue.

Sorry for the mess. Things worked smoothly, locally.

@YannickJadoul
Copy link
Collaborator Author

@YannickJadoul
Copy link
Collaborator Author

Oof, all green again 😅
That stress really wasn't necessary, ICC.

@YannickJadoul
Copy link
Collaborator Author

@henryiii, #2801 is also green, now, so as far as I'm concerned, this can go in, as the one loose end is handled in #2801.

@henryiii henryiii merged commit 8449a80 into pybind:master Jan 17, 2021
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Jan 17, 2021
@YannickJadoul YannickJadoul deleted the noconvert-int branch January 17, 2021 01:55
@henryiii henryiii removed the needs changelog Possibly needs a changelog entry label Jan 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants