Fail on passing py::object with wrong Python type to py::object subclass using PYBIND11_OBJECT macro #2349

YannickJadoul · 2020-07-31T14:56:25Z

Throwing this out there and seeing how it reacts to existing code, and opening this up for discussion.

As discussed in #2340, this would throw an error on e.g., py::bytes(o) with o any py::object that isn't a bytes or subclass instance.

Notes:

I'm not happy about the error message myself. I think it throwing a type_error is OK, but I'm hoping someone has a better formulations of the error message.

include/pybind11/pytypes.h

YannickJadoul · 2020-08-03T17:44:01Z

Python 2 tests fail, but #2360 needs to be merged (or we need to disable the Python 2 tests for py::ellipsis, if #2360 doesn't get merged)

rwgk · 2020-08-04T19:38:24Z

include/pybind11/pytypes.h

@@ -796,11 +796,16 @@ PYBIND11_NAMESPACE_END(detail)
    : Parent(check_(o) ? o.release().ptr() : ConvertFun(o.ptr()), stolen_t{}) \
    { if (!m_ptr) throw error_already_set(); }

+#define PYBIND11_OBJECT_CHECK_FAILED(Name, o) \
+    type_error("Object of type '" + std::string(Py_TYPE(o.ptr())->tp_name) + "' is not an instance of '" #Name "'")


I'm not sure this works correctly for nested classes. Questions:

Is there a higher-level function to get the fully-qualified class name?

Could we exercise with a native nested class, and a nested pybind11 extension class?

I know it looks horrible, but Py_TYPE(o.ptr())->tp_name is used in other places as well.
About nested types: good point. I should try. Probably something with __qualname__.

I know it looks horrible, but Py_TYPE(o.ptr())->tp_name is used in other places as well.
About nested types: good point. I should try. Probably something with __qualname__.

This thought randomly crossed my mind:
The "proper" way of doing this is:

Implement a helper function that returns the fully-qualified type name (module + qualname + tp_name) as a string.

Unit test that separately.

Hook it in here, the existing simple test is fine as-is.

Yeah, I just need to find time to check how the full name is normally retrieved :-)

Actually, a quick lookup confirms that tp_name is the way to go?

Pointer to a NUL-terminated string containing the name of the type. For types that are accessible as module globals, the string should be the full module name, followed by a dot, followed by the type name; for built-in types, it should be just the type name. If the module is a submodule of a package, the full package name is part of the full module name. For example, a type named T defined in module M in subpackage Q in package P should have the tp_name initializer "P.Q.M.T".

https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_name

Let me confirm with a few tests ;-)

Fix needed for PyPy, though, it seems. It looks like PyPy has tp_name the same as __name__. Need to check and confirm if I got that correct with half-sleepy mind.

I've merged the tp_name PR, should this be rebased?

tests/test_pytypes.py

YannickJadoul · 2020-10-04T21:42:41Z

Submitted an issue to PyPy: https://foss.heptapod.net/pypy/pypy/-/issues/3318

I guess I'll just try to see how we can best sweep this thing under the rug?

…ass using PYBIND11_OBJECT macro

…if the argument is not a type

…mon.h

… PyPy

YannickJadoul · 2020-10-04T22:24:28Z

This (i.e., f2424c5) should do it, I think. Is this too heavy?

The other option (while waiting for PyPy for a response and/or solution) would be to change the tests and have them accept not just ellipsis but also __builtin__.ellipsis or builtins.ellipsis.

rwgk · 2020-10-05T00:50:50Z

Is this too heavy?

Don't think so. Looks good to me.

wjakob

This change looks fine to me. Regarding:

This (i.e., f2424c5) should do it, I think. Is this too heavy?

Your change is very minor, and we have some very heavy workarounds for a few PyPy-specific problems.

include/pybind11/detail/class.h

…_tp_name

YannickJadoul · 2020-10-05T20:48:45Z

Green! Thanks, all, for the thorough reviews!

nyibbang · 2022-01-11T13:33:52Z

This is a breaking change. It breaks some of my code, such as:

  const py::bytes uid = py::getattr(obj, uidAttrName, py::none());
  if (!uid.is_none())
    return deserialize(static_cast<std::string>(uid));

If the attribute was not found before, it would work fine. Now it raises an exception saying that Object of type 'NoneType' is not an instance of 'bytes'.

I know that the fix is easy to implement, but I was hoping that upgrading from pybind 2.5.0 to 2.9.0 would not break my code, according to SemVer.
I believe this patch should have been merged in a major upgrade of the library version.

henryiii · 2022-01-11T15:04:25Z

SemVer is not a solution. You can't get a consensus as to what a breaking change is. Every change could break someone's code, and so the only perfect SemVer is to make every single release a major version, which makes it useless. Anything else is a practical approximation.

In this case, you were relying on undefined behavior - we didn't support converting to an object subclass if the types were invalid before either, it just failed to throw an error - it just happened to work for you since the only thing you used was is_none(). But that was not tested or defined behavior. The "fix" will also work just fine in 2.5, so I consider this a strict improvement.

Now, one could argue that "None" is a special case, especially due to the existence of is_none on all types. However, this would likely complicate code, because it would force you to check for None more often, unless there is some valid way to get None for py::bytes without a conversion like this. If there is some other way to get a None currently valid, then this is possibly worth a special case.

I believe this is the fix and is valid on both 2.5 and 2.9:

  const py::object uid = py::getattr(obj, uidAttrName, py::none());
  if (!uid.is_none())
    return deserialize(static_cast<std::string>(uid));

nyibbang · 2022-01-11T15:48:10Z

SemVer is not a solution. You can't get a consensus as to what a breaking change is. Every change could break someone's code, and so the only perfect SemVer is to make every single release a major version, which makes it useless. Anything else is a practical approximation.

This is true, so then the decision to release a major version is decided by the risk that a change might break client code.
As stated on the SemVer website:

Use your best judgment. If you have a huge audience that will be drastically impacted by changing the behavior back to what the public API intended, then it may be best to perform a major version release, even though the fix could strictly be considered a patch release.

I could argue that this change had the potential to affect a lot of client code, especially since it concerns implicit conversions. Although I guess I'm wrong since it seems I'm the only one that got bothered by it.

In this case, you were relying on undefined behavior - we didn't support converting to an object subclass if the types were invalid before either, it just failed to throw an error - it just happened to work for you since the only thing you used was is_none(). But that was not tested or defined behavior. The "fix" will also work just fine in 2.5, so I consider this a strict improvement.

The documentation was scarce in 2.5.0 about py::object inherited types constructors, and I had to read the code to understand what they meant. I especially got confused about the conversion from py::none to py::str. It has since been improved.

I didn't realize that my code was undefined behavior when I wrote it.

I believe this is the fix and is valid on both 2.5 and 2.9:

  const py::object uid = py::getattr(obj, uidAttrName, py::none());
  if (!uid.is_none())
    return deserialize(static_cast<std::string>(uid));

That's pretty much what I've done. It's actually more sensible to write it that way compared to before.

henryiii · 2022-01-11T16:20:07Z

SemVer is best considered "an abbreviated changelog". A patch release has only fixes - but could break your code. A minor release has features, but mostly should be compatible with past releases - but could break your code. A major release should have large new things a user should review, and is more likely, but not guarantied, to break some older code. For a stable library, in fact, unless you are a heavy user, major versions are not likely to break you - any more likely than patch/minor releases, actually.

And it differs based on the language - in JavaScript, you have nested dependencies, so having frequent major releases is encouraged - one library can use version 2 and another can use version 3 and both work together (though there turn out to be a lot of caveats, and I'm not sure it's really "better"). You can read more about this (and my opinions on capping) here: https://iscinumpy.dev/post/bound-version-constraints/

In Python, major releases are highly discouraged, and instead deprecation periods are encouraged. If possible, warnings should be produced for 1-3 releases, and then finally removed. You should upgrade frequently in order to benefit from this. Python itself follows this (Guido recently promised at least version 3.33) - things will slowly be removed and changed, but no more major versions, to avoid a transition like 2 to 3. This is a social construct of Python packaging.

Anyway, we've done quite a bit in pybind11 to improve the checking for undefined behavior. This had reduced bugs in client code. We do try to avoid breaking changes if possible - we test quite a bit of downstream code, and changes are required that are strictly fixes, we allow that, but if other types of changes are required, we try to avoid that. Major versions will likely be based on incrementing the ABI, not on the public API, though we will likely remove a few deprecated things on major version bumps. (PS: This is probably true for Python as well - ABI3 cannot be changed without a bump to Python 4, so this will likely be a/the driving factor).

The public constructor of `pybind11::module` is now deprecated. This patch replaces its only use by a call to `PYBIND11_EMBEDDED_MODULE` followed by a `pybind11::import(...)`. The constructor of a `pybind11::object` subclass from a `pybind11::object` from the wrong type now throws an exception. See pybind/pybind11#2349. We had one case of this happening where we tried constructing a `bytes` from a `None`. We now first construct an `object`, then test if it's `None` before trying to convert it to `bytes`. Change-Id: I4aa2a1b4a49d3024f203bdd5c6f9c4ee86bfc9c4 Reviewed-on: http://gerrit2.aldebaran.lan/1760 Reviewed-by: philippe.martin <philippe.martin@softbankrobotics.com> Reviewed-by: jmonnon <jmonnon@aldebaran.com> Tested-by: vincent.palancher <vincent.palancher@softbankrobotics.com>

YannickJadoul mentioned this pull request Jul 31, 2020

Tracking PR: intermediate state of completed str/bytes cleaning up PRs #2348

Closed

4 tasks

bstaletic reviewed Jul 31, 2020

View reviewed changes

include/pybind11/pytypes.h Outdated Show resolved Hide resolved

include/pybind11/pytypes.h Outdated Show resolved Hide resolved

YannickJadoul force-pushed the str-bytes-cleanup branch from 60cba1c to 3c8dbc2 Compare August 3, 2020 13:01

YannickJadoul force-pushed the throw-on-PYBIND11_OBJECT-check_-failure branch from 4d9ae63 to 2e66b40 Compare August 3, 2020 16:22

YannickJadoul mentioned this pull request Aug 3, 2020

Enable py::ellipsis on Python 2 #2360

Merged

YannickJadoul force-pushed the throw-on-PYBIND11_OBJECT-check_-failure branch 3 times, most recently from 8db37ff to 837710f Compare August 3, 2020 17:42

rwgk force-pushed the str-bytes-cleanup branch from 3c8dbc2 to 65130b7 Compare August 4, 2020 19:14

YannickJadoul force-pushed the throw-on-PYBIND11_OBJECT-check_-failure branch from 837710f to 47af504 Compare August 4, 2020 19:20

rwgk reviewed Aug 4, 2020

View reviewed changes

rwgk force-pushed the str-bytes-cleanup branch 2 times, most recently from dcc41ef to 0e4e64a Compare August 13, 2020 00:00

YannickJadoul force-pushed the throw-on-PYBIND11_OBJECT-check_-failure branch from 47af504 to 41f2e39 Compare August 13, 2020 23:10

rwgk force-pushed the str-bytes-cleanup branch 2 times, most recently from f38fafc to 6cd4552 Compare August 14, 2020 22:03

YannickJadoul force-pushed the throw-on-PYBIND11_OBJECT-check_-failure branch from 41f2e39 to 6c0f03b Compare August 15, 2020 21:33

rwgk force-pushed the str-bytes-cleanup branch from 1d553e4 to 0b49467 Compare August 16, 2020 17:24

YannickJadoul force-pushed the str-bytes-cleanup branch from 0b49467 to 0ab4b97 Compare August 16, 2020 21:14

YannickJadoul force-pushed the throw-on-PYBIND11_OBJECT-check_-failure branch from 6c0f03b to f32aa52 Compare August 16, 2020 23:29

rwgk force-pushed the str-bytes-cleanup branch from 7ae6bb2 to 4fc3b1b Compare August 17, 2020 17:55

YannickJadoul mentioned this pull request Sep 6, 2020

Add py::object casting example to embedding docs #2466

Merged

YannickJadoul mentioned this pull request Sep 14, 2020

feat: py::type::of<T>() and py::type::of(h) #2364

Merged

YannickJadoul force-pushed the throw-on-PYBIND11_OBJECT-check_-failure branch from 16a9fac to e43114f Compare September 21, 2020 00:35

YannickJadoul changed the base branch from str-bytes-cleanup to master September 21, 2020 00:35

YannickJadoul force-pushed the throw-on-PYBIND11_OBJECT-check_-failure branch from e43114f to f36af9e Compare September 21, 2020 00:37

YannickJadoul mentioned this pull request Sep 21, 2020

Moving tp_class access, and consistent fully-qualified naming for PyPy, to detail::get_tp_name #2520

Merged

YannickJadoul added this to the v2.6.0 milestone Sep 22, 2020

moralex1 mentioned this pull request Sep 28, 2020

[BUG] Confusing weakref constructor copy the object instead of creating a weakref #2536

Closed

YannickJadoul added 5 commits October 4, 2020 23:49

Fail on passing py::object with wrong Python type to py::object subcl…

6f873cf

…ass using PYBIND11_OBJECT macro

Split off test_non_converting_constructors from test_constructors

1e99e9c

Fix test_as_type, as py::type constructor now throws an error itself …

22f9557

…if the argument is not a type

Replace tp_name access by pybind11::detail::get_fully_qualified_tp_name

4ac0bd4

Move forward-declaration of get_fully_qualified_tp_name to detail/com…

d8664ad

…mon.h

YannickJadoul force-pushed the throw-on-PYBIND11_OBJECT-check_-failure branch from 4e8ef13 to d8664ad Compare October 4, 2020 22:03

Don't add the builtins module name in get_fully_qualified_tp_name for…

f2424c5

… PyPy

henryiii assigned wjakob and unassigned YannickJadoul Oct 5, 2020

wjakob approved these changes Oct 5, 2020

View reviewed changes

include/pybind11/detail/class.h Outdated Show resolved Hide resolved

Add PYBIND11_BUILTINS_MODULE macro, and use it in get_fully_qualified…

6b1cfdf

…_tp_name

henryiii approved these changes Oct 5, 2020

View reviewed changes

YannickJadoul merged commit f537093 into pybind:master Oct 5, 2020

YannickJadoul deleted the throw-on-PYBIND11_OBJECT-check_-failure branch October 5, 2020 20:48

github-actions bot added the needs changelog Possibly needs a changelog entry label Oct 5, 2020

henryiii mentioned this pull request Oct 6, 2020

fix: PYBIND11_OBJECT required pybind11 namespace (regression) #2553

Merged

henryiii removed the needs changelog Possibly needs a changelog entry label Oct 6, 2020

YannickJadoul mentioned this pull request Oct 23, 2020

py::cast<T> where T is a subclass of py::object is unchecked #1108

Closed

YannickJadoul mentioned this pull request Dec 21, 2020

Also throw in the move-constructor added by the PYBIND11_OBJECT macro, after the argument has been moved-out (if necessary) #2701

Merged

virtuald mentioned this pull request Feb 16, 2021

Reported segfault in SmartDashboard.putStringArray robotpy/robotpy-wpilib#649

Closed

rwgk mentioned this pull request Feb 10, 2023

FWD pybind11 google/pybind11clif#2349

Closed

Fail on passing py::object with wrong Python type to py::object subclass using PYBIND11_OBJECT macro #2349

Fail on passing py::object with wrong Python type to py::object subclass using PYBIND11_OBJECT macro #2349

Uh oh!

Conversation

YannickJadoul commented Jul 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

YannickJadoul commented Aug 3, 2020

Uh oh!

rwgk Aug 4, 2020

Choose a reason for hiding this comment

Uh oh!

YannickJadoul Aug 4, 2020

Choose a reason for hiding this comment

Uh oh!

rwgk Aug 5, 2020

Choose a reason for hiding this comment

Uh oh!

YannickJadoul Aug 5, 2020

Choose a reason for hiding this comment

Uh oh!

YannickJadoul Aug 16, 2020

Choose a reason for hiding this comment

Uh oh!

YannickJadoul Aug 17, 2020

Choose a reason for hiding this comment

Uh oh!

henryiii Oct 2, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

YannickJadoul commented Oct 4, 2020

Uh oh!

YannickJadoul commented Oct 4, 2020

Uh oh!

rwgk commented Oct 5, 2020

Uh oh!

wjakob left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

YannickJadoul commented Oct 5, 2020

Uh oh!

nyibbang commented Jan 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

henryiii commented Jan 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nyibbang commented Jan 11, 2022

Uh oh!

henryiii commented Jan 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

YannickJadoul commented Jul 31, 2020 •

edited

Loading

wjakob left a comment •

edited

Loading

nyibbang commented Jan 11, 2022 •

edited

Loading

henryiii commented Jan 11, 2022 •

edited

Loading

henryiii commented Jan 11, 2022 •

edited

Loading