-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail on passing py::object with wrong Python type to py::object subclass using PYBIND11_OBJECT macro #2349
Fail on passing py::object with wrong Python type to py::object subclass using PYBIND11_OBJECT macro #2349
Conversation
60cba1c
to
3c8dbc2
Compare
4d9ae63
to
2e66b40
Compare
8db37ff
to
837710f
Compare
837710f
to
47af504
Compare
include/pybind11/pytypes.h
Outdated
@@ -796,11 +796,16 @@ PYBIND11_NAMESPACE_END(detail) | |||
: Parent(check_(o) ? o.release().ptr() : ConvertFun(o.ptr()), stolen_t{}) \ | |||
{ if (!m_ptr) throw error_already_set(); } | |||
|
|||
#define PYBIND11_OBJECT_CHECK_FAILED(Name, o) \ | |||
type_error("Object of type '" + std::string(Py_TYPE(o.ptr())->tp_name) + "' is not an instance of '" #Name "'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this works correctly for nested classes. Questions:
- Is there a higher-level function to get the fully-qualified class name?
- Could we exercise with a native nested class, and a nested pybind11 extension class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it looks horrible, but Py_TYPE(o.ptr())->tp_name
is used in other places as well.
About nested types: good point. I should try. Probably something with __qualname__
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it looks horrible, but
Py_TYPE(o.ptr())->tp_name
is used in other places as well.
About nested types: good point. I should try. Probably something with__qualname__
.
This thought randomly crossed my mind:
The "proper" way of doing this is:
- Implement a helper function that returns the fully-qualified type name (module + qualname + tp_name) as a string.
- Unit test that separately.
- Hook it in here, the existing simple test is fine as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I just need to find time to check how the full name is normally retrieved :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, a quick lookup confirms that tp_name
is the way to go?
Pointer to a NUL-terminated string containing the name of the type. For types that are accessible as module globals, the string should be the full module name, followed by a dot, followed by the type name; for built-in types, it should be just the type name. If the module is a submodule of a package, the full package name is part of the full module name. For example, a type named T defined in module M in subpackage Q in package P should have the tp_name initializer "P.Q.M.T".
https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_name
Let me confirm with a few tests ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix needed for PyPy, though, it seems. It looks like PyPy has tp_name
the same as __name__
. Need to check and confirm if I got that correct with half-sleepy mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've merged the tp_name
PR, should this be rebased?
dcc41ef
to
0e4e64a
Compare
47af504
to
41f2e39
Compare
f38fafc
to
6cd4552
Compare
41f2e39
to
6c0f03b
Compare
1d553e4
to
0b49467
Compare
0b49467
to
0ab4b97
Compare
6c0f03b
to
f32aa52
Compare
7ae6bb2
to
4fc3b1b
Compare
16a9fac
to
e43114f
Compare
e43114f
to
f36af9e
Compare
Submitted an issue to PyPy: https://foss.heptapod.net/pypy/pypy/-/issues/3318 I guess I'll just try to see how we can best sweep this thing under the rug? |
…ass using PYBIND11_OBJECT macro
…if the argument is not a type
4e8ef13
to
d8664ad
Compare
This (i.e., f2424c5) should do it, I think. Is this too heavy? The other option (while waiting for PyPy for a response and/or solution) would be to change the tests and have them accept not just |
Don't think so. Looks good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks fine to me. Regarding:
This (i.e., f2424c5) should do it, I think. Is this too heavy?
Your change is very minor, and we have some very heavy workarounds for a few PyPy-specific problems.
Green! Thanks, all, for the thorough reviews! |
This is a breaking change. It breaks some of my code, such as: const py::bytes uid = py::getattr(obj, uidAttrName, py::none());
if (!uid.is_none())
return deserialize(static_cast<std::string>(uid)); If the attribute was not found before, it would work fine. Now it raises an exception saying that I know that the fix is easy to implement, but I was hoping that upgrading from pybind 2.5.0 to 2.9.0 would not break my code, according to SemVer. |
SemVer is not a solution. You can't get a consensus as to what a breaking change is. Every change could break someone's code, and so the only perfect SemVer is to make every single release a major version, which makes it useless. Anything else is a practical approximation. In this case, you were relying on undefined behavior - we didn't support converting to an object subclass if the types were invalid before either, it just failed to throw an error - it just happened to work for you since the only thing you used was Now, one could argue that "None" is a special case, especially due to the existence of I believe this is the fix and is valid on both 2.5 and 2.9: const py::object uid = py::getattr(obj, uidAttrName, py::none());
if (!uid.is_none())
return deserialize(static_cast<std::string>(uid)); |
This is true, so then the decision to release a major version is decided by the risk that a change might break client code.
I could argue that this change had the potential to affect a lot of client code, especially since it concerns implicit conversions. Although I guess I'm wrong since it seems I'm the only one that got bothered by it.
The documentation was scarce in 2.5.0 about I didn't realize that my code was undefined behavior when I wrote it.
That's pretty much what I've done. It's actually more sensible to write it that way compared to before. |
SemVer is best considered "an abbreviated changelog". A patch release has only fixes - but could break your code. A minor release has features, but mostly should be compatible with past releases - but could break your code. A major release should have large new things a user should review, and is more likely, but not guarantied, to break some older code. For a stable library, in fact, unless you are a heavy user, major versions are not likely to break you - any more likely than patch/minor releases, actually. And it differs based on the language - in JavaScript, you have nested dependencies, so having frequent major releases is encouraged - one library can use version 2 and another can use version 3 and both work together (though there turn out to be a lot of caveats, and I'm not sure it's really "better"). You can read more about this (and my opinions on capping) here: https://iscinumpy.dev/post/bound-version-constraints/ In Python, major releases are highly discouraged, and instead deprecation periods are encouraged. If possible, warnings should be produced for 1-3 releases, and then finally removed. You should upgrade frequently in order to benefit from this. Python itself follows this (Guido recently promised at least version 3.33) - things will slowly be removed and changed, but no more major versions, to avoid a transition like 2 to 3. This is a social construct of Python packaging. Anyway, we've done quite a bit in pybind11 to improve the checking for undefined behavior. This had reduced bugs in client code. We do try to avoid breaking changes if possible - we test quite a bit of downstream code, and changes are required that are strictly fixes, we allow that, but if other types of changes are required, we try to avoid that. Major versions will likely be based on incrementing the ABI, not on the public API, though we will likely remove a few deprecated things on major version bumps. (PS: This is probably true for Python as well - ABI3 cannot be changed without a bump to Python 4, so this will likely be a/the driving factor). |
The public constructor of `pybind11::module` is now deprecated. This patch replaces its only use by a call to `PYBIND11_EMBEDDED_MODULE` followed by a `pybind11::import(...)`. The constructor of a `pybind11::object` subclass from a `pybind11::object` from the wrong type now throws an exception. See pybind/pybind11#2349. We had one case of this happening where we tried constructing a `bytes` from a `None`. We now first construct an `object`, then test if it's `None` before trying to convert it to `bytes`. Change-Id: I4aa2a1b4a49d3024f203bdd5c6f9c4ee86bfc9c4 Reviewed-on: http://gerrit2.aldebaran.lan/1760 Reviewed-by: philippe.martin <philippe.martin@softbankrobotics.com> Reviewed-by: jmonnon <jmonnon@aldebaran.com> Tested-by: vincent.palancher <vincent.palancher@softbankrobotics.com>
Throwing this out there and seeing how it reacts to existing code, and opening this up for discussion.
As discussed in #2340, this would throw an error on e.g.,
py::bytes(o)
witho
anypy::object
that isn't abytes
or subclass instance.Notes:
type_error
is OK, but I'm hoping someone has a better formulations of the error message.