-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: exception in noexcept what()
when Python exception contains a surrogate character
#4288
Comments
Without having looked very closely here, it seems similar to what we were running into with the That approach could be adopted in Although that throws |
@TheShiftedBit Could be a bit more specific on what type of exception is being thrown here and possible provide an minimal compilable example? Better yet, open a PR that adds a unit test which fails under the current framework. There are a couple of potential solutions. Apparently, one is just returning unicode characters directly in what() per this StackOverflow exchange: https://stackoverflow.com/q/3760731/2444240 . Another option is escaping / replacing the error. I am trying to figure if the exception is coming. pybind11/include/pybind11/pytypes.h Line 1481 in fcb5554
pybind11/include/pybind11/pytypes.h Line 508 in fcb5554
Okay, there is a big perf issue here at least. We |
Something like this could work as a hotfix, but is still inefficient.
|
On MacOS at least: I'm not getting an exception, I am just junk characters as the output. I suspect this may be an issue on Windows though.
|
You asked for a reproducer, I posted it on the other issue: #4287 (comment) I temporarily fixed it by adapting your auto value_str = py::reinterpret_steal<py::object>(PyObject_Str(ex.value().ptr()));
py::bytes bytes = value_str.attr("encode").call("utf-8", "backslashreplace");
std::string result = bytes; (This one doesn't have the backtrace, just the message). In my case, I don't care about recovering from an exception, I just want to end the program with as much detail as possible; so using |
@TheShiftedBit Your repro doesn't seem to actually reproduce the issue, :( #4295 |
I've attached a zip containing a Dockerfile with the full reproducer. I pinned every relevant version number I could find, including pinning pybind11 to the latest release. Run with: cd <place_with_Dockerfile>
sudo docker build -t repro .
sudo docker run repro |
This reproduces the error: index 70b6ffea..0114bc31 100644
--- a/tests/test_exceptions.py
+++ b/tests/test_exceptions.py
@@ -275,6 +275,11 @@ def test_local_translator(msg):
assert msg(excinfo.value) == "this mod"
+def test_error_already_set_message_with_unicode_surrogate():
+ # Issue #4288
+ m.error_already_set_what(RuntimeError, "\ud927")
+
+
class FlakyException(Exception):
def __init__(self, failure_point):
if failure_point == "failure_point_init":
I think it's an easy fix, but I need to play a bit. |
@henryiii This would be a good one to get out ASAP. It's a bit nasty that it dies with |
…on exception message).
* Fix & test for issue #4288 (unicode surrogate character in Python exception message). * DRY `message_unavailable_exc` * fix: add a constexpr Co-authored-by: Aaron Gokaslan <skylion.aaron@gmail.com> * style: pre-commit fixes Co-authored-by: Henry Schreiner <HenrySchreinerIII@gmail.com> Co-authored-by: Aaron Gokaslan <skylion.aaron@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Fix & test for issue #4288 (unicode surrogate character in Python exception message). * DRY `message_unavailable_exc` * fix: add a constexpr Co-authored-by: Aaron Gokaslan <skylion.aaron@gmail.com> * style: pre-commit fixes Co-authored-by: Henry Schreiner <HenrySchreinerIII@gmail.com> Co-authored-by: Aaron Gokaslan <skylion.aaron@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Fix & test for issue #4288 (unicode surrogate character in Python exception message). * DRY `message_unavailable_exc` * fix: add a constexpr Co-authored-by: Aaron Gokaslan <skylion.aaron@gmail.com> * style: pre-commit fixes Co-authored-by: Henry Schreiner <HenrySchreinerIII@gmail.com> Co-authored-by: Aaron Gokaslan <skylion.aaron@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Discussed in #4287
Originally posted by TheShiftedBit October 26, 2022
By default, Python produces errors when converting encoding
str
s with utf-8 if thestr
contains surrogate characters. This can be disabled by passingsurrogatepass
as a second argument to.encode()
. Pybind11 has this same behavior with itsstr
->std::string
conversion. However, the bug is this: if an exception message contains a surrogate character, calling.what()
on anerror_already_set
with such an exception causes another exception to be thrown, but since.what()
isnoexcept
, that exception cannot be caught and the programstd::terminate
s.I'm not sure what the correct behavior regarding surrogate characters is. Perhaps pybind11 should always use
surrogatepass
, perhaps not. However, even if that's not the right choice, it should probably use it during exception handling, or Python exceptions like this are extremely difficult to diagnose.The text was updated successfully, but these errors were encountered: