-
Notifications
You must be signed in to change notification settings - Fork 783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance error messages of conversion errors #1212
Enhance error messages of conversion errors #1212
Conversation
a80c86f
to
f3b8257
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for this PR, very pleased to see it! I'll do my best to find time to review it tomorrow evening.
f3b8257
to
f0c5dc7
Compare
So.... what is the motivation? |
I think for users of libraries this can be very nice when figuring out what went wrong. C++'s I'm afraid I didn't find time to do any reviewing today, will do my best to find time asap. |
But I think users don't know argument names... |
Go look at the issue ticket #1055
I entirely disagree. This is one of the first features that I noticed was missing when I started using PyO3. It's extremely annoying to debug your code when your function calls are throwing type errors, but they don't tell you what the cause is. Like, imagine if you have some code that takes function arguments from the user: some_args = get_from_cli()
pyo3_function(**some_args)
# raises TypeError: str cannot be interpreted as int That doesn't tell the user which one of their arguments has an error. There's lots of crazy stuff you can do in python, and there are lots of situations in which you may have a function with a lot of arguments, or a few arguments with the same type. TBH even if I have 2 arguments like: def print_range(lo: int, hi: int):
...
lo, hi = do_some_stuff()
print_range(lo, hi)
# raises TypeError: str cannot be interpreted as int It's annoying to have to go back and debug your code just to figure out which argument has the wrong type, when the library really has access to that information already, but refuses to tell you. Just as you may look at this code: def foo(bar, **kwargs):
pass
foo(1, bar=10)
# raises TypeError: foo() got multiple values for argument 'bar' and say, "well the error message really doesn't need to include the name of the function or the name of the argument because it's obvious from the code," well, there's a reason why the python developers chose to include all of that information in the error message: people will do crazy things, context will get lost, and it helps if your error messages tell you the complete reason for an error. Rant over. For now...
Yes, although there are a few things I don't like about Also I don't think @davidhewitt BTW what would be a good file to add unit tests in? I'm also not sure how the And lastly, is it normal for some of the tests to fail? Even the master branch had a few failing tests when I ran them locally after cloning. |
OK, thank you for the explanation. But I feel
I recommend
No. The failure is because you changed the error message. |
Oh I see. I thought you were objecting to including the argument name in error messages. I also agree that the error message of conversion errors isn't ideal, but that part of the message already exists on the master, it's not part of my changes. I think again, the main problem is that the error message includes the value of the argument instead of the type. If I were to write the error, I would model it off of the format of the builtin python errors which afaik only use object types: None.foo
# AttributeError: 'NoneType' object has no attribute 'foo'
"foo" + 1
# TypeError: can only concatenate str (not "int") to str
1 + "foo"
# TypeError: unsupported operand type(s) for +: 'int' and 'str'
"".join([1, 2])
# TypeError: sequence item 0: expected str instance, int found So I'd probably go with something like
Well as I said, even when building the master branch locally (before I made any changes), I got 2 errors. |
Oops, I'm sorry I misunderstood it!
Yeah, I agree to use the type name instead.
I know there happenes some errors with pyenv, but if not, please file a separate issue with error messages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks again for implementing this, and sorry for my delay in reviewing. Finally had a chance to think about this properly!
I'm definitely persuaded that using the type name is better than using the repr
. I was thinking about this further and it's easy enough for debugging to print the problematic value. We should change this in PyDowncastError
.
It seems we have two different arrangements for the argument name already proposed:
-
TypeError: argument 'foo': 'str' object cannot be converted to 'PyTuple'
-
TypeError: 'str' object cannot be converted to 'PyTuple' (for argument 'foo')
Thinking about this further, I also quite like the idea of using Python exception chaining to add context. I think if we did this right, the effect could be something like the below:
TypeError: 'str' object cannot be converted to 'PyTuple'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
...
ValueError: invalid value for argument 'foo'
Which style does everyone prefer?
I thought about the exception chaining before, but I decided that it's uglier. Having a chain of exceptions kindof implies that there are multiple pieces of code that are failing, but really there is only one: the argument conversion. It feels a bit more like an implementation detail that the argument name isn't available further down the chain, and therefore needs to be injected into the error message later. So, I prefer just modifying the error message. |
I agree @Askaholic. I feel the chained message confusing. |
👍 happy to agree with that. I guess the choice is then between the argument name at the front, or at the end. I am personally not fond of multiple colons in exception messages, which is why I suggested adding |
f0c5dc7
to
067c5d0
Compare
I prefer the multiple colon way, even though it's kindof weird having multiple layers of colons. It's almost like a replacement for the error chaining where it gives you the same context as the error chain, but more condensed. It also exists in some native python error messages like the one I posted earlier: "".join([1, 2])
# TypeError: sequence item 0: expected str instance, int found This brings up another interesting point, which is that in order for the approach we're taking here (catch an internal error and reformat like |
3929845
to
e0a80e4
Compare
👍 I am persuaded that the colon way is the right style!
Agreed. Feel free to amend the change the existing message for Line 493 in 95cebd8
|
e0a80e4
to
a006efd
Compare
Yea, I wasn't sure if I should do that in this PR or not, but I totally can. I'll just have to expand the scope a bit and rename it. |
Displays type(obj) instead of repr(obj) and uses `cannot` instead of `can't` to be more consistent with existing python error messages. See discussion at PyO3#1212.
a006efd
to
01e22f0
Compare
Displays type(obj) instead of repr(obj) and uses `cannot` instead of `can't` to be more consistent with existing python error messages. See discussion at PyO3#1212.
01e22f0
to
31b64cd
Compare
I'm running into a bit of a tricky situation whenever >>> e = UnicodeEncodeError('utf-8', '\ud800', 0, 1, 'surrogates not allowed')
>>> str(e)
"'utf-8' codec can't encode character '\ud800' in position 0: surrogates not allowed" I see 3 possible solutions:
I'm not really sure what the best choice is. |
Displays type(obj) instead of repr(obj) and uses `cannot` instead of `can't` to be more consistent with existing python error messages. See discussion at PyO3#1212.
31b64cd
to
3b6a5d0
Compare
👍 for this |
+1 for this. The only other built-in exception commonly used for invalid arguments is |
Just thought of another possible solution: Create our own wrapper exception This wrapper exception basically does the same thing as python error chaining, but it displays nicer. The only thing is if the user is trying to handle exceptions within a try-except block, they can't directly catch the underlying error and instead need to catch |
Have to say I'm 👎 on creating a custom exception type here. I know we've got a couple already in the codebase (and I kind of wish we didn't, but they're necessary for other reasons). It's very awkward for Python users to be able to import these exceptions in Python (see #805). And being unable to import them makes it very difficult for Python users to figure out even what super type they have that they can catch. Having used So in summary I'm in favour of keeping the design here as simple as possible. (If we were to create a new argument type, I think the correct design is to release a |
Displays type(obj) instead of repr(obj) and uses `cannot` instead of `can't` to be more consistent with existing python error messages. See discussion at PyO3#1212.
3b6a5d0
to
ca908ba
Compare
Alright, I'll stick with only adding the argument name to TypeErrors. |
Displays type(obj) instead of repr(obj) and uses `cannot` instead of `can't` to be more consistent with existing python error messages. See discussion at PyO3#1212.
ca908ba
to
6724783
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, this LGTM! Just needs an ## Added
entry in the changelog please.
I see this is still in draft; anything else you want to do to this PR before it's merged?
.map(|s| format!("{} ({})", s.to_string_lossy(), type_name)) | ||
.unwrap_or_else(|_| type_name.to_string()); | ||
let err_msg = format!("Can't convert {} to {}", from, #error_names); | ||
let err_msg = format!("'{}' object cannot be converted to '{}'", type_name, #error_names); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I think it's now nicely shaped up, too.
if !err.matches($py, $py.get_type::<pyo3::exceptions::$err>()) { | ||
panic!("Expected {} but got {:?}", stringify!($err), err) | ||
} | ||
err | ||
}}; | ||
($py:expr, $val:ident, $code:expr, $err:ident, $err_msg:expr) => {{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this 👍
Yea last bullet would be:
Anything I should add there? |
Ah yes good memory. I would suggest just doing a quick search of the repository for the old error message "Can't convert" and replace all uses in docs with the equivalent new error message. That should be enough 👍 Thanks again for this PR! |
I can't find any more references to the old error message. I don't think it was included in the docs/guide anywhere so I guess this is ready. It would be awesome if you could add a "hacktoberfest-accepted" label so this will count towards my hacktoberfest PR's. https://hacktoberfest.digitalocean.com/details#details |
Perfect. Thanks very much again, will definitely add the label! |
Adds the name of arguments that generate conversion errors as mentioned in bullet 1 of #1055 and changes the message of PyDowncastError to match more closely with existing python errors. I'm very new to procedural macros so this is likely not the best way to go about things. Here's what it looks like now:
However I did notice an oddity. When converting to an integer type it doesn't seem to trigger the same pyo3 code and instead generates a different error from within python.
I looked for possible sources of the error and it might be coming from here:
https://github.com/python/cpython/blob/fb0a4651f1be4ad936f8277478f73f262d8eeb72/Objects/abstract.c#L1265
But there are actually a number of different places where this same text is present in an error string.