-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[pycodestyle] Fix handling of format specs for invalid-escape-sequence (W605)
#19546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[pycodestyle] Fix handling of format specs for invalid-escape-sequence (W605)
#19546
Conversation
|
pycodestyle] Fix handling of format specs for invalid-escape-sequence (W605pycodestyle] Fix handling of format specs for invalid-escape-sequence (W605)
ntBre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This looks plausible to me in a quick skim. I just had one tiny docs nit and a question about doubled braces.
@MichaReiser or @dylwil3 if either of you remembers much context from #14748, you might be better suited to review this. If not, I'll do a deeper dive here myself.
crates/ruff_linter/src/rules/pycodestyle/rules/invalid_escape_sequence.rs
Outdated
Show resolved
Hide resolved
crates/ruff_linter/src/rules/pycodestyle/rules/invalid_escape_sequence.rs
Show resolved
Hide resolved
…sequence.rs Co-authored-by: Brent Westbrook <36778786+ntBre@users.noreply.github.com>
|
Not really. But it makes sense that @dylwil3 takes a look |
dylwil3
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks this looks good! A few suggestions. Also, can you point me to some documentation around the comment you made about format specifiers being the same whether or not there is a raw prefix on the f-string?
| f"\XFF{a:\XFF}" | ||
| t"\n{a:\XFF}" | ||
| t"\XFF{a:\XFF}" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test for something like this?
rf"xyz{a:\XFF}"| // Range in the source code to perform the analysis on. | ||
| source_range: TextRange, | ||
| flags: AnyStringFlags, | ||
| in_format_spec: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a little clearer at the call site to make this an enum with two variants
| } | ||
| } | ||
|
|
||
| fn make_diagnostics(checker: &Checker, invalid_escape_chars: &[InvalidEscapeChar]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give this a more descriptive name and/or a doc-comment to differentiate it from check? Something to indicate that this function is specifically for adding escapes rather than making strings raw.
I did some more digging, and it looks like there is no documentation on this, see python/cpython#137314 . So looks like the fix might need to behave differently dependent on the python version. |
|
Thank you for the thorough investigation! It looks like there is now an open PR to change the behavior for some subset of Python versions where the change is still allowed to be made (python/cpython#137328). So let's wait to see how that pans out and then revisit this - I've subscribed to that PR so hopefully will be pinged if and when it's merged. |
|
That said - if you are optimistic that the PR will merge in (as I am), then you could make the changes now. It seems like after the PR is merged, the raw prefix will behave as expected except in version 3.12, where your custom logic is required. Update: PR has merged to main, and will be backported to 3.13 and 3.14. There was a leftover question of whether it would also be backported to 3.12, so we'll see if in the end we don't need any version-specific logic after all. |
|
@MeGaGiGaGon sorry for so many pings - it looks like the CPython changes are done and Python 3.12 is the only version that has this strange behavior going forward. Let me know if you need help implementing any of the updates, otherwise feel free to ping me when it's ready for re-review. Thank you! |
Summary
This PR fixes #11491, cleaning up the last of the interpolated string related issues with invalid-escape-sequence (W605). Interpolated string format specs behave very strangely, in that their contents are not affected by the raw-ness of their containing string, unless they are adjacent to a
{or}, in which case they are affected.The fix functions by separating out all of the non-raw-affected escapes into their own vec, since the only option for them is to be escaped via adding another
\. The edge cases of{and}adjacent\s are left with the rest of the invalid escapes.Passing the inside-format-spec-ness via a bool is a bit ugly, but I could not think of a better solution. Ideas welcome.
Test Plan
Added 4 new test cases for the non-
{-}edge cases, and 2 other new test cases for those edge cases.