-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-98401: Invalid escape sequences emits SyntaxWarning #99011
Conversation
Previous attempt in 2018:
|
I tested the Python test suite with SyntaxWarning treated as error: it does pass.
|
This issue mostly hit code defining regular expressions. Example in BeautifulSoup 3.2.2:
Examples of code, re.compile() calls:
|
Lib/test/test_codecs.py
Outdated
for i in range(97, 123): | ||
b = bytes([i]) | ||
if b not in b'abfnrtvx': | ||
with self.assertWarns(DeprecationWarning): | ||
with self.assertWarns(SyntaxWarning): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SyntaxWarning is not related to codecs. It only should be emitted by the compiler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which warning should be emitted if SyntaxWarning is not the best choice? UnicodeWarning?
Does UnicodeWarning make sense for codecs.escape_decode()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DeprecationWarning, as for all other deprecated features.
Did you remove all pyc files and regenerate frozen modules before tests? Also try to regenerate generated code using the new Python binary as PYTHON_FOR_REGEN. |
I ran |
Ok, I updated my PR: Use SyntaxWarning for invalid octal sequence. |
A backslash-character pair that is not a valid escape sequence now generates a SyntaxWarning, instead of DeprecationWarning. For example, re.compile("\d+\.\d+") now emits a SyntaxWarning ("\d" is an invalid escape sequence), use raw strings for regular expression: re.compile(r"\d+\.\d+"). In a future Python version, SyntaxError will eventually be raised, instead of SyntaxWarning. Octal escapes with value larger than 0o377 (ex: "\477"), deprecated in Python 3.11, now produce a SyntaxWarning, instead of DeprecationWarning. In a future Python version they will be eventually a SyntaxError. codecs.escape_decode() and codecs.unicode_escape_decode() are left unchanged: they still emit DeprecationWarning. * The parser only emits SyntaxWarning for Python 3.12 (feature version), and still emits DeprecationWarning on older Python versions. * Fix SyntaxWarning by using raw strings in Tools/c-analyzer/ and wasm_build.py.
Hum, I messed up my PR so I squashed two commits and I fixed my PR:
I also mentioned the future convertion to SyntaxError in the doc (What's New / NEWS entries). |
@mdickinson @serhiy-storchaka @hugovk: Would you mind to review my PR? |
Did you remove all pyc files? find -name '*.py[co]' -exec rm -rf '{}' + |
Let me try these commands:
The last command fails as expected with:
|
Oops, test_string_literals didn't work when run with I ran the test suite with:
Note: I would prefer to run the whole test suite with |
Fixes this warning experienced with Python 3.12 (python/cpython#98401, python/cpython#99011): faa_cs_aan.py:690: SyntaxWarning: invalid escape sequence '\.' email_match = re.search('For Inquiries: ([0-9a-z._-]+@[0-9a-z.-]+)\.?$',
Use r-strings for all regular expressions. Fixes these warnings experienced with Python 3.12 (python/cpython#98401, python/cpython#99011, https://docs.python.org/3/whatsnew/3.12.html#other-language-changes point 2): run_tests.py:200: SyntaxWarning: invalid escape sequence '\d' FINAL_LINE_RE = re.compile('status=(\d+)$') run_tests.py:441: SyntaxWarning: invalid escape sequence '\*' re.match('^\* daemon .+ \*$', line) or line == ''): Change-Id: I71ddfb1a2ca62654378ae67a99e9aeb4ce7b7394 Reviewed-on: https://chromium-review.googlesource.com/c/crashpad/crashpad/+/6254063 Commit-Queue: Mark Mentovai <mark@chromium.org> Reviewed-by: Nico Weber <thakis@chromium.org>
A backslash-character pair that is not a valid escape sequence now generates a SyntaxWarning, instead of DeprecationWarning. For example, re.compile("\d+.\d+") now emits a SyntaxWarning ("\d" is an invalid escape sequence), use raw strings for regular expression: re.compile(r"\d+.\d+"). In a future Python version, SyntaxError will eventually be raised, instead of SyntaxWarning.
Octal escapes with value larger than 0o377 (ex: "\477"), deprecated in Python 3.11, now produce a SyntaxWarning, instead of DeprecationWarning. In a future Python version they will be eventually a SyntaxError.
codecs.escape_decode() and codecs.unicode_escape_decode() are left unchanged: they still emit DeprecationWarning.