Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"SyntaxError: invalid or missing encoding declaration" for (admittedly terrible) code that ast parses #1078

Open
DRMacIver opened this issue Dec 26, 2023 · 0 comments
Labels
bug Something isn't working parsing Converting source code into CST nodes

Comments

@DRMacIver
Copy link

I can only apologise for this example, but the string b"#\x80" is accepted as valid Python by the ast module, compile, etc. but when passed to libcst gives the following error:

Traceback (most recent call last):
  File "/Users/drmaciver/.pyenv/versions/3.12.0/lib/python3.12/tokenize.py", line 348, in find_cookie
    line_string = line.decode('utf-8')
                  ^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 1: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/drmaciver/Projects/shrink-ray/.venv/lib/python3.12/site-packages/libcst/_parser/entrypoints.py", line 109, in parse_module
    result = _parse(
             ^^^^^^^
  File "/Users/drmaciver/Projects/shrink-ray/.venv/lib/python3.12/site-packages/libcst/_parser/entrypoints.py", line 44, in _parse
    encoding, source_str = convert_to_utf8(source, partial=config)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/drmaciver/Projects/shrink-ray/.venv/lib/python3.12/site-packages/libcst/_parser/detect_config.py", line 125, in convert_to_utf8
    _detect_encoding(source)
  File "/Users/drmaciver/Projects/shrink-ray/.venv/lib/python3.12/site-packages/libcst/_parser/detect_config.py", line 49, in _detect_encoding
    return py_tokenize_detect_encoding(BytesIO(source).readline)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/drmaciver/.pyenv/versions/3.12.0/lib/python3.12/tokenize.py", line 389, in detect_encoding
    encoding = find_cookie(first)
               ^^^^^^^^^^^^^^^^^^
  File "/Users/drmaciver/.pyenv/versions/3.12.0/lib/python3.12/tokenize.py", line 353, in find_cookie
    raise SyntaxError(msg)
SyntaxError: invalid or missing encoding declaration
@zsol zsol added bug Something isn't working parsing Converting source code into CST nodes labels Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working parsing Converting source code into CST nodes
Projects
None yet
Development

No branches or pull requests

2 participants