Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include column offset information in the tokenizer #97997

Closed
lysnikolaou opened this issue Oct 6, 2022 · 0 comments · Fixed by #98000
Closed

Include column offset information in the tokenizer #97997

lysnikolaou opened this issue Oct 6, 2022 · 0 comments · Fixed by #98000
Assignees
Labels
type-feature A feature request or enhancement

Comments

@lysnikolaou
Copy link
Contributor

The tokenizer currently only holds information about the current line number (along with the starting line number for strings that span multiple lines). This makes computing column offset harder, since it has to be done with pointer arithmetic using pointers to the beginning and end of the token (even more complicated when line continuations or multiline tokens happen).

Feature or enhancement

All of this can be significantly simplified, if we keep a column offset counter in the tokenizer state.

@lysnikolaou lysnikolaou added the type-feature A feature request or enhancement label Oct 6, 2022
@lysnikolaou lysnikolaou self-assigned this Oct 6, 2022
lysnikolaou added a commit to lysnikolaou/cpython that referenced this issue Oct 6, 2022
carljm added a commit to carljm/cpython that referenced this issue Oct 8, 2022
* main: (38 commits)
  pythongh-92886: make test_ast pass with -O (assertions off) (pythonGH-98058)
  pythongh-92886: make test_coroutines pass with -O (assertions off) (pythonGH-98060)
  pythongh-57179: Add note on symlinks for os.walk (python#94799)
  pythongh-94808: Fix regex on exotic platforms (python#98036)
  pythongh-90085: Remove vestigial -t and -c timeit options (python#94941)
  pythonGH-83901: Improve Signature.bind error message for missing keyword-only params (python#95347)
  pythongh-61105: Add default param, note on using cookiejar subclass (python#95427)
  pythongh-96288: Add a sentence to `os.mkdir`'s docstring. (python#96271)
  pythongh-96073: fix backticks in NEWS entry (pythonGH-98056)
  pythongh-92886: [clinic.py] raise exception on invalid input instead of assertion (pythonGH-98051)
  pythongh-97997: Add col_offset field to tokenizer and use that for AST nodes (python#98000)
  pythonGH-88968: Reject socket that is already used as a transport (python#98010)
  pythongh-96346: Use double caching for re._compile() (python#96347)
  pythongh-91708: Revert params note in urllib.parse.urlparse table (python#96699)
  pythongh-96265: Fix some formatting in faq/design.rst (python#96924)
  pythongh-73196: Add namespace/scope clarification for inheritance section (python#92840)
  pythongh-97646: Change `.js` and `.mjs` files mimetype to conform to RFC 9239 (python#97934)
  pythongh-97923: Always run Ubuntu SSL tests with others in CI (python#97940)
  pythongh-97956: Mention `generate_global_objects.py` in `AC How-To` (python#97957)
  pythongh-96959: Update HTTP links which are redirected to HTTPS (python#98039)
  ...
mpage pushed a commit to mpage/cpython that referenced this issue Oct 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant