Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TOML 1.0.0 parser used in latest release 1.1.285 fails to parse line endings of comments #4367

Closed
ClaasRostock opened this issue Dec 21, 2022 · 9 comments
Labels
addressed in next version Issue is fixed and will appear in next published version bug Something isn't working

Comments

@ClaasRostock
Copy link

Note: if you are reporting a wrong signature of a function or a class in the standard library, then the typeshed tracker is better suited for this report: https://github.com/python/typeshed/issues.

Describe the bug
line endings of comments in pyproject.toml lead to parse error.

To Reproduce
use pyproject.toml with following content:

[build-system]
requires = ["lxml", "setuptools", "wheel"]
build-backend = "setuptools.build_meta"

[tool.black]
line-length = 120
target-version = ["py39", "py310"]

[tool.isort]
profile = "black"
extend_skip = ["__init__.py"]

[tool.pyright]
exclude = [
    ".git",
    ".venv",
    ".tox",
    "build",
    "dist",
    "**/__pycache__",
    "./docs/source/conf.py",
    "./src/old_not_used",
    "./src/mvf/predictors/tensorflow/kernel.py",
]
extraPaths = ["./src"]
typeCheckingMode = "basic"
useLibraryCodeForTypes = true
reportMissingParameterType = "error"
reportUnknownParameterType = "warning"
reportUnknownMemberType = "warning"
reportMissingTypeArgument = "error"
reportPropertyTypeMismatch = "error"
reportFunctionMemberAccess = "warning"
reportPrivateUsage = "warning"
reportTypeCommentUsage = "warning"
reportIncompatibleMethodOverride = "warning"
reportIncompatibleVariableOverride = "error"
reportInconsistentConstructor = "error"
reportOverlappingOverload = "warning"
reportUninitializedInstanceVariable = "warning"
reportCallInDefaultInitializer = "warning"
reportUnnecessaryIsInstance = "information"
reportUnnecessaryCast = "warning"
reportUnnecessaryComparison = "warning"
reportUnnecessaryContains = "warning"
reportUnusedCallResult = "warning"
reportUnusedExpression = "warning"
reportMatchNotExhaustive = "warning"
reportShadowedImports = "warning"
reportUntypedFunctionDecorator = "warning"
reportUntypedBaseClass = "error"
reportUntypedNamedTuple = "warning"
# Activate the following rules only locally and temporary, i.e. for a QA session.
# (For server side CI they are considered too strict.)
# reportConstantRedefinition = "warning"
# reportUnnecessaryTypeIgnoreComment = "information"
# reportImportCycles = "warning"
# reportImplicitStringConcatenation = "warning"

run pyright .

Following error returned on console log:

(.venv) PS C:\Dev\mvf> pyright .
No configuration file found.
pyproject.toml file found at C:\Dev\mvf.
Loading pyproject.toml file at C:\Dev\mvf\pyproject.toml
Pyproject file parse attempt 1 error: {"name":"TomlError","fromTOML":true,"wrapped":null,"line":52,"col":81,"pos":1617}
Pyproject file parse attempt 2 error: {"name":"TomlError","fromTOML":true,"wrapped":null,"line":52,"col":81,"pos":1617}
Pyproject file parse attempt 3 error: {"name":"TomlError","fromTOML":true,"wrapped":null,"line":52,"col":81,"pos":1617}
Pyproject file parse attempt 4 error: {"name":"TomlError","fromTOML":true,"wrapped":null,"line":52,"col":81,"pos":1617}
Pyproject file parse attempt 5 error: {"name":"TomlError","fromTOML":true,"wrapped":null,"line":52,"col":81,"pos":1617}
Pyproject file parse attempt 6 error: {"name":"TomlError","fromTOML":true,"wrapped":null,"line":52,"col":81,"pos":1617}
Config file "C:\Dev\mvf\pyproject.toml" could not be parsed. Verify that format is correct.

Expected behavior
Line Endings of Line comments should not be parsed as non-allowed control characters, but removed before the check for control characters is executed.

Screenshots or Code
see excerpts above

VS Code extension or command-line
command-line

Additional context

@erictraut
Copy link
Collaborator

I'm not able to repro the problem you're reporting. Can you confirm that when you paste the above text into a fresh pyproject.toml file that you still see errors?

Is it possible that you have mixed line endings in your file (e.g. CR for some lines and CR/LF in others)?

@erictraut erictraut added the question Further information is requested label Dec 21, 2022
@ClaasRostock
Copy link
Author

Hi Eric,
just exercised it - deleted pyproject.toml, created new file, pasted above listing into it, run pyright again - but unfortunately still the same error.
I use VS Code on Windows for editing the file. execution of pyright in Powershell Terminal.
The error, however, is raised also in our CI workflows on GitHub. I.e. following repo: https://github.com/dnv-opensource/dictIO

@erictraut
Copy link
Collaborator

OK, I was able to repro the problem by converting it from CR to CRLF line endings. The latter is default on Windows, whereas the former is default on Mac and Linux. The problem occurs only if there is a comment in the file.

Yes, this looks like a bug in the TOML parser. I can report the bug to the maintainer, but it looks like it is not very actively maintained, so I suspect it will be a long time before it will be updated.

I think we have a couple of options here:

  1. I can revert the change for this issue, which means that some TOML 1.0 constructs will not be handled correctly.
  2. You can work around the issue by converting your TOML file to use CR line endings rather than CRLF.

Since this affects all Windows users, my inclination is to go with option 1. Thoughts? @smackesey, since you reported the other issue, I'm interested in your thoughts too.

@erictraut
Copy link
Collaborator

It looks like this is a known issue, and it has even been fixed already, but no new release has been published since that time.

@smackesey
Copy link

smackesey commented Dec 21, 2022

Well really both options are quite bad, since they result in broken TOML parsing.

I don't really have a personal dog in the fight since I stopped using the [[array]] syntax anyway, but I will say that the bug before updating the parser was quite nasty since it resulted in silently ignored config instead of throwing an error.

Perhaps the best option would be to temporarily vendor the master of the TOML parser where the issue has been fixed? Or, and I forget if npm/yarn allow this, set the dependency to an unpublished commit from github? Or maybe take another look at the options for 1.0-compliant parsers.

@ClaasRostock
Copy link
Author

  1. I can revert the change for this issue, which means that some TOML 1.0 constructs will not be handled correctly.
  2. You can work around the issue by converting your TOML file to use CR line endings rather than CRLF.

Since this affects all Windows users, my inclination is to go with option 1. Thoughts?

Hi Eric,

either fine for me. Take your time. I have, as a temporary solution on my end, nailed pyright to v1.1.284 in the CI workflows. This will work for me until next release of pyright with then possibly a permanent fix implemented.

Best regards

Claas

@erictraut
Copy link
Collaborator

@smackesey, I don't see a good option here. Vendoring isn't something we generally do because it defeats all vulnerability tracking mechanisms in github. There are no other good TOML 1.0 parsers available currently.

FWIW, I resisted adding support for pyproject.toml for a long time in pyright because I feared it would be a support burden. I still have mixed feelings about it. On the one hand, may pyright users like the ability to use pyproject.toml rather than pyrightconfig.json, but my concerns about support are proving somewhat true.

I'm going to revert the previous change for now.

@erictraut erictraut added bug Something isn't working addressed in next version Issue is fixed and will appear in next published version and removed question Further information is requested labels Dec 21, 2022
@ClaasRostock
Copy link
Author

@erictraut @smackesey @frl000 If there are really no good TOML 1.0 parsers out there despite a community need, we could think about adding support for toml to dictIO . Just 'thinking out loud'; but not impossible. Maybe there is someone out there who would like to contribute?

@erictraut
Copy link
Collaborator

This is included in pyright 1.1.286, which I just published. It will also be included in a future version of pylance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addressed in next version Issue is fixed and will appear in next published version bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants