-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Pass ParserOptions to the parser
#16220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
What's the motivation for changing the function signature for I've some additional thoughts that are related to this:
Curious to hear @MichaReiser thoughts on this. |
MichaReiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @dhruvmanila that I wouldn't make ParserOptions generic or add ParserOptions to parse_unchecked_source. It unnecessarily complicates things internally.
We could consider adding a ParseSourceOptions struct which is ParserOptions with the source type prop omitted, but I'm not sure if it's worth exposing the options on parse_unchecked_source. Instead, advanced use cases should use pare_unchecked and then map/unwrap on the call site.
I think it could make sense to expose the ParserOptions to the Lexer, or at least a subset of the options but I think this is something we can tackle separately.
|
I'll leave this one to Micha/Dhruv :-) |
|
The motivation for changing the signature of And the |
The motivation for this is that `parse_unchecked_source` relies on the fact that
it's called with a `PySourceType` to know that it's safe to unwrap
`try_into_module`. If we change the signature from
```rust
pub fn parse_unchecked_source(source: &str, source_type: PySourceType) -> Parsed<ModModule> {
```
directly to
```rust
pub fn parse_unchecked_source(source: &str, options: ParserOptions) -> Parsed<ModModule> {
```
we will lose this safety. Instead, to enforce this in the type system, I want to
have to call `parse_unchecked_source` with a `ParserOptions<KnownSource>`
and *not* with a `ParserOptions<UnknownSource>`.
This is slightly more awkward because I also don't want to propagate the generic
on `ParserOptions` to every `Parser` site, so I introduced the separate
`AsParserOptions` trait so `Parser` could hold a `Box<dyn AsParserOptions>`
instead of needing `SourceType` generics everywhere.
f9977cd to
f486e76
Compare
|
I've reverted the generic stuff and put I also changed
I looked into this briefly, but I don't think the lexer currently has access to a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet. I recommend giving @dhruvmanila some time to review the PR too.
| let parsed = parse(source, ParserOptions::from_mode(source_type.as_mode())) | ||
| .expect("Expect source to be valid Python"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We could implement from_source_type (or From<SourceType> for ParserOptions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I was thinking about that too. Should I implement From<Mode> for ParseOptions too? I guess I was thinking we might have multiple from_* methods, but I'm realizing now that they won't actually conflict with each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gives me more to put in the separate options.rs file anyway. I thought it looked a bit sparse before 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a love-hate relationship with From and I've been very inconsistent in the past with having explicit from_ methods vs From implementations...
I do like that from_mode is explicit but it also is a bit verbose... 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know what you mean. I already switched to From immediately after commenting, but I'm happy to go back to the explicit versions if you and/or Dhruv prefer. I think I tend toward from_ methods usually, so I can really go either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I defer to you and @dhruvmanila
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think From is fine. In the future, when there are more options and if it turns out a bit difficult to read, we could switch to from_ methods. Another option would be to use a builder pattern with the default impl so that it reads as ParseOptions::default().with_mode(mode).with_python_version(version).
dhruvmanila
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I'd recommend updating the PR description to reflect the current state of the PR for future readers. We can keep the "Details" content as a collapsed section for historical purposes.
|
I forgot to update the |
## Summary This PR builds on the changes in #16220 to pass a target Python version to the parser. It also adds the `Parser::unsupported_syntax_errors` field, which collects version-related syntax errors while parsing. These syntax errors are then turned into `Message`s in ruff (in preview mode). This PR only detects one syntax error (`match` statement before Python 3.10), but it has been pretty quick to extend to several other simple errors (see #16308 for example). ## Test Plan The current tests are CLI tests in the linter crate, but these could be supplemented with inline parser tests after #16357. I also tested the display of these syntax errors in VS Code:   --------- Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
Summary
This is part of the preparation for detecting syntax errors in the parser from #16090. As suggested in this comment, I started working on a
ParseOptionsstruct that could be stored in the parser. For this initial refactor, I only made it hold the existingModeoption, but for syntax errors, we will also need it to have aPythonVersion. For that use case, I'm picturing something like aParseOptions::with_python_versionmethod, so you can extend the current calls to something likeBut I thought it was worth adding
ParseOptionsalone without changing any other behavior first.Most of the diff is just updating call sites taking
Modeto takeParseOptions::from(Mode)or those takingPySourceTypes to takeParseOptions::from(PySourceType). The interesting changes are in the newparser/options.rsfile and smaller parts ofparser/mod.rsandruff_python_parser/src/lib.rs.Outdated implementation details for future reference
NOTE: as the
detailssummary says, this does not correspond to the current implementation, which preserves the original signature ofparse_unchecked_sourceand uses a much simpler representation ofParseOptions. This is preserved just for historical purposes.The
ParserOptionsimplementation is complicated a bit by wanting to preserve theSAFETYcomment inparse_unchecked_source:ruff/crates/ruff_python_parser/src/lib.rs
Lines 289 to 295 in b5cd4f2
The only way I could see to enforce this while passing a
ParserOptionsinstead of aPySourceTypewas by adding a generic type state toParserOptions, so thatparse_unchecked_sourcewould take aParserOptions<KnownSource>constructed byParserOptions::from_source_type, as opposed to theParserOptions<UnknownSource>produced byParserOptions::from_mode.On top of that, I didn't want to propagate the generics from
ParserOptionsto all uses ofParser, so there's another indirection through theAsParserOptionstrait, which is used byParserto hold aBox<dyn AsParserOptions>instead of aParserOptions<S: SourceType>.Finally, other public functions in the parser crate could be updated to take
ParserOptionsif we want, butparse_unchecked_sourceis the variant that ruff and red-knot call where I want to integrate the syntax error checks, so it was my main priority.Test Plan
Existing tests, this should not change any behavior.