Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Learn to parse a as usize < b #42578

Merged
merged 5 commits into from
Jun 16, 2017
Merged

Learn to parse a as usize < b #42578

merged 5 commits into from
Jun 16, 2017

Conversation

estebank
Copy link
Contributor

Parsing a as usize > b always works, but a as usize < b was a
parsing error because the parser would think the < started a generic
type argument for usize. The parser now attempts to parse as before,
and if a DiagnosticError is returned, try to parse again as a type with
no generic arguments. If this fails, return the original
DiagnosticError.

Fix #22644.

@rust-highfive
Copy link
Collaborator

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

};
let path = TyKind::Path(None, path);
let span = lo.to(self.prev_span);
let rhs = P(Ty { node: path, span: span, id: ast::DUMMY_NODE_ID });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use mk _ty here? Or is the DUMMY_NODE_ID intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mk_ty is part of TyCtxt which isn't available yet. This was extracted from parse_ty_common in order to see if this would work at all. I could modify parse_ty_common for this case, but I felt it would muddy the logic of that method.

@petrochenkov
Copy link
Contributor

petrochenkov commented Jun 10, 2017

I'd be okay with this if the unbounded lookahead and parser state saving/restoring machinery were used only for reporting better diagnostics, recovery and fix-it suggestions, not for actually accepting the code.

@Mark-Simulacrum
Copy link
Member

Looks like there are some test failures in Travis: https://travis-ci.org/rust-lang/rust/jobs/241424157#L1853, though addressing @petrochenkov's concern first seems like a good idea.

@Mark-Simulacrum Mark-Simulacrum added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jun 11, 2017
@estebank
Copy link
Contributor Author

I'd be okay with this if the unbounded lookahead and parser state saving/restoring machinery were used only for reporting better diagnostics, recovery and fix-it suggestions, not for actually accepting the code.

Should a as usize < b be accepted? I understand the concern, and would heavily lean towards your opinion on all other situations, but in this case in particular it is adding a very specific edge case to the grammar which IMO is a big ergonomic win.

Having said that, what would be your preferred output if this were used only for diagnostics? Something like this?

error: expected one of `!`, `(`, `+`, `,`, `::`, `<`, or `>`, found `;`
 --> <anon>:6:19
  |
6 |     a as usize < b;
  |                   ^ expected one of 7 possible tokens here
  help: may be you want to compare the casted value?
  |     (a as usize) < b;

Parsing `a as usize > b` always works, but `a as usize < b` was a
parsing error because the parser would think the `<` started a generic
type argument for `usize`. The parser now attempts to parse as before,
and if a DiagnosticError is returned, try to parse again as a type with
no generic arguments. If this fails, return the original
`DiagnosticError`.
@petrochenkov
Copy link
Contributor

@estebank

Having said that, what would be your preferred output if this were used only for diagnostics?

A specially crafted error would be the best, IMO.

error: `<` is interpreted as a start of generic arguments for `usize`, not comparison
 --> <anon>:6:19
  |
6 |     a as usize < b;
  |                ^ <some label, not sure what to write here>
  help: if you want to compare the casted value then write
  |     (a as usize) < b;

After this is reported as a non-fatal error, AST is build as if a as usize < b were legal and meant (a as usize) < b, e.g. most likely error recovery is performed.

@petrochenkov
Copy link
Contributor

@estebank

it is adding a very specific edge case to the grammar which IMO is a big ergonomic win.

It's an edge case, but it has large effect on properties of the grammar as a whole.
If syntactic disambiguation between types and expressions is going to be done, it should probably be done in more principled fashion, discussed properly and maybe modeled in something like https://github.com/nikomatsakis/rustypop
BTW, the Const generics RFC is related, first it break the assumption this PR does that if b is not a type, then it's the rhs argument for comparison, second it creates more demand for general disambiguation between types and expressions.

@petrochenkov
Copy link
Contributor

(a as usize) < b

a as (usize) < b works too and is more "local", but arguably more surprising, maybe it can be recommended instead? (I'm not sure)

```
warning: `<` is interpreted as a start of generic arguments for `usize`, not comparison
  --> $DIR/issue-22644.rs:16:33
   |
16 |     println!("{}", a as usize < b);
   |                                 ^ expected one of `!`, `(`, `+`, `,`, `::`, or `>` here
   |
help: if you want to compare the casted value then write
   |     println!("{}", (a as usize) < b);
```
@estebank estebank force-pushed the recover-binop branch 2 times, most recently from e53c7a1 to 803d323 Compare June 12, 2017 07:03
@estebank
Copy link
Contributor Author

estebank commented Jun 12, 2017

It's an edge case, but it has large effect on properties of the grammar as a whole.
If syntactic disambiguation between types and expressions is going to be done, it should probably be done in more principled fashion, discussed properly and maybe modeled in something like https://github.com/nikomatsakis/rustypop

I understand and completely agree.

a as (usize) < b works too and is more "local", but arguably more surprising, maybe it can be recommended instead? (I'm not sure)

I think I prefer the former, as it'd be less surprising for a newcomer.

I got something working which accepts the code as proposed but also emits a warning suggesting a correction.

I should probably change the code that emits the warning before we merge, as right now I am modifying the existing error in place, which is probably not what we want.

This is how it looks like right now:

warning: `<` is interpreted as a start of generic arguments for `usize`, not comparison
  --> $DIR/issue-22644.rs:16:33
   |
16 |     println!("{}", a as usize < b);
   |                               - ^ interpreted as generic argument
   |                               |
   |                               not interpreted as comparison
   |
help: if you want to compare the casted value then write:
   |     println!("{}", (a as usize) < b);

@@ -438,6 +441,8 @@ fn dummy_arg(span: Span) -> Arg {
Arg { ty: P(ty), pat: pat, id: ast::DUMMY_NODE_ID }
}

type RewindPoint = (token::Token, Span, Option<Span>, Span, TokenCursor, Vec<TokenType>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make this a struct with field names?

```
warning: `<` is interpreted as a start of generic arguments for `usize`, not a comparison
  --> $DIR/issue-22644.rs:16:33
   |
16 |     println!("{}", a as usize < b);
   |                               - ^ interpreted as generic argument
   |                               |
   |                               not interpreted as comparison
   |
help: if you want to compare the casted value then write:
   |     println!("{}", (a as usize) < b);
```
@nikomatsakis
Copy link
Contributor

As long as we're not changing the grammar, I'm happy if @petrochenkov is happy =)

@nikomatsakis
Copy link
Contributor

r? @petrochenkov

@estebank
Copy link
Contributor Author

estebank commented Jun 12, 2017

@nikomatsakis just to be clear, the code as is accepts it under the assumption that it is correct and only emits a warn diagnostic, so it does technically change the grammar. If this is the only error, it will generate a valid binary. Is it ok to accept this code relying on people to be annoyed enough by warnings to fix their code? I can change the warn to an error and call it a day :)

@@ -1819,7 +1850,8 @@ impl<'a> Parser<'a> {
}

// Parse types, optionally.
let parameters = if self.eat_lt() {
let parameters = if self.is_lt() && !dont_parse_generics {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if !dont_parse_generics && self.eat_lt() {

&& doesnt run second arg when first arg is false.

Copy link
Contributor

@petrochenkov petrochenkov Jun 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
This also makes fn is_lt unnecessary
EDIT: ... and keeps < in the expected set.

@nikomatsakis
Copy link
Contributor

@estebank

the code as is accepts it under the assumption that it is correct and only emits a warn diagnostic

Accepts what exactly? And what do you mean by "as is"? You mean, in the PR, or in master?

@estebank
Copy link
Contributor Author

@nikomatsakis The PR accepts a as b < c as valid code if it can parse b as a type without type arguments, showing a warning, not an error to change the code.

@@ -1800,7 +1829,9 @@ impl<'a> Parser<'a> {
/// - `a::b<T,U>::c<V,W>`
/// - `a::b<T,U>::c(V) -> W`
/// - `a::b<T,U>::c(V)`
pub fn parse_path_segments_without_colons(&mut self) -> PResult<'a, Vec<PathSegment>> {
pub fn parse_path_segments_without_colons(&mut self, dont_parse_generics: bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: dont_parse_generics: bool -> parse_generics: bool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

ExprKind::Cast(lhs, rhs), ThinVec::new());
}
Err(mut err) => {
// Rewind to before attempting to parse the type with generics, to get
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move the error handling logic into a separate function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@nikomatsakis
Copy link
Contributor

The PR accepts a as b < c as valid code if it can parse b as a type without type arguments, showing a warning, not an error to change the code.

Huh, I'm confused. After all, you can do let x = foo as Box<Bar>. How do we know that b < c should not be parsed as a type?

@petrochenkov
Copy link
Contributor

r=me with stray note removed.

- generate error instead of warning
- remove `RewindPoint` and just keep a copy of `Parser` to rewind state.
- `dont_parse_generics: bool` -> `parse_generics: bool`
- remove `eat_lt`
- move error handling code to separate method
@estebank
Copy link
Contributor Author

@petrochenkov done

@bors r=petrochenkov

@bors
Copy link
Contributor

bors commented Jun 15, 2017

📌 Commit ad260ff has been approved by petrochenkov

@nikomatsakis
Copy link
Contributor

@bors r-

@nikomatsakis
Copy link
Contributor

I'd like to know if this is changing the grammar that is accepted. I feel like I still don't understand quite what's happening here. =)

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jun 15, 2017

In particular, do we now compile (successfully) more programs than we did before? If so, what set of programs?

@estebank
Copy link
Contributor Author

@nikomatsakis The first commit of this PR changed the accepted grammar and only emitted a warning. I changed it back to not have any changes in the grammar and emit an error. The parser state changes slightly after finding this case in order to continue parsing successfully the rest of the file (before this PR the rest of the file would not get any diagnostics). No new set of programs will be accepted, the new behaviors are:

  • a more specific error for the case a as b < c
  • continued parsing of the rest of the file after encountering a as b < c

@nikomatsakis
Copy link
Contributor

@estebank perfect =)

@nikomatsakis
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Jun 15, 2017

📌 Commit ad260ff has been approved by nikomatsakis

@bors
Copy link
Contributor

bors commented Jun 16, 2017

⌛ Testing commit ad260ff with merge 44eeb21...

bors added a commit that referenced this pull request Jun 16, 2017
Learn to parse `a as usize < b`

Parsing `a as usize > b` always works, but `a as usize < b` was a
parsing error because the parser would think the `<` started a generic
type argument for `usize`. The parser now attempts to parse as before,
and if a DiagnosticError is returned, try to parse again as a type with
no generic arguments. If this fails, return the original
`DiagnosticError`.

Fix #22644.
@bors
Copy link
Contributor

bors commented Jun 16, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: nikomatsakis
Pushing 44eeb21 to master...

@bors bors merged commit ad260ff into rust-lang:master Jun 16, 2017
kennytm added a commit to kennytm/rust that referenced this pull request Nov 4, 2017
Add a nicer error message for missing  in for loop, fixes rust-lang#40782.

As suggested by @estebank in issue rust-lang#40782, this works in the same way as rust-lang#42578: if the in keyword is missing, we continue parsing the expression and if this works correctly an adapted error message is produced. Otherwise we return the old error.

A specific test case has also been added.
This is my first PR on rust-lang/rust so any feedback is very welcome.
@44100hertz
Copy link

One last thing: << isn't covered here.

error: `<` is interpreted as a start of generic arguments for `u16`, not a comparison                                          
 --> /tmp/angle.rs:4:29                                                                                                        
  |                                                                                                                            
4 |     println!("{}", x as u16 << 4);                                                                                         
  |                    -------- ^^ - interpreted as generic arguments                                                          
  |                    |        |                                                                                              
  |                    |        not interpreted as comparison                                                                  
  |                    help: try comparing the casted value: `(x as u16)`       

@petrochenkov
Copy link
Contributor

@Hertz4
<< is fixed on beta/nightly

error: `<` is interpreted as a start of generic arguments for `u16`, not a shift
 --> src/main.rs:2:29
  |
2 |     println!("{}", x as u16 << 4);
  |                    -------- ^^ - interpreted as generic arguments
  |                    |        |
  |                    |        not interpreted as shift
  |                    help: try shifting the casted value: `(x as u16)`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants