Skip to content

Add a parse recovery in array type syntax #81404

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Add a parse recovery in array type syntax #81404

wants to merge 4 commits into from

Conversation

osa1
Copy link
Contributor

@osa1 osa1 commented Jan 26, 2021

When user types [x, y] in a type context instead of [x; y] we now try to
recover. In particular this fixes confusing "unmatched angle bracket"
errors in code like

fn main() {
    drop::<[(), 0]>([]);
}

which would previously generate

 --> src/main.rs:2:11
  |
2 |     drop::<[(), 0]>([]);
  |           ^ help: remove extra angle bracket

error: expected one of `;` or `]`, found `,`
 --> src/main.rs:2:15
  |
2 |     drop::<[(), 0]>([]);
  |               ^ expected one of `;` or `]`

With the recovery we now only generate the second error.

Fixes #81097

Note that we only try this recovery in edition 2018 and later. In
earlier editions functions without bodies are now allowed to have named
arguments, so having this recovery is causing problems in cases like

trait Foo {
    fn foo([a, b]: [i32; 2]) {}
}

where the [a, b] part is an error, and assuming ; in place of ,
there makes the error message worse, not better. A test for this is in
ui/issues/issue-50571.rs.

@rust-highfive
Copy link
Contributor

r? @petrochenkov

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 26, 2021
None => TyKind::Slice(elt_ty),
}),
Err(mut err1)
if self.look_ahead(0, |t| t.kind == token::Comma) && size_const.is_none() =>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to generalize this a bit more. What other operators should I include here to recover in more cases?

@rust-log-analyzer

This comment has been minimized.

@osa1
Copy link
Contributor Author

osa1 commented Jan 26, 2021

I'm looking into the failing test.

@osa1
Copy link
Contributor Author

osa1 commented Jan 26, 2021

Here's my understanding of why this test is failing. It seems like before 2018 edition we couldn't give names to trait arguments, but with 2018 edition we have to.

Now, the code in the test is this:

trait Foo {
    fn foo([a, b]: [i32; 2]) {}
}

We try to parse this using pre-2018 syntax, so we don't expect to see a pattern in the argument position and we try to parse it as a type.

Unfortunately that interacts badly with the new recovery when parsing array types which happens after seeing the comma after a, causing an extra error.

One hacky way to fix this would be to avoid recovery when edition is <2018.

I don't know how else to fix the new issue or the original issue in #81404 differently yet.

@osa1
Copy link
Contributor Author

osa1 commented Jan 26, 2021

One hacky way to fix this would be to avoid recovery when edition is <2018.

Note that doing this would mean we can only improve the error in #81097 in editions 2018 and above.

@osa1
Copy link
Contributor Author

osa1 commented Jan 26, 2021

I pushed a commit that should hopefully fix the issue. I basically introduce another "recovery" type (RecoverArrayType) and pass an argument to type parsers to whether to recover from [x, y] when expecting a type. I set it to "false" when parsing a trait method with unnamed arguments. (e.g. pre-2018 trait methods)

--> $DIR/issue-81097.rs:2:15
|
LL | drop::<[(), 0]>([]);
| ^ expected one of `;` or `]`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the nightly release we currently also get the ^ expected one of ';' or ']' error, the first error we currently get is misleading though. But I would argue that a little thought easily solves this initial confusion in all cases. I don't want to sound too negative here and the rust community appreciates your effort, but I feel that the overhead of reading your additional code in the parser is not worth the benefit of having the misleading first error message removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that the overhead of reading your additional code in the parser is not worth the benefit of having the misleading first error message removed.

I'd be happy to implement this differently if anyone has any suggestions. I'm quite new here and I don't know the code in detail.

Regarding the extra code, I'd leave that decision to the active maintainers/developers of the parser.

In compiler development this kind of tradeoff exists all the time: you add code for user convenience but it makes maintainers' job more difficult. I don't know how rustc devs decides on what to do in these cases but I'm happy to close the PR if they think this code is not worth it. (I guess in that case the issue should be closed too?)

I made the mistake of using , instead of ; in array types in the past many times, but for me the second message is clear enough and the first one doesn't bother me too much. So I don't feel strongly either way.

@osa1
Copy link
Contributor Author

osa1 commented Jan 28, 2021

One way to make this patch simpler is to assume edition 2018 or newer. If we do that we can remove the new flag and always try to recover. It will cause more verbose errors in editions < 2018 as shown in this comment.

@petrochenkov
Copy link
Contributor

I'm unlikely to get to reviewing this until the next weekend (Feb 7).

@osa1
Copy link
Contributor Author

osa1 commented Feb 9, 2021

(rebased)

Ping @petrochenkov

@petrochenkov
Copy link
Contributor

The PR is in my review queue, but with lower priority than others (because it's diagnostic-related, non-blocking, etc), I should get to it this weekend. (I didn't review last week due to vacation.)

@petrochenkov
Copy link
Contributor

It's hard to me to look at what the parser became these days, so I'm just going to @bors r+ this.

@camelid if you want to have some changes made to this PR feel free to r- and assign to yourself.

@bors
Copy link
Collaborator

bors commented Feb 13, 2021

📌 Commit a900825f5b9383b1ab9102ca608e94bcb6912d97 has been approved by petrochenkov

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 13, 2021
@b-naber
Copy link
Contributor

b-naber commented Feb 13, 2021

It's hard to me to look at what the parser became these days, so I'm just going to @bors r+ this.

Wouldn't it be more rational to treat past changes to the parser as sunk costs and keep making the dicisions you hoped were made by others instead of giving up? This seems to me to ultimately lead to an unmaintenable code base.

@osa1
Copy link
Contributor Author

osa1 commented Feb 13, 2021

Let's maybe wait for @camelid before this gets merged as we discuss how to review it or who to ask for reviews.

r? @camelid

Anyone else know anyone who might be interested in reviewing this? Maybe @estebank?

@petrochenkov
Copy link
Contributor

@b-naber
Yeah, I would certainly do that if I had (much) more time, I'm giving up largely for prioritization reasons.

None => TyKind::Slice(elt_ty),
}),
Err(mut err1)
if recover_array_type == RecoverArrayType::Yes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any case where we wouldn't want to do the lookahead you're doing here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not doing lookahead here, do you mean recovery? See the RecoverArrayType documentation for why we don't want to try to recover always. Also documented in the commit message. In short, it makes some error messages in pre-2018 editions worse.

I think that may be fine as we probably don't have a lot of users today that actively write pre-2018 code (and I'd expect those people to use an older version of rustc, without this patch). If you agree that this is fine, then the RecoverArrayType type and extra parameter can be removed.

Comment on lines 410 to 422
match self.parse_anon_const_expr() {
Ok(array_size) => {
err1.emit();
self.expect(&token::CloseDelim(token::Bracket))?;
Ok(TyKind::Array(elt_ty, array_size))
}
Err(mut err2) => {
err2.cancel();
*self = snapshot;
Err(err1)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This... is unfortunate. I understand why you're using it. I'm the one that introduced this pattern into the compiler in the first place but it should be used sparingly. I guess that ship might have sailed, and in that case we should then make this pattern less intrusive. :-/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which part do you mean? Snapshotting? Or recovery in general?

Comment on lines +394 to +398
let size_const =
if self.eat(&token::Semi) { Some(self.parse_anon_const_expr()?) } else { None };

match self.expect(&token::CloseDelim(token::Bracket)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you could rewrite this as a single match on self.token.kind with four arms:

  • ;
  • , if following thing could be const (pretty sure there's a method for that
  • ]
  • _ case, where you call expect_any(&[CloseDelim(Bracket), Semi]) purely for the error.

At that point I think you can entirely avoid the snapshotting (but can still do if you find cases where it is worth it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using a 4-way match here. I just implemented this and pushed.

I'm now using struct_span_errs to generate the errors. Good: The error message is now improved! When the recovery successful I now emit "expected ;, found ," with a help, instead of "expected ; or ], found ,". Bad: the error message "expected ... found ..." will probably diverge from the one expected_one_of_not_found in time. I could call expected_one_of_not_found, but I'd have to add an unreachable()! to Ok case of the return value, which I'm not a fan of..

I think we could move the error string generation of expected_one_of_not_found and reuse it.

, if following thing could be const (pretty sure there's a method for that

I looked for this function but couldn't find it. Assuming this function is similar to is_certainly_not_a_block, I'm also not a fan of this idea. The problem is these functions are usually overly conservative and break very easily, as shown in #82051.

It's not clear to me what the problem is with recovering as currently implemented. If it's simply confusing code or something like that, perhaps we can implement a helper function to encapsulate this pattern (save the parser state, restore parser state on error, continue on success).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could call expected_one_of_not_found, but I'd have to add an unreachable()! to Ok case of the return value, which I'm not a fan of..

I think doing that would be reasonable to do.

@estebank
Copy link
Contributor

@bors r-

r? @estebank

I left a bunch of nitpicks, but I think we can clean up the code quite a bit and make it easier to maintain in the future :)

I celebrate the desire to improve the UX by making errors more targeted, as I consider it to be super important for the learnability of the language, but we do need to try and keep the "noise" to reasonable levels. (And I say this as likely the main noise introducer.)

@rust-highfive rust-highfive assigned estebank and unassigned camelid Feb 14, 2021
@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Feb 14, 2021
@osa1 osa1 requested a review from estebank February 14, 2021 06:16
@rust-log-analyzer

This comment has been minimized.

@osa1
Copy link
Contributor Author

osa1 commented Feb 14, 2021

@estebank error message in two of the tests are changed now, could you also review those? First test:

fn main() {
    let x: [isize 3];
}

Diff:

- error: expected one of `!`, `(`, `+`, `::`, `;`, `<`, or `]`, found `3`
+ error: expected one of ';' or ']', found `3`

Second test:

type v = [isize * 3];

Diff:

- error: expected one of `!`, `(`, `+`, `::`, `;`, `<`, or `]`, found `*`
+ error: expected one of ';' or ']', found `*`

I think the new error messages are better (it doesn't make sense to replace * with + in isize * 3, for example), but perhaps I don't know the full Rust type syntax yet..

@rustbot rustbot assigned osa1 and unassigned estebank Feb 14, 2021
@osa1 osa1 removed their assignment Feb 14, 2021
@osa1
Copy link
Contributor Author

osa1 commented Feb 14, 2021

Sorry for the previous comment, it was meant for another issue.

r? @estebank

@estebank
Copy link
Contributor

I think the new error messages are better (it doesn't make sense to replace * with + in isize * 3, for example), but perhaps I don't know the full Rust type syntax yet..

That's the syntax for multiple traits: Trait1 + Trait2 is "equivalent" to a SomeTrait: Trait1 + Trait2.

@osa1
Copy link
Contributor Author

osa1 commented Feb 15, 2021

That's the syntax for multiple traits: Trait1 + Trait2 is "equivalent" to a SomeTrait: Trait1 + Trait2.

Hmm, right. I need to include type operators in the error message when I find something other than ; or ] after the type. Not sure how to best do that though. Manually listing the tokens would not be an option I think as it would diverge from the actual type syntax.

@osa1
Copy link
Contributor Author

osa1 commented Feb 19, 2021

The 4-way arm won't give us follow set of type parser in the error messages so I think I'll have to revert this to previous version.

Too bad I force pushed all my changes ...

osa1 added 2 commits February 20, 2021 12:32
When user types [x, y] in a type context instead of [x; y] we now try to
recover. In particular this fixes confusing "unmatched angle bracket"
errors in code like

    fn main() {
        drop::<[(), 0]>([]);
    }

which would previously generate

     --> src/main.rs:2:11
      |
    2 |     drop::<[(), 0]>([]);
      |           ^ help: remove extra angle bracket

    error: expected one of `;` or `]`, found `,`
     --> src/main.rs:2:15
      |
    2 |     drop::<[(), 0]>([]);
      |               ^ expected one of `;` or `]`

With the recovery we now only generate the second error.

Fixes #81097

Note that we only try this recovery in edition 2018 and later. In
earlier editions functions without bodies are now allowed to have named
arguments, so having this recovery is causing problems in cases like

    trait Foo {
        fn foo([a, b]: [i32; 2]) {}
    }

where the `[a, b]` part is an error, and assuming `;` in place of `,`
there makes the error message worse, not better. A test for this is in
`ui/issues/issue-50571.rs`.
@osa1
Copy link
Contributor Author

osa1 commented Feb 20, 2021

@estebank I'm thinking about how to proceed here. I have a few changes in mind to improve the error message in test removed-syntax-fixed-vec (shown in "changes" tab). But in the meantime could I get feedback on your 4-way match idea? You suggested it here. See my questions below the linked comment.

@crlf0710
Copy link
Member

crlf0710 commented Mar 20, 2021

@osa1 Ping from triage, it seems there's a merge commit within this PR. Would you mind rebasing this PR so that there's no merge commits? Also, you could mark this PR as ready-for-review using rustbot after the rebase. Thanks!

@osa1
Copy link
Contributor Author

osa1 commented Mar 21, 2021

Sorry I have too many in-flight PRs currently.. I'll try to get back to this this week.

@crlf0710
Copy link
Member

crlf0710 commented Apr 9, 2021

@osa1 Ping from triage. Any updates?

@JohnCSimon
Copy link
Member

@osa1 Ping from triage. Just letting you know there's a merge conflict.

@bstrie bstrie added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 12, 2021
@bstrie bstrie added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 2, 2021
@crlf0710
Copy link
Member

@osa1 I'm gonna close this pr due to inactivity. Feel free to reopen or create a new pr when you've got time to work on this again. Thanks!

@crlf0710 crlf0710 closed this Jun 19, 2021
@crlf0710 crlf0710 added S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Poor diagnostics for [T, N] turbofish instead of [T; N]