Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be more careful about interpreting a label/lifetime as a mistyped char literal. #120460

Merged
merged 2 commits into from
Jan 30, 2024

Conversation

nnethercote
Copy link
Contributor

Currently the parser interprets any label/lifetime in certain positions as a mistyped char literal, on the assumption that the trailing single quote was accidentally omitted. In such cases it gives an error with a suggestion to add the trailing single quote, and then puts the appropriate char literal into the AST. This behaviour was introduced in #101293.

This is reasonable for a case like this:

let c = 'a;

because 'a' is a valid char literal. It's less reasonable for a case like this:

let c = 'abc;

because 'abc' is not a valid char literal.

Prior to #120329 this could result in some sub-optimal suggestions in error messages, but nothing else. But #120329 changed LitKind::from_token_lit to assume that the char/byte/string literals it receives are valid, and to assert if not. This is reasonable because the lexer does not produce invalid char/byte/string literals in general. But in this "interpret label/lifetime as unclosed char literal" case the parser can produce an invalid char literal with contents such as abc, which triggers an assertion failure.

This PR changes the parser so it's more cautious about interpreting labels/lifetimes as unclosed char literals.

Fixes #120397.

r? @compiler-errors

Because it can be used for a lifetime or a label.
…r literal.

Currently the parser will interpret any label/lifetime in certain
positions as a mistyped char literal, on the assumption that the
trailing single quote was accidentally omitted. This is reasonable for a
something like 'a (because 'a' would be valid) but not reasonable for a
something like 'abc (because 'abc' is not valid).

This commit restricts this behaviour only to labels/lifetimes that would
be valid char literals, via the new `could_be_unclosed_char_literal`
function. The commit also augments the `label-is-actually-char.rs` test
in a couple of ways:
- Adds testing of labels/lifetimes with identifiers longer than one
  char, e.g. 'abc.
- Adds a new match with simpler patterns, because the
  `recover_unclosed_char` call in `parse_pat_with_range_pat` was not
  being exercised (in this test or any other ui tests).

Fixes rust-lang#120397, an assertion failure, which was caused by this behaviour
in the parser interacting with some new stricter char literal checking
added in rust-lang#120329.
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 29, 2024
@compiler-errors
Copy link
Member

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Jan 29, 2024

📌 Commit 306612e has been approved by compiler-errors

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 29, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 30, 2024
…llaumeGomez

Rollup of 18 pull requests

Successful merges:

 - rust-lang#119123 (Add triagebot mentions entry for simd intrinsics)
 - rust-lang#119991 (Reject infinitely-sized reads from io::Repeat)
 - rust-lang#120172 (bootstrap: add more unit tests)
 - rust-lang#120250 (rustdoc: Prevent JS injection from localStorage)
 - rust-lang#120376 (Update codegen test for LLVM 18)
 - rust-lang#120387 (interpret/memory: fix safety comment for large array memset optimization)
 - rust-lang#120400 (Bound errors span label cleanup)
 - rust-lang#120402 (Make the coroutine def id of an async closure the child of the closure def id)
 - rust-lang#120403 (Add instructions of how to use pre-vendored 'rustc-src')
 - rust-lang#120424 (raw pointer metadata API: data address -> data pointer)
 - rust-lang#120425 (Remove unnecessary unit returns in query declarations)
 - rust-lang#120439 (Move UI issue tests to subdirectories)
 - rust-lang#120443 (Fixes footnote handling in rustdoc)
 - rust-lang#120452 (std: Update documentation of seek_write on Windows)
 - rust-lang#120460 (Be more careful about interpreting a label/lifetime as a mistyped char literal.)
 - rust-lang#120464 (Add matthewjasper to some review groups)
 - rust-lang#120467 (Update books)
 - rust-lang#120488 (Diagnostic lifetimes cleanups)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit c00192a into rust-lang:master Jan 30, 2024
11 checks passed
@rustbot rustbot added this to the 1.77.0 milestone Jan 30, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Jan 30, 2024
Rollup merge of rust-lang#120460 - nnethercote:fix-120397, r=compiler-errors

Be more careful about interpreting a label/lifetime as a mistyped char literal.

Currently the parser interprets any label/lifetime in certain positions as a mistyped char literal, on the assumption that the trailing single quote was accidentally omitted. In such cases it gives an error with a suggestion to add the trailing single quote, and then puts the appropriate char literal into the AST. This behaviour was introduced in rust-lang#101293.

This is reasonable for a case like this:
```
let c = 'a;
```
because `'a'` is a valid char literal. It's less reasonable for a case like this:
```
let c = 'abc;
```
because `'abc'` is not a valid char literal.

Prior to rust-lang#120329 this could result in some sub-optimal suggestions in error messages, but nothing else. But rust-lang#120329 changed `LitKind::from_token_lit` to assume that the char/byte/string literals it receives are valid, and to assert if not. This is reasonable because the lexer does not produce invalid char/byte/string literals in general. But in this "interpret label/lifetime as unclosed char literal" case the parser can produce an invalid char literal with contents such as `abc`, which triggers an assertion failure.

This PR changes the parser so it's more cautious about interpreting labels/lifetimes as unclosed char literals.

Fixes rust-lang#120397.

r? `@compiler-errors`
@nnethercote nnethercote deleted the fix-120397 branch January 30, 2024 20:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ICE failed to unescape char literal
4 participants