Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lookahead for closing curly quote when encountering opening curly quote #58436

Closed
estebank opened this issue Feb 13, 2019 · 2 comments
Closed
Labels
A-suggestion-diagnostics Area: Suggestions generated by the compiler applied by `cargo fix` C-enhancement Category: An issue proposing an enhancement or a PR with one.

Comments

@estebank
Copy link
Contributor

estebank commented Feb 13, 2019

When encountering curly quotes, the compiler now points at the opening one suggesting changing it for a regular straight quote. It should also look ahead looking for a closing curly quote char and suggest changing that one too, because when fixing only the first one, the parser will interpret this as an unclosed string.

fn main() {
    println!(“”);
}
error: unknown start of token: \u{201c}
 --> src/main.rs:2:14
  |
2 |     println!(“”);
  |              ^
help: Unicode character '“' (Left Double Quotation Mark) looks like '"' (Quotation Mark), but it is not
  |
2 |     println!("”);
  |              ^

After applying suggestion:

error: unterminated double quote string
 --> src/main.rs:2:14
  |
2 |       println!("”);
  |  ______________^
3 | | }
  | |_^

Ideally

error: unknown start of token: \u{201c}
 --> src/main.rs:2:14
  |
2 |     println!(“”);
  |              ^-
help: Unicode character '“' (Left Double Quotation Mark) and '”' (Right Double Quotation Mark) look like '"' (Quotation Mark), but it are not
  |
2 |     println!("");
  |              ^^
@estebank estebank added C-enhancement Category: An issue proposing an enhancement or a PR with one. A-suggestion-diagnostics Area: Suggestions generated by the compiler applied by `cargo fix` labels Feb 13, 2019
@pmccarter
Copy link
Contributor

I could maybe do this with some mentoring, if not too difficult or trivial...

@estebank
Copy link
Contributor Author

estebank commented Feb 13, 2019

@pmccarter this method is the one that supplies the error:

crate fn check_for_substitution<'a>(reader: &StringReader<'a>,
ch: char,
err: &mut DiagnosticBuilder<'a>) -> bool {
UNICODE_ARRAY
.iter()
.find(|&&(c, _, _)| c == ch)
.map(|&(_, u_name, ascii_char)| {
let span = Span::new(reader.pos, reader.next_pos, NO_EXPANSION);
match ASCII_ARRAY.iter().find(|&&(c, _)| c == ascii_char) {
Some(&(ascii_char, ascii_name)) => {
let msg =
format!("Unicode character '{}' ({}) looks like '{}' ({}), but it is not",
ch, u_name, ascii_char, ascii_name);
err.span_suggestion(
span,
&msg,
ascii_char.to_string(),
Applicability::MaybeIncorrect);
true
},
None => {
let msg = format!("substitution character not found for '{}'", ch);
reader.sess.span_diagnostic.span_bug_no_panic(span, &msg);
false
}
}
}).unwrap_or(false)
}

It gets called from the lexer:

unicode_chars::check_for_substitution(self, c, &mut err);

It should be possible to special case the char to continue advancing, using self.nextch() I think, until finding a , and if found, add that to the diagnostic.

The Forge has info on setting up the repo and building your own rustc. The workflow is usually running ./x.py test src/test/ui --stage 1 --bless and modifying/creating test files in src/test/ui.

Centril added a commit to Centril/rust that referenced this issue Feb 23, 2019
Special suggestion for illegal unicode curly quote pairs

Fixes rust-lang#58436

Did not end up expanding the error message span to include the full string literal since I figured the start of the token was the issue, while the help suggestion span would include up to the closing quotation mark.

The look ahead logic does not affect the reader position, not sure if that is an issue (if eg it should still continue to parse after the closing quote without erroring out).
Centril added a commit to Centril/rust that referenced this issue Feb 23, 2019
Special suggestion for illegal unicode curly quote pairs

Fixes rust-lang#58436

Did not end up expanding the error message span to include the full string literal since I figured the start of the token was the issue, while the help suggestion span would include up to the closing quotation mark.

The look ahead logic does not affect the reader position, not sure if that is an issue (if eg it should still continue to parse after the closing quote without erroring out).
Centril added a commit to Centril/rust that referenced this issue Feb 23, 2019
Special suggestion for illegal unicode curly quote pairs

Fixes rust-lang#58436

Did not end up expanding the error message span to include the full string literal since I figured the start of the token was the issue, while the help suggestion span would include up to the closing quotation mark.

The look ahead logic does not affect the reader position, not sure if that is an issue (if eg it should still continue to parse after the closing quote without erroring out).
Centril added a commit to Centril/rust that referenced this issue Feb 23, 2019
Special suggestion for illegal unicode curly quote pairs

Fixes rust-lang#58436

Did not end up expanding the error message span to include the full string literal since I figured the start of the token was the issue, while the help suggestion span would include up to the closing quotation mark.

The look ahead logic does not affect the reader position, not sure if that is an issue (if eg it should still continue to parse after the closing quote without erroring out).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-suggestion-diagnostics Area: Suggestions generated by the compiler applied by `cargo fix` C-enhancement Category: An issue proposing an enhancement or a PR with one.
Projects
None yet
Development

No branches or pull requests

2 participants