Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Punct::new() docs promise a panic on invalid punctuation character, but no panic actually ever happens making it possible to create an invalid Punct #470

Closed
Veetaha opened this issue Oct 5, 2024 · 3 comments · Fixed by #471

Comments

@Veetaha
Copy link

Veetaha commented Oct 5, 2024

See the docs for the Punct here. They are probably just copied from the original proc_macro::Punct type from the standard toolchain:

The ch argument must be a valid punctuation character permitted by the
language, otherwise the function will panic.

The problem here is that proc_macro2::Punct never actually panics on invalid input. It is defined literally as this:

proc-macro2/src/lib.rs

Lines 828 to 841 in 9c1d3eb

/// Creates a new `Punct` from the given character and spacing.
///
/// The `ch` argument must be a valid punctuation character permitted by the
/// language, otherwise the function will panic.
///
/// The returned `Punct` will have the default span of `Span::call_site()`
/// which can be further configured with the `set_span` method below.
pub fn new(ch: char, spacing: Spacing) -> Self {
Punct {
ch,
spacing,
span: Span::call_site(),
}
}

This allows for creating an invalid punctuation like this: Punct::new('{', Span::call_site()).

Why is this is problem?

It lead me to some unfortunate chain of debugging of the handling of incomplete syntax in Rust Analyzer. RA has yet another bug of its own (rust-lang/rust-analyzer#18244), where RA actually produces invalid proc_macro::Punct because it creates it using some unsafe ABI layout shenanigans. Then proc_macro2 happily converts such an invalid proc_macro::Punct into the proc_macro2::Punct equivalent, and that eventually leads to panic very late down the road (which is the actual pain) when one tries to run this code

Minimal reproduction of the late feedback about invalid punct. It leads to a panic in proc_macro2::TokenStream::extend():

// Code in proc-macro `my_lovely_macro/src/lib.rs` proc-macro crate

use proc_macro2::{Punct, Spacing, TokenStream, TokenTree};

#[proc_macro]
pub fn my_lovely_macro(_: proc_macro::TokenStream) -> proc_macro::TokenStream {
    let punct = Punct::new('{', Spacing::Alone);
    let token_stream2 = TokenStream::from_iter([TokenTree::Punct(punct)]);

    // This innocent conversion panics!
    proc_macro::TokenStream::from(token_stream2)
}
// Code in runtime crate:

my_lovely_macro::my_lovely_macro! {}

If you try to compile this code it panics with:

$ cargo build
   Compiling my-lovely-macro v0.1.0 (/home/veetaha/sandbox/rust/crates/my-lovely-macro)
   Compiling sandbox v0.1.0 (/home/veetaha/sandbox/rust/crates/sandbox)
error: proc macro panicked
 --> crates/sandbox/src/main.rs:2:1
  |
2 | my_lovely_macro::my_lovely_macro! {}
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = help: message: unsupported character `'{'`

This is because proc_macro::Punct::new() does actually have a panic on invalid characters (see code here).


Huge ❤️ for the proc_macro2/quote/syn ecosystem. It's a heroic design and implementation effort!

@wleslie
Copy link

wleslie commented Oct 16, 2024

It's too bad that the underscore is listed as a punctuation character in the Rust Reference, but is not accepted in Punct::new. I still think this crate is doing the right thing by being consistent with proc_macro::Punct's behavior, but perhaps the documentation should point out the discrepancy with the Reference?

@dtolnay
Copy link
Owner

dtolnay commented Oct 16, 2024

@wleslie do you have a use case where treating _ as Punct instead of Ident is more convenient? Maybe something that needs to handle multiple different kinds of Punct in a consistent manner, among them _? If so, I would accept a PR to make the standard library's proc_macro::Punct::new allow underscore.

@wleslie
Copy link

wleslie commented Oct 16, 2024

No, we don't have any legitimate need to treat underscore as punctuation. But we had preexisting code that did so, which was broken by this change.

Now that I think about it some more, I think that updating the Rust Reference to remove underscore from the list of punctuation characters would clear up the confusion. But maybe there's a subtle distinction I'm missing, where the language sometimes treats underscore as an identifier, and sometimes as punctuation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants