Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand format_args! with more details #14820

Merged
merged 3 commits into from
May 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 77 additions & 10 deletions crates/hir-def/src/macro_expansion_tests/builtin_fn_macro.rs
Original file line number Diff line number Diff line change
Expand Up @@ -193,17 +193,17 @@ fn main() {
format_args!("{} {:?}", arg1(a, b, c), arg2);
}
"#,
expect![[r#"
expect![[r##"
#[rustc_builtin_macro]
macro_rules! format_args {
($fmt:expr) => ({ /* compiler built-in */ });
($fmt:expr, $($args:tt)*) => ({ /* compiler built-in */ })
}

fn main() {
$crate::fmt::Arguments::new_v1(&[], &[$crate::fmt::ArgumentV1::new(&(arg1(a, b, c)), $crate::fmt::Display::fmt), $crate::fmt::ArgumentV1::new(&(arg2), $crate::fmt::Display::fmt), ]);
$crate::fmt::Arguments::new_v1(&["", " ", ], &[$crate::fmt::ArgumentV1::new(&(arg1(a, b, c)), $crate::fmt::Display::fmt), $crate::fmt::ArgumentV1::new(&(arg2), $crate::fmt::Debug::fmt), ]);
}
"#]],
"##]],
);
}

Expand All @@ -221,17 +221,84 @@ fn main() {
format_args!("{} {:?}", a::<A,B>(), b);
}
"#,
expect![[r#"
expect![[r##"
#[rustc_builtin_macro]
macro_rules! format_args {
($fmt:expr) => ({ /* compiler built-in */ });
($fmt:expr, $($args:tt)*) => ({ /* compiler built-in */ })
}

fn main() {
$crate::fmt::Arguments::new_v1(&[], &[$crate::fmt::ArgumentV1::new(&(a::<A, B>()), $crate::fmt::Display::fmt), $crate::fmt::ArgumentV1::new(&(b), $crate::fmt::Display::fmt), ]);
$crate::fmt::Arguments::new_v1(&["", " ", ], &[$crate::fmt::ArgumentV1::new(&(a::<A, B>()), $crate::fmt::Display::fmt), $crate::fmt::ArgumentV1::new(&(b), $crate::fmt::Debug::fmt), ]);
}
"#]],
"##]],
);
}

#[test]
fn test_format_args_expand_with_raw_strings() {
check(
r##"
#[rustc_builtin_macro]
macro_rules! format_args {
($fmt:expr) => ({ /* compiler built-in */ });
($fmt:expr, $($args:tt)*) => ({ /* compiler built-in */ })
}

fn main() {
format_args!(
r#"{},mismatch,"{}","{}""#,
location_csv_pat(db, &analysis, vfs, &sm, pat_id),
mismatch.expected.display(db),
mismatch.actual.display(db)
);
}
"##,
expect![[r##"
#[rustc_builtin_macro]
macro_rules! format_args {
($fmt:expr) => ({ /* compiler built-in */ });
($fmt:expr, $($args:tt)*) => ({ /* compiler built-in */ })
}

fn main() {
$crate::fmt::Arguments::new_v1(&[r#""#, r#",mismatch,""#, r#"",""#, r#"""#, ], &[$crate::fmt::ArgumentV1::new(&(location_csv_pat(db, &analysis, vfs, &sm, pat_id)), $crate::fmt::Display::fmt), $crate::fmt::ArgumentV1::new(&(mismatch.expected.display(db)), $crate::fmt::Display::fmt), $crate::fmt::ArgumentV1::new(&(mismatch.actual.display(db)), $crate::fmt::Display::fmt), ]);
}
"##]],
);
}

#[test]
fn test_format_args_expand_eager() {
check(
r#"
#[rustc_builtin_macro]
macro_rules! concat {}

#[rustc_builtin_macro]
macro_rules! format_args {
($fmt:expr) => ({ /* compiler built-in */ });
($fmt:expr, $($args:tt)*) => ({ /* compiler built-in */ })
}

fn main() {
format_args!(concat!("xxx{}y", "{:?}zzz"), 2, b);
}
"#,
expect![[r##"
#[rustc_builtin_macro]
macro_rules! concat {}

#[rustc_builtin_macro]
macro_rules! format_args {
($fmt:expr) => ({ /* compiler built-in */ });
($fmt:expr, $($args:tt)*) => ({ /* compiler built-in */ })
}

fn main() {
$crate::fmt::Arguments::new_v1(&["xxx", "y", "zzz", ], &[$crate::fmt::ArgumentV1::new(&(2), $crate::fmt::Display::fmt), $crate::fmt::ArgumentV1::new(&(b), $crate::fmt::Debug::fmt), ]);
}
"##]],
);
}

Expand All @@ -250,7 +317,7 @@ fn main() {
format_args!/*+errors*/("{} {:?}", a.);
}
"#,
expect![[r#"
expect![[r##"
#[rustc_builtin_macro]
macro_rules! format_args {
($fmt:expr) => ({ /* compiler built-in */ });
Expand All @@ -259,10 +326,10 @@ macro_rules! format_args {

fn main() {
let _ =
/* parse error: expected field name or number */
$crate::fmt::Arguments::new_v1(&[], &[$crate::fmt::ArgumentV1::new(&(a.), $crate::fmt::Display::fmt), ]);
/* error: no rule matches input tokens *//* parse error: expected field name or number */
$crate::fmt::Arguments::new_v1(&["", " ", ], &[$crate::fmt::ArgumentV1::new(&(a.), $crate::fmt::Display::fmt), $crate::fmt::ArgumentV1::new(&(), $crate::fmt::Debug::fmt), ]);
}
"#]],
"##]],
);
}

Expand Down
183 changes: 159 additions & 24 deletions crates/hir-expand/src/builtin_fn_macro.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,13 @@
//! Builtin macro

use std::mem;

use ::tt::Ident;
use base_db::{AnchoredPath, Edition, FileId};
use cfg::CfgExpr;
use either::Either;
use mbe::{parse_exprs_with_sep, parse_to_token_tree, TokenMap};
use rustc_hash::FxHashMap;
use syntax::{
ast::{self, AstToken},
SmolStr,
Expand Down Expand Up @@ -90,11 +94,6 @@ register_builtin! {
(module_path, ModulePath) => module_path_expand,
(assert, Assert) => assert_expand,
(stringify, Stringify) => stringify_expand,
(format_args, FormatArgs) => format_args_expand,
(const_format_args, ConstFormatArgs) => format_args_expand,
// format_args_nl only differs in that it adds a newline in the end,
// so we use the same stub expansion for now
(format_args_nl, FormatArgsNl) => format_args_expand,
HKalbasi marked this conversation as resolved.
Show resolved Hide resolved
(llvm_asm, LlvmAsm) => asm_expand,
(asm, Asm) => asm_expand,
(global_asm, GlobalAsm) => global_asm_expand,
Expand All @@ -106,6 +105,9 @@ register_builtin! {
(trace_macros, TraceMacros) => trace_macros_expand,

EAGER:
(format_args, FormatArgs) => format_args_expand,
(const_format_args, ConstFormatArgs) => format_args_expand,
(format_args_nl, FormatArgsNl) => format_args_nl_expand,
(compile_error, CompileError) => compile_error_expand,
(concat, Concat) => concat_expand,
(concat_idents, ConcatIdents) => concat_idents_expand,
Expand Down Expand Up @@ -232,42 +234,175 @@ fn file_expand(
}

fn format_args_expand(
db: &dyn ExpandDatabase,
id: MacroCallId,
tt: &tt::Subtree,
) -> ExpandResult<ExpandedEager> {
format_args_expand_general(db, id, tt, "")
.map(|x| ExpandedEager { subtree: x, included_file: None })
}

fn format_args_nl_expand(
db: &dyn ExpandDatabase,
id: MacroCallId,
tt: &tt::Subtree,
) -> ExpandResult<ExpandedEager> {
format_args_expand_general(db, id, tt, "\\n")
.map(|x| ExpandedEager { subtree: x, included_file: None })
}

fn format_args_expand_general(
_db: &dyn ExpandDatabase,
_id: MacroCallId,
tt: &tt::Subtree,
end_string: &str,
) -> ExpandResult<tt::Subtree> {
// We expand `format_args!("", a1, a2)` to
// ```
// $crate::fmt::Arguments::new_v1(&[], &[
// $crate::fmt::ArgumentV1::new(&arg1,$crate::fmt::Display::fmt),
// $crate::fmt::ArgumentV1::new(&arg2,$crate::fmt::Display::fmt),
// ])
// ```,
// which is still not really correct, but close enough for now
let mut args = parse_exprs_with_sep(tt, ',');
let args = parse_exprs_with_sep(tt, ',');

let expand_error =
ExpandResult::new(tt::Subtree::empty(), mbe::ExpandError::NoMatchingRule.into());

if args.is_empty() {
return ExpandResult::new(tt::Subtree::empty(), mbe::ExpandError::NoMatchingRule.into());
return expand_error;
}
for arg in &mut args {
let mut key_args = FxHashMap::default();
let mut args = args.into_iter().filter_map(|mut arg| {
// Remove `key =`.
if matches!(arg.token_trees.get(1), Some(tt::TokenTree::Leaf(tt::Leaf::Punct(p))) if p.char == '=')
{
// but not with `==`
if !matches!(arg.token_trees.get(2), Some(tt::TokenTree::Leaf(tt::Leaf::Punct(p))) if p.char == '=' )
if !matches!(arg.token_trees.get(2), Some(tt::TokenTree::Leaf(tt::Leaf::Punct(p))) if p.char == '=')
{
arg.token_trees.drain(..2);
let key = arg.token_trees.drain(..2).next().unwrap();
key_args.insert(key.to_string(), arg);
return None;
}
}
Some(arg)
}).collect::<Vec<_>>().into_iter();
// ^^^^^^^ we need this collect, to enforce the side effect of the filter_map closure (building the `key_args`)
let format_subtree = args.next().unwrap();
let format_string = (|| {
let token_tree = format_subtree.token_trees.get(0)?;
match token_tree {
tt::TokenTree::Leaf(l) => match l {
tt::Leaf::Literal(l) => {
if let Some(mut text) = l.text.strip_prefix('r') {
let mut raw_sharps = String::new();
while let Some(t) = text.strip_prefix('#') {
text = t;
raw_sharps.push('#');
}
text =
text.strip_suffix(&raw_sharps)?.strip_prefix('"')?.strip_suffix('"')?;
Some((text, l.span, Some(raw_sharps)))
} else {
let text = l.text.strip_prefix('"')?.strip_suffix('"')?;
let span = l.span;
Some((text, span, None))
}
}
_ => None,
},
tt::TokenTree::Subtree(_) => None,
}
})();
let Some((format_string, _format_string_span, raw_sharps)) = format_string else {
return expand_error;
};
let mut format_iter = format_string.chars().peekable();
let mut parts = vec![];
let mut last_part = String::new();
let mut arg_tts = vec![];
let mut err = None;
while let Some(c) = format_iter.next() {
// Parsing the format string. See https://doc.rust-lang.org/std/fmt/index.html#syntax for the grammar and more info
match c {
'{' => {
if format_iter.peek() == Some(&'{') {
format_iter.next();
last_part.push('{');
continue;
}
let mut argument = String::new();
while ![Some(&'}'), Some(&':')].contains(&format_iter.peek()) {
argument.push(match format_iter.next() {
Some(c) => c,
None => return expand_error,
});
}
let format_spec = match format_iter.next().unwrap() {
'}' => "".to_owned(),
':' => {
let mut s = String::new();
while let Some(c) = format_iter.next() {
if c == '}' {
break;
}
s.push(c);
}
s
}
_ => unreachable!(),
};
parts.push(mem::take(&mut last_part));
let arg_tree = if argument.is_empty() {
match args.next() {
Some(x) => x,
None => {
err = Some(mbe::ExpandError::NoMatchingRule.into());
tt::Subtree::empty()
}
}
} else if let Some(tree) = key_args.get(&argument) {
tree.clone()
} else {
// FIXME: we should pick the related substring of the `_format_string_span` as the span. You
// can use `.char_indices()` instead of `.char()` for `format_iter` to find the substring interval.
let ident = Ident::new(argument, tt::TokenId::unspecified());
quote!(#ident)
};
let formatter = match &*format_spec {
"?" => quote!(#DOLLAR_CRATE::fmt::Debug::fmt),
"" => quote!(#DOLLAR_CRATE::fmt::Display::fmt),
_ => {
// FIXME: implement the rest and return expand error here
quote!(#DOLLAR_CRATE::fmt::Display::fmt)
}
};
arg_tts.push(
quote! { #DOLLAR_CRATE::fmt::ArgumentV1::new(&(#arg_tree), #formatter), },
);
}
'}' => {
if format_iter.peek() == Some(&'}') {
format_iter.next();
last_part.push('}');
} else {
return expand_error;
}
}
_ => last_part.push(c),
}
}
last_part += end_string;
if !last_part.is_empty() {
parts.push(last_part);
}
let _format_string = args.remove(0);
let arg_tts = args.into_iter().flat_map(|arg| {
quote! { #DOLLAR_CRATE::fmt::ArgumentV1::new(&(#arg), #DOLLAR_CRATE::fmt::Display::fmt), }
}.token_trees);
let part_tts = parts.into_iter().map(|x| {
let text = if let Some(raw) = &raw_sharps {
format!("r{raw}\"{}\"{raw}", x).into()
} else {
format!("\"{}\"", x).into()
};
let l = tt::Literal { span: tt::TokenId::unspecified(), text };
quote!(#l ,)
});
let arg_tts = arg_tts.into_iter().flat_map(|arg| arg.token_trees);
let expanded = quote! {
#DOLLAR_CRATE::fmt::Arguments::new_v1(&[], &[##arg_tts])
#DOLLAR_CRATE::fmt::Arguments::new_v1(&[##part_tts], &[##arg_tts])
};
ExpandResult::ok(expanded)
ExpandResult { value: expanded, err }
}

fn asm_expand(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -171,5 +171,5 @@
<span class="macro">assert</span><span class="macro_bang">!</span><span class="parenthesis macro">(</span><span class="bool_literal macro">true</span><span class="comma macro">,</span> <span class="string_literal macro">"</span><span class="format_specifier">{</span><span class="format_specifier">}</span><span class="string_literal macro"> asdasd"</span><span class="comma macro">,</span> <span class="numeric_literal macro">1</span><span class="parenthesis macro">)</span><span class="semicolon">;</span>
<span class="macro">toho</span><span class="macro_bang">!</span><span class="parenthesis macro">(</span><span class="string_literal macro">"</span><span class="format_specifier">{</span><span class="format_specifier">}</span><span class="string_literal macro">fmt"</span><span class="comma macro">,</span> <span class="numeric_literal macro">0</span><span class="parenthesis macro">)</span><span class="semicolon">;</span>
<span class="macro unsafe">asm</span><span class="macro_bang">!</span><span class="parenthesis macro">(</span><span class="string_literal macro">"mov eax, </span><span class="format_specifier">{</span><span class="numeric_literal">0</span><span class="format_specifier">}</span><span class="string_literal macro">"</span><span class="parenthesis macro">)</span><span class="semicolon">;</span>
<span class="macro">format_args</span><span class="macro_bang">!</span><span class="parenthesis macro">(</span><span class="none macro">concat</span><span class="punctuation macro">!</span><span class="parenthesis macro">(</span><span class="string_literal macro">"</span><span class="format_specifier">{</span><span class="format_specifier">}</span><span class="string_literal macro">"</span><span class="parenthesis macro">)</span><span class="comma macro">,</span> <span class="string_literal macro">"{}"</span><span class="parenthesis macro">)</span><span class="semicolon">;</span>
<span class="macro">format_args</span><span class="macro_bang">!</span><span class="parenthesis macro">(</span><span class="none macro">concat</span><span class="punctuation macro">!</span><span class="parenthesis macro">(</span><span class="string_literal macro">"</span><span class="format_specifier">{</span><span class="format_specifier">}</span><span class="string_literal macro">"</span><span class="parenthesis macro">)</span><span class="comma macro">,</span> <span class="string_literal macro">"</span><span class="format_specifier">{</span><span class="format_specifier">}</span><span class="string_literal macro">"</span><span class="parenthesis macro">)</span><span class="semicolon">;</span>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test regressed, but I can't see why. The recursive expansion of format_args!(concat!("{}"), "{}") is correct but it still highlights the {} in the second argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like we aren't mapping down the argument into the format_args expansion anymore, resulting in the highlight. Our current logic for checking if a string is a format_args template argument is just checking if its in a token tree of a format_args (or similar) named macro call,

pub fn is_format_string(string: &ast::String) -> bool {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strange thing is that if I remove the concat! it will work correctly (so mapping down is working in at least some cases), and that concat! should not play any role here...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, maybe our token mapping infra is not handling eager macros properly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's probably something related to eager macros. In this code:

let a = 2;
format_args!("{} {} {:?}", a, concat!("{}", "xxx"), a);

go to definition on first a works but on second a doesn't.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a deep issue with eager macro expansion, and fixing it is out of scope for this PR. Do you consider this a blocker?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on whether this affects more things or not. But if its just caused by having anothereager macro call in the format_args macro it seems fine . Though what exactly do we gain from this change right now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if its just caused by having anothereager macro call in the format_args macro it seems fine.

I think it is not limited to the eager macros. Any argument after a macro argument in any eager macro will lose its span, and this PR is making the format_args! eager (we can keep format_args! lazy but it has its own problems: false positive expansion diagnostic when someone construct the first argument with macros).

Though what exactly do we gain from this change right now?

For users, a diagnostics when there are less arguments provided in a format_args! based macro + correct and meaningful inline-macro and Expand macro recursively output + more precise closure captures in case of #11260 which theoretically may cause false positives for mir based diagnostics. For me, I can panic-debugging the mir interpreter, seeing the variable values by panicking them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aight, that's a lot of benefits from this. I think it's fine to merge then, but do create an issue tracking this eager expansion problem once merged please (once bors is alive again).

<span class="brace">}</span></code></pre>