Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle block comment before Traits in derive attribute (issue 4984) #5002

Merged
merged 2 commits into from
Oct 6, 2021

Conversation

ytmimi
Copy link
Contributor

@ytmimi ytmimi commented Sep 24, 2021

Resolves #4984

When a block comment comes before the traits of a derive attribute
the string "#[derive(" is included as part of the the pre snippet, and as
a result gets duplicated.

For example:

#[derive(/*Debug, */Clone)] -> #[derive(#[derive(/*Debug, */ Clone)]

Now the string "#[derive(" is trimmed from the start of the pre snippet
before looking for the pre comment.

@calebcartwright
Copy link
Member

Are we sure the issue is specific to derive attributes? Seems to me that the actual issue/root cause is somewhere upstream in the call stack, and we should try to fix the underlying issue instead of working around the side effects that issue causes.

e.g. what happens with:

#[foo(/*Debug, */Clone)]
struct Foo;

@ytmimi
Copy link
Contributor Author

ytmimi commented Sep 27, 2021

@calebcartwright On rustfmt-nightly v1.4.37 the example you gave doesn't get reformatted.

Looking at the rewrite method it's probably because we branch if we're dealing with a derive (line 431)

rustfmt/src/attr.rs

Lines 379 to 494 in cb144c3

impl Rewrite for [ast::Attribute] {
fn rewrite(&self, context: &RewriteContext<'_>, shape: Shape) -> Option<String> {
if self.is_empty() {
return Some(String::new());
}
// The current remaining attributes.
let mut attrs = self;
let mut result = String::new();
// This is not just a simple map because we need to handle doc comments
// (where we take as many doc comment attributes as possible) and possibly
// merging derives into a single attribute.
loop {
if attrs.is_empty() {
return Some(result);
}
// Handle doc comments.
let (doc_comment_len, doc_comment_str) =
rewrite_initial_doc_comments(context, attrs, shape)?;
if doc_comment_len > 0 {
let doc_comment_str = doc_comment_str.expect("doc comments, but no result");
result.push_str(&doc_comment_str);
let missing_span = attrs
.get(doc_comment_len)
.map(|next| mk_sp(attrs[doc_comment_len - 1].span.hi(), next.span.lo()));
if let Some(missing_span) = missing_span {
let snippet = context.snippet(missing_span);
let (mla, mlb) = has_newlines_before_after_comment(snippet);
let comment = crate::comment::recover_missing_comment_in_span(
missing_span,
shape.with_max_width(context.config),
context,
0,
)?;
let comment = if comment.is_empty() {
format!("\n{}", mlb)
} else {
format!("{}{}\n{}", mla, comment, mlb)
};
result.push_str(&comment);
result.push_str(&shape.indent.to_string(context.config));
}
attrs = &attrs[doc_comment_len..];
continue;
}
// Handle derives if we will merge them.
if context.config.merge_derives() && is_derive(&attrs[0]) {
let derives = take_while_with_pred(context, attrs, is_derive);
let derive_str = format_derive(derives, shape, context)?;
result.push_str(&derive_str);
let missing_span = attrs
.get(derives.len())
.map(|next| mk_sp(attrs[derives.len() - 1].span.hi(), next.span.lo()));
if let Some(missing_span) = missing_span {
let comment = crate::comment::recover_missing_comment_in_span(
missing_span,
shape.with_max_width(context.config),
context,
0,
)?;
result.push_str(&comment);
if let Some(next) = attrs.get(derives.len()) {
if next.is_doc_comment() {
let snippet = context.snippet(missing_span);
let (_, mlb) = has_newlines_before_after_comment(snippet);
result.push_str(&mlb);
}
}
result.push('\n');
result.push_str(&shape.indent.to_string(context.config));
}
attrs = &attrs[derives.len()..];
continue;
}
// If we get here, then we have a regular attribute, just handle one
// at a time.
let formatted_attr = attrs[0].rewrite(context, shape)?;
result.push_str(&formatted_attr);
let missing_span = attrs
.get(1)
.map(|next| mk_sp(attrs[0].span.hi(), next.span.lo()));
if let Some(missing_span) = missing_span {
let comment = crate::comment::recover_missing_comment_in_span(
missing_span,
shape.with_max_width(context.config),
context,
0,
)?;
result.push_str(&comment);
if let Some(next) = attrs.get(1) {
if next.is_doc_comment() {
let snippet = context.snippet(missing_span);
let (_, mlb) = has_newlines_before_after_comment(snippet);
result.push_str(&mlb);
}
}
result.push('\n');
result.push_str(&shape.indent.to_string(context.config));
}
attrs = &attrs[1..];
}
}
}

I also totally agree that the problem should be fixed higher up in the call stack. I had trouble figuring out where a good place to fix the issue would be though. From my understanding this is what's happening:

[<ast::Attribute] as Rewrite>::rewrite() calls attr::format_derive. attr::format_derive immediately maps each ast::Attribute into a lists::ListItems object and then iterates and flattens the iterator of lists::ListItems into an iterator of lists::ListItem.

Taking a look at the Iterator implementation for lists::ListItems, which is where we construct each lists::ListItem, we use the span to find the pre_snippet, and then we use the pre_snippet to find the pre_comment.

rustfmt/src/lists.rs

Lines 730 to 738 in cb144c3

fn next(&mut self) -> Option<Self::Item> {
self.inner.next().map(|item| {
// Pre-comment
let pre_snippet = self
.snippet_provider
.span_to_snippet(mk_sp(self.prev_span_end, (self.get_lo)(&item)))
.unwrap_or("");
let (pre_comment, pre_comment_style) = extract_pre_comment(pre_snippet);

Given the following:

#[derive(/*Debug, */Clone)]
struct Foo;

The pre_snippet is "#[derive(/*Debug, */" and because the pre_snippet ends with a "*/" lists::extract_pre_comment hits the following code block and just returns the trimmed pre_snippet as the found comment.

rustfmt/src/lists.rs

Lines 591 to 596 in cb144c3

} else {
(
Some(trimmed_pre_snippet.to_owned()),
ListItemCommentStyle::SameLine,
)
}

Let me know if all that makes sense.

Again, I wasn't really sure where the fix should go, but I thought it made the most sense to implement the fix in lists::extract_pre_comment.

@calebcartwright
Copy link
Member

Looking at the rewrite method it's probably because we branch if we're dealing with a derive (line 431)

Ah okay, thanks for confirming!

The pre_snippet is "#[derive(/*Debug, /" and because the pre_snippet ends with a "/" lists::extract_pre_comment hits the following code block and just returns the trimmed pre_snippet as the found comment.

Yeah it sounds like the starting point of the span is at the start of the attribute instead of the paren (. I feel like that's what we need to really fix, as carrying around extra baggage on formatting strings can be a source of other subtle bugs around width limit/wrapping checks.

@ytmimi
Copy link
Contributor Author

ytmimi commented Sep 29, 2021

I removed the old fix where I just stripped "#[derive(" from the start of the snippet and now I've updated the span to start after the opening "("

src/attr.rs Outdated
attr.span.lo(),
// We update derive attribute spans to start after the opening '('
// This helps us focus parsing to just what's inside #[derive(...)]
attr.span.lo() + BytePos("#[derive(".len() as u32),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, this is precisely what I had in mind! This type of offset derivation is so common that we actually have a lot of utilities that make this easy to do, and account for various edge cases and complexities (whitespace, comments, etc.) so we'll want to apply the same idea here, but utilize the helper instead. We can also just look for the opening paren in this case, so that we don't have to worry about anything weird but technically valid like #[ derive ( type scenaro

e.g. (disclaimer, typed inline so may need to be tweaked and/or have the SpanUtils imported from source_map)

Suggested change
attr.span.lo() + BytePos("#[derive(".len() as u32),
context.snippet_provider.span_after(attr.span, "("),

@calebcartwright
Copy link
Member

Realize this and PR 5000 are still pending, but if you have any interest and bandwidth, I think #5009 and #5011 might be decent issues to look at as they're likely caused (at least in part) by the same sort of span/byte position derivation issues as this one

ytmimi added 2 commits October 4, 2021 23:08
Fixes 4984

When parsing derive attributes we're only concerned about the traits
and comments listed between the opening and closing parentheses.

Derive attribute spans currently start at the '#'.

    Span starts here
    |
    v
    #[derive(...)]

After this update the derive spans start after the opening '('.

    Span starts here
             |
             V
    #[derive(...)]
@ytmimi
Copy link
Contributor Author

ytmimi commented Oct 5, 2021

Thanks for pointing out the helper method and for the heads up on bringing SpanUtils into scope! I went ahead and made the change.

I think your hunch about the other issues having a similar cause is probably right. I should have some time to take a crack at it later this week!

@calebcartwright
Copy link
Member

LGTM thanks!

@calebcartwright calebcartwright merged commit 365a2f8 into rust-lang:master Oct 6, 2021
@ytmimi ytmimi deleted the issue_4984 branch October 13, 2021 03:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Syntax error after formatting a derive attribute that contains a block comment
2 participants