-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Fix panic when formatting comments in unary expressions #21501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
see #21410 (comment), but the short summary is that `if` (and likely other) statement formatting code that uses `maybe_parenthesize` checks if the condition has any leading or trailing comments, so if we try to smuggle the comments in as dangling comments, it thinks the expression won't break, so it doesn't add parentheses when formatting a case like this: ```py if ( not # comment a): pass ``` and we end up with a syntax error: ```py if not a: pass ``` There may be some other way around this, but this is why I'm giving up for now. It really feels like we want another CommentPlacement variant or some kind of dangling tag, like Micha mentioned when I met with him.
this is very close, but now I have an extra newline in a few cases
missing the leading comments is causing the whole thing to get indented deep in the if formatting
Co-authored-by: Micha Reiser <micha@reiser.io>
|
...python_formatter/tests/snapshots/format@parentheses__expression_parentheses_comments.py.snap
Show resolved
Hide resolved
MichaReiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't thought it through but does this change require preview gating?
I think what will help us to make this decision is if you update your summary and describe how the formatting, specifically the comment placement, changes compared to main.
| .map_or(unary_op.operand.start(), |lparen| lparen.start()); | ||
| if comment.end() < up_to { | ||
| CommentPlacement::leading(unary_op, comment) | ||
| let up_to = operand_start(unary_op, source); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the method description to match our new behavior
| .iter() | ||
| .any(|comment| comment.start() < range.start()) | ||
| }); | ||
| if comments.has_leading(operand.as_ref()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's assign the leading commnts to a variable to avoid retrieving them twice
| ) { | ||
| OptionalParentheses::Never | ||
| } else if context.comments().has(self.operand.as_ref()) { | ||
| return OptionalParentheses::Never; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to change the logic here too to match the logic for when we insert a hard line break in the unary formatting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean something like this?
if !context.comments().has_leading(self.operand.as_ref())
|| is_expression_parenthesized(
self.operand.as_ref().into(),
context.comments().ranges(),
context.source(),
)
{
return OptionalParentheses::Never;
}I played with a few variations on this and kept running into instabilities. It seems to be working okay without matching the check exactly, like on main.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, more like this:
let parenthesized_operand_range = parenthesized_range(
operand.into(),
item.into(),
comments.ranges(),
f.context().source(),
);
let leading_operand_comments = comments.leading(operand.as_ref());
let has_leading_comments_before_parens = parenthesized_operand_range.is_some_and(|range| {
leading_operand_comments
.iter()
.any(|comment| comment.start() < range.start())
});
if !leading_operand_comments.is_empty()
&& !is_expression_parenthesized(
operand.as_ref().into(),
f.context().comments().ranges(),
f.context().source(),
)
|| has_leading_comments_before_parens
It's important that it exactly mirrors the case when we insert a hard line break in the formatting code because any line break will lead to invalid syntax if the if formatting doesn't add parentheses.
Here's an example where your PR produces invalid syntax:
if (
not
# comment
(a)):
passWe should add more tests that exercise the new leading comment placement (may even be true for the trailing comment placement, are there more combinations that you could test?)
| let operand_start = operand_start(self, context.source()); | ||
| if context | ||
| .comments() | ||
| .dangling(self) | ||
| .iter() | ||
| .any(|comment| comment.end() < operand_start) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can simplify this to returning Multiline when there's any dangling comment.
Does this need to take precedence over the Never case when the operand is parenthesized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to work in both orders, at least with our current tests.
I did think a little bit about this. I think it would be a bit difficult to preview-gate this since the panic fix and the new formatting seem intertwined. I verified that taking the formatted output from I will also expand the summary, though! |
operand_start was now used in only one place again, and the intermediate variable in NeedsParentheses was no longer needed
|
Thanks for updating the summary. I think it should be safe to not preview gate this change because:
That means, any ruff formatted code can't contain any comment for which we now preserve the position |
MichaReiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you
| } | ||
|
|
||
| if needs_line_break(self, context) { | ||
| return OptionalParentheses::Multiline; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can even return Always here because we know it breaks over multiple lines and will need parentheses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, thanks! And thanks for all of your help here!
Co-authored-by: Takayuki Maeda <takoyaki0316@gmail.com>
Summary
This is another attempt at #21410 that fixes #19226.
@MichaReiser helped me get something working in a very helpful pairing session. I pushed one additional commit moving the comments back from leading comments to trailing comments, which I think retains more of the input formatting.
I was inspired by Dylan's PR (#21185) to make one of these tables:
hopefully it helps even though the snippets are much wider here.
The two main differences are (1) that we now retain own-line comments between the unary operator and its operand instead of moving these to leading comments on the operator itself, and (2) that we move end-of-line comments between the operator and operand to dangling end-of-line comments on the operand (the last example in the table).
Test Plan
Existing tests, plus new ones based on the issue. As I noted below, I also ran the output from main on the unary.py file back through this branch to check that we don't reformat code from main. This made me feel a bit better about not preview-gating the changes in this PR.