Skip to content

Commit

Permalink
Prefer expanding parenthesized expressions before operands
Browse files Browse the repository at this point in the history
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR implements Black's behavior where it first splits off parenthesized expressions before splitting before operands to avoid unnecessary parentheses:

```python
# We want 
if a + [ 
	b,
	c
]: 
	pass

# Rather than
if (
    a
    + [b, c]
): 
	pass
```

This is implemented by using the new IR elements introduced in #5596. 

* We give the group wrapping the optional parentheses an ID (`parentheses_id`)
* We use `conditional_group` for the lower priority groups  (all non-parenthesized expressions) with the condition that the `parentheses_id` group breaks (we want to split before operands only if the parentheses are necessary)
* We use `fits_expanded` to wrap all other parenthesized expressions (lists, dicts, sets), to prevent that expanding e.g. a list expands the `parentheses_id` group. We gate the `fits_expand` to only apply if the `parentheses_id` group fits (because we  prefer `a\n+[b, c]` over expanding `[b, c]` if the whole expression gets parenthesized).

We limit using `fits_expanded` and `conditional_group` only to expressions that themselves are not in parentheses (checking the conditions isn't free)

## Test Plan

It increases the Jaccard index for Django from 0.915 to 0.917

## Incompatibilites

There are two incompatibilities left that I'm aware of (there may be more, I didn't go through all snapshot differences). 

### Long string literals
I  commented on the regression. The issue is that a very long string (or any content without a split point) may not fit when only breaking the right side. The formatter than inserts the optional parentheses. But this is kind of useless because the overlong string will still not fit, because there are no new split points. 

I think we should ignore this incompatibility for now


### Expressions on statement level

I don't fully understand the logic behind this yet, but black doesn't break before the operators for the following example even though the expression exceeds the configured line width

```python
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa < bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb > ccccccccccccccccccccccccccccc == ddddddddddddddddddddd
```

But it would if the expression is used inside of a condition. 

What I understand so far is that Black doesn't insert optional parentheses on the expression statement level (and a few other places) and, therefore, only breaks after opening parentheses. I propose to keep this deviation for now to avoid overlong-lines and use the compatibility report to make a decision if we should implement the same behavior.
  • Loading branch information
MichaReiser authored Jul 11, 2023
1 parent d30e912 commit 715250a
Show file tree
Hide file tree
Showing 26 changed files with 683 additions and 946 deletions.
21 changes: 16 additions & 5 deletions crates/ruff_python_formatter/src/builders.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,21 @@ pub(crate) struct OptionalParentheses<'a, 'ast> {

impl<'ast> Format<PyFormatContext<'ast>> for OptionalParentheses<'_, 'ast> {
fn fmt(&self, f: &mut Formatter<PyFormatContext<'ast>>) -> FormatResult<()> {
group(&format_args![
let saved_level = f.context().node_level();

f.context_mut()
.set_node_level(NodeLevel::ParenthesizedExpression);

let result = group(&format_args![
if_group_breaks(&text("(")),
soft_block_indent(&Arguments::from(&self.inner)),
if_group_breaks(&text(")"))
if_group_breaks(&text(")")),
])
.fmt(f)
.fmt(f);

f.context_mut().set_node_level(saved_level);

result
}
}

Expand Down Expand Up @@ -113,7 +122,9 @@ impl<'fmt, 'ast, 'buf> JoinNodesBuilder<'fmt, 'ast, 'buf> {
0 | 1 => hard_line_break().fmt(self.fmt),
_ => empty_line().fmt(self.fmt),
},
NodeLevel::Expression => hard_line_break().fmt(self.fmt),
NodeLevel::Expression(_) | NodeLevel::ParenthesizedExpression => {
hard_line_break().fmt(self.fmt)
}
}?;
}

Expand Down Expand Up @@ -353,7 +364,7 @@ no_leading_newline = 30"#
// Removes all empty lines
#[test]
fn ranged_builder_parenthesized_level() {
let printed = format_ranged(NodeLevel::Expression);
let printed = format_ranged(NodeLevel::Expression(None));

assert_eq!(
&printed,
Expand Down
4 changes: 3 additions & 1 deletion crates/ruff_python_formatter/src/comments/format.rs
Original file line number Diff line number Diff line change
Expand Up @@ -318,7 +318,9 @@ impl Format<PyFormatContext<'_>> for FormatEmptyLines {
},

// Remove all whitespace in parenthesized expressions
NodeLevel::Expression => write!(f, [hard_line_break()]),
NodeLevel::Expression(_) | NodeLevel::ParenthesizedExpression => {
write!(f, [hard_line_break()])
}
}
}
}
9 changes: 6 additions & 3 deletions crates/ruff_python_formatter/src/context.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
use crate::comments::Comments;
use crate::PyFormatOptions;
use ruff_formatter::{FormatContext, SourceCode};
use ruff_formatter::{FormatContext, GroupId, SourceCode};
use ruff_python_ast::source_code::Locator;
use std::fmt::{Debug, Formatter};

Expand Down Expand Up @@ -78,6 +78,9 @@ pub(crate) enum NodeLevel {
/// (`if`, `while`, `match`, etc.).
CompoundStatement,

/// Formatting nodes that are enclosed in a parenthesized expression.
Expression,
/// The root or any sub-expression.
Expression(Option<GroupId>),

/// Formatting nodes that are enclosed by a parenthesized (any `[]`, `{}` or `()`) expression.
ParenthesizedExpression,
}
216 changes: 0 additions & 216 deletions crates/ruff_python_formatter/src/expression/binary_like.rs

This file was deleted.

55 changes: 11 additions & 44 deletions crates/ruff_python_formatter/src/expression/expr_bin_op.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
use crate::comments::{trailing_comments, trailing_node_comments, Comments};
use crate::expression::binary_like::{BinaryLayout, FormatBinaryLike};
use crate::expression::parentheses::{
default_expression_needs_parentheses, is_expression_parenthesized, NeedsParentheses,
Parenthesize,
default_expression_needs_parentheses, in_parentheses_only_group, is_expression_parenthesized,
NeedsParentheses, Parenthesize,
};
use crate::expression::Parentheses;
use crate::prelude::*;
Expand Down Expand Up @@ -31,24 +30,11 @@ impl FormatRuleWithOptions<ExprBinOp, PyFormatContext<'_>> for FormatExprBinOp {

impl FormatNodeRule<ExprBinOp> for FormatExprBinOp {
fn fmt_fields(&self, item: &ExprBinOp, f: &mut PyFormatter) -> FormatResult<()> {
item.fmt_binary(self.parentheses, f)
}

fn fmt_dangling_comments(&self, _node: &ExprBinOp, _f: &mut PyFormatter) -> FormatResult<()> {
// Handled inside of `fmt_fields`
Ok(())
}
}

impl<'ast> FormatBinaryLike<'ast> for ExprBinOp {
type FormatOperator = FormatOwnedWithRule<Operator, FormatOperator, PyFormatContext<'ast>>;

fn fmt_default(&self, f: &mut PyFormatter<'ast, '_>) -> FormatResult<()> {
let comments = f.context().comments().clone();

let format_inner = format_with(|f: &mut PyFormatter| {
let source = f.context().contents();
let binary_chain: SmallVec<[&ExprBinOp; 4]> = iter::successors(Some(self), |parent| {
let binary_chain: SmallVec<[&ExprBinOp; 4]> = iter::successors(Some(item), |parent| {
parent.left.as_bin_op_expr().and_then(|bin_expression| {
if is_expression_parenthesized(bin_expression.as_any_node_ref(), source) {
None
Expand All @@ -63,7 +49,7 @@ impl<'ast> FormatBinaryLike<'ast> for ExprBinOp {
let left_most = binary_chain.last().unwrap();

// Format the left most expression
group(&left_most.left.format()).fmt(f)?;
in_parentheses_only_group(&left_most.left.format()).fmt(f)?;

// Iterate upwards in the binary expression tree and, for each level, format the operator
// and the right expression.
Expand Down Expand Up @@ -100,33 +86,26 @@ impl<'ast> FormatBinaryLike<'ast> for ExprBinOp {
space().fmt(f)?;
}

group(&right.format()).fmt(f)?;
in_parentheses_only_group(&right.format()).fmt(f)?;

// It's necessary to format the trailing comments because the code bypasses
// `FormatNodeRule::fmt` for the nested binary expressions.
// Don't call the formatting function for the most outer binary expression because
// these comments have already been formatted.
if current != self {
if current != item {
trailing_node_comments(current).fmt(f)?;
}
}

Ok(())
});

group(&format_inner).fmt(f)
in_parentheses_only_group(&format_inner).fmt(f)
}

fn left(&self) -> FormatResult<&Expr> {
Ok(&self.left)
}

fn right(&self) -> FormatResult<&Expr> {
Ok(&self.right)
}

fn operator(&self) -> Self::FormatOperator {
self.op.into_format()
fn fmt_dangling_comments(&self, _node: &ExprBinOp, _f: &mut PyFormatter) -> FormatResult<()> {
// Handled inside of `fmt_fields`
Ok(())
}
}

Expand Down Expand Up @@ -200,18 +179,6 @@ impl NeedsParentheses for ExprBinOp {
source: &str,
comments: &Comments,
) -> Parentheses {
match default_expression_needs_parentheses(self.into(), parenthesize, source, comments) {
Parentheses::Optional => {
if self.binary_layout(source) == BinaryLayout::Default
|| comments.has_leading_comments(self.right.as_ref())
|| comments.has_dangling_comments(self)
{
Parentheses::Optional
} else {
Parentheses::Custom
}
}
parentheses => parentheses,
}
default_expression_needs_parentheses(self.into(), parenthesize, source, comments)
}
}
Loading

0 comments on commit 715250a

Please sign in to comment.