Prefer expanding parenthesized expressions before operands #5608

MichaReiser · 2023-07-08T09:23:02Z

Summary

This PR implements Black's behavior where it first splits off parenthesized expressions before splitting before operands to avoid unnecessary parentheses:

# We want 
if a + [ 
	b,
	c
]: 
	pass

# Rather than
if (
    a
    + [b, c]
): 
	pass

This is implemented by using the new IR elements introduced in #5596.

We give the group wrapping the optional parentheses an ID (parentheses_id)
We use conditional_group for the lower priority groups (all non-parenthesized expressions) with the condition that the parentheses_id group breaks (we want to split before operands only if the parentheses are necessary)
We use fits_expanded to wrap all other parenthesized expressions (lists, dicts, sets), to prevent that expanding e.g. a list expands the parentheses_id group. We gate the fits_expand to only apply if the parentheses_id group fits (because we prefer a\n+[b, c] over expanding [b, c] if the whole expression gets parenthesized).

We limit using fits_expanded and conditional_group only to expressions that themselves are not in parentheses (checking the conditions isn't free)

Test Plan

It increases the Jaccard index for Django from 0.915 to 0.917

Incompatibilites

There are two incompatibilities left that I'm aware of (there may be more, I didn't go through all snapshot differences).

Long string literals

I commented on the regression. The issue is that a very long string (or any content without a split point) may not fit when only breaking the right side. The formatter than inserts the optional parentheses. But this is kind of useless because the overlong string will still not fit, because there are no new split points.

I think we should ignore this incompatibility for now

Expressions on statement level

I don't fully understand the logic behind this yet, but black doesn't break before the operators for the following example even though the expression exceeds the configured line width

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa < bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb > ccccccccccccccccccccccccccccc == ddddddddddddddddddddd

But it would if the expression is used inside of a condition.

What I understand so far is that Black doesn't insert optional parentheses on the expression statement level (and a few other places) and, therefore, only breaks after opening parentheses. I propose to keep this deviation for now to avoid overlong-lines and use the compatibility report to make a decision if we should implement the same behavior.

MichaReiser · 2023-07-08T09:23:16Z

Current dependencies on/for this PR:

main
- PR Extend formatter IR to support Black's expression formatting #5596
  - PR Prefer expanding parenthesized expressions before operands #5608 👈
    - PR Remove mode from BestFitting #5609
      - PR Pass FormatContext to NeedsParentheses #5643
        
        PR Introduce ExpressionRef #5644
      - PR Something operator related that I don't understand yet #5615

This comment was auto-generated by Graphite.

github-actions · 2023-07-08T10:14:35Z

PR Check Results

Ecosystem

✅ ecosystem check detected no changes.

Benchmark

Linux

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.00      8.4±0.04ms     4.8 MB/sec    1.00      8.4±0.02ms     4.8 MB/sec
formatter/numpy/ctypeslib.py               1.00  1861.6±10.12µs     8.9 MB/sec    1.01   1876.6±6.75µs     8.9 MB/sec
formatter/numpy/globals.py                 1.00    199.4±0.35µs    14.8 MB/sec    1.02    204.1±1.24µs    14.5 MB/sec
formatter/pydantic/types.py                1.02      4.2±0.07ms     6.1 MB/sec    1.00      4.1±0.00ms     6.2 MB/sec
linter/all-rules/large/dataset.py          1.00     14.1±0.03ms     2.9 MB/sec    1.00     14.1±0.06ms     2.9 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      3.5±0.01ms     4.7 MB/sec    1.00      3.5±0.00ms     4.7 MB/sec
linter/all-rules/numpy/globals.py          1.00    365.9±1.30µs     8.1 MB/sec    1.00    367.1±1.43µs     8.0 MB/sec
linter/all-rules/pydantic/types.py         1.00      6.3±0.05ms     4.1 MB/sec    1.00      6.3±0.06ms     4.1 MB/sec
linter/default-rules/large/dataset.py      1.00      7.2±0.01ms     5.6 MB/sec    1.00      7.2±0.15ms     5.6 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00   1484.3±4.68µs    11.2 MB/sec    1.00   1489.4±2.32µs    11.2 MB/sec
linter/default-rules/numpy/globals.py      1.00    161.1±0.73µs    18.3 MB/sec    1.00    161.4±1.51µs    18.3 MB/sec
linter/default-rules/pydantic/types.py     1.00      3.2±0.01ms     7.9 MB/sec    1.00      3.2±0.01ms     8.0 MB/sec

Windows

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.00     11.7±0.44ms     3.5 MB/sec    1.00     11.7±0.37ms     3.5 MB/sec
formatter/numpy/ctypeslib.py               1.00      2.6±0.12ms     6.4 MB/sec    1.05      2.7±0.13ms     6.1 MB/sec
formatter/numpy/globals.py                 1.00   286.1±14.85µs    10.3 MB/sec    1.04   298.0±18.83µs     9.9 MB/sec
formatter/pydantic/types.py                1.00      5.6±0.23ms     4.5 MB/sec    1.04      5.9±0.33ms     4.3 MB/sec
linter/all-rules/large/dataset.py          1.00     19.5±0.61ms     2.1 MB/sec    1.00     19.6±0.65ms     2.1 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.02      5.1±0.20ms     3.3 MB/sec    1.00      5.0±0.21ms     3.3 MB/sec
linter/all-rules/numpy/globals.py          1.01   629.4±21.41µs     4.7 MB/sec    1.00   622.7±80.29µs     4.7 MB/sec
linter/all-rules/pydantic/types.py         1.00      8.7±0.33ms     2.9 MB/sec    1.00      8.7±0.34ms     2.9 MB/sec
linter/default-rules/large/dataset.py      1.02     10.1±0.33ms     4.0 MB/sec    1.00      9.9±0.37ms     4.1 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00      2.1±0.07ms     7.9 MB/sec    1.00      2.1±0.08ms     7.9 MB/sec
linter/default-rules/numpy/globals.py      1.02   258.7±15.88µs    11.4 MB/sec    1.00   254.2±19.29µs    11.6 MB/sec
linter/default-rules/pydantic/types.py     1.00      4.5±0.20ms     5.6 MB/sec    1.01      4.6±0.31ms     5.6 MB/sec

MichaReiser · 2023-07-09T11:16:03Z

...er/tests/snapshots/black_compatibility@simple_cases__trailing_comma_optional_parens3.py.snap

+## Input
+
+```py
+if True:


This is a regression

Ruff tests if all lines fit. This is usless in the given case because strings have no further split point. Meaning, changing the layout to fully expanded doesn't really fix the does not fit problem, because we would need to split the srings, which we cannot. Splitting the last line doesn't really help either.
Should the FitsExpanded measure mode be FitsEnd rather than StartAndEnd? Meaning, we measure if everything up to the first separator fits, then everything coming after the last line break?

i think we do need to measure after the line break here

I'll leave this for now to fix later. This is a general problem when using groups that contain no further split points.

konstin · 2023-07-11T10:27:11Z

crates/ruff_python_formatter/src/expression/mod.rs

+                    let result = group(&format_args![
+                        if_group_breaks(&text("(")),
+                        indent_if_group_breaks(
+                            &format_args![soft_line_break(), format_expr],


why is the line break inside the indent?

Indent is a bit counter-intuitive. Luckily, this is hidden a way by soft_block_indent and block_indent. indent only increments the indentation level, it doesn't insert any line breaks itself. The soft_line_break is necessary to then start a new line, with the new, incremented indent.

This is why the closing newline is outside of the indent. We first need to decrement the indent counter, and only then start the new line.

crates/ruff_python_formatter/src/expression/mod.rs

konstin · 2023-07-11T10:36:18Z

...er/tests/snapshots/black_compatibility@simple_cases__trailing_comma_optional_parens3.py.snap

+## Input
+
+```py
+if True:


i think we do need to measure after the line break here

MichaReiser · 2023-07-11T12:07:42Z

@MichaReiser merged this pull request with Graphite.

MichaReiser mentioned this pull request Jul 8, 2023

Extend formatter IR to support Black's expression formatting #5596

Merged

MichaReiser marked this pull request as draft July 8, 2023 09:23

MichaReiser force-pushed the expand-by-parentheses-first branch from 0cd2dde to 5dda384 Compare July 8, 2023 09:40

MichaReiser mentioned this pull request Jul 8, 2023

Remove mode from BestFitting #5609

Merged

MichaReiser force-pushed the expand-by-parentheses-first branch from 5dda384 to c97caf1 Compare July 8, 2023 15:43

MichaReiser mentioned this pull request Jul 8, 2023

Something operator related that I don't understand yet #5615

Closed

MichaReiser force-pushed the conditional-group branch from 7135143 to 4c48749 Compare July 9, 2023 11:11

MichaReiser force-pushed the expand-by-parentheses-first branch 2 times, most recently from 537c84f to 56bae34 Compare July 9, 2023 11:22

MichaReiser changed the title ~~Prefer parenthesized expressions before operands~~ Prefer expanding parenthesized expressions before operands Jul 9, 2023

MichaReiser mentioned this pull request Jul 10, 2023

format ExprListComp #5600

Merged

MichaReiser force-pushed the conditional-group branch from 4c48749 to 4ffd817 Compare July 10, 2023 08:37

MichaReiser force-pushed the expand-by-parentheses-first branch from 56bae34 to aa6915d Compare July 10, 2023 08:37

MichaReiser requested a review from konstin July 10, 2023 11:36

MichaReiser added the formatter Related to the formatter label Jul 10, 2023

MichaReiser marked this pull request as ready for review July 10, 2023 11:37

MichaReiser commented Jul 10, 2023

View reviewed changes

MichaReiser force-pushed the expand-by-parentheses-first branch from 8803400 to 074d289 Compare July 10, 2023 12:01

This was referenced Jul 10, 2023

Pass FormatContext to NeedsParentheses #5643

Merged

Introduce ExpressionRef #5644

Closed

Format target: annotation = value? expressions #5661

Merged

MichaReiser linked an issue Jul 11, 2023 that may be closed by this pull request

Formatter: Parenthesized expressions #5671

Closed

MichaReiser force-pushed the conditional-group branch from 4ffd817 to 6d11701 Compare July 11, 2023 06:56

MichaReiser force-pushed the expand-by-parentheses-first branch from 074d289 to b8dff47 Compare July 11, 2023 06:56

MichaReiser mentioned this pull request Jul 11, 2023

Improve comprehension line break beheavior #5680

Merged

konstin approved these changes Jul 11, 2023

View reviewed changes

MichaReiser mentioned this pull request Jul 11, 2023

Only omit optinal parens if the expression ends or starts with a parenthesized expression #5683

Merged

MichaReiser force-pushed the conditional-group branch from 6d11701 to ff18d52 Compare July 11, 2023 11:12

Base automatically changed from conditional-group to main July 11, 2023 11:20

MichaReiser added 10 commits July 11, 2023 13:29

Use new IR to format expressions

91df239

Commit regressions

d40b868

Revert quicktest changes

eaa9491

Add partial implementation of can_omit_optional_parentheses

d90f4d4

Fix boolean and compare op count

f344531

Count conditional expressions as two operations

2194dae

Always use group for subscript

1ffaf26

Add some documentation

d08a62a

Clippy, restore lib.rs

6ef1b84

Code review feedback

830edb6

MichaReiser force-pushed the expand-by-parentheses-first branch from b8dff47 to 830edb6 Compare July 11, 2023 11:30

MichaReiser merged commit 715250a into main Jul 11, 2023

MichaReiser deleted the expand-by-parentheses-first branch July 11, 2023 12:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prefer expanding parenthesized expressions before operands #5608

Prefer expanding parenthesized expressions before operands #5608

MichaReiser commented Jul 8, 2023 •

edited

Loading

MichaReiser commented Jul 8, 2023 •

edited

Loading

github-actions bot commented Jul 8, 2023 •

edited

Loading

MichaReiser Jul 9, 2023

konstin Jul 11, 2023

MichaReiser Jul 11, 2023

konstin Jul 11, 2023

MichaReiser Jul 11, 2023

konstin Jul 11, 2023

MichaReiser commented Jul 11, 2023

Prefer expanding parenthesized expressions before operands #5608

Prefer expanding parenthesized expressions before operands #5608

Conversation

MichaReiser commented Jul 8, 2023 • edited Loading

Summary

Test Plan

Incompatibilites

Long string literals

Expressions on statement level

MichaReiser commented Jul 8, 2023 • edited Loading

github-actions bot commented Jul 8, 2023 • edited Loading

PR Check Results

Ecosystem

Benchmark

Linux

Windows

MichaReiser Jul 9, 2023

Choose a reason for hiding this comment

konstin Jul 11, 2023

Choose a reason for hiding this comment

MichaReiser Jul 11, 2023

Choose a reason for hiding this comment

konstin Jul 11, 2023

Choose a reason for hiding this comment

MichaReiser Jul 11, 2023

Choose a reason for hiding this comment

konstin Jul 11, 2023

Choose a reason for hiding this comment

MichaReiser commented Jul 11, 2023

MichaReiser commented Jul 8, 2023 •

edited

Loading

MichaReiser commented Jul 8, 2023 •

edited

Loading

github-actions bot commented Jul 8, 2023 •

edited

Loading