Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LineSuffix reserved width #6830

Merged
merged 3 commits into from
Aug 28, 2023
Merged

Conversation

cnpryer
Copy link
Contributor

@cnpryer cnpryer commented Aug 23, 2023

Closes #5630

Summary

Currently a trailing, end-of-line comment considered a line suffix element is not measured during formatting. This change introduces a reserved width field to the ruff_formatter crate for line suffix elements in order to allow formatter clients to opt-into this behavior.

Test Plan

Currently just doctest


Outdated; See (comment)
TODO:

TODO(optional):

  • Use TextWidth instead of internal LineSuffix (comment)
  • Add benchmark report (this shouldn't impact perf, but see #6819 for example)

@github-actions
Copy link
Contributor

github-actions bot commented Aug 23, 2023

PR Check Results

Benchmark

Linux

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.10      4.5±0.01ms     9.1 MB/sec    1.00      4.1±0.01ms    10.0 MB/sec
formatter/numpy/ctypeslib.py               1.04    886.8±4.31µs    18.8 MB/sec    1.00    850.0±2.79µs    19.6 MB/sec
formatter/numpy/globals.py                 1.01     83.7±1.56µs    35.3 MB/sec    1.00     82.5±0.82µs    35.8 MB/sec
formatter/pydantic/types.py                1.00  1673.8±11.25µs    15.2 MB/sec    1.00   1674.9±4.73µs    15.2 MB/sec
linter/all-rules/large/dataset.py          1.02     10.3±0.09ms     3.9 MB/sec    1.00     10.1±0.03ms     4.0 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.01      2.7±0.01ms     6.1 MB/sec    1.00      2.7±0.01ms     6.2 MB/sec
linter/all-rules/numpy/globals.py          1.00    381.0±0.63µs     7.7 MB/sec    1.00    379.5±1.22µs     7.8 MB/sec
linter/all-rules/pydantic/types.py         1.01      5.3±0.04ms     4.8 MB/sec    1.00      5.2±0.03ms     4.9 MB/sec
linter/default-rules/large/dataset.py      1.01      5.5±0.03ms     7.4 MB/sec    1.00      5.4±0.02ms     7.5 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.01   1205.0±8.86µs    13.8 MB/sec    1.00   1187.9±5.21µs    14.0 MB/sec
linter/default-rules/numpy/globals.py      1.00    138.8±0.27µs    21.3 MB/sec    1.01    140.0±2.91µs    21.1 MB/sec
linter/default-rules/pydantic/types.py     1.01      2.5±0.02ms    10.3 MB/sec    1.00      2.4±0.02ms    10.5 MB/sec

Windows

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.08      5.1±0.04ms     8.0 MB/sec    1.00      4.7±0.05ms     8.6 MB/sec
formatter/numpy/ctypeslib.py               1.03   986.4±11.57µs    16.9 MB/sec    1.00   960.0±11.73µs    17.3 MB/sec
formatter/numpy/globals.py                 1.00     91.1±1.34µs    32.4 MB/sec    1.03     93.5±5.23µs    31.5 MB/sec
formatter/pydantic/types.py                1.00  1892.6±19.35µs    13.5 MB/sec    1.02  1939.0±29.01µs    13.2 MB/sec
linter/all-rules/large/dataset.py          1.00     12.2±0.10ms     3.3 MB/sec    1.01     12.3±0.10ms     3.3 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      3.4±0.03ms     4.9 MB/sec    1.01      3.4±0.03ms     4.9 MB/sec
linter/all-rules/numpy/globals.py          1.00    419.0±5.31µs     7.0 MB/sec    1.02    427.1±5.93µs     6.9 MB/sec
linter/all-rules/pydantic/types.py         1.00      6.4±0.05ms     4.0 MB/sec    1.01      6.4±0.06ms     4.0 MB/sec
linter/default-rules/large/dataset.py      1.00      6.8±0.04ms     6.0 MB/sec    1.00      6.8±0.04ms     6.0 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00  1433.8±11.88µs    11.6 MB/sec    1.01  1455.1±20.07µs    11.4 MB/sec
linter/default-rules/numpy/globals.py      1.00    166.3±2.32µs    17.7 MB/sec    1.04    172.4±2.43µs    17.1 MB/sec
linter/default-rules/pydantic/types.py     1.00      3.0±0.03ms     8.4 MB/sec    1.00      3.0±0.04ms     8.4 MB/sec

@cnpryer cnpryer changed the title Add reserved width to LineSuffix Add and use reserved width to LineSuffix Aug 23, 2023
@cnpryer cnpryer changed the title Add and use reserved width to LineSuffix Add and use LineSuffix reserved width Aug 23, 2023
@cnpryer cnpryer changed the title Add and use LineSuffix reserved width Use LineSuffix reserved width Aug 24, 2023
@MichaReiser
Copy link
Member

This is exciting. Thanks for taking the time to work on this!

@MichaReiser MichaReiser added the formatter Related to the formatter label Aug 24, 2023
@cnpryer
Copy link
Contributor Author

cnpryer commented Aug 24, 2023

I started updating the PR description to reflect feedback (see todo list).

As discussed on Discord we can also split this PR up if needed in order to unblock dependent work.

@cnpryer cnpryer force-pushed the line-suffix branch 6 times, most recently from eddc725 to 686f3da Compare August 26, 2023 00:00
@cnpryer
Copy link
Contributor Author

cnpryer commented Aug 26, 2023

Was looking at the ecosystem checks. This is unstable atm.

Pulled from the transformers repo:

class BaseMixedInt8Test(unittest.TestCase):
    # ...
    EXPECTED_RELATIVE_DIFFERENCE = (
        1.540025  # This was obtained on a Quadro RTX 8000 so the number might slightly change
    )
2023-08-26T00:22:39.664102Z ERROR Unstable formatting /Users/chrispryer/github/ruff/scratch.py
@@ -92,7 +92,9 @@
-    EXPECTED_RELATIVE_DIFFERENCE = 1.540025  # This was obtained on a Quadro RTX 8000 so the number might slightly change
+    EXPECTED_RELATIVE_DIFFERENCE = (
+        1.540025
+    )  # This was obtained on a Quadro RTX 8000 so the number might slightly change

From django the same instability occurs with:

class Field:
    hidden_widget = (
        HiddenInput  # Default widget to use when rendering this as "hidden".
    )

Would this imply that black considers optional parentheses by using the entire line-width, including the line-suffix?

FWIW I wanted to just see if this would fix it and it does.

diff --git a/crates/ruff_python_formatter/src/expression/mod.rs b/crates/ruff_python_formatter/src/expression/mod.rs
index 52255ec4f..89121e38d 100644
--- a/crates/ruff_python_formatter/src/expression/mod.rs
+++ b/crates/ruff_python_formatter/src/expression/mod.rs
@@ -183,11 +183,12 @@ impl Format<PyFormatContext<'_>> for MaybeParenthesizeExpression<'_> {
         } = self;
 
         let comments = f.context().comments();
-        let preserve_parentheses = parenthesize.is_optional()
-            && is_expression_parenthesized((*expression).into(), f.context().source());
+        let has_parentheses = is_expression_parenthesized((*expression).into(), f.context().source());
+        let preserve_parentheses = parenthesize.is_optional() && has_parentheses;
 
-        let has_comments =
-            comments.has_leading(*expression) || comments.has_trailing_own_line(*expression);
+        let has_comments = comments.has_leading(*expression)
+            || comments.has_trailing_own_line(*expression)
+            || (has_parentheses && comments.has_trailing(*expression));
2023-08-26T01:29:46.973674Z  INFO Finished /Users/chrispryer/github/ruff/target/progress_projects/typeshed: 0 stability errors, 3496 files, similarity index 0.99947), took 1.44s, 0 input files contained syntax errors 
2023-08-26T01:29:48.560901Z  INFO Finished /Users/chrispryer/github/ruff/target/progress_projects/zulip: 0 stability errors, 1437 files, similarity index 0.99938), took 1.59s, 0 input files contained syntax errors 
2023-08-26T01:29:53.443121Z  INFO Finished /Users/chrispryer/github/ruff/target/progress_projects/cpython: 0 stability errors, 1789 files, similarity index 0.76074), took 4.88s, 2 input files contained syntax errors 
2023-08-26T01:29:59.402318Z  INFO Finished /Users/chrispryer/github/ruff/target/progress_projects/transformers: 0 stability errors, 2587 files, similarity index 0.99913), took 5.96s, 13 input files contained syntax errors 
2023-08-26T01:30:00.243994Z  INFO Finished /Users/chrispryer/github/ruff/target/progress_projects/warehouse: 0 stability errors, 648 files, similarity index 0.99657), took 0.84s, 0 input files contained syntax errors 
2023-08-26T01:30:02.818237Z  INFO Finished /Users/chrispryer/github/ruff/target/progress_projects/django: 0 stability errors, 2760 files, similarity index 0.99926), took 2.57s, 1 input files contained syntax errors 
2023-08-26T01:30:02.862987Z  INFO Finished /Users/chrispryer/github/ruff/target/progress_projects/twine: 0 stability errors, 33 files, similarity index 0.99911), took 0.04s, 0 input files contained syntax errors 
2023-08-26T01:30:02.863148Z  INFO Finished: 0 stability errors, 12750 files, tool 17.326796s, 16 input files contained syntax errors 

real	0m34.871s
user	2m9.043s
sys	0m2.703s
+ cat /Users/chrispryer/github/ruff/target/progress_projects_stats.txt
| project      | similarity index |
|--------------|------------------|
| cpython      | 0.76074          |
| django       | 0.99926          |
| transformers | 0.99913          |
| twine        | 0.99911          |
| typeshed     | 0.99947          |
| warehouse    | 0.99657          |
| zulip        | 0.99938          |

Maybe that's #6771?

@MichaReiser
Copy link
Member

Would this imply that black considers optional parentheses by using the entire line-width, including the line-suffix?

It seems to me that Black removes the parentheses always. This is a file that I used to play around with the formatting.

class BaseMixedInt8Test(unittest.TestCase):
    # ...
    EXPECTED_RELATIVE_DIFFERENCE = (
        1.540025  # This was obtained on a Quadro RTX 8000 so the number might slightly change
    )

    EXPECTED_RELATIVE_DIFFERENCE = 1.540025  # This was obtained on a Quadro RTX 8000 so the number might slightly change

    
    EXPECTED_RELATIVE_DIFFERENCE = (
        1.540025  
    ) # This was obtained on a Quadro RTX 8000 so the number might slightly change

    # ...
    EXPECTED_RELATIVE_DIFFERENCE = (
        1.5  # This was obtained on a Quadro RTX 8000 so the number might slightly change
    )

    EXPECTED_RELATIVE_DIFFERENCE = 1.5  # This was obtained on a Quadro RTX 8000 so the number might slightly change

    
    EXPECTED_RELATIVE_DIFFERENCE = (
        1.5  
    ) # This was obtained on a Quadro RTX 8000 so the number might slightly change
    

    # ...
    EXPECTED_RELATIVE_DIFFERENCE = (
        1.540025  # This was 
    )

    EXPECTED_RELATIVE_DIFFERENCE = 1.540025  # This was

    
    EXPECTED_RELATIVE_DIFFERENCE = (
        1.540025  
    ) # This was
    

    (
        EXPECTED_RELATIVE_DIFFERENCE # This was obtained on a Quadro RTX 8000 so the number might slightly change
    )= (
        1.540025  
    ) # This was

I haven't come to a conclusion yet what the best behavior is but we have a few options:

  • Keep parentheses if the expression has any trailing comment. This would ensure that comments never move and has low complexity

  • Only add parentheses here

    if format_expression.inspect(f)?.will_break() {
    // The group here is necessary because `format_expression` may contain IR elements
    // that refer to the group id
    group(&format_expression)
    .with_group_id(Some(group_id))
    .should_expand(true)
    .fmt(f)
    } else {

    by changing it to

          group(&format_args![
            text("("),
            soft_block_indent(&format_expression),
            text(")")
        ])
        .with_group_id(Some(group_id))
        .fmt(f)

    This gives us black's formatting in most cases.

  • Use your implementation

We'll need to play around with this a little more.

@codspeed-hq
Copy link

codspeed-hq bot commented Aug 26, 2023

CodSpeed Performance Report

Merging #6830 will not alter performance

Comparing cnpryer:line-suffix (f32624c) with main (f33277a)

Summary

✅ 16 untouched benchmarks

@cnpryer
Copy link
Contributor Author

cnpryer commented Aug 26, 2023

Ah I see. I think I was too focused on dealing with the instability. You're right. black will remove the parentheses. Now I want to understand the expected behavior a bit better (black's behavior).

So is it specific nodes that are given the "break if line suffix exceeds line-lengths" treatment? On first glance it looks like it occurs where some kind of parentheses are present. When I say parentheses I mean {}, [], or (). I'm kind of curious how black implements this. I've been kicking the can on reading black's source, so maybe it's time.

Marking as ready to invite some discussion around the solution. I'll continue tinkering.

@cnpryer cnpryer marked this pull request as ready for review August 26, 2023 16:10
@cnpryer
Copy link
Contributor Author

cnpryer commented Aug 26, 2023

I've decided to split this into two PRs. One (this one) will be for adding the new reserved_width field to LineSuffix. The other (pending) will be for layering in the ruff_python_formatter changes.

I wanted to do this because the LineSuffix change doesn't need to wait for the remaining decisions here. This would also unblock anyone else from working off the change to close out some of the Alpha work remaining. The TextWidth implementation would be a great followup PR that I might throw up in parallel to reading black source. And all of this can occur in parallel to the child PR I've split half of this work into.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. My only ask is to use the struct syntax for StartLineSuffix and document the field.

crates/ruff_formatter/src/format_element/tag.rs Outdated Show resolved Hide resolved
@cnpryer cnpryer changed the title Use LineSuffix reserved width Add LineSuffix reserved width Aug 26, 2023
@MichaReiser MichaReiser merged commit 039694a into astral-sh:main Aug 28, 2023
@cnpryer cnpryer deleted the line-suffix branch August 28, 2023 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
formatter Related to the formatter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Measuring LineSuffix when formatting to include trailing comments
3 participants