Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

format ExprListComp #5600

Merged
merged 8 commits into from
Jul 11, 2023
Merged

Conversation

davidszotten
Copy link
Contributor

@davidszotten davidszotten commented Jul 7, 2023

includes Comprehension

The jaccard index on django goes from 0.915 to 0.923.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 7, 2023

PR Check Results

Ecosystem

✅ ecosystem check detected no changes.

Benchmark

Linux

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.05     10.3±0.51ms     3.9 MB/sec    1.00      9.8±0.33ms     4.2 MB/sec
formatter/numpy/ctypeslib.py               1.02      2.2±0.08ms     7.5 MB/sec    1.00      2.2±0.08ms     7.6 MB/sec
formatter/numpy/globals.py                 1.03   258.5±16.54µs    11.4 MB/sec    1.00   252.1±14.26µs    11.7 MB/sec
formatter/pydantic/types.py                1.02      4.9±0.18ms     5.2 MB/sec    1.00      4.8±0.20ms     5.3 MB/sec
linter/all-rules/large/dataset.py          1.03     17.2±0.51ms     2.4 MB/sec    1.00     16.7±0.42ms     2.4 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.04      4.3±0.17ms     3.9 MB/sec    1.00      4.1±0.14ms     4.0 MB/sec
linter/all-rules/numpy/globals.py          1.00   537.6±28.62µs     5.5 MB/sec    1.01   542.4±27.58µs     5.4 MB/sec
linter/all-rules/pydantic/types.py         1.03      7.7±0.30ms     3.3 MB/sec    1.00      7.4±0.26ms     3.4 MB/sec
linter/default-rules/large/dataset.py      1.03      8.6±0.27ms     4.7 MB/sec    1.00      8.3±0.27ms     4.9 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.02  1812.4±62.21µs     9.2 MB/sec    1.00  1780.8±82.95µs     9.4 MB/sec
linter/default-rules/numpy/globals.py      1.00   210.0±10.98µs    14.0 MB/sec    1.02   213.7±13.08µs    13.8 MB/sec
linter/default-rules/pydantic/types.py     1.02      3.8±0.11ms     6.8 MB/sec    1.00      3.7±0.10ms     6.9 MB/sec

Windows

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.00     11.5±0.27ms     3.5 MB/sec    1.00     11.6±0.35ms     3.5 MB/sec
formatter/numpy/ctypeslib.py               1.01      2.6±0.10ms     6.4 MB/sec    1.00      2.6±0.11ms     6.5 MB/sec
formatter/numpy/globals.py                 1.00   296.6±16.65µs     9.9 MB/sec    1.00   296.2±20.78µs    10.0 MB/sec
formatter/pydantic/types.py                1.02      5.8±0.25ms     4.4 MB/sec    1.00      5.7±0.22ms     4.5 MB/sec
linter/all-rules/large/dataset.py          1.01     19.5±0.46ms     2.1 MB/sec    1.00     19.4±0.38ms     2.1 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      5.1±0.15ms     3.3 MB/sec    1.01      5.2±0.20ms     3.2 MB/sec
linter/all-rules/numpy/globals.py          1.01   631.1±37.31µs     4.7 MB/sec    1.00   624.9±26.40µs     4.7 MB/sec
linter/all-rules/pydantic/types.py         1.00      8.8±0.25ms     2.9 MB/sec    1.01      8.9±0.27ms     2.9 MB/sec
linter/default-rules/large/dataset.py      1.03     10.2±0.32ms     4.0 MB/sec    1.00      9.9±0.26ms     4.1 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00      2.1±0.08ms     7.8 MB/sec    1.00      2.1±0.07ms     7.8 MB/sec
linter/default-rules/numpy/globals.py      1.00    248.5±9.73µs    11.9 MB/sec    1.01   250.5±10.75µs    11.8 MB/sec
linter/default-rules/pydantic/types.py     1.01      4.6±0.18ms     5.6 MB/sec    1.00      4.5±0.15ms     5.6 MB/sec

@konstin konstin added the formatter Related to the formatter label Jul 7, 2023
@davidszotten davidszotten force-pushed the format-expr-list-comp branch from 48afef1 to 8cd490d Compare July 7, 2023 20:51
@davidszotten davidszotten marked this pull request as ready for review July 7, 2023 21:07
@MichaReiser
Copy link
Member

I'm not entirely sure if I understand the Black difference that you're describing. There are two ways (besides what #5608 introduces but this hopefully isn't needed here) to control the priority

  • Group sequences group(one), group(two): The Printer expands group two before group one (one has a higher priority)
  • Group nesting group(one, group(two)): The printer expands *outer* groups before inner groups. In this case, the group oneexpands before grouptwo`.

You may need to use a group sequence here (one group for the value, one group for the body)

@MichaReiser MichaReiser mentioned this pull request Jul 10, 2023
72 tasks
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Implementing list comprehension was on my TODO list for this week, I guess no more :) Would you mind commenting on #4798 next time before starting to work on a new syntax for better coordination?

Comment on lines 1286 to 1290
let Some(in_token) = tokens.next() else {
// Should always have an `in`
debug_assert!(false);
return CommentPlacement::Default(comment);
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We should assert that this is indeed the in token. Is there a possibility that the expression is parenthesized? If so, you may also need to skip over any ) or(.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use .expect("could not find if token in list comprehension"), i think crashing is more reasonable here than trying to format under false assumptions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh interesting. i though the brackets would be part of the node. til

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's really inconsistent unfortunately (it's a particularity of our parser), e.g. tuple parentheses but others aren't are so ((1,2,3)) the range contains the inner parentheses but not the outer while in (1 + 2) they are not part of the range at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

find_only_token_in_range also has this problem. will fix in this pr (and then use that function)

Comment on lines 1333 to 1337
let Some(if_token_start) = find_if_token(locator, last_end, if_node.range().end()) else {
// there should always be an `if` here
debug_assert!(false);
return CommentPlacement::Default(comment);
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We should assert that this is indeed the expected if token and not any other token. You may also need to deal with any ( or ) in case the expression can be parenthesized.

last_end = if_node.range().end();
}

return CommentPlacement::Default(comment);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return CommentPlacement::Default(comment);
CommentPlacement::Default(comment)

[parenthesized(
"[",
&group(&self::format_args!(
&elt.format(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&elt.format(),
elt.format(),

"[NOT_YET_IMPLEMENTED_generator_key for NOT_YET_IMPLEMENTED_generator_key in []]"
[parenthesized(
"[",
&group(&self::format_args!(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: parenthesized already adds a group. Is this additional group required or should it maybe only wrap the value rather than the value and joined? This way the printer would first try to expand joined, and only fallback to expanding value if the line still exceeds the configured line width.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll check, but it might also be a mistake, parenthesized landed after i'd written this code and i added it in when i spotted it landing

Copy link
Contributor Author

@davidszotten davidszotten Jul 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but i should also admit i don't fully grok the rules around expanding and groups yet, and sometimes end up trying a bunch of variations on examples. your explanation in a previous comment above helps a bit though (group sequences and nesting)

crates/ruff_python_formatter/src/other/comprehension.rs Outdated Show resolved Hide resolved
crates/ruff_python_formatter/src/other/comprehension.rs Outdated Show resolved Hide resolved
crates/ruff_python_formatter/src/other/comprehension.rs Outdated Show resolved Hide resolved
@MichaReiser MichaReiser requested a review from konstin July 10, 2023 07:27
Comment on lines 1286 to 1290
let Some(in_token) = tokens.next() else {
// Should always have an `in`
debug_assert!(false);
return CommentPlacement::Default(comment);
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use .expect("could not find if token in list comprehension"), i think crashing is more reasonable here than trying to format under false assumptions

Comment on lines +38 to +53
[
i
for
i
in
[
1,
]
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

black formats this as

[
    i
    for i in [
        1,
    ]
]

which i prefer, but i expect that to be unreasonably hard to implement (breaking the inner group but not the outer group)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm interesting. I think this goes into the one case that my expression formatting refactor doesn't cover yet (and I also don't fully understand yet).

Black sometimes (for example on expression statement level) only expands by parenthesized expressions, but not by operators. But it does sometimes and inserts optional parentheses. I haven't understood the rules around this yet. It could be related and we may want to wait with addressing it just yet

@davidszotten davidszotten force-pushed the format-expr-list-comp branch 2 times, most recently from 476a978 to 8d11442 Compare July 10, 2023 19:06
@davidszotten
Copy link
Contributor Author

ready i think

@MichaReiser MichaReiser force-pushed the format-expr-list-comp branch from 9e7046b to 2511815 Compare July 11, 2023 06:24
@MichaReiser MichaReiser force-pushed the format-expr-list-comp branch from 2511815 to 7c2d38c Compare July 11, 2023 06:28
@MichaReiser MichaReiser enabled auto-merge (squash) July 11, 2023 06:28
@MichaReiser MichaReiser merged commit 1782fb8 into astral-sh:main Jul 11, 2023
@davidszotten
Copy link
Contributor Author

👍

I’ll have a look at the other comprehensions next

@MichaReiser
Copy link
Member

+1

I’ll have a look at the other comprehensions next

Nice. Do you have the permissions to create an issue from #4798 and assign it to you? If not, then I'll create it and assign it to you for you, if you let me know on which comprehension you want to work first.

@davidszotten
Copy link
Contributor Author

i don't think I have any repo permissions. i guess i can create issues (not sure what you by by creating "from 4798", do you mean and link from there?) but not assign

from a quick look, set and dict are pretty much like list comps so seem quite straightforward. generator comprehensions are trickier because the brackets are optional in some contexts but mandatory in others

@MichaReiser
Copy link
Member

i don't think I have any repo permissions. i guess i can create issues (not sure what you by by creating "from 4798", do you mean and link from there?) but not assign

from a quick look, set and dict are pretty much like list comps so seem quite straightforward. generator comprehensions are trickier because the brackets are optional in some contexts but mandatory in others

There's a button at the end that allows you to automatically create an issue:

image

I created #5674 now. I can assign it to you if you leave a comment.

@davidszotten
Copy link
Contributor Author

There's a button at the end that allows you to automatically create an issue:

the circle with the dot in? i don't have that

@davidszotten
Copy link
Contributor Author

while there is still other work to do, maybe we can mark set and list comp as "easy (see list comp)" to encourage newcomers?

you can assign met to generator exp though i'll probably need a pointer on how to approach this bracket issue

@MichaReiser
Copy link
Member

the circle with the dot in? i don't have that

Yes. Hmm, that's unfortunate. I created the issue #5676

you can assign met to generator exp though i'll probably need a pointer on how to approach this bracket issue

I'm currently looking into the expression handling. I, incorrectly, assumed that Black only uses the expand the right hand side before operands on the top level. But there seem to be other places where it applies the same logic (generators, with items, ... potentially more).

You can ignore this for now.

@davidszotten
Copy link
Contributor Author

brackets:

ok. note that for generator expressions it's not just a question of style, as the brackets are _required in some contexts. see e.g. https://peps.python.org/pep-0289/

This means that you can write:

sum(x**2 for x in range(10))
but you would have to write:

reduce(operator.add, (x**2 for x in range(10)))
and also:

g = (x**2 for x in range(10))

MichaReiser pushed a commit that referenced this pull request Jul 15, 2023
## Summary

Format `DictComp` like `ListComp` from #5600. It's not 100%, but I
figured maybe it's worth starting to explore.

## Test Plan

Added ruff fixture based on `ListComp`'s.
evanrittenhouse pushed a commit to evanrittenhouse/ruff that referenced this pull request Jul 19, 2023
## Summary

Format `DictComp` like `ListComp` from astral-sh#5600. It's not 100%, but I
figured maybe it's worth starting to explore.

## Test Plan

Added ruff fixture based on `ListComp`'s.
konstin pushed a commit that referenced this pull request Jul 19, 2023
## Summary

Format `DictComp` like `ListComp` from #5600. It's not 100%, but I
figured maybe it's worth starting to explore.

## Test Plan

Added ruff fixture based on `ListComp`'s.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
formatter Related to the formatter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants