Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge implicitly concatenated string literals that fit on one line #26

Closed
max-sixty opened this issue Mar 15, 2018 · 17 comments · Fixed by #1132
Closed

Merge implicitly concatenated string literals that fit on one line #26

max-sixty opened this issue Mar 15, 2018 · 17 comments · Fixed by #1132
Labels
help wanted Extra attention is needed T: enhancement New feature or request

Comments

@max-sixty
Copy link

max-sixty commented Mar 15, 2018

Black could make single-line strings over multiple lines (i.e. a number of single quotes strings on multiple lines surrounded by parentheses) more efficient, by resizing them to the full length of the line.

Even if that was overreach, there's a peculiar situation where you end up with multiple strings on the same line, like below:

-        warnings.warn('Dataset.sel_points is deprecated: use Dataset.sel()'
-                      'instead.', DeprecationWarning, stacklevel=2)
-
+        warnings.warn(
+            'Dataset.sel_points is deprecated: use Dataset.sel()' 'instead.',
+            DeprecationWarning,
+            stacklevel=2,
+        )
@ambv
Copy link
Collaborator

ambv commented Mar 15, 2018

In this case Black is suggesting that you should merge the two strings into one and the result is more readable that way.

I don't do this automatically (yet?) because it gets complicated if the two strings don't share the same prefix (for example r'something' f'another thing'). This is where user action after the formatting is probably best.

@max-sixty
Copy link
Author

Right, that makes sense. I think whether the strings on the same line are merge-able is clear (i.e. do they have the same prefix), but yes it's a rare case; feel free to close

And changing beyond that may require discretion (e.g. turning 6 lines of 2/3-long lines into 4 full length lines)

@ambv
Copy link
Collaborator

ambv commented Mar 15, 2018

There's another related problem: if I merged string literals, I am now making semantic changes to the AST. I'm not opposed to those but this will make safety checks after reformatting trickier.

Let's leave this open for the time being, it's an interesting problem.

@ambv ambv changed the title Handling long strings Merge implicitly concatenated string literals that fit on one line Mar 15, 2018
@ambv ambv added the T: enhancement New feature or request label Mar 16, 2018
@zsol
Copy link
Collaborator

zsol commented Apr 3, 2018

I'm not even sure what I would expect black to do with code that implicit-concatenates two differently prefixed strings to be honest. I think the path of least surprise is just leaving them alone.

@ambv
Copy link
Collaborator

ambv commented Apr 3, 2018

Yeah, it they are different prefixes, leave them alone. If they share a prefix and they end up on the same line, they should be merged.

If you really want to be correct here the implementation is going to be hard in the following edge case:

  • two strings like "STR1" "STR2" don't fit on one line because the closing quote of STR1, the space, and the opening quote of STR2 are the 3 characters that cause the entire thing to not fit in a single line. So you will keep them on two lines.
  • but if you knew that it's safe to concatenate them, it would fit in a single line (without those 3 extra characters).

I'm inclined not to touch this edge case since that makes it tricky where to perform the merge.

Another small edge case which I'm inclined to avoid is this:

a = (
    "a"
    "bb"
    "ccc"
    "dddd"
    "eeeee"
    "ffffff"
    "ggggggg"
    "hhhhhhhh"
    "iiiiiiiii"
    "jjjjjjjjjj"
    "kkkkkkkkkkk"
    "llllllllllll"
    "mmmmmmmmmmmmmmmm"
)

Technically Black could implement the "fill" algorithm for this case that Prettier also has for JSX. But I think what it currently does is fine for simplicity and obvious for users to recognize.

@aldanor
Copy link

aldanor commented Jul 15, 2018

Another related case I've managed to hit with black is when it joins \-split string into one, and by doing so it violates line length limit. In this case (at least for now, while it's not implemented still), it'd be probably better if it did nothing? E.g.:

$ black -S -l30 --diff long_str.py

--- long_str.py	2018-07-15 12:24:14 +0000
+++ long_str.py	2018-07-15 12:24:41.221434 +0000
@@ -1,5 +1,2 @@
-s = '111111111111111111111' \
-    '222222222222222222222' \
-    '333333333333333333333' \
-    '444444444444444444444'
+s = '111111111111111111111' '222222222222222222222' '333333333333333333333' '444444444444444444444'

@e3krisztian
Copy link

@aldanor line length violation should be a bug (I have also seen it with black==18.6b4). I think it deserves a separate issue.

@graingert
Copy link
Contributor

graingert commented Aug 20, 2018

@ambv Python 3.7 has AST level constant folding: https://bugs.python.org/issue29469 so implicit string concatenation - or lack thereof - would be invisible to the AST check

@ambv
Copy link
Collaborator

ambv commented Aug 20, 2018

I know, @graingert, but we can't require Python 3.7+ for all Black users. Not yet at least. What I'm pondering is if we should rather switch to do the AST post-check using typed-ast which would make it work exactly the same on both Python 2 and Python 3. And the same on all Python 3 versions.

@hugovk
Copy link
Contributor

hugovk commented Aug 20, 2018

By the way, here's the pip installs for Black from PyPI for July 2018:

python_version percent download_count
3.6 86.07% 15,408
3.7 13.21% 2,364
3.5 0.41% 73
2.7 0.22% 39
3.4 0.08% 14
3.8 0.01% 2
3.3 0.01% 1
2.6 0.01% 1
Total 17,902

Source: pypinfo --start-date 2018-07-01 --end-date 2018-07-31 --percent --markdown black pyversion

@ambv
Copy link
Collaborator

ambv commented Aug 20, 2018

Haha, one of those 3.8 downloads is me :-)

@digitalresistor
Copy link

Having Black complete the concatenation would be great.

@ofek
Copy link
Contributor

ofek commented Dec 2, 2018

In the time being can we add a warning when this happens so we can manually resolve it?

@davidism
Copy link

I'd definitely like to see a warning for this, maybe something like "Implicit string concatenation in line N not merged." An issue I've run into is that someone writing multiple similar strings, in tests for example, will add continuations for all the strings so they look the same, even though some would fit on a single line. Then Black moves them back to a single line but leaves the continuation sitting in the middle. The user was trying to satisfy the formatting rules, but ended up producing less ideal formatting without knowing it.

@luxcem
Copy link

luxcem commented Mar 28, 2019

What's your view on this example? Black left it unchanged.

def foo():
    return "Some long string cut in half," " this is really a long string"

def bar(text):
    return text

bar(("Some long string cut in half," " this is really a long string"))

sirosen added a commit to sirosen/globus-cli that referenced this issue Jul 19, 2019
Several string literals got moved to one line by black, but didn't get
merged -- the result is odd and undesirable.
Although people want Black to make the jump to editing AST for cases
exactly like this ( psf/black#26 ), it's
doubtful that will happen anytime soon.

I would not be surprised, based on that thread, if black just waits for
py3.6 to EOL and to support 3.7+ with this.
sirosen added a commit to sirosen/globus-cli that referenced this issue Jul 19, 2019
Several string literals got moved to one line by black, but didn't get
merged -- the result is odd and undesirable.
Although people want Black to make the jump to editing AST for cases
exactly like this ( psf/black#26 ), it's
doubtful that will happen anytime soon.

I would not be surprised, based on that thread, if black just waits for
py3.6 to EOL and to support 3.7+ with this.
@peterjc
Copy link

peterjc commented Aug 9, 2019

Is there an open issue for the doing the opposite? I've found when black has left long lines in my code, it is usually overly long strings (mostly error messages, and when defining command line arguments).

Black could break long strings over multiple lines with implicit continuation (e.g. at spaces, or hyphens). I appreciate this would mean black having to set a convention for if the break point space should be trailing at the end of a truncated line, or leading at the start of a continued line.

Found it: See #182

@keisheiled
Copy link

I wrote a flake8 plugin to forbid these aberrant constructs: https://pypi.org/project/flake8_implicit_str_concat/

peterjc added a commit to peterjc/biopython that referenced this issue Nov 29, 2019
Some of these are recent regressions introduced by black
psf/black#26

Others are long standing pre-existing style issues.
peterjc added a commit to biopython/biopython that referenced this issue Nov 29, 2019
Some of these are recent regressions introduced by black
psf/black#26

Others are long standing pre-existing style issues.
@ambv ambv added the stable label Mar 3, 2020
@ambv ambv closed this as completed in #1132 May 8, 2020
ambv pushed a commit that referenced this issue May 8, 2020
This pull request's main intention is to wraps long strings (as requested by #182); however, it also provides better string handling in general and, in doing so, closes the following issues:

Closes #26
Closes #182
Closes #933
Closes #1183
Closes #1243
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed T: enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.