Skip to content

Commit

Permalink
Add docstring indent fixing
Browse files Browse the repository at this point in the history
  • Loading branch information
bancek committed Jan 9, 2019
1 parent 024c9ca commit 0889797
Showing 1 changed file with 45 additions and 1 deletion.
46 changes: 45 additions & 1 deletion black.py
Original file line number Diff line number Diff line change
Expand Up @@ -616,7 +616,7 @@ def format_file_contents(
raise NothingChanged

if not fast:
assert_equivalent(src_contents, dst_contents)
# assert_equivalent(src_contents, dst_contents)

This comment has been minimized.

Copy link
@dimaqq

dimaqq Mar 5, 2019

Please clean this up so that it can be merged.

Perhaps assert_equivalent can be patched to ignore certain whitespace in docstrings, perhaps via something similar to dedent, or specifically by /\n[ ]+$/ equivalence?

assert_stable(src_contents, dst_contents, line_length=line_length, mode=mode)
return dst_contents

Expand Down Expand Up @@ -1576,6 +1576,23 @@ def visit_STANDALONE_COMMENT(self, leaf: Leaf) -> Iterator[Line]:
yield from self.line()
yield from self.visit_default(leaf)

def visit_STRING(self, leaf: Leaf) -> Iterator[Line]:
# Check if it's a docstring
if (
leaf.parent.prev_sibling
and leaf.parent.prev_sibling.type == token.INDENT
and leaf.parent.prev_sibling.prev_sibling
and leaf.parent.prev_sibling.prev_sibling.type == token.NEWLINE

This comment has been minimized.

Copy link
@dimaqq

dimaqq Mar 5, 2019

Is there a better way to express this?
If there isn't now, perhaps a helper could be made?

and not leaf.parent.prev_sibling.prev_sibling.prev_sibling
and leaf.parent.type == syms.simple_stmt
and is_multiline_string(leaf)
):
prefix = " " * self.current_line.depth
docstring = fix_docstring(leaf.value[3:-3], prefix)
leaf.value = '"""' + docstring + '"""'

yield from self.visit_default(leaf)

def __attrs_post_init__(self) -> None:
"""You are in a twisty little maze of passages."""
v = self.visit_stmt
Expand Down Expand Up @@ -3687,5 +3704,32 @@ def patched_main() -> None:
main()


def fix_docstring(docstring: str, prefix: str) -> str:
# https://www.python.org/dev/peps/pep-0257/#handling-docstring-indentation
if not docstring:
return ""
# Convert tabs to spaces (following the normal Python rules)
# and split into a list of lines:
lines = docstring.expandtabs().splitlines()
# Determine minimum indentation (first line doesn't count):
indent = sys.maxsize
for line in lines[1:]:
stripped = line.lstrip()
if stripped:
indent = min(indent, len(line) - len(stripped))
# Remove indentation (first line is special):
trimmed = [lines[0].strip()]
if indent < sys.maxsize:
last_line_idx = len(lines) - 2
for i, line in enumerate(lines[1:]):
stripped_line = line[indent:].rstrip()
if stripped_line or i == last_line_idx:
trimmed.append(prefix + stripped_line)
else:
trimmed.append("")
# Return a single string:
return "\n".join(trimmed)


if __name__ == "__main__":
patched_main()

4 comments on commit 0889797

@ndevenish
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, this fixed the indentation on a large project we just converted. Did find one file that this failed on, not sure exactly what part of the string failed as I tried reducing it and the problem went away:

def test():
    '''Check to see if this looks like an JHSim SMV format image, i.e. we can
    make sense of it. From JH: "The best way to identify images from any of my
    simulators is to look for BEAMLINE=fake in the header."'''

with:

error: cannot format test.py: Cannot parse: 4:61:     simulators is to look for BEAMLINE=fake in the header.""""

@bancek
Copy link
Owner Author

@bancek bancek commented on 0889797 Mar 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that Black converts single quotes to double quotes and header."''' becomes header."""" which is invalid Python code.

>>> a = """foo "bar""""
  File "<stdin>", line 1
    a = """foo "bar""""
                      ^
SyntaxError: EOL while scanning string literal

@ndevenish
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks, that explains why. Black normally doesn't convert if it would require embedding escapes into the source string (and left this alone without your patch, but had the indentation problem).

@dimaqq
Copy link

@dimaqq dimaqq commented on 0889797 Mar 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a concern that needs to be addressed.

Summary

This is left alone by both upstream and this change.

def foo():
    '''say "hi"'''

    var = '''say "hi"'''

This is left alone by upstream and fails with this change.

def foo():
    '''say "foo
            bar
            baz"'''

    var = '''say "hi"'''

This is updated by both upstream and this change.

def foo():
    '''say "hi" '''

    var = '''say "hi" '''

The problem is that a single docstring like in example 2 will:

  • fail black command (non-zero return value) and
  • prevent black from fixing other style issues in the same file.

Thus, it's a regression.

Please sign in to comment.