String formatting linting #443

asottile · 2019-04-04T05:13:39Z

There's three classes of lint messages added here -- I'm happy to split out / reorder / etc. these commits as fit. Also please suggest better names / messages -- these were my first-pass attempt at the messages 😆

Resolves #148

Supersedes #370 (CC @fperrin)

f-strings

f-strings without placeholders

`'...'.format(...)`

'{}'.format(1, 2) -- extra positional arguments
'{foo}'.format(foo=1, bar=2) -- extra named arguments
'{} {}'.format(1) -- placeholder missing argument (positional)
'{foo} {bar}'.format(foo=1) -- placeholder missing argument (named)
'{} {1}'.format(1, 2) -- switching between manual / automatic numbering
'{0} {}'.format(1, 2) -- switching between manual / automatic numbering
'{'.format(1) -- invalid format string

`'...' % ...`

'%(foo)' % () -- invalid format string (missing the conversion type)
'%j' % () -- invalid format specifier (j is not a valid conversion type)
'%s %s' % (1, 2, 3) -- positional count mismatch (too many args)
'%s %s' % (1,) -- positional count mismatch (too few args)
'%(bar)s' % {'bar': 1, 'baz': 2} -- extra named argument supplied
'%(bar)s %(baz)s' % {'bar': 1} -- missing named argument
'%(bar)s' % (1, 2, 3) -- expected mapping, but got sequence
'%s %s' % {'baz': 1} -- expected sequence, got mapping

Code for parsing .format() uses string.Formatter.parse
Code for parsing % format is lifted from pyupgrade: here

asmeurer · 2019-04-04T07:24:52Z

Cool that it doesn't actually evaluate the string so it isn't susceptible to #370 (comment). Maybe that should be added as a test.

asottile · 2019-04-04T15:13:04Z

Some examples of things this found in my code:

And other code I have access to at a glance:

asottile · 2019-04-04T15:14:55Z

Oops this is still WIP -- found two more cases to handle:

# currently flagged as "too many placeholders"
'%.*f %s' % (2, b / float(factor), suffix)

# currently flagged as "unused placeholder at position 2"
'{:{}} {}'.format(1, 15, 2)

asmeurer · 2019-04-04T16:50:28Z

Is it possible to have real column numbers here? For example

"abc %t" % 1

reports "unsupported format character 't'" line 1, column 1. It would be nice to have column 7 (where the t is).

asmeurer · 2019-04-04T16:52:22Z

Does this have any support for finding multiple errors in the same string? It seems like it could be another advantage of this approach. I tried "%s%t" % 1 but it only gave one error.

asottile · 2019-04-04T17:11:48Z

Is it possible to have real column numbers here? For example
"abc %t" % 1
reports "unsupported format character 't'" line 1, column 1. It would be nice to have column 7 (where the t is).

it's difficult, but doable -- I'd like to leave that to a v2 as a later improvement because it is not easy and purely cosmetic. pyflakes currently only has support for node-based reporting and it would be a large refactor to point at positions within a node:

pyflakes/pyflakes/messages.py

Lines 12 to 13 in 232cb1d

    
           self.lineno = loc.lineno 
        
           self.col = getattr(loc, 'col_offset', 0)

Does this have any support for finding multiple errors in the same string? It seems like it could be another advantage of this approach. I tried "%s%t" % 1 but it only gave one error.

It does, however there's only 1 statically determinable error in that format string you've posted (without attempting to solve an intractably large problem of full type checking) -- notably the positional / keyword can only really be determined if it's a tuple / list / dict literal (and for the case of dictionaries it can only determine it if all the keys are strings)

Here's an example where multiple are detected:

$ python3 -m pyflakes t.py
t.py:1:1 '...' % ... has unused named argument(s): bar
t.py:1:1 '...' % ... is missing argument(s) for placeholder(s): foo
$ cat t.py
'%(foo)s' % {'bar': 'baz'}

asmeurer · 2019-04-04T17:19:54Z

Ah. I had assumed it would take literals as obviously not unpackable. But I can see that in general pyflakes would need a recursive type inference system, which it currently does not have. I guess this just shows why format is better than %.

asottile · 2019-04-04T17:26:24Z

It ~could probably inference int / float / str / bool / None literals without too much more complexity -- though I left them out since they're not interesting formatting

asmeurer · 2019-04-04T17:28:37Z

That's a good point.

asottile · 2019-04-04T19:37:14Z

here's a few more cases it found: python/mypy#6631

asottile · 2019-04-04T19:45:09Z

Found one more unhandled case that I'll get a fix for:

```
 '{foo}-{}'.format(1, foo=2)
```

Using the PR of pyflakes @ PyCQA/pyflakes#443

asottile · 2019-05-02T17:06:25Z

I've tried this out on quite a few internal codebases and it appears to be stable and has proved quite useful (fixed a bunch of bugs in rare error cases!) -- would definitely appreciate some reviews on the approach with the hope that this gets integrated :)

pyflakes/messages.py

asmeurer · 2019-05-02T18:58:13Z

I ran this on a bunch of codebases. I didn't really find many errors. Only a handful of issues that wouldn't lead to runtime errors (extra unused formatting arguments). But there were no crashes or incorrect reports, so that's good. I have no idea if it missed anything.

Some of the error messages can be a bit confusing. The arguments to the format function call are counted starting at 0. I can see why this is done, as that's how formatting works, but for something like f(x, y, z) I'm used to calling z the 3rd argument, not position 2. Maybe instead of listing the position it could name the argument itself.

It can also be a bit confusing with something like '{} {}'.format(1) that the {} is called "placeholder 1", same as '{0} {1}'.format(1). I can't think of a cleaner way to write this error message, however. I'm also not clear from the code if it would be straightforward distinguish '{} {}' and '{0} {1}' (I haven't read the code especially closely, though).

Another suggestion would be to replace the (s) in the error messages with an automatic pluralizer to improve readability. Alternately emit one message per error. So for instance

'{a} {b} {c}'.format(a=1)

would produce

'...'.format(...) is missing an argument for placeholder 'b'
'...'.format(...) is missing an argument for placeholder 'c'

instead of

'...'.format(...) is missing argument(s) for placeholder(s): b, c

I think I would prefer splitting like this, as this matches the way pyflakes handles other errors, and it will allow better reporting once column numbers are added (the 'b' and 'c' errors could each point to their respective position in the format string).

In all I think the changes here are pretty good. But again I didn't read the code too closely, so you might want someone else to do that.

myint

Looks good!

I added a minor change request.

pyflakes/test/test_other.py

asmeurer · 2019-07-30T12:54:19Z

Should we open a new issue for splitting out the errors into separate errors for each item and adding column information?

asottile mentioned this pull request Apr 4, 2019

Detect broken string templates that will raise ValueError when interpolated #148

Closed

asottile mentioned this pull request Apr 4, 2019

Fix up a few string formatting issues python/mypy#6631

Merged

asottile mentioned this pull request Apr 4, 2019

Fix invalid format string in error message aws/aws-cli#4049

Closed

JukkaL pushed a commit to python/mypy that referenced this pull request Apr 5, 2019

Fix up a few string formatting issues (#6631)

ce467a7

Using the PR of pyflakes @ PyCQA/pyflakes#443

asmeurer reviewed May 2, 2019

View reviewed changes

pyflakes/messages.py Outdated Show resolved Hide resolved

asottile added 2 commits May 6, 2019 08:18

Add lint rule for f-strings without placeholders

2b74ba5

Add linting for string.format(...)

62e44a9

myint requested changes Jul 7, 2019

View reviewed changes

pyflakes/test/test_other.py Show resolved Hide resolved

Add linting for % formatting

105878b

myint approved these changes Jul 7, 2019

View reviewed changes

mxr approved these changes Jul 13, 2019

View reviewed changes

myint merged commit eeb6263 into PyCQA:master Jul 30, 2019

asottile deleted the string_formatting branch July 30, 2019 15:50

fperrin mentioned this pull request Aug 2, 2019

Detect broken string interpolation and formatting #370

Closed

pquentin mentioned this pull request Sep 24, 2019

fix invalid f-string encode/httpx#375

Merged

asottile mentioned this pull request Apr 3, 2021

Plugins with conflicting entry-point error-code specifications do not auto-load properly [REPLACEMENT ISSUE] PyCQA/flake8#701

Closed

asottile mentioned this pull request Apr 3, 2021

Plugins with conflicting entry-point error-code specifications do not auto-load properly PyCQA/flake8#1090

Closed

asmeurer mentioned this pull request Jun 9, 2021

Check for invalid format specifiers #644

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

String formatting linting #443

String formatting linting #443

asottile commented Apr 4, 2019

asmeurer commented Apr 4, 2019

asottile commented Apr 4, 2019

asottile commented Apr 4, 2019 •

edited

Loading

asmeurer commented Apr 4, 2019

asmeurer commented Apr 4, 2019

asottile commented Apr 4, 2019

asmeurer commented Apr 4, 2019

asottile commented Apr 4, 2019

asmeurer commented Apr 4, 2019

asottile commented Apr 4, 2019

asottile commented Apr 4, 2019 •

edited

Loading

asottile commented May 2, 2019

asmeurer commented May 2, 2019

myint left a comment

asmeurer commented Jul 30, 2019

String formatting linting #443

String formatting linting #443

Conversation

asottile commented Apr 4, 2019

f-strings

'...'.format(...)

'...' % ...

asmeurer commented Apr 4, 2019

asottile commented Apr 4, 2019

asottile commented Apr 4, 2019 • edited Loading

asmeurer commented Apr 4, 2019

asmeurer commented Apr 4, 2019

asottile commented Apr 4, 2019

asmeurer commented Apr 4, 2019

asottile commented Apr 4, 2019

asmeurer commented Apr 4, 2019

asottile commented Apr 4, 2019

asottile commented Apr 4, 2019 • edited Loading

asottile commented May 2, 2019

asmeurer commented May 2, 2019

myint left a comment

Choose a reason for hiding this comment

asmeurer commented Jul 30, 2019

`'...'.format(...)`

`'...' % ...`

asottile commented Apr 4, 2019 •

edited

Loading

asottile commented Apr 4, 2019 •

edited

Loading