Detect broken string interpolation and formatting #370

fperrin · 2018-10-05T09:26:23Z

Verify simple cases, such as: "%s: %s" % (foo, bar, baz) and "{foo}: {bar}".format(foo="foo", baaar="baz").

This is done by asking Python to do the formatting, filling in the arguments with dummy objects.

The testing for .format string is skipped for Python3 but should be rather easy to add; I want to get an idea of whether you like the current approach first.

Resolves #148

asmeurer · 2018-10-05T18:36:49Z

This is awesome.

The only potential issue I can see with this is that someone could DoS pyflakes by creating a very large string using float interpolation. For instance, "%1000000000000f" % 1 will create a string that's 1 terabyte in size. Hopefully this isn't difficult to workaround.

bitglue · 2018-10-05T18:58:29Z

Maybe that's a feature, since it's equally a problem with the program under analysis.

What's the impact to performance?

asmeurer · 2018-10-05T19:29:42Z

Well I would expect rogue code to not be able to crash a static analysis tool. I can also write os.system('rm -rf /') which is a problem for the code if it runs, but if pyflakes actually ran the code it would be a serious issue for pyflakes. Practically speaking, it means someone can construct a very small Python source that will eat up all the memory and crash any program that uses pyflakes (text editors, REPLs, etc.).

Also, it's a mistake to believe that pyflakes code <=> code will be run. pyflakes is often used as an editor extension meaning it runs on intermediate versions of code, not just the final version that will be executed. All it takes is for someone to accidentally hold down a the number key for too long when constructing a formatting string to accidentally create something like what I have above, and it will crash pyflakes (and possibly the editor as well).

fperrin · 2018-10-08T11:15:21Z

The only potential issue I can see with this is that someone could DoS pyflakes by creating a very large string using float interpolation. For instance, "%1000000000000f" % 1 will create a string that's 1 terabyte in size. Hopefully this isn't difficult to workaround.

I didn't think of that, I thought evaluating the string interpolation to be safe because there is no arbitrary Python code (as opposed to eg. f"" strings). I guess it could even be argued that your example would run fine on beefy production environment, so pyflakes should not fail when running on the developer's smaller laptop. I'll try to think more about it.

There is another bug in number formatting: "%*f" % (5, 1.2) fails with * wants int. ~~Looking into this~~ Pushed new commit.

fperrin · 2019-02-25T14:58:20Z

Hi,

I don't have any bright idea for avoiding the unbounded memory use. It's an intrinsic flaw of asking Python to evaluate the string. I think I'll close this PR, because it'll not go anywhere I'm afraid.

asottile · 2019-02-25T15:15:56Z

Instead of evaluating the format string, you could parse it and compare the fields. There's a stdlib parser at least for .format(...). Here's how I use it for pyupgrade

Parsing % format is a little trickier, though I think I have that as well: pyupgrade again

I don't think there's anything necessary for f-strings since they have their own ast representation, though if you wanted a parser for those I've ported the C parser to pure python here: future-fstrings

fperrin · 2019-08-02T16:57:15Z

Closed as it is superseded by #443, thanks @asottile !

#443 found a couple extra issues on our code base, where "{} {}".format(1, 2, 3) has more (unnamed) arguments than are used in the template. That case was missed by the present PR. So that makes #443 plainly better.

fperrin added 2 commits October 5, 2018 10:45

checker.py: verify string %-interpolation in simple cases

bd0e963

checker.py: verify string formatting in simple cases

3a26e6d

Handle %*f constructs

d502ec8

asottile mentioned this pull request Apr 4, 2019

String formatting linting #443

Merged

fperrin closed this Aug 2, 2019

asottile mentioned this pull request Apr 3, 2021

Add warning on invalid percent markers; ValueError: unsupported format character PyCQA/flake8#689

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detect broken string interpolation and formatting #370

Detect broken string interpolation and formatting #370

fperrin commented Oct 5, 2018

asmeurer commented Oct 5, 2018

bitglue commented Oct 5, 2018

asmeurer commented Oct 5, 2018

fperrin commented Oct 8, 2018 •

edited

Loading

fperrin commented Feb 25, 2019

asottile commented Feb 25, 2019

fperrin commented Aug 2, 2019 •

edited

Loading

Detect broken string interpolation and formatting #370

Detect broken string interpolation and formatting #370

Conversation

fperrin commented Oct 5, 2018

asmeurer commented Oct 5, 2018

bitglue commented Oct 5, 2018

asmeurer commented Oct 5, 2018

fperrin commented Oct 8, 2018 • edited Loading

fperrin commented Feb 25, 2019

asottile commented Feb 25, 2019

fperrin commented Aug 2, 2019 • edited Loading

fperrin commented Oct 8, 2018 •

edited

Loading

fperrin commented Aug 2, 2019 •

edited

Loading