Incorrect detection of self-documenting f-strings #39

GjjvdBurg · 2020-01-16T17:20:01Z

Describe the bug

False-positives are returned for self-documenting f-strings.

To Reproduce

$ cat /tmp/test.py
a = 1
print(f"Hello world={a}")
$ vermin /tmp/test.py
Minimum required versions: 3.8
Incompatible versions:     2

Expected behavior

Expected the minimum required version for this code to be 3.6. A self-documenting f-string has the equals sign on the other side of the variable (i.e. {a=}).

Environment (please complete the following information):

Vermin version 0.9.2

Thanks in advance!

The text was updated successfully, but these errors were encountered:

netromdk · 2020-01-16T18:06:56Z

Thanks for noticing, @GjjvdBurg! Looking at it now. :)
I can deduce that you were running Python 3.8 to execute Vermin to achieve this result, right?

netromdk · 2020-01-16T19:12:54Z

There were some cases I hadn't noticed the first time around. Thanks again!

gousaiyang · 2020-01-16T22:09:31Z

Actually there are more cases (tested on Python 3.8 on your patch just now):

Case 1
f'{ a = }'
Expected: detected as 3.8+
Actual: detected as 3.6+
Notes: ASCII whitespaces ([ \t\n\r\f\v]) are allowed before and after the =, and are preserved in the output.

Case 2
f'{a+b=}', f'{a+1=}'
Expected: detected as 3.8+
Actual: detected as 3.6+
Notes: However f'{1+a=}' and f'{1+1=}' are detected correctly.

Case 3
f'{(x:=1)}'
Expected: detected as 3.8+
Actual: detected as 3.6+
Notes: This is not a "self-documenting f-string", this is an f-string containing an assignment expression (which is new in 3.8).

Unfortunately, "self-documenting f-string" or "f-string debugging" is poorly documented in the official documentation (I can only find it in whatsnew and changelog). The PR introducing this change contains some useful information (especially the test cases), along with its corresponding issue (contains relevant discussion when introducing this feature, might be a bit noisy because some previous proposals mentioned were rejected).

netromdk · 2020-01-16T22:18:01Z

Thanks, @gousaiyang :) Looking into it.

netromdk · 2020-01-16T23:12:07Z

I already implemented support for named expressions actually, but the sub node wasn't getting visited. So now f'{(x:=1)}' will correctly yield:

!2, 3.8      /tmp/fstr_named_expr.py
  f-strings require 3.6+
  named expressions require 3.8+

Also, visit sub nodes of f-strings. Related to #39.

netromdk · 2020-01-17T00:07:09Z

Nice test cases. Have to include those, too.

Related to #39.

netromdk · 2020-01-17T21:09:27Z

Okay, that should be all the remaining cases. Phew..

gousaiyang · 2020-01-17T22:48:47Z

I just realized that the false positives happen because the ast module returns an optimized AST (= related information is lost). Which means that f'foo{name=}' is parsed into ast.Constant with value = fooname= (constant string merged) and ast.FormattedValue containing ast.Name name.

Unfortunately, this means that f'{a=}' and f'a={a!r}' will be parsed into the same AST, except for the end_col_offset attribute:

In [26]: ast.parse("f'{a=}'").body[0].value.values[0].__dict__
Out[26]:
{'value': 'a=',
 'kind': None,
 'lineno': 1,
 'col_offset': 0,
 'end_lineno': 1,
 'end_col_offset': 7}

In [27]: ast.parse("f'{a=}'").body[0].value.values[1].__dict__
Out[27]:
{'value': <_ast.Name at 0x29b0cc78c70>,
 'conversion': 114,
 'format_spec': None,
 'lineno': 1,
 'col_offset': 0,
 'end_lineno': 1,
 'end_col_offset': 7}

In [28]: ast.parse("f'a={a!r}'").body[0].value.values[0].__dict__
Out[28]:
{'value': 'a=',
 'kind': None,
 'lineno': 1,
 'col_offset': 0,
 'end_lineno': 1,
 'end_col_offset': 10}

In [29]: ast.parse("f'a={a!r}'").body[0].value.values[1].__dict__
Out[29]:
{'value': <_ast.Name at 0x29b0cc737f0>,
 'conversion': 114,
 'format_spec': None,
 'lineno': 1,
 'col_offset': 0,
 'end_lineno': 1,
 'end_col_offset': 10}

So it seems impossible to differentiate f'{a=}' and f'a={a!r}' with the standard ast module without using the end_col_offset attribute or doing extra parsing of the original source code. Using the end_col_offset attribute might help (deduce from the length of code), but it will be somewhat tricky because the f-string could be very complex like f'foo{f"nested"=}bar{no}baz{another=}'.

netromdk · 2020-01-18T11:57:13Z

Yes, you are exactly right. I've run into this a number of times for different features, and as also mentioned in #35, I've thought about maybe utilizing an additional parser. But it would still be required to function with Python 2.7/3.0, and I still prefer the standard ast parser for maintainability and always being up-to-date.

The source visitor class is given the parsed AST (because errors need to be handled beforehand due to the concurrent behavior), but it could perhaps be given the entire source in string form as well if any extra parsing is necessary - like parsing the [(lineno, col_offset); (end_lineno, end_col_offset)] part for this case. Hm.

bulletmark · 2020-05-26T05:18:30Z

There are still trivial cases tripping this so please re-open. E.g:

$ cat tmp.py
a = 0
b = 0
print(f'a={a} b={b}')

$ vermin -vv --no-tips tmp.py
Detecting python files..
Analyzing using 4 processes..
!2, 3.8      /home/mark/tmp.py
  f-strings require 3.6+
  print(expr) requires 2+ or 3+
  self-documenting fstrings require 3.8+

Minimum required versions: 3.8
Incompatible versions:     2

netromdk · 2020-07-19T12:32:34Z

Sorry for the slow reply.

The problem, as already described above, is that ast optimizes away valuable information. The AST cannot distinguish f'{a=}' from f'a={a}'. This is why your "trivial" case is seen as a self-doc f-string.

With this in mind, I think it would make more sense to disable the entire self-doc f-string detection (but keeping regular f-string detection, of course).

This is done since the built-in AST cannot distinguish `f'{a=}'` from `f'a={a}'`, for instance, because it optimizes some information away. And this incorrectly marks some source code as using fstring self-doc when only using general fstring.

netromdk closed this as completed in d61470f Jan 16, 2020

netromdk reopened this Jan 16, 2020

netromdk added a commit that referenced this issue Jan 17, 2020

Fixed more self-documenting f-string cases

244dbbc

Also, visit sub nodes of f-strings. Related to #39.

netromdk added a commit that referenced this issue Jan 17, 2020

Fixed more self-documenting f-string cases

8987d20

Related to #39.

netromdk added a commit that referenced this issue Jan 17, 2020

Fixed more self-documenting f-string cases

0639042

Related to #39.

netromdk closed this as completed Jan 17, 2020

netromdk added a commit that referenced this issue Aug 15, 2020

fstring self-doc detection can be enabled via --feature argument (#39)

b3390cb

netromdk added a commit that referenced this issue Aug 15, 2020

fstring self-doc detection can be enabled via --feature argument (#39)

c25c44d

lou1306 mentioned this issue Sep 28, 2021

Vermin reports 3.6 instead of 3.8 on files with self-documenting f-strings #77

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect detection of self-documenting f-strings #39

Incorrect detection of self-documenting f-strings #39

GjjvdBurg commented Jan 16, 2020

netromdk commented Jan 16, 2020

netromdk commented Jan 16, 2020

gousaiyang commented Jan 16, 2020

netromdk commented Jan 16, 2020

netromdk commented Jan 16, 2020

netromdk commented Jan 17, 2020

netromdk commented Jan 17, 2020

gousaiyang commented Jan 17, 2020

netromdk commented Jan 18, 2020

bulletmark commented May 26, 2020

netromdk commented Jul 19, 2020 •

edited

Loading

Incorrect detection of self-documenting f-strings #39

Incorrect detection of self-documenting f-strings #39

Comments

GjjvdBurg commented Jan 16, 2020

netromdk commented Jan 16, 2020

netromdk commented Jan 16, 2020

gousaiyang commented Jan 16, 2020

netromdk commented Jan 16, 2020

netromdk commented Jan 16, 2020

netromdk commented Jan 17, 2020

netromdk commented Jan 17, 2020

gousaiyang commented Jan 17, 2020

netromdk commented Jan 18, 2020

bulletmark commented May 26, 2020

netromdk commented Jul 19, 2020 • edited Loading

netromdk commented Jul 19, 2020 •

edited

Loading