Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-37910: argparse usage wrapping should allow whitespace differences caused by metavar #15372

Conversation

sjfranklin
Copy link

@sjfranklin sjfranklin commented Aug 21, 2019

Having metavar be empty (="") can introduce two spaces into the usage summary created by the help text. Here's a minimum working example:

import argparse
# based on Vajrasky Kok's script in https://bugs.python.org/issue11874
parser = argparse.ArgumentParser(prog='PROG')
parser.add_argument('--nil', metavar='', required=True)
parser.add_argument('--a', metavar='a' * 165)
parser.parse_args()

This would produce the following AssertionError:

  File "/minimum_argparse_bug.py", line 7, in <module>
    parser.parse_args()
  File "/path/to/cpython/Lib/argparse.py", line 1758, in parse_args
    args, argv = self.parse_known_args(args, namespace)
  File "/path/to/cpython/Lib/argparse.py", line 1790, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/path/to/cpython/Lib/argparse.py", line 1996, in _parse_known_args
    start_index = consume_optional(start_index)
  File "/path/to/cpython/Lib/argparse.py", line 1936, in consume_optional
    take_action(action, args, option_string)
  File "/path/to/cpython/Lib/argparse.py", line 1864, in take_action
    action(self, namespace, argument_values, option_string)
  File "/path/to/cpython/Lib/argparse.py", line 1037, in __call__
    parser.print_help()
  File "/path/to/cpython/Lib/argparse.py", line 2483, in print_help
    self._print_message(self.format_help(), file)
  File "/path/to/cpython/Lib/argparse.py", line 2467, in format_help
    return formatter.format_help()
  File "/path/to/cpython/Lib/argparse.py", line 281, in format_help
    help = self._root_section.format_help()
  File "/path/to/cpython/Lib/argparse.py", line 212, in format_help
    item_help = join([func(*args) for func, args in self.items])
  File "/path/to/cpython/Lib/argparse.py", line 212, in <listcomp>
    item_help = join([func(*args) for func, args in self.items])
  File "/path/to/cpython/Lib/argparse.py", line 336, in _format_usage
    assert ' '.join(opt_parts) == opt_usage
AssertionError

The desired output would be:

$ ./python minimum_argparse_bug.py --help
usage: PROG [-h] --nil
            [--a aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa]

optional arguments:
  -h, --help            show this help message and exit
  --nil 
  --a aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

Changing these two asserts to essentially ignore whitespace fixes the above issue and also allows metavar="\n", ="\t", and a range of other whitespace options.

An more substantial example of special characters now available to metavar:

import argparse
# based on Vajrasky Kok's script in https://bugs.python.org/issue11874
parser = argparse.ArgumentParser(prog='PROG')

parser.add_argument('--nil', metavar='', required=True)
parser.add_argument('--Line-Feed', metavar='\n', required=True)
parser.add_argument('--Tab', metavar='\t', required=True)
parser.add_argument('--Carriage-Return', metavar='\r', required=True)
parser.add_argument('--Carriage-Return-and-Line-Feed',
                    metavar='\r\n', required=True)
parser.add_argument('--vLine-Tabulation', metavar='\v', required=True)
parser.add_argument('--x0bLine-Tabulation', metavar='\x0b', required=True)
parser.add_argument('--fForm-Feed', metavar='\f', required=True)
parser.add_argument('--x0cForm-Feed', metavar='\x0c', required=True)
parser.add_argument('--File-Separator', metavar='\x1c', required=True)
parser.add_argument('--Group-Separator', metavar='\x1d', required=True)
parser.add_argument('--Record-Separator', metavar='\x1e', required=True)
parser.add_argument('--C1-Control-Code', metavar='\x85', required=True)
parser.add_argument('--Line-Separator', metavar='\u2028', required=True)
parser.add_argument('--Paragraph-Separator', metavar='\u2029', required=True)
parser.add_argument('--a', metavar='a' * 165)
parser.parse_args()
./python argparse_bug.py --help
usage: PROG [-h] --nil --Line-Feed --Tab --Carriage-Return --Carriage-Return-and-Line-Feed --vLine-Tabulation
            --x0bLine-Tabulation --fForm-Feed --x0cForm-Feed --File-Separator --Group-Separator
            --Record-Separator --C1-Control-Code --Line-Separator --Paragraph-Separator
            [--a aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa]

optional arguments:
  -h, --help            show this help message and exit
  --nil 
  --Line-Feed 

  --Tab 	
  --Carriage-Return 
  --Carriage-Return-and-Line-Feed 

  --vLine-Tabulation �
  --x0bLine-Tabulation �
  --fForm-Feed 
  --x0cForm-Feed 
  --File-Separator �
  --Group-Separator �
  --Record-Separator �
  --C1-Control-Code �
  --Line-Separator 

  --Paragraph-Separator 

  --a aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

I tested this fix on Linux (RHEL 7.6; 3.10.0-957) with yesterday's master branch (Python 3.9.0a0). The fix was also compatible when I manually changed argparse.py on Python 3.7.3 installed through Miniconda3.

https://bugs.python.org/issue37910

Franklin, Samuel added 2 commits August 21, 2019 18:30
…erences. Stripping whitespace in the opt_ and pos_usage assertions solves this issue, which can be introduced by empty metavars or metavars containing various forms of whitespace.
@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Our records indicate we have not received your CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

If you have recently signed the CLA, please wait at least one business day
before our records are updated.

You can check yourself to see if the CLA has been received.

Thanks again for your contribution, we look forward to reviewing it!

@@ -0,0 +1 @@
The two asserts in _format_usage now ignore extra whitespaces which are used to split a long usages into multiple lines.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add more context here so it's clear that this affects argparse, and what behavior it allows argparse to do?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll try to do this soon. Thanks for noticing that!

assert ' '.join(opt_parts) == opt_usage
assert ' '.join(pos_parts) == pos_usage

# ignore extra whitespace differences
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some tests in test_argparse.py verifying this behavior?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. It might not be too soon since work's a bit busy at the moment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review, @epicfaace. I added several new tests for a range of special characters.

Franklin, Samuel added 3 commits September 19, 2019 23:26
…y all Unicode whitespace characters. Thank you Ashwin Ramaswami for the recommendation.
@sjfranklin sjfranklin force-pushed the fix-37910-extra-whitespace-breaks-argparse-wrap branch from 6abd8e6 to dba2acc Compare September 20, 2019 04:06
@sjfranklin sjfranklin force-pushed the fix-37910-extra-whitespace-breaks-argparse-wrap branch from 2f8f025 to 646fbad Compare September 20, 2019 04:24
…break argparse usage help text. Fixed whitespace issue.
@sjfranklin sjfranklin force-pushed the fix-37910-extra-whitespace-breaks-argparse-wrap branch from 9737c28 to 30c69ff Compare September 20, 2019 04:40
Franklin, Samuel added 2 commits September 20, 2019 01:09
…break argparse usage help text. Fixed whitespace issue.
…break argparse usage help text. Fixed whitespace issue.
@sjfranklin sjfranklin force-pushed the fix-37910-extra-whitespace-breaks-argparse-wrap branch from 28d9f09 to d941a85 Compare September 20, 2019 05:09
@csabella csabella requested a review from rhettinger June 12, 2020 12:26
class TestAddArgumentMetavarWrapNoException(TestCase):
"""Check that certain special character wrap with no exceptions
Based off TestAddArgumentMetavar"""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please combine all of these into a single test method that runs the cases in a loop with unittest subtests.

Copy link
Member

@shihai1991 shihai1991 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, It works.

@@ -0,0 +1,3 @@
argparse.py now allows metavar to be certain whitespace characters, such as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls use :mod:argparse in here.

@hauntsaninja
Copy link
Contributor

@sjfranklin are you interested in seeing this PR through?

hamdanal added a commit to hamdanal/cpython that referenced this pull request May 28, 2023
Rationale
=========

argparse performs a complex formatting of the usage for argument grouping
and for line wrapping to fit the terminal width. This formatting has been
a constant source of bugs for at least 10 years (see linked issues below)
where defensive assertion errors are triggered or brackets and paranthesis
are not properly handeled.

Problem
=======

The current implementation of argparse usage formatting relies on regular
expressions to group arguments usage only to separate them again later
with another set of regular expressions. This is a complex and error prone
approach that caused all the issues linked below. Special casing certain
argument formats has not solved the problem. The following are some of
the most common issues:
- empty `metavar`
- mutually exclusive groups with `SUPPRESS`ed arguments
- metavars with whitespace
- metavars with brackets or paranthesis

Solution
========

The following two comments summarize the solution:
- python#82091 (comment)
- python#77048 (comment)

Mainly, the solution is to rewrite the usage formatting to avoid the
group-then-separate approach. Instead, the usage parts are kept separate
and only joined together at the end. This allows for a much simpler
implementation that is easier to understand and maintain. It avoids the
regular expressions approach and fixes the corresponding issues.

This closes the following issues:
- Closes python#62090
- Closes python#62549
- Closes python#77048
- Closes python#82091
- Closes python#89743
- Closes python#96310
- Closes python#98666

These PRs become obsolete:
- Closes python#15372
- Closes python#96311
encukou pushed a commit that referenced this pull request May 7, 2024
Rationale
=========

argparse performs a complex formatting of the usage for argument grouping
and for line wrapping to fit the terminal width. This formatting has been
a constant source of bugs for at least 10 years (see linked issues below)
where defensive assertion errors are triggered or brackets and paranthesis
are not properly handeled.

Problem
=======

The current implementation of argparse usage formatting relies on regular
expressions to group arguments usage only to separate them again later
with another set of regular expressions. This is a complex and error prone
approach that caused all the issues linked below. Special casing certain
argument formats has not solved the problem. The following are some of
the most common issues:
- empty `metavar`
- mutually exclusive groups with `SUPPRESS`ed arguments
- metavars with whitespace
- metavars with brackets or paranthesis

Solution
========

The following two comments summarize the solution:
- #82091 (comment)
- #77048 (comment)

Mainly, the solution is to rewrite the usage formatting to avoid the
group-then-separate approach. Instead, the usage parts are kept separate
and only joined together at the end. This allows for a much simpler
implementation that is easier to understand and maintain. It avoids the
regular expressions approach and fixes the corresponding issues.

This closes the following GitHub issues:
-  #62090
-  #62549
-  #77048
-  #82091
-  #89743
-  #96310
-  #98666

These PRs become obsolete:
-  #15372
-  #96311
@erlend-aasland
Copy link
Contributor

Closing this, as the linked issue was resolved by #105039. Thanks for the PR, though!

SonicField pushed a commit to SonicField/cpython that referenced this pull request May 8, 2024
Rationale
=========

argparse performs a complex formatting of the usage for argument grouping
and for line wrapping to fit the terminal width. This formatting has been
a constant source of bugs for at least 10 years (see linked issues below)
where defensive assertion errors are triggered or brackets and paranthesis
are not properly handeled.

Problem
=======

The current implementation of argparse usage formatting relies on regular
expressions to group arguments usage only to separate them again later
with another set of regular expressions. This is a complex and error prone
approach that caused all the issues linked below. Special casing certain
argument formats has not solved the problem. The following are some of
the most common issues:
- empty `metavar`
- mutually exclusive groups with `SUPPRESS`ed arguments
- metavars with whitespace
- metavars with brackets or paranthesis

Solution
========

The following two comments summarize the solution:
- python#82091 (comment)
- python#77048 (comment)

Mainly, the solution is to rewrite the usage formatting to avoid the
group-then-separate approach. Instead, the usage parts are kept separate
and only joined together at the end. This allows for a much simpler
implementation that is easier to understand and maintain. It avoids the
regular expressions approach and fixes the corresponding issues.

This closes the following GitHub issues:
-  python#62090
-  python#62549
-  python#77048
-  python#82091
-  python#89743
-  python#96310
-  python#98666

These PRs become obsolete:
-  python#15372
-  python#96311
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants