Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[symilar] Migrate from getopt to argparse #9731

Merged
merged 5 commits into from
Aug 25, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions doc/whatsnew/fragments/9731.user_action
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
We migrated ``symilar`` to argparse, from getopt, so the error and help output changed
(for the better). We exit with 2 instead of sometime 1, sometime 2. The error output
is not captured by the runner anymore. It's not possible to use a value for the
boolean options anymore (``--ignore-comments 1`` should become ``--ignore-comments``).

Refs #9731
80 changes: 22 additions & 58 deletions pylint/checkers/symilar.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@
import warnings
from collections import defaultdict
from collections.abc import Callable, Generator, Iterable, Sequence
from getopt import GetoptError, getopt
from io import BufferedIOBase, BufferedReader, BytesIO
from itertools import chain
from typing import TYPE_CHECKING, NamedTuple, NewType, NoReturn, TextIO, Union
Expand Down Expand Up @@ -876,67 +875,32 @@ def register(linter: PyLinter) -> None:
linter.register_checker(SimilaritiesChecker(linter))


def usage(status: int = 0) -> NoReturn:
"""Display command line usage information."""
print("finds copy pasted blocks in a set of files")
print()
print(
"Usage: symilar [-d|--duplicates min_duplicated_lines] \
[-i|--ignore-comments] [--ignore-docstrings] [--ignore-imports] [--ignore-signatures] file1..."
)
sys.exit(status)


def Run(argv: Sequence[str] | None = None) -> NoReturn:
"""Standalone command line access point."""
if argv is None:
argv = sys.argv[1:]

s_opts = "hd:i:"
l_opts = [
"help",
"duplicates=",
"ignore-comments",
"ignore-imports",
"ignore-docstrings",
"ignore-signatures",
]
min_lines = DEFAULT_MIN_SIMILARITY_LINE
ignore_comments = False
ignore_docstrings = False
ignore_imports = False
ignore_signatures = False
try:
opts, args = getopt(list(argv), s_opts, l_opts)
except GetoptError as e:
print(e)
usage(2)
for opt, val in opts:
if opt in {"-d", "--duplicates"}:
try:
min_lines = int(val)
except ValueError as e:
print(e)
usage(2)
elif opt in {"-h", "--help"}:
usage()
elif opt in {"-i", "--ignore-comments"}:
ignore_comments = True
elif opt in {"--ignore-docstrings"}:
ignore_docstrings = True
elif opt in {"--ignore-imports"}:
ignore_imports = True
elif opt in {"--ignore-signatures"}:
ignore_signatures = True
if not args:
usage(1)
sim = Symilar(
min_lines, ignore_comments, ignore_docstrings, ignore_imports, ignore_signatures
parser = argparse.ArgumentParser(
prog="symilar", description="Finds copy pasted blocks in a set of files."
)
parser.add_argument("files", nargs="+")
parser.add_argument(
"-d", "--duplicates", type=int, default=DEFAULT_MIN_SIMILARITY_LINE
)
parser.add_argument("-i", "--ignore-comments", action="store_true")
parser.add_argument("--ignore-docstrings", action="store_true")
parser.add_argument("--ignore-imports", action="store_true")
parser.add_argument("--ignore-signatures", action="store_true")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Not sure under what circumstances users might end up seeing the parser's "usage" explanation, but if they might, help statements might be helpful here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch !

parsed_args = parser.parse_args(args=argv)
similar_runner = Symilar(
min_lines=parsed_args.duplicates,
ignore_comments=parsed_args.ignore_comments,
ignore_docstrings=parsed_args.ignore_docstrings,
ignore_imports=parsed_args.ignore_imports,
ignore_signatures=parsed_args.ignore_signatures,
)
for filename in args:
for filename in parsed_args.files:
with open(filename, encoding="utf-8") as stream:
sim.append_stream(filename, stream)
sim.run()
similar_runner.append_stream(filename, stream)
similar_runner.run()
# the sys exit must be kept because of the unit tests that rely on it
sys.exit(0)


Expand Down
28 changes: 16 additions & 12 deletions tests/checkers/unittest_symilar.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
from pathlib import Path

import pytest
from _pytest.capture import CaptureFixture

from pylint.checkers import symilar
from pylint.lint import PyLinter
Expand Down Expand Up @@ -364,13 +365,16 @@ def test_help() -> None:
pytest.fail("not system exit")


def test_no_args() -> None:
def test_no_args(capsys: CaptureFixture) -> None:
output = StringIO()
with redirect_stdout(output):
try:
symilar.Run([])
except SystemExit as ex:
assert ex.code == 1
assert ex.code == 2
out, err = capsys.readouterr()
assert not out
assert "the following arguments are required: files" in err
else:
pytest.fail("not system exit")

Expand Down Expand Up @@ -494,30 +498,30 @@ def test_set_duplicate_lines_to_zero() -> None:
assert output.getvalue() == ""


@pytest.mark.parametrize("v", ["d"])
def test_bad_equal_short_form_option(v: str) -> None:
def test_equal_short_form_option() -> None:
"""Regression test for https://github.com/pylint-dev/pylint/issues/9343"""
output = StringIO()
with redirect_stdout(output), pytest.raises(SystemExit) as ex:
symilar.Run([f"-{v}=0", SIMILAR1, SIMILAR2])
assert ex.value.code == 2
assert "invalid literal for int() with base 10: '=0'" in output.getvalue()
symilar.Run(["-d=2", SIMILAR1, SIMILAR2])
assert ex.value.code == 0
assert "similar lines in" in output.getvalue()


@pytest.mark.parametrize("v", ["i", "d"])
def test_space_short_form_option(v: str) -> None:
def test_space_short_form_option() -> None:
"""Regression test for https://github.com/pylint-dev/pylint/issues/9343"""
output = StringIO()
with redirect_stdout(output), pytest.raises(SystemExit) as ex:
symilar.Run([f"-{v} 2", SIMILAR1, SIMILAR2])
symilar.Run(["-d 2", SIMILAR1, SIMILAR2])
assert ex.value.code == 0
assert "similar lines in" in output.getvalue()


def test_bad_short_form_option() -> None:
def test_bad_short_form_option(capsys: CaptureFixture) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be out of scope of this PR, but could it be helpful to add a couple more unit tests?

  1. --i could be useful because it's ambiguous and argparse allows abbreviation (allow_abbrev defaults to True)
  2. I also hope a user wouldn't fall into this trap, but it might be relevant to have a test for if a user provides an argument accompaniment to one of the boolean flags (i.e., --ignore-comments 1 as you mentioned this in the fragments).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of those test cases are going to fail with an elegant argparse message. We had a lot of unit tests because we were using getopt and doing too much ourselves (no other choices at time of implementation, as argparse was not builtin yet). I added test previously to check that I wasn't changing the behavior with the migration. I don't know if we should remove tests because we trust argparse now, maybe later (?)

Copy link
Contributor

@rogersheu rogersheu Aug 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Makes sense! Yeah, seeing as a lot of this error handling gets offloaded to argparse, I think it sounds like it would make sense to remove the tests when ready.

"""Regression test for https://github.com/pylint-dev/pylint/issues/9343"""
output = StringIO()
with redirect_stdout(output), pytest.raises(SystemExit) as ex:
symilar.Run(["-j=0", SIMILAR1, SIMILAR2])
out, err = capsys.readouterr()
assert ex.value.code == 2
assert "option -j not recognized" in output.getvalue()
assert not out
assert "unrecognized arguments: -j=0" in err
Loading