Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: improve diagnostic formatting #3258

Open
perillo opened this issue Dec 20, 2023 · 4 comments
Open

RFC: improve diagnostic formatting #3258

perillo opened this issue Dec 20, 2023 · 4 comments

Comments

@perillo
Copy link
Contributor

perillo commented Dec 20, 2023

In #3256 I introduced the --caret-diagnostic flag in order to report the diagnostic message as it is done by modern compilers.
The name of the flag is derived from https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-fcaret-diagnostics (though I missed the ending s).

@peternewman suggested to add a --format flag, instead, in order to be future proof.

Since this requires more changes to the current code, I would like to open a discussion about the best strategy.
The main problem is how the context (-C/-A/-B) is applied. Here is an example:

:     with fp:
:         # GH-87235: On macOS all file descriptors for /dev/fd/N share the same
>         # file offset, reset the file offset after scanning the zipfile diretory
:         # to not cause problems when some runs 'python3 /dev/fd/9 9<some_script'
:         start_offset = fp.tell()
Lib/zipimport.py:355: diretory ==> directory

This works only when there is one line.

--format flag

Introducing the --format flag has the advantage that we know how many context source lines are printed by a format, so with -C/-A/-B the code only needs to print additional lines around the lines printed by the format.

Possible formats:

plain (default)

if filename != "-":
    f"{cfilename}:{cline}: {cwrongword} "
    f"==> {crightword}{creason}"
else:
    f"{cline}: {cwrongword} ==> {crightword}{creason}"

Not sure if ccolumn can be added.

diagnostic / caret-diagnostics / develop/ ?

if filename != "-":
       f"{cfilename}:{cline}:{ccolumn}: {cwrongword} "
       f"==> {crightword}{creason}\n"
       f"{line}"
       f"{ccaret}\n"                   
else:
       f"{cfilename}:{cline}:{ccolumn}: {cwrongword} "
       f"==> {crightword}{creason}\n"
       f"{line}"
       f"{ccaret}\n"

TODO

Not sure if the code in interactive mode should also be modified. Example:

:         env['LC_ALL'] = 'C'
>         # Empty strings will be quoted by popen so we should just ommit it
:         if args != ('',):
        # Empty strings will be quoted by popen so we should just ommit it
	ommit ==> omit (Y/n)

The same line is printed twice; not sure if it is a good user experience.

Another issue I found is:

elif options.stdin_single_line:
    print(f"{cline}: {cwrongword} ==> {crightword}{creason}")
else:
    print(
        f"{cline}: {line.strip()}\n\t{cwrongword} "
        f"==> {crightword}{creason}"
    )

Should we have the second template be an alternative format?
Should the --stdin_single_line flag be removed before the next release?

Thanks.

@DimitriPapadopoulos
Copy link
Collaborator

Should the --stdin_single_line flag be removed before the next release?

Removed or changed to --format=...?

@perillo
Copy link
Contributor Author

perillo commented Dec 20, 2023

Should the --stdin_single_line flag be removed before the next release?

Removed or changed to --format=...?

Yes. My idea was to have the default format be "plain" (and --stdin-single-line was probably added to have the same plain format for both stdin and file). However this will break the compatibility, so a possible solution is to set the default format to None , "legacy" or "default", so that we can keep the current API.

However, as I wrote, the way the context is written should probably be modified, in order to improve readability.
In alternative, each format can decide if context should be ignored, maybe reporting an error.

An example is a format that prints a JSON document.

@perillo
Copy link
Contributor Author

perillo commented Dec 27, 2023

@DimitriPapadopoulos

I'm planning to implement the --format option.
Before starting, I checked if the current tests are able to detect a change in the codespell command output, but unfortunately after this change:

diff --git a/codespell_lib/_codespell.py b/codespell_lib/_codespell.py
index 6e3662a8..40683357 100644
--- a/codespell_lib/_codespell.py
+++ b/codespell_lib/_codespell.py
@@ -994,14 +994,14 @@ def parse_file(
                     print_context(lines, i, context)
                 if filename != "-":
                     print(
-                        f"{cfilename}:{cline}: {cwrongword} "
+                        f"{cfilename}:{cline}:x {cwrongword} "
                         f"==> {crightword}{creason}"
                     )
                 elif options.stdin_single_line:
                     print(f"{cline}: {cwrongword} ==> {crightword}{creason}")
                 else:
                     print(
-                        f"{cline}: {line.strip()}\n\t{cwrongword} "
+                        f"{cline}: {line.strip()}x\n\t{cwrongword} "
                         f"==> {crightword}{creason}"
                     )
 

there was no test failure.

My suggestion is:

  1. Move run_codespell, run_codespell_stdin, test_command and test_stdin from test_basic.py to test_cli.py
  2. Add a new expect parameter to check that the output matches the expected value

What do you think?

@perillo
Copy link
Contributor Author

perillo commented Jan 11, 2024

Any news about this proposal?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants