RFC: improve diagnostic formatting #3258

perillo · 2023-12-20T10:51:01Z

In #3256 I introduced the --caret-diagnostic flag in order to report the diagnostic message as it is done by modern compilers.
The name of the flag is derived from https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-fcaret-diagnostics (though I missed the ending s).

@peternewman suggested to add a --format flag, instead, in order to be future proof.

Since this requires more changes to the current code, I would like to open a discussion about the best strategy.
The main problem is how the context (-C/-A/-B) is applied. Here is an example:

:     with fp:
:         # GH-87235: On macOS all file descriptors for /dev/fd/N share the same
>         # file offset, reset the file offset after scanning the zipfile diretory
:         # to not cause problems when some runs 'python3 /dev/fd/9 9<some_script'
:         start_offset = fp.tell()
Lib/zipimport.py:355: diretory ==> directory

This works only when there is one line.

--format flag

Introducing the --format flag has the advantage that we know how many context source lines are printed by a format, so with -C/-A/-B the code only needs to print additional lines around the lines printed by the format.

Possible formats:

plain (default)

if filename != "-":
    f"{cfilename}:{cline}: {cwrongword} "
    f"==> {crightword}{creason}"
else:
    f"{cline}: {cwrongword} ==> {crightword}{creason}"

Not sure if ccolumn can be added.

diagnostic / caret-diagnostics / develop/ ?

if filename != "-":
       f"{cfilename}:{cline}:{ccolumn}: {cwrongword} "
       f"==> {crightword}{creason}\n"
       f"{line}"
       f"{ccaret}\n"                   
else:
       f"{cfilename}:{cline}:{ccolumn}: {cwrongword} "
       f"==> {crightword}{creason}\n"
       f"{line}"
       f"{ccaret}\n"

TODO

Not sure if the code in interactive mode should also be modified. Example:

:         env['LC_ALL'] = 'C'
>         # Empty strings will be quoted by popen so we should just ommit it
:         if args != ('',):
        # Empty strings will be quoted by popen so we should just ommit it
	ommit ==> omit (Y/n)

The same line is printed twice; not sure if it is a good user experience.

Another issue I found is:

elif options.stdin_single_line:
    print(f"{cline}: {cwrongword} ==> {crightword}{creason}")
else:
    print(
        f"{cline}: {line.strip()}\n\t{cwrongword} "
        f"==> {crightword}{creason}"
    )

Should we have the second template be an alternative format?
Should the --stdin_single_line flag be removed before the next release?

Thanks.

The text was updated successfully, but these errors were encountered:

DimitriPapadopoulos · 2023-12-20T12:50:01Z

Should the --stdin_single_line flag be removed before the next release?

Removed or changed to --format=...?

perillo · 2023-12-20T14:03:00Z

Should the --stdin_single_line flag be removed before the next release?

Removed or changed to --format=...?

Yes. My idea was to have the default format be "plain" (and --stdin-single-line was probably added to have the same plain format for both stdin and file). However this will break the compatibility, so a possible solution is to set the default format to None , "legacy" or "default", so that we can keep the current API.

However, as I wrote, the way the context is written should probably be modified, in order to improve readability.
In alternative, each format can decide if context should be ignored, maybe reporting an error.

An example is a format that prints a JSON document.

perillo · 2023-12-27T12:26:36Z

@DimitriPapadopoulos

I'm planning to implement the --format option.
Before starting, I checked if the current tests are able to detect a change in the codespell command output, but unfortunately after this change:

diff --git a/codespell_lib/_codespell.py b/codespell_lib/_codespell.py
index 6e3662a8..40683357 100644
--- a/codespell_lib/_codespell.py
+++ b/codespell_lib/_codespell.py
@@ -994,14 +994,14 @@ def parse_file(
                     print_context(lines, i, context)
                 if filename != "-":
                     print(
-                        f"{cfilename}:{cline}: {cwrongword} "
+                        f"{cfilename}:{cline}:x {cwrongword} "
                         f"==> {crightword}{creason}"
                     )
                 elif options.stdin_single_line:
                     print(f"{cline}: {cwrongword} ==> {crightword}{creason}")
                 else:
                     print(
-                        f"{cline}: {line.strip()}\n\t{cwrongword} "
+                        f"{cline}: {line.strip()}x\n\t{cwrongword} "
                         f"==> {crightword}{creason}"
                     )

there was no test failure.

My suggestion is:

Move run_codespell, run_codespell_stdin, test_command and test_stdin from test_basic.py to test_cli.py
Add a new expect parameter to check that the output matches the expected value

What do you think?

perillo · 2024-01-11T10:35:25Z

Any news about this proposal?

DimitriPapadopoulos added enhancement awaiting review labels Jan 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: improve diagnostic formatting #3258

RFC: improve diagnostic formatting #3258

perillo commented Dec 20, 2023 •

edited

Loading

DimitriPapadopoulos commented Dec 20, 2023

perillo commented Dec 20, 2023 •

edited

Loading

perillo commented Dec 27, 2023

perillo commented Jan 11, 2024

RFC: improve diagnostic formatting #3258

RFC: improve diagnostic formatting #3258

Comments

perillo commented Dec 20, 2023 • edited Loading

--format flag

plain (default)

diagnostic / caret-diagnostics / develop/ ?

TODO

DimitriPapadopoulos commented Dec 20, 2023

perillo commented Dec 20, 2023 • edited Loading

perillo commented Dec 27, 2023

perillo commented Jan 11, 2024

perillo commented Dec 20, 2023 •

edited

Loading

perillo commented Dec 20, 2023 •

edited

Loading