Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: Support configuration of multiple tools for jj fix #3801

Closed
hooper opened this issue Jun 1, 2024 · 4 comments
Closed

FR: Support configuration of multiple tools for jj fix #3801

hooper opened this issue Jun 1, 2024 · 4 comments
Assignees
Labels
good first issue Good for newcomers

Comments

@hooper
Copy link
Contributor

hooper commented Jun 1, 2024

Problem:

As of #3698, jj fix can only be configured to run different code formatters on different file types if a wrapper script is used to make such decisions. That is an unreasonable requirement for anyone who didn't already have such a wrapper script, and it means that jj fix is unable to make any useful decisions about which files even need to be processed at all. Even users of a wrapper script might want to avoid processing some files.

Solution:

Support a repeatable configuration syntax for defining the properties of each tool:

  • The name of the tool for use in progress bars, error messages, etc.
  • The name or location of the executable.
  • The file patterns the tool should affect.
  • The arguments that should be passed to the tool, including:
    • Arguments that are always needed.
    • Arguments that communicate the name of the file to the tool (since the file content is given via stdin).
    • Arguments that communicate the line ranges that the tool should focus its effects on.
  • Some notion of whether the tool should be applied before or after any other tool that applies to any of the same paths.

As @PhilipMetzger mentioned, TOML tables seem to be a good approach:
#3698 (comment)

Alternatives:

You could imagine configuring some other source of information, from which jj fix would fetch the tool properties. I don't think that would be useful, although we could look around to see if there is some existing standard for declaring this kind of thing.

You could imagine deferring formatting to something like LSP, but that would require more setup, would be less generic, would encounter more complicated failure modes, and would share some of the limitations of wrapper scripts as described above:
https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#textDocument_formatting

Context:

Prior art: https://www.mercurial-scm.org/doc/hg.1.html#fix

I specifically want to avoid re-implementing the :metadata subconfig from hg fix, because it's weird and probably unnecessary. I say this as the person who implemented it and oversaw its use at Google; I don't know if others have found it useful.

@arxanas
Copy link
Contributor

arxanas commented Jun 1, 2024

jj fix can only be configured to run different code formatters on different file types if a wrapper script is used to make such decisions

If one is not willing to implement a wrapper script, is it really a problem to just run multiple serial jj fix invocations with the different tools?

I don't know if the listed "solution" has a lot to do with the listed "problem", anyways. The listed items largely seem like general configuration options.

@hooper
Copy link
Contributor Author

hooper commented Jun 10, 2024

My expectation is that users will generally run jj fix without arguments, relying on a configuration of multiple tools that they rarely interact with, and that multiple invocations for different tools would be cumbersome. That's mostly based on good feedback about hg fix working that way.

I think the key word in the solution was "repeatable". I described the main configuration options just to make sure those ideas are documented. They're generally borrowed from hg fix, but not set in stone. The "affected file pattern" is the one most related to the idea of having multiple instances in the configuration. Imagine a config with several formatters relevant to a repo:

  • clang-format for **.{h,c,cpp}
  • mdformat for **.md
  • keep-sorted for ** (the file sets don't need to be disjoint)
  • etc.

@hooper hooper self-assigned this Jun 18, 2024
@hooper hooper added the good first issue Good for newcomers label Jun 18, 2024
@hooper
Copy link
Contributor Author

hooper commented Jun 18, 2024

Here's a concrete proposal for the configuration syntax.

A simple example for a single code formatter:

[[fix.clang-format]]
command = ["/usr/bin/clang-format", "-", "--assume-filename=$path"]
patterns = ["**.cc", "**.h"]

This would result in argv like the following, which would vary per ToolInput due to the $path substitution:

/usr/bin/clang-format - --assume-filename=src/lib/foo.cc

The patterns will be handled by parse_file_patterns() to determine what we do with each element of unique_tool_inputs during fix_file_ids().

Multiple tools can be defined by repeating the section with unique identifiers, and their priority is undefined. Until we implement the priority option, we can operate under the assumption that the tools are run in the order they are specified.

[[fix.cpp]]
command = ["/usr/bin/clang-format", "-", "--assume-filename=$path"]
patterns = ["**.cc", "**.h"]

[[fix.python]]
command = ["/usr/bin/black", "--stdin-filename=$path"]
patterns = ["**.py"]

We would like to implement that basic syntax this week if there are no objections. Some other functionality can be added incrementally.

When #3802 is implemented, we will need to support communicating the boundaries of the diff hunks using repeated flags. We see two main approaches. The first one puts the repeated flags in the same list with the other arguments, and implicitly repeats them at the same position in the argument list. The presence of substitutions for the line range variables determines which arguments are handled this way. For example:

[[fix.clang-format]]
command = ["/usr/bin/clang-format", "-", "--lines=$first:$last", "--assume-filename=$path"]
patterns = ["**.cc", "**.h"]

This would result in an argv like the following:

/usr/bin/clang-format - \
--lines=12:34 --lines=56:78 \
--assume-filename=src/lib/foo.cc

Another approach, which is used by Mercurial, is to list the line range flags separately so that the difference in how they are handled is more explicit:

[[fix.clang-format]]
command = ["/usr/bin/clang-format", "-", "--assume-filename=$path"]
linerange = "--lines=$first:$last"
patterns = ["**.cc", "**.h"]

Without any extra features to specify where the line range flags go, they would just be appended to the end of the command list:

/usr/bin/clang-format - \
--assume-filename=src/lib/foo.cc \
--lines=12:34 --lines=56:78

Including the other options alluded to in #3801 (comment), a more complete configuration could look like this:

[[fix.clang-format]]
command = ["/usr/bin/clang-format", "-", "--sort-includes",
           "--lines=$first:$last", "--assume-filename=$path"]
patterns = ["**.cc", "**.h"]
priority = 123     # Run this tool after tools with lower priority numbers (on the same file)
skipclean = false  # Run this tool even if there are no line ranges, for --sort-includes
enabled = false    # Meant to be used if a tool needs to be disabled temporarily

@hooper
Copy link
Contributor Author

hooper commented Oct 10, 2024

This happened a while back in #4079.

@hooper hooper closed this as completed Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants