Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement usethis tool pre-commit #22

Closed
nathanjmcdougall opened this issue Oct 5, 2024 · 3 comments · Fixed by #25
Closed

Implement usethis tool pre-commit #22

nathanjmcdougall opened this issue Oct 5, 2024 · 3 comments · Fixed by #25
Assignees
Labels
enhancement New feature or request

Comments

@nathanjmcdougall
Copy link
Owner

nathanjmcdougall commented Oct 5, 2024

Motivation
pre-commit is an essential framework. It will interact with many other tools so it is better to implement usethis tool pre-commit sooner rather than later.

Summary of desired feature
usethis tool pre-commit should add, install and configure pre-commit.
By default, it will add a single, simple hook, however also any other tools which have been added using usethis should be detected and configured as hooks too; at the moment it is just deptry.

Design
We can add pre-commit as a dev dependency using uv, with a message.

Every tool provided by usethis will potentially write to configuration files. Usually, this will be pyproject.toml but for pre-commit hooks, it will be .pre-commit-config.yaml.

Any time we come to add to a .pre-commit-config.yaml file, if it does not already exist, then we can start with a simple file with one example hook and display a message:

repos:
  - repo: https://github.com/abravalheri/validate-pyproject
    rev: "v0.19"
    hooks:
      - id: validate-pyproject
        additional_dependencies: ["validate-pyproject-schema-store[all]"]

Then we can run uv run pre-commit autoupdate in a subprocess to update the rev tag to the latest verison. If offline, this won't work so we'll just leave the hard-coded version (e.g. as v0.19), although for now let's not worry about handling the case of being offline. There will be a chore associated with bumping the rev in each release of usethis - we can ensure this with a version check test.

We should create a Protocol class for each tool; implementations of the protocol should provide a method which gives the contents of the relevant config files. In general, this might depend on other things that are installed, or existing configuration.
Another part of the protocol should be a method to determine whether the tool is being used in a given repo. This might involve some heuristics but to begin with we will just check whether the tool is a dev dependency (for the pre-commit and deptry tools, which are the only ones so far). Here is what I expect the classes would roughly look like:

class Tool(Protocol):
    def get_pre_commit_repo_config(self) -> PreCommitRepoConfig:
        raise NotImplementedError
        
    def add_pre_commit_repo_config(self) -> None:
        #TODO add the pre-commit config, creating the config file if necessary, and deleting any old config first.
        ...
    
    @abstractmethod
    def is_used(self) -> bool: ...

class DeptryTool(Tool):
    def get_pre_commit_repo_config(self) -> PreCommitRepoConfig:
        # TODO deptry pre-commit config
        ...
       
    def is_used(self) -> bool:
        # TODO for now just check whether deptry is a dev dependency in the pyproject.toml file
        ...

class PreCommitTool(Tool):
    def is_used(self) -> bool:
        # TODO for now just check whether pre-commit is a dev dependency in the pyproject.toml file
        ...

We should loop over tools, Following the protocol, we can use the method to determine whether get_pre_commit_repo_config raises NotImplementedError: if so, we don't need to add pre-commit hooks, otherwise, we do. This should be done with a function which takes a PreCommitRepoConfig and adds it. This function can call another function which creates the .pre-commit-config.yaml file, if it does not already exist, as described above.

To add the new deptry hook, we need to remove any hook that already has the ID deptry, and replace it with the new one, including a message. This ensures that any older version of usethis or any repo configured with a different methodology will still be compatible with the current version of usethis. There should be a function which removes a hook with a given ID. Again, this can be called the hook-adding function.

We will use ruamel.yaml to preserve round-trip formatting, comments, etc. of the YAML file.

There are different ways to set up the deptry pre-commit hook, but currently, I believe the best way something like this:

  - repo: local
    hooks:
    - id: deptry
      name: deptry
      entry: uv run --frozen deptry src
      language: system
      always_run: true
      pass_filenames: false

There are a variety of reasons for this choice, explained later.

We should have a canonical sort order for all usethis-supported hooks to decide where to place the section. The main objective with the sort order is to ensure dependency relationships are satisfied. For example, valdiate-pyproject will check if the pyproject.toml is valid - if it isn't then some later tools might fail. It would be better to catch this earlier. A general principle is to move from the simpler hooks to the more complicated. Of course, this order might already be violated, or the config might include unrecognized repos - in any case, we aim to ensure the new tool is configured correctly, so it should be placed after the last of its precedents. This logic can be encoded in the adding function.

Once we have added the hooks, we will then want to try install pre-commit from a subprocess; if we're not in a git repo, this will fail, in which case, we should fail hard with a descriptive error message (eventually we might support usethis tool git under the hood and manage setup automatically instead of giving up). Otherwise give a message that pre-commit has been installed.

As a part of all these changes, we should also change the usethis tool deptry command - if using pre-commit (again, the tool protocol should provide a method to detect this) we should again add the pre-commit configuration for deptry.

One issue will be determining the directory in which the source code is found. I am of the opinion that src should be a default/pre-requisite for usethis. Alternatively, this could be some kind of global usethis configuration. For now, I think let's hard-code it and fail hard if src is not a directory, with a helpful message. There are potentially other directories such as tests and doc which would benefit from checking, but deptry does not yet support checking of folders where dev dependencies are allowed. This check should be enforced as a part of the usethis tool deptry command.

Describe alternatives you've considered

  • Rather than a Protocol class which is config-file oriented, we might try and create an abstraction which doesn't give pre-commit special first-class status and just models inter-tool interactions (possibly between more than 2 tools) and the objects would define the interaction effects (which might be saving a particular piece of config in a file, or perhaps something else). I still haven't ruled something like this out but the abstraction isn't clear to me yet, and I don't think it would be too hard to refactor later. Also, interactions between tools is unusual for the most part. The protocol approach seems simple and appropriate at this stage of development.
  • ruamel.yaml seems like the only option in terms of round-trip modification of YAML; pyyaml would not be powerful enough.
  • Rather than including the hook for validate-pyproject, we might include a different hook (or hooks). One option was maximum file size (but that might disrupt a user of Git LFS). Ultimately it's basically just a placeholder and can be changed later. It is very unlikely that someone wouldn't want this sort of valdiation to their pyproject.toml file. We take for granted that a pyproject.toml file is being used for uv add to work, so this is a reasonable choice.
  • Rather than running uv run pre-commit autoupdate to get the version of validate-pyproject, we might just rely on the hard-coded version. This would mean less complexity and maybe a performance improvement would mean that an outdated version of usethis would give an outdated version of validate-pyproject. On the balance, the user experience of getting the latest version is more important.

It's worth going into some detail as to why the deptry hook is set up the way it is. Configuring deptry with pre-commit is a little tricky since it needs to access the virtual environment to automatically determine the names of modules (used in import statements) exposed by each distribution package (i.e. installed with uv/pip/etc.). As such, a system hook is really the only option to get the necessary dependencies without explicitly enumerating them in the .pre-commit-config.yaml file. This is approach taken by the officially supported hook; as a system hook. However, that particular official hook doesn't really work in my experience when running while committing rather than manually from the CLI, since the system hook is not able to activate the virtual environment. Whereas using uv run will activate the venv. Also, with uv run, all the packages are all based on the lockfile, including deptry itself, which is useful since deptry otherwise there is a version syncing issue.

This is a more general issue: how to handle synchronization between dev dependencies' versions and the corresponding versions in the pre-commit-config.yaml file; also how to ensure that the syncing is not disrupted by pre-commit.ci's automatic update requests.
In the long run, I think the best approach is to use something like sync-pre-commit-lock but this is still waiting on uv support. deptry will not benefit from this approach. Since you can't exclude specific repos from the autoupdate command, we would need to use [pre-commit-update](https://gitlab.com/vojko.pribudic.foss/pre-commit-update) and this would be incompatible with using the autoupdate provided by pre-commit.ci, which is a shame. An alternative is just to remove the dev dependency and rely on pre-commit, which is fine for some tools (e.g. validate-pyproject) which are only ever really used in the context of pre-commit, but for things like ruff which have IDE integrations, etc. it's not an option. So the uv run local hook is a very attractive workaround in the interim until we work through the complexity of doing pre-commit version updates; not just for deptry but for other tools too.

It would be theoretically possible to use a python language hook and make uv a dependency, but then pre-commit.ci will try to run it by default, so we would have to introduce complexity to configure the ci section for support by pre-commit.ci, etc. Better to use a system hook and just assume uv is installed, which we are doing anyway.

We could fail hard if we're not in a git repo, but this probably means there needs to be complexity associated with checking this up-front - it's potentially easier to just add all the configuration and then rely on pre-commit itself to determine whether installation is successful (which should hopefully be equivalent to whether git is installed).

Testing Strategy

Just calling usethis tool pre-commit when there is a git repo and basic, valid pyproject.toml file, and no `.pre-commit-config.yaml`` file:

  • Test pre-commit is added as a dev dependency
  • Test the config file gets created
  • Test the config file contains the expected contents, including hard-coded version for validate-pyproject which will need bumping
  • Test that calling pre-commit run --all-files from a subprocess with invalid TOML in the pyproject.toml file fails.
  • Test that calling pre-commit run --all-files from a subprocess with valid TOML in the pyproject.toml file runs successfully
  • Test that trying to commit invalid TOML to the pyproject.toml file is rejected by the pre-commit hook
  • Test that trying to commit valid TOML to the pyproject.toml file is accepted by the pre-commit hook.

Test the case where the pre-commit-config.yaml file already exists (and git repo, and pyproject.toml), that there is no error raised.

When calling usethis tool pre-commit when there isn't a git repo (but is a pyproject.toml file), test the command fails.
When calling usethis tool pre-commit when there isn't a pyproject.toml file (but is a git repo), test the command fails.

Test we can call the CLI for usethis tool pre-commit from a subprocess.

Calling usethis tool deptry and then usethis tool pre-commit in succession (and vice versa; test cases should be the same):

  • Test that calling pre-commit run deptry --all-files from a subprocess succeeds.
  • Test that calling pre-commit run deptry --all-files from a subprocess with bad dependency relationships fails.

Test all the output messages are correct (potentially just within multiple of the tests above).

Steps

  1. Write an empty def pre_commit() -> None function.
  2. Test pre-commit is added as a dev dependency.
  3. Call uv add from a subprocess in the pre_commit function.
  4. Test a message regarding the new dev dependency.
  5. Add the message.
  6. Test the .pre-commit-config.yaml file exists.
  7. In the pre_commit function, add an empty .pre-commit-config.yaml file if it does not already exist.
  8. Test a message regarding the new config file.
  9. Add the message.
  10. Test that the .pre-commit-config.yaml file has the expected contents, including latest version of validate-pyproject.
  11. Modify the .pre-commit-config.yaml population logic so that it doesn't write empty contents but instead writes the expected contents, followed by pre-commit autoupdate from a subprocess.
  12. Add all the remaining tests and pass them.

Acceptance Criteria
Assuming uv and git are installed, it should be possible to call usethis tool pre-commit from the command-line in a new project, and then immediately call pre-commit run --all-files to run the pre-commit hooks for any tools managed by usethis, e.g. deptry.

@nathanjmcdougall nathanjmcdougall self-assigned this Oct 5, 2024
@nathanjmcdougall
Copy link
Owner Author

Rather than using pre-commit autoupdate from a subprocess (which is quite slow and I think involves cloning the repo), I am planning to write a function to use the GitHub API.

@nathanjmcdougall
Copy link
Owner Author

Not going to test the case of where the isn't a git repo, mostly since pre-commit will search recursively into parent directories to look for one. We can/should think about this at a later date. For now, we are just assuming there is a git repo already.

@nathanjmcdougall
Copy link
Owner Author

Also not going to bother testing the case where pyproject.toml doesn't exist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant