Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pre-commit config so formatR can be used as a remote hook #101

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

Serene-Arc
Copy link

Implements #97

Saw that there was an open issue so I implemented it myself. There's a section in the README for the options that can be given to the hook, but it's very similar to those that are given to the tidy_source() function. There's a wrapper script that the hook calls. I've tested it an it works.

If there's any comments or suggestions, happy to hear and implement them.

Copy link
Owner

@yihui yihui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! Could you leave out the irrelevant renv stuff in this PR?

@lorenzwalthert
Copy link

lorenzwalthert commented Apr 11, 2024

renv is required for language: r in the .pre-commit-config.yaml. You can use the system language if you want to rely on the global R library.

@yihui
Copy link
Owner

yihui commented Apr 11, 2024

@lorenzwalthert Thanks for the explanation! I don't really understand why renv is required, but I don't want to enable it for this repo. The formatR package is fairly stable and has no other dependencies, so it's probably not worth using a sophisticated tool like renv to manage it.

Perhaps the better place to host this feature is your precommit package?

Actually I know nothing about the python package pre-commit (as I said in #97). Most of time I only use R. If I were to add a git pre-commit hook, I'd probably just do:

Rscript -e 'formatR::tidy_dir()'

in .git/hooks/pre-commit. I understand that the python package could be useful to multilingual people, though. I guess the above one-liner could be added as a system hook (https://pre-commit.com/#system).

@lorenzwalthert
Copy link

lorenzwalthert commented Apr 11, 2024

I completely understand you don’t want to add renv as a dependency (although this PR only adds infrastructure to use your package ins pre-commit compatible way for isolated dependency management with renv, it’s not a dependency that will be installed when the package is installed from CRAN agora example). The goal of the pre-commit python framework is among others to avoid copying scrips into .git/hooks/, which is unversioned.

My above suggestion (as you grasped correctly) was to use the systems language instead of the r language as a hooks language. This one line change in the .pre-commit-config.yaml file would make all renv elements of this PR redundant. It’s similar to your suggestion of running tidy_dir(), except that it’s possible to specify the command line arguments for the new script in the pre-commit config file, running to hook only when R files are changed etc.

@Serene-Arc
Copy link
Author

Hi @yihui, @lorenzwalthert is correct. The renv stuff only changes the pre-commit process, since pre-commit automatically looks for and installs the renv files so that the hook is in it's own environment. It's pretty essential.

The problem with doing it as a system library is that it will be dependent on the installed R packages and library, which can change and pollute the process. The reason that pre-commit insists on renv is so that there is a self-contained environment that is just for the hook, and which won't be affected by any outside influences. A system script for this kind of defeats the purpose that pre-commit fills. As mentioned above, this won't change formatR in any way other than enabling pre-commit. It won't be used for managing your tool at all.

I don't think calling a system script is the way to go for this. Also pre-commit isn't so much a python package as a tool written in python; that it's python is pretty irrelevant to the usage.

If you're not comfortable with the addition of the PR to your repository, then perhaps @lorenzwalthert will want to integrate it, or I can put it in its own repo. But I don't think the system script is a good idea and renv is required unfortunately for the R integration with pre-commit.

@yihui
Copy link
Owner

yihui commented Apr 12, 2024

Okay, thanks a lot for the explanation! Honestly, I tend not to merge a PR that adds an R script of 1200 lines of code from renv. If this thing could be hosted elsewhere, I'll be more than happy add a link to formatR's README so people who are interested in pre-commit can use it.

@lorenzwalthert
Copy link

lorenzwalthert commented Apr 12, 2024

I don't think calling a system script is the way to go for this.

I am not sure you understand that from a user point of view a system hook behaves exactly the same way except that the global r library will be used instead of a hook specific one. the user does not need to call any script.

My experience with {precommit} shows that a dependency isolation is almost an overkill for R and even creates problems for hooks that rely on all dependencies of package or the source in the repo where the hook is used. As CRAN ensures compatibility between all packages at any time, unlike python and other tools, version incompatibility with R is not an issue . So I strongly advise to just change that line in the confit file and drop the renv stuff. I won’t support formatR in {precommit}. If you really want the renv isolation, just create your own repo with the relevant files. There is no rule that the hook repo must contain the source of a package that is used in a hook. And it’s not the case for any hook in {precommit}. Just capture the dependencies with renv and expose the hooks work the .precommit-hooks.yaml (not hooks.yaml!) and that’s it.

@Serene-Arc
Copy link
Author

I do understand that there is no difference to the user, but most hooks are self-contained and don't require any changes to the host operating system, which is what pre-commit is designed to facilitate. With such a script, it would require that the user install formatR, docopt, pacman, and any other packages that are required, which may cause problems and inconsistencies.

I can drop it but I just want to be clear that I can't guarantee that it'll always work then, or not cause problems for some users, however rare that might be.

@Serene-Arc
Copy link
Author

Changes have been made, it is now a system script that runs on my machine fine, but I am on Linux. I don't know if there's a /usr/bin/env function on Windows so not sure what it'll do there.

@lorenzwalthert
Copy link

lorenzwalthert commented Apr 14, 2024

Ok, I won't comment any further on this, since I have actually nothing to do with this repo. Hope my comments helped to implement a good solution. Ping me if you have specific questions.

@Serene-Arc
Copy link
Author

@yihui I have fixed everything suggested.

Copy link
Owner

@yihui yihui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still hesitating on whether to introduce the command-line arguments, because all these options could be set in options() in (per-project or global) .Rprofile: https://yihui.org/formatr/#8-global-options Introducing command-line arguments may increase the learning cost to formatR users, and will also introduce the dependency on docopt. If users are happy with options() for configuring formatR::tidy_source(), I think the implementation could be significantly simplified to just one line of code:

formatR::tidy_file(commandArgs(TRUE)) 

@Serene-Arc What do you think?

.Rbuildignore Outdated Show resolved Hide resolved
scripts/pre-commit-formatR.R Outdated Show resolved Hide resolved
@Serene-Arc
Copy link
Author

No, I don't think that removing the command line parsing should be done. This would be moving too far from the point of pre-commit. Arguments to configure the hook should be in the hook configuration file, which requires that the script be able to parse them. The script absolutely must be able to parse them, it is expected that it will be able to by users for pre-commit.

@TimTaylor
Copy link

TimTaylor commented Apr 16, 2024

FWIW (and appreciate these are my unsolicited thoughts). I personally think this should be created in a separate repository and, should @yihui wish, it could be linked to in the README. AFAICT - there is no reason it needs to be within the formatR source repository itself. Given recent xz events, lLimiting unnecessary additions to popular packages seems prudent (even within the R ecosystem).

@Serene-Arc
Copy link
Author

The xz affair is not really applicable, and the lesson from that certainly shouldn't be that popular tools have as little code as possible. Many, including myself, who use pre-commit expect that the hook for a tool comes from the tool's source. This is in itself a security concern that leans in the opposite direction: using a hook means using the tool, and if they come from the same source, if the user trusts the tool then they can implicitly trust the hook too. Separating them means more security concerns from an end user's perspective, not less, as they need to vet both repos.

The xz takeaway should be about supporting devs and security audits on popular repos. Given that these changes won't affect the R package in any way, it isn't really a problem imo.

That being said, it's pretty clear to me that @yihui isn't totally enthusiastic on the idea, at least as far as pre-commit goes. The feedback so far seems to be (and please @yihui correct me if I'm wrong) that they don't want to add a pre-commit hook, just maybe a git hook. All of the suggestions so far have basically been to remove all of the features that make pre-commit different from a regular, plain git hook. Maybe @yihui isn't familiar with the tool, or just doesn't want to add it here, and either way is fine. I don't want to add code that they're not 100% confident in, since this is their tool and a popular one at that.

I mean no offense with my tone, and I hope I'm being clear. If the additions in the PR as they stand now aren't to anyone's satisfaction, then I'm more than happy to create a separate repository. I came to this repo looking for the hook, and judging from the issue, at least one other person did too, so I submitted it here. But a separate repository isn't a huge burden and it would allow the hook to work as expected according to the docs.

I would ask that a link be added to the README, though of course @yihui is free to say no. As I said above, the home of the tool is generally where people look first for any pre-commit hooks, so a note on where to find it might be helpful.

@TimTaylor
Copy link

@Serene-Arc - Have edited as referencing xz was a little cliché and unnecessary. My point remains:

Limiting unnecessary additions to popular packages seems prudent

"Unnecessary" is clearly subjective but, from a personal perspective, I prefer a separation between package and development functionality. Of course a maintainer may want to push a framework (e.g. pre-commit) but I felt this was not the case here and wanted to ensure there was an additional 👎 voice present.

@yihui - Apologies if I've created unwanted noise on this PR. Will leave the discussion now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants