Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update pre-commit, remove vale #1282

Merged
merged 9 commits into from
May 26, 2022
Merged

Update pre-commit, remove vale #1282

merged 9 commits into from
May 26, 2022

Conversation

iameskild
Copy link
Member

@iameskild iameskild commented May 14, 2022

Think of this PR as more of a discussion.

Theme (or goals)

  • move linting and formatting as close to the developer as possible (pre-commit) for quicker feedback that is NOT reliant on CI.
  • on the road to a more consistent and manageable CI

Description and motivations for these changes:

  • Completely remove Vale and replace it with codespell pre-commit hook.
    • Although Vale is a far more sophisticated of spell checker than codespell, the benefits of using codespell over Vale include:
      • codespell has a very low false positive rate and only makes suggestions/fixes for common words, skipping technical terms. This means we don't have to continuously maintain a dictionary like we do for Vale.
      • codespell can make the fixes automatically (using pre-commit, with codespell --write enabled).
      • codespell can run in files besides just .md, see setup.cfc::codespell for files currently excluded.
      • And perhaps the biggest reason for this change is to move spell checking closer to the user (not CI reliant).
        • If Vale had a pre-commit hook, this might not be necessary but Vale doesn't. And forcing developers/writers to download, config and run vale from the command line seems like it adds a lot friction for new contributors.
  • Clean up and update pre-commit.
    • Add codespell pre-commit hook, as mentioned above.
    • Add terraform_fmt pre-commit hook to format Terraform files according the Terraform style guidelines.
    • Add isort pre-commit hook to cleanly sort python imports.
    • Add check-json pre-commit hook to check json files ;)
    • There are also a few minor updates to the existing pre-commit hooks.
      • The most noticeable change is moving as many of the configuration settings into either pyproject.toml, setup.cfg or within the .pre-commit-config.yaml itself. Files as such .hadolint.yaml and .mdformat.toml can subsequently be removed.
  • Completely remove docs.yaml (Documentation Linter) Github Action workflow.
    • If we remove Vale, and if are fine removing the link-checker for the time being, then we can completely remove this workflow.
      • I personally find the current link-checker to be exceptionally flaky. It feels like it reports false positives on every other one of my doc related PRs. As a result, I've spent more time than I'd like running down upstream bugs, making minor changes to the link-checker GHA or simply ignoring the failed CI.
  • Create run-pre-commit.yaml which hands all linting and formatting.
    • This workflows simply runs pre-commit run --all-files to ensure all of our files are always properly linted / formatted. Triggered for every PR and for every push to main.

These proposed changes are ultimately the result of the frustration I've felt while writing or updating docs, consistently getting errors from Vale or the link-checker or both. Like I mentioned above, I think Vale is definitely a more comprehensive spell checker (and style checker) but the added benefits don't outweigh the friction it causes the development workflow especially compared to a baseline using codespell.

If accepted, we can close #1007 #972.

I would love to hear others have to say about this propose change :)

cc. @costrouc @tonyfast @magsol @viniciusdc @HarshCasper @aktech @kcpevey @trallard

@iameskild
Copy link
Member Author

iameskild commented May 14, 2022

I intentionally committed with --no-verify so that we can focus on my proposed changes and not get lost in all the noise created from the pre-commit hooks themselves.

Have a look at the GHA logs for run-pre-commit.yaml to see which files will be affected by the pre-commit hooks.

@viniciusdc
Copy link
Contributor

viniciusdc commented May 14, 2022

Amazing work @iameskild, just finished looking through the changes, and still need to know the end result of the action. But I liked this organization!!!

One thing that I suggest is to add the following if condition to the pre-commit job:
if: github.event.pull_request.merged == false
this will skip the workflow to run everytime when a PR is merged (to avoid running twice)

@viniciusdc
Copy link
Contributor

Also, not sure about what the others think of this but we could add a message to the PR automatically in case the pre-commit action fails, explaining to the user (if it's a new user) about pre-commit and how to install it, something like this:

...
    steps:
      - uses: mshick/add-pr-comment@v1
        with:
          message: |
            First of all, thank you for taking the time to contribute! :tada: 
            It seems that some tests are falling, please refer to the pre-commit logs for more details or check if pre-commit is 
            correctly installed. If you're a first-time contributor, you can follow this steps to automatically set up pre-commit and 
            avoid these errors.
          repo-token: ${{ secrets.GITHUB_TOKEN }}
          repo-token-user-login: 'github-actions[bot]' # The user.login for temporary GitHub tokens
          allow-repeats: false # This will avoid duplicating comments

We can add a link to some step-by-step installation from pre-commit itself, or add extra information in the comment.

.github/workflows/docs.yml Show resolved Hide resolved
.github/workflows/run-pre-commit.yaml Outdated Show resolved Hide resolved
.github/workflows/run-pre-commit.yaml Outdated Show resolved Hide resolved
@iameskild
Copy link
Member Author

One thing that I suggest is to add the following if condition to the pre-commit job: if: github.event.pull_request.merged == false this will skip the workflow to run everytime when a PR is merged (to avoid running twice)

I like this!

@viniciusdc
Copy link
Contributor

@iameskild once you get a preview of these changes let us know. The PR LGTM!!

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@iameskild
Copy link
Member Author

iameskild commented May 25, 2022

@viniciusdc @HarshCasper @costrouc this PR is ready to be merged whenever you feel ready :)

@iameskild
Copy link
Member Author

I need to understand why the Kubernetes tests are failing before merging...

@viniciusdc
Copy link
Contributor

@iameskild @HarshCasper the pre-commit job seems to be failing now, would you mind having a look? Also, how this will work again? I thought the pre-commit stuff would be executed by the user only...

@iameskild
Copy link
Member Author

@viniciusdc some of the recent PRs had "unformatted" files which caused the pre-commit CI to fail. At this point, we will need to pull down the changes, make pre-commit happy and push those changes up. Is this workflow too complex?

I will need to investigate why the jupyterlab image failed to build though.

@iameskild iameskild merged commit d0cc266 into main May 26, 2022
@iameskild iameskild deleted the run_pre_commit branch May 26, 2022 21:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants