Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add link_checker settings for external_level and internal_level #1848

Merged
merged 16 commits into from
May 11, 2022

Conversation

mwcz
Copy link
Contributor

@mwcz mwcz commented May 3, 2022

Based on this convo.

This PR adds new link checker configuration settings: internal_level and external_level, which can both be set to "warn" or "error". The current behavior of zola is to always error on broken internal links and on external links with unparsable URLs, so these new settings allow those to be treated as warnings instead. In other words, it allows zola to build sites even though they have broken links. The current behavior is still the default.

Sanity check:

  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

Code changes

  • Are you doing the PR on the next branch?

If the change is a new feature or adding to/changing an existing one:

  • Have you created/updated the relevant documentation page(s)?

src/messages.rs Show resolved Hide resolved
components/config/src/config/link_checker.rs Outdated Show resolved Hide resolved
components/console/Cargo.toml Outdated Show resolved Hide resolved
components/markdown/src/markdown.rs Outdated Show resolved Hide resolved
components/markdown/src/markdown.rs Outdated Show resolved Hide resolved
},
Err(err) => bail!("could not parse domain `{}` from link: `{}`", link, err),
Err(err) => Err(format!("could not parse domain `{}` from link: `{}`", link, err)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why that change? It should be equivalent to bail!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made that change so it doesn't bail on the first unparseable URL. It tries to parse everything, and then if there were any failures, it prints out all the unparseable URLs and then bails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to reverse that change though, I can see how bailing on the first error is more consistent and leads to less complexity.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I believe the code of get_link_domain before and the one with your changes is exactly the same no? You're just returning the error either way in get_link_domain

It's fine to report all of them though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, for some reason I was under the impression that bail caused an immediate panic. Whoops, I'll change it back.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for writing so much, this PR has gone down several different paths and I'm trying to get my bearings.

No worries!
I was thinking about it today as well and my thoughts might be a bit out of scope for the PR.

What if the link checker was not using Errors but was instead using a Vec<String> string for the errors.
This way we can build the error message ourselves nicely, something like:

Failed to build the site
Error: Found 3 broken anchor links
  1. The anchor in the link `@/post.md#foo` in content/post.md does not exist.
  2. The anchor in the link `@/post.md#bar` in content/post.md does not exist.
  3. The anchor in the link `@/post.md#baz` in content/post.md does not exist.

We can/should obviously apply the same kind of .formatting to external link errors obviously.
What do you think?

Copy link
Contributor Author

@mwcz mwcz May 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine check_external_links and check_internal_links_with_anchors emitting a Vec<String> which contains the text for all the errors or warnings that were encountered. Those functions are called from site/src/lib.rs, so I suppose that's where the logging could happen. Seems like a clean solution; the logging code would only need to exist in site/src/lib.rs instead of being duplicated at every early-return site. It would potentially change the ordering of error/warning logs relative to regular printlns since the errors/warnings would be delayed until the function returns, but the link checker hardly has any printlns so I don't see a problem there. Seems good to me, so I guess I'll give it a try next week.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The internal link checking that happens in markdown.rs would need a similar treatment I suppose, so the code would be more-or-less duplicated, but only in two places. 🤔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the markdown one is unfortunate... maybe keep it as is for now and see if it feels odd when trying it out.

Copy link
Contributor Author

@mwcz mwcz May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems pretty good. The markdown.rs link checking errors/warnings aren't formatted exactly the same, but it doesn't feel odd when using zola. Here's a demo.

https://asciinema.org/a/Ructp25lMiFB1ztSatUr74ASn

components/site/src/link_checking.rs Outdated Show resolved Hide resolved
test_site/config.toml Outdated Show resolved Hide resolved
components/markdown/src/markdown.rs Outdated Show resolved Hide resolved
components/config/src/config/link_checker.rs Outdated Show resolved Hide resolved
@mwcz mwcz force-pushed the link-checker-levels-with-colorization branch from 74c7de6 to defbd54 Compare May 10, 2022 17:21
@mwcz mwcz force-pushed the link-checker-levels-with-colorization branch from 91a6f64 to 4a2857d Compare May 10, 2022 20:56
@mwcz mwcz marked this pull request as ready for review May 10, 2022 20:57
Copy link
Collaborator

@Keats Keats left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet thanks

@Keats Keats merged commit 6240ed5 into getzola:next May 11, 2022
@Keats
Copy link
Collaborator

Keats commented May 11, 2022

FYI I just pushed a commit to make the external link checking a bit more DRY: 1a3b783

@mwcz mwcz deleted the link-checker-levels-with-colorization branch May 11, 2022 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants