add link_checker settings for external_level and internal_level #1848

mwcz · 2022-05-03T19:41:41Z

This PR adds new link checker configuration settings: internal_level and external_level, which can both be set to "warn" or "error". The current behavior of zola is to always error on broken internal links and on external links with unparsable URLs, so these new settings allow those to be treated as warnings instead. In other words, it allows zola to build sites even though they have broken links. The current behavior is still the default.

Sanity check:

Have you checked to ensure there aren't other open Pull Requests for the same update/change?

Code changes

Are you doing the PR on the next branch?

If the change is a new feature or adding to/changing an existing one:

Have you created/updated the relevant documentation page(s)?

components/console/Cargo.toml

components/markdown/src/markdown.rs

src/messages.rs

components/config/src/config/link_checker.rs

components/console/Cargo.toml

components/markdown/src/markdown.rs

Keats · 2022-05-04T08:09:43Z

components/site/src/link_checking.rs

        },
-        Err(err) => bail!("could not parse domain `{}` from link: `{}`", link, err),
+        Err(err) => Err(format!("could not parse domain `{}` from link: `{}`", link, err)),


Why that change? It should be equivalent to bail!

I made that change so it doesn't bail on the first unparseable URL. It tries to parse everything, and then if there were any failures, it prints out all the unparseable URLs and then bails.

I'm happy to reverse that change though, I can see how bailing on the first error is more consistent and leads to less complexity.

but I believe the code of get_link_domain before and the one with your changes is exactly the same no? You're just returning the error either way in get_link_domain

It's fine to report all of them though

Whoops, for some reason I was under the impression that bail caused an immediate panic. Whoops, I'll change it back.

Sorry for writing so much, this PR has gone down several different paths and I'm trying to get my bearings.

No worries!
I was thinking about it today as well and my thoughts might be a bit out of scope for the PR.

What if the link checker was not using Errors but was instead using a Vec<String> string for the errors.
This way we can build the error message ourselves nicely, something like:

Failed to build the site Error: Found 3 broken anchor links 1. The anchor in the link `@/post.md#foo` in content/post.md does not exist. 2. The anchor in the link `@/post.md#bar` in content/post.md does not exist. 3. The anchor in the link `@/post.md#baz` in content/post.md does not exist.

We can/should obviously apply the same kind of .formatting to external link errors obviously.
What do you think?

I can imagine check_external_links and check_internal_links_with_anchors emitting a Vec<String> which contains the text for all the errors or warnings that were encountered. Those functions are called from site/src/lib.rs, so I suppose that's where the logging could happen. Seems like a clean solution; the logging code would only need to exist in site/src/lib.rs instead of being duplicated at every early-return site. It would potentially change the ordering of error/warning logs relative to regular printlns since the errors/warnings would be delayed until the function returns, but the link checker hardly has any printlns so I don't see a problem there. Seems good to me, so I guess I'll give it a try next week.

The internal link checking that happens in markdown.rs would need a similar treatment I suppose, so the code would be more-or-less duplicated, but only in two places. 🤔

Yes the markdown one is unfortunate... maybe keep it as is for now and see if it feels odd when trying it out.

It seems pretty good. The markdown.rs link checking errors/warnings aren't formatted exactly the same, but it doesn't feel odd when using zola. Here's a demo.

https://asciinema.org/a/Ructp25lMiFB1ztSatUr74ASn

components/site/src/link_checking.rs

test_site/config.toml

components/site/src/link_checking.rs

components/markdown/src/markdown.rs

components/config/src/config/link_checker.rs

Keats

Sweet thanks

Keats · 2022-05-11T20:32:46Z

FYI I just pushed a commit to make the external link checking a bit more DRY: 1a3b783

mwcz commented May 3, 2022

View reviewed changes

components/console/Cargo.toml Show resolved Hide resolved

mwcz commented May 3, 2022

View reviewed changes

components/markdown/src/markdown.rs Outdated Show resolved Hide resolved

mwcz commented May 3, 2022

View reviewed changes

src/messages.rs Show resolved Hide resolved

Keats requested changes May 4, 2022

View reviewed changes

mwcz commented May 4, 2022

View reviewed changes

components/site/src/link_checking.rs Outdated Show resolved Hide resolved

Keats reviewed May 5, 2022

View reviewed changes

components/markdown/src/markdown.rs Outdated Show resolved Hide resolved

components/config/src/config/link_checker.rs Outdated Show resolved Hide resolved

mwcz force-pushed the link-checker-levels-with-colorization branch from 74c7de6 to defbd54 Compare May 10, 2022 17:21

mwcz added 15 commits May 10, 2022 16:56

add external_level and internal_level

e34848c

remove unnecessary debug derive on LinkDef

5683c00

clarify doc comment about link check levels

4b4ef2b

simplify link checker logging

ca0a8c0

add missing warn prefix

acba668

simplify link level logging, remove "Level" from linklevel variants

be81373

remove link level config from test site

328b6fe

switch back to using bail! from get_link_domain

526ed89

move console's deps to libs

fb4d41e

remove unnecessary reference

35e818a

calling console::error/warn directly

2fa6754

emit one error, or one warning, per link checker run

8b56180

various link checker level changes

12dc16e

add docs about link checker levels

e3a9a21

remove accidentally committed test site

4a2857d

mwcz force-pushed the link-checker-levels-with-colorization branch from 91a6f64 to 4a2857d Compare May 10, 2022 20:56

mwcz marked this pull request as ready for review May 10, 2022 20:57

remove completed TODO

af70fc7

Keats approved these changes May 11, 2022

View reviewed changes

Keats merged commit 6240ed5 into getzola:next May 11, 2022

mwcz deleted the link-checker-levels-with-colorization branch May 11, 2022 21:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add link_checker settings for external_level and internal_level #1848

add link_checker settings for external_level and internal_level #1848

mwcz commented May 3, 2022 •

edited

Loading

Keats May 4, 2022

mwcz May 4, 2022

mwcz May 4, 2022

Keats May 4, 2022

mwcz May 4, 2022

Keats May 6, 2022

mwcz May 6, 2022 •

edited

Loading

mwcz May 6, 2022

Keats May 6, 2022

mwcz May 10, 2022 •

edited

Loading

Keats left a comment

Keats commented May 11, 2022

add link_checker settings for external_level and internal_level #1848

add link_checker settings for external_level and internal_level #1848

Conversation

mwcz commented May 3, 2022 • edited Loading

Code changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mwcz May 6, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mwcz May 10, 2022 • edited Loading

Choose a reason for hiding this comment

Keats left a comment

Choose a reason for hiding this comment

Keats commented May 11, 2022

mwcz commented May 3, 2022 •

edited

Loading

mwcz May 6, 2022 •

edited

Loading

mwcz May 10, 2022 •

edited

Loading