-
Notifications
You must be signed in to change notification settings - Fork 989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add link_checker settings for external_level and internal_level #1848
add link_checker settings for external_level and internal_level #1848
Conversation
components/site/src/link_checking.rs
Outdated
}, | ||
Err(err) => bail!("could not parse domain `{}` from link: `{}`", link, err), | ||
Err(err) => Err(format!("could not parse domain `{}` from link: `{}`", link, err)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why that change? It should be equivalent to bail!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made that change so it doesn't bail on the first unparseable URL. It tries to parse everything, and then if there were any failures, it prints out all the unparseable URLs and then bails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to reverse that change though, I can see how bailing on the first error is more consistent and leads to less complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I believe the code of get_link_domain
before and the one with your changes is exactly the same no? You're just returning the error either way in get_link_domain
It's fine to report all of them though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, for some reason I was under the impression that bail caused an immediate panic. Whoops, I'll change it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for writing so much, this PR has gone down several different paths and I'm trying to get my bearings.
No worries!
I was thinking about it today as well and my thoughts might be a bit out of scope for the PR.
What if the link checker was not using Error
s but was instead using a Vec<String>
string for the errors.
This way we can build the error message ourselves nicely, something like:
Failed to build the site
Error: Found 3 broken anchor links
1. The anchor in the link `@/post.md#foo` in content/post.md does not exist.
2. The anchor in the link `@/post.md#bar` in content/post.md does not exist.
3. The anchor in the link `@/post.md#baz` in content/post.md does not exist.
We can/should obviously apply the same kind of .formatting to external link errors obviously.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can imagine check_external_links
and check_internal_links_with_anchors
emitting a Vec<String>
which contains the text for all the errors or warnings that were encountered. Those functions are called from site/src/lib.rs
, so I suppose that's where the logging could happen. Seems like a clean solution; the logging code would only need to exist in site/src/lib.rs
instead of being duplicated at every early-return site. It would potentially change the ordering of error/warning logs relative to regular printlns since the errors/warnings would be delayed until the function returns, but the link checker hardly has any printlns so I don't see a problem there. Seems good to me, so I guess I'll give it a try next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The internal link checking that happens in markdown.rs
would need a similar treatment I suppose, so the code would be more-or-less duplicated, but only in two places. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the markdown one is unfortunate... maybe keep it as is for now and see if it feels odd when trying it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems pretty good. The markdown.rs
link checking errors/warnings aren't formatted exactly the same, but it doesn't feel odd when using zola. Here's a demo.
74c7de6
to
defbd54
Compare
91a6f64
to
4a2857d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet thanks
FYI I just pushed a commit to make the external link checking a bit more DRY: 1a3b783 |
Based on this convo.
This PR adds new link checker configuration settings:
internal_level
andexternal_level
, which can both be set to"warn"
or"error"
. The current behavior of zola is to always error on broken internal links and on external links with unparsable URLs, so these new settings allow those to be treated as warnings instead. In other words, it allows zola to build sites even though they have broken links. The current behavior is still the default.Sanity check:
Code changes
next
branch?If the change is a new feature or adding to/changing an existing one: