-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate to lychee for link checking #291
Conversation
as liche has been deprecated by its developer
since Rust applications are by default placed into a custom directory outside of PATH.
since those required an external base URL currently, local file existence instead is not possible. But we do not want to burst our server with hundreds of concurrent requests for now. Let's see of this feature is implemented soon, until then we stay with still functional liche.
as building it takes too long Also fix URL glob, as "build/docs/**.html" does not match html files in sub directories, it seems.
as the previous Rust setup has been removed
Thoughts on using |
Good to know about that option. However, often one step is required for the following step but not for the step afterwards, e.g. checking spelling can only work if pyspelling got successfully build, but link checking can still be done. I think this can currently only be correctly handled the way it is done now. |
Also, (more related to this PR), why not use the actual lychee action? https://github.com/lycheeverse/lychee-action |
The question is why should we? 🙂 Those GitHub action containers are generally nice to get started fast and easy without doing things like manual builds (or download and extract in case of lychee), but finally it does nothing more than executing the same tool, so from that state on has no benefit. |
Since currently local file checks seems to be no prioritised feature in lychee, an alternative is to use |
... since it does not support local file checking for internal links yet: lycheeverse/lychee#21
The base URL is not suffixed with the file path, hence it is wrong as fast as sub directories are processed. It is hence required to loop through all directories and check those individually with the directory as base path each. Remove verbose flag. It is ignored when STDOUT and STDERR are redirected. Do not serve redirects, so we can find outdated links.
as otherwise there are too many excludes required. lychee GET requests are quite efficient, so that is not an issue. Add further excludes, required mainly because lychee checks also URLs in code tags :(.
It required some more changes and more complex integration, but now it works fine. It's worth it to reduce checks process time and timeout errors that liche suffered from. Positive side-effects are:
Negative is that URLs in code blocks are checked as well, which required some additional exclusions when we used things like Since we need to check every directory with a separate call, we have many summary prints in the checks output. Not awesome, but acceptable. |
OpenSSL sees a problem with their certificate, browsers on Windows not. For now simply exclude it.
also to force another checks run, as I want to have it green :).
It's not as fast as on the main website, I guess mostly since we need to do a single call for each HTML to have internal links checked correctly. This also means that same URLs are checked doubled. And it's about 3000 overall links, so MUCH more 😄. Ready from my end. |
as liche has been deprecated by its developer. But there are a few issues that need to be addressed first:
EDIT: Solved by running
mkdocs serve
as a local webserver and check internal lings against it.EDIT: All full URLs do work now with a new implemented HTML parser 👍.