-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create dead-links-check.yml #137
base: develop
Are you sure you want to change the base?
Conversation
CI that checks for dead links
Thank you for making a PR request!! 🤩 Links are a tricky thing, for sites we own (MITRE & CTID) it makes sense to check and a great call out. For sites we do not own...we will probably always come up with errors. Here is the reason, vendors (the main supplier of reports) can and do remove published reports 💔 . Annoying but since it's their report it's also their right. It's not uncommon for us to be using a report during development and suddenly find the report 💨 gone 😿 . Our work around ❤️🩹 has been to download reports earmarked as useful so we do not rely on the online version. This way if anyone has questions regarding citations, we can promptly provide the documentation even if the links are broken 🔗 . However GitHub is not the best place for document storage. So we don't upload those here. Any thoughts on other solutions? I haven't looked too deep in this project yet but it's now on my docket. If there is a way to ignore some links while verifying others, that would be helpful. This is also a good call out for a documentation update. Thank you! 🙏 |
ci: add: arguments to workflow and clean workflow test commits
add: accept code 403 as not error Signed-off-by: dcaldr <22105838+dcaldr@users.noreply.github.com>
I did some trial and error testing on the tool. The tool can also suggest for the dead links their saved version in wayback machine (Internet Archive)it is way slower but can get the job done. |
@@ -13,5 +13,5 @@ jobs: | |||
- name: Link Checker | |||
uses: lycheeverse/lychee-action@v1.8.0 | |||
with: | |||
args: " --suggest --verbose --no-progress './**/*.md' './**/*.html' './**/*.rst' --exclude-mail -a 429 --exclude-path *fin7/Resources/Step7/BOOSTWRITE-src/curl/README.md " | |||
args: " --suggest --verbose --no-progress './**/*.md' './**/*.html' './**/*.rst' --exclude-mail -a 403,429 --exclude-path *fin7/Resources/Step7/BOOSTWRITE-src/curl/README.md " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-- suggest
adds wayback links--verbose --no-progress
format of output- *.md etc. targeting only selected files (not .c for example)
--include-verbatim
could add search inside md code blocks -a
treats http codes 403 and 429 as good Bitdefender and about two others returns those (due to needed cookies and js) - maybe could be replaced with specific exclusion--exclude
--exclude-path
this one file is in UTF-16 (?bug?) link checker chrashes on this (very rarely even now after excluding )
CI that checks for dead links as suggested by Issue #60 I used work from https://github.com/lycheeverse/lychee-action
has some false positives i.e. www.bitdefender.com as 403 error that I'm not able to fix. but most reported links are really broken. Further additions could be using cache or try to auto-solve links via internet archive as presented: more commandline arguments
I could put more time and effort, but as this is my first pull request I'm not sure if it's useful.