-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
seo: monitor and fix broken links #746
Comments
I would clarify that there are lot of false positives here. And we need only fix a very few that use redirects like |
Running script. Only seeing a few issues after turning on --max-redirect=10 and --method=GET. Lots of false positives w/ redirects=0 and method=HEAD. |
Thanks @taylorlee1 ! Yes, I'm not sure that script is completely flawless. Please use your criteria and fix the broken links you are able to find 🙂 You may open a PR with the fixes and say "Fix #746" in it's description. More info in https://dvc.org/doc/user-guide/contributing/docs |
@taylorlee1 some false positives on redirects=0 should be fixed. Mostly those which are redirects we keep for backward compatibility (docs -> doc), etc. They are not external links, they automatically transform one docs link into another. |
All three errors can be fixed by using .html instead of .htm suffix. |
My 2c on this: keep redirects that just remove/add slash at the end like def fix redirects like this https://plugins.jetbrains.com/plugin/11368-dvc-support-poc - that look like owners of the site moved the page (probably it should be returning 301?) https://dvc.org/chat https://discordapp.com/invite/dvwXA2N - these are specifically made so that we change the invite if it's needed. |
Another issue I just detected is that we can't really check all the anchors of links for example |
Don't think we need to remove them. It's more or less fine to have some of them broken and update them from time to time. We can make a script that analyzes the content of the page to see if there is anchor there. SSR would be helpful in this case, but can be done w/o that as well. |
Agree. As long as the link before anchor exists. But finding broken ones may indicate that the original content has changed and so the link may no longer be relevant. (Unlikely)
Only for internal links. Any external links to dynamic sites will also be hard to detect broken anchors for. (A crawler would throw false positives, which we could simply review manually when reported.) Anyway, yeah not a huge deal, just something I realized and wanted to note here. |
should be solved by @casperdcl 's fix and a few commits that removed/fixed broken links |
My implementation (#958) was quick and dirty but probably does the job. Didn't actually fix the current broken links but will find any future ones and keep warning about the current ones |
Use this script or similar manually to double check which links are broken in the docs: https://github.com/iterative/dvc.org/pull/690/files#diff-a5173e320dcf100fc3ff5b32ba2ea911
The last run (see #690 (comment)) reported the following problems:
UPDATE: Scroll to #746 (comment) and below for latest pending work here.
The text was updated successfully, but these errors were encountered: