-
Notifications
You must be signed in to change notification settings - Fork 405
test: check links #958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: check links #958
Changes from all commits
7b8aab5
3c78e17
efbaa24
f448a34
4e53a6c
0a03cdd
ccc6211
220cd2a
ac0b426
fd2ece5
63ce875
8fb1075
34d3640
0b56a18
f585d13
beb1aaa
6c697c4
d5aa06b
7752ee5
774d8b3
f3c05aa
eef515b
979ecf9
73a4317
60f7350
3580e23
d4a1933
816bcf7
175e694
e92317b
1ab8a8c
1613146
ba42034
a53b6aa
2b55fe6
b3a6b2a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| - id: dead-url | ||
| name: Dead URL Checker | ||
| entry: scripts/link-check.sh | ||
| language: script | ||
| types: [text] | ||
| description: This hook searches for problematic URLs. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| http://127.0.0.1:10000/devstoreaccount1; | ||
| http://localhost:3000/ | ||
| https://$ | ||
| https://api.github.com/repos/$ | ||
| https://blog.$ | ||
| https://discuss.$ | ||
| https://dvc.org/some.link | ||
jorgeorpinel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| https://example.com/data.txt | ||
| https://example.com/path/to/data | ||
| https://example.com/path/to/data.csv | ||
| https://example.com/path/to/dir | ||
| https://github.com/$ | ||
shcheklein marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| https://github.com/dataversioncontrol/myrepo.git | ||
| https://github.com/example/registry | ||
| https://github.com/iterative/dvc.org/blob/master/public$ | ||
| https://github.com/iterative/dvc/releases/download/$ | ||
|
Comment on lines
+15
to
+16
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh so it's actually matching the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes my regexes match only up to the literal |
||
| https://github.com/myaccount/myproject.git | ||
| https://myendpoint.com | ||
| https://object-storage.example.com | ||
| https://www.youtube.com/embed/$ | ||
shcheklein marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| #!/usr/bin/env bash | ||
| (find pages/ public/static/docs/ src/ .github/ -name '*.md' -o -name '*.js' && ls *.md *.js) \ | ||
| | xargs -n1 -P8 $(dirname "$0")/link-check.sh |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| #!/usr/bin/env bash | ||
| set -euxo pipefail | ||
| $(dirname "$0")/link-check.sh <(git diff origin/master -U0) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| #!/usr/bin/env bash | ||
| # Check HTTP status codes of links in the given files. | ||
| # Success: 2xx, Errors: 4xx/5xx, Warnings: anything else. | ||
| # Redirects (3xx) are followed. | ||
| # Usage: | ||
| # link-check.sh [<files>] | ||
| set -euo pipefail | ||
|
|
||
| base_url="${CHECK_LINKS_RELATIVE_URL:-https://dvc.org}" | ||
| exclude="${CHECK_LINKS_EXCLUDE_LIST:-$(dirname $0)/exclude-links.txt}" | ||
| [ -f "$exclude" ] && exclude="$(cat $exclude)" | ||
|
|
||
| finder(){ # expects list of files | ||
| # explicit links not in markdown | ||
| pcregrep -o '(?<!\]\()https?://[^\s<>{}"'"'"'`]+' "$@" | ||
| # explicit links in markdown | ||
| pcregrep -o '(?<=\])\(https?://[^[\]\s]+\)' "$@" | pcregrep -o '\((?:[^)(]*(?R)?)*+\)' | pcregrep -o '(?<=\().*(?=\))' | ||
| # relative links in markdown | ||
| sed -nr 's/.*]\((\/[^)[:space:]]+).*/\1/p' "$@" | xargs -n1 -II echo ${base_url}I | ||
| # relative links in html | ||
| sed -nr 's/.*href=["'"'"'](\/[^"'"'"']+?)["'"'"'].*/\1/p' "$@" | xargs -n1 -II echo ${base_url}I | ||
| } | ||
| checker(){ # expects list of urls | ||
| errors=0 | ||
| for url in "$@"; do | ||
| status="$(curl -IL -w '%{http_code}' -so /dev/null "$url")" | ||
| case "$status" in | ||
| 2??) | ||
| # success | ||
| ;; | ||
| [45]??) | ||
| echo | ||
| echo " ERROR:$status:$url" >&2 | ||
| errors=$(($errors + 1)) | ||
| ;; | ||
| *) | ||
| echo | ||
| echo " WARNING:$status:$url" >&2 | ||
| ;; | ||
| esac | ||
| done | ||
| return $errors | ||
| } | ||
|
|
||
| fails=0 | ||
| for file in "$@"; do | ||
| echo -n "$file:" | ||
| prev=$fails | ||
| checker $(finder "$file" | sort -u | comm -23 - <(echo "$exclude" | sort -u)) || fails=$(($fails + 1)) | ||
| [ $prev -eq $fails ] && echo OK | ||
| done | ||
| [ $fails -eq 0 ] || echo -e "ERROR:$fails failures\n---" >&2 | ||
| exit $fails |
Uh oh!
There was an error while loading. Please reload this page.