-
-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lychee shows network error: forbidden for valid links #733
Comments
Hum, looks like it's an issue on your end. 🤔 At least it works over here:
Can you try from a different network? Or maybe reconnect to your wifi? Maybe it also was just a temporary hickup? |
Hi @mre , It originally happened on github action. Same happened after I tried it on my local pc. Ran it several time on github action too and still the same. We need it to work on GitHub Action. |
Oh I see. No clue what's going on there yet. |
Any idea how to make it work? |
I just tried it on another cloud server and same error. |
What's strange is that it works for me. So it must be something that's wrong with the client-side.
I'm afraid there ain't much I can do on my end other than asking questions. 😅 |
On your thoughts,
|
I've created a new git repository to make a mock test. And the result is negative. |
Hi @mre , Any update? |
Nope, no update. I don't think I can help much here. 🙁 It works over here, so it's definitely a problem with the server preventing certain clients from getting access. It's not something we can fix in lychee. |
Understood. But it's not working on my local PC and Linode server also. I'm confused. So, three places in a row it's not working. |
Yeah that's definitely weird. If anyone can help out narrow this down, please run the following command on your machine and report back the result. echo 'https://catboost.ai' | lychee - |
With lychee it's not working
On the other hand working with curl
|
Does this work for you? echo 'https://catboost.ai' | lychee --user-agent 'curl/7.79.1' - |
Not working
|
No clue, really. The last thing that comes to mind is echo 'https://catboost.ai' | lychee --user-agent 'curl/7.79.1' --headers 'Accept=*/*' -- That's the only thing I can see when I inspect the curl command. |
With my curl version
|
@lebensterben could you run a test on your side if you find the time? |
|
This issue is similar to #715 We need to build a minimal curl-like client with reqwest to check whether the problem is in reqwest or lychee. |
The strange thing is that it works for me with lychee. 🤔 I sort of hacked together a curl/reqwest client here: https://github.com/lycheeverse/geturl-test # reqwest (the default)
~/C/s/r/geturl ❯❯❯ cargo run -- 'https://catboost.ai'
Finished dev [unoptimized + debuginfo] target(s) in 0.04s
Running `target/debug/geturl 'https://catboost.ai'`
200 OK
# libcurl
~/C/s/r/geturl ❯❯❯ cargo run --features curl -- 'https://catboost.ai'
Finished dev [unoptimized + debuginfo] target(s) in 0.05s
Running `target/debug/geturl 'https://catboost.ai'`
200 OK
~/C/s/r/geturl ❯❯❯ |
|
We may need this |
I've added support for it to getcurl-test and it indeed works:
Tested locally and inside a Github codespace. Can you both test it on your machines as well? If it works I really don't know if we should add |
I won't be able to test it for a while because I get covid (again). |
Get well soon. 🤗 |
@Rizwan-Hasan in case you find the time, maybe you can test this: #733 (comment) |
@mre Hi, of course. I pulled a docker image root@22c7cf489de3:/geturl-test# cargo run -- 'https://catboost.ai'
Updating git repository `https://github.com/4JX/h2.git`
Updating git repository `https://github.com/4JX/hyper.git`
Updating crates.io index
Updating git repository `https://github.com/4JX/reqwest-impersonate.git`
Updating git repository `https://github.com/4JX/boring`
Updating git submodule `https://github.com/google/boringssl.git`
Updating git submodule `https://github.com/google/boringssl.git`
Downloaded bitflags v1.3.2
Downloaded memchr v2.5.0
Downloaded quote v1.0.21
Downloaded mio v0.8.4
Downloaded serde_urlencoded v0.7.1
Downloaded reqwest v0.11.12
Downloaded syn v1.0.103
Downloaded unicode-ident v1.0.5
Downloaded ipnet v2.5.0
Downloaded hyper-tls v0.5.0
Downloaded httparse v1.8.0
Downloaded http-body v0.4.5
Downloaded foreign-types-shared v0.1.1
Downloaded foreign-types v0.3.2
Downloaded tower-service v0.3.2
Downloaded tokio-util v0.7.4
Downloaded openssl v0.10.42
Downloaded tokio-native-tls v0.3.0
Downloaded tracing v0.1.37
Downloaded fnv v1.0.7
Downloaded num_cpus v1.13.1
Downloaded pin-utils v0.1.0
Downloaded native-tls v0.2.10
Downloaded tracing-core v0.1.30
Downloaded want v0.3.0
Downloaded try-lock v0.2.3
Downloaded bytes v1.2.1
Downloaded log v0.4.17
Downloaded pin-project-lite v0.2.9
Downloaded indexmap v1.9.1
Downloaded once_cell v1.15.0
Downloaded url v2.3.1
Downloaded slab v0.4.7
Downloaded form_urlencoded v1.1.0
Downloaded ryu v1.0.11
Downloaded httpdate v1.0.2
Downloaded itoa v1.0.4
Downloaded unicode-bidi v0.3.8
Downloaded mime v0.3.16
Downloaded percent-encoding v2.2.0
Downloaded tinyvec_macros v0.1.0
Downloaded serde v1.0.147
Downloaded http v0.2.8
Downloaded tinyvec v1.6.0
Downloaded proc-macro2 v1.0.47
Downloaded openssl-macros v0.1.0
Downloaded anyhow v1.0.66
Downloaded base64 v0.13.1
Downloaded autocfg v1.1.0
Downloaded cfg-if v1.0.0
Downloaded cc v1.0.73
Downloaded futures-core v0.3.25
Downloaded idna v0.3.0
Downloaded futures-task v0.3.25
Downloaded futures-sink v0.3.25
Downloaded futures-io v0.3.25
Downloaded pkg-config v0.3.25
Downloaded openssl-probe v0.1.5
Downloaded futures-channel v0.3.25
Downloaded futures-util v0.3.25
Downloaded socket2 v0.4.7
Downloaded openssl-sys v0.9.77
Downloaded hashbrown v0.12.3
Downloaded unicode-normalization v0.1.22
Downloaded tokio v1.21.2
Downloaded libc v0.2.135
Downloaded encoding_rs v0.8.31
Downloaded 67 crates (5.6 MB) in 2.05s (largest was `encoding_rs` at 1.4 MB)
Compiling autocfg v1.1.0
Compiling libc v0.2.135
Compiling cfg-if v1.0.0
Compiling log v0.4.17
Compiling proc-macro2 v1.0.47
Compiling pin-project-lite v0.2.9
Compiling cc v1.0.73
Compiling once_cell v1.15.0
Compiling memchr v2.5.0
Compiling pkg-config v0.3.25
Compiling quote v1.0.21
Compiling futures-core v0.3.25
Compiling bytes v1.2.1
Compiling unicode-ident v1.0.5
Compiling syn v1.0.103
Compiling futures-task v0.3.25
Compiling itoa v1.0.4
Compiling openssl v0.10.42
Compiling fnv v1.0.7
Compiling foreign-types-shared v0.1.1
Compiling futures-util v0.3.25
Compiling futures-io v0.3.25
Compiling native-tls v0.2.10
Compiling pin-utils v0.1.0
Compiling tinyvec_macros v0.1.0
Compiling httparse v1.8.0
Compiling futures-channel v0.3.25
Compiling hashbrown v0.12.3
Compiling bitflags v1.3.2
Compiling futures-sink v0.3.25
Compiling percent-encoding v2.2.0
Compiling openssl-probe v0.1.5
Compiling try-lock v0.2.3
Compiling serde v1.0.147
Compiling encoding_rs v0.8.31
Compiling httpdate v1.0.2
Compiling tower-service v0.3.2
Compiling unicode-bidi v0.3.8
Compiling ryu v1.0.11
Compiling anyhow v1.0.66
Compiling mime v0.3.16
Compiling ipnet v2.5.0
Compiling base64 v0.13.1
Compiling tracing-core v0.1.30
Compiling tokio v1.21.2
Compiling slab v0.4.7
Compiling indexmap v1.9.1
Compiling http v0.2.8
Compiling foreign-types v0.3.2
Compiling tinyvec v1.6.0
Compiling openssl-sys v0.9.77
Compiling form_urlencoded v1.1.0
Compiling tracing v0.1.37
Compiling want v0.3.0
Compiling http-body v0.4.5
Compiling unicode-normalization v0.1.22
Compiling num_cpus v1.13.1
Compiling mio v0.8.4
Compiling socket2 v0.4.7
Compiling idna v0.3.0
Compiling url v2.3.1
Compiling serde_urlencoded v0.7.1
Compiling tokio-util v0.7.4
Compiling h2 v0.3.14 (https://github.com/4JX/h2.git?branch=imp#90af7b9d)
Compiling openssl-macros v0.1.0
Compiling hyper v0.14.18 (https://github.com/4JX/hyper.git?branch=v0.14.18-patched#abf28d87)
Compiling tokio-native-tls v0.3.0
Compiling hyper-tls v0.5.0
Compiling reqwest v0.11.12
Compiling tectonic_geturl v0.0.0-dev.0 (/geturl-test)
warning: variable does not need to be mutable
--> src/bin/geturl.rs:9:13
|
9 | let mut response = backend.get_url(&url, &mut status).unwrap();
| ----^^^^^^^^
| |
| help: remove this `mut`
|
= note: `#[warn(unused_mut)]` on by default
warning: `tectonic_geturl` (bin "geturl") generated 1 warning
Finished dev [unoptimized + debuginfo] target(s) in 2m 59s
Running `target/debug/geturl 'https://catboost.ai'`
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: unexpected HTTP response code 403 Forbidden for URL https://catboost.ai', src/bin/geturl.rs:9:63
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
root@22c7cf489de3:/geturl-test# root@22c7cf489de3:/geturl-test# cargo run --no-default-features --features curl -- 'https://catboost.ai'
Downloaded curl v0.4.44
Downloaded libz-sys v1.1.8
Downloaded curl-sys v0.4.56+curl-7.83.1
Downloaded 3 crates (5.5 MB) in 2.91s (largest was `curl-sys` at 3.0 MB)
Compiling curl v0.4.44
Compiling libz-sys v1.1.8
Compiling curl-sys v0.4.56+curl-7.83.1
Compiling socket2 v0.4.7
Compiling tectonic_geturl v0.0.0-dev.0 (/geturl-test)
error[E0599]: no method named `status_code` found for struct `std::io::Cursor` in the current scope
--> src/bin/geturl.rs:21:40
|
21 | let status_code = response.status_code();
| ^^^^^^^^^^^ method not found in `std::io::Cursor<Vec<u8>>`
For more information about this error, try `rustc --explain E0599`.
error: could not compile `tectonic_geturl` due to previous error
root@22c7cf489de3:/geturl-test# root@22c7cf489de3:/geturl-test# cargo run --no-default-features --features reqwest_impersonate -- 'https://catboost.ai'
Downloaded libloading v0.7.3
Downloaded clang-sys v1.4.0
Downloaded cexpr v0.6.0
Downloaded cmake v0.1.48
Downloaded brotli-decompressor v2.3.2
Downloaded glob v0.3.0
Downloaded foreign-types-shared v0.3.1
Downloaded linked_hash_set v0.1.4
Downloaded foreign-types-macros v0.2.2
Downloaded crc32fast v1.3.2
Downloaded flate2 v1.0.24
Downloaded linked-hash-map v0.5.6
Downloaded alloc-stdlib v0.2.2
Downloaded adler v1.0.2
Downloaded async-compression v0.3.15
Downloaded nom v7.1.1
Downloaded antidote v1.0.0
Downloaded lazycell v1.3.0
Downloaded regex-syntax v0.6.27
Downloaded tower-layer v0.3.2
Downloaded peeking_take_while v0.1.2
Downloaded alloc-no-stdlib v2.0.4
Downloaded regex v1.6.0
Downloaded minimal-lexical v0.2.1
Downloaded rustc-hash v1.1.0
Downloaded foreign-types v0.5.0
Downloaded bindgen v0.60.1
Downloaded lazy_static v1.4.0
Downloaded shlex v1.1.0
Downloaded miniz_oxide v0.5.4
Downloaded brotli v3.3.4
Downloaded 31 crates (3.0 MB) in 1.39s (largest was `brotli` at 1.4 MB)
Compiling glob v0.3.0
Compiling minimal-lexical v0.2.1
Compiling bindgen v0.60.1
Compiling regex-syntax v0.6.27
Compiling lazy_static v1.4.0
Compiling lazycell v1.3.0
Compiling rustc-hash v1.1.0
Compiling peeking_take_while v0.1.2
Compiling shlex v1.1.0
Compiling alloc-no-stdlib v2.0.4
Compiling foreign-types-shared v0.3.1
Compiling crc32fast v1.3.2
Compiling adler v1.0.2
Compiling linked-hash-map v0.5.6
Compiling tower-layer v0.3.2
Compiling antidote v1.0.0
Compiling libloading v0.7.3
Compiling cmake v0.1.48
Compiling alloc-stdlib v0.2.2
Compiling miniz_oxide v0.5.4
Compiling linked_hash_set v0.1.4
Compiling clang-sys v1.4.0
Compiling brotli-decompressor v2.3.2
Compiling nom v7.1.1
Compiling foreign-types-macros v0.2.2
Compiling tokio-util v0.7.4
Compiling flate2 v1.0.24
Compiling regex v1.6.0
Compiling h2 v0.3.14 (https://github.com/4JX/h2.git?branch=imp#90af7b9d)
Compiling brotli v3.3.4
Compiling foreign-types v0.5.0
Compiling cexpr v0.6.0
Compiling hyper v0.14.18 (https://github.com/4JX/hyper.git?branch=v0.14.18-patched#abf28d87)
Compiling async-compression v0.3.15
Compiling boring-sys v2.0.0 (https://github.com/4JX/boring?rev=2a7463a#2a7463aa)
error: failed to run custom build command for `boring-sys v2.0.0 (https://github.com/4JX/boring?rev=2a7463a#2a7463aa)`
Caused by:
process didn't exit successfully: `/geturl-test/target/debug/build/boring-sys-0708443bc13991d2/build-script-build` (exit status: 101)
--- stdout
cargo:rerun-if-env-changed=BORING_BSSL_PATH
CMAKE_TOOLCHAIN_FILE_x86_64-unknown-linux-gnu = None
CMAKE_TOOLCHAIN_FILE_x86_64_unknown_linux_gnu = None
HOST_CMAKE_TOOLCHAIN_FILE = None
CMAKE_TOOLCHAIN_FILE = None
CMAKE_GENERATOR_x86_64-unknown-linux-gnu = None
CMAKE_GENERATOR_x86_64_unknown_linux_gnu = None
HOST_CMAKE_GENERATOR = None
CMAKE_GENERATOR = None
CMAKE_PREFIX_PATH_x86_64-unknown-linux-gnu = None
CMAKE_PREFIX_PATH_x86_64_unknown_linux_gnu = None
HOST_CMAKE_PREFIX_PATH = None
CMAKE_PREFIX_PATH = None
CMAKE_x86_64-unknown-linux-gnu = None
CMAKE_x86_64_unknown_linux_gnu = None
HOST_CMAKE = None
CMAKE = None
running: "cmake" "/usr/local/cargo/git/checkouts/boring-b574da2d10d0f762/2a7463a/boring-sys/deps/boringssl" "-DCMAKE_INSTALL_PREFIX=/geturl-test/target/debug/build/boring-sys-6cc3a56ed015afe6/out" "-DCMAKE_C_FLAGS= -ffunction-sections -fdata-sections -fPIC -m64" "-DCMAKE_C_COMPILER=/usr/bin/cc" "-DCMAKE_CXX_FLAGS= -ffunction-sections -fdata-sections -fPIC -m64" "-DCMAKE_CXX_COMPILER=/usr/bin/c++" "-DCMAKE_ASM_FLAGS= -ffunction-sections -fdata-sections -fPIC -m64" "-DCMAKE_ASM_COMPILER=/usr/bin/cc" "-DCMAKE_BUILD_TYPE=Debug"
--- stderr
thread 'main' panicked at '
failed to execute command: No such file or directory (os error 2)
is `cmake` not installed?
build script failed, must exit now', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/cmake-0.1.48/src/lib.rs:975:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
root@22c7cf489de3:/geturl-test# |
Okay thanks. The second one should not have failed. It's an error on my end. However I do expect it to fail just like the first test with request. For the last one, which is the most promising one. You need to install boringssl for that first. |
...and for boringssl you need to install |
I had to install root@e4e91d5d237d:/geturl-test# cargo run --no-default-features --features reqwest_impersonate -- 'https://catboost.ai'
warning: unused variable: `redirect_policy`
--> src/reqwest_impersonate.rs:59:13
|
59 | let redirect_policy = Policy::custom(move |attempt| {
| ^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_redirect_policy`
|
= note: `#[warn(unused_variables)]` on by default
warning: `tectonic_geturl` (lib) generated 1 warning
warning: variable does not need to be mutable
--> src/bin/geturl.rs:9:13
|
9 | let mut response = backend.get_url(&url, &mut status).unwrap();
| ----^^^^^^^^
| |
| help: remove this `mut`
|
= note: `#[warn(unused_mut)]` on by default
warning: `tectonic_geturl` (bin "geturl") generated 1 warning
Finished dev [unoptimized + debuginfo] target(s) in 0.08s
Running `target/debug/geturl 'https://catboost.ai'`
get_url: https://catboost.ai
response: Response { url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("catboost.ai")), port: None, path: "/", query: None, fragment: None }, status: 200, headers: {"content-length": "79367", "content-security-policy": "default-src 'none'; script-src 'unsafe-eval' 'unsafe-inline' 'nonce-/lIFeLdebbwR139V24X0Xg==' mc.yandex.ru social.yandex.ru yastatic.net; style-src 'unsafe-inline' mc.yandex.ru yastatic.net; img-src 'self' data: avatars.yandex.net avatars.mds.yandex.net avatars.mdst.yandex.net mc.yandex.ru ext.captcha.yandex.net yastatic.net; connect-src 'self' mc.yandex.ru; frame-src www.youtube.com video.yandex.ru player.video.yandex.net; media-src ext.captcha.yandex.net; font-src yastatic.net; report-uri https://csp.yandex.net/csp?from=promo-catboost-2017&yandex_login=undefined&yandexuid=undefined;", "content-type": "text/html; charset=utf-8", "date": "Fri, 28 Oct 2022 19:51:08 GMT", "x-content-type-options": "nosniff", "x-frame-options": "DENY", "x-xss-protection": "1; mode=block"} }
status: 200 OK
root@e4e91d5d237d:/geturl-test# |
Dang. It works. 😞 That means if we integrate that backend into lychee it would solve your issue.
|
I don't know the answer for the second question. For the first one, I suggest to test it on other related issue where browsers are able to open a URL but curl and lychee are not.
|
Good idea. Did some tests but haven't found other use-cases yet where this fixes the problem. I'll keep this on hold until then. |
Similar case that got resolved with browser impersonation: Thinking about adding reqwest-impersonate as a fallback to handle such cases. |
Hi @mre, Is there any update regarding the issue❔ |
No updates, but if I find the time I will create a pull request to integrate reqwest impersonate as a fallback backend. It will be an optional library feature, but it will be enabled by default in the binary. |
That's great. Looking forward to the release then. 🙂 |
Bad news. I wanted to integrate this, but I don't think it's possible right now. |
As much as I would like this to be part of lychee, I don't think there is an easy solution right now. |
Seems like there isn't much upstream traction, and it's not something we can fix on our side, so I'm gonna go ahead and close this. If the upstream issue gets fixed, we can reopen and integrate |
Suddenly lychee shows this error for valid links. But these links are valid and also accessible from the browser.
Contents of
work/ok.txt
Lychee version
The text was updated successfully, but these errors were encountered: