Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[4chanarchives] 403 Forbidden Bug #4686

Closed
cheese529 opened this issue Oct 20, 2023 · 11 comments
Closed

[4chanarchives] 403 Forbidden Bug #4686

cheese529 opened this issue Oct 20, 2023 · 11 comments

Comments

@cheese529
Copy link

gallery-dl was working perfectly to download from 4chanarchives ever since the support for it was added back in #4012 but unfortunately when trying to download from a board now you just get 403 forbidden errors. Here is an example board to test with along with my verbose log.

https://4chanarchives.com/board/hr/thread/2693240

https://gist.github.com/cheese529/abd49f5abb737bb16f3181cd37a6838c

@Hrxn
Copy link
Contributor

Hrxn commented Oct 20, 2023

Ugh.. the images are hosted on Imgur, I guess this is why this fails.

Although it does not make sense to me, at the moment, because some links are working here in the browser (at least sometimes?)

This gives me also a HTTP 403 in the browser (incognito mode)

https://i.imgur.com/9FQY2Xy.jpg

This one works..

https://i.imgur.com/9FQY2Xy.jpg

😄

Doesn't really make any sense, not sure what triggers those 403s, I can't reproduce at the moment..

@mikf
Copy link
Owner

mikf commented Oct 20, 2023

This happens because of the referer option added in v1.26.0.
Disable it with -o referer= or in your config with "referer": false and it'll work again.

Imgur doesn't like its files referred to from a non-imgur site, it seems.

@cheese529
Copy link
Author

Ugh.. the images are hosted on Imgur, I guess this is why this fails.

The images that are still available on imgur still fail tho. This board is mostly SFW stuff and therefore a lot these are still available.

Although it does not make sense to me, at the moment, because some links are working here in the browser (at least sometimes?)

Yep I am having the exact same problem. It seems that if you click the link you get a 403 but if you click the actual image it will work.

@cheese529
Copy link
Author

This happens because of the referer option added in v1.26.0. Disable it with -o referer= or in your config with "referer": false and it'll work again.

Imgur doesn't like its files referred to from a non-imgur site, it seems.

Ahhh okay that explains everything, thank you for this. Does this also mean we need disable the referrer option on reddit too since that is a non-imgur site?

@mikf
Copy link
Owner

mikf commented Oct 20, 2023

Does this also mean we need disable the referrer option on reddit too since that is a non-imgur site?

I guess it wouldn't hurt, but reddit handles imgur links indirectly (it spawns an ImgurExtractor that does the "right" thing and uses its own headers and options), while 4chanarchives downloads from these links directly.

@cheese529
Copy link
Author

it spawns an ImgurExtractor that does the "right" thing and uses its own headers and options

Ahhh well that makes perfect sense. I guess i'll test with both to see if it changes anything.

BTW is {num} not supported for the filename? I currently have it set as "{filename}_{num}.{extension}", so in case it runs into a duplicate filename it will download with a number at the end but instead I am just getting "None" at the end for all images.

@Hrxn
Copy link
Contributor

Hrxn commented Oct 20, 2023

Ah yeah, I see it now...

<a target="_blank" href="https://i.imgur.com/8dfUO96.jpg">TaylorSwift-2008MTV-VMA-Arrivals_Vettri.Net-38.jpg</a>

<a target="_blank" href="https://i.imgur.com/8dfUO96.jpg" class="fileThumb" rel="noreferrer">
                     
</a>

(I removed the content inside of the second anchor tag)

Yep I am having the exact same problem. It seems that if you click the link you get a 403 but if you click the actual image it will work.

That is why.

rel can be used just like referrerpolicy

@Hrxn
Copy link
Contributor

Hrxn commented Oct 20, 2023

it spawns an ImgurExtractor that does the "right" thing and uses its own headers and options

Ahhh well that makes perfect sense. I guess i'll test with both to see if it changes anything.

BTW is {num} not supported for the filename? I currently have it set as "{filename}_{num}.{extension}", so in case it runs into a duplicate filename it will download with a number at the end but instead I am just getting "None" at the end for all images.

Use {no} instead of {num} here...

Shows the age of this extractor, I think..

@mikf
Copy link
Owner

mikf commented Oct 20, 2023

Shows the age of this extractor, I think..

Support for this site was added this year :) (1406f71)
It's just that it tries to replicate the metadata names used by all the other *chan extractors.

@Hrxn
Copy link
Contributor

Hrxn commented Oct 20, 2023

Should have taken a look before posting :D

@cheese529
Copy link
Author

Use {no} instead of {num} here...

This seems to work. Thank you both once again very much for all your help :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants