Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeviantArt download fails after few images #5363

Closed
UnforeseenOcean opened this issue Mar 23, 2024 · 9 comments
Closed

DeviantArt download fails after few images #5363

UnforeseenOcean opened this issue Mar 23, 2024 · 9 comments

Comments

@UnforeseenOcean
Copy link

First 20 or so images are downloaded properly it seems, but after that, it keeps returning json error. The log with -v switch is attached below.

[1/2] https://www.deviantart.com/nahelus
[gallery-dl][debug] Starting DownloadJob for 'https://www.deviantart.com/nahelus'
[deviantart][debug] Using DeviantartUserExtractor for 'https://www.deviantart.com/nahelus'
[deviantart][debug] Using DeviantartGalleryExtractor for 'https://www.deviantart.com/nahelus/gallery'
[deviantart][debug] Using custom API credentials (client-id 14376)
[deviantart][debug] Sleeping 1.00 seconds (api)
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/user/profile/nahelus HTTP/1.1" 403 919
[deviantart][error] Unable to download data:  JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[deviantart][debug]
Traceback (most recent call last):
  File "C:\Users\Torbjorn\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\models.py", line 971, in json
    return complexjson.loads(self.text, **kwargs)
  File "C:\Users\Torbjorn\AppData\Local\Programs\Python\Python310\lib\json\__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "C:\Users\Torbjorn\AppData\Local\Programs\Python\Python310\lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\Torbjorn\AppData\Local\Programs\Python\Python310\lib\json\decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Torbjorn\AppData\Local\Programs\Python\Python310\lib\site-packages\gallery_dl\job.py", line 127, in run
    for msg in extractor:
  File "C:\Users\Torbjorn\AppData\Local\Programs\Python\Python310\lib\site-packages\gallery_dl\extractor\deviantart.py", line 103, in items
    profile = self.api.user_profile(self.user)
  File "C:\Users\Torbjorn\AppData\Local\Programs\Python\Python310\lib\site-packages\gallery_dl\cache.py", line 34, in __call__
    value = self.cache[key] = self.func(*args, **kwargs)
  File "C:\Users\Torbjorn\AppData\Local\Programs\Python\Python310\lib\site-packages\gallery_dl\extractor\deviantart.py", line 1281, in user_profile
    return self._call(endpoint, fatal=False)
  File "C:\Users\Torbjorn\AppData\Local\Programs\Python\Python310\lib\site-packages\gallery_dl\extractor\deviantart.py", line 1360, in _call
    data = response.json()
  File "C:\Users\Torbjorn\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\models.py", line 975, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
@defaultd3f4lt
Copy link

Same here too, here's my log:

[deviantart][debug]
Traceback (most recent call last):
  File "requests\models.pyc", line 971, in json
  File "json\__init__.pyc", line 357, in loads
  File "json\decoder.pyc", line 337, in decode
  File "json\decoder.pyc", line 355, in raw_decode
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "gallery_dl\job.pyc", line 127, in run
  File "gallery_dl\extractor\deviantart.pyc", line 103, in items
  File "gallery_dl\cache.pyc", line 34, in __call__
  File "gallery_dl\extractor\deviantart.pyc", line 1281, in user_profile
  File "gallery_dl\extractor\deviantart.pyc", line 1360, in _call
  File "requests\models.pyc", line 975, in json
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)```

@TheMerricat
Copy link

Having the same issue as well.

@mikf
Copy link
Owner

mikf commented Mar 23, 2024

Seems like one now gets blocked for doing too many requests in too short a time:

ERROR: The request could not be satisfied

Request blocked.

We can't connect to the server for this app or website at this time.
There might be too much traffic or a configuration error.
Try again later, or contact the app or website owner.

If you provide content to customers through CloudFront,
you can find steps to troubleshoot and help prevent this error
by reviewing the CloudFront documentation.

Generated by cloudfront (CloudFront)

--sleep-request or using a somewhat high wait-min might help (3 seconds?) once your IP / ClientID is no longer blocked which seems to happen after only a few minutes.

@Twi-Hard
Copy link

Twi-Hard commented Mar 23, 2024

I'm getting 2 at a time and each error skips to the next extractor which is really bad. In this you can see if goes from "gallery" to "journal" to "scraps" in about a second so none of the accounts I'm trying to download from are getting through the whole account since the extractor stops when it gets this error. Is it possible to make gallery-dl retry what it failed to download after the rate limit ends?

[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/deviation/download/55883A45-DA56-7A60-A761-AE7BA49960D0?mature_content=true HTTP/1.1" 403 919
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/deviation/download/55883A45-DA56-7A60-A761-AE7BA49960D0?mature_content=true HTTP/1.1" 403 919
[deviantart][error] Unable to download data:  JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[deviantart][debug]
Traceback (most recent call last):
  File "/home/mane/.local/lib/python3.10/site-packages/requests/models.py", line 971, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/mane/.local/lib/python3.10/site-packages/gallery_dl/job.py", line 127, in run
    for msg in extractor:
  File "/home/mane/.local/lib/python3.10/site-packages/gallery_dl/extractor/deviantart.py", line 144, in items
    content = self._extract_content(deviation)
  File "/home/mane/.local/lib/python3.10/site-packages/gallery_dl/extractor/deviantart.py", line 347, in _extract_content
    self._update_content(deviation, content)
  File "/home/mane/.local/lib/python3.10/site-packages/gallery_dl/extractor/deviantart.py", line 397, in _update_content_default
    data = self.api.deviation_download(deviation["deviationid"], public)
  File "/home/mane/.local/lib/python3.10/site-packages/gallery_dl/extractor/deviantart.py", line 1299, in deviation_download
    return self._call(endpoint, params=params, public=False)
  File "/home/mane/.local/lib/python3.10/site-packages/gallery_dl/extractor/deviantart.py", line 1418, in _call
    data = response.json()
  File "/home/mane/.local/lib/python3.10/site-packages/requests/models.py", line 975, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[deviantart][debug] Using DeviantartJournalExtractor for 'https://www.deviantart.com/8bitlola/posts'
[deviantart][debug] Using custom API credentials (client-id 11034)
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET /api/v1/oauth2/browse/user/journals?username=8BitLola&offset=0&limit=50&mature_content=true&featured=false HTTP/1.1" 403 919
[deviantart][error] Unable to download data:  JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[deviantart][debug]
Traceback (most recent call last):
  File "/home/mane/.local/lib/python3.10/site-packages/requests/models.py", line 971, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/mane/.local/lib/python3.10/site-packages/gallery_dl/job.py", line 127, in run
    for msg in extractor:
  File "/home/mane/.local/lib/python3.10/site-packages/gallery_dl/extractor/deviantart.py", line 115, in items
    for deviation in self.deviations():
  File "/home/mane/.local/lib/python3.10/site-packages/gallery_dl/extractor/deviantart.py", line 1477, in _pagination
    data = self._call(endpoint, params=params, public=public)
  File "/home/mane/.local/lib/python3.10/site-packages/gallery_dl/extractor/deviantart.py", line 1418, in _call
    data = response.json()
  File "/home/mane/.local/lib/python3.10/site-packages/requests/models.py", line 975, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[deviantart][debug] Using DeviantartScrapsExtractor for 'https://www.deviantart.com/8bitlola/gallery/scraps'
[deviantart][debug] Using custom API credentials (client-id 11034)
[urllib3.connectionpool][debug] https://www.deviantart.com:443 "GET / HTTP/1.1" 403 919
[deviantart][info] Waiting until 12:50:21 for rate limit reset.

mikf added a commit that referenced this issue Mar 23, 2024
This was already done for non-OAuth requests (#655)
but CF is now blocking OAuth API requests as well.
@mikf mikf pinned this issue Mar 23, 2024
@mikf
Copy link
Owner

mikf commented Mar 23, 2024

These errors now get properly handled and downloading gets resumed after 5 minutes (waiting for 3 minutes is not long enough). (925123e)

CloudFront blocking requests to DeviantArt happened before quite some time ago (#655), but then it was only for non-API requests. Seems like it got extended to all HTTP requests including OAuth API ones.

This fix is included in the latest v1.26.9 release, by the way.

@mikf mikf closed this as completed Mar 23, 2024
@Twi-Hard
Copy link

Twi-Hard commented Mar 24, 2024

Do you think if somebody brought up the fact that cloudfront is doing this with the api now too that they might change it? It might have been unintentional.

@TheMerricat
Copy link

TheMerricat commented Mar 24, 2024 via email

@mikf
Copy link
Owner

mikf commented Mar 25, 2024

I'd think this was a deliberate decision by DeviantArt, given that CloudFront's services cost money. They wouldn't accidentally extend CF's protection without doing some sort of cost-benefit analysis beforehand, or would they?

You might as well give it a try. Maybe it works, especially if this whole thing has negative effects on their mobile users and otherwise.

JackTildeD added a commit to JackTildeD/gallery-dl-forked that referenced this issue Apr 24, 2024
* save cookies to tempfile, then rename

avoids wiping the cookies file if the disk is full

* [deviantart:stash] fix 'index' metadata (mikf#5335)

* [deviantart:stash] recognize 'deviantart.com/stash/…' URLs

* [gofile] fix extraction

* [kemonoparty] add 'revision_count' metadata field (mikf#5334)

* [kemonoparty] add 'order-revisions' option (mikf#5334)

* Fix imagefap extrcator

* [twitter] add 'birdwatch' metadata field (mikf#5317)

should probably get a better name,
but this is what it's called internally by Twitter

* [hiperdex] update URL patterns & fix 'manga' metadata (mikf#5340)

* [flickr] add 'contexts' option (mikf#5324)

* [tests] show full path for nested values

'user.name' instead of just 'name' when testing for
"user": { … , "name": "…", … }

* [bluesky] add 'instance' metadata field (mikf#4438)

* [vipergirls] add 'like' option (mikf#4166)

* [vipergirls] add 'domain' option (mikf#4166)

* [gelbooru] detect returned favorites order (mikf#5220)

* [gelbooru] add 'date_favorited' metadata field

* Update fapello.py

get fullsize image instead resized

* fapello.py Fullsize image

by remove ".md" and ".th" in image url, it will download fullsize of images

* [formatter] fix local DST datetime offsets for ':O'

'O' would get the *current* local UTC offset and apply it to all
'datetime' objects it gets applied to.
This would result in a wrong offset if the current offset includes
DST and the target 'datetime' does not or vice-versa.

'O' now determines the correct local UTC offset while respecting DST for
each individual 'datetime'.

* [subscribestar] fix 'date' metadata

* [idolcomplex] support new pool URLs

* [idolcomplex] fix metadata extraction

- replace legacy 'id' vales with alphanumeric ones, since the former are
  no longer available
- approximate 'vote_average', since the real value is no longer
  available
- fix 'vote_count'

* [bunkr] remove 'description' metadata

album descriptions are no longer available on album pages
and the previous code erroneously returned just '0'

* [deviantart] improve 'index' extraction for stash files (mikf#5335)

* [kemonoparty] fix exception for '/revision/' URLs

caused by 03a9ce9

* [steamgriddb] raise proper exception for deleted assets

* [tests] update extractor results

* [pornhub:gif] extract 'viewkey' and 'timestamp' metadata (mikf#4463)

mikf#4463 (comment)

* [tests] use 'datetime.timezone.utc' instead of 'datetime.UTC'

'datetime.UTC' was added in Python 3.11
and is not defined in older versions.

* [gelbooru] add 'order-posts' option for favorites (mikf#5220)

* [deviantart] handle CloudFront blocks in general (mikf#5363)

This was already done for non-OAuth requests (mikf#655)
but CF is now blocking OAuth API requests as well.

* release version 1.26.9

* [kemonoparty] fix KeyError for empty files (mikf#5368)

* [twitter] fix pattern for single tweet (mikf#5371)

- Add optional slash
- Update tests to include some non-standard tweet URLs

* [kemonoparty:favorite] support 'sort' and 'order' query params (mikf#5375)

* [kemonoparty] add 'announcements' option (mikf#5262)

mikf#5262 (comment)

* [wikimedia] suppress exception for entries without 'imageinfo' (mikf#5384)

* [docs] update defaults of 'sleep-request', 'browser', 'tls12'

* [docs] complete Authentication info in supportedsites.md

* [twitter] prevent crash when extracting 'birdwatch' metadata (mikf#5403)

* [workflows] build complete docs Pages only on gdl-org/docs

deploy only docs/oauth-redirect.html on mikf.github.io/gallery-dl

* [docs] document 'actions' (mikf#4543)

or at least attempt to

* store 'match' and 'groups' in Extractor objects

* [foolfuuka] improve 'board' pattern & support pages (mikf#5408)

* [reddit] support comment embeds (mikf#5366)

* [build] add minimal pyproject.toml

* [build] generate sdist and wheel packages using 'build' module

* [build] include only the latest CHANGELOG entries

The CHANGELOG is now at a size where it takes up roughly 50kB or 10% of
an sdist or wheel package.

* [oauth] use Extractor.request() for HTTP requests (mikf#5433)

Enables using proxies and general network options.

* [kemonoparty] fix crash on posts with missing datetime info (mikf#5422)

* restore LD_LIBRARY_PATH for PyInstaller builds (mikf#5421)

* remove 'contextlib' imports

* [pp:ugoira] log errors for general exceptions

* [twitter] match '/photo/' Tweet URLs (mikf#5443)

fixes regression introduced in 40c0553

* [pp:mtime] do not overwrite '_mtime' for None values (mikf#5439)

* [wikimedia] fix exception for files with empty 'metadata'

* [wikimedia] support wiki.gg wikis

* [pixiv:novel] add 'covers' option (mikf#5373)

* [tapas] add 'creator' extractor (mikf#5306)

* [twitter] implement 'relogin' option (mikf#5445)

* [docs] update docs/configuration links (mikf#5059, mikf#5369, mikf#5423)

* [docs] replace AnchorJS with custom script

use it in rendered .rst documents as well as in .md ones

* [text] catch general Exceptions

* compute tempfile path only once

* Add warnings flag

This commit adds a warnings flag

It can be combined with -q / --quiet to display warnings.
The intent is to provide a silent option that still surfaces
warning and error messages so that they are visible in logs.

* re-order verbose and warning options

* [gelbooru] improve pagination logic for meta tags (mikf#5478)

similar to 494acab

* [common] add Extractor.input() method

* [twitter] improve username & password login procedure (mikf#5445)

- handle more subtasks
- support 2FA
- support email verification codes

* [common] update Extractor.wait() message format

* [common] simplify 'status_code' check in Extractor.request()

* [common] add 'sleep-429' option (mikf#5160)

* [common] fix NameError in Extractor.request()

… when accessing 'code' after an requests exception was raised.

Caused by the changes in 566472f

* [common] show full URL in Extractor.request() error messages

* [hotleak] download files with 404 status code (mikf#5395)

* [pixiv] change 'sanity_level' debug message to a warning (mikf#5180)

* [twitter] handle missing 'expanded_url' fields (mikf#5463, mikf#5490)

* [tests] allow filtering extractor result tests by URL or comment

python test_results.py twitter:+/i/web/
python test_results.py twitter:~twitpic

* [exhentai] detect CAPTCHAs during login (mikf#5492)

* [output] extend 'output.colors' (mikf#2566)

allow specifying ANSI colors for all loglevels
(debug, info, warning, error)

* [output] enable colors by default

* add '--no-colors' command-line option

---------

Co-authored-by: Luc Ritchie <luc.ritchie@gmail.com>
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
Co-authored-by: Herp <asdf@qwer.com>
Co-authored-by: wankio <31354933+wankio@users.noreply.github.com>
Co-authored-by: fireattack <human.peng@gmail.com>
Co-authored-by: Aidan Harris <me@aidanharr.is>
@mikf mikf unpinned this issue Apr 25, 2024
@Twi-Hard
Copy link

Twi-Hard commented May 16, 2024

I accidently used an out-of-date snapshot of my scripts folder which was from before they started this rate limiting so it doesn't include any "sleep" options and it's downloading thousands of images at full speed (very fast) with all metadata/comments possible. There's been no rate limiting at all. I just downloaded an account with 5,400 images to test it (5796 non-json files in 25m52s) and nothing slowed it down. The rate limiting seems fixed on their end to me. I'm on the most recent dev version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants