Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal error messages should remind me about 429 errors. #32706

Open
3 tasks done
mk-pmb opened this issue Jan 27, 2024 · 15 comments
Open
3 tasks done

Fatal error messages should remind me about 429 errors. #32706

mk-pmb opened this issue Jan 27, 2024 · 15 comments
Labels

Comments

@mk-pmb
Copy link
Contributor

mk-pmb commented Jan 27, 2024

Checklist

  • I'm reporting a feature request
  • I've verified that I'm running youtube-dl version 2021.12.17
  • I've searched the bugtracker for similar feature requests including closed ones

Description

I put too much effort into trying to report what seemed to be a bug, only to then discover in the log that there was a "HTTP Error 429: Too Many Requests" earlier. Can we make it so that any non-fatal 429 sets a kind of "tainted" flag and have all subsequent fatal errors start their message with something like "This might be caused by an earlier error: HTTP Error 429: Too Many Requests –"? That way I'd instantly see I can ignore the "please report this bug" part.

@mk-pmb mk-pmb added the request label Jan 27, 2024
@dirkf
Copy link
Contributor

dirkf commented Jan 27, 2024

This seems like a case where the non-fatal error shouldn't have been, but you'd have to provide more details, as in the bug report template(s).

@mk-pmb
Copy link
Contributor Author

mk-pmb commented Jan 28, 2024

Not a bug because the problem is that my proxy's IP is temporarily blacklisted due to too other users' behaviors. It works via another proxy.

Verbose log
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--no-call-home', '--abort-on-error', '--no-overwrites', '--keep-video', '--fixup=warn', '--prefer-ffmpeg', '--verbose', '--', 'http://youtu.be/ZQeSeYNEwHM']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Git HEAD: dc512e3a8
[debug] Python 3.8.10 (CPython x86_64 64bit) - ####
[debug] exe versions: ffmpeg 4.2.7, ffprobe 4.2.7
[debug] Proxy map: {####}
[youtube] ZQeSeYNEwHM: Downloading webpage
WARNING: Unable to download webpage: HTTP Error 429: Too Many Requests
[youtube] ZQeSeYNEwHM: Downloading API JSON
ERROR: Cannot decrypt nsig without player_url; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/####/youtube_dl/YoutubeDL.py", line 859, in wrapper
    return func(self, *args, **kwargs)
  File "/####/youtube_dl/YoutubeDL.py", line 955, in __extract_info
    ie_result = ie.extract(url)
  File "/####/youtube_dl/extractor/common.py", line 565, in extract
    ie_result = self._real_extract(url)
  File "/####/youtube_dl/extractor/youtube.py", line 2108, in _real_extract
    self._unthrottle_format_urls(video_id, player_url, dct)
  File "/####/youtube_dl/extractor/youtube.py", line 1725, in _unthrottle_format_urls
    n_response = decrypt_nsig(n_param)(n_param, video_id, player_url)
  File "/####/youtube_dl/extractor/youtube.py", line 1614, in inner
    raise ret
  File "/####/youtube_dl/extractor/youtube.py", line 1606, in inner
    self._player_cache[cache_id] = func(*args, **kwargs)
  File "/####/youtube_dl/extractor/youtube.py", line 1634, in _decrypt_nsig
    raise ExtractorError('Cannot decrypt nsig without player_url')
youtube_dl.utils.ExtractorError: Cannot decrypt nsig without player_url; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.


E: error type: 'DownloadError' msg: 'ERROR: Cannot decrypt nsig without player_url; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.'
(2, 'DownloadError', ('ERROR: Cannot decrypt nsig without player_url; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.',))

The Unable to download webpage message exists in

youtube_dl/extractor/common.py:687:                errnote = 'Unable to download webpage'
youtube_dl/extractor/itv.py:51:            errnote = 'Unable to download webpage'

so probably common.py should be the place to set the taint flag (probably in self._downloader.report_warning), and the other should be refactored to use a common(.py) annotation function.

@dirkf
Copy link
Contributor

dirkf commented Jan 28, 2024

The ITV extractor probably fails anyway, and the default errnote didn't need to be re-specified.

You would have to maintain a list of (host, port) that had experienced 429. A good place to set that up might be in _request_webpage(). But I reiterate the original point. Why is this error being treated as non-fatal, if the result of not receiving a valid response is a later error? Let's see the log.

@mk-pmb
Copy link
Contributor Author

mk-pmb commented Jan 31, 2024

Let's see the log.

Have you seen the verbose log in the details tag? I wish GitHub would make their "click to open" arrow a bit more visible. Or what more log would you need?

@dirkf
Copy link
Contributor

dirkf commented Jan 31, 2024

AHA, sorry. In this case, the YT webpage failed with 429 but yt-dl was able to fetch the JSON which is supposed to be enough to get the video, though not to resolve the n-sig, or get the full metadata.

Which actual code are you running there? The logic was updated in the most recent nightly. One possibility, without checking, is that the failure during n-sig was introduced, or not removed; another is that it more properly proceeds with a warning.

@mk-pmb
Copy link
Contributor Author

mk-pmb commented Jan 31, 2024

[debug] youtube-dl version 2021.12.17
[debug] Git HEAD: dc512e3

iirc I had pulled before reproducing for the log. If it happens again, what version information should I collect?

@dirkf
Copy link
Contributor

dirkf commented Feb 1, 2024

Yes, so I think this patch should be applied to treat the first failure in the routine in the same way as the later cases:

--- old/youtube-dl/youtube_dl/extractor/youtube.py
+++ new/youtube-dl/youtube_dl/extractor/youtube.py
@@ -1631,7 +1631,9 @@
     def _decrypt_nsig(self, n, video_id, player_url):
         """Turn the encrypted n field into a working signature"""
         if player_url is None:
-            raise ExtractorError('Cannot decrypt nsig without player_url')
+            self.report_warning(
+                'Cannot decrypt nsig without player_url', video_id)
+            return
 
         try:
             jsi, player_id, func_code = self._extract_n_function_code(video_id, player_url)
@@ -1646,11 +1648,9 @@
             ret = extract_nsig(jsi, func_code)(n)
         except JSInterpreter.Exception as e:
             self.report_warning(
-                '%s (%s %s)' % (
-                    self.__ie_msg(
-                        'Unable to decode n-parameter: download likely to be throttled'),
-                    error_to_compat_str(e),
-                    traceback.format_exc()))
+                'Unable to decode n-parameter: download likely to be throttled (%s %s)' % (
+                    error_to_compat_str(e), traceback.format_exc()),
+                video_id)
             return
 
         self.write_debug('Decrypted nsig {0} => {1}'.format(n, ret))

The second hunk is an unrelated clean-up!

@mk-pmb
Copy link
Contributor Author

mk-pmb commented Feb 2, 2024

Thanks! I will try it.
Edit 2024-02-03: Error 429 vanished when applying the patch. Probably coincidence. I'll have to wait until it reappears.

@mk-pmb
Copy link
Contributor Author

mk-pmb commented Feb 3, 2024

The patch works nicely. It seems to replace the one specific fatal error above with downloading anyway:

[youtube] ZQeSeYNEwHM: Downloading webpage
WARNING: Unable to download webpage: HTTP Error 429: Too Many Requests
[youtube] ZQeSeYNEwHM: Downloading API JSON
WARNING: [youtube] ZQeSeYNEwHM: Cannot decrypt nsig without player_url
WARNING: [youtube] ZQeSeYNEwHM: Cannot decrypt nsig without player_url
[youtube] ZQeSeYNEwHM: Downloading API JSON
[info] Writing video description metadata as JSON to: …
[debug] Invoking downloader on 'https://…'
[download] Destination: Big_Buck_Bunny_animation_student.mp4

Which is good and we should get this into master. Thanks!

However, now without that error, I can't tell whether it would achieve the actual feature requested above. Would it annotate any other fatal errors that may occurr after a 429 error?

@dirkf
Copy link
Contributor

dirkf commented Feb 3, 2024

Such cases should be resolved in the same way, if they occur: (a) if the 429 will prevent extraction it should be an error; (b) if not it should be a warning, unless (c) exceptionally, the fallback tactics result in the same quality of extraction, where it might just be ignored or logged for debug. The problem here was poor logic such that although (b) should have applied the extraction failed anyway.

@mk-pmb
Copy link
Contributor Author

mk-pmb commented Feb 3, 2024

I'm not entirely sure I understand it but I think you mean that even with a 429 error, fatal errors should still be reported as a bug, so the message saying so is actually correct. In that case, my feature request was misguided and you can just close it.

@dirkf
Copy link
Contributor

dirkf commented Feb 3, 2024

Yes, that was my point. The issue could be left open until I push the change (actually there's a slicker version now).

@mk-pmb
Copy link
Contributor Author

mk-pmb commented Feb 4, 2024

I have a new case, which looks a LOT like yet another dupe of 429 + ERROR: No video formats found and as such, not worthy of reporting. However they all seem to have WARNING: unable to extract player URL (e.g. #32635#32630) or ERROR: Unable to extract yt initial data on playlist (e.g. #32499) which I don't have, so I'm not 100% sure.

It's not about the patch either, as I was able to download the same video earlier.

Verbose log (click to open)
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--no-call-home', '--abort-on-error', '--no-overwrites', '--keep-video', '--fixup=warn', '--prefer-ffmpeg', '--format=22', '--verbose', '--', '71WCJh7WKrM']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Git HEAD: dc512e3a8
[debug] Python 3.8.10 (CPython x86_64 64bit) - Linux-…-x86_64-with-glibc… - OpenSSL 1.1.1f  31 Mar 2020 - glibc …
[debug] exe versions: ffmpeg 4.2.7, ffprobe 4.2.7
[debug] Proxy map: {…}
[youtube] 71WCJh7WKrM: Downloading webpage
WARNING: Unable to download webpage: HTTP Error 429: Too Many Requests
[youtube] 71WCJh7WKrM: Downloading API JSON
ERROR: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/…/youtube_dl/YoutubeDL.py", line 859, in wrapper: return func(self, *args, **kwargs)
  File "/…/youtube_dl/YoutubeDL.py", line 955, in __extract_info: ie_result = ie.extract(url)
  File "/…/youtube_dl/extractor/common.py", line 565, in extract: ie_result = self._real_extract(url)
  File "/…/youtube_dl/extractor/youtube.py", line 2221, in _real_extract: self._sort_formats(formats)
  File "/…/youtube_dl/extractor/common.py", line 1494, in _sort_formats: raise ExtractorError('No video formats found')
youtube_dl.utils.ExtractorError: No video formats found; please report this issue on […]

@dirkf
Copy link
Contributor

dirkf commented Feb 4, 2024

Well, #23638 applies.

In this case the JSON contains streaming formats but they only contain encrypted media links ("signature cipher"). The extractor needs the player_url to resolve these; that URL in turn is only available from the webpage that could not be downloaded. The code recognises this and skips those formats. If it didn't you would also get warning messages like this:

WARNING: [youtube] 71WCJh7WKrM: Signature extraction failed: Some formats may be missing

Would that be better? We now have the only_once parameter for warnings so the output needn't be spammed as it might have been when the signature processing was implemented.

@mk-pmb
Copy link
Contributor Author

mk-pmb commented Feb 4, 2024

Well, #23638 applies.

Yeah I read that, and am rather confident that the issue is redundantly not worth reporting, especially due to lack of reproducible console access. Which means we probably have a case where my original feature request is useful.

Would that be better?

Probably not. "Some formats may be missing" to me sounds like "YouTube doesn't like you(r proxy)", for which my default route of action would be to use the next proxy in my list. Thus, my use is actually twofold:

  1. Avoid the effort for manually re-running verbosely and reporting;
  2. Make it easier to guess from the final error whether I should try the next proxy. I already have a wrapper that communicates the final error message to my download manager, and have tried to use full log capture to also get previous 429 messages, but it's messy and has side effects. Having the indication in the fatal error would make it much easier.

Update 2024-05-15:
I investigated how I might be able to monkey-patch a flag into it. It seems there are three separate report_warning functions, with the one in youtube_dl/YoutubeDL.py being the relevant one. (The one in extractors seems to delegate to the downloader, and the one in the downloader seems to delegate to the main file.) Was a good opportunity to refactor it.

Based on the message, the 429 error in my verbose log must be from _request_webpage in extractor/common.py (called without errnote, thus equivalent to errnote=None), which repackages errors caught from self._downloader.urlopen. The latter is a hot candidate for what to monkey-decorate. Another approach would be to configure my own logger and match based on strings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants