-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Invidious] Add new extractor #31426
base: master
Are you sure you want to change the base?
Conversation
Thanks, but I doubt that this is a good solution. The existing YT extradtor knows about a whole lot of Invidious instances. I believe that your problem is just that the list of instances in See also #29885. The discussion there is now really of historical interest, though (and also the linked PR) because yt-dlp has now implemented a page-based extraction system in the generic extractor to handle these cases (Invidious, PeerTube, etc). yt-dl will eventually pull this in instead of the original PR, so as to maximise commonality and avoid incompatible reinvention. |
It seems like the youtube extractor sends at least one request to youtube.
This behavior can be problematic in environments with limited access to youtube itself. |
But then won't the download (googlevideo.com) URLs also be inaccessible? |
In my case, no. |
Normally a benefit of yt-dl vs using the YT web interface is to avoid the odious bloat of the latter while being able to capture a lot of the detailed metadata that comes with it. If a user who has YT access wants yt-dl to process an Invidious page, going to YT directly can give a better result because (AFAIK) less rich metadata is available on the IV page. The API is another matter, but reproducing the details of the YT extractor using the IV API would be a massive task. But if YT is blocked for the user, it would plainly be better to use the IV page instead. The problem is how to combine these tactics. For one-off uses, the IV page may have a download function, but yt-dl users are going to want a batchable solution. And all these considerations apply equally for other YT front-ends, which seem to be proliferating. |
IMHO, when a user gives yt-dl an invidious link, he/she probably wants to download from invidious servers. |
If this were a completely new feature, I would agree. But we have be auto-translating invidious inks to youtube for a long time. This means many users would be expecting to get all the metadata youtube provides even with a invidious URL. Having the new extractor return less data is a regression. Perhaps a |
Maybe, as there are other front-end sites for which the same issue could arise, we should introduce an option like This might also apply where a site has links and metadata in the page but could also use some API URL(s) for more metadata and formats, whether to avoid blocked URLs or increase extraction speed. |
If that were the case, why would they use an invidious url?
I struggle to see how performing the expected behaviour is a regression. Invidious is always going to be worse than youtube, but that doesn't mean people who pass invidious urls expect their urls to be silently converted to youtube urls
That seems reasonable, although I think there should be a warning if someone passes an invidious url with neither option, and people can silence that warning by explicitly using
Wanting to avoid Google feels like a completely different use-case to not wanting to download from the website you're using's api. |
If this is an issue (and imo it's not) perhaps the new invidious extractor should be limited to new instances (that aren't in youtube.py) |
Recently I wanted to download a video from one of the Invidious servers. I was very surprised when it redirected to YouTube. :)
|
@gamer191 Invidious will always return less data than YouTube, regardless of which version of Invidious that you use. It also doesn't support things like multiple audio tracks and subtitle translating (the have to use the Innertube transcript API endpoint, which doesn't support translating, and convert the response to WebVTT, as the publicly listed instances get ratelimited on YouTube's subtitle endpoint). The format list |
Consider the two use cases:
Since the second case was trivial, if tiresome, to support, that's what happened. Arguably the first case should have been given priority, since it would have supported users who need (or want) Invidious to act as a proxy, and so are content with whatever limitations that implies. |
I do agree that passing an Invidious URL should download from Invidious, I just wanted to point out that there is significantly less usable metadata that you might have thought at fist. So you'll either have to decided to show the incorrect metadata that Invidious returns or not show it at all, in either case you are likely to get user complaints. My point is that while the change does seem like a good idea it will be a breaking change, which you'll want to mention clearly in the changelog and potentially even log a warning message for a while. Think of it from a users perspective if you upgrade youtube-dl and suddenly your format filter/selector no longer works, because height, width and fps are not available or completely incorrect, you would want to be clearly informed during downloading why that is happening. |
Before submitting a pull request make sure you have:
In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
Description of your pull request and other information
Add a new extractor which is able to download from Invidious instances, since the Youtube extractor isn't able to download from Invidious correctly.