-
Notifications
You must be signed in to change notification settings - Fork 425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YouTube] Support more channel headers #1094
[YouTube] Support more channel headers #1094
Conversation
…rer channel headers The addition of this support required to turn the isCarouselHeader boolean into an enum containing all supported channel headers named HeaderType. Also assert that the page has been fetched where needed to avoid NullPointerExceptions when the channel page has been not fetched and remove the getChannelHeaderJson method in YoutubeChannelExtractor, method for which its code has been moved to its sole usage after the new headers support changes.
This test uses the Minecraft game topic channel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I pushed a small commit simplifying Optional call chains, since Optional.map()
accepts null
values just fine (actually you did so in one place but in the other you used flatMap
with Optional.ofNullable
). I also included all cases in the switch
statements so that it is clear for which header kinds the default branch is taken.
.getArray("thumbnails") | ||
.getObject(0) | ||
.getString("url")) | ||
.filter(url -> !url.contains("s.ytimg.com") && !url.contains("default_banner")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you not returning a banner at all if the url contains s.ytimg.com
or default_banner
? This seems to date back to the initial commit of the extractor: see here. You seem to have removed this in the multiple images PR.
Also, the thumbnails array returned in the c4 header does not seem to contain such urls:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to keep the original behavior, that's the reason I didn't change this. Like you said, that's indeed useless.
I only used my test class on which no errors occurred during my tests and a temporary one to test a part of my changes (see why below).
This PR adds support of more YouTube channel headers, the
pageHeaderRenderer
andinteractiveTabbedHeaderRenderer
ones, fixing crashes when extracting channels with these headers.The first header is currently A/B tested (that's the reason why no test has been added it) on system topic channels such as the Gaming one and renders in the following way on the desktop web client:
It only returns the info you can see (channel's name and its avatar).
The second header is used for gaming topic channels and only provides the channel's name, its banner and a poster as its "avatar".
A new test in
YoutubeChannelExtractorTest
has been added for this header, which uses the Minecraft system topic channel (UCQvWX73GQygcwXOTSf_VDVg
).Even if the support of these headers has been added, the tabs of the (system) channels on which the headers are returned are not supported.
Another header is also returned when you try to access some system topic channels with an invalid tab for these channels, such as the Gaming one with the Videos tab (it looks like a bug on YouTube's side):
feedTabbedHeaderRenderer
, which is almost empty and only contains the channel name (the response doesn't seem to render anything on the website). Its support is not added with this PR.I also made a few other changes related to the headers support, which are described in the commits' message.