[YouTube] Fix hashtags links extraction and escape HTML links #1032
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
webCommandMetadata
object is contained inside acommandMetadata
one, so it is not accessible from the root of thenavigationEndpoint
object.The corresponding statement has been moved at the bottom of the specific endpoints parsing, as the
webCommandMetadata
object is present almost everywhere, otherwise URLs of some endpoints would have be changed, such as uploader URLs (from channel IDs to handles).As no
ParsingException
is now thrown bygetUrlFromNavigationEndpoint
, and so bygetTextFromObject
,getUrlFromObject
andgetTextAtKey
, the methods which were catchingParsingExceptions
thrown by these methods had to be updated.URLs got in the HTML version of
getTextFromObject
are now escaped properly to provide valid HTML to clients. This has been also done for attribute descriptions, with the description text for this type of descriptions.As YouTube descriptions are in HTML format (except for the fallback on the JSON player response, which is plain text and only happens when there is no visual metadata or a breaking change), URLs returned are escaped, so tests which are testing presence of URLs with escaped characters had to be updated (it was only the case for
YoutubeStreamExtractorDefaultTest.DescriptionTestUnboxing
).I've also updated the mocks of two tests classes of
YoutubeCommentsExtractorTest
which were missing, in order to test completely my changes:RepliesTest
andFormattingTest
.Fixes #1019 (for real this time)
Related issue: TeamNewPipe/NewPipe#9774