Improve HDoujin info.txt parsing #1053

HDoujinDownloader · 2024-08-13T09:16:20Z

Currently, tags are only extracted from the TAGS field for HDoujin's info.txt files. I've updated the plugin to extract tags from other fields as well (artist, series, language, parody, etc.), and namespace them accordingly.

Parse tags from additional fields

Difegue

LGTM overall - I think the tests should be updated to reflect the new summary/description parsing here. Thanks!

lib/LANraragi/Plugin/Metadata/Hdoujin.pm

Filter out irrelevant fields and support more JSON configurations

HDoujinDownloader · 2024-08-17T07:26:29Z

Thank you for the feedback!

I updated the JSON parser to read the summary and make it more consistent with the output from the TXT file parser. It was adding all the fields as tags (including titles and URLs), but I've limited it to a more relevant subset. I also updated it to work with different JSON configurations (the outer manga_info may or may not be present based on user settings). The namespace-related issues should be resolved now as well.

I think the tests should be updated to reflect the new summary/description parsing here

Correct me if I'm wrong, but it doesn't look like there are any tests for this format right now. I could possibly add some.

Difegue · 2024-08-18T22:57:03Z

Thanks! The JSON parser was a pretty old bit of code so I'm not surprised if it was worse than the txt version.
There are indeed no specific tests for the HDoujin plugin - Adding some with samples like the other plugins have would be welcome, but that's not blocking me from merging this in the meantime.

holopin-bot · 2024-08-18T22:57:29Z

Congratulations @HDoujinDownloader, the maintainer of this repository has issued you a holobyte! Here it is: https://holopin.io/holobyte/cm0063ggy18850clbr48u3ufu

This badge can only be claimed by you, so make sure that your GitHub account is linked to your Holopin account. You can manage those preferences here: https://holopin.io/account.
Or if you're new to Holopin, you can simply sign up with GitHub, which will do the trick!

Boontato · 2024-08-31T00:47:25Z

I updated the JSON parser to read the summary and make it more consistent with the output from the TXT file parser. It was adding all the fields as tags (including titles and URLs), but I've limited it to a more relevant subset. I also updated it to work with different JSON configurations (the outer manga_info may or may not be present based on user settings). The namespace-related issues should be resolved now as well.

Thanks squiddy for working on this, I actually enjoyed that it would pull URLs since in mihon/tachi i could search nhentai codes and it would resolve because the url is part of the tags and it was useful at least for me.

when i saw this PR i was hoping that it would fix the ability for this plugin to pull the title too because right now im using a secondary plugin just to pull title information from the metadata file.

HDoujinDownloader · 2024-08-31T01:11:43Z

@Boontato Oh! I didn't even notice plugins could specify a gallery title. I'll get that fixed and submit a new PR in a bit.

@Difegue What's your take on having URLs in the tags (e.g. url:https://nhentai.net/g/XXXXXX/)? If the use case is just being able to search by NHentai code, maybe there's a better way to do it.

Difegue · 2024-08-31T01:19:16Z

You should use source:nhentai.net/xxxx tags if you want to add URLs to the metadata, there's support for those in the browser extension and a few other spots.

Boontato · 2024-08-31T01:53:46Z

Yes I have been using tag rules to convert url namespace to source namespaces. mihon also allowed specifying which namespace to use to pull the url too.

Improve HDoujin info.txt parsing

1e8b180

Parse tags from additional fields

Difegue requested changes Aug 14, 2024

View reviewed changes

lib/LANraragi/Plugin/Metadata/Hdoujin.pm Outdated Show resolved Hide resolved

lib/LANraragi/Plugin/Metadata/Hdoujin.pm Outdated Show resolved Hide resolved

lib/LANraragi/Plugin/Metadata/Hdoujin.pm Outdated Show resolved Hide resolved

HDoujinDownloader added 2 commits August 16, 2024 22:32

Improve HDoujin info.json parsing

62fd9e1

Filter out irrelevant fields and support more JSON configurations

Update HDoujin plugin namespace

c5561ea

Rename Hdoujin.pm to HDoujin.pm

893d68e

Difegue approved these changes Aug 18, 2024

View reviewed changes

Difegue merged commit abf1ec5 into Difegue:dev Aug 18, 2024
1 check passed

Difegue added the holobyte label Aug 18, 2024

HDoujinDownloader mentioned this pull request Aug 31, 2024

Add title and source extraction for HDoujin info files #1068

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Improve HDoujin info.txt parsing #1053

Improve HDoujin info.txt parsing #1053

Uh oh!

HDoujinDownloader commented Aug 13, 2024

Uh oh!

Difegue left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HDoujinDownloader commented Aug 17, 2024

Uh oh!

Difegue commented Aug 18, 2024

Uh oh!

Uh oh!

holopin-bot bot commented Aug 18, 2024

Uh oh!

Boontato commented Aug 31, 2024

Uh oh!

HDoujinDownloader commented Aug 31, 2024

Uh oh!

Difegue commented Aug 31, 2024 •

edited

Loading

Uh oh!

Boontato commented Aug 31, 2024

Uh oh!

Uh oh!

Uh oh!

Improve HDoujin info.txt parsing #1053

Improve HDoujin info.txt parsing #1053

Uh oh!

Conversation

HDoujinDownloader commented Aug 13, 2024

Uh oh!

Difegue left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HDoujinDownloader commented Aug 17, 2024

Uh oh!

Difegue commented Aug 18, 2024

Uh oh!

Uh oh!

holopin-bot bot commented Aug 18, 2024

Uh oh!

Boontato commented Aug 31, 2024

Uh oh!

HDoujinDownloader commented Aug 31, 2024

Uh oh!

Difegue commented Aug 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Boontato commented Aug 31, 2024

Uh oh!

Uh oh!

Difegue commented Aug 31, 2024 •

edited

Loading