Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter more credit pages & move covers to start #742

Merged
merged 8 commits into from
Feb 5, 2023
Merged

Filter more credit pages & move covers to start #742

merged 8 commits into from
Feb 5, 2023

Conversation

Asinin3
Copy link
Contributor

@Asinin3 Asinin3 commented Jan 21, 2023

There are a number of scanlators who name their files in a way that puts the credit/translation note pages at the start of a gallery. This PR filters them so they are at the end of a gallery instead.

artist_info - [RedLantern]
9999nhnl - [nhnl]
end_card_save_file - [LunaticSeibah]
9999999* - [MegaFaget]
bumper - [cowboy]
ramble - [degenerate]

And also filters out filenames ending in "note." or "notes." with the period being the file extension, this aims to remove commonly used names for translation notes. I also added a filter to locate filenames ending with "cover." and set them to be the first images in a gallery, so we more reliably find front covers & correct thumbnails for galleries.

I tried to use stricter regex for the commonly used names/characters to avoid false positives, but more may be required.

Some RedLantern credit pages were just "Artist_Info" and not "Artist_Info_Store_Links"
Exclude Back, End, Rear from results
@Asinin3
Copy link
Contributor Author

Asinin3 commented Jan 28, 2023

This still needs to be improved to reduce false positives. The search for cover pages now excludes filenames with "back", "rear" and "end". But I need to make it only match "cover" as a whole word, or a word ending and or starting with numbers/special characters. To avoid it matching e.g "discover". I'm not very good at regex, help would be appreciated.

Otherwise, I can simplify this PR to just filter the most common credit pages only e.g NHNL and Red Lanterns.

Copy link
Owner

@Difegue Difegue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this -- Such a large regex feels a bit unwieldy to maintain imo, but I'm open to accepting a trimmed down version. 👍

lib/LANraragi/Utils/Archive.pm Outdated Show resolved Hide resolved
lib/LANraragi/Utils/Archive.pm Outdated Show resolved Hide resolved
lib/LANraragi/Utils/Archive.pm Outdated Show resolved Hide resolved
Asinin3 and others added 3 commits February 1, 2023 19:39
Co-authored-by: Difegue <8237712+Difegue@users.noreply.github.com>
Co-authored-by: Difegue <8237712+Difegue@users.noreply.github.com>
@Difegue
Copy link
Owner

Difegue commented Feb 5, 2023

I didn't test all possible permutations but this seems solid enough, let's get this merged. 👍

Thanks for taking the time with this! Here's a Holobyte.

@Difegue Difegue merged commit a9a81fd into Difegue:dev Feb 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants