-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kemono.party Patreon posts always contain duplicate images #1667
Comments
You can use |
Yeah, I mean this definitely seems like an issue with the site. |
The main issue is that the main file and the first attachment of any Patreon post refers to the same file. Before v1.18.0 this was "solved" by effectively using It should be possible to distinguish between Patreon and everything else with conditional filenames to use the "filename": {
"service == 'patreon'": "{id}-{filename}.{extension}",
"" : "{id}-{num}.{extension}"
} Or with
The devs are already working on a solution (https://desuarchive.org/g/thread/82346276/#q82366219), although just having file hashes or some way to detect duplicates other than unreliable filenames in API responses would be very handy (@kemono-bugs) |
But they're not quite the same file, because one uses spaces and the other uses underlines. I tried downloading the example post with the
Well that's good to hear, and it saves me the trouble of contacting them directly. Thanks for your help. I suppose I'll just wait for a fix and get used to cleaning out duplicates when I'm downloading from Patreon. |
You could use |
I think it's more reasonable to change the Patreon extractor itself to convert spaces to underscores. ^ Better yet if the extractor can tell those files apart before starting to download the file, so that the downloading doesn't have to be aborted before moving to the next one. |
That works when it's the only formatting in the I think the issue is that the
|
I also came across posts (lots of them) where the header file and a completely different attachment shared the same filename, so I guess that's also an issue, specially for people that prefer the {id}+{filename} format. |
@jlazarskiparkin9815 |
Ah, that did it. I upgraded to 1.18.x and now the problem is solved when I use this filename format:
Testing the example from the OP gives me one file with the format |
Fixes mikf#1667 Judging by the discussion in that thread, the first file is always duplicated in the attachments list, except for some posts which only have the image and no attachments. This change makes it so if attachments are present it only downloads those. It might require testing from various posts though. It worked in the one I tried for what it's worth, but I'm not too familiar with the service.
I have noticed an inconsistency in kemono.party. In short, I cannot seem to find a way to configure kemono.party to download non-duplicate pictures from Patreon posts even though my configuration works with other data sources like SubscribeStar. I believe this is due to the way kemono.party displays images from these two sites.
Example post (NSFW but no nudity): https://kemono.party/patreon/user/2909939/post/48126953
My config file:
Attempting to download the example post with this config gets me two files with the same image and file size, but different names:
48126953-1.png
and48126953-2.png
. For a SubscribeStar post, I would only get48126953-1.png
, which is fine for my organization needs.I tried looking at the keywords for filenames to find something that would help, but there does not seem to be anything there that could help.
I also tried configuring a postprocessor option to compare images once they've been downloaded, but that has two problems:
compare.shallow
postprocessor option to work properly. I had thought that I could use that option to compare the filesizes as a pseudo-checksum, but either I configured it wrong or it didn't work.The text was updated successfully, but these errors were encountered: