-
-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add image scraping support #370
Conversation
Tests ok with me. |
6ccdfe2
to
4e7a75f
Compare
Rebased and ported UI changes to 2.5. @bnkai can you please review on v2 and v2.5 of the UI? |
stashapp/CommunityScrapers#2 some scrapers for it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested against v2 and v2.5 UI using the boobpedia and @MrX292 's mofos , newfreeones xpath scrapers.
Both scene and performer images seem to work fine.
* Add sub-scraper functionality * Add scraping of performer image * Add scene cover image scraping * Port UI changes to v2.5 * Fix v2.5 dialog suggest color * Don't convert eol of UI to support pretty
Resolves #344
Adds the ability to scrape performer images and scene cover images.
This change also introduces the
subScraper
xpath post-processing option. IfsubScraper
appears in an attribute xpath configuration, then the sub-scraper will be executed after all other post-processes are complete. It then takes the value and performs an http request, using the value as the URL. Within thesubScraper
config is a nested scraping configuration. This allows you to traverse to other webpages to get the attribute value you are after.For example, from the Boobpedia scraper config in #333 :
This fragment gets the URL from the xpath
//table[@class="infobox"]//tr[2]//a/@href
, adds thehttp://www.boobpedia.com
prefix with thereplace
post-process. Then the sub-scraper post-process is run. It requests the document from the resulting URL, then gets the URL from//div[@class="fullImageLink"]/a/@href
of the resulting page, followed by the replace post-process.The
Image
value is expected to be a URL itself, which the system will subsequent request and encode.Also adds image scraping to the stash scraper.