-
-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autotag scraper #1817
Autotag scraper #1817
Conversation
✓ Code reviewed Auto-tagging just one scene / image is a nice touch! |
return ret, nil | ||
} | ||
|
||
func PathToStudios(path string, reader models.StudioReader) ([]*models.Studio, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These PathTo*
functions should probably be refactored. Instead of doing 3 database hits for every path and then repeatedly compiling the REs for each candidate match, we should probably enumerate all 'names' (Tags+aliases, Studios+aliases, Performer names) at the beginning of an autotask job and memoize their compiled REs at a higher level. The queries and compiles add up, and I suspect narrowing down the candidate list through QueryForAutoTag
doesn't save a ton of time comparatively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some reflection and some rough experiments, the queries are probably fine, but we should still memoize the compiled RE objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did start implementing RE caching since initial benchmarks suggested that there would be a performance benefit. However, after analysing the way that this code is used, the benefit would be limited to the autotag task only, and would require creating the cache in task_autotag.go
and passing it down through all of the subsequent function calls. This is fine, but well out of scope for this PR, and the effort involved is non-trivial.
Retested with new changes |
match
.Auto Tag
. This scraper finds performers, studios and tags only and is supported for scene and gallery scrapes. It operates using the same logic as the auto-tagger.