Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch suggest in Omikuji backend #669

Merged
merged 2 commits into from
Feb 3, 2023
Merged

Conversation

osma
Copy link
Member

@osma osma commented Feb 3, 2023

This PR clarifies that backends only have to implement either one of _suggest and _suggest_batch, then implements batched suggest in the Omikuji backend. In practice, only the text vectorization is performed on the whole batch at once; the Omikuji implementation only supports a predict method for a single document at a time so it has to be done within a for loop.

There seems to be a small performance benefit. I tested this using annif eval the Finto AI yso-parabel-fi project/model, with the kirjaesittelyt2021/fin/test corpus. The evaluation results were unchanged, only the amount of time spent was slightly different. Memory usage remained pretty much the same.

With 1 job

user time wall time max rss
before (master) 86.69 1:29.94 6322624
after (PR) 78.66 1:22.79 6336584

With 4 jobs

user time wall time max rss
before (master) 121.33 1:22.33 6293804
after (PR) 96.55 1:18.78 6293640

Fixes #665

@osma osma added this to the 0.61 milestone Feb 3, 2023
@osma osma self-assigned this Feb 3, 2023
@osma osma requested a review from juhoinkinen February 3, 2023 14:49
@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 3, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@codecov
Copy link

codecov bot commented Feb 3, 2023

Codecov Report

Base: 99.56% // Head: 99.56% // Increases project coverage by +0.00% 🎉

Coverage data is based on head (f4b55cd) compared to base (a7e3b4b).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #669   +/-   ##
=======================================
  Coverage   99.56%   99.56%           
=======================================
  Files          87       87           
  Lines        6143     6145    +2     
=======================================
+ Hits         6116     6118    +2     
  Misses         27       27           
Impacted Files Coverage Δ
annif/backend/backend.py 100.00% <ø> (ø)
annif/backend/omikuji.py 97.53% <100.00%> (+0.09%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@juhoinkinen juhoinkinen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@osma osma merged commit cc6dfcf into master Feb 3, 2023
@osma osma deleted the issue663-suggest-batch-omikuji branch February 3, 2023 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support batch suggest in Omikuji backend
2 participants