-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API performance issues (packages endpoints) #1538
Comments
pombredanne
added a commit
that referenced
this issue
Aug 11, 2024
Reference:#1538 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne
added a commit
that referenced
this issue
Aug 11, 2024
Remove unused imports Move univers monkey patching to top level Reference:#1538 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne
added a commit
that referenced
this issue
Aug 11, 2024
* is_vulnerable is now a queryset annotation. * with_is_vulnerable Package manager method annotates the queryset * It is now called anywhere needed. The results with a single API call to http://127.0.0.1:8001/api/packages/63 are: * before 1.7 sec * after 0.25 sec Or an improvement where the old is 6.8 times slower than the new. Reference:#1538 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne
added a commit
that referenced
this issue
Aug 11, 2024
Reference:#1538 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne
added a commit
that referenced
this issue
Aug 11, 2024
Reference:#1538 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne
added a commit
that referenced
this issue
Aug 11, 2024
* simplify sorting code, revert to using lists * use cached_property where relevant Reference:#1538 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne
added a commit
that referenced
this issue
Aug 12, 2024
* simplify sorting code, revert to using lists * use cached_property where relevant Reference:#1538 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne
added a commit
that referenced
this issue
Aug 12, 2024
* Fix how we find vulnerable and non vulnerable versions * Complete using is_vulnerable in relevant querysets. * Adjust tests for clarity and DRY using small method to make the tests setup easier to read * Use proper typing annotation for PackageURL or str * Add new Package manager from_purl() method to create a Package from a PURL Reference:#1538 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
TG1999
pushed a commit
that referenced
this issue
Aug 12, 2024
* Add pyinstrument to debug mode Reference:#1538 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com> * Apply minor cosmetic models.py changes Remove unused imports Move univers monkey patching to top level Reference:#1538 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com> * Annotate queryset for is_vulnerable * is_vulnerable is now a queryset annotation. * with_is_vulnerable Package manager method annotates the queryset * It is now called anywhere needed. The results with a single API call to http://127.0.0.1:8001/api/packages/63 are: * before 1.7 sec * after 0.25 sec Or an improvement where the old is 6.8 times slower than the new. Reference:#1538 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com> * Improve sorting of packages by version * simplify sorting code, revert to using lists * use cached_property where relevant Reference:#1538 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com> * Annotate queryset for is_vulnerable everywhere * Fix how we find vulnerable and non vulnerable versions * Complete using is_vulnerable in relevant querysets. * Adjust tests for clarity and DRY using small method to make the tests setup easier to read * Use proper typing annotation for PackageURL or str * Add new Package manager from_purl() method to create a Package from a PURL Reference:#1538 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com> * Do not print things in tests Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com> --------- Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne
changed the title
API QuerySets need optimization (thousands of queries are currently triggered)
API performance issues (packages endpoints)
Aug 20, 2024
First batch of improvements went in with this PR: More to come |
4 tasks
The performance is now acceptable. We need to9 make sure we save a baseline of performances. |
FYI, the key code to get this fixed is the new API design in #1572 |
Merged
github-project-automation
bot
moved this to Done
in 04-VulnerableCode Data Quality and next
Nov 11, 2024
Merged
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Let's focus on the Package-related endpoints for now as those are the ones used to collect vulnerability data in DejaCode.
Those tests were run on a clean install of VCIO with only the
nginx.NginxImporter
data set.It's a very small amount of data but somehow looking at a single Package triggers over a thousand queries.
/api/packages/
(on only 88 packages) -> 6,124 queries: 5706 similar queries. Duplicated 73 times./api/packages/63
-> 1,329 queries: 1230 similar queries. Duplicated 16 times./api/packages/bulk_search
(providing the 88 purl): 39,925 queries.This is quite problematic in the context of batch data collection using the VCIO API.
The
PackageSerializer
and related QuerySets require optimization.Once done, make sure to implement unit test using the
assertNumQueries
to make sure that future code change do not add uncontrolled queries back.Related issues:
For bulk lookup, we track this here:
The text was updated successfully, but these errors were encountered: