-
-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce database round-trips during BOM processing #1006
Conversation
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferencesCodacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more |
In the previous implementation, a `SELECT` query was issued for every single component and service in a BOM, in order to find existing components that match their identity. In retrospect, this causes a lot of unnecessary database round-trips and puts the database under unnecessary stress, in particular for new projects where no components and services exist yet. Now, we query all existing components and services of the project once in bulk. A situation where this approach can perform worse, is when a BOM is uploaded to an existing project, and the content differs wildly between BOM and project. We would then load many components into memory, only to delete them shortly after. However, this scenario should be less common. Usually, projects are either empty, or have significant overlap with the uploaded BOM. Backports DependencyTrack/hyades-apiserver#1006 Signed-off-by: nscuro <nscuro@protonmail.com>
e78985d
to
57f02fa
Compare
In the previous implementation, a `SELECT` query was issued for every single component and service in a BOM, in order to find existing components that match their identity. In retrospect, this causes a lot of unnecessary database round-trips and puts the database under unnecessary stress, in particular for new projects where no components and services exist yet. Now, we query all existing components and services of the project once in bulk. A situation where this approach can perform worse, is when a BOM is uploaded to an existing project, and the content differs wildly between BOM and project. We would then load many components into memory, only to delete them shortly after. However, this scenario should be less common. Usually, projects are either empty, or have significant overlap with the uploaded BOM. Signed-off-by: nscuro <nscuro@protonmail.com>
57f02fa
to
a78e9e6
Compare
In the previous implementation, a `SELECT` query was issued for every single component and service in a BOM, in order to find existing components that match their identity. In retrospect, this causes a lot of unnecessary database round-trips and puts the database under unnecessary stress, in particular for new projects where no components and services exist yet. Now, we query all existing components and services of the project once in bulk. A situation where this approach can perform worse, is when a BOM is uploaded to an existing project, and the content differs wildly between BOM and project. We would then load many components into memory, only to delete them shortly after. However, this scenario should be less common. Usually, projects are either empty, or have significant overlap with the uploaded BOM. Backports DependencyTrack/hyades-apiserver#1006 Signed-off-by: nscuro <nscuro@protonmail.com>
Description
Reduces database round-trips during BOM processing.
In the previous implementation, a
SELECT
query was issued for every single component and service in a BOM, in order to find existing components that match their identity.In retrospect, this causes a lot of unnecessary database round-trips and puts the database under unnecessary stress, in particular for new projects where no components and services exist yet.
Now, we query all existing components and services of the project once in bulk.
A situation where this approach can perform worse, is when a BOM is uploaded to an existing project, and the content differs wildly between BOM and project. We would then load many components into memory, only to delete them shortly after. However, this scenario should be less common. Usually, projects are either empty, or have significant overlap with the uploaded BOM.
Addressed Issue
N/A
Additional Details
Profiling the bloated BOM test, it's visible we previously spent a large chunk of CPU time waiting for Postgres to respond to identity matching queries:
In fact, performing these queries was more expensive than flushing new changes, despite the project initially being completely empty.
This overhead is now entirely gone:
Checklist
This PR fixes a defect, and I have provided tests to verify that the fix is effectiveThis PR introduces changes to the database model, and I have updated the migration changelog accordinglyThis PR introduces new or alters existing behavior, and I have updated the documentation accordingly