Reduce database round-trips during BOM processing #1006

nscuro · 2024-12-21T14:32:29Z

Description

Reduces database round-trips during BOM processing.

In the previous implementation, a SELECT query was issued for every single component and service in a BOM, in order to find existing components that match their identity.

In retrospect, this causes a lot of unnecessary database round-trips and puts the database under unnecessary stress, in particular for new projects where no components and services exist yet.

Now, we query all existing components and services of the project once in bulk.

A situation where this approach can perform worse, is when a BOM is uploaded to an existing project, and the content differs wildly between BOM and project. We would then load many components into memory, only to delete them shortly after. However, this scenario should be less common. Usually, projects are either empty, or have significant overlap with the uploaded BOM.

Addressed Issue

N/A

Additional Details

Profiling the bloated BOM test, it's visible we previously spent a large chunk of CPU time waiting for Postgres to respond to identity matching queries:

In fact, performing these queries was more expensive than flushing new changes, despite the project initially being completely empty.

This overhead is now entirely gone:

Checklist

I have read and understand the contributing guidelines
~~This PR fixes a defect, and I have provided tests to verify that the fix is effective~~
This PR implements an enhancement, and I have provided tests to verify that it works as intended
~~This PR introduces changes to the database model, and I have updated the migration changelog accordingly~~
~~This PR introduces new or alters existing behavior, and I have updated the documentation accordingly~~

codacy-production · 2024-12-21T15:02:55Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ +0.14% (target: -1.00%)	✅ 80.65% (target: 70.00%)

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`bc8683a`)	22292	18436	82.70%
Head commit (`a78e9e6`)	22249 (-43)	18432 (-4)	82.84% (+0.14%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#1006)	62	50	80.65%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

_{Codacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more}

In the previous implementation, a `SELECT` query was issued for every single component and service in a BOM, in order to find existing components that match their identity. In retrospect, this causes a lot of unnecessary database round-trips and puts the database under unnecessary stress, in particular for new projects where no components and services exist yet. Now, we query all existing components and services of the project once in bulk. A situation where this approach can perform worse, is when a BOM is uploaded to an existing project, and the content differs wildly between BOM and project. We would then load many components into memory, only to delete them shortly after. However, this scenario should be less common. Usually, projects are either empty, or have significant overlap with the uploaded BOM. Backports DependencyTrack/hyades-apiserver#1006 Signed-off-by: nscuro <nscuro@protonmail.com>

In the previous implementation, a `SELECT` query was issued for every single component and service in a BOM, in order to find existing components that match their identity. In retrospect, this causes a lot of unnecessary database round-trips and puts the database under unnecessary stress, in particular for new projects where no components and services exist yet. Now, we query all existing components and services of the project once in bulk. A situation where this approach can perform worse, is when a BOM is uploaded to an existing project, and the content differs wildly between BOM and project. We would then load many components into memory, only to delete them shortly after. However, this scenario should be less common. Usually, projects are either empty, or have significant overlap with the uploaded BOM. Signed-off-by: nscuro <nscuro@protonmail.com>

In the previous implementation, a `SELECT` query was issued for every single component and service in a BOM, in order to find existing components that match their identity. In retrospect, this causes a lot of unnecessary database round-trips and puts the database under unnecessary stress, in particular for new projects where no components and services exist yet. Now, we query all existing components and services of the project once in bulk. A situation where this approach can perform worse, is when a BOM is uploaded to an existing project, and the content differs wildly between BOM and project. We would then load many components into memory, only to delete them shortly after. However, this scenario should be less common. Usually, projects are either empty, or have significant overlap with the uploaded BOM. Backports DependencyTrack/hyades-apiserver#1006 Signed-off-by: nscuro <nscuro@protonmail.com>

nscuro added the enhancement New feature or request label Dec 21, 2024

nscuro added this to the 5.6.0 milestone Dec 21, 2024

nscuro mentioned this pull request Dec 21, 2024

Reduce database round-trips during BOM processing DependencyTrack/dependency-track#4486

Merged

2 tasks

nscuro force-pushed the bom-processing-db-rountrips branch from e78985d to 57f02fa Compare December 21, 2024 15:31

nscuro force-pushed the bom-processing-db-rountrips branch from 57f02fa to a78e9e6 Compare December 21, 2024 17:45

nscuro merged commit e12e9b1 into main Dec 23, 2024
9 checks passed

nscuro deleted the bom-processing-db-rountrips branch December 23, 2024 17:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce database round-trips during BOM processing #1006

Reduce database round-trips during BOM processing #1006

nscuro commented Dec 21, 2024 •

edited

Loading

codacy-production bot commented Dec 21, 2024 •

edited

Loading

Reduce database round-trips during BOM processing #1006

Reduce database round-trips during BOM processing #1006

Conversation

nscuro commented Dec 21, 2024 • edited Loading

Description

Addressed Issue

Additional Details

Checklist

codacy-production bot commented Dec 21, 2024 • edited Loading

Coverage summary from Codacy

See diff coverage on Codacy

See your quality gate settings Change summary preferences

nscuro commented Dec 21, 2024 •

edited

Loading

codacy-production bot commented Dec 21, 2024 •

edited

Loading