Skip to content

Conversation

@G-Rath
Copy link
Owner

@G-Rath G-Rath commented Oct 3, 2025

Based on google/osv-scanner#2241, this should greatly reduce peak memory usage of the detector and in theory make it faster since there should be less advisories to check, though in practice I don't expect this to be noticeable as it mainly matters for large databases like Ubuntu which we don't support out of the box.

I've implemented this as part of the memDb meaning we retain the total count of advisories loaded externally even though internally the number of advisories retained is likely far lower

@G-Rath G-Rath force-pushed the skip-advisories branch 2 times, most recently from 6a21168 to 48bf4b4 Compare December 2, 2025 18:10
@G-Rath
Copy link
Owner Author

G-Rath commented Dec 2, 2025

due to another spike in npm advisories, this is pretty much a must-have for the local databases to be viable - without it I'm getting OOMs running the tests locally 😓

@G-Rath G-Rath marked this pull request as ready for review December 2, 2025 18:11
@G-Rath G-Rath requested a review from Copilot December 2, 2025 18:12
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a performance optimization to reduce memory usage by filtering out advisories that don't affect any of the packages being scanned. The implementation adds a pkgNames []string parameter throughout the database loading chain, allowing databases to skip loading advisories for unrelated packages.

Key Changes:

  • Added package name filtering in memDB.addVulnerability() to skip advisories that don't affect specified packages
  • Modified database loading functions (NewZippedDB, NewDirDB, Load) to accept and pass through a list of package names
  • Collected all package names from lockfiles in main.go and passed them to database loaders

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pkg/database/mem-check.go Added filtering logic to skip advisories that don't affect the specified packages while maintaining total count
pkg/database/zip.go Added mightAffectPackages helper function and updated signatures to accept pkgNames parameter
pkg/database/dir.go Updated load and NewDirDB signatures to accept and pass pkgNames parameter
pkg/database/config.go Updated Load function to accept pkgNames and route it to appropriate database constructors
main.go Added collection of all package names from lockfiles and passed to loadDatabases
pkg/database/zip_test.go Updated all test calls to pass nil for new parameter and added comprehensive test for filtering behavior
pkg/database/dir_test.go Updated all test calls and added test validating package filtering works correctly
pkg/database/load_test.go Updated test calls to pass nil for new parameter
pkg/database/testdata/db/nested-2/osv-3.json Added test data for OSV-3 with package "mine2" to validate filtering

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +595 to +604
pkgSet := make(map[string]struct{})
for _, p := range files {
for _, pkg := range p.lockf.Packages {
pkgSet[pkg.Name] = struct{}{}
}
}
allPackages := make([]string, 0, len(pkgSet))
for name := range pkgSet {
allPackages = append(allPackages, name)
}
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package names collected here are not normalized, but the filtering logic in mightAffectPackages compares them against affected.Package.NormalizedName(). This creates a mismatch for PIP ecosystem packages where normalization converts names to lowercase and replaces [-_.] with -.

For example, if a lockfile contains "Django" but the OSV database has "django", the filtering will fail to match them because NormalizedName() returns "django" but the collected package name is "Django".

To fix this, the package names should be normalized when collecting them. Consider creating a map that accounts for the ecosystem and normalizes accordingly, or normalize package names before adding them to pkgSet.

Copilot uses AI. Check for mistakes.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that should be fine because the lockfile parsers handle normalization - the only way this could happen is if someone passes an unnormalized name in via CSV, which already won't work

@G-Rath G-Rath merged commit 016ef51 into main Dec 2, 2025
19 checks passed
@G-Rath G-Rath deleted the skip-advisories branch December 2, 2025 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants