-
Notifications
You must be signed in to change notification settings - Fork 614
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up cataloging by replacing globs searching with index lookups #1510
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Benchmark Test ResultsBenchmark results from the latest changes vs base branch
|
a9193c9
to
7f3382f
Compare
85b7f66
to
cba0e45
Compare
kzantow
reviewed
Jan 26, 2023
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
spiffcs
reviewed
Feb 8, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comments about if values could be nil and cause a panic. I'm going to take another look at the resolver in a second pass since those had the most changes.
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
spiffcs
approved these changes
Feb 9, 2023
…indexes Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
This was referenced Feb 9, 2023
This was referenced Feb 17, 2023
GijsCalis
pushed a commit
to GijsCalis/syft
that referenced
this pull request
Feb 19, 2024
…nchore#1510) * replace raw globs with index equivelent operations Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add cataloger test for alpm cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * fix import sorting for binary cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * fix linting for mock resolver Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * separate portage cataloger parser impl from cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * enhance cataloger pkgtest utils to account for resolver responses Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for alpm cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for apkdb cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for dpkg cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for cpp cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for dart cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for dotnet cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for elixir cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for erlang cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for golang cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for haskell cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for java cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for javascript cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for php cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for portage cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for python cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for rpm cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for rust cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for sbom cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for swift cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * allow generic catloger to run all mimetype searches at once Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * remove stutter from php and javascript cataloger constructors Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * bump stereoscope Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add tests for generic.Search Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add exceptions for java archive git ignore entries Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * enhance basename and extension resolver methods to be variadic Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * dont allow * prefix on extension searches Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * add glob-based cataloger tests for ruby cataloger Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * remove unnecessary string casting Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * incorporate surfacing of leaf link resolitions from stereoscope results Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * [wip] switch to stereoscope file metadata Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * [wip + failing] revert to old globs but keep new resolvers Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * index files, links, and dirs within the directory resolver Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * fix several resolver bugs and inconsistencies Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * move format testutils to internal package Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * update syft json to account for file type string normalization Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * split up directory resolver from indexing Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * update docs to include details about searching Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * [wip] bump stereoscope to development version Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * fix linting Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * adjust symlinks fixture to be fixed to digest Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * fix all-locations resolver tests Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * fix test fixture reference Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * rename file.Type Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * bump stereoscope Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * fix PR comment to exclude extra * Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * bump to dev version of stereoscope Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * bump to final version of stereoscope Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * move observing resolver to pkgtest Signed-off-by: Alex Goodman <alex.goodman@anchore.com> --------- Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Leverages anchore/stereoscope#154 to improve cataloging speeds. Searching by full path globs is used everywhere, however, in images that have a large number of files this kind of operation is very intensive. This PR introduces new searching objects that leverage basename indexes in stereoscope in order to vastly improve the initial lookup against the index and then additionally prune potential results with additional requirements.
Today's behavior:
With the new indexes being leveraged:
Specific changes:
source.FileResover
implementations to show that they behave very similarly (this PR addresses several inconsistencies)source.directoryResolver
has been split out into another object entirely:source.directoryIndexer
source.directoryResolver
to use the new stereoscopefiletree.Index
andfiletree.Searcher
to best leverage the new performance related enhancements from Add additional catalog indexes for performance stereoscope#154source.FileMetadata
object with the stereoscopefile.Metadata
object instead.source.FileType
object with the stereoscopefile.Type
object instead.Closes #1328
Closes #1480