-
Notifications
You must be signed in to change notification settings - Fork 585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read relative etc/apk/repositories for alpine version when no OS provided #1615
Conversation
53259c1
to
bab9c68
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I'm good with this approach; I left some implementation detail notes (consolidating into the r == nil
section, simpler regex handling). Also, could you add some sort of test for this?
var ( | ||
repoRegex = regexp.MustCompile(`^https://.*\.alpinelinux\.org/alpine/v([^\/]+)/([a-zA-Z0-9_]+)$`) | ||
newlineSplitRegex = regexp.MustCompile(`[\r \s\n]+`) | ||
) | ||
|
||
func init() { | ||
repoRegex.Longest() | ||
newlineSplitRegex.Longest() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be simplified to use a multiline modifier ((?m)
) in the original regex, e.g:
repoRegex = regexp.MustCompile(`(?m)^https://.*\.alpinelinux\.org/alpine/v([^\/]+)/([a-zA-Z0-9_]+)$`)
... and since it reads the entire line (the ^
and $
), it probably doesn't need the .Longest()
modifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good approach. Pity that I deleted the play.golang where I was testing all of it. Oh well.
func parseApkDB(resolver source.FileResolver, env *generic.Environment, reader source.LocationReadCloser) ([]pkg.Package, []artifact.Relationship, error) { | ||
// find the repositories file from the the relative directory of the DB file | ||
var repos []string | ||
if resolver != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this whole block can be moved down into the r == nil
check to only occur if there isn't a linux release identified. As noted in your assessment of this issue, it might be valid to read the repositories in order to generate a correct PURL, but since there is no way to correlate these entries to the apkdb currently, and we are just looking for alpinelinux.org
, we could avoid doing this if we already identified a release for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still holding back on the thought that this is valuable on its own, and at some point, we will prefer this over the OS. If it doesn't hurt, let's leave it up here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the point that this could be more useful to identify the source than the linux release itself, and agree. My hesitation here is that currently it is only validating this is a known alpine
release, which is why for now I would probably move it to the r == nil
check below.
The oddity here is that if there is a linux release identified, this isn't used even if it has differing information so there would need to be a different mechanism to set this.
Ok, I dug in a bit more to the code, I think a cleaner solution might be good to do a little bit of refactoring here, starting with packageURL
here: https://github.com/anchore/syft/blob/main/syft/pkg/cataloger/apkdb/package.go#L30 (it would also need to replace the hardcoded "alpine"
with the namespace
arg and remove the check that it's "alpine"
And newPackage
here: https://github.com/anchore/syft/blob/main/syft/pkg/cataloger/apkdb/package.go#L12
These currently take the linux.Release, which is why you need to create one, but they only really need the string namespace
.
So if we change var repos []linux.Release
to var namespace string
, then we can default to the linux release and subsequently look up the repositories file, e.g.:
var namespace string
if env != nil && env.LinuxRelease != nil {
namespace = env.LinuxRelease.ID
}
// do the /etc/apk/repositories lookup for a namespace
// and other stuff
// then later on: newPackage(apk, namespace, reader.Location)
This would pave the way to detect different sources a bit better, I think.
What do you think about this?
reposDirect := newlineSplitRegex.Split(string(reposB), -1) | ||
for _, repo := range reposDirect { | ||
if repo != "" { | ||
repos = append(repos, repo) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If simplified to the single regex, this would just match and extract the version, like what is happening in the section below
Addressed all of your points, take a look please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the lengthy comment, but I think I've outlined how we can make this change a bit cleaner without the need to make "fake" linux.Release objects, WDYT?
func parseApkDB(resolver source.FileResolver, env *generic.Environment, reader source.LocationReadCloser) ([]pkg.Package, []artifact.Relationship, error) { | ||
// find the repositories file from the the relative directory of the DB file | ||
var repos []string | ||
if resolver != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the point that this could be more useful to identify the source than the linux release itself, and agree. My hesitation here is that currently it is only validating this is a known alpine
release, which is why for now I would probably move it to the r == nil
check below.
The oddity here is that if there is a linux release identified, this isn't used even if it has differing information so there would need to be a different mechanism to set this.
Ok, I dug in a bit more to the code, I think a cleaner solution might be good to do a little bit of refactoring here, starting with packageURL
here: https://github.com/anchore/syft/blob/main/syft/pkg/cataloger/apkdb/package.go#L30 (it would also need to replace the hardcoded "alpine"
with the namespace
arg and remove the check that it's "alpine"
And newPackage
here: https://github.com/anchore/syft/blob/main/syft/pkg/cataloger/apkdb/package.go#L12
These currently take the linux.Release, which is why you need to create one, but they only really need the string namespace
.
So if we change var repos []linux.Release
to var namespace string
, then we can default to the linux release and subsequently look up the repositories file, e.g.:
var namespace string
if env != nil && env.LinuxRelease != nil {
namespace = env.LinuxRelease.ID
}
// do the /etc/apk/repositories lookup for a namespace
// and other stuff
// then later on: newPackage(apk, namespace, reader.Location)
This would pave the way to detect different sources a bit better, I think.
What do you think about this?
Oh, yeah, I like the namespace approach. Going to give it a shot quickly now. |
OK, I tried. The problem is that |
For now, I left is as If you ask me, the correct thing to do here is to have something entirely different passed to type PackageRelease interface {
func ID() string
func VersionID() string
func BuildID() string
}
func PURLQualifiers(vars map[string]string, release PackageRelease) (q packageurl.Qualifiers) { Some packages are tied to a linux release, others not so much. That requires changing everything that calls it, so it might be a bit much for this. I would sooner have this in, and then we can worry about the right abstraction for |
…ided Signed-off-by: Avi Deitcher <avi@deitcher.net>
Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
@deitch solid improvement 💯 . I pushed a small refactor that decomposed the additions into smaller functions and swapped out multiple nesting of conditions with guard clauses instead, but it should logically the same (with a couple changes). You can treat this commit as a suggestion, if you really don't like it or would prefer a different approach feel free to force-push over it / revert it. The upside of the decomposition is to allow for easier unit testing of just the apk repository file parsing (now the |
Yeah, that's a fine breakdown; smaller is easier whenever possible. |
Signed-off-by: Avi Deitcher <avi@deitcher.net>
Added a test of that. Back at you. |
Thanks for this contribution @deitch! |
…ided (anchore#1615) Signed-off-by: Avi Deitcher <avi@deitcher.net>
fixes #1572
Sort of
The way apkdb works now, if it cannot find an
/etc/os-release
(or similar files on a fixed list), and if the ID is not"alpine"
, it still reports on the package, but no purl is provided.This has several problems.
etc/os-release
. This especially runs true when you are looking at a filesystem with multiple containers, or perhaps you have modified theos-release
because your OS isn't alpine, but the packages in the embedded container are.os-release
is not necessarily the best place to find out where it came from./etc/apk/repositories
can have alpinelinux.org but also other placesThis tries to address some of it. It does so very cautiously. If it cannot find an
/etc/os-release
, i.e.linux.Release
is nil, then then it tries to read theetc/apk/repositories
relative to where it foundlib/apk/db/installed
(so if it is embedded deep in the filesystem, it still can find it). If it findsalpinelinux.org
in there, it sets it up as if running on alpine, and takes the version from there.It is an improvement over the current design, but could be improved further.
etc/apk/repositories
overos-release
; this uses it only if release still is nilrepositories
, but that will have to wait, since there is no way to correlate those with a particular package ininstalled
This is a real if imperfect improvement over the current state.
apk team is looking at improving this chain-of-custody, including providing a purl in the
installed
file. Things will get much easier then. See this issue. Until then, this can help.