Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sbom): use PURL or Group and Name in case of Java #5154

Merged
merged 7 commits into from
Oct 3, 2023
Merged

fix(sbom): use PURL or Group and Name in case of Java #5154

merged 7 commits into from
Oct 3, 2023

Conversation

j1nka
Copy link
Contributor

@j1nka j1nka commented Sep 8, 2023

Description

If another SCA generates CycloneDX file, it may does not have fields like "group" and "name" in SBOM file. As example, an Dependency Track. There's problem in handling SBOM files, where Trivy trying to get data from specific SBOM fields, but not from "purl" field of SBOM (CycloneDX-json) file.

Related issues

Checklist

  • [ X] I've read the guidelines for contributing to this repository.
  • [X ] I've followed the conventions in the PR title.
  • I've added tests that prove my fix is effective or that my feature works.
  • I've updated the documentation with the relevant information (if needed).
  • I've added usage information (if the PR introduces new options)
  • I've included a "before" and "after" example to the description (if the PR is a user interface change).

@j1nka j1nka requested a review from knqyf263 as a code owner September 8, 2023 12:13
@knqyf263 knqyf263 requested a review from nikpivkin September 8, 2023 13:39
@@ -410,8 +411,26 @@ func toTrivyCdxComponent(component cdx.Component) ftypes.Component {

func getPackageName(typ string, component cdx.Component) string {
// Jar uses `Group` field for `GroupID`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is no longer needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Done.

// Split the package into its parts
parts := strings.Split(pkg, "/")

// Get Group
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this comment? It almost duplicates the variable name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! Done.


pkgWithoutVersion := fmt.Sprintf("%s:%s", group, nameWOVersion)

return pkgWithoutVersion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return pkgWithoutVersion
return fmt.Sprintf("%s:%s", group, nameWOVersion)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again :) Done

@nikpivkin
Copy link
Contributor

@j1nka I left a couple of small comments.

@knqyf263
Copy link
Collaborator

I don't remember, but there might be a reason why we use groups. I'd defer to @DmitriyLewen. He will be back next week.

Copy link
Contributor

@DmitriyLewen DmitriyLewen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @j1nka
Thanks for your work!

Some languages have nuances with lowercase in purl (e.g. Go, npm).
To avoid problem with that - we use original names (#4306 (comment)).

Maven plugin uses Name and Group field.
We try to adhere to the language format.

But i understand your point.
I left a comment. Take a look, please.

Regards, Dmitriy

Comment on lines 419 to 432
func convertMavenPackage(pkg string) string {
// Split the package into its parts
parts := strings.Split(pkg, "/")

group := parts[1]

// Get FullName with Version
nameWithVersion := parts[len(parts)-1]

// Remove the Version from the package
nameWOVersion := strings.Split(nameWithVersion, "@")[0]

return fmt.Sprintf("%s:%s", group, nameWOVersion)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we handle all possible ways?
Something like that:

Suggested change
func convertMavenPackage(pkg string) string {
// Split the package into its parts
parts := strings.Split(pkg, "/")
group := parts[1]
// Get FullName with Version
nameWithVersion := parts[len(parts)-1]
// Remove the Version from the package
nameWOVersion := strings.Split(nameWithVersion, "@")[0]
return fmt.Sprintf("%s:%s", group, nameWOVersion)
}
func convertMavenPackage(component cdx.Component) string {
// By default, Jar uses `Group` field for `GroupID`
// Use original `ArtifactID` and `GroupID`
if component.Group != "" {
return fmt.Sprintf("%s:%s", component.Group, component.Name)
}
// if `Name` and `Group` fields don't exist - unmarshal purl
if component.Name == "" {
// Split the package into its parts
parts := strings.Split(component.PackageURL, "/")
group := parts[1]
// Get FullName with Version
nameWithVersion := parts[len(parts)-1]
// Remove the Version from the package
nameWOVersion := strings.Split(nameWithVersion, "@")[0]
return fmt.Sprintf("%s:%s", group, nameWOVersion)
}
// in some cases `GroupID:ArtifactID` is stored in `Name`
return component.Name
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the specification, name is a required field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh... right. Thanks!
Then, we can check that the "Name" field contains "GroupID".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikpivkin @DmitriyLewen thanks for your advices! Added some more checks. Main idea is to catch smth like:
1st Version of SBOM (we don't have Group and that's why we can get empty list of vulners):

{
      "bom-ref": "",
      "name": "",
      "purl": "",
      "scope": "",
      "type": "",
      "version": ""
}

2nd Version of SBOM (we have both - Group and Name):

{
      "group" : "",
      "name" : "",
      "version" : "",
      "cpe" : "",
      "purl" : "",
      "externalReferences" : [
        {
          "type" : "",
          "url" : ""
        }
      ],
      "type" : "",
      "bom-ref" : ""
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your examples are not valid.
e.g. Name is required field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your examples are not valid. e.g. Name is required field.

Hm, i removed all data in this two examples just to show fields. In both cases we have name, but in first we don't have group. Maybe i didn't understand you comment abiut "name is required".

Yes if purl and group are missing we miss the package. In last commit I've added checks like you told me - check the group. You want me to add some checks to identify if purl is present? Because at the moment I don't check if purl is present and trying to parse it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh... i didn't understand that you removed all values from these fields ( i thought these field are empty).

You want me to add some checks to identify if purl is present

I think we need to add that.
group and purl can be omitted. So in this case we need to get package name from name field.
It is strange, but it is possible:

{
      "bom-ref": "",
      "name": "",
      "scope": "",
      "type": "",
      "version": ""
}

I

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok! Will do it. But if we don't have group or purl what we need in return? In case of Java only name will be not enough to get vulnerability data

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use name field. It is better than nothing.
It is also possible that name will use <GroupID>:<ArtifactID> format.
e.g. we used this format in name before #4674.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope i get it right :) Added checks and name as default behavior if there is no group or purl

@j1nka
Copy link
Contributor Author

j1nka commented Sep 18, 2023

Hello @j1nka Thanks for your work!

Some languages have nuances with lowercase in purl (e.g. Go, npm). To avoid problem with that - we use original names (#4306 (comment)).

Maven plugin uses Name and Group field. We try to adhere to the language format.

But i understand your point. I left a comment. Take a look, please.

Regards, Dmitriy

@DmitriyLewen thanks for the clarification! But i'm talking only about Java projects and nothing else. It may happen that we don't get vulnerability data from SBOM because sometimes we don't have "Group"

@DmitriyLewen
Copy link
Contributor

It may happen that we don't get vulnerability data from SBOM because sometimes we don't have "Group"

Can you share some example for this case?
I am wondering where these apps store GroupID.

e.g. before #4674 we save ArtifactID + GroupID in name field.

@j1nka
Copy link
Contributor Author

j1nka commented Sep 18, 2023

It may happen that we don't get vulnerability data from SBOM because sometimes we don't have "Group"

Can you share some example for this case? I am wondering where these apps store GroupID.

e.g. before #4674 we save ArtifactID + GroupID in name field.

Sure! Two examples above are the real SBOMs generated by different tools (not Trivy)

Comment on lines 424 to 436
// Split the package into its parts
parts := strings.Split(component.PackageURL, "/")

group := parts[1]

// Get FullName with Version
nameWithVersion := parts[len(parts)-1]

// Remove the Version from the package
nameWOVersion := strings.Split(nameWithVersion, "@")[0]

return fmt.Sprintf("%s:%s", group, nameWOVersion)
// In case we don't have Group or PURL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already parsed purl here -

pkg := p.Package()

We can use this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already parsed purl here -

pkg := p.Package()

We can use this.

Great idea, thanks! Made some changes

Comment on lines 411 to 415
func getPackageName(purl *purl.PackageURL, component cdx.Component) string {
if purl.Type == packageurl.TypeMaven {
return convertMavenPackage(purl, component)
}
return component.Name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i rechecked code before this line and found that we don't parse CycloneDX without purl -

if component.PackageURL == "" {
log.Logger.Warnf("Skip the component (BOM-Ref: %s) as the PURL is empty", component.BOMRef)
return false, "", nil, ErrPURLEmpty
}

So we can remove this check (sorry for confusing you).

I think we can remove purl == "" check and move logic to getPackageName and use parsed package name (pkg.Name = getPackageName(p.Type, pkg.Name, component)). Something like that:

Suggested change
func getPackageName(purl *purl.PackageURL, component cdx.Component) string {
if purl.Type == packageurl.TypeMaven {
return convertMavenPackage(purl, component)
}
return component.Name
func getPackageName(typ, pkgNameFromPurl string, component cdx.Component) string {
if typ == packageurl.TypeMaven {
// Jar uses `Group` field for `GroupID`
if component.Group != "" {
return fmt.Sprintf("%s:%s", component.Group, component.Name)
} else {
// use name derived from purl if `Group` doesn't exist
return pkgNameFromPurl
}
}
return component.Name
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow :) Great idea! Thanks a lot! Done :)

Copy link
Contributor

@DmitriyLewen DmitriyLewen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now.
@j1nka Thanks for your work!

Can you update title of this PR?
(we don't use purl instead of name and group name => we use one of these fields.

@j1nka j1nka changed the title fix(sbom): use PURL instead of Group and Name fix(sbom): use PURL or Group and Name in case of Java Sep 20, 2023
@j1nka
Copy link
Contributor Author

j1nka commented Sep 20, 2023

Looks good now. @j1nka Thanks for your work!

Can you update title of this PR? (we don't use purl instead of name and group name => we use one of these fields.

Hi! Sure. Smth like that? Thanks for your help and patience, Dmitriy!

Copy link
Contributor

@DmitriyLewen DmitriyLewen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks!

@knqyf263 i approved this PR.

@j1nka
Copy link
Contributor Author

j1nka commented Sep 29, 2023

Great! Thanks!

@knqyf263 i approved this PR.

@knqyf263 @DmitriyLewen Hi! Is there anything else I should to to merge this PR? Thanks!

@knqyf263 knqyf263 requested a review from nikpivkin October 3, 2023 09:09
@knqyf263
Copy link
Collaborator

knqyf263 commented Oct 3, 2023

@nikpivkin requested changes. We cannot merge the PR until @nikpivkin approves it.

@knqyf263 knqyf263 enabled auto-merge October 3, 2023 09:10
@knqyf263 knqyf263 added this pull request to the merge queue Oct 3, 2023
Merged via the queue into aquasecurity:main with commit 393bfdc Oct 3, 2023
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants