Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: CBOM support #933

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Conversation

san-zrl
Copy link

@san-zrl san-zrl commented Sep 27, 2024

Description

Enhances DT to read, persist, serve and export CBOM 1.6 data. See additional_details section for more information on what has been changed in particular. Note that there is a corresponding PR for hyades-frontend that enhances the UI to render CBOM data.

Addressed Issue

Issue #1538

Additional Details

  • Added model classes and metric for CBOM data to org.depencytrack.model
  • Enhanced ModelConverter to populate the additional model classes
  • ModelConverter became quite massive so I split it into ModelConverter (cyclonedx -> internal model) and ModelExporter (internal model -> cyclonedx
  • Added org.depencytrack.persistence.v1.CryptoAssetsResource with endpoints that serve the UI
  • Enhanced exporter to generate CBOM data in download BOM
  • CBOM data are components of type Classifier.CRYPTOGRAPHIC_ASSET with CryptoProperties and Occurrence attributes

Checklist

  • I have read and understand the contributing guidelines
  • This PR fixes a defect, and I have provided tests to verify that the fix is effective
  • This PR implements an enhancement, and I have provided tests to verify that it works as intended
  • This PR introduces changes to the database model, and I have updated the migration changelog accordingly
  • This PR introduces new or alters existing behavior, and I have updated the documentation accordingly

@nscuro nscuro added the enhancement New feature or request label Sep 27, 2024
@VinodAnandan
Copy link
Contributor

Hi @san-zrl Thank you for the Pull Request! I believe this will be a great feature. It seems the test pipeline action is currently
failing, would you mind taking a look at it?

https://github.com/DependencyTrack/hyades-apiserver/actions/runs/11067343603/job/30750563096?pr=933

"Error:    BomResourceTest.exportProjectAsCycloneDxInventoryTest:266 JSON documents are different:
Array "dependencies" has different length, expected: <4> but was: <2>.
Array "dependencies" has different content. Missing values: [{"ref":"${json-unit.matches:componentWithVulnUuid}","dependsOn":[]}, {"ref":"${json-unit.matches:componentWithVulnAndAnalysisUuid}","dependsOn":[]}], extra values: [], expected: <[{"ref":"${json-unit.matches:projectUuid}","dependsOn":["${json-unit.matches:componentWithoutVulnUuid}","${json-unit.matches:componentWithVulnAndAnalysisUuid}"]},{"ref":"${json-unit.matches:componentWithoutVulnUuid}","dependsOn":["${json-unit.matches:componentWithVulnUuid}"]},{"ref":"${json-unit.matches:componentWithVulnUuid}","dependsOn":[]},{"ref":"${json-unit.matches:componentWithVulnAndAnalysisUuid}","dependsOn":[]}]> but was: <[{"ref":"a2e6e2b6-9352-49e7-a189-140edfda8b13","dependsOn":["dbce1c1b-e1d1-40d6-8d3e-8bb0b27a3ec1","e119913b-4ced-4582-a1de-f31f49c7d27f"]},{"ref":"dbce1c1b-e1d1-40d6-8d3e-8bb0b27a3ec1","dependsOn":["f8ccf117-7d61-47f6-b417-bb74927184ca"]}]>
Different keys found in node "metadata", missing: "metadata.manufacture", expected: <{"authors":[{"name":"bomAuthor"}],"component":{"author":"SampleAuthor","bom-ref":"${json-unit.matches:projectUuid}","name":"acme-app","supplier":{"name":"projectSupplier"},"type":"application","version":"SNAPSHOT"},"manufacture":{"name":"projectManufacturer"},"supplier":{"name":"bomSupplier"},"timestamp":"${json-unit.any-string}","tools":[{"name":"Dependency-Track","vendor":"OWASP","version":"${json-unit.any-string}"}]}> but was: <{"authors":[{"name":"bomAuthor"}],"component":{"author":"SampleAuthor","bom-ref":"a2e6e2b6-9352-49e7-a189-140edfda8b13","manufacturer":{"name":"projectManufacturer"},"name":"acme-app","supplier":{"name":"projectSupplier"},"type":"application","version":"SNAPSHOT"},"supplier":{"name":"bomSupplier"},"timestamp":"2024-09-27T08:57:00Z","tools":[{"name":"Dependency-Track","vendor":"OWASP","version":"5.6.0-SNAPSHOT"}]}>
Different keys found in node "metadata.component", extra: "metadata.component.manufacturer", expected: <{"author":"SampleAuthor","bom-ref":"${json-unit.matches:projectUuid}","name":"acme-app","supplier":{"name":"projectSupplier"},"type":"application","version":"SNAPSHOT"}> but was: <{"author":"SampleAuthor","bom-ref":"a2e6e2b6-9352-49e7-a189-140edfda8b13","manufacturer":{"name":"projectManufacturer"},"name":"acme-app","supplier":{"name":"projectSupplier"},"type":"application","version":"SNAPSHOT"}>
Different value found in node "specVersion", expected: <"1.5"> but was: <"1.6">.

Error:    BomResourceTest.exportProjectAsCycloneDxInventoryWithVulnerabilitiesTest:505 
expected: 200
 but was: 500
Error:    BomResourceTest.exportProjectAsCycloneDxLicenseTest:391 JSON documents are different:
Array "dependencies" has different length, expected: <2> but was: <0>.
Array "dependencies" has different content. Missing values: [{"ref":"${json-unit.matches:projectUuid}","dependsOn":[]}, {"ref":"${json-unit.matches:component}","dependsOn":[]}], expected: <[{"ref":"${json-unit.matches:projectUuid}","dependsOn":[]},{"ref":"${json-unit.matches:component}","dependsOn":[]}]> but was: <[]>
Different value found in node "specVersion", expected: <"1.5"> but was: <"1.6">.

Error:    BomResourceTest.exportProjectAsCycloneDxVdrTest:700 
expected: 200
 but was: 500
Error:    VexResourceTest.exportProjectAsCycloneDxTest:145 
expected: 200
 but was: 500
Error:    VexResourceTest.exportVexWithDifferentVulnAnalysisValidJsonTest:485 
expected: 200
 but was: 500
Error:    VexResourceTest.exportVexWithSameVulnAnalysisValidJsonTest:387 
expected: 200
 but was: 500
Error:    BomUploadProcessingTaskTest.informWithExistingDuplicateComponentPropertiesAndBomWithDuplicateComponentProperties:1618 
Expecting code not to raise a throwable but caught
  "javax.jdo.JDOObjectNotFoundException: Object with id "org.dependencytrack.model.ComponentProperty:5188" not found !
	at org.datanucleus.api.jdo.JDOAdapter.getJDOExceptionForNucleusException(JDOAdapter.java:637)
	at org.datanucleus.api.jdo.JDOPersistenceManager.jdoRefresh(JDOPersistenceManager.java:495)
	at org.datanucleus.api.jdo.JDOPersistenceManager.refresh(JDOPersistenceManager.java:507)
	at org.dependencytrack.tasks.BomUploadProcessingTaskTest.lambda$informWithExistingDuplicateComponentPropertiesAndBomWithDuplicateComponentProperties$116(BomUploadProcessingTaskTest.java:1618)
	at org.assertj.core.api.ThrowableAssert.catchThrowable(ThrowableAssert.java:63)
	at org.assertj.core.api.NotThrownAssert.isThrownBy(NotThrownAssert.java:43)
	at org.dependencytrack.tasks.BomUploadProcessingTaskTest.informWithExistingDuplicateComponentPropertiesAndBomWithDuplicateComponentProperties(BomUploadProcessingTaskTest.java:1618)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
	at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
	at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
	at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:316)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:240)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:214)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:155)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:385)
	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162)
	at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:507)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:495)
NestedThrowablesStackTrace:
Object with id "org.dependencytrack.model.ComponentProperty:5188" not found !
org.datanucleus.exceptions.NucleusObjectNotFoundException: Object with id "org.dependencytrack.model.ComponentProperty:5188" not found !
	at org.datanucleus.store.rdbms.request.FetchRequest.execute(FetchRequest.java:492)
	at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.fetchObject(RDBMSPersistenceHandler.java:427)
	at org.datanucleus.state.StateManagerImpl.loadFieldsFromDatastore(StateManagerImpl.java:1632)
	at org.datanucleus.state.StateManagerImpl.refreshFieldsInFetchPlan(StateManagerImpl.java:4034)
	at org.datanucleus.api.jdo.state.Hollow.transitionRefresh(Hollow.java:169)
	at org.datanucleus.state.StateManagerImpl.refresh(StateManagerImpl.java:1031)
	at org.datanucleus.ExecutionContextImpl.refreshObject(ExecutionContextImpl.java:1664)
	at org.datanucleus.api.jdo.JDOPersistenceManager.jdoRefresh(JDOPersistenceManager.java:490)
	at org.datanucleus.api.jdo.JDOPersistenceManager.refresh(JDOPersistenceManager.java:507)
	at org.dependencytrack.tasks.BomUploadProcessingTaskTest.lambda$informWithExistingDuplicateComponentPropertiesAndBomWithDuplicateComponentProperties$116(BomUploadProcessingTaskTest.java:1618)
	at org.assertj.core.api.ThrowableAssert.catchThrowable(ThrowableAssert.java:63)
	at org.assertj.core.api.NotThrownAssert.isThrownBy(NotThrownAssert.java:43)
	at org.dependencytrack.tasks.BomUploadProcessingTaskTest.informWithExistingDuplicateComponentPropertiesAndBomWithDuplicateComponentProperties(BomUploadProcessingTaskTest.java:1618)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
	at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
	at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
	at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:316)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:240)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:214)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:155)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:385)
	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162)
	at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:507)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:495)
"
Error:  Errors: 
Error:    ModelConverterTest.testConvertCycloneDX1:91 » NullPointer Cannot invoke "org.cyclonedx.model.component.crypto.enums.AssetType.ordinal()" because the return value of "org.dependencytrack.model.CryptoAssetProperties.getAssetType()" is null
Error:    ModelConverterTest.testGenerateDependencies:203 » JDOUser One or more instances could not be made persistent
Error:    BomUploadProcessingTaskTest.informTest:138->lambda$informTest$8:139 NullPointer Cannot invoke "org.dependencytrack.model.OrganizationalEntity.getName()" because "manufacturer" is null
Error:    BomUploadProcessingTaskTest.informWithExistingComponentPropertiesAndBomWithComponentProperties:1399 » JDOObjectNotFound Object with id "org.dependencytrack.model.Component:17394" not found !
Error:    BomUploadProcessingTaskTest.informWithExistingComponentPropertiesAndBomWithoutComponentProperties:1371 » JDOObjectNotFound Object with id "org.dependencytrack.model.Component:2852" not found !
[INFO] 
Error:  Tests run: 1708, Failures: 8, Errors: 5, Skipped: 2
[INFO] 
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  32:34 min
[INFO] Finished at: 2024-09-27T09:10:35Z
[INFO] ------------------------------------------------------------------------
Error:  Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.5.0:test (default-test) on project dependency-track: There are test failures.
Error:  
Error:  Please refer to /home/runner/work/hyades-apiserver/hyades-apiserver/target/surefire-reports for the individual test results.
Error:  Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
Error:  -> [Help 1]
Error:  
Error:  To see the full stack trace of the errors, re-run Maven with the -e switch.
Error:  Re-run Maven using the -X switch to enable full debug logging.
Error:  
Error:  For more information about the errors and possible solutions, please read the following articles:
Error:  [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
Error: Process completed with exit code 1.
"

Signed-off-by: san-zrl <san@zurich.ibm.com>

fix: added CryptoAssetsResource

Signed-off-by: san-zrl <san@zurich.ibm.com>

added getAllCryptoAssets() perr project and globally

Signed-off-by: san-zrl <san@zurich.ibm.com>
Signed-off-by: san-zrl <san@zurich.ibm.com>
@san-zrl
Copy link
Author

san-zrl commented Oct 1, 2024

Hi @VinodAnandan - I looked into the tests and fixed the main problems that are related to the migration to cyclonedx 1..6. Not sure if the entire test set works because I never managed to run it successfully on my local system. I'm on a Mac M1 and getting testcontainers to run was a challenge. Issues:

  1. There are some tests that rely on the jdo refresh feature to sync java objects with background changes in the database. One such example is informWithExistingComponentPropertiesAndBomWithoutComponentProperties. I couldn't get those to run and commented them out for the time being.
  2. MigrationInitializerTest creates a temporary postgres container as the migration target. Some of the 4 tests in this class always fail. Behavior is pretty random.

@nscuro
Copy link
Member

nscuro commented Oct 1, 2024

Not sure if the entire test set works because I never managed to run it successfully on my local system. I'm on a Mac M1 and getting testcontainers to run was a challenge

Can you elaborate what about the test containers was problematic? The team so far has been working predominantly with M1 macs, so that should not be a problem.

There are some tests that rely on the jdo refresh feature to sync java objects with background changes in the database. [...] I couldn't get those to run and commented them out for the time being.

Can you share the errors you were getting here? Really all the refresh is doing is reloading the object from the database, so it's not much different from doing a SELECT query.

MigrationInitializerTest creates a temporary postgres container as the migration target. Some of the 4 tests in this class always fail. Behavior is pretty random.

Same as above, can you share the errors you're getting?

@nscuro
Copy link
Member

nscuro commented Oct 1, 2024

Also, have you tried if launching the API server with Dev Services works for you? https://dependencytrack.github.io/hyades/0.6.0-SNAPSHOT/development/testing/#api-server

@san-zrl
Copy link
Author

san-zrl commented Oct 1, 2024

  1. BomUploadProcessingTaskTest#informWithExistingComponentPropertiesAndBomWithComponentProperties (test report: org.dependencytrack.tasks.BomUploadProcessingTaskTest.txt) fails with
javax.jdo.JDOObjectNotFoundException: Object with id "org.dependencytrack.model.Component:33590 not found
  1. Some test in MigrationInitializerTest (test report: org.dependencytrack.persistence.migration.MigrationInitializerTest.txt) always fail with the msg below. Some seem to work even though they use the same mechanism underneath.
Caused by: org.postgresql.util.PSQLException: Connection to localhost:33271 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.

I ran above tests with mvn -P enhance verify -Dtest=<class>[#<method>]

@nscuro
Copy link
Member

nscuro commented Oct 1, 2024

I can reproduce the failures of the component property tests in this PR, but not in main. The tests expect a component to be modified, but it gets deleted instead. Which means DT couldn't match it's identity upon the second BOM upload.

The problem seems to be the modified buildExactComponentIdentityQuery method, where you currently have:

        if (cid.getOid() != null) {
            filterParts.add("(cryptoAssetProperties != null && cryptoAssetProperties.oid == :oid)");
            params.put("oid", cid.getOid());
        } else {
            filterParts.add("cryptoAssetProperties != null && cryptoAssetProperties.oid == null");
        }

But it should be this instead:

        if (cid.getOid() != null) {
            filterParts.add("(cryptoAssetProperties != null && cryptoAssetProperties.oid == :oid)");
            params.put("oid", cid.getOid());
        } else {
            filterParts.add("(cryptoAssetProperties == null || cryptoAssetProperties.oid == null)");
        }

The MigrationInitializerTest successfully completes for me. They also don't fail in CI. I wonder if perhaps there's something special about your local Docker setup? Are you using Docker or some other derivative such as Podman or Rancher?

@san-zrl
Copy link
Author

san-zrl commented Oct 1, 2024

@nscuro: Good catch, thanks!

I'm using Rancher.

@san-zrl
Copy link
Author

san-zrl commented Oct 1, 2024

@nscuro - The buildExactComponentIdentityQuery fix resolved most of the issues related to refresh(). However, there are still 3 testUpdateMetricsUnchanged tests that fail with assert violations on timestamps. Here's an example:
org.dependencytrack.tasks.metrics.ProjectMetricsUpdateTaskTest.txt. Could you have a look into this?

@nscuro
Copy link
Member

nscuro commented Oct 1, 2024

Will have a look at the failing tests.

Regarding Testcontainers, could this be relevant? https://docs.rancherdesktop.io/how-to-guides/using-testcontainers/#prerequisites

Signed-off-by: san-zrl <san@zurich.ibm.com>
@san-zrl
Copy link
Author

san-zrl commented Oct 1, 2024

Regarding Testcontainers, could this be relevant? https://docs.rancherdesktop.io/how-to-guides/using-> testcontainers/#prerequisites

I've seen this. Rancher uses admin rights. Kubernetes is disabled, VM type is set to QEMU. My env settings are

DOCKER_HOST=unix:///Users/san/.rd/docker.sock
TESTCONTAINERS_DOCKER_SOCKET_OVERRIDE=/var/run/docker.sock
TESTCONTAINERS_HOST_OVERRIDE=localhost
TESTCONTAINERS_RYUK_DISABLED=true

Signed-off-by: san-zrl <san@zurich.ibm.com>
@nscuro
Copy link
Member

nscuro commented Oct 1, 2024

TESTCONTAINERS_RYUK_DISABLED=true

Not entirely sure, but maybe Ryuk not being there could be a problem. Can you try enabling it?

Copy link

codacy-production bot commented Oct 1, 2024

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
-0.54% (target: -1.00%) 62.00% (target: 70.00%)
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (b0581ff) 21938 18137 82.67%
Head commit (279450c) 22674 (+736) 18623 (+486) 82.13% (-0.54%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#933) 1029 638 62.00%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Codacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more

@san-zrl
Copy link
Author

san-zrl commented Oct 1, 2024

TESTCONTAINERS_RYUK_DISABLED=false makes no difference for MigrationInitializerTest.

Signed-off-by: san-zrl <san@zurich.ibm.com>
Signed-off-by: san-zrl <san@zurich.ibm.com>
Signed-off-by: san-zrl <san@zurich.ibm.com>
@san-zrl
Copy link
Author

san-zrl commented Oct 5, 2024

Hi @nscuro, I pushed some code to bump the test coverage of the PR above the required threshold. The corresponding pipeline actions have been hanging since yesterday morning (https://github.com/DependencyTrack/hyades-apiserver/actions/runs/11177278805) with a message "This workflow is awaiting approval from a maintainer in #933". Could you check what's going on?

@nscuro
Copy link
Member

nscuro commented Oct 7, 2024

@san-zrl PRs from first-time contributors need explicit approval for workflows to run. Approved it now.

@san-zrl
Copy link
Author

san-zrl commented Oct 8, 2024

@VinodAnandan, thanks for letting me know. Diff test coverage is not quite on target Do you want me to further increase this number?

@nscuro
Copy link
Member

nscuro commented Oct 8, 2024

@san-zrl Don't sweat about the coverage too much for now. I haven't yet had the chance to thoroughly review the PR, and I am taking a few days "off" at the moment so there's limited time I'm spending on GitHub.

Assigning this to me, I'll get back to you ASAP.

@nscuro nscuro self-requested a review October 8, 2024 14:49
Copy link
Member

@nscuro nscuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial feedback with mostly questions from my side. Apologies for letting this sit for so long.

Comment on lines 261 to 266
<column name="BOM_REF" type="VARCHAR(64)"/>
<column name="LOCATION" type="VARCHAR(255)"/>
<column name="LINE" type="INT"/>
<column name="OFFSET" type="INT"/>
<column name="SYMBOL" type="INT"/>
<column name="ADDITIONAL_CONTEXT" type="VARCHAR(255)"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The character limits on BOM_REF, LOCATION and ADDITIONAL_CONTEXT seem a bit optimistic to me. If LOCATION ends up being a file path, I can see the limit of 255 being broken rather trivially. Similar story for ADDITIONAL_CONTEXT, which seems to be a field of arbitrary content?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @nscuro, thanks a lot for your review.

The code above comes from the definition of the OCCURRENCES table. The PR currently adds all crypto-related data fields that our scanner (https://github.com/IBM/cbomkit) generates. We can drop OCCURRENCES because the location of crypto assets in source is probably not important in DT.

Nevertheless, the your point re. the size of the varchar columns is still valid. There is no proper reasoning behind the current numbers. We started with some guesstimates based on standard (https://cyclonedx.org/docs/1.6/json/) and examples (https://github.com/CycloneDX/bom-examples/tree/master/CBOM) and simply doubled if the initial numbers weren't sufficient. I'd gladly use other numbers with better foundation. If data size is an issue we should go through all varchars one-by-one and decide (a) does it have to be persisted and (b) if so, what size is needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can drop OCCURRENCES because the location of crypto assets in source is probably not important in DT.

Occurrence will likely come to DT in the future to cater to other use cases, but if it's not crucial for the crypto functionality it's good to drop here IMO.

I'd gladly use other numbers with better foundation. If data size is an issue we should go through all varchars one-by-one and decide (a) does it have to be persisted and (b) if so, what size is needed.

In PostgreSQL it's all TEXT (unlimited length) behind the scenes, with an implicit CHECK constraint for the length. So other than preventing users from storing GBs of useless data, size constraints don't matter. With this in mind, I think choosing an initial length of 1024 should be fine.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Occurrence will likely come to DT in the future to cater to other use cases, but if it's not crucial for the crypto functionality it's good to drop here IMO
....
With this in mind, I think choosing an initial length of 1024 should be fine.

Okay, thanks.

  1. I will remove Occurrence with the next commit.
  2. VARCHAR length will default to 1024 unless we know better.

Comment on lines +124 to +131
<addColumn tableName="COMPONENT">
<column name="CRYPTO_PROPERTIES_ID" type="BIGINT"/>
</addColumn>

<createTable tableName="CRYPTO_PROPERTIES">
<column autoIncrement="true" name="ID" type="BIGINT">
<constraints nullable="false" primaryKey="true" primaryKeyName="CRYPTO_PROPERTIES_PK"/>
</column>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense to have a COMPONENT_ID column on the CRYPTO_PROPERTIES table?

Given the FK constraint with onDelete="CASCADE", the current setup would cause the COMPONENT record to automatically be deleted when CRYPTO_PROPERTIES gets deleted. It should be the other way around, i.e. CRYPTO_PROPERTIES being deleted automatically when COMPONENT gets deleted.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, true. This would be a better idea but I didn't manage to make this work.

I tried to specify a COMPONENT_ID in CRYPTO_PROPERTIES that refers upwards to the enclosing component along with an FK constraint that would cause the right order of cascading deletes. This led to a problem when trying to persist the modified crypto asset properties object. I reproduced the error msg with stacktrace and appended it below for reference. It turns out that CryptoAssetProperties@5af49362 is not the instance from ModelConverter (in which the component id was properly set) but another instance created somewhere deep down in the transaction code.

2024-10-21 15:22:19,384 ERROR [Persist] Insert of object "org.dependencytrack.model.CryptoAssetProperties@5af49362" using statement "INSERT INTO "CRYPTO_PROPERTIES" ("ALGORITHM_PROPERTIES_ID","ASSET_TYPE","CERTIFICATE_PROPERTIES_ID","COMPONENT_ID","OID","PROTOCOL_PROPERTIES_ID","RELATED_MATERIAL_PROPERTIES_ID") VALUES (?,?,?,?,?,?,?)" failed : ERROR: null value in column "COMPONENT_ID" of relation "CRYPTO_PROPERTIES" violates not-null constraint
  Detail: Failing row contains (3, null, ALGORITHM, 3, null, null, null, 2.16.840.1.101.3.4.1.6). [bomSerialNumber=e8c355aa-2142-4084-a8c7-6d42c8610ba2, bomFormat=CycloneDX, bomUploadToken=6cb46db7-e009-4146-8615-122179457b63, projectName=x, bomSpecVersion=1.6, projectUuid=416f54cc-8621-492f-838f-b87ef3de9dad, projectVersion=null, bomVersion=1]
javax.jdo.JDODataStoreException: Insert of object "org.dependencytrack.model.CryptoAssetProperties@5af49362" using statement "INSERT INTO "CRYPTO_PROPERTIES" ("ALGORITHM_PROPERTIES_ID","ASSET_TYPE","CERTIFICATE_PROPERTIES_ID","COMPONENT_ID","OID","PROTOCOL_PROPERTIES_ID","RELATED_MATERIAL_PROPERTIES_ID") VALUES (?,?,?,?,?,?,?)" failed : ERROR: null value in column "COMPONENT_ID" of relation "CRYPTO_PROPERTIES" violates not-null constraint
  Detail: Failing row contains (3, null, ALGORITHM, 3, null, null, null, 2.16.840.1.101.3.4.1.6).
	at org.datanucleus.api.jdo.JDOAdapter.getJDOExceptionForNucleusException(JDOAdapter.java:608)
	at org.datanucleus.api.jdo.JDOPersistenceManager.flush(JDOPersistenceManager.java:2057)
	at org.dependencytrack.tasks.BomUploadProcessingTask.processComponents(BomUploadProcessingTask.java:529)
...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure, is this how you modeled the relationship in this case? https://www.datanucleus.org/products/accessplatform_6_0/jdo/mapping.html#one_many_fk_bi

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted the 'inner' dependent object to be fetched when the 'outer' object is queried and I wanted the cascading delete to work in both ways. The code does that, so I think we can leave it as is.

The fetching works since the inner objects (e.g., CryptoAssetProperties) are modelled as persistent members of the outer (e.g., Component) object. Inner objects cannot exist without their 'outer' container. Thus, cascading delete must work in both ways:

  1. Deleting an 'outer' object must delete all 'inner' objects. This is guaranteed by the dependent = "true" flag in the Persistent annotation. Here is an example:
    @Persistent(defaultFetchGroup = "true", dependent = "true")
    @Index(name = "COMPONENT_CRYPTO_PROPERTIES_ID_IDX")
    @Column(name = "CRYPTO_PROPERTIES_ID", allowsNull = "true")
    private CryptoAssetProperties cryptoAssetProperties;
  2. Deleting an 'inner' object renders the outer objects invalid and they must be deleted. This is guaranteed by the foreign key constraints. Here is an example:
    <addForeignKeyConstraint baseTableName="COMPONENT" baseColumnNames="CRYPTO_PROPERTIES_ID"
    constraintName="COMPONENT_CRYPTO_PROPERTIES_FK" deferrable="true" initiallyDeferred="true"
    referencedTableName="CRYPTO_PROPERTIES" referencedColumnNames="ID"
    onDelete="CASCADE" onUpdate="NO ACTION" validate="true"/>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a few tests with crypto BOMs here that verify correct ingestion of data?

From what I'm seeing we're currently relying on the ORM to implicitly persist the object graph, but it may very well be that a more manual approach is necessary (see BomUploadProcessingTask#processComponents).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add some tests. Your observation is right, we rely on the ORM magic.

Comment on lines +66 to +71
@Persistent
@Column(name = "SIGNATURE_ALGORITHM_REF", jdbcType = "VARCHAR", length=64)
private String signatureAlgorithmRef;
@Persistent
@Column(name = "SUBJECT_PUBLIC_KEY_REF", jdbcType = "VARCHAR", length=64)
private String subjectPublicKeyRef;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand from the schema, these are BOM refs. I'm wondering if they should be resolved prior to persisting them, such that these would be foreign keys instead? Is there a reason to leave them plain like this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, they should be resolved. I haven't done this yet because the issue appears in many places where the standard has xxxRef data fields. In most cases, these are bomrefs that could be resolved to components. Most prominently, the bomref resolution question arises in the dependencies. Here DT currently stores a JSON structure which imho cannot not be the final solution. Hence I thought that bomref resolution ist still an open issue and left the refs plain for the time being.

Copy link
Member

@nscuro nscuro Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here DT currently stores a JSON structure which imho cannot not be the final solution

Very much agreed. An overhaul is planned but needs more fleshing out: DependencyTrack/dependency-track#3452

Would the resolved references here be part of the dependency graph though, or would it be a separate, graph-like structure? Seeing as the CDX standard still lacks "type" properties on dependency graph nodes...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle it's the same tree structure. The bomrefs in the ref fields must obviously point to other components in the BOM otherwise they would be dangling. In a sense, the standard is "over-specified": You can have ref fields that point to components and you can have dependency objects that also state that one component depends on another. It is not clear if the specification of a signatureAlgorithmRef requires the specification of a dependency object.

I think our final goal should be to model signatureAlgorithmRef as

 private Component subjectPublicKeyRef;

that is set as the result of the bomref resolution. Maybe we can make these fields transient because the get re-created on the fly when the BOM is loaded.

Comment on lines 197 to 203
<createTable tableName="CRYPTO_FUNCTIONS">
<column autoIncrement="true" name="ID" type="BIGINT">
<constraints nullable="false" primaryKey="true" primaryKeyName="CRYPTO_FUNCTIONS_PK"/>
</column>
<column name="ALGORITHM_PROPERTY_ID" type="BIGINT"/>
<column name="CRYPTO_FUNCTION" type="VARCHAR(32)"/>
</createTable>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this just be an array column in the ALGORITHM_PROPERTY table?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, absolutely. I avoided array column types since they are not supported by all db systems. But if we stick to postgres using array types would greatly simplify the table structure. There are a couple of similar structures in CBOM 1.6 where we could migrate to arrays.

Comment on lines 252 to 255
<createTable tableName="COMPONENT_OCCURRENCES">
<column name="COMPONENT_ID" type="BIGINT"/>
<column name="OCCURRENCE_ID" type="BIGINT"/>
</createTable>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd replace this join table with a join column (i.e. have a COMPONENT_ID column in the OCCURRENCES table). Since the component<->occurrence relationship is 1:N, the overhead of a join table is not necessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR currently adds crypto-related data fields that our scanner (https://github.com/IBM/cbomkit) generates. We can drop OCCURRENCES because the location of crypto assets in source is probably not important in DT.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General question: Is all this new data required to make crypto integration useful? One thing I'd like to avoid is persisting a lot of data without having a need for it.

The less data we store, the fewer pains we'll have in the future as the CycloneDX specification evolves and shifts things around. Due to the tight coupling of the model to the CycloneDX spec, that is definitely something to be vary about.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said, we could drop Occurrences. Given that each crypto component can have many occurrences this would already save a lot. Other than Occurrences, the PR adds the entire cryptoProperties realm from https://cyclonedx.org/docs/1.6/json/ to persistence. There are some things that may be of less importance in the context of DT (e.g., the implementation platform and execution env in algorithm properties). These are only enum values and the saving potential is limited. We could have a telco and go through the data fields one by one. Perhaps there's too much detail in the standard for DT's purposes.

@@ -135,7 +135,7 @@ public List<Component> getAllComponents() {
@SuppressWarnings("unchecked")
public List<Component> getAllComponents(Project project) {
final Query<Component> query = pm.newQuery(Component.class, "project == :project");
query.getFetchPlan().setMaxFetchDepth(2);
query.getFetchPlan().setMaxFetchDepth(3);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What data are we trying to capture with this increase in fetch depth? I wonder if FetchGroups would be a better tool to avoid over-fetching of data that is not strictly needed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The frontend PR contains code for rendering crypto properties. They form an additional layer in the persistence model which must be fetched for rendering (fetch depth 2 was not enough). To keep things simple I just raised the fetch depth to make this work. We could define a dedicated crypto component fetch group and use this only when crypto assets are queried.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'd propose we deal with this once everything else in place, as it's just an optimization and DataNucleus' fetch behavior can be quite fiddly to tune.

Comment on lines 43 to 45
@Persistent
@Column(name = "BOM_REF", jdbcType = "VARCHAR", length = 64)
private String bomRef;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to store the BOM ref?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we could drop this. The PR currently adds all crypto-related data fields that our scanner (https://github.com/IBM/cbomkit) generates. We can drop OCCURRENCES because the location of crypto assets in source is probably not important in DT.

…oin tables with array columns

Signed-off-by: san-zrl <san@zurich.ibm.com>
@san-zrl
Copy link
Author

san-zrl commented Oct 29, 2024

Hi @nscuro - I just pushed commit 850eab9 that addresses some of the issues we discussed:

  • Not processing of occurrence data anymore.
  • Added some tests for ingesting Crypto BOMs.
  • Collections of simple types are persisted as text[] columns in postgres. Join tables are only used for storing collections of complex types.

Issues not addressed yet:

  • BOM refs are not yet resolved, still stored in plain.
  • Crypto fields are still part of the default fetch group. No dedicated fetch plan yet.

There is also a corresponding commit for hyades-frontend: DependencyTrack/hyades-frontend@62be90f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants