-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature improved java cataloging #2769
Feature improved java cataloging #2769
Conversation
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Update configuration documentation Improve maven groupid detection Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Note: The 'Detect schema changes / Label changes' failed, but should pass on re-run of the job. |
@kzantow, @willmurphyscode : I can split this PR into smaller parts, each adding part of the improvements:
What would be the best way forward? |
I would love to have this PR merged, but can totally understand that syft currently doesnt have the triggering of external tools. But this would also help to better catalog gradle packages |
Signed-off-by: Keith Zantow <kzantow@gmail.com>
Signed-off-by: Keith Zantow <kzantow@gmail.com>
Signed-off-by: Keith Zantow <kzantow@gmail.com>
Hey @GijsCalis -- I made a significant number of changes since you submitted this PR as you may have noticed, so I just wanted to explain why and see if you had any time or desire to kick the tires on this branch to make sure it is still working as you expect for the cases you care about and had working. I tried to not miss anything, but I could have, since it was a large PR and maven is a large ecosystem 😬 I also wanted to also say a HUGE THANK YOU for getting this work in motion and all the effort you've put in here! Thank you very much for doing so much of the work so we could understand a number of things that needed to be in place for this to work (like BOMs, I hadn't dealt with them during my Maven days long ago). One of the blocking things in the original set of changes was keeping global state that tracked all the poms found in a static map. This could cause problems, for example, if Syft was being used as a library and scanned multiple things without resetting the map between runs. However, once I got into the nuts and bolts of how this worked, it was quickly apparent that having an in-memory cache of poms was necessary for performance and locating things in certain cases. This was the motivating factor behind most of the changes here. To accomplish this, I refactored maven things into a The original set of changes were not explicitly working to scan a multi-module project source but rather relied on the fact that top-level A few other notes: I've also integrated with the general cache, so any network requests for poms will be cached locally for improved subsequent performance. I also implemented a few more small details like the [probably infrequently used] alternate maven local repository directory lookup, and fixed as much of the possible property resolution usage that I could. An example is... it actually works to have a variable in an artifact name provided by a parent pom. I suspect this type of thing isn't used a whole lot but I'm sure there are people using properties for group, artifact, and version somewhere. Using the repositories you linked above, the current results look similar to where you left the PR, although I didn't quite get the same numbers you had in your table, with the latest commit before my changes. Regardless, this is what things look like with the latest changes I pushed:
And some examples of the property resolution with the same things:
Again, THANK YOU for this, it is very much appreciated. Apologies it has taken so long to get to getting these changes reviewed and mergable, and also for the length of this comment. 😁 |
Signed-off-by: Keith Zantow <kzantow@gmail.com>
…rties-order' into feature-improved-java-cataloging
I just tried it with a few repos of mine and it works like a charm for large maven multi module projects |
Signed-off-by: Keith Zantow <kzantow@gmail.com>
Signed-off-by: Keith Zantow <kzantow@gmail.com>
Signed-off-by: Keith Zantow <kzantow@gmail.com>
Signed-off-by: Keith Zantow <kzantow@gmail.com>
Signed-off-by: Keith Zantow <kzantow@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice work @GijsCalis and @kzantow !!
Signed-off-by: Keith Zantow <kzantow@gmail.com>
As announced in PR #2669 I've improved the package detection for Java/Maven packages by:
I've added support for use of the local Maven cache because it usually it contains all the required pom.xml files, when scanning on a system where the code has been build.
As a result the scanning is significantly more complete and faster, see table below with test results.
I've run the tests on the following projects:
Also find attached some SBOM files generated by syft v1.1.1 and the version in this PR.
sbom.cyclonedx.httpcomponents-new.json
sbom.cyclonedx.httpcomponents-v1.1.1.json
sbom.cyclonedx.jackson-new.json
sbom.cyclonedx.jackson-v1.1.1.json
sbom.cyclonedx.zookeeper-new-no-network-with-local-repo-after-build.json
sbom.cyclonedx.zookeeper-v1.1.1-with-network.json
sbom.cyclonedx.petclinic-new-no-network-no-local-repo.json
Uploading sbom.cyclonedx.petclinic-v1.1.1-with-network.json…