-
-
Notifications
You must be signed in to change notification settings - Fork 563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement the hackage and nixpkgs meta analyzers #3549
Implement the hackage and nixpkgs meta analyzers #3549
Conversation
Signed-off-by: Magnus Viernickel <magnus.viernickel@wire.com>
Signed-off-by: Magnus Viernickel <magnus.viernickel@wire.com>
Signed-off-by: Magnus Viernickel <magnus.viernickel@wire.com>
Signed-off-by: Magnus Viernickel <magnus.viernickel@wire.com>
src/main/java/org/dependencytrack/tasks/repositories/NixpkgsMetaAnalyzer.java
Outdated
Show resolved
Hide resolved
what's missing:
|
different than the older http client Signed-off-by: Magnus Viernickel <magnus.viernickel@wire.com>
bf2c25b
to
c5096b2
Compare
Signed-off-by: Magnus Viernickel <magnus.viernickel@wire.com>
Signed-off-by: Magnus Viernickel <magnus.viernickel@wire.com>
e96397d
to
965fbc3
Compare
so if my theory is true, then this is some truly cursed shit:
|
Signed-off-by: Magnus Viernickel <magnus.viernickel@wire.com>
a5813d9
to
93573e0
Compare
Hi, I would like some input on where to best invoke the update of the nixpkgs meta analyzer singleton. Thanks in advance! |
@nscuro I don't want to pester you and understand you have a lot on your mind. We'd be grateful if you could provide some guidance at your earliest convenience to make this PR a success and land this feature in DT. Thank you! |
@MangoIV @emil-wire Hey, will have a look at this ASAP and provide feedback. Thank you for your contributions! :) |
@MangoIV Instead of using a singleton of So the first time a task tries to access nix versions, it will fetch and parse The benefit of this is that you won't have to deal with concurrency controls such as locking on your own. Also refreshing happens implicitly and as such doesn't require a dedicated scheduled task. Because we'd want to refresh the entire package index at once, the entire index would be a single cache entry: class NixPkgMetaAnalyzer {
private static final Cache<String, Map<String, String>> CACHE = Caffeine.newBuilder()
.expireAfterWrite(60, TimeUnit.MINUTES)
.maximumSize(1)
.build();
MetaModel analyze(Component component) {
Map<String, String> latestVersions = CACHE.get("nixpkgs", cacheKey -> {
final var versions = new HashMap<String, String>();
// TODO: Download and parse package index
return versions;
});
// TODO: Do things with latestVersions
} More details here: https://github.com/ben-manes/caffeine/wiki/Population#manual |
Thank you for the answer, that looks very nice! It looks like a fairly large hammer for the tiny nail in trying to get into the wall but if it’s already used across the codebase, this looks like a feasible option; I’ll adjust accordingly. |
…xpkgs json Signed-off-by: Magnus Viernickel <magnus.viernickel@wire.com>
package-url/packageurl-java@73b2f51 btw; this got merged, i.e. theoretically I could do this minor cleanup; however upstream didn't do a release yet so idk if this is worth the effort. |
new tests pass locally, let's see if CI is merciful today 🥺 |
Signed-off-by: Magnus Viernickel <magnus.viernickel@wire.com>
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferencesYou may notice some variations in coverage metrics with the latest Coverage engine update. For more details, visit the documentation |
@nscuro is this how you'd imagine it? what can I do to make this PR progress? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @MangoIV!
As a future enhancement, we'd want to improve the parsing of the nixpkgs catalogue. At the moment, the entire thing is deserialized at once, which causes quite a large amount of memory allocations:
A more efficient way would be to leverage a streaming API such as the one provided by Jackson. That allows us to only deserialize fields we actually care about. We make use of that when parsing CVE data from the NVD already:
dependency-track/src/main/java/org/dependencytrack/parser/nvd/NvdParser.java
Lines 82 to 107 in 3e8f93f
try (final InputStream in = Files.newInputStream(file.toPath()); | |
final JsonParser jsonParser = objectMapper.createParser(in)) { | |
jsonParser.nextToken(); // Position cursor at first token | |
// Due to JSON feeds being rather large, do not parse them completely, | |
// but "stream" through them. Parsing individual CVE items | |
// one-by-one allows for garbage collection to kick in sooner, | |
// keeping the overall memory footprint low. | |
JsonToken currentToken; | |
while (jsonParser.nextToken() != JsonToken.END_OBJECT) { | |
final String fieldName = jsonParser.getCurrentName(); | |
currentToken = jsonParser.nextToken(); | |
if ("CVE_Items".equals(fieldName)) { | |
if (currentToken == JsonToken.START_ARRAY) { | |
while (jsonParser.nextToken() != JsonToken.END_ARRAY) { | |
final ObjectNode cveItem = jsonParser.readValueAsTree(); | |
parseCveItem(cveItem); | |
} | |
} else { | |
jsonParser.skipChildren(); | |
} | |
} else { | |
jsonParser.skipChildren(); | |
} | |
} | |
} catch (Exception e) { |
It’s a tradeoff. I would be reluctant to make a guess on whether it’s wise to use streaming here without thorough benchmarks. |
This was missed in DependencyTrack#3549 Signed-off-by: nscuro <nscuro@protonmail.com>
This was missed in DependencyTrack#3549 Signed-off-by: nscuro <nscuro@protonmail.com>
This was missed in DependencyTrack#3549 Signed-off-by: nscuro <nscuro@protonmail.com>
Description
This solves part 2 of #3545
Addressed Issue
#3545
Additional Details
this depends on #3546
the apache http client utils version 4 don't implement brotli decompression; I redid some of the common http client code to keep this scoped; a future refactoring could remove the extraneous code.
Checklist