-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify apache_kafka.py and related tests for migration #1042
Conversation
842799b
to
a5f7248
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Please see feedback mostly to cleanup the code from comments so we can merge.
a5f7248
to
bc141b6
Compare
@johnmhoran please rebase your branch and also provide importer improver logs for this importer. |
@TG1999 After prep steps I ran into an error trying to run the apache_kafka importer.
I don't understand why the |
@TG1999 I've been unable to track down the source of the error. All tests pass. Perhaps we can discuss tomorrow? |
639979b
to
e016650
Compare
@TG1999 All 10 GH checks have passed and both the Please note, however, that the improver success message included a warning:
|
1fb0eef
to
0b26584
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you++
See my comments for your consideration.
"affected_packages": [ | ||
{ | ||
"package": { | ||
"type": "maven", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no such thing as an "apache_kafka" maven package. This is a 404: https://repo1.maven.org/maven2/apache_kafka
There are instead two possible package URLs to return:
1. pkg:apache/kafka
(let's use this ONLY for now)
2. and many different Maven PURLs because there are many JARS in kafka. That's a job for an improver later.
To get an idea of what maven PURls would be, in this https://downloads.apache.org/kafka/3.3.2/kafka_2.12-3.3.2.tgz we have all these JARS:
- kafka_2.13-3.3.2.jar
- kafka-clients-3.3.2.jar
- kafka-log4j-appender-3.3.2.jar
- kafka-metadata-3.3.2.jar
- kafka-raft-3.3.2.jar
- kafka-server-common-3.3.2.jar
- kafka-shell-3.3.2.jar
- kafka-storage-3.3.2.jar
- kafka-storage-api-3.3.2.jar
- kafka-streams-3.3.2.jar
- kafka-streams-examples-3.3.2.jar
- kafka-streams-scala_2.13-3.3.2.jar
- kafka-streams-test-utils-3.3.2.jar
- kafka-tools-3.3.2.jar
That's only for a kafka build for Scale 2.12.
There is another for 2.13 with a second set of similar JARs.
All these 14 JARs would exist on Maven. For instance with version 3.3.2, we would have:
- https://repo1.maven.org/maven2/org/apache/kafka/kafka_2.12/3.3.2/ as
pkg:maven/org.apache.kafka/kafka_2.12@3.3.2
- https://repo1.maven.org/maven2/org/apache/kafka/kafka_2.13/3.3.2/ as
pkg:maven/org.apache.kafka/kafka_2.13@3.3.2
and many variation for each JARs above.
THEREFORE, let's use only a series of pkg:apache/kafka
PURLs for now.
@johnmhoran please create an issue to later create an improver that will handle the maven creation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @pombredanne for your detailed comments. 👍 This test file was an earlier draft and I should have deleted it, doing so now; and I've opened a new issue re a maven
importer as requested.
vulnerabilities/tests/test_data/apache_kafka/test-advisories.json
Outdated
Show resolved
Hide resolved
"qualifiers": null, | ||
"subpath": null | ||
}, | ||
"affected_version_range": "vers:maven/2.0.0|2.0.1|2.1.0|2.1.1|2.2.0|2.2.1|2.2.2|2.3.0|2.3.1|2.4.0|2.4.1|2.5.0|2.5.1|2.6.0|2.6.1|2.6.2|!=2.6.3|2.7.0|2.7.1|!=2.7.2|2.8.0.|!=2.8.1|<3.0.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure the range is using maven syntax for all their versions? including some older ones as found in http://archive.apache.org/dist/kafka/ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pombredanne I don't understand your question, but I think this is superseded by your decision above to use only apache
PURLs for now and to add an issue for a maven
improver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
vulnerabilities/tests/test_data/apache_kafka/test-advisories.json
Outdated
Show resolved
Hide resolved
vulnerabilities/tests/test_data/apache_kafka/to-advisory-apache_kafka-expected.json
Outdated
Show resolved
Hide resolved
fixed_versions_string = fixed_versions_row.find_all("td")[1].text | ||
|
||
# Remove leading white space after initial comma | ||
affected_versions_string_split_SPLIT = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very variable name.... Consider this code instead:
affected_versions = [v.strip() for v in affected_versions.split(",")]
affected_versions = [v for v in affected_versions if v]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pombredanne After making your suggested changes I get an error re my hard-coded mapping:
TypeError: unhashable type: 'list'
Replacing the awkwardly-named
affected_versions_string_split_SPLIT = [
substring.strip()
for substring in affected_versions.split(",")
if not substring.isspace()
]
with
affected_versions = [v.strip() for v in affected_versions.split(",")]
affected_versions = [v for v in affected_versions if v]
has converted the previously-defined affected_versions = affected_versions_row.find_all("td")[1].text
to a list, which won't work as a mapping/dict key. There's probably a simple fix for this, giving it some thought now....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pombredanne This is where the conversion of affected_versions
to a list throws an error. It also applies to your subsequent comment re fixed_versions_string_split_SPLIT
.
# This throws a KeyError if the opening h2 tag `id` data changes or is not in the
# hard-coded affected_version_range_mapping dictionary.
if affected_version_range_mapping[cve_id]["action"] == "include":
# These 2 variables (not used elsewhere) trigger the KeyError for changed/missing data.
check_affected_versions_key = affected_version_range_mapping[cve_id][
affected_versions
]
check_fixed_versions_key = affected_version_range_mapping[cve_id][
fixed_versions_string
]
I don't see how to modify the excerpted code to deal with your conversion of affected_versions
to a list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, by simply changing the variable name ever so slightly to avoid using affected_versions
, your approach works. Am I missing something? ;-) This way we use your cleaner code AND avoid converting affected_versions
itself to an unwanted list.
affected_versions_clean = [v.strip() for v in affected_versions.split(",")]
affected_versions_clean = [v for v in affected_versions if v]
for substring in affected_versions_string.split(",") | ||
if not substring.isspace() | ||
] | ||
fixed_versions_string_split_SPLIT = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My above comment seems to apply with equal vigor to your suggested fixed_versions_string_split_SPLIT
improvement. All tests once again pass.
0ddb6fe
to
a284994
Compare
@pombredanne I've addressed all of your PR comments (thank you ;-), committed and pushed my updated code, and all GH checks have passed. Ready for you to review at your convenience. |
Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
Signed-off-by: John M. Horan <johnmhoran@gmail.com>
Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
b973738
to
e33e18d
Compare
@pombredanne I've replaced the rather long mapping reference with a variable as you suggested and all tests and GH checks pass. 👍 |
|
||
# This throws a KeyError if the opening h2 tag `id` data changes or is not in the | ||
# hard-coded affected_version_range_mapping dictionary. | ||
cve_version_mapping = affected_version_range_mapping[cve_id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. I would prefer not having a _mapping
suffix or similar type suffixes when this is not absolutely needed, but this is cosmetic we can fix this later!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
No description provided.