Modify apache_kafka.py and related tests for migration #1042

johnmhoran · 2022-12-12T22:59:28Z

No description provided.

pombredanne

Thank you! Please see feedback mostly to cleanup the code from comments so we can merge.

vulnerabilities/importers/apache_kafka.py

vulnerabilities/tests/test_apache_kafka.py

vulnerabilities/tests/test_data/apache_kafka/jmh-test-01.txt

vulnerabilities/importers/apache_kafka.py

TG1999 · 2023-01-23T22:03:25Z

@johnmhoran please rebase your branch and also provide importer improver logs for this importer.

johnmhoran · 2023-01-24T01:33:44Z

@TG1999 After prep steps I ran into an error trying to run the apache_kafka importer.

(venv) Mon Jan 23, 2023 05:22 PM  /home/jmh/dev/nexb/vulnerablecode jmh (972-migrate-apache-kafka-importer)
$ ./manage.py import vulnerabilities.importers.apache_kafka.ApacheKafkaImporter
Importing data using vulnerabilities.importers.apache_kafka.ApacheKafkaImporter
Traceback (most recent call last):
  File "/home/jmh/dev/nexb/vulnerablecode/vulnerabilities/management/commands/import.py", line 60, in import_data
    ImportRunner(importer).run()
  File "/home/jmh/dev/nexb/vulnerablecode/vulnerabilities/import_runner.py", line 43, in run
    advisory_datas = importer_class().advisory_data()
  File "/home/jmh/dev/nexb/vulnerablecode/vulnerabilities/importer.py", line 321, in advisory_data
    raise NotImplementedError
NotImplementedError
Failed to run importer vulnerabilities.importers.apache_kafka.ApacheKafkaImporter. Continuing...
CommandError: 1 failed!: vulnerabilities.importers.apache_kafka.ApacheKafkaImporter

(venv) Mon Jan 23, 2023 05:23 PM  /home/jmh/dev/nexb/vulnerablecode jmh (972-migrate-apache-kafka-importer)
$

I don't understand why the raise NotImplementedError error was raised, since we do return AdvisoryData() objects -- perhaps this is related to our use of hard-coded values in lieu of code?

johnmhoran · 2023-01-24T01:53:32Z

@TG1999 I've been unable to track down the source of the error. All tests pass. Perhaps we can discuss tomorrow?

johnmhoran · 2023-01-24T22:34:55Z

@TG1999 All 10 GH checks have passed and both the apache_kafka importer and the default improver ran successfully (see attached file).
vcio-apache_kafka-import-and-improve-2023-01-24.txt

Please note, however, that the improver success message included a warning:

(venv) Tue Jan 24, 2023 02:09 PM  /home/jmh/dev/nexb/vulnerablecode jmh (972-migrate-apache-kafka-importer)
$ ./manage.py improve vulnerabilities.improvers.default.DefaultImprover
Improving data using vulnerabilities.improvers.default.DefaultImprover
Failed to get exact purls for AffectedPackage(package=PackageURL(type='apache', namespace=None, name='kafka', version=None, qualifiers={}, subpath=None), affected_version_range=ApacheVersionRange(constraints=(VersionConstraint(comparator='>=', version=SemverVersion(string='0.11.0.0')), VersionConstraint(comparator='<=', version=SemverVersion(string='2.1.0')), VersionConstraint(comparator='<', version=SemverVersion(string='2.1.1')))), fixed_version=None) Invalid constraints sequence: [VersionConstraint(comparator='>=', version=SemverVersion(string='0.11.0.0')), VersionConstraint(comparator='<=', version=SemverVersion(string='2.1.0')), VersionConstraint(comparator='<', version=SemverVersion(string='2.1.1'))]
Successfully improved data using vulnerabilities.improvers.default.DefaultImprover

(venv) Tue Jan 24, 2023 02:15 PM  /home/jmh/dev/nexb/vulnerablecode jmh (972-migrate-apache-kafka-importer)
$

TG1999

LGTM!

pombredanne

Thank you++
See my comments for your consideration.

pombredanne · 2023-01-26T21:26:53Z

vulnerabilities/tests/test_data/apache_kafka/test-advisories.json

+        "affected_packages": [
+            {
+                "package": {
+                    "type": "maven",


There is no such thing as an "apache_kafka" maven package. This is a 404: https://repo1.maven.org/maven2/apache_kafka

There are instead two possible package URLs to return:
1. pkg:apache/kafka (let's use this ONLY for now)
2. and many different Maven PURLs because there are many JARS in kafka. That's a job for an improver later.

To get an idea of what maven PURls would be, in this https://downloads.apache.org/kafka/3.3.2/kafka_2.12-3.3.2.tgz we have all these JARS:

kafka_2.13-3.3.2.jar

kafka-clients-3.3.2.jar

kafka-log4j-appender-3.3.2.jar

kafka-metadata-3.3.2.jar

kafka-raft-3.3.2.jar

kafka-server-common-3.3.2.jar

kafka-shell-3.3.2.jar

kafka-storage-3.3.2.jar

kafka-storage-api-3.3.2.jar

kafka-streams-3.3.2.jar

kafka-streams-examples-3.3.2.jar

kafka-streams-scala_2.13-3.3.2.jar

kafka-streams-test-utils-3.3.2.jar

kafka-tools-3.3.2.jar

That's only for a kafka build for Scale 2.12.
There is another for 2.13 with a second set of similar JARs.

All these 14 JARs would exist on Maven. For instance with version 3.3.2, we would have:

https://repo1.maven.org/maven2/org/apache/kafka/kafka_2.12/3.3.2/ as pkg:maven/org.apache.kafka/kafka_2.12@3.3.2

https://repo1.maven.org/maven2/org/apache/kafka/kafka_2.13/3.3.2/ as pkg:maven/org.apache.kafka/kafka_2.13@3.3.2
and many variation for each JARs above.

THEREFORE, let's use only a series of pkg:apache/kafka PURLs for now.
@johnmhoran please create an issue to later create an improver that will handle the maven creation.

Thank you @pombredanne for your detailed comments. 👍 This test file was an earlier draft and I should have deleted it, doing so now; and I've opened a new issue re a maven importer as requested.

vulnerabilities/tests/test_data/apache_kafka/test-advisories.json

pombredanne · 2023-01-26T21:31:34Z

vulnerabilities/tests/test_data/apache_kafka/test-advisories.json

+                    "qualifiers": null,
+                    "subpath": null
+                },
+                "affected_version_range": "vers:maven/2.0.0|2.0.1|2.1.0|2.1.1|2.2.0|2.2.1|2.2.2|2.3.0|2.3.1|2.4.0|2.4.1|2.5.0|2.5.1|2.6.0|2.6.1|2.6.2|!=2.6.3|2.7.0|2.7.1|!=2.7.2|2.8.0.|!=2.8.1|<3.0.0",


Are you sure the range is using maven syntax for all their versions? including some older ones as found in http://archive.apache.org/dist/kafka/ ?

@pombredanne I don't understand your question, but I think this is superseded by your decision above to use only apache PURLs for now and to add an issue for a maven improver.

Good point!

vulnerabilities/tests/test_data/apache_kafka/test-advisories.json

vulnerabilities/tests/test_data/apache_kafka/to-advisory-apache_kafka-expected.json

vulnerabilities/importers/apache_kafka.py

pombredanne · 2023-01-26T21:50:47Z

vulnerabilities/importers/apache_kafka.py

+            fixed_versions_string = fixed_versions_row.find_all("td")[1].text
+
+            # Remove leading white space after initial comma
+            affected_versions_string_split_SPLIT = [


This is a very variable name.... Consider this code instead:

affected_versions = [v.strip() for v in affected_versions.split(",")] affected_versions = [v for v in affected_versions if v]

@pombredanne After making your suggested changes I get an error re my hard-coded mapping:

TypeError: unhashable type: 'list'

Replacing the awkwardly-named

affected_versions_string_split_SPLIT = [ substring.strip() for substring in affected_versions.split(",") if not substring.isspace() ]

with

affected_versions = [v.strip() for v in affected_versions.split(",")] affected_versions = [v for v in affected_versions if v]

has converted the previously-defined affected_versions = affected_versions_row.find_all("td")[1].text to a list, which won't work as a mapping/dict key. There's probably a simple fix for this, giving it some thought now....

@pombredanne This is where the conversion of affected_versions to a list throws an error. It also applies to your subsequent comment re fixed_versions_string_split_SPLIT.

# This throws a KeyError if the opening h2 tag `id` data changes or is not in the # hard-coded affected_version_range_mapping dictionary. if affected_version_range_mapping[cve_id]["action"] == "include": # These 2 variables (not used elsewhere) trigger the KeyError for changed/missing data. check_affected_versions_key = affected_version_range_mapping[cve_id][ affected_versions ] check_fixed_versions_key = affected_version_range_mapping[cve_id][ fixed_versions_string ]

I don't see how to modify the excerpted code to deal with your conversion of affected_versions to a list.

Hmm, by simply changing the variable name ever so slightly to avoid using affected_versions, your approach works. Am I missing something? ;-) This way we use your cleaner code AND avoid converting affected_versions itself to an unwanted list.

affected_versions_clean = [v.strip() for v in affected_versions.split(",")] affected_versions_clean = [v for v in affected_versions if v]

pombredanne · 2023-01-26T21:51:00Z

vulnerabilities/importers/apache_kafka.py

+                for substring in affected_versions_string.split(",")
+                if not substring.isspace()
+            ]
+            fixed_versions_string_split_SPLIT = [


Same comment as above

My above comment seems to apply with equal vigor to your suggested fixed_versions_string_split_SPLIT improvement. All tests once again pass.

johnmhoran · 2023-01-30T22:55:22Z

@pombredanne I've addressed all of your PR comments (thank you ;-), committed and pushed my updated code, and all GH checks have passed. Ready for you to review at your convenience.

vulnerabilities/importers/apache_kafka.py

Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>

…he_kafka #972 Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>

Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>

Signed-off-by: John M. Horan <johnmhoran@gmail.com>

Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>

johnmhoran · 2023-02-08T21:02:14Z

@pombredanne I've replaced the rather long mapping reference with a variable as you suggested and all tests and GH checks pass. 👍

pombredanne · 2023-02-10T10:24:46Z

vulnerabilities/importers/apache_kafka.py

+
+            # This throws a KeyError if the opening h2 tag `id` data changes or is not in the
+            # hard-coded affected_version_range_mapping dictionary.
+            cve_version_mapping = affected_version_range_mapping[cve_id]


Thanks for the update. I would prefer not having a _mapping suffix or similar type suffixes when this is not absolutely needed, but this is cosmetic we can fix this later!

pombredanne

Thanks!

johnmhoran force-pushed the 972-migrate-apache-kafka-importer branch from 842799b to a5f7248 Compare December 13, 2022 00:10

johnmhoran requested review from pombredanne and TG1999 December 13, 2022 01:38

pombredanne requested changes Dec 24, 2022

View reviewed changes

johnmhoran force-pushed the 972-migrate-apache-kafka-importer branch from a5f7248 to bc141b6 Compare January 3, 2023 22:24

TG1999 reviewed Jan 11, 2023

View reviewed changes

vulnerabilities/importers/apache_kafka.py Outdated Show resolved Hide resolved

vulnerabilities/importers/apache_kafka.py Outdated Show resolved Hide resolved

TG1999 added this to the v32.0.0 milestone Jan 17, 2023

johnmhoran force-pushed the 972-migrate-apache-kafka-importer branch from 639979b to e016650 Compare January 24, 2023 21:05

johnmhoran force-pushed the 972-migrate-apache-kafka-importer branch from 1fb0eef to 0b26584 Compare January 25, 2023 18:19

TG1999 approved these changes Jan 26, 2023

View reviewed changes

TG1999 requested a review from pombredanne January 26, 2023 07:57

pombredanne self-assigned this Jan 26, 2023

pombredanne requested changes Jan 26, 2023

View reviewed changes

johnmhoran mentioned this pull request Jan 27, 2023

Create an improver to handle Maven creation #1106

Open

johnmhoran force-pushed the 972-migrate-apache-kafka-importer branch from 0ddb6fe to a284994 Compare January 30, 2023 18:15

johnmhoran requested a review from pombredanne January 30, 2023 22:55

pombredanne reviewed Feb 8, 2023

View reviewed changes

vulnerabilities/importers/apache_kafka.py Outdated Show resolved Hide resolved

johnmhoran added 8 commits February 8, 2023 12:57

Modify apache_kafka.py and related tests for migration #972

502d85c

Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>

Remove unneeded comments and test files, add changelog entry for apac…

3602fe9

…he_kafka #972 Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>

Update vers: and package naming #972

01ffe7e

Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>

Prepare to run apache_kafka importer #972

a9f4842

Signed-off-by: John M. Horan <johnmhoran@gmail.com>

Fix NotImplementedError #972

bd69a8f

Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>

Fix additional import errors #972

33bf05a

Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>

Correct manual version range error #972

daca89d

Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>

Fix failing test #972

6b2cd15

Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>

johnmhoran added 2 commits February 8, 2023 12:57

Address PR comments including simplifying code #972

2776cb2

Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>

Replace long mapping reference with variable #972

e33e18d

Reference: #972 Signed-off-by: John M. Horan <johnmhoran@gmail.com>

johnmhoran force-pushed the 972-migrate-apache-kafka-importer branch from b973738 to e33e18d Compare February 8, 2023 20:57

johnmhoran requested a review from pombredanne February 8, 2023 21:02

pombredanne reviewed Feb 10, 2023

View reviewed changes

pombredanne approved these changes Feb 10, 2023

View reviewed changes

pombredanne merged commit 0c04da4 into main Feb 10, 2023

pombredanne deleted the 972-migrate-apache-kafka-importer branch February 10, 2023 10:25

TG1999 mentioned this pull request Feb 10, 2023

Migrate apache_kafka #972

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modify apache_kafka.py and related tests for migration #1042

Modify apache_kafka.py and related tests for migration #1042

johnmhoran commented Dec 12, 2022

pombredanne left a comment

TG1999 commented Jan 23, 2023

johnmhoran commented Jan 24, 2023

johnmhoran commented Jan 24, 2023

johnmhoran commented Jan 24, 2023

TG1999 left a comment

pombredanne left a comment

pombredanne Jan 26, 2023

johnmhoran Jan 30, 2023

pombredanne Jan 26, 2023

johnmhoran Jan 30, 2023

pombredanne Jan 31, 2023

pombredanne Jan 26, 2023

johnmhoran Jan 30, 2023

johnmhoran Jan 30, 2023

johnmhoran Jan 30, 2023

pombredanne Jan 26, 2023

johnmhoran Jan 30, 2023

johnmhoran commented Jan 30, 2023

johnmhoran commented Feb 8, 2023

pombredanne Feb 10, 2023

pombredanne left a comment

Modify apache_kafka.py and related tests for migration #1042

Modify apache_kafka.py and related tests for migration #1042

Conversation

johnmhoran commented Dec 12, 2022

pombredanne left a comment

Choose a reason for hiding this comment

TG1999 commented Jan 23, 2023

johnmhoran commented Jan 24, 2023

johnmhoran commented Jan 24, 2023

johnmhoran commented Jan 24, 2023

TG1999 left a comment

Choose a reason for hiding this comment

pombredanne left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnmhoran commented Jan 30, 2023

johnmhoran commented Feb 8, 2023

Choose a reason for hiding this comment

pombredanne left a comment

Choose a reason for hiding this comment