cyhy-cvesync
is Python library that can retrieve JSON files containing Common
Vulnerabilities and Exposures (CVE) data (such as those from the National
Vulnerability Database (NVD)) and import the data into a
MongoDB collection.
- Python 3.12 or newer
- A running MongoDB instance that you have access to
Important
This requires Docker to be installed in order for this to work.
You can start a local MongoDB instance in a container with the following command:
pytest -vs --mongo-express
Note
The command pytest -vs --mongo-express
not only starts a local
MongoDB instance, but also runs all the cyhy-cvesync
unit tests, which will
create various collections and documents in the database.
Sample output (trimmed to highlight the important parts):
<snip>
MongoDB is accessible at mongodb://mongoadmin:secret@localhost:32881 with database named "test"
Mongo Express is accessible at http://admin:pass@localhost:8081
Press Enter to stop Mongo Express and MongoDB containers...
Based on the example output above, you can access the MongoDB instance at
mongodb://mongoadmin:secret@localhost:32881
and the Mongo Express web
interface at http://admin:pass@localhost:8081
. Note that the MongoDB
containers will remain running until you press "Enter" in that terminal.
Once you have a MongoDB instance running, the sample Python code below demonstrates how to initialize the CyHy database, fetch CVE data from a source, and then load the data into to your database.
import asyncio
from cyhy_cvesync import DEFAULT_CVE_URL_PATTERN
from cyhy_cvesync.cve_sync import process_urls
from cyhy_db import initialize_db
from cyhy_db.models import CVEDoc
async def main():
# Initialize the CyHy database
await initialize_db("mongodb://mongoadmin:secret@localhost:32881", "test")
# Count number of CVE documents in DB before sync
cve_count_before = await CVEDoc.find_all().count()
print(f"CVE documents in DB before sync: {cve_count_before}")
# Fetch CVE data from the default source for a single year and sync it to the database
cve_url = DEFAULT_CVE_URL_PATTERN.format(year=2024)
print(f"Processing CVE data from: {cve_url}...")
created_cve_docs_count, updated_cve_docs_count, deleted_cve_docs_count = await process_urls(
[cve_url], cve_data_gzipped=True, concurrency=1)
print(f"Created CVE documents: {created_cve_docs_count}")
print(f"Updated CVE documents: {updated_cve_docs_count}")
print(f"Deleted CVE documents: {deleted_cve_docs_count}")
# Count number of CVE documents in DB after sync
cve_count_after = await CVEDoc.find_all().count()
print(f"CVE documents in DB after sync: {cve_count_after}")
asyncio.run(main())
Output:
CVE documents in DB before sync: 20
Processing CVE data from: https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-1.1-2024.json.gz...
Deleting outdated CVE docs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Created CVE documents: 12174
Updated CVE documents: 0
Deleted CVE documents: 0
CVE documents in DB after sync: 12194
Variable | Description | Default |
---|---|---|
MONGO_INITDB_ROOT_USERNAME |
The MongoDB root username | mongoadmin |
MONGO_INITDB_ROOT_PASSWORD |
The MongoDB root password | secret |
DATABASE_NAME |
The name of the database to use for testing | test |
MONGO_EXPRESS_PORT |
The port to use for the Mongo Express web interface | 8081 |
Option | Description | Default |
---|---|---|
--mongo-express |
Start a local MongoDB instance and Mongo Express web interface | n/a |
--mongo-image-tag |
The tag of the MongoDB Docker image to use | docker.io/mongo:latest |
--runslow |
Run slow tests | n/a |
We welcome contributions! Please see CONTRIBUTING.md
for
details.
This project is in the worldwide public domain.
This project is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication.
All contributions to this project will be released under the CC0 dedication. By submitting a pull request, you are agreeing to comply with this waiver of copyright interest.