Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Pagination Support for Retrieving All CVEs API #357

Closed
wants to merge 2 commits into from

Conversation

mghotz
Copy link

@mghotz mghotz commented Jan 8, 2024

I implemented a new API feature for retrieving all Common Vulnerabilities and Exposures (CVEs) with pagination support. This enhancement allows users to fetch CVE data in a paginated format, making it more efficient and user-friendly, especially when dealing with large datasets.

I manually tested the API to ensure it handles various real-world use cases effectively.

@MaineK00n
Copy link
Collaborator

MaineK00n commented Jan 8, 2024

If you want to get all CVEs, I think you can do it by combining it with existing functionality if you can get the CVE ID list.
#358

Still want to page and retrieve CVEs?

@mghotz
Copy link
Author

mghotz commented Jan 8, 2024

If you want to get all CVEs, I think you can do it by combining it with existing functionality if you can get the CVE ID list. #358

Still want to page and retrieve CVEs?
I appreciate your suggestion. However, fetching the CVE ID list and then retrieving details separately would mean two separate API calls, which isn't ideal. Our revised GetAll method efficiently consolidates this into a single API call, enhancing overall performance and reducing the load on both the server and network. This approach is more streamlined, ensuring data completeness and efficiency in one go.

@MaineK00n
Copy link
Collaborator

MaineK00n commented Jan 9, 2024

If you want to retrieve all the data and do something with it, I think it is better to create a local clone using go-cve-dictionary fetch instead of going through server mode.

I don't know the use case where you want to use pagination to retrieve all items from the DB.
Also, I am wondering whether such a function is necessary for go-cve-dictionary itself.

@@ -67,6 +69,26 @@ func getCve(driver db.DB) echo.HandlerFunc {
}
}

func getAllCves(driver db.DB) echo.HandlerFunc {
return func(c echo.Context) error {
cveDetails, err := driver.GetAll()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cveDetails, err := driver.GetAll()

Comment on lines +75 to +76
limit, _ := strconv.Atoi(c.QueryParam("limit"))
pageNum, _ := strconv.Atoi(c.QueryParam("page_num"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please handle the errors.

ctx := context.Background()

// Get all keys for CVE details
keys, err := r.conn.Keys(ctx, fmt.Sprintf(cveKeyFormat, "*")).Result()
Copy link
Collaborator

@MaineK00n MaineK00n Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaineK00n
Copy link
Collaborator

MaineK00n commented Jan 9, 2024

However, fetching the CVE ID list and then retrieving details separately would mean two separate API calls, which isn't ideal. Our revised GetAll method efficiently consolidates this into a single API call, enhancing overall performance and reducing the load on both the server and network. This approach is more streamlined, ensuring data completeness and efficiency in one go.

Since this part is executed every time a page is requested, I think it would be more efficient to fetch all the CVE ID list at once and then request them separately.
https://github.com/mghotz/go-cve-dictionary/blob/12be202f8d8b19c39c6913637e84237da43d42d5/db/rdb.go#L314-L318

@jbmaillet
Copy link

@MaineK00n , @mghotz , my 2 cents, here is one of my daily use cases, and my performance bottleneck.

I audit full embedded products, built from sources (e.g. Linux or Android devices). As of today for example, a Linux kernel 4.14.335 released this week yields 824 CVE from go-cve-dictionary (of course, in the end 95% are false positives for various reasons, but that's another story). To get these 824 CVE, my go-cve-dictionary query takes more than 10 minutes to complete. Note that on a full product, I of course not only have a kernel to query for, but also dozens of OSS libraries.

I don't think my performance problem here would be solved by pagination, nor by getting the CVE Ids and then retrieve them if I wish: the bottleneck is that the CPE to CVE matching takes a very long time.

For example, try querying o:linux:linux_kernel < 6.5.3. A real world CVE example that would match could be https://nvd.nist.gov/vuln/detail/CVE-2023-45871, but you will get hundred more.

Even worst, try with no version boundary at all, with cpe:2.3:o:linux:linux_kernel:-:*:*:*:*:*:*:*, such as in https://nvd.nist.gov/vuln/detail/CVE-2023-2156.

@MaineK00n
Copy link
Collaborator

MaineK00n commented Jan 10, 2024

@jbmaillet
Thanks for pointing out the performance issue with the use case.
The reason why it takes time to query a CPE is because we are looking for matches to the queried CPE, rather than a simple string comparison.

There are two things I'm thinking about with this PR.
One is whether there is a need for an API that fetches the entire data using pagination.
The second is that the pagination method that is implemented by fetching the entire data is likely to have worse performance than using the CVE ID list, even if it uses two types of API.

@MaineK00n MaineK00n closed this Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants