Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVD API 1.0 is being deprecated in January 2023 #2542

Closed
anthonyharrison opened this issue Jan 13, 2023 · 8 comments
Closed

NVD API 1.0 is being deprecated in January 2023 #2542

anthonyharrison opened this issue Jan 13, 2023 · 8 comments

Comments

@anthonyharrison
Copy link
Contributor

NVD API 1.0 is being deprecated from January 2023.

https://nvd.nist.gov/general/news/change-timeline

Since we introduced support for NVD 2.0 API, there have been a number changes made since the initial release. From a quick review most won't affect cve-bin-tool, but I thin only the /cves/ changes need some review

General

· Improved CORS header support.

· Many clerical and clarifying changes to the 2.0 API documentation.

· Improved handling of certain scenarios requiring encoded characters

CVE (/cves/)

· Clarified that the “lastModified” date for a CVE record is not changed when a CVE record changes to “Undergoing Analysis” status in the NVD data set.

· Added a new parameter that filters responses to exclude rejected CVE records. See https://nvd.nist.gov/developers/vulnerabilities#cves-noRejected

· Added a series of parameters that allows users to search for a range of versions for a given virtualMatchString value. (Note that search results are limited to searching the CPE Match Criteria of a CVE based on how the virtualMatchString parameter operates.) See https://nvd.nist.gov/developers/vulnerabilities#cves-versionStart See https://nvd.nist.gov/developers/vulnerabilities#cves-versionEnd

· Added a new property, cisaVulnerabilityName, in responses regarding CISA KEV data.

Additionally, we relabeled other related properties (cisaExploitAdd, cisaActionDue, cisaRequiredAction) to align and identify they are CISA populated items.

· Moved “baseSeverity” property to its proper location in the cvssMetricV2 object.

· Removed the “negate” property from appearing in the configurations object in responses.

· Amended schema to include “id”, “published” and “lastModified” as required.**

CVE Change History (/cvehistory/)

· Released this API for public use in October.

CPE (/cpes/)

· Added a “deprecates” array for relevant CPE records. Previously we only included a “deprecatedBy” array when a CPE had been deprecated by another. This change allows for awareness in either direction of the deprecation chain. (Example)

Match Criteria (/cpematch/)

· Amended data regarding “cpeLastModified” to be populated as expected or to align with the lastModified date.

· Resolved inconsistent encodings in the responses for CPE Names and CPE Match Criteria. This involved changing the schema and aligns with the approach used in other API responses.

@terriko
Copy link
Contributor

terriko commented Jan 18, 2023

I was originally planning to switch over to NVD 2.0 as our default circa 3.3 or having it as the driving factor for 4.0.

That said, I'm currently thinking that 4.0 should switch us over to using our own github mirror of the data as the default, assuming we can keep it sufficiently up to date. This would give first-time users a much better experience as they wouldn't need to set up an NVD_API_KEY right away, and it would potentially make life a lot easier/faster for folk running in Github Actions specifically.

Doing this is a lot more work than just flipping the default to API2.0, though. And as well as the technical work we'll need to do some work to make sure our data store is actually sufficiently trustworthy and verifiable.

Thoughts?

@anthonyharrison
Copy link
Contributor Author

@terriko I need to start looking at this now. as I just tried to use API 2.0 (using the --nvd api2 option) and I get lots of 403/404 errors. The 403 error is '403 Forbidden - Request forbidden by administrative rules.' so it looks like something may have changed (maybe the API key is now required?).

@terriko
Copy link
Contributor

terriko commented Jan 18, 2023

If we need the API key in 2.0 right now, we can make that switch and do a 3.2.1 release as soon as I get CI working again.

@anthonyharrison
Copy link
Contributor Author

It still seems to work (without the API key) but not if I try and populate an empty database! Wondering if I am being rate limited without using the API Key, so I have sent email to NIST querying if I need a separate API key in order to use the 2.0 APIs.

@anthonyharrison
Copy link
Contributor Author

@terriko Been playing with the 2.0 API. It isn't much fun :-( but I think I have a much better underatnding of what is going on :-)

(1) The API KEY doesn't work. I tried sending a API 2.0 request to NVD with a valid key (I used curl) and I got a 404 error with the helpful message 'Invalid parameter: apiKey'. I have sent an email to NVD to ask what is going on.

(2) I think not having the API KEY is resulting in rate limiting. If we are resetiing the database, 100 requests are sent to NVD to get all of the CVE data. NVD recommend a 6 second interval between requests to result in a maximum of 5 requests every 30 seconds. However as https://nvd.nist.gov/developers/terms-of-use says -

NIST firewall rules put in place to prevent denial of service attacks can thwart your application if it exceeds a predetermined rate limit. The public rate limit (without an API key) is 5 requests in a rolling 30 second window; the rate limit with an API key is 50 requests in a rolling 30 second window. Requesting an API key significantly raises the number of requests that can be made in a given time frame. However, it is still recommended that your application sleeps for several seconds between requests so that legitimate requests are not denied, and all requests are responded to in sequence.

THIS MEANS WE CANNOT DOWNLOAD THE WHOLE DATABASE IN A REASONABLE TIME WITHOUT AN API KEY as there will be more than 50 requests unless we divide out the requests across a 30 minute boundary (i.e. one request every 40 seconds) to avoid the 50 requests in 30 minute rolling window.

(3) Incremental data update does seem to work - it is normally only a single request.

(4) The standard connection has a 5 MINIUTE timeout. This is being invoked. Need to consider if this needs to be increased, particualrly for multiple requests. When a timeout occurs, the tool never recovers.

(5) Using Curl to send a single request to NVD does work.

(6) I am seeing various exceptions being raised with the aiohttp mclient interface including timeoutError() and Disconnected from Server.

I have tried reducing the rate limiter to only have a single connection (it is currently set to 19). This improves things but it still resulted in timeouts. Turning timeouts off, resulted in the following pattern - 5 sets of results received followed by 5 403 errors followed by a 30 second back off. This demonstrates the rate limiting effect.

I have a number of modifications to the nvd_api.py code which have been useful in helping track down some of the issues but they probably don't need to be merged just yet.

In summary, until the API KEY interface working with the 2.0 API, the 2.0 API isn't fit for purpose. Without an APIKEY, the 1.0 interface probably suffers in a similar way which probably explains some of the issues with the CI when accessing the NVD.

@anthonyharrison
Copy link
Contributor Author

anthonyharrison commented Jan 19, 2023

UPDATE - According to the NVD team, the way the apiKey is passed in the request has changed from 1.0 as it now needs to be passed in the header and not as part of the URL. Think it just requires a simple modification to the request code to get it working.

UPDATE2: - @terriko It now works!!!! API key is now being accepted but still running into rate limits which result in 403 errors (a maximum of 50 requests in a 30 second window is allowed with an API key). Slowed everything down to have single connection with a small sleep between each request seems to be sufficient to reliably download all of the data (seems to take around 7 or 8 minutes to download the whole dataset). I ALWAYS get a 'Server Disconnected' error around a minute into the download (progress bar is around 2%) but the processing seems to carry on (and the progress bar starts moving much quicker). Sounds like it is a common problem with the aiohttp library and although I tried a recommend hack by adding await asyncio.sleep(0.001) into the processing it didn't make any difference.

anthonyharrison added a commit to anthonyharrison/cve-bin-tool that referenced this issue Jan 20, 2023
@terriko
Copy link
Contributor

terriko commented Jan 23, 2023

Ugh. Thank you so much for persevering on this one!

I think we do need some work on improved timeout recovery not just for NVD -- the cache sometimes times out in the middle of OSV too and I suspect we just need to fail if we get a couple of timeouts.

@terriko
Copy link
Contributor

terriko commented Apr 17, 2024

I think we've done all the things that needed doing here, so closing this now.

@terriko terriko closed this as completed Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants