Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: CI pre-checker for VENDOR_PRODUCT #3628

Open
terriko opened this issue Dec 18, 2023 · 14 comments · May be fixed by #3840
Open

feat: CI pre-checker for VENDOR_PRODUCT #3628

terriko opened this issue Dec 18, 2023 · 14 comments · May be fixed by #3840
Labels
enhancement New feature or request

Comments

@terriko
Copy link
Contributor

terriko commented Dec 18, 2023

Description

The VENDOR_PRODUCT variable in a binary checker provides {vendor, product} pairs that correspond to CPE entries and can be used to look up vulnerabilities. Currently, I check these manually while reviewing pull requests, but there's no reason we couldn't have a script in github actions or run from pre-commit that could check that they're valid and fail if they are not.

Design thoughts:

  • this should run if and only if files in the cve_bin_tool/checkers directory are changed/added and only on those files
  • laziest way to check: see if a database query for that {vendor, product} returns any CVEs. (In theory you can have a CPE that doesn't have any CVEs associated yet, but in practice there's no much point in making a checker until there's a least one CVE so it's unlikely to come up very often.)
  • if you don't want to do something using cve-bin-tool itself, remember that you can make direct sqlite queries in a script, though since you're expecting to run this code on arbitrary pull requests you should have some basic validation in place as well as safe sql usage (in this case, some simple limits on allowed characters + bound parameters)
  • you could run it as part of pre-commit so people could get a local check before they even make a pull request.
  • you definitely want to run it in Github actions somehow because not everyone has pre-commit set up and running

Why?

Occasionally, people make a typo or new contributors try to make a checker without seeing if there's even a CPE

Anything else?

This is probably suitable for adventurous beginners with some SQL experience, but I don't think it's quite simple enough for me to tag it as a good first issue because it's going to be a bit finicky to figure out how to test and deploy it in github actions.

@terriko terriko added the enhancement New feature or request label Dec 18, 2023
@terriko
Copy link
Contributor Author

terriko commented Dec 19, 2023

Also noting from #3605 -- @ffontaine has beenfinding invalid ones in older code. Some of that may be due to CPEs changing over time, some may just be because I trusted submitters too much and didn't double-check every thing manually when I approved the original PR.

We may also want to run the validation script on all of the VENDOR_PRODUCT entries once a week or something and have it open up issues if any of them become invalid.

@terriko
Copy link
Contributor Author

terriko commented Dec 20, 2023

Quoting from #3363 on the topic of allowing "invalid" checkers: so it's here and easy to read:

If such kind of checkers is kept, it means that the solution designed for #3628 will have to explicitly handle such of kind of checkers.

Moreover, it also means that if for some reason in the future, a CVE is allocated for debianutils but with a different CPE ID that the one that was "randomly" chosen in the checker (e.g. debianutils_project:debianutils instead of debian:debianutils), cve-bin-tool will wrongly report them no CVEs affect debianutils.
So, I would agree to not run those "incomplete" checkers by default even for SBOM generation.

Originally posted by @ffontaine in #3633 (comment)

@joydeep049
Copy link
Contributor

Hello @terriko , I want to work on this . How should I proceed?

@terriko
Copy link
Contributor Author

terriko commented Jan 11, 2024

@crazytrain328 I'd probably start by writing a python script that takes a checker *.py file and checks to see if each thing in the VENDOR_PRODUCT list provided has any CVEs associated with it.

So basically, parse out VENDOR_PRODUCT and run something like
SELECT count(*) from cve_range where vendor = % and product = %

with % replaced by each {vendor, product} pair you need to look up.

We don't really have a function for this in the existing cvedb class, so you can pretty much just jam some sql into your new script and tell it to use the same database name.

(eventually you might want to use the cvedb class to make sure the database is there and is up to date, but for a first pass that doesn't need to be in the script)

@joydeep049
Copy link
Contributor

Hello @terriko ,
I have been reading more about the repo and also have been learning how to set-up github actions for this kind of pre-checker.
I have a doubt.
According to what you said above, I have to take a checker.py and parse out its VENDOR_PRODUCT list. How do I look up the database after that?

@terriko
Copy link
Contributor Author

terriko commented Jan 29, 2024

@crazytrain328 you might find my experiment script from the version compare investigations useful:

https://github.com/intel/cve-bin-tool/blob/main/experiments/sqlite-experiments.py

I won't say that that particular file is great coding, since it was never intended to be used "in production" as it were, but it should be enough to get you started with a minimal connection to the database and hwo to run a sql query manually

It's got a pretty minimum connection setup:

dbcon = sqlite3.connect("/home/terri/.cache/cve-bin-tool/cve.db")
dbcon.create_function("regexp", 2, lambda x, y: 1 if re.search(x, y) else 0)
cursor = dbcon.cursor()

(You don't need the regexp function if you're not doing regex searches.)

And then the rest of the file is a bunch of select statements that work against the db and output some data. You could copy that file, change out the home directory to match your name instead of mine, and edit one of the cursor.execute lines to contain whatever you actually want to look up.

@joydeep049
Copy link
Contributor

Hello @terriko
I am having problems regarding setting up the database for cve-bin-tool. I am installing it , and it is always saying server disconnected, I have perfect internet connection but it always same.

@terriko
Copy link
Contributor Author

terriko commented Feb 15, 2024

@crazytrain328 Are you using an NVD api key or the mirror? I ask because NVD does sometimes just won't connect, especially for folk on windows -- we actually download our cache on linux and copy it to windows for that reason. But the mirror should be more generally available even if NVD is misbehaving.

@joydeep049
Copy link
Contributor

joydeep049 commented Feb 15, 2024

I got it! I ran the initial script and am able to parse out whether all the vendor_product pairs have associated cves or not .
But it is on my local . How do i get the script to run in the environment of the tool?

@joydeep049 joydeep049 linked a pull request Feb 19, 2024 that will close this issue
@terriko
Copy link
Contributor Author

terriko commented Feb 20, 2024

Are you asking how to run it in github actions? You can look at our other scripts under the .github/ directory to see how we're doing this for various tools, and github itself has pretty decent documentation of actions.

Parts you'll need off the top of my head:

  • grabbing the database from the cache
  • something that runs on only new files in checkers/* when they're in PRs
  • possibly something that runs weekly or only on main to see if any checkers need updates? likely a separate github actions job.

@joydeep049
Copy link
Contributor

Are you asking how to run it in github actions? You can look at our other scripts under the .github/ directory to see how we're doing this for various tools, and github itself has pretty decent documentation of actions.

Parts you'll need off the top of my head:

  • grabbing the database from the cache
  • something that runs on only new files in checkers/* when they're in PRs
  • possibly something that runs weekly or only on main to see if any checkers need updates? likely a separate github actions job.

I've already made a PR (#3840). Can you check it once?

@joydeep049
Copy link
Contributor

joydeep049 commented Feb 20, 2024

@terriko Do we really need to run it weekly? I have added the github actions so that it checks any newly added/modified checker for related CVEs.

There is one thing we cud do to assure completeness. I'll file an issue if someone wants to volunteer to check whether the existing checkers have all related CVEs or not? How does it sound?

@terriko
Copy link
Contributor Author

terriko commented Feb 21, 2024

Ideally: the version that runs on a PR checks only the files changed in that PR.

The idea behind running it weekly (or monthly or whatever) would be to check existing checkers not in a PR so we'd get some warning if the CPE for a checker was removed from NVD. (which does happen periodically).

the frequency there doesn't matter so much, it's just the idea that "check your PR before merging" and "address technical debt" should probably be different but related github actions.

@terriko
Copy link
Contributor Author

terriko commented Feb 21, 2024

And yes, I'll get to the PR when I can!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants