Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: discrepancy between results on windows vs linux for test_SBOM #2793

Closed
terriko opened this issue Mar 6, 2023 · 8 comments
Closed

fix: discrepancy between results on windows vs linux for test_SBOM #2793

terriko opened this issue Mar 6, 2023 · 8 comments
Labels
bug Something isn't working CI Related to our continuous integration service (GitHub Actions) higher priority Issues we'd like fixed sooner rather than later, often ones that come directly from users.
Milestone

Comments

@terriko
Copy link
Contributor

terriko commented Mar 6, 2023

In the course of creating #2747 @metabiswadeep figured out that our SBOM tests were failing because windows and linux were reporting different numbers of cves:

91 CVEs are detected for glibc in windows, whereas 90 CVEs are detected in Linux. So I suppose some CVEs are not detected by Linux?

There should not be any cases where the number of CVEs changes depending on which platform is used for scanning at this time, so this is a bug.

(It is possible to have a CVE that impacts only one platform, but we explicitly avoid assuming that the platform used for scanning is the same platform that will run the code, so we should not be taking such data into account in the current version.)

To fix this bug:

  • figure out why we're getting different results on different platforms. It may be related to the different line feeds in windows vs linux (\r\n vs \n) as that can change detection, but it could also be something completely different.
  • if you don't have access to windows and linux on your own systems, you can use github actions on your own fork to experiment. Although you're welcome to open a draft PR while you experiment, you don't need to open a PR to run the tests -- you should be able to run them directly on your branch.
  • fix it so they both give the correct result, whatever that may be.
  • fix the test updated in test: windows longtests in test/test_cli.py::TestCLI::test_SBOM #2747 so that it behaves correctly. Right now it has bad logic that makes that portion of the test basically non-functional: it will pass no matter how many CVEs the code finds as it is currently written. It will either need to be reverted to something closer to its previous state or replaced with something better that actually works correctly.

A note for new contributors: I don't expect this to be an easy bug. I don't have any answers about how to fix it or any insights that I haven't included here. If you can't figure out how to get started in investigating on your own, please move on to another bug. This isn't to say that you can't work on it or that I think a new contributor wouldn't be able to find a solution, just that this is definitely an independent study project and you should be prepared to work in a space where no one has the answer for you.

@terriko terriko added bug Something isn't working CI Related to our continuous integration service (GitHub Actions) higher priority Issues we'd like fixed sooner rather than later, often ones that come directly from users. labels Mar 6, 2023
@b31ngd3v
Copy link
Contributor

b31ngd3v commented Mar 9, 2023

i'm working on this issue!

@anthonyharrison
Copy link
Contributor

@terriko @b31ngd3v I have just set up a Windows system (I have to download all the data so it will take some time) but I have just looked at the database and noted that the version string for the two CVEs assoicated with the jena product contain extra characters.

CVE-2022-28890 has a version of 4.2.0)
CVE-2021-39239 has a version of 4.1.0]

Both of these CVEs originate from the GAD data source so it looks as if the parsing of the version string is not correct.

Looking at the raw data, the verson string for CVE-2021-39239 is "(,4.1.0]" and the version string for CVE-2022-28890 is "[4.4.0],(,4.2.0)". The correct parsing of these strings is the parse_range_string within GAD_Source is not addressed so there is clearly an issue here.

Whether this bug fully accounts for the discrepency between Linux and Windows systems I don't know but there is clearly a bug that needs to be fixed.

@b31ngd3v
Copy link
Contributor

b31ngd3v commented Mar 9, 2023

@terriko @anthonyharrison this is not a problem of windows or linux from my research, you can reproduce this in any distro just run cve-bin-tool -u latest -n json and then cve-bin-tool --sbom spdx --sbom-file test/sbom/spdx_test.spdx in any distro it would return two products with CVEs [jena and glibc]. normally it returns only one [glibc].

image

@anthonyharrison
Copy link
Contributor

@b31ngd3v What do you mean 'normally' only returns glibc? There are CVEs in the database for Jena so I would expect these to be reported although as these CVEs are relatively new and come from GAD they will only have been reported in the past 6-9 months.

@b31ngd3v
Copy link
Contributor

b31ngd3v commented Mar 9, 2023

@anthonyharrison the issue we're having is sometimes it's showing Jena on the list of vulnerable products and sometimes not. by "normally" i meant try running cve-bin-tool --sbom spdx --sbom-file test/sbom/spdx_test.spdx this on any os it'll not show jena on the list of vulnerable products. but if you run cve-bin-tool -u latest -n json this and then scan the sbom file then it'll show up.

image
and are you sure it's coming from GAD because here on the list of NewFound CVEs it's saying the source is NVD?

@anthonyharrison
Copy link
Contributor

@terriko @b31ngd3v This is interesting. I have a different set of data for the two CVEs related to Jena

image

This might explain why we see differences occuring.

I have just deleted the database and reloaded all of the data to see if that makes a difference - it didn't! I have even reloaded the GAD data as well but that makes no difference either

I have noted is that even if we delete the GAD cache, the data remains in the database so old data will remain in the database.

However what I have discovered is that there are two entries for each of the CVEs in the cve_severity table which are for the GAD and NVD but old one entry for the CVe in the cve_range table (the GAD). What appears to have happenned is that the NVD data has been overwrriten by the GAD data. So there must be a sort of 'race' condition where the latest data gets used - depending on whether the NVD processing or GAD processing finishes last will be the one which is used. Given the earlier issue with version formatting of GAD data, this may explain why Jena data isn't rteported if GAD is the latest data but is reported if NVD is the latest data (becuase the version data is correct)

So there are at least 2 BUGS to fix

  1. GAD version formatting
  2. Investigate whay data is being overwritten in the cve_range table.

terriko pushed a commit that referenced this issue Mar 29, 2023
* Part of what's needed for #2793
terriko pushed a commit to anthonyharrison/cve-bin-tool that referenced this issue Mar 30, 2023
@terriko terriko added this to the 3.3 milestone Jun 28, 2023
@anthonyharrison
Copy link
Contributor

@terriko I have enabled test_SBOM in the test_clii.py file. The assertion text just needs to change from 3 to 1 for the test to work.

@terriko
Copy link
Contributor Author

terriko commented Oct 26, 2023

Sounds like re-enabling should be pretty easy. I'm going to close this issue and immediately open a new one for just re-enabling that test (so anyone picking this up for hacktoberfest won't get stuck reading all the earlier debugging)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CI Related to our continuous integration service (GitHub Actions) higher priority Issues we'd like fixed sooner rather than later, often ones that come directly from users.
Projects
None yet
Development

No branches or pull requests

3 participants