Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit of OSV record linter #243

Merged
merged 32 commits into from
Aug 29, 2024

Conversation

andrewpollock
Copy link
Collaborator

@andrewpollock andrewpollock commented May 21, 2024

This is reasonably functional at this point, with multiple checks of two different aspects:

Ranges:

  • introduced exists
  • don't overlap

Packages:

  • plumbing for ecosystem-specific behaviour
  • package existence
    • PyPI
    • Go
  • package version existence
    • PyPI
    • Go (with some caveats around pseudoversions)
  • Basic Purl validity
$ go run ./cmd/osv record lint test_data/
Running "osv.dev" check collection on &["test_data/"]
2024/08/07 23:26:14 Found 9 files in "test_data/"
Running "introduced-event-exists" check on "test_data/CVE-2018-5407.json"
Running "range-is-distinct" check on "test_data/CVE-2018-5407.json"
Running "package-exists" check on "test_data/CVE-2018-5407.json"
2024/08/07 23:26:14 "test_data/CVE-2018-5407.json": "package-exists": []checks.CheckError{checks.CheckError{Code:"P0001", Message:": package \"openssl\" not found"}}
Running "package-versions-exist" check on "test_data/CVE-2018-5407.json"
2024/08/07 23:26:14 "test_data/CVE-2018-5407.json": "package-versions-exist": []checks.CheckError{checks.CheckError{Code:"P0002", Message:": Failed to find some versions of openssl: &errors.errorString{s:\"unsupported ecosystem: Alpine\"}"}, checks.CheckError{Code:"P0002", Message:": Failed to find some versions of openssl: &errors.errorString{s:\"unsupported ecosystem: Alpine\"}"}, checks.CheckError{Code:"P0002", Message:": Failed to find some versions of openssl: &errors.errorString{s:\"unsupported ecosystem: Alpine\"}"}, checks.CheckError{Code:"P0002", Message:": Failed to find some versions of openssl: &errors.errorString{s:\"unsupported ecosystem: Alpine\"}"}, checks.CheckError{Code:"P0002", Message:": Failed to find some versions of openssl: &errors.errorString{s:\"unsupported ecosystem: Alpine\"}"}, checks.CheckError{Code:"P0002", Message:": Failed to find some versions of openssl: &errors.errorString{s:\"unsupported ecosystem: Alpine\"}"}}
Running "package-purl-valid" check on "test_data/CVE-2018-5407.json"
Running "introduced-event-exists" check on "test_data/CVE-2023-41045.json"
Running "range-is-distinct" check on "test_data/CVE-2023-41045.json"
Running "package-exists" check on "test_data/CVE-2023-41045.json"
Running "package-versions-exist" check on "test_data/CVE-2023-41045.json"
Running "package-purl-valid" check on "test_data/CVE-2023-41045.json"
Running "introduced-event-exists" check on "test_data/GHSA-9v2f-6vcg-3hgv.json"
Running "range-is-distinct" check on "test_data/GHSA-9v2f-6vcg-3hgv.json"
Running "package-exists" check on "test_data/GHSA-9v2f-6vcg-3hgv.json"
Running "package-versions-exist" check on "test_data/GHSA-9v2f-6vcg-3hgv.json"
Running "package-purl-valid" check on "test_data/GHSA-9v2f-6vcg-3hgv.json"
Running "introduced-event-exists" check on "test_data/GO-2020-0001.json"
Running "range-is-distinct" check on "test_data/GO-2020-0001.json"
Running "package-exists" check on "test_data/GO-2020-0001.json"
Running "package-versions-exist" check on "test_data/GO-2020-0001.json"
2024/08/07 23:26:16 "test_data/GO-2020-0001.json": "package-versions-exist": []checks.CheckError{checks.CheckError{Code:"P0002", Message:": Failed to find some versions of github.com/gin-gonic/gin: &errors.errorString{s:\"failed to find [1.6] for \\\"github.com/gin-gonic/gin\\\" in [v1.9.0 v1.3.0 v1.7.0 v1.8.0 v1.6.0 v1.8.2 v1.1.1 v1.5.0 v1.7.2 v1.7.1 v1.1.3 v1.1.2 v1.9.1 v1.6.3 v1.10.0 v1.7.3 v1.7.5 v1.4.0 v1.1.4 v1.6.1 v1.7.7 v1.8.1 v1.6.2 v1.7.4 v1.7.6 ]\"}"}}
Running "package-purl-valid" check on "test_data/GO-2020-0001.json"
Running "introduced-event-exists" check on "test_data/GO-2024-2963.json"
Running "range-is-distinct" check on "test_data/GO-2024-2963.json"
Running "package-exists" check on "test_data/GO-2024-2963.json"
Running "package-versions-exist" check on "test_data/GO-2024-2963.json"
Running "package-purl-valid" check on "test_data/GO-2024-2963.json"
Running "introduced-event-exists" check on "test_data/PYSEC-2023-74.json"
Running "range-is-distinct" check on "test_data/PYSEC-2023-74.json"
Running "package-exists" check on "test_data/PYSEC-2023-74.json"
Running "package-versions-exist" check on "test_data/PYSEC-2023-74.json"
Running "package-purl-valid" check on "test_data/PYSEC-2023-74.json"
Running "introduced-event-exists" check on "test_data/nointroduced-CVE-2023-41045.json"
2024/08/07 23:26:18 "test_data/nointroduced-CVE-2023-41045.json": "introduced-event-exists": []checks.CheckError{checks.CheckError{Code:"R0001", Message:": missing 'introduced' object in event"}}
Running "range-is-distinct" check on "test_data/nointroduced-CVE-2023-41045.json"
Running "package-exists" check on "test_data/nointroduced-CVE-2023-41045.json"
Running "package-versions-exist" check on "test_data/nointroduced-CVE-2023-41045.json"
Running "package-purl-valid" check on "test_data/nointroduced-CVE-2023-41045.json"
Running "introduced-event-exists" check on "test_data/nondistinct-CVE-2018-5407.json"
Running "range-is-distinct" check on "test_data/nondistinct-CVE-2018-5407.json"
2024/08/07 23:26:18 "test_data/nondistinct-CVE-2018-5407.json": "range-is-distinct": []checks.CheckError{checks.CheckError{Code:"R0002", Message:": overlapping event: \"e818b74be2170fbe957a07b0da4401c2b694b3b8\""}}
Running "package-exists" check on "test_data/nondistinct-CVE-2018-5407.json"
2024/08/07 23:26:18 "test_data/nondistinct-CVE-2018-5407.json": "package-exists": []checks.CheckError{checks.CheckError{Code:"P0001", Message:": package \"openssl\" not found"}}
Running "package-versions-exist" check on "test_data/nondistinct-CVE-2018-5407.json"
2024/08/07 23:26:18 "test_data/nondistinct-CVE-2018-5407.json": "package-versions-exist": []checks.CheckError{checks.CheckError{Code:"P0002", Message:": Failed to find some versions of openssl: &errors.errorString{s:\"unsupported ecosystem: Alpine\"}"}, checks.CheckError{Code:"P0002", Message:": Failed to find some versions of openssl: &errors.errorString{s:\"unsupported ecosystem: Alpine\"}"}, checks.CheckError{Code:"P0002", Message:": Failed to find some versions of openssl: &errors.errorString{s:\"unsupported ecosystem: Alpine\"}"}, checks.CheckError{Code:"P0002", Message:": Failed to find some versions of openssl: &errors.errorString{s:\"unsupported ecosystem: Alpine\"}"}, checks.CheckError{Code:"P0002", Message:": Failed to find some versions of openssl: &errors.errorString{s:\"unsupported ecosystem: Alpine\"}"}, checks.CheckError{Code:"P0002", Message:": Failed to find some versions of openssl: &errors.errorString{s:\"unsupported ecosystem: Alpine\"}"}}
Running "package-purl-valid" check on "test_data/nondistinct-CVE-2018-5407.json"
Running "introduced-event-exists" check on "test_data/nopackage-GHSA-9v2f-6vcg-3hgv.json"
Running "range-is-distinct" check on "test_data/nopackage-GHSA-9v2f-6vcg-3hgv.json"
Running "package-exists" check on "test_data/nopackage-GHSA-9v2f-6vcg-3hgv.json"
2024/08/07 23:26:19 "test_data/nopackage-GHSA-9v2f-6vcg-3hgv.json": "package-exists": []checks.CheckError{checks.CheckError{Code:"P0001", Message:": package \"Gradi0\" not found"}}
Running "package-versions-exist" check on "test_data/nopackage-GHSA-9v2f-6vcg-3hgv.json"
2024/08/07 23:26:19 "test_data/nopackage-GHSA-9v2f-6vcg-3hgv.json": "package-versions-exist": []checks.CheckError{checks.CheckError{Code:"P0002", Message:": Failed to find some versions of Gradi0: &errors.errorString{s:\"unable to validate package: fail: \\\"https://pypi.org/pypi/Gradi0/json\\\": bad response: 404\"}"}}
Running "package-purl-valid" check on "test_data/nopackage-GHSA-9v2f-6vcg-3hgv.json"
2024/08/07 23:26:19 found errors
exit status 1

Part of google/osv.dev#2187

```
go run cmd/osv/main.go record lint test_data/nointroduced-CVE-2023-41045.json
```

Signed-off-by: Andrew Pollock <apollock@google.com>
- Generate the maps of checks and collections dynamically, to reduce
  future maintenance burden
- Define an interface for Checks
- Refactor existing code accordingly

Signed-off-by: Andrew Pollock <apollock@google.com>
Add more docstrings

Signed-off-by: Andrew Pollock <apollock@google.com>
(actual directory walking support coming soon)

Still functional:

```
$ go run cmd/osv/main.go record lint test_data/nointroduced-CVE-2023-41045.json
Running "osv.dev" check collection on &["test_data/nointroduced-CVE-2023-41045.json"]
Running "introduced-event-exists" check on "test_data/nointroduced-CVE-2023-41045.json"
2024/05/29 07:07:55 "test_data/nointroduced-CVE-2023-41045.json": "introduced-event-exists": []checks.CheckError{checks.CheckError{Code:"R0001", Message:"missing 'introduced' object in event at index 0"}}
2024/05/29 07:07:55 found errors
exit status 1
```

Signed-off-by: Andrew Pollock <apollock@google.com>
I'm still getting the hang of GJSON's query syntax and how to operate on
results from it. I'm at least now more confident about the behaviour of
this check.

Signed-off-by: Andrew Pollock <apollock@google.com>
Signed-off-by: Andrew Pollock <apollock@google.com>
tools/osv-linter/internal/checks/checks.go Outdated Show resolved Hide resolved
tools/osv-linter/internal/checks/checks.go Outdated Show resolved Hide resolved
tools/osv-linter/internal/checks/checks.go Outdated Show resolved Hide resolved
tools/osv-linter/internal/checks/checks.go Outdated Show resolved Hide resolved
Revert the interface, on the premise that there's only going to be one
known implementation at this time.

Rename the types, do away with the custom string and map. Move more of
the definitional variable to the same place as the code.

Simply how checks are inventoried by adding another "ALL" collection.

Signed-off-by: Andrew Pollock <apollock@google.com>
Use an historically incorrectly generated record that was flagged in
google/osv.dev#1984

```
$ go run ./cmd/osv record lint test_data/CVE-2018-5407.json  test_data/nondistinct-CVE-2018-5407.json
Running "osv.dev" check collection on &["test_data/CVE-2018-5407.json" "test_data/nondistinct-CVE-2018-5407.json"]
Running "introduced-event-exists" check on "test_data/CVE-2018-5407.json"
Running "range-is-distinct" check on "test_data/CVE-2018-5407.json"
Running "introduced-event-exists" check on "test_data/nondistinct-CVE-2018-5407.json"
Running "range-is-distinct" check on "test_data/nondistinct-CVE-2018-5407.json"
2024/07/19 05:22:54 "test_data/nondistinct-CVE-2018-5407.json": "range-is-distinct": []checks.CheckError{checks.CheckError{Code:"R0002", Message:": overlapping event: \"e818b74be2170fbe957a07b0da4401c2b694b3b8\}
2024/07/19 05:22:54 found errors
exit status 1
```

Signed-off-by: Andrew Pollock <apollock@google.com>
For more conciseness, and because the linter is expected to be run on
records before they're processed by OSV.dev, not after.

Signed-off-by: Andrew Pollock <apollock@google.com>
Add checks for package and package version existence.

Add end-to-end support for two ecosystems: PyPI and Go
- Normalize ecosystems with colons in them down to their base
- Cache the existence/nonexistence of a package (in a normalized
  ecosystem) to reduce duplicate network checks
- Correct the test data for CVE-2018-5407 to be the current live record
  without overlapping ranges present (this shouldn't fail range validation)
It is valid to not have any range at all, as seen in the likes of
GHSA-9v2f-6vcg-3hgv, which was being flagged incorrectly.
Remove some println() debugging
The Go module proxy seems to not support package names with uppercase in
their name. GitHub URLs are known to be case-insensitive, so it's safe
to explicitly lowercase these. I dare say it'll be safe to lowercase
everything, but I wanted to start conservatively for now.

Also treat Go toolchain vulnerabilities the same way as stdlib ones so
they aren't flagged as a non-existent package.
Overwrite, don't append so that only the check requested gets run
They don't get returned by the Go proxy.

Also support versions with or without the "v" prefix.

All existing published Go vulnerabilities with the exception of
GO-2024-3012 now pass validation.

Signed-off-by: Andrew Pollock <apollock@google.com>
@andrewpollock andrewpollock marked this pull request as ready for review August 7, 2024 23:22
Copy link
Contributor

@oliverchang oliverchang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initial (very quick and incomplete) first pass!

tools/osv-linter/internal/checks/checks.go Show resolved Hide resolved
tools/osv-linter/internal/checks/packages.go Outdated Show resolved Hide resolved
tools/osv-linter/internal/helpers/ecosystems.go Outdated Show resolved Hide resolved
tools/osv-linter/internal/helpers/ecosystems.go Outdated Show resolved Hide resolved
@andrewpollock
Copy link
Collaborator Author

@cuixq would you mind reviewing this also? (In particular, the package and version validation, but overall as well)

tools/osv-linter/go.mod Outdated Show resolved Hide resolved
go.work Outdated Show resolved Hide resolved
go.work Outdated Show resolved Hide resolved
tools/osv-linter/internal/checks/checks.go Outdated Show resolved Hide resolved
tools/osv-linter/internal/checks/packages.go Outdated Show resolved Hide resolved
tools/osv-linter/internal/helpers/utility.go Outdated Show resolved Hide resolved
@andrewpollock andrewpollock changed the title Initial commit of a very bare-bones linter Initial commit of OSV record linter Aug 11, 2024
Unnecessary, default to the ALL one.

Signed-off-by: Andrew Pollock <apollock@google.com>
I was following examples in the GJSON documentation, but I can't really
see a reason for them.

Signed-off-by: Andrew Pollock <apollock@google.com>
return the result of the comparison instead of branching to return a
boolean

Signed-off-by: Andrew Pollock <apollock@google.com>
New project, no reason not to

Signed-off-by: Andrew Pollock <apollock@google.com>
General wisdom seems to be not to include these in the repo

Signed-off-by: Andrew Pollock <apollock@google.com>
1.23.0 isn't available for gLinux yet

Signed-off-by: Andrew Pollock <apollock@google.com>
Addresses reviewer feedback

Signed-off-by: Andrew Pollock <apollock@google.com>
Address reviewer feedback

Signed-off-by: Andrew Pollock <apollock@google.com>
- use better Go package names
- break out Go packages more granularly
- make some functions private

Signed-off-by: Andrew Pollock <apollock@google.com>
The callers need to do this as they access the response

Signed-off-by: Andrew Pollock <apollock@google.com>
Signed-off-by: Andrew Pollock <apollock@google.com>
@@ -0,0 +1,308 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of these test files don't seem to be referenced anywhere as part of an automated test. Are you planning to add them in another PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's so much extra scaffolding to add around CI/CD type stuff... Yes, that'll be in a future PR. Right now, these test files are for running the code against manually, per the PR description.

Copy link
Collaborator

@another-rex another-rex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments!

Also idea for future implementation:

  • flag to disable checks (could be useful for the version check for unsupported ecosystems)
  • Allowing the record to be from stdin, this way people can pipe in their records to check.

tools/osv-linter/cmd/osv/main.go Outdated Show resolved Hide resolved
This reduces failure noise for ecosystems yet to be implemented

Signed-off-by: Andrew Pollock <apollock@google.com>
Addresses user expectation feedback from @another-rex

Signed-off-by: Andrew Pollock <apollock@google.com>
Copy link

@cuixq cuixq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

tools/osv-linter/internal/pkgchecker/ecosystems.go Outdated Show resolved Hide resolved
tools/osv-linter/internal/pkgchecker/ecosystems.go Outdated Show resolved Hide resolved
Thanks to @cuixq alerting me to golang.org/x/mod/{module,semver} I can
check for psuedoversions and mess with semver versions somewhat more
cleanly

Signed-off-by: Andrew Pollock <apollock@google.com>
This variable has no need to be public

Signed-off-by: Andrew Pollock <apollock@google.com>
This handles all the idiosyncrasies of PEP440 for versions, and
normalizes package names per the documented guidelines.

Signed-off-by: Andrew Pollock <apollock@google.com>
@andrewpollock andrewpollock merged commit df1341c into ossf:main Aug 29, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants