Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand matching of requirements.txt #167

Closed
alfredodeza opened this issue Aug 13, 2020 · 6 comments · Fixed by #168
Closed

Expand matching of requirements.txt #167

alfredodeza opened this issue Aug 13, 2020 · 6 comments · Fixed by #168
Assignees
Labels
good-first-issue Good for newcomers

Comments

@alfredodeza
Copy link
Contributor

Currently, grype matches on requirements.txt files for that file only, and I think it is valid to match on more combinations of requirements.txt like:

  • requirements-dev.txt
  • requirements-prod.txt
  • test-requirements.txt

This grep found those variations in (my) local repositories that would benefit from getting inspected. This will probably need only the globs to be updated for Python.

@dakaneye dakaneye self-assigned this Sep 3, 2020
@dakaneye dakaneye transferred this issue from anchore/grype Sep 3, 2020
@dakaneye
Copy link
Contributor

dakaneye commented Sep 3, 2020

@alfredodeza @luhring @wagoodman I spent some time looking through the Grype code, and it seems like Grype is just doing the matching based on the packages returned from syft, which is looking in any requirements.txt file.

I believe if we change that glob expression from **/requirements.txt to **/*requirements*.txt it should allow syft to look through multiple files, hence returning packages from all of them for Grype to scan and match against vulnerabilities.

@dakaneye dakaneye added the good-first-issue Good for newcomers label Sep 3, 2020
@luhring
Copy link
Contributor

luhring commented Sep 3, 2020

Sounds cool, @dakaneye! I don't know the Python ecosystem very well. The only risk I could imagine is picking up false positives. But assuming we think there's a low chance that we'll come across txt files with "requirements" somewhere in the name that don't specify Python dependencies, I think this is a fantastic solution.

@dakaneye
Copy link
Contributor

dakaneye commented Sep 3, 2020

@luhring I think mostly the use case for having multiple types of requirements files is to add debugging tools or other things that won't go into you're production binary / app environment. The value add here is pretty small, to be honest, I was just looking for a "Good First Issue" to do some Golang! You'd probably still want to know if a package there was vulnerable, as long as you could trace it back to which file it was declared in.

@wagoodman
Copy link
Contributor

wagoodman commented Sep 8, 2020

I think it would be relatively safe to loosen the pattern to **/*requirements*.txt since false positives would require the filename to be the same as well as the internal format (lines with name == some.sem.ver), which seems unlikely to me. What I do wonder about is if the contents of requirements-dev.txt is really valid? That is, a service probably isn't vulnerable in prod just because a dev package is vulnerable. For a container, no one should be packaging requirements-dev.txt files anyway, so it shouldn't be an issue... for a dir scan I could go either way.

I ultimately see that having a vulnerable package is undesirable and the point of the tool is to notify the user of vulnerabilities outside of any dev/prod context, so the change is probably worth it.

@alfredodeza
Copy link
Contributor Author

I don't think we do any distinction about "is this package in prod? or is it in a dev environment?" to selectively say something is vulnerable and worth notifying.

The whole directory approach is based on the suspicion that packages will get installed. A development workflow for example, regardless of a containerized output or not, could include adding new (or newer) dependencies to a separate requirements file, so that it can be promoted later. Catching it early, and reporting it, is valuable information, to prevent churn at the end of the process.

I think @wagoodman is basically on the same page, but wanted to emphasize that it is important to catch these things even in seemingly non-prod environments.

@wagoodman
Copy link
Contributor

agreed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good-first-issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants