Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python): support PKG-INFO and METADATA #716

Closed
knqyf263 opened this issue Oct 24, 2020 · 5 comments · Fixed by aquasecurity/go-dep-parser#23 or aquasecurity/go-dep-parser#24
Closed
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@knqyf263
Copy link
Collaborator

Currently, Trivy traverses all paths and looks for all Pipfile.lock or poetry.lock in a container image. However, the image sometimes has only Pipfile.lock and doesn't install python packages listed in the Pipfile.lock. A python package should have PKG-INFO and METADATA depending on egg or wheel.

https://packaging.python.org/discussions/wheel-vs-egg/

To avoid false positives from Pipfile.lock, we are probably able to take advantage of *.dist-info/METADATA and *.egg-info/PKG-INFOfile.

How it works:

  1. Look for all *.dist-info and *.egg-info directory
    • Note that *.egg-info is sometimes a file.
  2. Parse PKG-INFO or METADATA under those directories
  3. Extract a python package name and version
  4. Use them for vulnerability detection

How to implement it:

  1. Add a new parser here like pkg/egg/parse.go or pkg/python/egg/parse.go
  2. Add a new analyzer here like analyzer/library/egg/egg.go
  3. Add a new detector here

I'm sure the first two tasks are not difficult and good for the first contributor as well. There are a few things to consider when implementing it in Trivy.

@knqyf263 knqyf263 added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. labels Oct 24, 2020
@dai-dao
Copy link

dai-dao commented Nov 6, 2020

Hello , i would like to take on this feature

@dai-dao
Copy link

dai-dao commented Nov 12, 2020

hello, so i've spent some time on this, and figured out the following:

  1. Pipfile.lock doesn't include all sub-dependencies packages, only some of it. So if you want to parse METADA and PKG_INFO and get a list of all the sub-dependencies, the final list will be much bigger than the list in Pipfile.lock
  2. A simpler solution would probably be to parse Pipfile.lock and get the list of dependencies , then verify it by parsing site-packages folder for those packages only

Let me know what you think. Cheers

@knqyf263
Copy link
Collaborator Author

This issue aims to scan python packages without Pipfile.lock. Some packages are not listed in Pipfile.lock and it might be false negative. It would be great if you get a bigger list than Pipfile.lock.

@westonsteimel
Copy link

Clair appears to take a similar approach here if that is useful to anyone who may take on implementing this: https://github.com/quay/claircore/blob/master/python/packagescanner.go

@github-actions
Copy link

This issue is stale because it has been labeled with inactivity.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and will be auto-closed. label Mar 22, 2021
@krol3 krol3 added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and will be auto-closed. labels Mar 22, 2021
@knqyf263 knqyf263 reopened this Jul 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
4 participants