Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for "file" source type in syftjson unmarshaling #750

Merged
merged 4 commits into from
Jan 18, 2022

Conversation

luhring
Copy link
Contributor

@luhring luhring commented Jan 18, 2022

This fixes the root cause of anchore/grype#592.

Syft's "syftjson" format had not accounted for the "file" type of Source during unmarshaling. This PR:

  • adds "file" as an expected Source type
  • creates tests for Source unmarshaling, including a new test case to catch this "file" case

To address the Grype bug, a follow-up PR will be needed in Grype to use a version of Syft that has this fix.


Lingering thoughts for the future:

  1. It'd be nice to also assert in tests that all possible Source.Type values are being tested, similar to how we test the completeness of the test suite itself in other areas of the code.
  2. Using a source of truth like AllSchemes might be the way to go. But those underlying string values don't line up with the string values we use during marshaling (i.e. "FileScheme" != "file"). (Perhaps we could make them line up, if there's not value in creating two different versions of each given value?)
  3. An opportunity to add clarity to the code in the future is to disambiguate the term "scheme". In some places we're using it to refer to the enum of {image, directory, and file), and in other places we're using it to refer to the user input text that precedes a colon (e.g. registry:...).

Signed-off-by: Dan Luhring <dan+github@luhrings.com>
Signed-off-by: Dan Luhring <dan+github@luhrings.com>
Signed-off-by: Dan Luhring <dan+github@luhrings.com>
Signed-off-by: Dan Luhring <dan+github@luhrings.com>
Copy link
Contributor

@spiffcs spiffcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great test and great find! Glad this was such an easy fix.

"digest": "sha256:396c31837116ac290458afcb928f68b6cc1c7bdd6963fc72f52f365a2a89c1b5"
}
]
}`),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick nit: should the indentation here be moved so it's closer to RawManifest and not at the 0 position of the line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spiffcs I can try that! The only constraint here worth mentioning is that this is a string literal (encoded as a byte slice), and it has to match the result of unmarshaling the raw input. The way I've tried to indent so far, it causes a test failure, since those inserted tabs are not expected. Do you have ideas on how to get the test to pass and have this be indented to the positiion of RawManifest?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries on the indentations. I understand that the byte match fails because of those added characters. The only way we could update it is if the fixture was also represented in the test struct which seems way more brittle than what we're doing now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, sounds good

@spiffcs
Copy link
Contributor

spiffcs commented Jan 18, 2022

Regarding the follow-up questions:

  • +1 for more enums and lining up the terms
  • here is a list of all the "schemes" we use. It looks like (image --> docker, docker-archive, ociarchive, etc). Does this hint that we add more specific source types and be more granular than image in our output? (dir --> directory) this shorthand is probably for user convenience more than anything else. I think we can map (dir -> directory) for the sbom output to keep the convenience. Linking the input value with what shows up in the Source field is a good idea for clarifying terms and keeping concepts linked across the code.
docker:yourrepo/yourimage:tag          use images from the Docker daemon
docker-archive:path/to/yourimage.tar   use a tarball from disk for archives created from "docker save"
oci-archive:path/to/yourimage.tar      use a tarball from disk for OCI archives (from Skopeo or otherwise)
oci-dir:path/to/yourimage              read directly from a path on disk for OCI layout directories (from Skopeo or otherwise)
dir:path/to/yourproject                read directly from a path on disk (any directory)
file:path/to/yourproject/file          read directly from a path on disk (any single file)
registry:yourrepo/yourimage:tag        pull image directly from a registry (no container runtime required)

@luhring
Copy link
Contributor Author

luhring commented Jan 18, 2022

+1 for more enums and lining up the terms

❤️

Re: disambiguation of the word "scheme":

My understanding of the code is that there really are two different concepts here (both of which today are referred to as "scheme" in the code).

There is:

  1. "how to find the thing to scan" — i.e. what's used in user input, inspired by the URI spec (e.g. oci-dir:)
  2. "what thing was scanned" — i.e., was it a file? was it a directory? was it an image (regardless of if we obtained it using Docker, found it on disk as an OCI archive, pulled it directly from an OCI registry, etc.)?

IMHO, item 1 makes sense to be called "scheme", whereas calling item 2 a "scheme" causes me confusion. On top of that, it seems like we also refer to item 2 as the "source type" in some places (e.g. in Source.Type).

In my mind, one relatively low impact but high value solution would to be to adjust the "item 2" uses of "scheme" to be "type" / "source type". This would, for example, change AllSchemes to AllTypes (in the source package).

Does this perspective make sense?

@luhring luhring merged commit c61204f into anchore:main Jan 18, 2022
spiffcs added a commit that referenced this pull request Jan 19, 2022
…hub.com/hectorj2f/syft into hectorj2f/add_dependencies_to_cyclonedx

* 'hectorj2f/add_dependencies_to_cyclonedx' of https://github.com/hectorj2f/syft: (29 commits)
  Improve CycloneDX format output (#710)
  Add additional PHP metadata (#753)
  Update Syft formats for SyftJson (#752)
  Add support for "file" source type in syftjson unmarshaling (#750)
  remove contains file from spdx dependency generation
  support .sar for java ecosystem (#748)
  Start developer documentation (#746)
  Align SPDX export more with SPDX 2.2 specification (#743)
  Replace distro type (#742)
  update goreleaser with windows checksums (#740)
  bump stereoscope version to remove old containerd (#741)
  Add support for multiple output files in different formats (#732)
  Add support for searching for jars within archives (#734)
  683 windows filepath (#735)
  Fix CPE encode/decode when it contains special chars (#714)
  support .par for java ecosystems (#727)
  Add arm64 support to install script (#729)
  Revert "bump goreleaser to v1.2 (#720)" (#731)
  Add a version flag (#722)
  Add lpkg as java package format (#694)
  ...
@luhring luhring added the bug Something isn't working label Jan 20, 2022
fengshunli pushed a commit to fengshunli/syft that referenced this pull request Jan 24, 2022
)

* Add tests for image and directory syftjson source

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Add failing test case for file source unmarshaling

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Fix file source unmarshaling

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Add test case for unknown source type

Signed-off-by: Dan Luhring <dan+github@luhrings.com>
Signed-off-by: fsl <1171313930@qq.com>
spiffcs pushed a commit that referenced this pull request Jan 24, 2022
* Add tests for image and directory syftjson source

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Add failing test case for file source unmarshaling

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Fix file source unmarshaling

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Add test case for unknown source type

Signed-off-by: Dan Luhring <dan+github@luhrings.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
spiffcs pushed a commit that referenced this pull request Jan 25, 2022
* Add tests for image and directory syftjson source

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Add failing test case for file source unmarshaling

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Fix file source unmarshaling

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Add test case for unknown source type

Signed-off-by: Dan Luhring <dan+github@luhrings.com>
jonasagx pushed a commit to jonasagx/syft that referenced this pull request Jan 28, 2022
)

* Add tests for image and directory syftjson source

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Add failing test case for file source unmarshaling

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Fix file source unmarshaling

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Add test case for unknown source type

Signed-off-by: Dan Luhring <dan+github@luhrings.com>
GijsCalis pushed a commit to GijsCalis/syft that referenced this pull request Feb 19, 2024
)

* Add tests for image and directory syftjson source

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Add failing test case for file source unmarshaling

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Fix file source unmarshaling

Signed-off-by: Dan Luhring <dan+github@luhrings.com>

* Add test case for unknown source type

Signed-off-by: Dan Luhring <dan+github@luhrings.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants