Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix intake of SBOM from JSON #241

Merged
merged 4 commits into from
Mar 15, 2021
Merged

Fix intake of SBOM from JSON #241

merged 4 commits into from
Mar 15, 2021

Conversation

luhring
Copy link
Contributor

@luhring luhring commented Mar 2, 2021

This PR:

Notes:

This PR also adds the beginnings of "CLI-level" tests — to ensure that the built binary behaves correctly from a CLI context, such as the handling of data piped to the command's STDIN.

What's done so far:

  • introduced a new "cli" make target to run these tests
  • introduces tests to cover the cases listed in Can't use syft JSON output as input #235 (with the exception of ~ home directory processing)
  • target snapshot builds for CLI tests
  • included these tests in the CI pipeline

What's left for consideration:

  • factoring out generalized CLI test helpers to a new directory or project

path string
}{
{
"absolute path",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware that you could define structs without their field names. When is this style preferred?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I have a good answer to that question offhand. In this case, I had two reasons for this syntax, 1) the field names appeared immediately before the data (in the struct type literal), and 2) in my editor (GoLand), the field names actually do appear as hints next to the field values.

Having said all that, I don't mind changing to include field names 😃

@@ -160,10 +185,6 @@ func (p *partialSyftPackage) UnmarshalJSON(b []byte) error {
PomGroupID: group,
ManifestName: name,
}
case "":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this part of the issue as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes — I forget the exact execution path that was causing trouble, but I can dig that up if you'd like

}

func openSbom(path string) (*os.File, error) {
expandedPath, err := homedir.Expand(path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more of an open-ended question, but how does this expand when using sudo? is it acceptable to expand to /root instead of /home/myusername ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting question... I would guess so. The homedir library tries its best to rely on the OS's own behavior for "home directory" lookup. (E.g. see here -> https://github.com/mitchellh/go-homedir/blob/af06845cf3004701891bf4fdb884bfe4920b3727/homedir.go#L89)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be tricky in system calls using shell and interpreting back into code (I've run into issues like this in Python). Just tried it out with a quick test and it checks out. Thanks for the link the source code!

func openSbom(path string) (*os.File, error) {
expandedPath, err := homedir.Expand(path)
if err != nil {
return nil, fmt.Errorf("unable to open SBOM: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this error should be specific to expanding the path, not to open the SBOM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make that change. 👍

This is one area I think it'd be good to seek alignment on among the group. Errors are interesting... (especially in Go). If function A calls function B, and thus has to handle getting an error back from function B, which function reports that there was a problem with function B's responsibility? Lately, I've been preferring function B to do this, since that consolidates the message to function B, and any number of callers of function B can rely on this error message rather than have them all say something like "there was a problem with function B" in their own function bodies.

In this specific case, if you look into the code for the function homedir.Expand(path), you'll notice it returns errors like "cannot expand user-specific home dir". So in my function, openSbom, rather than fmt.Errorf("error expanding the home dir in the path": %w", err), which would produce:

ERROR: error expanding the home dir in the path: cannot expand user-specific home dir

I decided to have my error reflect the failure in my function's job, which is to open an SBOM.

Again, happy to change it, but just wanted to share those thoughts!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That explanation makes sense to me, and I agree with all your points. The caller of this function mentions something similar which seemed repetitive:

    return nil, fmt.Errorf("unable to use specified SBOM: %w", err)

Re-using a bit your example, that would mean it would produce this error:

ERROR: unable to use specified SBOM: unable to open SBOM: cannot expand user-specific home dir

This reads somewhat repetitive. I don't have a good suggestion or alternative here, but that repetition is what made me think that perhaps the messaging should change.

However! I'm fine with whatever you decide, this is not a hard comment :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting point. I'll think about that. I think my thinking was that there could be multiple reasons why an SBOM is unusable (e.g. invalid JSON syntax). getSyftJSON doesn't care why the SBOM is unusable, it just needs to know that it's not usable. The inner error unable to open SBOM... in openSbom is a specific reason why the SBOM was unusable — specifically that the SBOM couldn't be opened.

luhring added 3 commits March 9, 2021 08:58
Signed-off-by: Dan Luhring <dan.luhring@anchore.com>
Signed-off-by: Dan Luhring <dan.luhring@anchore.com>
Signed-off-by: Dan Luhring <dan.luhring@anchore.com>
Signed-off-by: Dan Luhring <dan.luhring@anchore.com>
@luhring luhring marked this pull request as ready for review March 9, 2021 14:29

// this reader is a combination of previous bytes read from stdin by other providers as well as what still is
// available from stdin. The Tee reader is to ensure that any read bytes from stdin are preserved.
cfg.reader = io.MultiReader(&previousStdin, io.TeeReader(os.Stdin, &previousStdin))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taking this out has means that we must read all of stdin to determine which provider to use, where the former implementation was "peeking" as far as necessary. Was there a reason to change this to read all stdin bytes into memory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main goal here was to simplify. To be honest, I was having trouble keeping track of the logic, which was notable to me because the lack of availability of SBOM JSON readers was central to these bugs. For now, we only have one provider that makes use of the stdin bytes reader, so the multireader + teereader weren't being leveraged beyond providing an io.Reader for stdin. I could see this changing if we see a need for more providers, and at that point, it'd be good to reassess the symmetry we're seeing among the providers.

If there's a more straightforward way to improve this code, I'd welcome that!

Copy link
Contributor

@wagoodman wagoodman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice addition 💯

@luhring luhring merged commit 75c0d36 into main Mar 15, 2021
@luhring luhring deleted the fix-sbom-input branch March 15, 2021 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't use syft JSON output as input
3 participants