Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to write a SBOM per binary #563

Closed
wants to merge 7 commits into from

Conversation

Shnatsel
Copy link
Contributor

@Shnatsel Shnatsel commented Nov 16, 2023

Mostly (?) fixes #557

This sticks to having the whole SBOM operate in terms of packages, so that the toplevel element is still a package potentially consisting of several binaries; but there is a SBOM named after every binary, making them much easier to locate.

Why do it this way

So you may be thinking "Eww, gross! Why don't you just make the binary your toplevel element? Why generate a SBOM for the whole package?"

There is a long list of reasons why it is done this way, most of which are weaksauce ("all components are packages with PURLs so it's more consistent", "the per-binary mode isn't special and isn't a source of lots of extra complexity and bugs", etc). The real reason we output a SBOM for the whole package is the SBOMs clobbering each other.

Imagine you have a target that's both a binary and a shared library. The compiled artifacts get different extensions (.exe vs .dll on Windows, nothing vs .so on Linux), etc. However, when you try to emit stuff like debug info for them, the names will clash! Both are myproject.pdb on Windows. If you run into this, Cargo will print a warning but not actually do anything to fix this.

We have to deal with this problem somehow as well.

There are two sources of clobbering:

  1. The target with the same name (this is common)
  2. The workspace producing e.g. a library and a binary with the same name but coming from different packages (also common), or even binaries with the same name (unlikely)

We easily sidestep (2) by writing the SBOM into each package's directory. Boom, stuff from different packages cannot clash.

We sidestep (1) by emitting the SBOM for the whole package. So the library and binary targets within the same package can clobber each other as much as they like - it's still going to be the exact same file! You don't have to invent a scheme to differentiate between an rlib, a cdylib and an executable that all got built from the same package - you just look up the name of the binary and off you go.

In case you're wondering why don't we just name each SBOM after the final name of the binary, .exe and all: the answer is that we don't know them. cargo metadata does not provide this information. #532 might help with that. But even if that ever happens, that would still be incompatible with the --target=all mode.


You may notice that these things are now named after a variable prefix and no longer fit the standard naming convention. And that is absolutely correct! That's a bug that was present all along for --output-pattern=package and that we inherited in this PR too. I think there needs to be a .cdx in there when it's not bom.{xml,json}. That is a change in user-visible behavior so that'll require shipping a 0.5.0 release. I think this is a necessary fix, I'll make a PR for that once this is merged to avoid conflicts.

Signed-off-by: Sergey "Shnatsel" Davidoff <shnatsel@gmail.com>
Signed-off-by: Sergey "Shnatsel" Davidoff <shnatsel@gmail.com>
Signed-off-by: Sergey "Shnatsel" Davidoff <shnatsel@gmail.com>
Signed-off-by: Sergey "Shnatsel" Davidoff <shnatsel@gmail.com>
@Shnatsel Shnatsel requested a review from a team as a code owner November 16, 2023 22:20
Signed-off-by: Sergey "Shnatsel" Davidoff <shnatsel@gmail.com>
@Shnatsel
Copy link
Contributor Author

Oh, and if you want to test this on real projects, merge #554 first to get actually sensible behavior.

Signed-off-by: Sergey "Shnatsel" Davidoff <shnatsel@gmail.com>
Copy link
Contributor

@justahero justahero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me.

@@ -80,7 +80,7 @@ Defaults to the host target, as printed by 'rustc -vV'"
#[clap(long = "output-cdx")]
pub output_cdx: bool,

/// Prefix patterns to use for the filename: bom, package
/// Prefix patterns to use for the filename: bom, package, binary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not add
"Implies --output-cdx" for binary?

@lfrancke
Copy link
Contributor

I'm afraid this returns the wrong result. At least not what I'd expect and in this form it'd not be too helpful.

If I have a member with multiple targets (e.g. multiple binaries) I'd expect the top-level metadata.component to be exactly that.
e.g. I have a crate called cargo-cyclonedx with a main.rs plus bin/hello_world.rs and bin/hello_world2.rs I'd expect three SBOMs.
One called hello_world.cdx.json which would have a top-level metadata.component with the name hello_world and not cargo-cylonedx.

We can ask this on the CycloneDX slack but that'd be my interpretation of what users will expect.
Libraries don't get any separate SBOMs in this case either.

The implementation might suck but my original PR implemented what I thought was sane back then.

@Shnatsel
Copy link
Contributor Author

I see that this does not fit your use case. I'm glad I held off merging until I got your feedback. I'm going to close this, and work on solving your use case in another PR.


e.g. I have a crate called cargo-cyclonedx with a main.rs plus bin/hello_world.rs and bin/hello_world2.rs I'd expect three SBOMs.

To clarify, this is the case in this PR.

One called hello_world.cdx.json which would have a top-level metadata.component with the name hello_world and not cargo-cylonedx.

But this is not.

Libraries don't get any separate SBOMs in this case either.

This is not the case in this PR either.

@Shnatsel Shnatsel closed this Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide an easy way to map an artifact to a SBOM
3 participants