Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle symlinks in SPDX documents? #610

Open
pseudoyim opened this issue Dec 15, 2021 · 8 comments
Open

How to handle symlinks in SPDX documents? #610

pseudoyim opened this issue Dec 15, 2021 · 8 comments
Milestone

Comments

@pseudoyim
Copy link

Hello!

I’m working on a project to build SBOMs for packages using the SPDX specification (v2.2.1). Many of our packages contain symlinks to other files (within the same package), and we were wondering exactly how these should be described in the SPDX documents we generate?

For example, our zlib package contains the following symlink: libz.so -> libz.so.1.2.11.

We have a tool, which uses tools-python, that builds SPDX documents for us. It has a function that analyzes each file contained in a package (e.g. to generate checksums for each file). When this function opens a symlink file, it de-references the symlink (e.g. libz.so) and just gets the checksums of the target file (e.g. libz.so.1.2.11). This doesn’t seem like the best thing to do because it feels like we’re saying those two files are the same thing, but they’re not. For now, we are planning to stick with this function as it is and add a comment indicating that "this file is a symlink to <some target>" and that the checksums represent those of the target.

Could you please advise on whether there is a prescribed or better way to handle symlinks that conforms with the SPDX specifications?

Thank you.

@swinslow
Copy link
Member

Hi @pseudoyim, it's a great question and unfortunately I don't have an answer for you on a prescribed approach here.

For the SPDX Golang tools, when analyzing a package's files it will disregard symlinks altogether -- e.g. it will just ignore the symlink and won't include it in the list of Files for that Package. I'm not convinced that's the right approach either, but it didn't feel right to include the target file as being "contained" by the Package, or describing its hashes / licenses / etc. as part of the Package's contents.

Either way, it's a good question and probably something that the project should align on, one way or the other :)

@goneall
Copy link
Member

goneall commented Dec 16, 2021

This would make a good topic for any upcoming Docfests.

The Java tools do not check for Symlinks when calculating the verification code, so it likely includes them. Like @swinslow - I'm not sure if this is the right approach or not but we already discovered one inconsistent approach.

One suggestion is to skip the files with the symlinks AND add the file paths to the excludes file list when generating the Package Verification Code. That way anyone validating the verification code would skip the symlinked files and the verification codes should match.

@seabass-labrax
Copy link
Contributor

Linking relationships could be a really useful feature to add in SPDX 3.0 for use-cases like @pseudoyim's. I've made a comment in spdx/spdx-3-model#5 so that this can be tracked :)

@pseudoyim
Copy link
Author

I apologize for the extremely late follow up on this thread! We greatly appreciate you all taking up this issue and addressing it in this Punch List.

@swinslow
Copy link
Member

Thanks for replying on this thread, @pseudoyim! I'd forgotten we started this discussion :)

This is timely as I just realized this morning that the Golang tools might be handling Packages with symlinks differently on Windows than they do on Linux. The thread at spdx/tools-golang#117 has some details about (I'm guessing) differences in symlinks leading to getting different Package Verification Codes on Windows vs. on Linux. I need to poke around more to confirm that's what's going on, but just noting that this is less settled in the Golang tools than I'd hoped...

@dholth
Copy link

dholth commented Mar 31, 2022

Some symlinks are aliases for libraries included in the same package, others point at e.g. the system's default timezone

@goneall
Copy link
Member

goneall commented Apr 4, 2024

It looks like this is still an open question - moving to 3.1 since it likely would not involve a breaking change

@goneall goneall modified the milestones: 3.0, 3.1 Apr 4, 2024
@goneall
Copy link
Member

goneall commented May 28, 2024

In SPDX 3.X we are deprecating the use of the package verification code preferring to use either gitoid or swhid.

In these algorithms, the name is captured from the symlink but not the content.

For SPDX, suggest we follow the same approach and capture the name and not the actual content.

We need to model out how symlinks will work:

  • Could be another "type" of file
  • Could be a subclass of File
  • Could be a relationship between the Symlink and a File
  • Could be a combination of the above

Discussed on the 28 May 2024 tech call - agreed it should be solved for 3.1 but will require more discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants