Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternate non-embedding proposal #53

Open
edwarnicke opened this issue Jul 26, 2023 · 10 comments
Open

Alternate non-embedding proposal #53

edwarnicke opened this issue Jul 26, 2023 · 10 comments
Labels
c-spec Category: Improvements or additions to the OmniBOR specification s-needs-info Status: Further information is requested

Comments

@edwarnicke
Copy link
Contributor

There has been much discussion about a non-embedding mode for OmniBOR.

Typically these proposals have all run up against a complication: it must be possible for a build tool to, by inspection, figure out the Input Manifest Identifier of an artifact it is attempting to use as input.

One possible approach for consideration might be for an output artifact, the build tool writes a simple file into the same directory as the output artifact named ${outputfile}.omnibor containing a very simple single record Input Manifest for the artifact.

Example:

  1. foo.o is the name of the output file
  2. foo.o has Artifact ID gitoid:blob:sha1:03fb9d595634e14c261c8732d52e9ee8e7976f55
  3. foo.o has Input Manifest ID gitoid:blob:sha1:7df059597099bb7dcf25d2a9aedfaf4465f72d8d

Then in the same directory as foo.o, write out a file foo.o.omnibor containing:

gitoid:blob:sha1:
blob 03fb9d595634e14c261c8732d52e9ee8e7976f55 bom 7df059597099bb7dcf25d2a9aedfaf4465f72d8d

Things to address with this idea:

  1. How to handle dual tree
  2. What should the suffix be? .omnibor or something else.
@edwarnicke
Copy link
Contributor Author

Off hand idea for handling the multiple artifact identifier types:

  1. Define a standard for concatenating multiple Input Manifests together similar to that for yaml, where each document is seperated by a single line containing ---
  2. Use that standard for the ${output artifact}.omnibor file

Example:

---
gitoid:blob:sha1:
blob 03fb9d595634e14c261c8732d52e9ee8e7976f55 bom 7df059597099bb7dcf25d2a9aedfaf4465f72d8d
---
gitoid:blob:sha256:
blob 09c825ac02df9150e4f93d12ba1da5d1ff5846c3e62503c814aa3a300c535772 bom c71d239df91726fc519c6eb72d318ec65820627232b2f796219e87dcf35d0ab4

@edwarnicke
Copy link
Contributor Author

One other option might be to write the 'mapping' files into files in

${OMNIBOR_DIR}/metadata/mapping/gitoid_blob_sha1/${gitoid:blob:sha1 Output Artifact ID}

${OMNIBOR_DIR}/metadata/mapping/gitoid_blob_sha256/${gitoid:blob:sha256 Output Artifact ID}

Example:

${OMNIBOR_DIR}/metadata/mapping/gitoid_blob_sha1/03fb9d595634e14c261c8732d52e9ee8e7976f55

${OMNIBOR_DIR}/metadata/mapping/gitoid_blob_sha256/09c825ac02df9150e4f93d12ba1da5d1ff5846c3e62503c814aa3a300c535772

Please don't take the choice of 'mapping' as a ${context} overly seriously... it is likely not the ultimate choice we wish to make.

@edwarnicke
Copy link
Contributor Author

@bharsesh suggests possibly instead of 'mapping' we might call the context 'adg' or something else.

  • Jeff Hewett seconded 'adg' and supported directory structure to delineate

  • Yongkui suggested there can be a combination of binding and non-binding - even if embedding this mapping can still work.

@alilleybrinker
Copy link
Member

@yonhan3 and @edwarnicke, I think the state of the non-embedding discussion is a little messy right now. Specifically, we have:

The draft PR is currently conflicted with the main branch, and it's not clear to me whether it reflects the current status of the discussion, or what the current status of the discussion is.

Ideally, I'd like for us to be able to achieve the following:

  • Either resolve the conflicts and merge Ed's PR with some set of minimal further changes to reflect the current consensus points, potentially leaving open further points of disagreement for future PR's, OR close Ed's PR with a plan to open a fresh one once further conversation has clarified the intended design.
  • Close out the two open issues, which tough on a wide-ranging number of points, and open either a more focused set of issues which each describe individual changes to the spec where the design space has been narrowed, OR open a Discussion which enumerates the open challenges for this topic.

Basically, I'm seeking to clarify the status here, and make it easier to judge when we've reached sufficient consensus to merge content to the spec which describes how the no-embedding case ought to be handled.

Given this, my questions to you both are:

  • What do you view as the open issues for non-embedding?
  • What is your disposition on the state of Ed's PR and what ought to happen to it?

@alilleybrinker alilleybrinker added c-spec Category: Improvements or additions to the OmniBOR specification s-needs-info Status: Further information is requested labels Sep 20, 2023
@yonhan3
Copy link

yonhan3 commented Sep 20, 2023

@alilleybrinker I think my old #22 issue has been outdated. I implemented a prototype for GCC, but it never went to upstream, it is still just stay in a private branch.

We can just follow Ed's new draft #24 for further discussions.

@alilleybrinker
Copy link
Member

@yonhan3 sounds good! I'll close the other issue then.

@edwarnicke
Copy link
Contributor Author

@alilleybrinker

Actually... I'm not a huge fan of #24 currently either. This issue captures a bit more of my thinking on direction currently (which moves away from the link farm approach in #24

@yonhan3
Copy link

yonhan3 commented Oct 17, 2023

Can we call it ADF (Artifact Dependency Fragment) for the saved OmniBOR manifest mapping file in non-embedding mode?

The ADF (Artifact Dependency Fragment) contains a single output file and a list of input files (with sha1 or sha256 gitoid). Some metadata like optional build_cmd can also be generated for this ADF.

If a single command generates multiple output files, then multiple ADFs can be generated.

The bomsh post-processing scripts are based on this ADF concept. As long as these ADFs are provided, then Bomsh scripts can create all the OmniBOR manifest documents, mappings, metadata, etc. For example, the runtime dependency ADFs can be generated by bomsh_dynlib.py script, and the Python runtime dependency ADFs can be generated by bomsh_dylib.py script.

Any tools can create such ADF documents, making our OmniBOR framework very flexible.

@alilleybrinker
Copy link
Member

alilleybrinker commented Oct 18, 2023

Based on discussion in the WG meeting, some points:

  • Compared to the original proposal, the ADF concept does not include the Input Manifest ID. The idea is that this ID can be derived on the fly for the output artifact's manifest when/if needed.
  • This ADF proposal is intended to add flexibility by not needing to include as much information.

I'd personally like to look at the bomsh_dynlib.py implementation to better understand what is being done with this concept.


Open questions:

  • How much data is the minimal amount of data we need to include?
  • What exact spec changes would this proposal entail?

@yonhan3
Copy link

yonhan3 commented Oct 18, 2023

For the bomsh_dynlib.py script, you can follow the below instructions:
https://github.com/omnibor/bomsh#Creating-Runtime-Dependency-Tree-for-ELF-Binaries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c-spec Category: Improvements or additions to the OmniBOR specification s-needs-info Status: Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants