-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GITBOM_RECORD_HASH_MODE to support use case of not embedding gitoid into artifacts for reproducible build #22
Comments
I would recommend that we look at recommending build tools keep a map from artifacts to graphs as a symlink farm (see #20). This then becomes allowing an option to disable embedding in build tools. |
I have implemented a prototype in GCC-11.3 to support gitBOM document generation in a very flexible way. I have used the below environment variable to control how GCC builds software: GITBOM_BUILD_MODE=sha1,sha256,create_adg,embed_bomid,record_hash Each comma separated key or attribute is a flag to turn on/off a specific gitbom feature.
The default value is 'create_no_adg,embed_no_bomid,record_no_hash', but we can discuss what is the more appropriate default values for these flags. A few examples to illustrate the usage:
Let me know if you have any questions/comments. |
I have also used another GITBOM environment variable to specify the top-level directory to store the generated gitBOM docs, symlinks, record_hash_logfiles. Using absolute path: or Using relative path: Of course, there are still some sub-directories under this GITBOM_DOC_SAVE_DIR, to store gitBOM docs, symlinks, or record_hash log files. if "unset GITBOM_DOC_SAVE_DIR", then all the gitBOM docs will be saved in the default directory, which is specified or recommended in the GITBOM specification. |
I'd suggest we reduce this down to: GITBOM_DO_NOT_EMBED which, if set and non-empty instructs the build tool to not embed gitbom docs. This does open some questions about the a2g link farm. Non-embedding opens the possibility for multiple GitBOM docs for the same artifact id. In construction of the link farm, this can be solved by replacing the symlink with a directory of symlinks. For consumption of the link farm, there is no one true solution. It cannot be known which GitBOM doc corresponds to the particular build the build tool is currently part of. The best we can manage is to use a simple heuristic like 'last modified'. |
Hi @edwarnicke Ed, how do we completely turn off the generating of gitBOM documents? Also, I think it is very valuable to have the flexibility for all the 3 combinations: 1. sha1 only; 2. sha256 only; 3. sha1+sha256. Because sha1+sha256 will at least double the number of gitBOM documents than sha1-only, most users will probably prefer to use either sha1-only or sha256-only. If there is no knob to provide this flexibility, then it seems not good. Let me know your suggestions. Thanks. |
Would it make sense to simply not generate GITBOM Docs if GITBOM_DIR (or equivalent tool flag) is not set? |
That would work. |
I guess you are talking about the 3 combinations: 1. sha1-only; 2. sha256-only; 3. sha1+sha256. In BOMSH, it is a command line option to specify it: --hashtype "sha1 | sha256 | sha1,sha256". |
Per the linked comment in #53, I am trying to clarify and move forward the no-embedding discussion to a point of resolution, as I think it's a little bit murky right now. Full context here: #53 (comment) |
Closing, per this comment: #53 (comment) |
This GITBOM_RECORD_HASH_MODE is orthogonal and complementary to existing gitBOM gitoid-embedding mode specified in gitBOM spec.
In this GITBOM_RECORD_HASH_MODE, during software build, we only compute and record the git-hashes of input and output artifacts for all build steps, but we do not create gitBOM documents nor compute the bom-id of output artifacts.
Later after software build, these recorded git-hashes are processed by bomsh_create_bom.py script to generate all gitBOM docs and create the external artifact-to-bomid mapping database. This post-processing can occur on a remote host.
A lot of benefits with this GITBOM_RECORD_HASH_MODE:
I think our gitBOM SPEC should support this GITBOM_RECORD_HASH_MODE, and provide some explicit guidelines for tool developers to follow. This GITBOM_RECORD_HASH_MODE is perhaps easier for software industry to accept gitBOM, especially the reproducible-build people.
The purpose of this GITBOM_RECORD_HASH_MODE is to achieve the same functionality as bomsh, while eliminating the high performance overhead of bomsh.
I have modified the binutils and gcc/llvm compiler with a few dozens of code line changes only, to implement this GITBOM_RECORD_HASH_MODE feature. An end-to-end hello-world demo is available.
The text was updated successfully, but these errors were encountered: