Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example scripts for how to extract source files #2867

Merged
merged 3 commits into from
Mar 23, 2023

Conversation

eriknordmark
Copy link
Contributor

@eriknordmark eriknordmark commented Oct 18, 2022

Three scripts; one for the golang files, one for the alpine build receipies we use, adn one for the kernel sources.

These are now good enough for manual invocation and we can explore using them in the release build pipeline.

@eriknordmark eriknordmark requested review from deitch and a user October 18, 2022 06:01
@eriknordmark eriknordmark requested a review from rvs as a code owner October 18, 2022 06:01
@eriknordmark eriknordmark changed the title [WIP] Add example scripts for how to extract source files Add example scripts for how to extract source files Oct 18, 2022
pkg/pillar/scripts/get-alpine-pkg-source.sh Outdated Show resolved Hide resolved
pkg/pillar/scripts/get-alpine-pkg-source.sh Outdated Show resolved Hide resolved
pkg/pillar/scripts/get-alpine-pkg-source.sh Outdated Show resolved Hide resolved
pkg/pillar/scripts/get-alpine-pkg-source.sh Outdated Show resolved Hide resolved
pkg/pillar/scripts/get-alpine-pkg-source.sh Outdated Show resolved Hide resolved
docs/EVE-IMAGE-SOURCES.md Outdated Show resolved Hide resolved
docs/EVE-IMAGE-SOURCES.md Show resolved Hide resolved
@eriknordmark eriknordmark force-pushed the checkpoint branch 3 times, most recently from 6e68296 to ca742d4 Compare October 18, 2022 22:46
@rouming
Copy link
Contributor

rouming commented Oct 19, 2022

Not sure is this is a correct PR where should I ask this, but since kdump is merged, can we have a script for extracting the kernel debug information (kernel-debug.tar.gz) from the eve-kernel / eve-new-kernel containers? (kernel debug info eventuallywill be needed for kdumps debugging).

@eriknordmark
Copy link
Contributor Author

Not sure is this is a correct PR where should I ask this, but since kdump is merged, can we have a script for extracting the kernel debug information (kernel-debug.tar.gz) from the eve-kernel / eve-new-kernel containers? (kernel debug info eventuallywill be needed for kdumps debugging).

@rouming what information do you need? The kernel OCI containers contain a file with KERNEL_SOURCE= but do you need to pull something else out of the OCI container?
FWIW a hacky way to get something is e.g.,
CONT=docker.io/lfedge/eve-kernel:27897827d2e6fab1e2eb4f6ce0ea3a6d44a27cc1-amd64
docker image save $CONT | tar xf -
tar xf */layer.tar kernel-source-info
cat kernel-source-info

@deitch
Copy link
Contributor

deitch commented Oct 20, 2022

Side note: I got a whole bunch of errors - but it still worked - when I ran it in bash terminal Codespaces. Just started a new Codespaces on this checkpoint and ran pkg/pillar/scripts/get-alpine-pkg-source.sh -t lfedge/eve-pillar:17837a9fcd05c765e9a1f6707b2e48f0f1dd215b-amd64 and ended up with a lot of lines like:

busybox.2bf6ec48e526113f87216683cd341a78af5f0b3f/APKBUILD: line 1: Too: command not found
bzip2.596cd15e692776222d49da2700c6041b39ffbea9/APKBUILD: line 1: Too: command not found
ca-certificates.bb51fa7743320ac61f76e181cca84daa9977573e/APKBUILD: line 1: Too: command not found
coreutils.b9d70788e03fbf913f7a1872917856d2290adba8/APKBUILD: line 1: Too: command not found
curl.720eee024ae2221901cc607bedb5626ecb0c45d0/APKBUILD: line 1: Too: command not found
dhcpcd.c8db01e1ce2cf2c8c0fd2f387ff666f9f49ba91b/APKBUILD: line 1: Too: command not found
...

I assume those come from sourcing the APKBUILD. Speaking of which, why do we source APKBUILD?

Ah, yes, because the source of every APKBUILD looks like:

$ cat /tmp/7354/zstd.22b9167f4898653b61c7a8f289f631ac17f83740/APKBUILD 
Too Many Requests

@deitch
Copy link
Contributor

deitch commented Oct 20, 2022

I have been trying to wrap my head around what we are trying to get from the apk script (will look at go after).

Existing SBoM tools (like Syft) scan the image and pull out the package information into a purl. I think you are trying to replicate that, but also turn it into several layers above: the actual source URL (purl for apk doesn't include the actual URL) and upstream source that APKBUILD uses, etc.

Is that correct?

If so, would we be better served by having this tool consume an SBoM in standard format and building on it? I am not sure we need to replicate everything they do. For example, the output from spdx-json:

     "referenceCategory": "PACKAGE_MANAGER",
     "referenceLocator": "pkg:alpine/findmnt@2.38-r1?arch=x86_64&upstream=util-linux&distro=busybox-1.35.0",
     "referenceType": "purl"

Granted, that example is missing the commit, but if it did, we would have enough to create the URL and retrieve information from it.

FYI, Syft does get the aports commit, but only includes it in its own output format, not the standard SPDX or spdx-json formats. I have an open issue there.

@deitch
Copy link
Contributor

deitch commented Oct 20, 2022

Getting lots of deprecation errors on the go one. I ran it with bash -x to see the line.

+ go get -d cloud.google.com/go/compute@v1.2.0
go: go.mod file not found in current directory or any parent directory.
        'go get' is no longer supported outside a module.
        To build and install a command, use 'go install' with a version,
        like 'go install example.com/cmd@latest'
        For more information, see https://golang.org/doc/go-get-install-deprecation
        or run 'go help get' or 'go help install'.

In addition, I ran the whole thing in pkg/pillar/go.sum and ended up with nothing:

1 packages without LICENSE file in /tmp/31207
$ ls -la /tmp/31207/
total 12
drwxr-xrw-+  2 codespace codespace 4096 Oct 20 14:16 .
drwxr-xrwt+ 13 root      root      4096 Oct 20 14:15 ..
-rw-r--rw-   1 codespace codespace    0 Oct 20 14:16 null.NO-LICENSE
-rw-r--rw-   1 codespace codespace   45 Oct 20 14:16 null.tgz
$ tar -ztvf /tmp/31207/null.tgz
$

@deitch
Copy link
Contributor

deitch commented Oct 20, 2022

What is the purpose of the go one? Is it to find licenses? Get the sources? In other words, is it to:

  • determine what packages (and hashes) are in use? OR
  • given the above, get license files? OR
  • given the above, get the entire source?

My general thought would be like with the apk: let an SBoM tool gather it, let us build on top of it. Although there probably are go libraries for parsing go.mod/go.sum that could help. We could look at the source to go get and go mod for pointers.

@eriknordmark
Copy link
Contributor Author

What is the purpose of the go one? Is it to find licenses? Get the sources? In other words, is it to:

  • determine what packages (and hashes) are in use? OR
  • given the above, get license files? OR
  • given the above, get the entire source?

My general thought would be like with the apk: let an SBoM tool gather it, let us build on top of it. Although there probably are go libraries for parsing go.mod/go.sum that could help. We could look at the source to go get and go mod for pointers.

The first one requires no script - cat $(find . -name go.sum) does not
Initially the goal was to get the entire source, and then I added the collection of license files as I went along (since we don't yet have that ability AFAIK)

I hope that both of these scripts will be short lived; at some level they merely cast the textual description in https://github.com/lf-edge/eve/blob/master/docs/EVE-IMAGE-SOURCES.md to code.

@deitch
Copy link
Contributor

deitch commented Oct 21, 2022

OK, got it. These definitely should be top-level then, but I am wondering if they should exist outside of lf-edge/eve? These aren't eve or even lf-edge specific, but just useful tools for scanning for sources.

@eriknordmark
Copy link
Contributor Author

Granted, that example is missing the commit, but if it did, we would have enough to create the URL and retrieve information from it.

@deitch in your example starting with spdx-json and its purl, don't we still need to look at the sources= in the APKBUILD file to be able to fetch all of the Alpine patches in addition to the upstreadm source? (and the sha512sums in the APKBUILD if we want to verify the shas?)
Having a parser for APKBUILD which could get us those settings would be great.

@deitch
Copy link
Contributor

deitch commented Mar 11, 2023

in your example starting with spdx-json and its purl, don't we still need to look at the sources= in the APKBUILD file to be able to fetch all of the Alpine patches in addition to the upstreadm source? (and the sha512sums in the APKBUILD if we want to verify the shas?)
Having a parser for APKBUILD which could get us those settings would be great.

Yes and yes. And there is nothing in the purl which tells you the source; you just need to know that is came from alpinelinux.org, and that it came from main or community or whatever.

Having a parser for APKBUILD which could get us those settings would be great.

look here: https://pkg.go.dev/gitlab.alpinelinux.org/alpine/go

@eriknordmark eriknordmark force-pushed the checkpoint branch 2 times, most recently from 754fa2f to 649249d Compare March 15, 2023 12:58
@eriknordmark eriknordmark requested a review from deitch March 15, 2023 12:59
@deitch
Copy link
Contributor

deitch commented Mar 15, 2023

This looks fine.

Did you want any of the stuff I have that does some of this? I have:

  • The Dockerfile ADD scanner, which lists all ADD sources
  • a utility I have that fetches the go license for a given package and reference (commit, semver, etc.). It looks in local GOPATH and if not there then goes to the Internet

@deitch
Copy link
Contributor

deitch commented Mar 17, 2023

I went through this again, as I look to compare the total sources to the SBoM.

Your alpine fetcher is pulling down lfedge/eve, but we already expand everything in the rootfs so that the sbom generator can find them.

Could we modify it so that it can be passed a path? Then I could build it right into the generation pipeline.

@eriknordmark eriknordmark force-pushed the checkpoint branch 3 times, most recently from b0f701a to 4273121 Compare March 21, 2023 12:14
@deitch
Copy link
Contributor

deitch commented Mar 21, 2023

I like the modifications.

…rnel.

Includes fetching complete source and/or URLs+license information.

Signed-off-by: eriknordmark <erik@zededa.com>
Signed-off-by: eriknordmark <erik@zededa.com>
Signed-off-by: eriknordmark <erik@zededa.com>
@eriknordmark eriknordmark merged commit 267aee1 into lf-edge:master Mar 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants