-
Notifications
You must be signed in to change notification settings - Fork 691
Add md5sums file list to distroless container #2065
Add md5sums file list to distroless container #2065
Conversation
5538de4
to
9818e11
Compare
For reference, I started this: |
@loosebazooka gentle ping. You asked for a patch here in GoogleContainerTools/distroless#741 (comment)
Do you mind to have a look before it goes stale? |
@smukherj1 @alexeagle @pcj @gravypod gentle ping too. Some kind of review (and approval) would be very nice! |
container/build_tar.py
Outdated
if not control_file_member: | ||
raise self.DebError(deb + ' does not Metadata File!') | ||
raise self.DebError(deb + ' does not contain a control Metadata File!') | ||
control_file = tar.extractfile(control_file_member[0]) | ||
metadata = b''.join(control_file.readlines()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like this could be more readably expressed as
metadata = control_file.read().decode("utf-8")
pkg_name = TarFile.parse_pkg_name(metadata, deb)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aiuto good point. Done in the latest push. I also rebased. Thank you for the review!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, but I don't maintain this repo.
Again, repeating the friendly ping.
9818e11
to
3cfc89b
Compare
3cfc89b
to
89acc82
Compare
Joshua Katz asked on Slack:
Here is the skinny:
This introspection is important for security and license compliance. |
89acc82
to
01c4b4d
Compare
Are scanners going to be aware of the distroless specific custom directory structure: |
@loosebazooka re:
All the open source scanners that can handle distroless images have long been aware of the This is true for scancode.io and scancode-toolkit but also all the other open source tools capable of scanning Distroless images that I know of including tern and anchore. In scancode-toolkit (and scancode.io), this is handled there https://github.com/nexB/scancode-toolkit/blob/aba31126dcb3ab57f2b885090f7145f69b67351a/src/packagedcode/debian.py#L345 (this has recently been deeply refactored but this was there long before). All these tools open source (or not) will benefit directly from this small patch. |
I am all for it.... I guess I have no idea why a /status.d dir was used in the first place, except that it does allow independent installation without having to worry of merging control files in one? Since this is already in odd territory that's no longer exactly Debian, I just kept on piling on this. Tell me if you want to use the "standard" dpkg /info directory instead. Frankly I grew to like the simple status.d/ dir. |
Sure, I don't have a strong preference either way, I was just curious. I was hoping to reduce the burden on new scanners, but I barely know what that would even mean. LGTM from the distroless side. |
This was reviewed by the SWEs on Distroless and it looks good. I'm going to approve this PR but we can't merge until the tests go green. Some of these tests are flakey so they might just need to be rerun. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets get the tests pass then we can submit this. Thanks for putting this together!
I triggered a build by removing odd trailing whitespaces ;) |
@gravypod the CI failure is about a failure to fetch https://bazel.build/bazel-release.pub.gpg
|
This ensures that a "distroless" container layer tarball built from Debian packages contains not only the control file of each package, but also the md5sums file that lists original files included in a package. If present, we extract the md5sums file and save is side-by-side with the package control file under this path: var/lib/dpkg/status.d/<package-name>.md5sums Reference: bazelbuild#1876 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
e25680e
to
c63f166
Compare
@gravypod the issue (or the fix) is that the published gpg checksum has just been updated Line 125 in 900572d
I rebased and forced pushed and this should clean this up hopefully. |
@gravypod we are all green now 🍏 🎉 |
* Add md5sums file list to distroless container This ensures that a "distroless" container layer tarball built from Debian packages contains not only the control file of each package, but also the md5sums file that lists original files included in a package. If present, we extract the md5sums file and save is side-by-side with the package control file under this path: var/lib/dpkg/status.d/<package-name>.md5sums Reference: bazelbuild#1876 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com> * Remove trailing whitespaces Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
This ensures that a "distroless" container layer tarball built from
Debian packages contains not only the control file of each package, but
also the md5sums file that lists original files included in a package.
The md5sums file is extracted from a .deb package and saved side-by-side
with the control under this path:
var/lib/dpkg/status.d/.md5sums
Reference: #1876
Signed-off-by: Philippe Ombredanne pombredanne@nexb.com
PR Checklist
Please check if your PR fulfills the following requirements:
PR Type
What kind of change does this PR introduce?
What is the current behavior?
Issue Number: #1876
The current behaviour is to skip including the md5sum listing the list of package files when installing a Debian package in a distroless container image.
What is the new behavior?
The new behaviour is to included the md5sum listing the list of package files when installing a Debian package in a distroless container image, side-by-side with the control file.
Does this PR introduce a breaking change?
Other information
I used only conservative Python syntax and kept the same code style as before.
There was no specific documentation entry, so I added a docstring