Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken Hash Calculation for Embedded Container Images #260

Open
streaky opened this issue Feb 13, 2025 · 2 comments
Open

Broken Hash Calculation for Embedded Container Images #260

streaky opened this issue Feb 13, 2025 · 2 comments

Comments

@streaky
Copy link

streaky commented Feb 13, 2025

When embedding container images, the process fails with errors during hash generation for our custom container images. The error output includes messages like:

sha256sum: images/hook-embedded/images/overlay2/.../diff/usr/local/lib/python3.10/site-packages/setuptools/command/launcher: No such file or directory
sha256sum: manifest.xml: No such file or directory
sha256sum: images/hook-embedded/images/overlay2/.../diff/usr/local/lib/python3.10/site-packages/setuptools/script: No such file or directory
sha256sum: '(dev).tmpl': No such file or directory

The script computes a hash for the container image by listing all files and then hashing the files. The current implementation uses this pipeline:

find "${container_base_dir}/${container_dir}" -type f -print | LC_ALL=C sort | xargs sha256sum | sha256sum | cut -d' ' -f1

While this approach works fine for many standard images, it fails with [all of] our custom container images. The failure indicates that valid file paths are being misinterpreted in some cases.

Explanation:

Using find -print produces newline-delimited output, and passing that output to xargs causes it to split on any whitespace. This splitting misinterprets a single file path as multiple tokens, leading to calls to sha256sum with incorrect file names and, ultimately, build failures.

For example, the file name images/hook-embedded/images/overlay2/.../diff/usr/local/lib/python3.10/site-packages/setuptools/command/launcher manifest.xml is incorrectly passed to sha256sum as two seperate files, images/hook-embedded/images/overlay2/.../diff/usr/local/lib/python3.10/site-packages/setuptools/command/launcher and manifest.xml

Proposed Fix:

To resolve the issue, the file enumeration should use a null-terminated approach that safely preserves file names regardless of any whitespace or special characters. This can be achieved by modifying the pipeline as follows:

container_files_hash="$(
  find "${container_base_dir}/${container_dir}" -type f -print0 | \
  LC_ALL=C sort -z | \
  xargs -0 sha256sum | \
  sha256sum | \
  cut -d' ' -f1
)"

-print0 and -0 Options: These ensure file names are treated as complete, atomic strings.
LC_ALL=C sort -z: This guarantees a consistent, bytewise sort order while preserving the null termination.

@rpardini
Copy link
Contributor

That makes sense; we simply didn't have any chance of a filename with spaces before Embedded Images, which was added later.

Can you send a Pull Request?

@streaky
Copy link
Author

streaky commented Feb 14, 2025

Sure - I'll throw one together when I have a second.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants