Add support for hashing files with header. #194

mihaimaruseac · 2024-06-03T23:55:19Z

Summary

Missed this in #188, but found out I need it when working on #190. The serialize_v0/serialize_v1 methods all had headers in front of the files, so we need to do that too. Will update usage of header on #190 shortly.

As a benefit, we can simulate hashing a file with a header for the first portion of the file and a sharded hasher for the remainder of the file.

Release Note

NONE

Documentation

NONE

Missed this in sigstore#188, but found out I need it when working on sigstore#190. The `serialize_v0`/`serialize_v1` methods all had headers in front of the files, so we need to do that too. Will update usage of header on sigstore#190 shortly. As a benefit, we can simulate hashing a file with a header for the first portion of the file and a sharded hasher for the remainder of the file. Signed-off-by: Mihai Maruseac <mihaimaruseac@google.com>

laurentsimon

It's a bit dangerous that:
"header" + "file_content" will return the same hash as "head" + "erfile_content"

What's the use case?

mihaimaruseac · 2024-06-04T00:37:53Z

Actually, I don't need it.

In the old serialization we had

        h = hashlib.sha256(header)
        with open(path, "rb") as f:
            if chunk == 0:
                all_data = f.read()
                h.update(all_data)
            else:
                # Compute the hash by reading chunk bytes at a time.
                while True:
                    chunk_data = f.read(chunk)
                    if not chunk_data:
                        break
                    h.update(chunk_data)
        return h.digest()

So we needed a header before hashing the file. But we actually have 2 different hashers here, so this is not needed.

mihaimaruseac requested a review from a team as a code owner June 3, 2024 23:55

mihaimaruseac added this to the V1 release milestone Jun 3, 2024

laurentsimon reviewed Jun 4, 2024

View reviewed changes

mihaimaruseac closed this Jun 4, 2024

mihaimaruseac deleted the hashing-header branch June 4, 2024 00:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for hashing files with header. #194

Add support for hashing files with header. #194

mihaimaruseac commented Jun 3, 2024

laurentsimon left a comment

mihaimaruseac commented Jun 4, 2024

Add support for hashing files with header. #194

Add support for hashing files with header. #194

Conversation

mihaimaruseac commented Jun 3, 2024

Summary

Release Note

Documentation

laurentsimon left a comment

Choose a reason for hiding this comment

mihaimaruseac commented Jun 4, 2024