Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to compute the Sum512 of a stream of data without fixed known length #20

Open
gianalbertochini opened this issue Jun 2, 2024 · 1 comment

Comments

@gianalbertochini
Copy link

gianalbertochini commented Jun 2, 2024

Hello,

I’m curious to know if it’s possible to compute the hash of a data stream without knowing its length in advance, and to do this without storing the entire data in RAM. Ideally, the hash should be updated incrementally.

As a beginner, this might seem like a simple question. I understand that to compute the hash, the data needs to be divided into chunks of 1024 bytes.

To put it in simpler terms, I want to write a Hash class that has a method void HashByte(*byte). This method would take an arbitrary number of bytes as input and maintain a “partial hash” in memory, which is updated incrementally every time N bytes arrive from the stream.

Another method, byte[64] Close(void), would return the 512-bit hash as an array of 64 bytes, representing the entire received stream.

for example:

hash = New Hash()
hash.HashBytes(byte[] "This is the first array of byte.")
hash.HashBytes(byte[] " <A VERY LONG STRING>.") // Here can be added even few MiB of data
hash.HashBytes(byte[] "") //Nothing is added
hash.HashBytes(byte[] " This is the second")
hash.HashBytes(byte[] ".") // Just 1 byte is added
byte[64] result0 = hash.close()

byte[64] result1 = Sum512(byte[] "This is the first array of byte. <A VERY LONG STRING>. This is the second.")

result0 should be equal to result1

Is it possible and how can I do?

Many thanks

@lukechampine
Copy link
Owner

sure, here's how to do that in Go:

h := blake3.New(64, nil) // 512-bit output, no key
h.Write([]byte("This is the first array of byte."))
h.Write([]byte("<A VERY LONG STRING>"))
result := h.Sum(nil)

In Go, we typically use the io.Reader and io.Writer interfaces when working with lots of data. For example, if your data is stored in a file, you could do this:

f, _ := os.Open("path/to/file")
h := blake3.New(64, nil)
h.Write([]byte("This is the first array of byte."))
io.Copy(h, f) // stream the file contents into the hash
result := h.Sum(nil)

This works because io.Copy streams data from an io.Reader (in this case, f, the file) to an io.Writer (in this case, h, the hash).

Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants