Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2/N][zero serialization] Make Batcher operate on chunks without ser/deser #700

Merged
merged 19 commits into from
Aug 19, 2024

Conversation

jianoaix
Copy link
Contributor

@jianoaix jianoaix commented Aug 13, 2024

Why are these changes needed?

This eliminates both the deserialization and serialization of chunks at the Batcher.

This matters because the cost on just the serialization of chunks can be significant: e.g. a large operator took 61s to serialize the dispersal request

Such perf/cost is on the critical path of E2E flow, so by eliminating them (as well as eliminating the cost to deserialize chunks from Encoder) will contribute directly the E2E perf.

Screenshot 2024-08-13 at 2 58 56 PM

Testing

In addition to the unit test, also tested it in Preprod, with following setup

  • Original chunks are in Gob
  • The 3 Nodes are running with: 1) v0.7.4; 2) v0.8.0+ with Gnark disabled (i.e. using Gob); 3) v0.8.0+ with Gnark enabled

And the Batcher was tested in these two cases:

  • Batcher shipping Gob chunks to Node: chunks will be shipped as it is
  • Batcher shipping Gnark chunks to Node: chunks will be converted to Gnark (since they were in Gob) and then shipped to Nodes

This shows the compatibility/safety of:

  1. interaction with the ongoing Gob->Gnark migration is correct
  2. the migration of frames (deserialized) -> ChunksData (zero ser/deser) itself is correct

Checks

  • I've made sure the lint is passing in this PR.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

@jianoaix jianoaix changed the title [1/N][zero serialization] Make Batcher operate on chunks without deserialization [2/N][zero serialization] Make Batcher operate on chunks without deserialization Aug 13, 2024
@jianoaix jianoaix changed the title [2/N][zero serialization] Make Batcher operate on chunks without deserialization [2/N][zero serialization] Make Batcher operate on chunks without ser/deser Aug 13, 2024
@@ -160,7 +160,7 @@ func TestBatchTrigger(t *testing.T) {
assert.Nil(t, err)
count, size := encodingStreamer.EncodedBlobstore.GetEncodedResultSize()
assert.Equal(t, count, 1)
assert.Equal(t, size, uint64(16384))
assert.Equal(t, size, uint64(26630))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Occasional data point about the data size reduction with the new chunk encoding with Gnark(that's separately done): the 16384 is the theoretical size (4 bytes/symbol * numSymbols in the chunk), and 26630 is the number of bytes from Gob encoding. The new chunk encoding will achieve the former.

@jianoaix jianoaix requested review from bxue-l2 and ian-shim August 15, 2024 18:27
Comment on lines 230 to 231
assert.Equal(t, uint64(27631), size) // Robert checks it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove Robert checks it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Bundles: make(core.Bundles),
blobMessagesByOp[opID] = append(blobMessagesByOp[opID], &core.EncodedBlobMessage{
BlobHeader: blobHeaders[i],
EncodedBundles: make(map[core.QuorumID]*core.ChunksData),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ie. EncodedBundles?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Contributor

@bxue-l2 bxue-l2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jianoaix jianoaix merged commit cacdc21 into Layr-Labs:master Aug 19, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants