Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1/N][zero serialization] Create data structure for passing encode chunks #690

Merged
merged 2 commits into from
Aug 13, 2024

Conversation

jianoaix
Copy link
Contributor

@jianoaix jianoaix commented Aug 8, 2024

Why are these changes needed?

Create a ChunksData data structure for the encoded chunks data. This will be used to pass chunks around Encoder/Batcher/Dispatcher and eventually used to create dispersal requests to operator nodes.

Checks

  • I've made sure the lint is passing in this PR.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

return result, nil
}

func (cd *ChunksData) ToGobFormat() (*ChunksData, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ToGobForamt and ToGnarkFormat conversions are used to handle compatibility when we enable the Gnark at Encoder server: during the transition, some Encoder servers may return new format (Gnark) and some may still return the old (Gob), and the EncoderClient/Batcher will need to work correctly during transition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just first drain all the encoder pods to 0. Then quickly spin up new pods. But it could results in some increased latency

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. This is really a choice between software complexity v.s. operation complexity with a short/temporary latency spike. I'm just to make operation simpler here with some added software complexity.

return size
}

func (cd *ChunksData) FlattenToBundle() ([]byte, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part that will replace the serialization of chunks when dispersing them to nodes.

if cd.Format != GnarkChunkEncodingFormat {
return nil, fmt.Errorf("unsupported chunk encoding format: %d", cd.Format)
}
gobChunks := make([][]byte, 0, len(cd.Chunks))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for a while, serialization will still be done in batcher?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the plan is to gob encode in the encoder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the serialization will happen only once, which will be at Encoder. That'll in a separate PR.

@jianoaix jianoaix merged commit 9a43392 into Layr-Labs:master Aug 13, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants