-
Notifications
You must be signed in to change notification settings - Fork 519
network: proposal payload compression #4589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
network: proposal payload compression #4589
Conversation
|
zstd seems like a very good choice, since we can seed the dictionary it uses. it ought to do a great job on our repetitive msgpack txns after that. |
Codecov Report
@@ Coverage Diff @@
## master #4589 +/- ##
==========================================
+ Coverage 54.34% 54.42% +0.07%
==========================================
Files 402 403 +1
Lines 51793 51917 +124
==========================================
+ Hits 28149 28255 +106
- Misses 21276 21284 +8
- Partials 2368 2378 +10
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
3fa1553 to
5418774
Compare
3c8e5c8 to
b4f8360
Compare
|
Some refactoring, unit tests and rebase to master |
jannotti
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not so familiar with wsNetwork, but the parts I understood seemed good.
I wonder, since we have to support uncompressed senders anyway, why should we ever accept a compression that is bigger than the message? We might as well use the uncompressed form in that case. This simplifies the compression routine slightly, because we don't need to ask for the bound, we can just insist that it compresses in less than len(d) or refuse.
network/msgCompressor.go
Outdated
| func (dec zstdProposalDecompressor) convert(data []byte) ([]byte, error) { | ||
| r := zstd.NewReader(bytes.NewReader(data)) | ||
| defer r.Close() | ||
| b := make([]byte, 0, 1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I though we are only compressing large messages, so maybe start higher? We could define a constant minCompressedMsgSize?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we compressing all proposals, small and large. I think I like this idea, the magic check above would handle both compressed and non-compressed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, right, mspacked, my bad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make the compressed encoding start with an additional 4 bytes to include the length up front?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was considering suggesting that too. Then you could io.ReadFull on the known size (allocated all at once). And the check for exploding messages isn't even needed (it'll just fail to decompress into the buffer, which presumably you've prechecked against the maxsize)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, how about we start with a buffer 5x the size of the compressed message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe 2.5x would be a good guess since we see 2-3x compression ratio
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set to 3x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. That should eliminate most copies.
algorandskiy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This simplifies the compression routine slightly, because we don't need to ask for the bound, we can just insist that it compresses in less than len(d) or refuse.
I thought about it, and the only way is to compare the post-compression result (since the bound is always greater, and internally zstd re-allocates).
network/msgCompressor.go
Outdated
| func (dec zstdProposalDecompressor) convert(data []byte) ([]byte, error) { | ||
| r := zstd.NewReader(bytes.NewReader(data)) | ||
| defer r.Close() | ||
| b := make([]byte, 0, 1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we compressing all proposals, small and large. I think I like this idea, the magic check above would handle both compressed and non-compressed
|
A common pattern for algod is to take an incoming message and relay it to others. Should we save the compressed message in |
outmsg := wn.handlers.Handle(msg)
...
case Broadcast:
err := wn.Broadcast(wn.ctx, msg.Tag, msg.Data, false, msg.Sender)
|
|
@zeldovich the issue with proposals is that the relayAction constructs a new transmittedPayload by combining a PriorVote and unauthenticatedProposal that were pulled separately from the voteTracker and proposalStore systems — I ran into this when doing a POC to avoid the re-msgp-encoding of proposals when relaying a proposal in #4568. I ran into the issue where I believe it isn't totally guaranteed that the PriorVote that was sent to you in the PP/transmittedPayload message (that you fed into agreement by splitting it in two messageEvents) would be the same PriorVote as you send out when relaying the PP/transmittedPayload message. Or maybe it is always the same when you are relaying for period 0? In any case you can see in my POC I ended up adding references to the original msgp encoding of the unauthenticatedProposal and PriorVote separately, and then do a byte-comparison on PriorVote just to be extra sure it is the same before re-using the encoding... But perhaps it would be possible to similarly feed the compressed proposal message into agreement and back out again to avoid re-compressing. |
|
Hmm, yeah, on closer examination it seems like it will be tricky to do this optimization. You might be right that, in period 0, the prior vote that accompanied the proposal on arrival is the same as (or just as good as?) the prior vote we pull out of the voteTracker, but this also seems a subtle invariant to think about.. I was thinking it would be doable because we try to track some information about the original incoming message (e.g., Not worth delaying this PR for this second-order optimization. |
|
According to perf tests, PP traffic went down to about 30%, and round time shows less fluctuations. |
|
@zeldovich I guess one thing we could do, but would require a consensus upgrade, is move the compression into the and it would be agreement's job to decompress the proposal, and also keep a reference to the original compressed and msgp-encoded version in the unauthenticatedProposal type like I had in #4568. Then we would only be compressing the proposal and not the PriorVote, so we wouldn't have to worry about mixing them up. |
Summary
Inproduce proposal payload compression in protocol version 2.2.
Acceptance:
Testing
Possible optimization
If all peers support compressed proposals, do not allocate memory for non-compressed data batch. Possibly in another PR.