Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC responses from go impl polling incomplete bytes from substream #505

Closed
austinabell opened this issue Jun 16, 2020 · 0 comments · Fixed by #530
Closed

RPC responses from go impl polling incomplete bytes from substream #505

austinabell opened this issue Jun 16, 2020 · 0 comments · Fixed by #530
Assignees
Labels
Network Libp2p and PubSub stuff Priority: 2 - High Very important and should be addressed ASAP Type: Bug Something isn't working

Comments

@austinabell
Copy link
Contributor

Describe the bug

Bytes polled from the substream from go impl of Blocksync more often than not will fail to decode because only partial bytes will be pulled from the substream. Was very inconsistent and hard to reproduce within implementation but now with Kademlia and varying network latency it's much more easily reproducible.

Error will look something like:

2020-06-16T19:42:09.421Z WARN  forest_libp2p::behaviour > RPC Error Custom("Codec Error: EOF while parsing a value at offset 514"), 1

but even when putting a long thread sleep right before the bytes are polled, the error is somewhat consistently:

 2020-06-16T19:45:57.413Z WARN  forest_libp2p::behaviour  > RPC Error Custom("Codec Error: EOF while parsing a value at offset 8192"), 2

and I didn't dig deep enough to find out where the 8192 cap is from, whether it's the substream (on either go or rust impl), cap on the bytes polled from the substream, or some other limit but it's too consistent to be a coincidence. Because of whatever limit, probably leads to having to either decode the response from some reader of pulling from the substream (definitely inefficiencies there as probably shouldn't happen in the poll function) or storing chunks of unfinished bytes polled from those substreams (also inefficient for keeping unnecessary bytes in memory and doing unnecessary decoding checks until finished).

In any case, the main issue should be found out why the substream is polled as ready when bytes have only started to be written. I wasn't able to reproduce from within our client but maybe I just didn't create a large enough tipset bundle or didn't simulate how go implementation writes to the substream (I believe their cbor encoding is just using a writer and gets written as encoded)

My guess is that the way the RPC and even rust libp2p isn't built to handle such large messages over the network (blocksync tipset bundles are very large in practical scenarios) so there probably has to be a refactor of how the RPC module is setup

To Reproduce
Steps to reproduce the behavior:
Once #501 comes in, can reproduce very consistently with connecting to testnet (default bootnodes and genesis so just running the node)

Log output

Log Output

Expected behaviour

BlockSync responses should not fail to decode

Screenshots

Environment (please complete the following information):

  • OS:
  • Rust version(e.g. rustc --version)
  • Branch/commit

Other information and links

@austinabell austinabell added Type: Bug Something isn't working Priority: 2 - High Very important and should be addressed ASAP Network Libp2p and PubSub stuff labels Jun 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Network Libp2p and PubSub stuff Priority: 2 - High Very important and should be addressed ASAP Type: Bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant