-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handshake protocol: cannot decode stage 1/2 MessageBuffer: length less than 96 bytes #5372
Comments
Hey @prestonvanloon - this seems like it could be a bug in go-libp2p-noise. Do you have any more details about when this is happening, or more log output? Are you able to reproduce this locally? |
@yusefnapora I have not personally encountered this locally, but it was happening very consistently after restarting Prysm in our internal testnet. We would have to restart the process 3 or 4 times before it would work. Do you have any advice on how to troubleshoot this issue? Thanks |
@rauljordan updated some libp2p dependencies and now we have a slightly different message
|
I believe this is where the error originates from: https://github.com/flynn/noise/blob/2492fe189ae688d7edbeae0fd575de2f1c5fec8e/state.go#L420 One of the concerns here is that libp2p did not fallback to secio when we had options both options given to the host. This failure rendered the host unable to peer with anyone. |
@rauljordan mentioned that this issue is very easy to produce locally and happens quite often. This is the last testnet restart blocking items so this is high priority for us. Do you have any suggestions or feedback @yusefnapora? |
Update on this issue, it seems both the local and remote peer regard each other as initiating the outbound handshake. Due to both peers marking themselves as the initiator this leads to a malformed handshake. Added in my own custom logging in Peer A:
Peer B
It seems like both of these peers dial each other at the same exact time and then regard themselves as initiator of the connection and handshake. I am not sure how this is possible ? But it would point You can probably reproduce this by having a libp2p host with Noise enabled. Just disconnecting one peer and reconnecting it a few times should lead to this issue coming up. |
Hey, sorry I wasn't able to help much on Friday & thanks for digging deeper and discovering that it's a simultaneous connection issue. I think we can fix this by detecting the size mismatch and having one peer re-initialize the handshake as a responder. I'll start on this today & keep you guys posted. |
Resolved in v0.12, now running in dev testnet |
Observed in internal testnet today. Affected node cannot peer with other nodes.
Running from commit 9cf22ac
The text was updated successfully, but these errors were encountered: