-
Notifications
You must be signed in to change notification settings - Fork 649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix node crash #1996
Fix node crash #1996
Conversation
There is one detail missing in OP: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I don't think the new _ready_for_sending
promise is necessary (if the std::array
issue is not there or has been fixed), it looks harmless, so I agree to add it.
@@ -251,6 +255,7 @@ namespace graphene { namespace net { | |||
} | |||
~verify_no_send_in_progress() { var = false; } | |||
} _verify_no_send_in_progress(_send_message_in_progress); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the assert
above will be skipped in release build, the elog
above doesn't make sense. If we want to avoid two tasks calling send_message
at the same time, we need a lock.
There is another elog
below which looks suspicious as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen any indication that this happens (i. e. two tasks calling send_message
in parallel), so I wouldn't change this at this time.
The check below (MAX_MESSAGE_SIZE
) is OK IMO. We will start sending the oversized message, but the remote end will probably disconnect us. If they don't it's fine too.
Note that a too large message is an indication of a problem in the chain
layer - it should catch oversized blocks and transactions. Protocol messages are much smaller than the limit.
Hm, actually... this should be verified for some message types. But out of scope here.
Does this solve the case where requesting_peer is disconnected due to inactivity before the firewall check result? as per TG discussion? |
It currently looks as if the case where the peer disconnects is handled correctly. This change here is specifically about the case where the firewall check reply is received while the peer is being reconnected. |
Or another peer is being connected whose |
Added logging in |
P2P log indicating the
Note: |
Theory:
Node is connecting to a peer that has recently requested a firewall check and was then disconnected.
initiate_connect_to
creates a newpeer_connection
, inserts it into_handshaking_connections
, then createsaccept_or_connect_task
to create the actual stcp connection, whereinsock.connect()
will yield.Suppose that while we're waiting for the connection, we receive the result of the firewall check from another peer.
on_check_firewall_reply_message
will look up the originating peer (to which we're still connecting) in_handshaking_connections
and callsend_message
on it. This goes down tostcp_socket.writesome
, which sends the message through_send_aes
for encryption.The problem is, because the connection has not been set up yet and there has been no key exchange,
_send_aes
hasn't been initialized yet.