Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure network connection is closed when handshake with peer fails #1476

Conversation

pipermerriam
Copy link
Member

@pipermerriam pipermerriam commented Nov 18, 2018

fixes #1403

What was wrong?

As part of our peer connection logic we initiate a low level connection, then create a Peer instance, then perform a handshake, followed by starting the peer service. If anything goes wrong before the service starts, the connection will be left open causing these warnings.

How was it fixed?

wrapped the handshake step in a try/except which closes the connection in the event of an error.

Cute Animal Picture

bearlive05_04

@pipermerriam
Copy link
Member Author

I'll need to dig in a bit to validate where I think the remainder of these are coming from. The first place I want to investigate is:

py-evm/p2p/peer.py

Lines 934 to 936 in 1f60fd8

peer = await self.connect(node)
if peer is not None:
await self.start_peer(peer)

First we initiate the connection and then call PeerPool.start_peer(peer). This ends up calling self.run_child_service() which starts the peer's service component, but only by adding it as a task that it tracks. I'm looking for whether it's possible for the peer to disconnect before reaching:

py-evm/p2p/service.py

Lines 101 to 103 in 1f60fd8

async with self._run_lock:
self.events.started.set()
await self._run()

If so, then the _cleanup will never be called.

@pipermerriam pipermerriam changed the title [WIP] Ensure network connection is closed when handshake with peer fails Ensure network connection is closed when handshake with peer fails Nov 18, 2018
@pipermerriam
Copy link
Member Author

Some loose experimentation suggests this is a good enough starting point that it could be merged and we can chase down the remaining warnings in a later PR.

@pipermerriam
Copy link
Member Author

Nevermind, 3149871 seems to have cleaned up the remaining warnings.

Copy link
Contributor

@cburgdorf cburgdorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me but I was wondering if it could be improved using an asynccontextmanager?

aes_secret, mac_secret, egress_mac, ingress_mac = await _handshake(
initiator, reader, writer, token)
try:
aes_secret, mac_secret, egress_mac, ingress_mac = await _handshake(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this could be rewritten using an asynccontextmanager so that becomes:

async with initiator.connect() as reader, writer:
    aes_secret, mac_secret, egress_mac, ingress_mac = await _handshake(
            initiator, reader, writer, token)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory yes. I've thought about this a bit but I'm unsure enough about it that I'd like to punt, with the compromise being adding a larger comment to the section explaining why the solution isn't ideal.

Copy link
Contributor

@carver carver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome

@pipermerriam
Copy link
Member Author

Don't restart the jobs that failed to report in due to rate limiting, I'm surfacing this with github support and they need to be able to view the failures.

@pipermerriam pipermerriam merged commit 6fca8f7 into ethereum:master Nov 19, 2018
@pipermerriam pipermerriam deleted the piper/ensure-peer-transport-is-always-closed branch November 19, 2018 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ResourceWarning: unclosed resource <TCPTransport
3 participants