dial attempt failed: bind: address already in use #262

paralin · 2018-01-05T17:06:47Z

I'm seeing this error when multiple streams are rapidly opened to a peer from different Goroutines:

dial attempt failed: <peer.ID fR8QjW> --> <peer.ID V6hTxD> dial attempt failed: dial tcp4 0.0.0.0:8180->100.96.234.187:8180: bind: address already in use

I think there may be a concurrency issue somewhere. Using go-libp2p @ 4bba0bb (latest).

The text was updated successfully, but these errors were encountered:

Stebalien · 2018-01-05T19:41:41Z

If this is always happening, it's a bug that has been fixed in a dependency but I haven't bubbled it up here yet (dependency conflicts that I'm currently resolving). If you're not using gx (just using go get) you shouldn't be noticing this bug.

If this only happens sometimes, make sure that the peers aren't trying to dial each other at the same time (that will result in this error, unfortunately). It happens because we enable SO_REUSEPORT to reuse the source port. That means that there can exist at most one connection between two peers. That is:

Peer A initiates a dial to peer B.
Peer B initiates a dial to peer A.
The second dial fails because there is already a connection between those two ports.

This is, actually, something we can and should fix. To do so, we'd need to detect this error and keep retrying until either we see that the other side as succeeded in connecting to us or we have succeeded in connecting to them. However, this will probably be a bit tricky...

So, what's your precise setup. Any chance the other side is trying to dial you back?

paralin · 2018-01-05T20:57:16Z

After introducing yamux, copying the setup in IPFS (without the msmux experiment) I no longer see this, so I believe the problem exists somewhere in the default stack used by libp2p.

For some context, the libp2p stack is in use in the FACEIT matchmaking system in production.

They are trying to dial each other at approximately the same time (+/- 2 seconds). The issue only occurs between peers that dial each other this way (A contacts B at the same time as B contacting A), not between peers that dial one way (A contacts B but never B contacts A).

The reason the dial happens so close together is because Kubernetes endpoints are used as the discovery mechanism (1 node per pod mapping via internal Kubernetes networking). Kubernetes informs the peers about each others' addresses this way at exactly the same time over the watch channel.

So, the issue is as you said: when two peers dial each other simultaneously, the SO_REUSEPORT approach breaks the connections. A easy fix is to retry with a staggered backoff (which we do, and works quite nicely). It would be nice if we could find a way to do this where simultaneous dials are possible.

qywang2012 · 2019-01-07T09:00:47Z

I also have the problem of 'bind: address already in use'. Currently we use a old version @edb6434ddf456f58fbe2538d5336435a23915bd9.
I want to update the version to 6.0.30, this problem has fixed? If not, can you give a good suggestion to deal with this problem.@Stebalien

Stebalien · 2019-01-07T17:40:07Z

@qywang2012 what's the exact error you're seeing?

We haven't fixed the second issue I described but you may be experiencing a different issue.

speed up the TestFDLimitUnderflow test

Stebalien added the kind/bug A bug in existing code (including security flaws) label Jan 7, 2019

Stebalien closed this as completed Jul 22, 2021

marten-seemann added a commit that referenced this issue Apr 21, 2022

Merge pull request #262 from libp2p/speed-up-fd-limit-underflow-test

1f20d59

speed up the TestFDLimitUnderflow test

MarcoPolo mentioned this issue Jul 7, 2022

go-libp2p v0.21.0 #1514

Closed

41 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dial attempt failed: bind: address already in use #262

dial attempt failed: bind: address already in use #262

paralin commented Jan 5, 2018

Stebalien commented Jan 5, 2018

paralin commented Jan 5, 2018 •

edited

Loading

qywang2012 commented Jan 7, 2019

Stebalien commented Jan 7, 2019

dial attempt failed: bind: address already in use #262

dial attempt failed: bind: address already in use #262

Comments

paralin commented Jan 5, 2018

Stebalien commented Jan 5, 2018

paralin commented Jan 5, 2018 • edited Loading

qywang2012 commented Jan 7, 2019

Stebalien commented Jan 7, 2019

paralin commented Jan 5, 2018 •

edited

Loading