-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dial attempt failed: bind: address already in use #262
Comments
If this is always happening, it's a bug that has been fixed in a dependency but I haven't bubbled it up here yet (dependency conflicts that I'm currently resolving). If you're not using gx (just using go get) you shouldn't be noticing this bug. If this only happens sometimes, make sure that the peers aren't trying to dial each other at the same time (that will result in this error, unfortunately). It happens because we enable
This is, actually, something we can and should fix. To do so, we'd need to detect this error and keep retrying until either we see that the other side as succeeded in connecting to us or we have succeeded in connecting to them. However, this will probably be a bit tricky... So, what's your precise setup. Any chance the other side is trying to dial you back? |
After introducing yamux, copying the setup in IPFS (without the msmux experiment) I no longer see this, so I believe the problem exists somewhere in the default stack used by libp2p. For some context, the libp2p stack is in use in the FACEIT matchmaking system in production. They are trying to dial each other at approximately the same time (+/- 2 seconds). The issue only occurs between peers that dial each other this way (A contacts B at the same time as B contacting A), not between peers that dial one way (A contacts B but never B contacts A). The reason the dial happens so close together is because Kubernetes endpoints are used as the discovery mechanism (1 node per pod mapping via internal Kubernetes networking). Kubernetes informs the peers about each others' addresses this way at exactly the same time over the watch channel. So, the issue is as you said: when two peers dial each other simultaneously, the SO_REUSEPORT approach breaks the connections. A easy fix is to retry with a staggered backoff (which we do, and works quite nicely). It would be nice if we could find a way to do this where simultaneous dials are possible. |
I also have the problem of 'bind: address already in use'. Currently we use a old version @edb6434ddf456f58fbe2538d5336435a23915bd9. |
@qywang2012 what's the exact error you're seeing? We haven't fixed the second issue I described but you may be experiencing a different issue. |
speed up the TestFDLimitUnderflow test
I'm seeing this error when multiple streams are rapidly opened to a peer from different Goroutines:
I think there may be a concurrency issue somewhere. Using go-libp2p @ 4bba0bb (latest).
The text was updated successfully, but these errors were encountered: