-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NAT Connectivity and Discoverability Issues #2509
Comments
Some tips i've posted in a different issue before: first, note down the peer IDs of all your involved nodes (run To check what peers a given node is connected to, run To check connectivity to a given node, i normally start at an ipfs node that i know has good connectivity (my vps normally) and run If you can successfully connect a node to the node with the data, you should be able to run an If you connect and arent able to get the data, i would check If you cant make a connection from an outside node to your node with the data, the next thing I would try is making a connection from the data node out to other peers, then try fetching the data on those other peers. If that works, then the issue lies entirely with NAT traversal not working. Ipfs does require some amount of port forwarding to work on NAT'ed networks (whether manual forwarding, NAT PMP or upnp). |
i have thought about designing a NAT test lab, notes here: https://gist.github.com/whyrusleeping/a0ab8df68d1020df32c6 |
I keep getting too many files open errors on my ipfs daemon.. not sure if this is related. |
@slothbag hrm... getting the too many open files error will definitely cause issues with dht connectivity. |
An irc user noted issues after seeing the mdns 'failed to bind unicast' error. Theres likely some correlation here. |
The issue appears to be a file descriptor leak in the utp codebase (thanks for the tip @slothbag it really helped!) A temporary workaround (while i'm working on an official fix) is to add a utp swarm addr to your swarm address config. In After that value is set, restart your daemon and things should be better. If you continue to experience the same problems, please let me know ASAP UPDATE: the utp code is disabled by default in recent versions of go-ipfs. This suggestion is no longer valid. |
so from our other beginning to this issue - a couple quick notes:
dht:
I was able to retrieve a 2.04GB data set (1.4GB file, 700MB file, a few others) with the following procedure:
on host running some of these addresses are known to be from my own hosts, some clearly are not. I'll have a little more time tomorrow to investigate further & maybe get some packet traces. |
Is this a regression? Maybe going back in versions to see when it is starting? |
I did the utp config change and it appears to have fixed the issue.. nice find! |
fixes to the utp lib have been merged into master, so pull the latest down and run Please let me know how things go. |
Added the utp change, added the port to my security groups.
Restarting this same node running with repeated restarts, and slightly different configurations for API,Gateway, Swarm (but including the appropriate utp line) I had many different results regarding the inclusion, lack of inclusion of the AWS PUBLIC_IPV4 address.
I finally ran this machine out of space (I'm presuming this is related to docker's handling of volumes, rather than ipfs's handling of data) - and blew up the ipfs dir. :) Starting with a fresh config on the same host I moved onto a "fresher" node (fresh IP addresses, fresh config file, no data downloaded)
There's not any perceptible pattern to log messages when the get operation appears to "stall out" other than what I've noted above
I'll be adding more nodes shortly, all with fresh ip addresses. I'll whip up an updated docker image in the morning from master. |
Updated local and remote IPFS node with latest UTP fixes. Local node is behind NAT but has port forwarding for IPFS. I dont seem to get the "too many files in use" error anymore, however the discover-ability is still not working. I have been trying to pin a object for an hour and it cant find it. Problem still exists. |
@guruvan the 'stalling out' is 'no data at all received for a long time' right? not, 'received some data and then hung'. If thats the case, then its an issue with discoverability/connectivity (which i think is the problem). @slothbag in this case, can you discover valid addressed from the NAT'ed node from a node outside the NAT? ( |
ipfs dht findpeer returns a list if ip addresses... a mixture of my LAN ip and my external IP, but the correct incoming port is on the LAN ip and all the external IPs have incorrect ports. |
@slothbag thats awesome information for me to have, thank you! |
Why can't ipfs use same method for determine external IP like for example parity: --nat extip:. |
@mikhail-manuilov Do you mean you specified your external IP in The |
Here is other issue with connectivity I have observed at my house. So if someone has problems with dialing out, disabling reuseport might help. |
It would be great to add tests for this sort of thing to https://github.com/whyrusleeping/natest
I need to spend more time working on that tool, but it should help us to diagnose these things.
…On Oct 26, 2017, 2:39 PM +0100, Jakub Sztandera ***@***.***>, wrote:
Here is other issue with connectivity I have observed at my house.
I am behind carrier grade NAT and then my local NAT.
After starting go-ipfs it connects to one bootstrap node and that is it. Randomly I have found that disabling reuseport (IPFS_PREUSEPORT=false) "fixes" it. Fixes as in: now I can dial out, people still can't dial to me (NAT it too strong).
So if someone has problems with dialing out, disabling reuseport might help.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I tried to use this tool right now, seems quite broke. |
@Kubuxu I fixed the issue you reported, thanks! mind trying again? |
|
@Kubuxu hah, you can see it's a double NAT, you succeeded in mapping a
port, but the connect back still failed.
…On Wed, Nov 1, 2017, 4:47 AM Jakub Sztandera ***@***.***> wrote:
{
"OutboundHTTP": {
"OddPortConnection": "",
"Port443Connection": ""
},
"Nat": {
"Error": null,
"MappedAddr": "/ip4/0.0.0.0/tcp/38044"
},
"HavePublicIP": false,
"Response": {
"SeenAddr": "/ip4/87.239.222.9/tcp/6812",
"ConnectBackSuccess": false,
"ConnectBackMsg": "dial attempt failed: \u003cpeer.ID Pah1CN\u003e --\u003e \u003cpeer.ID TwQRCH\u003e dial attempt failed: connection refused",
"ConnectBackAddr": "",
"TriedAddrs": [
"/ip4/127.0.0.1/tcp/40941",
"/ip4/0.0.0.0/tcp/38044",
"/ip4/87.239.222.9/tcp/40941"
]
},
"Request": {
"PeerID": "QmTwQRCHoF34HamrcfAQx9rti3AM127hKr6MGrzvBnxBoM",
"SeenGateway": "",
"PortMapped": "/ip4/0.0.0.0/tcp/38044",
"ListenAddr": "/ip4/127.0.0.1/tcp/40941"
},
"TcpReuseportWorking": false
}
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2509 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABL4HB87tw3Ia7ZrjDGSiSWsmljxQcBuks5syD5EgaJpZM4H5k3r>
.
|
Yup, interesting thing is that with REUSEPORT I think I might be able to dial only once from that port. |
I have an option to buy external IP from my ISP but I am not doing it on purpose until we successfully recreate setup like this elsewhere. |
People have been noticing issues with connectivity through NATs lately. Lets use this issue to track those issues, and provide debugging information/tips/tricks.
The text was updated successfully, but these errors were encountered: