-
Notifications
You must be signed in to change notification settings - Fork 646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Message gets dropped because ttl reaches 0 #2849
Comments
@mfornet any updates? We need to understand whether we should address this before phase 1. |
I've been investigating this, it looks like a problem with the routing table which is creating a cycle, but I haven't been able to track down what is the root cause yet. Is this happening consistently to some node? |
No. At least I have seen this reported only once, but that is not (and should not be) an argument for the severity (or the lack thereof) of the issue. |
@bowenwang1996 the simplest scenario where this might happen is if there are three nodes A, B, C all connected to each other, and C goes offline. A and B both learn immediately that its connection with C is dropped, but it takes "a moment" to learn that the other dropped the connection with C. So A sends a routed message to C via B, and then B sends it back to A (to route it to C). Eventually they will drop the message. Something similar can happen with many more nodes when a nodes goes offline. Routing table synchronization should happen fairly fast (in the order of seconds), so this issue will be present during a short period of time and most of the time it will happen trying to route message to a disconnected peer. I'm of course assuming those nodes are running legit (non-tampered) clients, because the easiest way to reproduce this today is simply sending message with ttl=1, or sending routed messages through non-optimal routes. |
@bowenwang1996 I'm closing this, since this behavior is "expected" in some situations. Reopen if we see high number of dropped messages because of this reason. |
We've seen on betanet the following:
It seems that there is some routing loop in the network, but it's not clear to me why that would happen.
The text was updated successfully, but these errors were encountered: