[Routing Behavior] Possible Multi-Device Interface Problem? #605

LinuxinaBit · 2024-11-07T16:08:30Z

LinuxinaBit
Nov 7, 2024

According to Parnikkapore on Matrix, when a packet comes in over an RNode (or other multi-device interface) and best route to the destination is back over the same interface, the packet is dropped to avoid networking issues.

The best solution for this I can think of is temporarily adding an extra field containing the address of the next hop when routing over multi-device interfaces like plain RF.

I think it’s well worth the extra overhead because it’s the only way to solve this major issue without adding unnecessary network layers or introducing flaws like switching loops etc.

Answered by markqvist

Nov 7, 2024

The best solution for this I can think of is temporarily adding an extra field containing the address of the next hop when routing over multi-device interfaces like plain RF.

I completely agree, since this is already how it works :)

Consider the following scenario: Node B can hear both node A and C on the same radio interface, but A and C cannot hear each other. Any announces originating from either A or C will be heard by B, and inserted into the path table on B.

If A tries to communicate with a destination on C, it will request a path to that destination, and B will answer.
Any packets sent to C from A will be inserted into transport on A, and have the transport ID of node B added to…

View full answer

LinuxinaBit · 2024-11-07T16:16:59Z

LinuxinaBit
Nov 7, 2024
Author

It turns out I completely missed a message from @liamcottle saying

I have no idea what I'm talking about, I just woke up, but it was my understanding that each transport nodes has a transport Id. So each radio unit has a unique transport Id for next hop. If your packet is destined to go via A, and not C, C doesn't broadcast even if it heard it because transport drops it

If that is true, this is likely pretty close to a non-issue.
Can someone please confirm this?

0 replies

faragher · 2024-11-07T18:22:14Z

faragher
Nov 7, 2024

I'm going to boil this down to brass tacks, a truly terrifying mixed metaphor, and expand from there.

Reticulum routes across interfaces. RNode, TCPServer, Serial, whatever. It sends to the next hop based on which interface that next hop is on.

When an announce comes in on an interface, it doesn't repeat over that interface because it's assumed that everyone on that interface has heard it. Across practical wired interfaces, this makes sense. Serial, CAN, whatever, it's assumed all stations got the transmissions properly. Data corruption and resends are a different discussion I'm not up to speed on.

Over radio, this can't be assured. In fact, the cases when this isn't appropriate are generally all over radio. Both dumb repeaters and addressing the issue of radios being out of range of each other, but not an intermediate peer, require some rebroadcasts in a pure mesh topography.

Analog radio deals with this by frequency shifting. A radio on foo.20 MHz may rebroadcast its transmissions on foo.60 MHz, allowing for an expanded broadcast area by effectively being a different station altogether. There are a LOT of finer solutions, such as multiple control frequencies and computer-defined transmit/receive frequencies (see Project 25 Phase 2) but generally speaking, these require far more expensive systems than the SX series chips, and require central command and control.

By doing broadcasts over the same radio settings, you effectively take up N*bandwidth for N nodes in range of any given receiver. You can reduce this by reducing the rebroadcasts to only infrastructure repeaters, but when everyone clicks over to being a transport node, then there's no avoiding everything will be rebroadcast and, as noted, any efficiency gained by routing is lost.

This generally boils down to competing and incompatible topographies. Routed systems are more efficient than ad-hoc mesh networks, but an ad-hoc mesh network can contact nodes across multiple hops without any sort of planning. It's certainly possible to do this with modifications to Reticulum, or even with an application as-is (plain destinations, encrypted in software, which would only be propagated across the wider network by properly configured bridges).

Should it be done? That question can't be answered. The end user is responsible for finding a tool that suits their needs. If they find a novel way to use Reticulum as a pure mesh network and accept the issues therein, that's great. If another tool is a better fit, then perfect! No solution can cover every use case, and any system that tries will either fail or be a bloated compromise.

Now, the curious and astute will ask "What about TCP? If you don't repeat over an interface, then how could that possibly work?" TCP Interfaces list each individual IP connection as a discrete sub-interface. So when Alice gets an announce from Bob, be that directly or through a transport, then she'll send the announce to Claire and David across their individual sub-interfaces, but not to Bob, since that's really a point to point connection.

At the end of the day, Reticulum does not suffer from the loop issue or mass re-broadcast issues as-is. It was built for routed efficiency. In certain cases over radio, this causes issues and people would like to see those issues resolved. It's possible these are fundamental issues that can't be resolved either entirely or with the RNS specifically. Some of them can be solved at the application level (opportunistic messages, pre-shared keys, direct RNode access, etc. combined with proper interface modes to prevent announce bleed). Some are genuine issues that need serious thought. Like the Matrix discussion which shows that this, perhaps, is a generally unsolvable problem at the stack level, and Reticulum mesh networks may be best assembled as a purely application-driven system, only using RNS to route over infrastructure endpoints.

Does that answer the direct questions and explain the overall situation and why the question was posed in the first place? I'm still half asleep.

2 replies

LinuxinaBit Nov 7, 2024
Author

When an announce comes in on an interface, it doesn't repeat over that interface because it's assumed that everyone on that interface has heard it. Across practical wired interfaces, this makes sense. Serial, CAN, whatever, it's assumed all stations got the transmissions properly… Over radio, this can't be assured. In fact, the cases when this isn't appropriate are generally all over radio.

So then why isn’t it a per-interface configuration option, enabled by default on radio interfaces like RNode but not mandatory?

Analog radio deals with this by frequency shifting.

Frequency shifting makes sense for analog, but it’s more complex and requires more setup. Digital radio doesn’t require this because it doesn’t need to broadcast constantly. It seems we both agree that it isn’t an ideal solution.

By doing broadcasts over the same radio settings, you effectively take up N*bandwidth for N nodes in range of any given receiver…then there's no avoiding everything will be rebroadcast and, as noted, any efficiency gained by routing is lost.

I once summed up Reticulum’s routing as “flood routing announces” because that’s what I thought it did.
I see very little issue with this assuming the network is sectioned off properly. It loses an amount of efficiency that is not insignificant when compared to the complicated frequency shifting method previously mentioned, sure, but it’s nowhere near as inefficient as pure flood routing.
I personally believe the pros of working single channel operation completely outweigh the downsides it causes.

This was kind of an adjacent response to my actual question, but it is something worth discussing :)

faragher Nov 7, 2024

So then why isn’t it a per-interface configuration option, enabled by default on radio interfaces like RNode but not mandatory?

Dunno. Not my domain.

Frequency shifting makes sense for analog, but it’s more complex and requires more setup. Digital radio doesn’t require this because it doesn’t need to broadcast constantly. It seems we both agree that it isn’t an ideal solution.

It also makes sense in limited bandwidth situations, congested regions, or when you're setting up dissimilar systems, such as local nets and long range backbones. P25P2 is all digital, but the need for multiple transmissions simultaneously and a wide area of responsibility requires multiple channels. It's a larger discussion regarding network topography, not really related to meshes or the original question.

I once summed up Reticulum’s routing as “flood routing announces” because that’s what I thought it did. I see very little issue with this assuming the network is sectioned off properly. It loses an amount of efficiency that is not insignificant when compared to the complicated frequency shifting method previously mentioned, sure, but it’s nowhere near as inefficient as pure flood routing. I personally believe the pros of working single channel operation completely outweigh the downsides it causes.

This was kind of an adjacent response to my actual question, but it is something worth discussing :)

I mean if people would just stop announcing every five minutes it would be quite efficient, and the efficiency increases when the network is properly configured and proper gateways and access points are used. Bear in mind it contains the cryptographic key required to communicate with the announced at all. Since it's used to define network topography, it can't exactly be routed. It's useful to separate the creation of the network and the data transfer as different intents and required efficiency.

markqvist · 2024-11-07T20:19:39Z

markqvist
Nov 7, 2024
Maintainer

The best solution for this I can think of is temporarily adding an extra field containing the address of the next hop when routing over multi-device interfaces like plain RF.

I completely agree, since this is already how it works :)

Consider the following scenario: Node B can hear both node A and C on the same radio interface, but A and C cannot hear each other. Any announces originating from either A or C will be heard by B, and inserted into the path table on B.

If A tries to communicate with a destination on C, it will request a path to that destination, and B will answer.
Any packets sent to C from A will be inserted into transport on A, and have the transport ID of node B added to it's header in the first address field.
Node B will receive the packet, identify that it is in transport and use its current path table to determine that the next hop for this packet is via node C, which is on the same interface the packet received it on.
The transport ID field of the packet header is updated, and the packet is now sent on the same interface again when possible.
Node C receives the packet in transport, and delivers it to the destination, which is local to that system.

Hope that clears things up a bit.

What @faragher points out here is still relevant though, but I'd like to offer one clarification. Reticulum does indeed not flood route or mindlessly rebroadcast anything, but a limited amount of announce rebroadcasts can actually occur, even on radio interfaces, provided they are in the full (or similar) interface mode. In such cases, Reticulum will rebroadcast an announce, but only once for every unique announce (from the perspective of the original announce emitter), and only if it has not heard any other adjacent nodes rebroadcasting the same announce. These rebroadcasts are inserted into a rebroadcast waiting table, and as soon as it hears someone else on the same shared medium rebroadcasting it, the announce is dropped from the rebroadcast waiting table. Additionally, all the normal congestion / bandwidth allocation / rate limiting logic still applies, of course.

0 replies

LinuxinaBit · 2024-11-07T21:58:20Z

LinuxinaBit
Nov 7, 2024
Author

This all makes a lot more sense now, thank you both for clearing it up ❤️

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Routing Behavior] Possible Multi-Device Interface Problem? #605

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 2 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

[Routing Behavior] Possible Multi-Device Interface Problem? #605

LinuxinaBit Nov 7, 2024

Replies: 4 comments · 2 replies

LinuxinaBit Nov 7, 2024 Author

faragher Nov 7, 2024

LinuxinaBit Nov 7, 2024 Author

faragher Nov 7, 2024

markqvist Nov 7, 2024 Maintainer

LinuxinaBit Nov 7, 2024 Author

LinuxinaBit
Nov 7, 2024

Replies: 4 comments 2 replies

LinuxinaBit
Nov 7, 2024
Author

faragher
Nov 7, 2024

LinuxinaBit Nov 7, 2024
Author

markqvist
Nov 7, 2024
Maintainer

LinuxinaBit
Nov 7, 2024
Author