Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support multi address per node and same address for multi node #743

Open
wenerme opened this issue Aug 1, 2021 · 11 comments
Open

support multi address per node and same address for multi node #743

wenerme opened this issue Aug 1, 2021 · 11 comments

Comments

@wenerme
Copy link

wenerme commented Aug 1, 2021

when using tinc, I can add multi address to a interface, mostly used as a poor man's load balance or network redundancy.

e.g. two master has two ip, but one more shared ip used for client access.

tinc is documented on this, if one node down, will use about 15s to route to another master node.

n2n 2 if add multi -a flag, only last one works, how n2n support something like this, make build small network easier.

@Logan007
Copy link
Collaborator

Logan007 commented Aug 2, 2021

add multi address to a interface

n2n itself does currently not support it.

n2n 2 if add multi -a flag, only last one works, how n2n support something like this, make build small network easier.

But as n2n creates a regular layer 2 TAP device, you can easily add more IP addresses manually to the virtual adapter (named n2n0 in this case, i.e. -d n2n0).

ip addr add 10.xx.xx.xx/24 dev n2n0

Just put this line into your start-up script after edge ... command.

if one node down, will use about 15s to route to another master node

That is an interesting idea! I guess it is about fail-over. So far, n2n has not been too deep into layer 3 management, it does not care too much about the management of the IP level but more on the Ethernet level. Aren't there any standard procedures for fail-over in regular networks that could be applied here?

@wenerme
Copy link
Author

wenerme commented Aug 2, 2021

@Logan007 something like arp level may help failover, like https://github.com/metallb/metallb , I also use metallb over tinc.

so the behavior of shared ip address is undefined, but maybe n2n can support some start script ? I want to use multi -a because n2n use conf, if n2n support start script like tinc-up, I can put it there. the n2n is managed by openrc deployed by ansible https://github.com/wenerme/ansible-collection-wenerme-alpine/blob/master/roles/alpine/tasks/n2n-edge-conf.yaml , no proper start script make customize harder.

@Logan007
Copy link
Collaborator

Logan007 commented Aug 2, 2021

@wenerme n2n's primary goal is to offer Ethernet alike TAP adapter connectivity in a safe VPN environment that will even work behind NAT. Up to now, we only took care of very few things beyond that, such as auto-IP address assignment and route control. Often, these features turn out to be very platform specific and make the code difficult to maintain. To be honest, I have a similar feeling about multiple IP addresses. I googled how to add multiple IP addresses to an adapter on Linux and it does not seem to be an all-too-easy C-job, I don't want to imagine on Windows...

However, if anyone wants to contribute code, we certainly are open to pull it in!

Until then, you could try to use scripts. Simple ones could just run

#! /bin/bash
/usr/bin/edge ...-d n2n0 ...
ip addr add ... dev n2n0
ip route add ... dev n2n0

If I remember correctly, the edge without -f terminates after the virtual device is configured, up and running. So, the ip addr add command should work very well on the already started interface.

Alternatively, a custom script could run with PostUp as suggested in #694 along with #629 (use of allow-hotplug for the n2n virtual TAP device). Not sure what distributions support it.


With a view to fail-over, I think it is a very specific function. I am not sure if this is maybe too specific to be covered by n2n?

It would require the reserve node to actively monitor all the other nodes for the virtual reserve IP address. Those are registered at the supernodes, only partly with the peer edges. So, the reserve edge permanently would need to poll the supernode. Being layer 2, n2n is more prepared to look up other nodes on their MAC addresses, there is no look-up for their virtual IP addresses yet because OS' ARP does that job (no need to do ICMP ping inside n2n, just poll the supernode, new message type required). And then, the fail-over period could be as long as approximately 60 to 90 seconds according to the current registration-and-purge timing scheme. This is how the fail-over case could be detected and TAP's virtual IP address can be changed then to the provided reserve address.

But, what happens if the original node comes back claiming the original address again? How to detect and how to react? If we get those points somewhat clearer, we could think about how to implement this feature. But then, still, shall we duplicate other tools' functionalities (talking about fail-over solutions, not so much tinc)?

If I get it correctly, the following drafted script could do a similar job; beware, it probably does not cover the original-server-is-back case either.

#! /bin/bash

# ping the other original edge at its IP address
# if success, keep quiet (restore reserve IP address, send out ARP), sleep, start again
# if failing, try again the ping
# if failing this second time, change own IP address to other edge's IP address (ip addr ...) and send out ARP (arping -U ...)
# sleep, start over

The virtual TAP device should handle ARP packets very well, it also broadcasts them to the other edges, so you could ip addr ... command to change the address and use arping -U ... to send out ARP announcing it.

@skyformat99
Copy link
Contributor

@wenerme n2n's primary goal is to offer Ethernet alike TAP adapter connectivity in a safe VPN environment that will even work behind NAT. Up to now, we only took care of very few things beyond that, such as auto-IP address assignment and route control. Often, these features turn out to be very platform specific and make the code difficult to maintain. To be honest, I have a similar feeling about multiple IP addresses. I googled how to add multiple IP addresses to an adapter on Linux and it does not seem to be an all-too-easy C-job, I don't want to imagine on Windows...

However, if anyone wants to contribute code, we certainly are open to pull it in!

Until then, you could try to use scripts. Simple ones could just run

#! /bin/bash
/usr/bin/edge ...-d n2n0 ...
ip addr add ... dev n2n0
ip route add ... dev n2n0

If I remember correctly, the edge without -f terminates after the virtual device is configured, up and running. So, the ip addr add command should work very well on the already started interface.

Alternatively, a custom script could run with PostUp as suggested in #694 along with #629 (use of allow-hotplug for the n2n virtual TAP device). Not sure what distributions support it.

With a view to fail-over, I think it is a very specific function. I am not sure if this is maybe too specific to be covered by n2n?

It would require the reserve node to actively monitor all the other nodes for the virtual reserve IP address. Those are registered at the supernodes, only partly with the peer edges. So, the reserve edge permanently would need to poll the supernode. Being layer 2, n2n is more prepared to look up other nodes on their MAC addresses, there is no look-up for their virtual IP addresses yet because OS' ARP does that job (no need to do ICMP ping inside n2n, just poll the supernode, new message type required). And then, the fail-over period could be as long as approximately 60 to 90 seconds according to the current registration-and-purge timing scheme. This is how the fail-over case could be detected and TAP's virtual IP address can be changed then to the provided reserve address.

But, what happens if the original node comes back claiming the original address again? How to detect and how to react? If we get those points somewhat clearer, we could think about how to implement this feature. But then, still, shall we duplicate other tools' functionalities (talking about fail-over solutions, not so much tinc)?

If I get it correctly, the following drafted script could do a similar job; beware, it probably does not cover the original-server-is-back case either.

#! /bin/bash

# ping the other original edge at its IP address
# if success, keep quiet (restore reserve IP address, send out ARP), sleep, start again
# if failing, try again the ping
# if failing this second time, change own IP address to other edge's IP address (ip addr ...) and send out ARP (arping -U ...)
# sleep, start over

The virtual TAP device should handle ARP packets very well, it also broadcasts them to the other edges, so you could ip addr ... command to change the address and use arping -U ... to send out ARP announcing it.

Very good! The above can be added to the FAQ

@wenerme
Copy link
Author

wenerme commented Aug 3, 2021

@Logan007 Thanks for the detail.

I pushed n2n to alpine's repo, the problem of this

/usr/bin/edge ...-d n2n0 ...
ip addr add ... dev n2n0
ip route add ... dev n2n0

is nowhere to place the ip script, maybe consider add a flag like --post-up-script /etc/n2n/mynet.sh (just like tinc-up). the edge is started by openrc, so no way to know when edge is started.

hotplug is even more complex, require event support, openrc do not support hotplug.


for the fail-over, as long as tun act like a normal L2 device, support broadcast and arp, fail-over should work, no need to consider original-server-is-back case, like normal L2, just make this as a defined behavior of n2n.

@Logan007
Copy link
Collaborator

Logan007 commented Aug 3, 2021

Very good! The above can be added to the FAQ

That would make it a hard-to-read document then... 😄

@Logan007
Copy link
Collaborator

Logan007 commented Aug 3, 2021

the edge is started by openrc, so no way to know when edge is started

I am absolutely not confident with openrc but couldn't you make it start one script edgeStart.sh which contains exactly all of the commands listed above? It would have the same effect as a post-up script.

By when, do you mean before or after network is up? Under these circumstances, a post-up script could also not be sure if network is already up or not.

maybe consider add a flag like --post-up-script /etc/n2n/mynet.sh

However, as we see rising demand here and over at other issues, we actually might consider it. Probably not a big deal adding it but I would assign low priority. If anyone wants to cover this, please feel free to contribute a pull request!

as long as tun act like a normal L2 device, support broadcast and arp,

n2n's TAP does support broadcast and especially ARP.

no need to consider original-server-is-back case

Do you mean in case of that other fail-over tools (relying on broadcast and ARP?) are able to work on current n2n?

If not, in case n2n were to take care of IP fail-over, shouldn't we try to get it working somewhat right? From where I see it, we definitely cannot ignore the original-IP-is-back case.

like normal L2, just make this as a defined behavior of n2n.

I do not fully understand. Please specify what exactly you want n2n's default behavior to be or to become.

@wenerme
Copy link
Author

wenerme commented Aug 4, 2021

openrc is just a common supervisor like systemd or s6, they need to control the process so they can support signal, restart, monitoring, it should not be a start.sh.

post-up script means the tap device is created, the ip script will success, no matter the network is working or not.

n2n's TAP does support broadcast and especially ARP.

I will test the fail-over when I have time.

we definitely cannot ignore the original-IP-is-back case.

this is more than normal fail-over, original-IP-is-back should just works if fail-over works, basically it's just like a bridge device, but I do see some bridge problems https://github.com/ntop/n2n/issues?q=is%3Aissue+is%3Aopen+bridge

defined behavior of n2n

Just want to know case like this does supported or tested by n2n, but seems not, I should test this by myself, use as own risk.

@Logan007
Copy link
Collaborator

Logan007 commented Aug 4, 2021

the ip script will success, no matter the network is working or not.

You are right and made a very good point here. Might consider the up-script even more now.

Just want to know case like this does supported or tested by n2n, but seems not, I should test this by myself, use as own risk.

I see. Please let us know what you find!

@MurzNN
Copy link

MurzNN commented Mar 7, 2024

Thanks for the suggestion, but it doesn't work for me, could anyone please explain why?

/usr/sbin/edge -f -a 10.77.1.4 ...
ip addr add 10.77.1.5/24 dev edge0
ip route add 10.77.1.5 dev edge0

After executing this, I can ping the 10.77.1.5 from the same edge node, but not from other nodes.
But the 10.77.1.4 pongs well from any other edge node of the virtual network.

By the way, arping responds well:

node-10-77-1-2$ arping 10.77.1.5 -c 1
ARPING 10.77.1.5 from 10.77.1.3 edge0
Unicast reply from 10.77.1.5 [EE:63:75:13:09:B4]  180.998ms
Sent 1 probes (1 broadcast(s))
Received 1 response(s)

node-10-77-1-2$ ping 10.77.1.5 -c 1
PING 10.77.1.5 (10.77.1.5) 56(84) bytes of data.

--- 10.77.1.5 ping statistics ---
1 packet transmitted, 0 received, 100% packet loss, time 0ms

node-10-77-1-2$ ping 10.77.1.4 -c 1
PING 10.77.1.4 (10.77.1.4) 56(84) bytes of data.
64 bytes from 10.77.1.4: icmp_seq=1 ttl=64 time=195 ms

--- 10.77.1.4 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 194.811/194.811/194.811/0.000 ms

What can I try to do to make this work?

@Logan007
Copy link
Collaborator

Logan007 commented Mar 8, 2024

Might be that ARP packet do not get through. If -E, (former also -b I think...) (and perhaps -r) on all edges does not help, someone might need to extend the code of multicast/broadcast handling to also allow ARP packets (with perhaps 00⁰000000 destination MAC?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants