Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Interface-specific binding and packet marking broken #485

Closed
flu0r1ne opened this issue Sep 14, 2023 · 13 comments
Closed

BUG: Interface-specific binding and packet marking broken #485

flu0r1ne opened this issue Sep 14, 2023 · 13 comments

Comments

@flu0r1ne
Copy link
Contributor

flu0r1ne commented Sep 14, 2023

I am reporting two distinct issues related to the 'mtr' network diagnostic tool. The first issue (Bug 1) arises when specifying a network interface using the '-I' flag; contrary to expectations, 'mtr' does not bind directly to the specified device but seems to consult the system's routing table instead. This results in packets being sent via an unintended interface, leading to diagnostic inaccuracies. The second issue (Bug 2) occurs when using the '-M' flag to set a firewall mark on packets; despite setting the mark, it appears that 'mtr' does not actually apply it, causing the packets to not follow the expected policy-based routing rules. These two issues hinder the tool's effectiveness in specialized network diagnostics.

To demonstrate these issues, I have created a small proof of concept using Linux network namespaces to demonstrate the bug. Conceptually, it simulates two boxes with two connected interfaces:

 ┌──────────────────────┐ 172.30.1.0              ┌──────────────────────┐
 │ MTR-TEST0    VETH01A ├─────────────────────────┤ VETH10A    MTR-TEST1 │
 │                      │               172.30.1.1│                      │
 │                      │                         │                      │
 │                      │ 172.30.2.0              │                      │
 │              VETH01B ├─────────────────────────┤ VETH10B              │
 │                      │               172.30.2.1│                      │
 │                      │                         │                      │
 │                      │                         │                      │
 │                      │                         │                      │
 │                      │                         │                      │
 └──────────────────────┘                         └──────────────────────┘

The following tests were conducted on Linux 6.1.51-1-lts using mtr version 0.95.

#############
### SETUP ###
#############

# Create two network namespaces to simulate a connection between mtr-test0 and mtr-test1
ip netns add mtr-test0
ip netns add mtr-test1

# Create the first pair of linked virtual ethernet interfaces (a)
ip link add veth-01a type veth peer name veth-10a
ip link set veth-01a netns mtr-test0
ip link set veth-10a netns mtr-test1

# Create the second pair of linked virtual ethernet interfaces (b)
ip link add veth-01b type veth peer name veth-10b
ip link set veth-01b netns mtr-test0
ip link set veth-10b netns mtr-test1

# Assign IP addresses to the interfaces in namespace mtr-test0
ip netns exec mtr-test0 /bin/bash -c '
	ip addr add 172.30.1.0/31 dev veth-01a
	ip addr add 172.30.2.0/31 dev veth-01b
	ip link set veth-01a up
	ip link set veth-01b up
'

# Assign IP addresses to the interfaces in namespace mtr-test1
ip netns exec mtr-test1 /bin/bash -c '
	ip addr add 172.30.1.1/31 dev veth-10a
	ip addr add 172.30.2.1/31 dev veth-10b
	ip link set veth-10a up
	ip link set veth-10b up

	# Implement policy routing with fwmark. This will be used
	# to demonstrate the second bug. This rule routes all traffic
	# with mark 0x99 through veth-10b

	ip rule add fwmark 0x99 table 99
	ip route add 172.30.1.0/31 dev veth-10b table 99
'

#############
## TESTING ##
#############

# Start tshark in namespace mtr-test0 on interface veth-01a
ip netns exec mtr-test0 tshark -i veth-01a

# Ping from mtr-test1 to mtr-test0, routed to veth-01a
#
# This demonstrates that packets can be routed to veth01a
# from mtr-test1 using the route on the link and that our
# tshark command is listening.
ip netns exec mtr-test1 ping 172.30.1.0

# Ping from mtr-test1 to mtr-test0 via veth-10b (interface specification)

# This demonstrates that if packets sent through veth-10b, they
# do not appear in the tshark packet stream. Note, mtr-test0 will actually
# respond since this is a local address.
ip netns exec mtr-test1 ping -I veth-10b 172.30.1.0 -c 2

# BUG 1: Execute mtr from mtr-test1 with interface specification
# As is shown by tshark, these packets are routed through veth-10a
ip netns exec mtr-test1 mtr -I veth-10b 172.30.1.0

# I conduct an strace on mtr to look for `SO_BINDTODEVICE`. From what I
# understand, this is the mechanism by which a socket can be bound to
# an interface, bypassing the routing table.
ip netns exec mtr-test1 strace -o trace mtr -I veth-10b 172.30.1.0
cat trace | grep SO_BINDTODEVICE # Yields no results
rm trace

# Ping from mtr-test1 to mtr-test0 using fwmark
# This demonstrates traffic with mark 0x99 are routed
# through veth-10b
ip netns exec mtr-test1 ping -m 153 172.30.1.0 -c 2

# BUG 2: Execute mtr with the specified fwmark
# Again, they are routed through veth-10a and not according to
# the mark
ip netns exec mtr-test1 mtr -M 153 172.30.1.0


##############
## TEARDOWN ##
##############

# Cleanup: Delete virtual interfaces and network namespaces
ip netns exec mtr-test0 /bin/bash -c '
	ip link delete veth-01a
	ip link delete veth-01b
'
ip netns del mtr-test0
ip netns del mtr-test1
@flu0r1ne
Copy link
Contributor Author

I also verified that this exists in HEAD.

@eaglegai
Copy link

eaglegai commented Sep 18, 2023

@flu0r1ne hello,could you test this by traceroute?
and about ‘Bug 1’,if 7a03200 is useful ?

@flu0r1ne
Copy link
Contributor Author

Interesting. From that commit, it seems like there are code paths which are not executing.

could you test this by traceroute?

I don't know what you're intending. The traceroute tool works as intended on my machine with -i device and --fwmark=MARK.

@eaglegai
Copy link

eaglegai commented Sep 19, 2023

about 'Bug 2', from the code, it seems only tcp and sctp support -M/--mark, could you try by --tcp ?

@eaglegai
Copy link

I don't know what you're intending. The traceroute tool works as intended on my machine with -i device and --fwmark=MARK.
OK

@flu0r1ne
Copy link
Contributor Author

about 'Bug 2', from the code, it seems only tcp and sctp support -M/--mark, could you try by --tcp ?

I don't think this makes sense. Marks are a meta attribute assigned by the Linux kernel and work with all packet types. Even if there was some limitation in the implementation, the correct behavior would be to kill the program with an error message after the argument parser.

@eaglegai
Copy link

Yes, I agree with you, but now mtr indeed only support it in tcp and sctp, hoping maintainer could pay attention to this.

@flu0r1ne
Copy link
Contributor Author

I'll consider into fixing these issues next weekend, assuming the fixes are reasonable. Before I spend time on this, I want to ensure there is an active maintainer who will work with me to merge any fixes.

@rewolff
Copy link
Collaborator

rewolff commented Sep 19, 2023

yes. I will merge "reasonable" looking fixes.
But I've been burned in the past with "fixes" that break a little something else and then interest drops and we end up with long standing bugs that I don't have time for to fix myself.
Someone submitted a patch recently that removed the byteswap-if-necessary for a length field. I even knew it was necessary for bsd-like OSes. And then the complaints start rolling in.
Burned by this in the past, Thinking it would be ok this time I accepted the patch. I shouldn't have. I've been reminded. So be warned: I'll be strict.....
Don't worry we can discuss if there is an issue with the patch.

@flu0r1ne
Copy link
Contributor Author

I had a chance to start examining this over the weekend, although I haven't yet compiled a patch. My preliminary analysis of Bug 2 has led me to identify two issues; however, these findings may change upon more rigorous testing.

The first issue is with mtr-packet, which drops CAP_NET_ADMIN during its initialization phase. This capability is necessary for setting packet marks (for example, using setsockopt with SO_MARK). A viable solution might be to remove CAP_NET_ADMIN from the effective set, but keep it in the permitted set. Then, it can be added back to the effective set when SO_MARK is set, and subsequently removed again.

Additionally, there may be a need for cap_free calls. This is because cap_get_proc might allocate memory, so cleaning up would be advisable.

mtr/packet/packet.c

Lines 60 to 71 in 74d312d

#ifdef HAVE_LIBCAP
cap = cap_get_proc();
if (cap == NULL) {
return -1;
}
if (cap_clear(cap)) {
return -1;
}
if (cap_set_proc(cap)) {
return -1;
}
#endif

The second issue pertains to the selection of the source address on Linux systems. The Routing Policy Database (RPDB) can influence which routing table gets selected, and consequently, which source address is chosen. Currently, the packet mark is not set during this selection process, leading to the use of an incorrect source address. This pattern occurs multiple times in mtr. I suggest that the best approach for Linux hosts would be to leave the source address unspecified. (That is, if the user doesn't explicitly specify it.) This would allow the kernel to route the packet appropriately, even if the rules change intermittently.

mtr/packet/probe.c

Lines 314 to 351 in 74d312d

int find_source_addr(
struct sockaddr_storage *srcaddr,
const struct sockaddr_storage *destaddr)
{
int sock;
int len;
struct sockaddr_storage dest_with_port;
#ifdef __linux__
// The Linux code needs these.
struct sockaddr_in *srcaddr4;
struct sockaddr_in6 *srcaddr6;
#endif
dest_with_port = *destaddr;
/*
MacOS requires a non-zero sin_port when used as an
address for a UDP connect. If we provide a zero port,
the connect will fail. We aren't actually sending
anything to the port.
*/
*sockaddr_port_offset(&dest_with_port) = htons(1);
len = sockaddr_size(&dest_with_port);
sock = socket(destaddr->ss_family, SOCK_DGRAM, IPPROTO_UDP);
if (sock == -1) {
probe_err = "open socket";
return -1;
}
if (connect(sock, (struct sockaddr *) &dest_with_port, len) == 0) {
if (getsockname(sock, (struct sockaddr *) srcaddr, &len)) {
close(sock);
probe_err = "getsockname";
return -1;
}
} else {

@matt-kimball
Copy link
Contributor

Good investigation. With regard to dropping CAP_NET_ADMIN, I wrote that code when mtr-packet was created as an analogue to dropping root UID when setuid is used by the installation. I didn't realize that would interfere with setting the routing mark. It may have been an overly cautious approach.

When you do finish writing a patch for these issues, please include detailed and explicit comments in the code on what these issues are and how your changes fix them. They aren't obvious.

@matt-kimball
Copy link
Contributor

With regards to setting the source address, I don't remember the full history of that, but it would be wise to test that change on as many operating systems as you can to verify it doesn't break anything for non-Linux usage. When mtr-packet was created, I tested on Linux, MacOS, FreeBSD, OpenBSD and Solaris. Each had their own special quirks.

@flu0r1ne
Copy link
Contributor Author

flu0r1ne commented Oct 4, 2023

I'm closing this issue since both of these problems are now fixed. Please see #491 for a follow up proposal.

@flu0r1ne flu0r1ne closed this as completed Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants