sporadic failure of mptcp_join.sh selftest 13 #112

dcaratti · 2020-11-18T13:35:52Z

While backporting the support for "echo bit" to make ADD_ADDR "reliable", I noticed that some of the kselftests that check for the value of MPTcpExtEchoAdd often fail: I'm attaching pcap captures of the same test (number 13 in the current list), repeated twice on the same kernel.
In mp_join-01-ns1-0-FZiNdH.pcap the test failed as follows:

Created /tmp/tmp.6ll1l3yxsi (size 2 KB) containing data sent by client
Created /tmp/tmp.zo99WX2idt (size 2 KB) containing data sent by server
Capturing traffic for test 1 into mp_join-01-ns1-0-FZiNdH.pcap
dropped privs to tcpdump
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 65535 bytes
287 packets captured
290 packets received by filter
0 packets dropped by kernel
01 remove single address                syn[ ok ] - synack[ ok ] - ack[ ok ]
                                        add[ ok ] - echo  [fail] got 0 ADD_ADDR echo[s] expected 1
Server ns stats
MPTcpExtMPCapableSYNRX          1                  0.0
MPTcpExtMPCapableACKRX          1                  0.0
MPTcpExtMPTCPRetrans            14                 0.0
MPTcpExtMPJoinSynRx             1                  0.0
MPTcpExtMPJoinAckRx             1                  0.0
MPTcpExtDuplicateData           16                 0.0
MPTcpExtRmSubflow               1                  0.0
Client ns stats
MPTcpExtMPTCPRetrans            16                 0.0
MPTcpExtMPJoinSynAckRx          1                  0.0
MPTcpExtDuplicateData           14                 0.0
MPTcpExtAddAddr                 1                  0.0
MPTcpExtRmAddr                  1                  0.0
                                        rm [ ok ] - sf    [ ok ]

while in the other trace, namely mp_join-01-ns1-0-M8kYtD.pcap, the test passed.
Comparing the 2 captures, it's clear that the "echoed" ADD_ADDR is received correctly only if the peer transmitted it in a "pure ack" when TCP is in the ESTABLISHED state. However, it might happen that TCP subflows have no chance to send a pure ACK until they receive a FIN packet, like it happens in mp_join-01-ns1-0-FZiNdH.pcap. In this case, all the MPTCP options in the packet are currently discarded (here) by Linux.

This patch proved to fix the tests:

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6254,6 +6265,9 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
                         */
                        inet_csk_reset_keepalive_timer(sk, tmo);
                } else {
+                       if (sk_is_mptcp(sk) && TCP_SKB_CB(skb)->seq == TCP_SKB_CB(skb)->end_seq)
+                               mptcp_incoming_options(sk, skb);
+
                        tcp_time_wait(sk, TCP_FIN_WAIT2, tmo);
                        goto discard;
                }

but I'm unsure if we are allowed to do this (e.g. parsing the MPTCP options after sending FIN, and skipping the test of URG bit).

The text was updated successfully, but these errors were encountered:

dcaratti · 2020-11-18T13:38:02Z

issue112.tar.gz

dcaratti · 2020-11-18T13:42:05Z

I noticed that this behavior becomes more evident on kernel where the TCP "ping pong threshold" is 1. e.g. they don't have this commit:

commit 4a41f453bedfd5e9cd040bad509d9da49feb3e2c
Author: Wei Wang <weiwan@google.com>
Date:   Fri Jan 25 10:53:20 2019 -0800

    tcp: change pingpong threshold to 3

the issue can also be fixed by temporarily exiting the "pingpong" mode, when the ADD_ADDR is received, e.g. with something like:

--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -911,6 +922,8 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
                if (!mp_opt.echo) {
                        mptcp_pm_add_addr_received(msk, &addr);
                        MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_ADDADDR);
+                      /* printk("silly-fix\n");*/
+                       inet_csk_exit_pingpong_mode(sk);
                } else {
                        mptcp_pm_del_add_timer(msk, &addr);
                        MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_ECHOADD)

dcaratti · 2020-11-18T13:52:31Z

A third option might be to ignore the problem in real-life scenarios, and just fix the tests in a way that subflows are created with TCP_QUICKACK socket option for kselftests where the value of MPTcpExtEchoAdd is required to pass the test (but this would require at least implementation of the setsockopt() in the kernel)

dcaratti · 2021-02-11T14:07:11Z

I still see sporadic failures in the "flush", but I see them independently from whether I keep commit 4a41f45 reverted or not. Hence, the issue can be considered a dup of #139

Fix a warning from checkpatch.pl --strict: CHECK: Prefer kernel type 'u16' over 'uint16_t' multipath-tcp#112: FILE: server.c:112: + uint16_t command; Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>

The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220321152852.2334294-4-xukuohai@huawei.com

With latest upstream llvm18, the following test cases failed: $ ./test_progs -j #13/2 bpf_cookie/multi_kprobe_link_api:FAIL #13/3 bpf_cookie/multi_kprobe_attach_api:FAIL #13 bpf_cookie:FAIL #77 fentry_fexit:FAIL #78/1 fentry_test/fentry:FAIL #78 fentry_test:FAIL #82/1 fexit_test/fexit:FAIL #82 fexit_test:FAIL #112/1 kprobe_multi_test/skel_api:FAIL #112/2 kprobe_multi_test/link_api_addrs:FAIL [...] #112 kprobe_multi_test:FAIL #356/17 test_global_funcs/global_func17:FAIL #356 test_global_funcs:FAIL Further analysis shows llvm upstream patch [1] is responsible for the above failures. For example, for function bpf_fentry_test7() in net/bpf/test_run.c, without [1], the asm code is: 0000000000000400 <bpf_fentry_test7>: 400: f3 0f 1e fa endbr64 404: e8 00 00 00 00 callq 0x409 <bpf_fentry_test7+0x9> 409: 48 89 f8 movq %rdi, %rax 40c: c3 retq 40d: 0f 1f 00 nopl (%rax) ... and with [1], the asm code is: 0000000000005d20 <bpf_fentry_test7.specialized.1>: 5d20: e8 00 00 00 00 callq 0x5d25 <bpf_fentry_test7.specialized.1+0x5> 5d25: c3 retq ... and <bpf_fentry_test7.specialized.1> is called instead of <bpf_fentry_test7> and this caused test failures for #13/#77 etc. except #356. For test case #356/17, with [1] (progs/test_global_func17.c)), the main prog looks like: 0000000000000000 <global_func17>: 0: b4 00 00 00 2a 00 00 00 w0 = 0x2a 1: 95 00 00 00 00 00 00 00 exit ... which passed verification while the test itself expects a verification failure. Let us add 'barrier_var' style asm code in both places to prevent function specialization which caused selftests failure. [1] llvm/llvm-project#72903 Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231127050342.1945270-1-yonghong.song@linux.dev

dcaratti added bug enhancement question labels Nov 18, 2020

matttbe removed enhancement question labels Nov 19, 2020

dcaratti mentioned this issue Jan 26, 2021

mptcp: ADD_ADDR are now sent sooner multipath-tcp/packetdrill#34

Merged

matttbe assigned dcaratti Jan 28, 2021

dcaratti closed this as completed Feb 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sporadic failure of mptcp_join.sh selftest 13 #112

sporadic failure of mptcp_join.sh selftest 13 #112

dcaratti commented Nov 18, 2020 •

edited

Loading

dcaratti commented Nov 18, 2020

dcaratti commented Nov 18, 2020 •

edited

Loading

dcaratti commented Nov 18, 2020 •

edited

Loading

dcaratti commented Feb 11, 2021

sporadic failure of mptcp_join.sh selftest 13 #112

sporadic failure of mptcp_join.sh selftest 13 #112

Comments

dcaratti commented Nov 18, 2020 • edited Loading

dcaratti commented Nov 18, 2020

dcaratti commented Nov 18, 2020 • edited Loading

dcaratti commented Nov 18, 2020 • edited Loading

dcaratti commented Feb 11, 2021

dcaratti commented Nov 18, 2020 •

edited

Loading

dcaratti commented Nov 18, 2020 •

edited

Loading

dcaratti commented Nov 18, 2020 •

edited

Loading