Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

./mptcp_connect.sh -m mmap test blocks: regression #124

Closed
matttbe opened this issue Dec 10, 2020 · 1 comment
Closed

./mptcp_connect.sh -m mmap test blocks: regression #124

matttbe opened this issue Dec 10, 2020 · 1 comment
Labels

Comments

@matttbe
Copy link
Member

matttbe commented Dec 10, 2020

With the latest export branch, I got a similar issue than #67 :

00:32:12.889 + ./mptcp_connect.sh -m mmap
00:32:13.203 [  297.723793] IPv6: ADDRCONF(NETDEV_CHANGE): ns1eth2: link becomes ready
00:32:13.324 [  297.845668] IPv6: ADDRCONF(NETDEV_CHANGE): ns2eth3: link becomes ready
00:32:13.449 [  297.970239] IPv6: ADDRCONF(NETDEV_CHANGE): ns3eth4: link becomes ready
00:32:13.501 # INFO: set ns3-5fd1f6c4-HuMyRo dev ns3eth2: ethtool -K  gso off
00:32:13.517 # INFO: set ns4-5fd1f6c4-HuMyRo dev ns4eth3: ethtool -K  gro off
00:32:13.588 # Created /tmp/tmp.KQFpLf0eh7 (size 6059036	/tmp/tmp.KQFpLf0eh7) containing data sent by client
00:32:13.608 # Created /tmp/tmp.j4b69lb8mi (size 90140	/tmp/tmp.j4b69lb8mi) containing data sent by server
00:32:13.703 # New MPTCP socket can be blocked via sysctl		[ OK ]
00:32:13.738 # setsockopt(..., TCP_ULP, "mptcp", ...) blocked	[ OK ]
00:32:13.750 # INFO: validating network environment with pings
00:32:16.736 # INFO: Using loss of 0.96% delay 38 ms reorder 91% 96% with delay 9ms on ns3eth4
00:32:16.749 # INFO: extra options:  -m mmap
00:32:17.044 # ns1 MPTCP -> ns1 (10.0.1.1:10000      ) MPTCP	(duration    75ms) [ OK ]
00:32:17.288 # ns1 MPTCP -> ns1 (10.0.1.1:10001      ) TCP  	(duration    29ms) [ OK ]
00:32:17.543 # ns1 TCP   -> ns1 (10.0.1.1:10002      ) MPTCP	(duration    28ms) [ OK ]
00:32:17.837 # ns1 MPTCP -> ns1 (dead:beef:1::1:10003) MPTCP	(duration    68ms) [ OK ]
00:32:18.070 # ns1 MPTCP -> ns1 (dead:beef:1::1:10004) TCP  	(duration    29ms) [ OK ]
00:32:18.310 # ns1 TCP   -> ns1 (dead:beef:1::1:10005) MPTCP	(duration    28ms) [ OK ]
00:32:18.565 # ns1 MPTCP -> ns2 (10.0.1.2:10006      ) MPTCP	(duration    34ms) [ OK ]
00:32:18.817 # ns1 MPTCP -> ns2 (dead:beef:1::2:10007) MPTCP	(duration    29ms) [ OK ]
00:47:17.818 Timeout: sending Ctrl+C
00:47:17.819 # ns1 MPTCP -> ns2 (10.0.2.1:10008      ) MPTCP	^C/usr/lib/klibc/bin/poweroff

I still need to instrument my CI to output more info in case of timeout...

I can reproduce it with my CI, no always blocking with the same test, e.g.

00:07:17.032 + ./mptcp_connect.sh -m mmap
00:07:17.342 [  231.333351] IPv6: ADDRCONF(NETDEV_CHANGE): ns1eth2: link becomes ready
00:07:17.472 [  231.463572] IPv6: ADDRCONF(NETDEV_CHANGE): ns2eth3: link becomes ready
00:07:17.598 [  231.589181] IPv6: ADDRCONF(NETDEV_CHANGE): ns3eth4: link becomes ready
00:07:17.656 # INFO: set ns3-5fd20e18-8txmkT dev ns3eth2: ethtool -K  gro off
00:07:17.673 # INFO: set ns4-5fd20e18-8txmkT dev ns4eth3: ethtool -K  gro off
00:07:17.741 # Created /tmp/tmp.76tE338eOt (size 6571036	/tmp/tmp.76tE338eOt) containing data sent by client
00:07:17.807 # Created /tmp/tmp.zO7wzqhHqp (size 6808604	/tmp/tmp.zO7wzqhHqp) containing data sent by server
00:07:17.884 # New MPTCP socket can be blocked via sysctl		[ OK ]
00:07:17.918 # setsockopt(..., TCP_ULP, "mptcp", ...) blocked	[ OK ]
00:07:17.929 # INFO: validating network environment with pings
00:07:19.950 # INFO: Using loss of 0.67% delay 12 ms reorder 95% 92% with delay 3ms on ns3eth4
00:07:19.962 # INFO: extra options:  -m mmap
00:23:26.963 Timeout: sending Ctrl+C
00:23:26.964 # ns1 MPTCP -> ns1 (10.0.1.1:10000      ) MPTCP	^C/usr/lib/klibc/bin/poweroff

It looks like it has been introduced by:

  • bbaa110: mptcp: push pending frames when subflow has free space
@matttbe
Copy link
Member Author

matttbe commented Dec 11, 2020

Can be closed thanks to a fix from Paolo (thx!):

  • 02754ad: "squashed" in mptcp: push pending frames when subflow has free space

@matttbe matttbe closed this as completed Dec 11, 2020
jenkins-tessares pushed a commit that referenced this issue Nov 18, 2021
When SUSPEND_DISCONNECTING bit is set that means Disconnect is pending
but the code was evaluating if the list is empty before calling
hci_conn_del which does the actual cleanup and remove the connection
from the list thus the bit is never cleared causing the suspend
procedure to always timeout when there are connections to be
disconnected:

Suspend/Resume - Success 5 (Pairing - Legacy) - waiting done
  Set the system into Suspend via force_suspend
= mgmt-tester: Suspend/Resume - Success 5 (Pairing -..   17:03:13.200458
= mgmt-tester: Set the system into Suspend via force_suspend    17:03:13.205812
< HCI Command: Write Scan E.. (0x03|0x001a) plen 1  #122 [hci0] 17:03:13.213561
        Scan enable: No Scans (0x00)
> HCI Event: Command Complete (0x0e) plen 4         #123 [hci0] 17:03:13.214710
      Write Scan Enable (0x03|0x001a) ncmd 1
        Status: Success (0x00)
< HCI Command: Disconnect (0x01|0x0006) plen 3      #124 [hci0] 17:03:13.215830
        Handle: 42
        Reason: Remote Device Terminated due to Power Off (0x15)
> HCI Event: Command Status (0x0f) plen 4           #125 [hci0] 17:03:13.216602
      Disconnect (0x01|0x0006) ncmd 1
        Status: Success (0x00)
> HCI Event: Disconnect Complete (0x05) plen 4      #126 [hci0] 17:03:13.217342
        Status: Success (0x00)
        Handle: 42
        Reason: Remote Device Terminated due to Power Off (0x15)
@ MGMT Event: Device Disconn.. (0x000c) plen 8  {0x0002} [hci0] 17:03:13.217688
        BR/EDR Address: 00:AA:01:01:00:00 (Intel Corporation)
        Reason: Connection terminated by local host for suspend (0x05)
@ MGMT Event: Device Disconn.. (0x000c) plen 8  {0x0001} [hci0] 17:03:13.217688
        BR/EDR Address: 00:AA:01:01:00:00 (Intel Corporation)
        Reason: Connection terminated by local host for suspend (0x05)
Suspend/Resume - Success 5 (Pairing - Legacy) - test timed out
= mgmt-tester: Suspend/Resume - Success 5 (Pairing -..   17:03:13.939317
Suspend/Resume - Success 5 (Pairing - Legacy) - teardown
= mgmt-tester: Suspend/Resume - Success 5 (Pairing -..   17:03:13.947267
[   13.284291] Bluetooth: hci0: Timed out waiting for suspend events
[   13.287324] Bluetooth: hci0: Suspend timeout bit: 6

Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
jenkins-tessares pushed a commit that referenced this issue Jun 23, 2023
[BUG]
David reported an ASSERT() get triggered during fio load on 8 devices
with data/raid6 and metadata/raid1c3:

  fio --rw=randrw --randrepeat=1 --size=3000m \
	  --bsrange=512b-64k --bs_unaligned \
	  --ioengine=libaio --fsync=1024 \
	  --name=job0 --name=job1 \

The ASSERT() is from rbio_add_bio() of raid56.c:

	ASSERT(orig_logical >= full_stripe_start &&
	       orig_logical + orig_len <= full_stripe_start +
	       rbio->nr_data * BTRFS_STRIPE_LEN);

Which is checking if the target rbio is crossing the full stripe
boundary.

  [100.789] assertion failed: orig_logical >= full_stripe_start && orig_logical + orig_len <= full_stripe_start + rbio->nr_data * BTRFS_STRIPE_LEN, in fs/btrfs/raid56.c:1622
  [100.795] ------------[ cut here ]------------
  [100.796] kernel BUG at fs/btrfs/raid56.c:1622!
  [100.797] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
  [100.798] CPU: 1 PID: 100 Comm: kworker/u8:4 Not tainted 6.4.0-rc6-default+ #124
  [100.799] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552-rebuilt.opensuse.org 04/01/2014
  [100.802] Workqueue: writeback wb_workfn (flush-btrfs-1)
  [100.803] RIP: 0010:rbio_add_bio+0x204/0x210 [btrfs]
  [100.806] RSP: 0018:ffff888104a8f300 EFLAGS: 00010246
  [100.808] RAX: 00000000000000a1 RBX: ffff8881075907e0 RCX: ffffed1020951e01
  [100.809] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000001
  [100.811] RBP: 0000000141d20000 R08: 0000000000000001 R09: ffff888104a8f04f
  [100.813] R10: ffffed1020951e09 R11: 0000000000000003 R12: ffff88810e87f400
  [100.815] R13: 0000000041d20000 R14: 0000000144529000 R15: ffff888101524000
  [100.817] FS:  0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000
  [100.821] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [100.822] CR2: 000055d54e44c270 CR3: 000000010a9a1006 CR4: 00000000003706a0
  [100.824] Call Trace:
  [100.825]  <TASK>
  [100.825]  ? die+0x32/0x80
  [100.826]  ? do_trap+0x12d/0x160
  [100.827]  ? rbio_add_bio+0x204/0x210 [btrfs]
  [100.827]  ? rbio_add_bio+0x204/0x210 [btrfs]
  [100.829]  ? do_error_trap+0x90/0x130
  [100.830]  ? rbio_add_bio+0x204/0x210 [btrfs]
  [100.831]  ? handle_invalid_op+0x2c/0x30
  [100.833]  ? rbio_add_bio+0x204/0x210 [btrfs]
  [100.835]  ? exc_invalid_op+0x29/0x40
  [100.836]  ? asm_exc_invalid_op+0x16/0x20
  [100.837]  ? rbio_add_bio+0x204/0x210 [btrfs]
  [100.837]  raid56_parity_write+0x64/0x270 [btrfs]
  [100.838]  btrfs_submit_chunk+0x26e/0x800 [btrfs]
  [100.840]  ? btrfs_bio_init+0x80/0x80 [btrfs]
  [100.841]  ? release_pages+0x503/0x6d0
  [100.842]  ? folio_unlock+0x2f/0x60
  [100.844]  ? __folio_put+0x60/0x60
  [100.845]  ? btrfs_do_readpage+0xae0/0xae0 [btrfs]
  [100.847]  btrfs_submit_bio+0x21/0x60 [btrfs]
  [100.847]  submit_one_bio+0x6a/0xb0 [btrfs]
  [100.849]  extent_write_cache_pages+0x395/0x680 [btrfs]
  [100.850]  ? __extent_writepage+0x520/0x520 [btrfs]
  [100.851]  ? mark_usage+0x190/0x190
  [100.852]  extent_writepages+0xdb/0x130 [btrfs]
  [100.853]  ? extent_write_locked_range+0x480/0x480 [btrfs]
  [100.854]  ? mark_usage+0x190/0x190
  [100.854]  ? attach_extent_buffer_page+0x220/0x220 [btrfs]
  [100.855]  ? reacquire_held_locks+0x178/0x280
  [100.856]  ? writeback_sb_inodes+0x245/0x7f0
  [100.857]  do_writepages+0x102/0x2e0
  [100.858]  ? page_writeback_cpu_online+0x10/0x10
  [100.859]  ? __lock_release.isra.0+0x14a/0x4d0
  [100.860]  ? reacquire_held_locks+0x280/0x280
  [100.861]  ? __lock_acquired+0x1e9/0x3d0
  [100.862]  ? do_raw_spin_lock+0x1b0/0x1b0
  [100.863]  __writeback_single_inode+0x94/0x450
  [100.864]  writeback_sb_inodes+0x372/0x7f0
  [100.864]  ? lock_sync+0xd0/0xd0
  [100.865]  ? do_raw_spin_unlock+0x93/0xf0
  [100.866]  ? sync_inode_metadata+0xc0/0xc0
  [100.867]  ? rwsem_optimistic_spin+0x340/0x340
  [100.868]  __writeback_inodes_wb+0x70/0x130
  [100.869]  wb_writeback+0x2d1/0x530
  [100.869]  ? __writeback_inodes_wb+0x130/0x130
  [100.870]  ? lockdep_hardirqs_on_prepare.part.0+0xf1/0x1c0
  [100.870]  wb_do_writeback+0x3eb/0x480
  [100.871]  ? wb_writeback+0x530/0x530
  [100.871]  ? mark_lock_irq+0xcd0/0xcd0
  [100.872]  wb_workfn+0xe0/0x3f0<

[CAUSE]
Commit a97699d ("btrfs: replace map_lookup->stripe_len by
BTRFS_STRIPE_LEN") changes how we calculate the map length, to reduce
u64 division.

Function btrfs_max_io_len() is to get the length to the stripe boundary.

It calculates the full stripe start offset (inside the chunk) by the
following code:

		*full_stripe_start =
			rounddown(*stripe_nr, nr_data_stripes(map)) <<
			BTRFS_STRIPE_LEN_SHIFT;

The calculation itself is fine, but the value returned by rounddown() is
dependent on both @stripe_nr (which is u32) and nr_data_stripes() (which
returned int).

Thus the result is also u32, then we do the left shift, which can
overflow u32.

If such overflow happens, @full_stripe_start will be a value way smaller
than @offset, causing later "full_stripe_len - (offset -
*full_stripe_start)" to underflow, thus make later length calculation to
have no stripe boundary limit, resulting a write bio to exceed stripe
boundary.

There are some other locations like this, with a u32 @stripe_nr got left
shift, which can lead to a similar overflow.

[FIX]
Fix all @stripe_nr with left shift with a type cast to u64 before the
left shift.

Those involved @stripe_nr or similar variables are recording the stripe
number inside the chunk, which is small enough to be contained by u32,
but their offset inside the chunk can not fit into u32.

Thus for those specific left shifts, a type cast to u64 is necessary so
this patch does not touch them and the code will be cleaned up in the
future to keep the fix minimal.

Reported-by: David Sterba <dsterba@suse.com>
Fixes: a97699d ("btrfs: replace map_lookup->stripe_len by BTRFS_STRIPE_LEN")
Tested-by: David Sterba <dsterba@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
matttbe pushed a commit that referenced this issue Jan 26, 2024
Like commit 1cf3bfc ("bpf: Support 64-bit pointers to kfuncs")
for s390x, add support for 64-bit pointers to kfuncs for LoongArch.
Since the infrastructure is already implemented in BPF core, the only
thing need to be done is to override bpf_jit_supports_far_kfunc_call().

Before this change, several test_verifier tests failed:

  # ./test_verifier | grep # | grep FAIL
  #119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL
  #120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL
  #121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL
  #122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL
  #123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL
  #124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL
  #125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL
  #126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL
  #127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL
  #128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL
  #129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL
  #130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL
  #486/p map_kptr: ref: reference state created and released on xchg FAIL

This is because the kfuncs in the loaded module are far away from
__bpf_call_base:

  ffff800002009440 t bpf_kfunc_call_test_fail1    [bpf_testmod]
  9000000002e128d8 T __bpf_call_base

The offset relative to __bpf_call_base does NOT fit in s32, which breaks
the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts
this limit.

Note that to reproduce the above result, tools/testing/selftests/bpf/config
should be applied, and run the test with JIT enabled, unpriv BPF enabled.

With this change, the test_verifier tests now all passed:

  # ./test_verifier
  ...
  Summary: 777 PASSED, 0 SKIPPED, 0 FAILED

Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant