Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[syzkaller] Use-after-free in subflow_state_change #250

Closed
mjmartineau opened this issue Dec 13, 2021 · 2 comments
Closed

[syzkaller] Use-after-free in subflow_state_change #250

mjmartineau opened this issue Dec 13, 2021 · 2 comments
Assignees

Comments

@mjmartineau
Copy link
Member

Syzkaller reports a UAF in __subflow_state_change()

Export branch, tag export/20211209T054821

R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000015
R13: 00007ffe3c01ee6f R14: 00007ffe3c01f010 R15: 00007f2007d61dc0
 </TASK>
==================================================================
BUG: KASAN: use-after-free in list_empty include/linux/list.h:284 [inline]
BUG: KASAN: use-after-free in waitqueue_active include/linux/wait.h:129 [inline]
BUG: KASAN: use-after-free in wq_has_sleeper include/linux/wait.h:163 [inline]
BUG: KASAN: use-after-free in skwq_has_sleeper include/net/sock.h:2290 [inline]
BUG: KASAN: use-after-free in __subflow_state_change net/mptcp/subflow.c:1600 [inline]
BUG: KASAN: use-after-free in subflow_state_change+0x8e2/0x970 net/mptcp/subflow.c:1615
Read of size 8 at addr ffff88800bf211c0 by task ksoftirqd/1/17

CPU: 1 PID: 17 Comm: ksoftirqd/1 Not tainted 5.16.0-rc3+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x8b/0xb3 lib/dump_stack.c:106
 print_address_description.constprop.0+0x21/0x140 mm/kasan/report.c:247
 __kasan_report mm/kasan/report.c:433 [inline]
 kasan_report.cold+0x7f/0x11b mm/kasan/report.c:450
 list_empty include/linux/list.h:284 [inline]
 waitqueue_active include/linux/wait.h:129 [inline]
 wq_has_sleeper include/linux/wait.h:163 [inline]
 skwq_has_sleeper include/net/sock.h:2290 [inline]
 __subflow_state_change net/mptcp/subflow.c:1600 [inline]
 subflow_state_change+0x8e2/0x970 net/mptcp/subflow.c:1615
 tcp_done net/ipv4/tcp.c:4450 [inline]
 tcp_done+0x1f7/0x380 net/ipv4/tcp.c:4429
 tcp_reset+0x155/0x460 net/ipv4/tcp_input.c:4312
 tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:6156 [inline]
 tcp_rcv_state_process+0x3028/0x52f0 net/ipv4/tcp_input.c:6420
 tcp_v4_do_rcv+0x38f/0xc00 net/ipv4/tcp_ipv4.c:1738
 tcp_v4_rcv+0x3594/0x44d0 net/ipv4/tcp_ipv4.c:2110
 ip_protocol_deliver_rcu+0xba/0xf30 net/ipv4/ip_input.c:204
 ip_local_deliver_finish+0x207/0x370 net/ipv4/ip_input.c:231
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip_local_deliver+0x1c5/0x4e0 net/ipv4/ip_input.c:252
 dst_input include/net/dst.h:461 [inline]
 ip_rcv_finish+0x236/0x4d0 net/ipv4/ip_input.c:429
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip_rcv+0xcd/0x3b0 net/ipv4/ip_input.c:540
 __netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5346
 __netif_receive_skb+0x24/0x1c0 net/core/dev.c:5460
 process_backlog+0x21a/0x7d0 net/core/dev.c:5792
 __napi_poll+0xb6/0x690 net/core/dev.c:6360
 napi_poll net/core/dev.c:6427 [inline]
 net_rx_action+0x82b/0xb60 net/core/dev.c:6514
 __do_softirq+0x1c4/0x83e kernel/softirq.c:558
 run_ksoftirqd kernel/softirq.c:920 [inline]
 run_ksoftirqd+0x2d/0x60 kernel/softirq.c:912
 smpboot_thread_fn+0x66c/0xa00 kernel/smpboot.c:164
 kthread+0x409/0x500 kernel/kthread.c:327
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
 </TASK>

Allocated by task 8089:
 kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:46 [inline]
 set_alloc_info mm/kasan/common.c:434 [inline]
 __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:467
 kasan_slab_alloc include/linux/kasan.h:259 [inline]
 slab_post_alloc_hook mm/slab.h:519 [inline]
 slab_alloc_node mm/slub.c:3234 [inline]
 slab_alloc mm/slub.c:3242 [inline]
 kmem_cache_alloc+0x1c1/0x7c0 mm/slub.c:3247
 sock_alloc_inode+0x18/0x1c0 net/socket.c:303
 alloc_inode+0x61/0x1e0 fs/inode.c:235
 new_inode_pseudo+0x14/0xe0 fs/inode.c:944
 sock_alloc+0x3c/0x260 net/socket.c:626
 __sock_create+0xb9/0x750 net/socket.c:1428
 sock_create net/socket.c:1515 [inline]
 __sys_socket+0xef/0x200 net/socket.c:1557
 __do_sys_socket net/socket.c:1566 [inline]
 __se_sys_socket net/socket.c:1564 [inline]
 __x64_sys_socket+0x6f/0xb0 net/socket.c:1564
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Freed by task 409:
 kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
 kasan_set_track+0x21/0x30 mm/kasan/common.c:46
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:370
 ____kasan_slab_free mm/kasan/common.c:366 [inline]
 ____kasan_slab_free mm/kasan/common.c:328 [inline]
 __kasan_slab_free+0xea/0x120 mm/kasan/common.c:374
 kasan_slab_free include/linux/kasan.h:235 [inline]
 slab_free_hook mm/slub.c:1723 [inline]
 slab_free_freelist_hook mm/slub.c:1749 [inline]
 slab_free mm/slub.c:3513 [inline]
 kmem_cache_free+0xb8/0x570 mm/slub.c:3530
 i_callback+0x3f/0x70 fs/inode.c:224
 rcu_do_batch kernel/rcu/tree.c:2506 [inline]
 rcu_core+0x81c/0x2090 kernel/rcu/tree.c:2741
 __do_softirq+0x1c4/0x83e kernel/softirq.c:558

Last potentially related work creation:
 kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
 __kasan_record_aux_stack+0xaf/0xc0 mm/kasan/generic.c:348
 __call_rcu kernel/rcu/tree.c:2985 [inline]
 call_rcu+0x82/0xb80 kernel/rcu/tree.c:3065
 destroy_inode+0x12b/0x1b0 fs/inode.c:290
 iput_final fs/inode.c:1670 [inline]
 iput fs/inode.c:1696 [inline]
 iput+0x491/0x840 fs/inode.c:1682
 dentry_unlink_inode+0x2e5/0x4a0 fs/dcache.c:376
 __dentry_kill+0x356/0x5c0 fs/dcache.c:582
 dentry_kill fs/dcache.c:708 [inline]
 dput+0x75c/0xc20 fs/dcache.c:888
 __fput+0x3a6/0x9e0 fs/file_table.c:293
 task_work_run+0xe2/0x1a0 kernel/task_work.c:164
 tracehook_notify_resume include/linux/tracehook.h:189 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:175 [inline]
 exit_to_user_mode_prepare+0x1d9/0x1e0 kernel/entry/common.c:207
 __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
 syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:300
 do_syscall_64+0x48/0x90 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Second to last potentially related work creation:
 kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
 __kasan_record_aux_stack+0xaf/0xc0 mm/kasan/generic.c:348
 __call_rcu kernel/rcu/tree.c:2985 [inline]
 call_rcu+0x82/0xb80 kernel/rcu/tree.c:3065
 destroy_inode+0x12b/0x1b0 fs/inode.c:290
 iput_final fs/inode.c:1670 [inline]
 iput fs/inode.c:1696 [inline]
 iput+0x491/0x840 fs/inode.c:1682
 dentry_unlink_inode+0x2e5/0x4a0 fs/dcache.c:376
 __dentry_kill+0x356/0x5c0 fs/dcache.c:582
 dentry_kill fs/dcache.c:708 [inline]
 dput+0x75c/0xc20 fs/dcache.c:888
 __fput+0x3a6/0x9e0 fs/file_table.c:293
 task_work_run+0xe2/0x1a0 kernel/task_work.c:164
 tracehook_notify_resume include/linux/tracehook.h:189 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:175 [inline]
 exit_to_user_mode_prepare+0x1d9/0x1e0 kernel/entry/common.c:207
 __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
 syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:300
 do_syscall_64+0x48/0x90 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x44/0xae

The buggy address belongs to the object at ffff88800bf21140
 which belongs to the cache sock_inode_cache of size 1344
The buggy address is located 128 bytes inside of
 1344-byte region [ffff88800bf21140, ffff88800bf21680)
The buggy address belongs to the page:
page:0000000038b47ce4 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xbf20
head:0000000038b47ce4 order:3 compound_mapcount:0 compound_pincount:0
flags: 0x100000000010200(slab|head|node=0|zone=1)
raw: 0100000000010200 dead000000000100 dead000000000122 ffff888100af4280
raw: 0000000000000000 0000000000160016 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88800bf21080: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
 ffff88800bf21100: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
>ffff88800bf21180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                           ^
 ffff88800bf21200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88800bf21280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

config: syz.config.gz
full syzkaller log: log0.gz

matttbe pushed a commit that referenced this issue Dec 17, 2021
The self-tests in a loop triggered a UaF similar to:

#250

The critical scenario is actually almost fixed by:

"mptcp: cleanup MPJ subflow list handling"

with a notable exception: if an MPJ handshake races with
mptcp_close(), the subflow enter the join_list and __mptcp_finish_join()
is processed at the msk socket lock release in mptcp_close(),
the subflow will preserver a danfling reference to the msk sk_socket.

Address the issue fragting the subflow only on successful
__mptcp_finish_join()

Note that issues/250 triggers even before
"mptcp: cleanup MPJ subflow list handling", as before such commit the join
list was not spliced by mptcp_close(). We could consider a net-only patch to
address that.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
jenkins-tessares pushed a commit that referenced this issue Dec 31, 2021
Hulk robot reported a kmemleak problem:

    unreferenced object 0xffff93d1d8cc02e8 (size 248):
      comm "cat", pid 23327, jiffies 4624670141 (age 495992.217s)
      hex dump (first 32 bytes):
        00 40 85 19 d4 93 ff ff 00 10 00 00 00 00 00 00  .@..............
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      backtrace:
         seq_open+0x2a/0x80
         full_proxy_open+0x167/0x1e0
         do_dentry_open+0x1e1/0x3a0
         path_openat+0x961/0xa20
         do_filp_open+0xae/0x120
         do_sys_openat2+0x216/0x2f0
         do_sys_open+0x57/0x80
         do_syscall_64+0x33/0x40
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
    unreferenced object 0xffff93d419854000 (size 4096):
      comm "cat", pid 23327, jiffies 4624670141 (age 495992.217s)
      hex dump (first 32 bytes):
        6b 66 65 6e 63 65 2d 23 32 35 30 3a 20 30 78 30  kfence-#250: 0x0
        30 30 30 30 30 30 30 37 35 34 62 64 61 31 32 2d  0000000754bda12-
      backtrace:
         seq_read_iter+0x313/0x440
         seq_read+0x14b/0x1a0
         full_proxy_read+0x56/0x80
         vfs_read+0xa5/0x1b0
         ksys_read+0xa0/0xf0
         do_syscall_64+0x33/0x40
         entry_SYSCALL_64_after_hwframe+0x44/0xa9

I find that we can easily reproduce this problem with the following
commands:

	cat /sys/kernel/debug/kfence/objects
	echo scan > /sys/kernel/debug/kmemleak
	cat /sys/kernel/debug/kmemleak

The leaked memory is allocated in the stack below:

    do_syscall_64
      do_sys_open
        do_dentry_open
          full_proxy_open
            seq_open            ---> alloc seq_file
      vfs_read
        full_proxy_read
          seq_read
            seq_read_iter
              traverse          ---> alloc seq_buf

And it should have been released in the following process:

    do_syscall_64
      syscall_exit_to_user_mode
        exit_to_user_mode_prepare
          task_work_run
            ____fput
              __fput
                full_proxy_release  ---> free here

However, the release function corresponding to file_operations is not
implemented in kfence.  As a result, a memory leak occurs.  Therefore,
the solution to this problem is to implement the corresponding release
function.

Link: https://lkml.kernel.org/r/20211206133628.2822545-1-libaokun1@huawei.com
Fixes: 0ce20dd ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
@matttbe
Copy link
Member

matttbe commented Jan 5, 2022

Please note that in our tree we have a fix for the "export branch":

But we still have an issue in -net and we still need a dedicated patch for that:

Note that issues/250 triggers even before
"mptcp: cleanup MPJ subflow list handling", as before such commit the join
list was not spliced by mptcp_close(). We could consider a net-only patch to
address that.

I didn't close issues/250 for this reason.

Assigning @pabeni on this one because I thought you were looking at a patch for this one but if not, please tell me :-)

@pabeni
Copy link

pabeni commented Jan 21, 2022

as per public meeting discussion, this is probably not worthy a stable-only patch.

@pabeni pabeni closed this as completed Jan 21, 2022
jenkins-tessares pushed a commit that referenced this issue Jul 20, 2023
Add a big batch of test coverage to assert all aspects of the tcx opts
attach, detach and query API:

  # ./vmtest.sh -- ./test_progs -t tc_opts
  [...]
  #238     tc_opts_after:OK
  #239     tc_opts_append:OK
  #240     tc_opts_basic:OK
  #241     tc_opts_before:OK
  #242     tc_opts_chain_classic:OK
  #243     tc_opts_demixed:OK
  #244     tc_opts_detach:OK
  #245     tc_opts_detach_after:OK
  #246     tc_opts_detach_before:OK
  #247     tc_opts_dev_cleanup:OK
  #248     tc_opts_invalid:OK
  #249     tc_opts_mixed:OK
  #250     tc_opts_prepend:OK
  #251     tc_opts_replace:OK
  #252     tc_opts_revision:OK
  Summary: 15/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20230719140858.13224-8-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
jenkins-tessares pushed a commit that referenced this issue Aug 11, 2023
Add a detachment test case with miniq present to assert that with and
without the miniq we get the same error.

  # ./test_progs -t tc_opts
  #244     tc_opts_after:OK
  #245     tc_opts_append:OK
  #246     tc_opts_basic:OK
  #247     tc_opts_before:OK
  #248     tc_opts_chain_classic:OK
  #249     tc_opts_delete_empty:OK
  #250     tc_opts_demixed:OK
  #251     tc_opts_detach:OK
  #252     tc_opts_detach_after:OK
  #253     tc_opts_detach_before:OK
  #254     tc_opts_dev_cleanup:OK
  #255     tc_opts_invalid:OK
  #256     tc_opts_mixed:OK
  #257     tc_opts_prepend:OK
  #258     tc_opts_replace:OK
  #259     tc_opts_revision:OK
  Summary: 16/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20230804131112.11012-2-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
matttbe pushed a commit that referenced this issue Aug 17, 2023
Add several new tcx test cases to improve test coverage. This also includes
a few new tests with ingress instead of clsact qdisc, to cover the fix from
commit dc644b5 ("tcx: Fix splat in ingress_destroy upon tcx_entry_free").

  # ./test_progs -t tc
  [...]
  #234     tc_links_after:OK
  #235     tc_links_append:OK
  #236     tc_links_basic:OK
  #237     tc_links_before:OK
  #238     tc_links_chain_classic:OK
  #239     tc_links_chain_mixed:OK
  #240     tc_links_dev_cleanup:OK
  #241     tc_links_dev_mixed:OK
  #242     tc_links_ingress:OK
  #243     tc_links_invalid:OK
  #244     tc_links_prepend:OK
  #245     tc_links_replace:OK
  #246     tc_links_revision:OK
  #247     tc_opts_after:OK
  #248     tc_opts_append:OK
  #249     tc_opts_basic:OK
  #250     tc_opts_before:OK
  #251     tc_opts_chain_classic:OK
  #252     tc_opts_chain_mixed:OK
  #253     tc_opts_delete_empty:OK
  #254     tc_opts_demixed:OK
  #255     tc_opts_detach:OK
  #256     tc_opts_detach_after:OK
  #257     tc_opts_detach_before:OK
  #258     tc_opts_dev_cleanup:OK
  #259     tc_opts_invalid:OK
  #260     tc_opts_mixed:OK
  #261     tc_opts_prepend:OK
  #262     tc_opts_replace:OK
  #263     tc_opts_revision:OK
  [...]
  Summary: 44/38 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/8699efc284b75ccdc51ddf7062fa2370330dc6c0.1692029283.git.daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
matttbe pushed a commit that referenced this issue Jul 5, 2024
Only export struct fb_info.fix.smem_start if that is required by the
user and the memory does not come from vmalloc().

Setting struct fb_info.fix.smem_start breaks systems where DMA
memory is backed by vmalloc address space. An example error is
shown below.

[    3.536043] ------------[ cut here ]------------
[    3.540716] virt_to_phys used for non-linear address: 000000007fc4f540 (0xffff800086001000)
[    3.552628] WARNING: CPU: 4 PID: 61 at arch/arm64/mm/physaddr.c:12 __virt_to_phys+0x68/0x98
[    3.565455] Modules linked in:
[    3.568525] CPU: 4 PID: 61 Comm: kworker/u12:5 Not tainted 6.6.23-06226-g4986cc3e1b75-dirty #250
[    3.577310] Hardware name: NXP i.MX95 19X19 board (DT)
[    3.582452] Workqueue: events_unbound deferred_probe_work_func
[    3.588291] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    3.595233] pc : __virt_to_phys+0x68/0x98
[    3.599246] lr : __virt_to_phys+0x68/0x98
[    3.603276] sp : ffff800083603990
[    3.677939] Call trace:
[    3.680393]  __virt_to_phys+0x68/0x98
[    3.684067]  drm_fbdev_dma_helper_fb_probe+0x138/0x238
[    3.689214]  __drm_fb_helper_initial_config_and_unlock+0x2b0/0x4c0
[    3.695385]  drm_fb_helper_initial_config+0x4c/0x68
[    3.700264]  drm_fbdev_dma_client_hotplug+0x8c/0xe0
[    3.705161]  drm_client_register+0x60/0xb0
[    3.709269]  drm_fbdev_dma_setup+0x94/0x148

Additionally, DMA memory is assumed to by contiguous in physical
address space, which is not guaranteed by vmalloc().

Resolve this by checking the module flag drm_leak_fbdev_smem when
DRM allocated the instance of struct fb_info. Fbdev-dma then only
sets smem_start only if required (via FBINFO_HIDE_SMEM_START). Also
guarantee that the framebuffer is not located in vmalloc address
space.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reported-by: Peng Fan (OSS) <peng.fan@oss.nxp.com>
Closes: https://lore.kernel.org/dri-devel/20240604080328.4024838-1-peng.fan@oss.nxp.com/
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Closes: https://lore.kernel.org/dri-devel/CAMuHMdX3N0szUvt1VTbroa2zrT1Nye_VzPb5qqCZ7z5gSm7HGw@mail.gmail.com/
Fixes: a51c766 ("drm/fb-helper: Consolidate CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM")
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: <stable@vger.kernel.org> # v6.4+
Link: https://patchwork.freedesktop.org/patch/msgid/20240617152843.11886-1-tzimmermann@suse.de
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants