-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding title in README.md file #10
Open
kshithijiyer
wants to merge
1
commit into
intel:master
Choose a base branch
from
kshithijiyer:patch-1
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sys-oak
pushed a commit
that referenced
this pull request
Jan 13, 2020
[ Upstream commit 6ee4051 ] Fail to allocate memory for tgid_map, because it requires order-6 page. detail as: c3 sh: page allocation failure: order:6, mode:0x140c0c0(GFP_KERNEL), nodemask=(null) c3 sh cpuset=/ mems_allowed=0 c3 CPU: 3 PID: 5632 Comm: sh Tainted: G W O 4.14.133+ #10 c3 Hardware name: Generic DT based system c3 Backtrace: c3 [<c010bdbc>] (dump_backtrace) from [<c010c08c>](show_stack+0x18/0x1c) c3 [<c010c074>] (show_stack) from [<c0993c54>](dump_stack+0x84/0xa4) c3 [<c0993bd0>] (dump_stack) from [<c0229858>](warn_alloc+0xc4/0x19c) c3 [<c0229798>] (warn_alloc) from [<c022a6e4>](__alloc_pages_nodemask+0xd18/0xf28) c3 [<c02299cc>] (__alloc_pages_nodemask) from [<c0248344>](kmalloc_order+0x20/0x38) c3 [<c0248324>] (kmalloc_order) from [<c0248380>](kmalloc_order_trace+0x24/0x108) c3 [<c024835c>] (kmalloc_order_trace) from [<c01e6078>](set_tracer_flag+0xb0/0x158) c3 [<c01e5fc8>] (set_tracer_flag) from [<c01e6404>](trace_options_core_write+0x7c/0xcc) c3 [<c01e6388>] (trace_options_core_write) from [<c0278b1c>](__vfs_write+0x40/0x14c) c3 [<c0278adc>] (__vfs_write) from [<c0278e10>](vfs_write+0xc4/0x198) c3 [<c0278d4c>] (vfs_write) from [<c027906c>](SyS_write+0x6c/0xd0) c3 [<c0279000>] (SyS_write) from [<c01079a0>](ret_fast_syscall+0x0/0x54) Switch to use kvcalloc to avoid unexpected allocation failures. Link: http://lkml.kernel.org/r/1571888070-24425-1-git-send-email-chunyan.zhang@unisoc.com Signed-off-by: Yuming Han <yuming.han@unisoc.com> Signed-off-by: Chunyan Zhang <chunyan.zhang@unisoc.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Jan 13, 2020
[ Upstream commit 32546a9 ] This patch moves the final part of the cifsFileInfo_put() logic where we need a write lock on lock_sem to be processed in a separate thread that holds no other locks. This is to prevent deadlocks like the one below: > there are 6 processes looping to while trying to down_write > cinode->lock_sem, 5 of them from _cifsFileInfo_put, and one from > cifs_new_fileinfo > > and there are 5 other processes which are blocked, several of them > waiting on either PG_writeback or PG_locked (which are both set), all > for the same page of the file > > 2 inode_lock() (inode->i_rwsem) for the file > 1 wait_on_page_writeback() for the page > 1 down_read(inode->i_rwsem) for the inode of the directory > 1 inode_lock()(inode->i_rwsem) for the inode of the directory > 1 __lock_page > > > so processes are blocked waiting on: > page flags PG_locked and PG_writeback for one specific page > inode->i_rwsem for the directory > inode->i_rwsem for the file > cifsInodeInflock_sem > > > > here are the more gory details (let me know if I need to provide > anything more/better): > > [0 00:48:22.765] [UN] PID: 8863 TASK: ffff8c691547c5c0 CPU: 3 > COMMAND: "reopen_file" > #0 [ffff9965007e3ba8] __schedule at ffffffff9b6e6095 > #1 [ffff9965007e3c38] schedule at ffffffff9b6e64df > #2 [ffff9965007e3c48] rwsem_down_write_slowpath at ffffffff9af283d7 > #3 [ffff9965007e3cb8] legitimize_path at ffffffff9b0f975d > #4 [ffff9965007e3d08] path_openat at ffffffff9b0fe55d > #5 [ffff9965007e3dd8] do_filp_open at ffffffff9b100a33 > #6 [ffff9965007e3ee0] do_sys_open at ffffffff9b0eb2d6 > #7 [ffff9965007e3f38] do_syscall_64 at ffffffff9ae04315 > * (I think legitimize_path is bogus) > > in path_openat > } else { > const char *s = path_init(nd, flags); > while (!(error = link_path_walk(s, nd)) && > (error = do_last(nd, file, op)) > 0) { <<<< > > do_last: > if (open_flag & O_CREAT) > inode_lock(dir->d_inode); <<<< > else > so it's trying to take inode->i_rwsem for the directory > > DENTRY INODE SUPERBLK TYPE PATH > ffff8c68bb8e79c0 ffff8c691158ef20 ffff8c6915bf9000 DIR /mnt/vm1_smb/ > inode.i_rwsem is ffff8c691158efc0 > > <struct rw_semaphore 0xffff8c691158efc0>: > owner: <struct task_struct 0xffff8c6914275d00> (UN - 8856 - > reopen_file), counter: 0x0000000000000003 > waitlist: 2 > 0xffff9965007e3c90 8863 reopen_file UN 0 1:29:22.926 > RWSEM_WAITING_FOR_WRITE > 0xffff996500393e00 9802 ls UN 0 1:17:26.700 > RWSEM_WAITING_FOR_READ > > > the owner of the inode.i_rwsem of the directory is: > > [0 00:00:00.109] [UN] PID: 8856 TASK: ffff8c6914275d00 CPU: 3 > COMMAND: "reopen_file" > #0 [ffff99650065b828] __schedule at ffffffff9b6e6095 > #1 [ffff99650065b8b8] schedule at ffffffff9b6e64df > #2 [ffff99650065b8c8] schedule_timeout at ffffffff9b6e9f89 > #3 [ffff99650065b940] msleep at ffffffff9af573a9 > #4 [ffff99650065b948] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs] > #5 [ffff99650065ba38] cifs_writepage_locked at ffffffffc0a0b8f3 [cifs] > #6 [ffff99650065bab0] cifs_launder_page at ffffffffc0a0bb72 [cifs] > #7 [ffff99650065bb30] invalidate_inode_pages2_range at ffffffff9b04d4bd > #8 [ffff99650065bcb8] cifs_invalidate_mapping at ffffffffc0a11339 [cifs] > #9 [ffff99650065bcd0] cifs_revalidate_mapping at ffffffffc0a1139a [cifs] > #10 [ffff99650065bcf0] cifs_d_revalidate at ffffffffc0a014f6 [cifs] > #11 [ffff99650065bd08] path_openat at ffffffff9b0fe7f7 > #12 [ffff99650065bdd8] do_filp_open at ffffffff9b100a33 > #13 [ffff99650065bee0] do_sys_open at ffffffff9b0eb2d6 > #14 [ffff99650065bf38] do_syscall_64 at ffffffff9ae04315 > > cifs_launder_page is for page 0xffffd1e2c07d2480 > > crash> page.index,mapping,flags 0xffffd1e2c07d2480 > index = 0x8 > mapping = 0xffff8c68f3cd0db0 > flags = 0xfffffc0008095 > > PAGE-FLAG BIT VALUE > PG_locked 0 0000001 > PG_uptodate 2 0000004 > PG_lru 4 0000010 > PG_waiters 7 0000080 > PG_writeback 15 0008000 > > > inode is ffff8c68f3cd0c40 > inode.i_rwsem is ffff8c68f3cd0ce0 > DENTRY INODE SUPERBLK TYPE PATH > ffff8c68a1f1b480 ffff8c68f3cd0c40 ffff8c6915bf9000 REG > /mnt/vm1_smb/testfile.8853 > > > this process holds the inode->i_rwsem for the parent directory, is > laundering a page attached to the inode of the file it's opening, and in > _cifsFileInfo_put is trying to down_write the cifsInodeInflock_sem > for the file itself. > > > <struct rw_semaphore 0xffff8c68f3cd0ce0>: > owner: <struct task_struct 0xffff8c6914272e80> (UN - 8854 - > reopen_file), counter: 0x0000000000000003 > waitlist: 1 > 0xffff9965005dfd80 8855 reopen_file UN 0 1:29:22.912 > RWSEM_WAITING_FOR_WRITE > > this is the inode.i_rwsem for the file > > the owner: > > [0 00:48:22.739] [UN] PID: 8854 TASK: ffff8c6914272e80 CPU: 2 > COMMAND: "reopen_file" > #0 [ffff99650054fb38] __schedule at ffffffff9b6e6095 > #1 [ffff99650054fbc8] schedule at ffffffff9b6e64df > #2 [ffff99650054fbd8] io_schedule at ffffffff9b6e68e2 > #3 [ffff99650054fbe8] __lock_page at ffffffff9b03c56f > #4 [ffff99650054fc80] pagecache_get_page at ffffffff9b03dcdf > #5 [ffff99650054fcc0] grab_cache_page_write_begin at ffffffff9b03ef4c > #6 [ffff99650054fcd0] cifs_write_begin at ffffffffc0a064ec [cifs] > #7 [ffff99650054fd30] generic_perform_write at ffffffff9b03bba4 > #8 [ffff99650054fda8] __generic_file_write_iter at ffffffff9b04060a > #9 [ffff99650054fdf0] cifs_strict_writev.cold.70 at ffffffffc0a4469b [cifs] > #10 [ffff99650054fe48] new_sync_write at ffffffff9b0ec1dd > #11 [ffff99650054fed0] vfs_write at ffffffff9b0eed35 > #12 [ffff99650054ff00] ksys_write at ffffffff9b0eefd9 > #13 [ffff99650054ff38] do_syscall_64 at ffffffff9ae04315 > > the process holds the inode->i_rwsem for the file to which it's writing, > and is trying to __lock_page for the same page as in the other processes > > > the other tasks: > [0 00:00:00.028] [UN] PID: 8859 TASK: ffff8c6915479740 CPU: 2 > COMMAND: "reopen_file" > #0 [ffff9965007b39d8] __schedule at ffffffff9b6e6095 > #1 [ffff9965007b3a68] schedule at ffffffff9b6e64df > #2 [ffff9965007b3a78] schedule_timeout at ffffffff9b6e9f89 > #3 [ffff9965007b3af0] msleep at ffffffff9af573a9 > #4 [ffff9965007b3af8] cifs_new_fileinfo.cold.61 at ffffffffc0a42a07 [cifs] > #5 [ffff9965007b3b78] cifs_open at ffffffffc0a0709d [cifs] > #6 [ffff9965007b3cd8] do_dentry_open at ffffffff9b0e9b7a > #7 [ffff9965007b3d08] path_openat at ffffffff9b0fe34f > #8 [ffff9965007b3dd8] do_filp_open at ffffffff9b100a33 > #9 [ffff9965007b3ee0] do_sys_open at ffffffff9b0eb2d6 > #10 [ffff9965007b3f38] do_syscall_64 at ffffffff9ae04315 > > this is opening the file, and is trying to down_write cinode->lock_sem > > > [0 00:00:00.041] [UN] PID: 8860 TASK: ffff8c691547ae80 CPU: 2 > COMMAND: "reopen_file" > [0 00:00:00.057] [UN] PID: 8861 TASK: ffff8c6915478000 CPU: 3 > COMMAND: "reopen_file" > [0 00:00:00.059] [UN] PID: 8858 TASK: ffff8c6914271740 CPU: 2 > COMMAND: "reopen_file" > [0 00:00:00.109] [UN] PID: 8862 TASK: ffff8c691547dd00 CPU: 6 > COMMAND: "reopen_file" > #0 [ffff9965007c3c78] __schedule at ffffffff9b6e6095 > #1 [ffff9965007c3d08] schedule at ffffffff9b6e64df > #2 [ffff9965007c3d18] schedule_timeout at ffffffff9b6e9f89 > #3 [ffff9965007c3d90] msleep at ffffffff9af573a9 > #4 [ffff9965007c3d98] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs] > #5 [ffff9965007c3e88] cifs_close at ffffffffc0a07aaf [cifs] > #6 [ffff9965007c3ea0] __fput at ffffffff9b0efa6e > #7 [ffff9965007c3ee8] task_work_run at ffffffff9aef1614 > #8 [ffff9965007c3f20] exit_to_usermode_loop at ffffffff9ae03d6f > #9 [ffff9965007c3f38] do_syscall_64 at ffffffff9ae0444c > > closing the file, and trying to down_write cifsi->lock_sem > > > [0 00:48:22.839] [UN] PID: 8857 TASK: ffff8c6914270000 CPU: 7 > COMMAND: "reopen_file" > #0 [ffff9965006a7cc8] __schedule at ffffffff9b6e6095 > #1 [ffff9965006a7d58] schedule at ffffffff9b6e64df > #2 [ffff9965006a7d68] io_schedule at ffffffff9b6e68e2 > #3 [ffff9965006a7d78] wait_on_page_bit at ffffffff9b03cac6 > #4 [ffff9965006a7e10] __filemap_fdatawait_range at ffffffff9b03b028 > #5 [ffff9965006a7ed8] filemap_write_and_wait at ffffffff9b040165 > #6 [ffff9965006a7ef0] cifs_flush at ffffffffc0a0c2fa [cifs] > #7 [ffff9965006a7f10] filp_close at ffffffff9b0e93f1 > #8 [ffff9965006a7f30] __x64_sys_close at ffffffff9b0e9a0e > #9 [ffff9965006a7f38] do_syscall_64 at ffffffff9ae04315 > > in __filemap_fdatawait_range > wait_on_page_writeback(page); > for the same page of the file > > > > [0 00:48:22.718] [UN] PID: 8855 TASK: ffff8c69142745c0 CPU: 7 > COMMAND: "reopen_file" > #0 [ffff9965005dfc98] __schedule at ffffffff9b6e6095 > #1 [ffff9965005dfd28] schedule at ffffffff9b6e64df > #2 [ffff9965005dfd38] rwsem_down_write_slowpath at ffffffff9af283d7 > #3 [ffff9965005dfdf0] cifs_strict_writev at ffffffffc0a0c40a [cifs] > #4 [ffff9965005dfe48] new_sync_write at ffffffff9b0ec1dd > #5 [ffff9965005dfed0] vfs_write at ffffffff9b0eed35 > #6 [ffff9965005dff00] ksys_write at ffffffff9b0eefd9 > #7 [ffff9965005dff38] do_syscall_64 at ffffffff9ae04315 > > inode_lock(inode); > > > and one 'ls' later on, to see whether the rest of the mount is available > (the test file is in the root, so we get blocked up on the directory > ->i_rwsem), so the entire mount is unavailable > > [0 00:36:26.473] [UN] PID: 9802 TASK: ffff8c691436ae80 CPU: 4 > COMMAND: "ls" > #0 [ffff996500393d28] __schedule at ffffffff9b6e6095 > #1 [ffff996500393db8] schedule at ffffffff9b6e64df > #2 [ffff996500393dc8] rwsem_down_read_slowpath at ffffffff9b6e9421 > #3 [ffff996500393e78] down_read_killable at ffffffff9b6e95e2 > #4 [ffff996500393e88] iterate_dir at ffffffff9b103c56 > #5 [ffff996500393ec8] ksys_getdents64 at ffffffff9b104b0c > #6 [ffff996500393f30] __x64_sys_getdents64 at ffffffff9b104bb6 > #7 [ffff996500393f38] do_syscall_64 at ffffffff9ae04315 > > in iterate_dir: > if (shared) > res = down_read_killable(&inode->i_rwsem); <<<< > else > res = down_write_killable(&inode->i_rwsem); > Reported-by: Frank Sorenson <sorenson@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Jan 17, 2020
[ Upstream commit 6ee4051 ] Fail to allocate memory for tgid_map, because it requires order-6 page. detail as: c3 sh: page allocation failure: order:6, mode:0x140c0c0(GFP_KERNEL), nodemask=(null) c3 sh cpuset=/ mems_allowed=0 c3 CPU: 3 PID: 5632 Comm: sh Tainted: G W O 4.14.133+ #10 c3 Hardware name: Generic DT based system c3 Backtrace: c3 [<c010bdbc>] (dump_backtrace) from [<c010c08c>](show_stack+0x18/0x1c) c3 [<c010c074>] (show_stack) from [<c0993c54>](dump_stack+0x84/0xa4) c3 [<c0993bd0>] (dump_stack) from [<c0229858>](warn_alloc+0xc4/0x19c) c3 [<c0229798>] (warn_alloc) from [<c022a6e4>](__alloc_pages_nodemask+0xd18/0xf28) c3 [<c02299cc>] (__alloc_pages_nodemask) from [<c0248344>](kmalloc_order+0x20/0x38) c3 [<c0248324>] (kmalloc_order) from [<c0248380>](kmalloc_order_trace+0x24/0x108) c3 [<c024835c>] (kmalloc_order_trace) from [<c01e6078>](set_tracer_flag+0xb0/0x158) c3 [<c01e5fc8>] (set_tracer_flag) from [<c01e6404>](trace_options_core_write+0x7c/0xcc) c3 [<c01e6388>] (trace_options_core_write) from [<c0278b1c>](__vfs_write+0x40/0x14c) c3 [<c0278adc>] (__vfs_write) from [<c0278e10>](vfs_write+0xc4/0x198) c3 [<c0278d4c>] (vfs_write) from [<c027906c>](SyS_write+0x6c/0xd0) c3 [<c0279000>] (SyS_write) from [<c01079a0>](ret_fast_syscall+0x0/0x54) Switch to use kvcalloc to avoid unexpected allocation failures. Link: http://lkml.kernel.org/r/1571888070-24425-1-git-send-email-chunyan.zhang@unisoc.com Signed-off-by: Yuming Han <yuming.han@unisoc.com> Signed-off-by: Chunyan Zhang <chunyan.zhang@unisoc.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Jan 23, 2020
Luca Abeni reported this: | BUG: scheduling while atomic: kworker/u8:2/15203/0x00000003 | CPU: 1 PID: 15203 Comm: kworker/u8:2 Not tainted 4.19.1-rt3 #10 | Call Trace: | rt_spin_lock+0x3f/0x50 | gen6_read32+0x45/0x1d0 [i915] | g4x_get_vblank_counter+0x36/0x40 [i915] | trace_event_raw_event_i915_pipe_update_start+0x7d/0xf0 [i915] The tracing events use trace_i915_pipe_update_start() among other events use functions acquire spin locks. A few trace points use intel_get_crtc_scanline(), others use ->get_vblank_counter() wich also might acquire a sleeping lock. Based on this I don't see any other way than disable trace points on RT. Cc: stable-rt@vger.kernel.org Reported-by: Luca Abeni <lucabe72@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
sys-oak
pushed a commit
that referenced
this pull request
Mar 4, 2020
[ Upstream commit c194839 ] free_match_list could be called when the flow table is already locked. We need to pass this notation to tree_put_node. It fixes the following lockdep warnning: [ 1797.268537] ============================================ [ 1797.276837] WARNING: possible recursive locking detected [ 1797.285101] 5.5.0-rc5+ #10 Not tainted [ 1797.291641] -------------------------------------------- [ 1797.299917] handler10/9296 is trying to acquire lock: [ 1797.307885] ffff889ad399a0a0 (&node->lock){++++}, at: tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.319694] [ 1797.319694] but task is already holding lock: [ 1797.330904] ffff889ad399a0a0 (&node->lock){++++}, at: nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core] [ 1797.344707] [ 1797.344707] other info that might help us debug this: [ 1797.356952] Possible unsafe locking scenario: [ 1797.356952] [ 1797.368333] CPU0 [ 1797.373357] ---- [ 1797.378364] lock(&node->lock); [ 1797.384222] lock(&node->lock); [ 1797.390031] [ 1797.390031] *** DEADLOCK *** [ 1797.390031] [ 1797.403003] May be due to missing lock nesting notation [ 1797.403003] [ 1797.414691] 3 locks held by handler10/9296: [ 1797.421465] #0: ffff889cf2c5a110 (&block->cb_lock){++++}, at: tc_setup_cb_add+0x70/0x250 [ 1797.432810] #1: ffff88a030081490 (&comp->sem){++++}, at: mlx5_devcom_get_peer_data+0x4c/0xb0 [mlx5_core] [ 1797.445829] #2: ffff889ad399a0a0 (&node->lock){++++}, at: nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core] [ 1797.459913] [ 1797.459913] stack backtrace: [ 1797.469436] CPU: 1 PID: 9296 Comm: handler10 Kdump: loaded Not tainted 5.5.0-rc5+ #10 [ 1797.480643] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017 [ 1797.491480] Call Trace: [ 1797.496701] dump_stack+0x96/0xe0 [ 1797.502864] __lock_acquire.cold.63+0xf8/0x212 [ 1797.510301] ? lockdep_hardirqs_on+0x250/0x250 [ 1797.517701] ? mark_held_locks+0x55/0xa0 [ 1797.524547] ? quarantine_put+0xb7/0x160 [ 1797.531422] ? lockdep_hardirqs_on+0x17d/0x250 [ 1797.538913] lock_acquire+0xd6/0x1f0 [ 1797.545529] ? tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.553701] down_write+0x94/0x140 [ 1797.560206] ? tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.568464] ? down_write_killable_nested+0x170/0x170 [ 1797.576925] ? del_hw_flow_group+0xde/0x1f0 [mlx5_core] [ 1797.585629] tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.593891] ? free_match_list.part.25+0x147/0x170 [mlx5_core] [ 1797.603389] free_match_list.part.25+0xe0/0x170 [mlx5_core] [ 1797.612654] _mlx5_add_flow_rules+0x17e2/0x20b0 [mlx5_core] [ 1797.621838] ? lock_acquire+0xd6/0x1f0 [ 1797.629028] ? esw_get_prio_table+0xb0/0x3e0 [mlx5_core] [ 1797.637981] ? alloc_insert_flow_group+0x420/0x420 [mlx5_core] [ 1797.647459] ? try_to_wake_up+0x4c7/0xc70 [ 1797.654881] ? lock_downgrade+0x350/0x350 [ 1797.662271] ? __mutex_unlock_slowpath+0xb1/0x3f0 [ 1797.670396] ? find_held_lock+0xac/0xd0 [ 1797.677540] ? mlx5_add_flow_rules+0xdc/0x360 [mlx5_core] [ 1797.686467] mlx5_add_flow_rules+0xdc/0x360 [mlx5_core] [ 1797.695134] ? _mlx5_add_flow_rules+0x20b0/0x20b0 [mlx5_core] [ 1797.704270] ? irq_exit+0xa5/0x170 [ 1797.710764] ? retint_kernel+0x10/0x10 [ 1797.717698] ? mlx5_eswitch_set_rule_source_port.isra.9+0x122/0x230 [mlx5_core] [ 1797.728708] mlx5_eswitch_add_offloaded_rule+0x465/0x6d0 [mlx5_core] [ 1797.738713] ? mlx5_eswitch_get_prio_range+0x30/0x30 [mlx5_core] [ 1797.748384] ? mlx5_fc_stats_work+0x670/0x670 [mlx5_core] [ 1797.757400] mlx5e_tc_offload_fdb_rules.isra.27+0x24/0x90 [mlx5_core] [ 1797.767665] mlx5e_tc_add_fdb_flow+0xaf8/0xd40 [mlx5_core] [ 1797.776886] ? mlx5e_encap_put+0xd0/0xd0 [mlx5_core] [ 1797.785562] ? mlx5e_alloc_flow.isra.43+0x18c/0x1c0 [mlx5_core] [ 1797.795353] __mlx5e_add_fdb_flow+0x2e2/0x440 [mlx5_core] [ 1797.804558] ? mlx5e_tc_update_neigh_used_value+0x8c0/0x8c0 [mlx5_core] [ 1797.815093] ? wait_for_completion+0x260/0x260 [ 1797.823272] mlx5e_configure_flower+0xe94/0x1620 [mlx5_core] [ 1797.832792] ? __mlx5e_add_fdb_flow+0x440/0x440 [mlx5_core] [ 1797.842096] ? down_read+0x11a/0x2e0 [ 1797.849090] ? down_write+0x140/0x140 [ 1797.856142] ? mlx5e_rep_indr_setup_block_cb+0xc0/0xc0 [mlx5_core] [ 1797.866027] tc_setup_cb_add+0x11a/0x250 [ 1797.873339] fl_hw_replace_filter+0x25e/0x320 [cls_flower] [ 1797.882385] ? fl_hw_destroy_filter+0x1c0/0x1c0 [cls_flower] [ 1797.891607] fl_change+0x1d54/0x1fb6 [cls_flower] [ 1797.899772] ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0 [cls_flower] [ 1797.910728] ? lock_downgrade+0x350/0x350 [ 1797.918187] ? __radix_tree_lookup+0xa5/0x130 [ 1797.926046] ? fl_set_key+0x1590/0x1590 [cls_flower] [ 1797.934611] ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0 [cls_flower] [ 1797.945673] tc_new_tfilter+0xcd1/0x1240 [ 1797.953138] ? tc_del_tfilter+0xb10/0xb10 [ 1797.960688] ? avc_has_perm_noaudit+0x92/0x320 [ 1797.968721] ? avc_has_perm_noaudit+0x1df/0x320 [ 1797.976816] ? avc_has_extended_perms+0x990/0x990 [ 1797.985090] ? mark_lock+0xaa/0x9e0 [ 1797.991988] ? match_held_lock+0x1b/0x240 [ 1797.999457] ? match_held_lock+0x1b/0x240 [ 1798.006859] ? find_held_lock+0xac/0xd0 [ 1798.014045] ? symbol_put_addr+0x40/0x40 [ 1798.021317] ? rcu_read_lock_sched_held+0xd0/0xd0 [ 1798.029460] ? tc_del_tfilter+0xb10/0xb10 [ 1798.036810] rtnetlink_rcv_msg+0x4d5/0x620 [ 1798.044236] ? rtnl_bridge_getlink+0x460/0x460 [ 1798.052034] ? lockdep_hardirqs_on+0x250/0x250 [ 1798.059837] ? match_held_lock+0x1b/0x240 [ 1798.067146] ? find_held_lock+0xac/0xd0 [ 1798.074246] netlink_rcv_skb+0xc6/0x1f0 [ 1798.081339] ? rtnl_bridge_getlink+0x460/0x460 [ 1798.089104] ? netlink_ack+0x440/0x440 [ 1798.096061] netlink_unicast+0x2d4/0x3b0 [ 1798.103189] ? netlink_attachskb+0x3f0/0x3f0 [ 1798.110724] ? _copy_from_iter_full+0xda/0x370 [ 1798.118415] netlink_sendmsg+0x3ba/0x6a0 [ 1798.125478] ? netlink_unicast+0x3b0/0x3b0 [ 1798.132705] ? netlink_unicast+0x3b0/0x3b0 [ 1798.139880] sock_sendmsg+0x94/0xa0 [ 1798.146332] ____sys_sendmsg+0x36c/0x3f0 [ 1798.153251] ? copy_msghdr_from_user+0x165/0x230 [ 1798.160941] ? kernel_sendmsg+0x30/0x30 [ 1798.167738] ___sys_sendmsg+0xeb/0x150 [ 1798.174411] ? sendmsg_copy_msghdr+0x30/0x30 [ 1798.181649] ? lock_downgrade+0x350/0x350 [ 1798.188559] ? rcu_read_lock_sched_held+0xd0/0xd0 [ 1798.196239] ? __fget+0x21d/0x320 [ 1798.202335] ? do_dup2+0x2a0/0x2a0 [ 1798.208499] ? lock_downgrade+0x350/0x350 [ 1798.215366] ? __fget_light+0xd6/0xf0 [ 1798.221808] ? syscall_trace_enter+0x369/0x5d0 [ 1798.229112] __sys_sendmsg+0xd3/0x160 [ 1798.235511] ? __sys_sendmsg_sock+0x60/0x60 [ 1798.242478] ? syscall_trace_enter+0x233/0x5d0 [ 1798.249721] ? syscall_slow_exit_work+0x280/0x280 [ 1798.257211] ? do_syscall_64+0x1e/0x2e0 [ 1798.263680] do_syscall_64+0x72/0x2e0 [ 1798.269950] entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: bd71b08 ("net/mlx5: Support multiple updates of steering rules in parallel") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Mar 9, 2020
commit b57a55e upstream. There is a KASAN slab-out-of-bounds: BUG: KASAN: slab-out-of-bounds in _copy_from_iter_full+0x783/0xaa0 Read of size 80 at addr ffff88810c35e180 by task mount.cifs/539 CPU: 1 PID: 539 Comm: mount.cifs Not tainted 4.19 #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0xdd/0x12a print_address_description+0xa7/0x540 kasan_report+0x1ff/0x550 check_memory_region+0x2f1/0x310 memcpy+0x2f/0x80 _copy_from_iter_full+0x783/0xaa0 tcp_sendmsg_locked+0x1840/0x4140 tcp_sendmsg+0x37/0x60 inet_sendmsg+0x18c/0x490 sock_sendmsg+0xae/0x130 smb_send_kvec+0x29c/0x520 __smb_send_rqst+0x3ef/0xc60 smb_send_rqst+0x25a/0x2e0 compound_send_recv+0x9e8/0x2af0 cifs_send_recv+0x24/0x30 SMB2_open+0x35e/0x1620 open_shroot+0x27b/0x490 smb2_open_op_close+0x4e1/0x590 smb2_query_path_info+0x2ac/0x650 cifs_get_inode_info+0x1058/0x28f0 cifs_root_iget+0x3bb/0xf80 cifs_smb3_do_mount+0xe00/0x14c0 cifs_do_mount+0x15/0x20 mount_fs+0x5e/0x290 vfs_kern_mount+0x88/0x460 do_mount+0x398/0x31e0 ksys_mount+0xc6/0x150 __x64_sys_mount+0xea/0x190 do_syscall_64+0x122/0x590 entry_SYSCALL_64_after_hwframe+0x44/0xa9 It can be reproduced by the following step: 1. samba configured with: server max protocol = SMB2_10 2. mount -o vers=default When parse the mount version parameter, the 'ops' and 'vals' was setted to smb30, if negotiate result is smb21, just update the 'ops' to smb21, but the 'vals' is still smb30. When add lease context, the iov_base is allocated with smb21 ops, but the iov_len is initiallited with the smb30. Because the iov_len is longer than iov_base, when send the message, copy array out of bounds. we need to keep the 'ops' and 'vals' consistent. Fixes: 9764c02 ("SMB3: Add support for multidialect negotiate (SMB2.1 and later)") Fixes: d5c7076 ("smb3: add smb3.1.1 to default dialect list") Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> [bwh: Backported to 4.14: We never switch to SMB3.1.1 here] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Mar 12, 2020
commit f767078 upstream. There has oops as below happen on i.MX8MP EVK platform that has 6G bytes DDR memory. when (xmit->tail < xmit->head) && (xmit->head == 0), it setups one sg entry with sg->length is zero: sg_set_buf(sgl + 1, xmit->buf, xmit->head); if xmit->buf is allocated from >4G address space, and SDMA only support <4G address space, then dma_map_sg() will call swiotlb_map() to do bounce buffer copying and mapping. But swiotlb_map() don't allow sg entry's length is zero, otherwise report BUG_ON(). So the patch is to correct the tx DMA scatter list. Oops: [ 287.675715] kernel BUG at kernel/dma/swiotlb.c:497! [ 287.680592] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 287.686075] Modules linked in: [ 287.689133] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.3-00016-g3fdc4e0-dirty #10 [ 287.696872] Hardware name: FSL i.MX8MP EVK (DT) [ 287.701402] pstate: 80000085 (Nzcv daIf -PAN -UAO) [ 287.706199] pc : swiotlb_tbl_map_single+0x1fc/0x310 [ 287.711076] lr : swiotlb_map+0x60/0x148 [ 287.714909] sp : ffff800010003c00 [ 287.718221] x29: ffff800010003c00 x28: 0000000000000000 [ 287.723533] x27: 0000000000000040 x26: ffff800011ae0000 [ 287.728844] x25: ffff800011ae09f8 x24: 0000000000000000 [ 287.734155] x23: 00000001b7af9000 x22: 0000000000000000 [ 287.739465] x21: ffff000176409c10 x20: 00000000001f7ffe [ 287.744776] x19: ffff000176409c10 x18: 000000000000002e [ 287.750087] x17: 0000000000000000 x16: 0000000000000000 [ 287.755397] x15: 0000000000000000 x14: 0000000000000000 [ 287.760707] x13: ffff00017f334000 x12: 0000000000000001 [ 287.766018] x11: 00000000001fffff x10: 0000000000000000 [ 287.771328] x9 : 0000000000000003 x8 : 0000000000000000 [ 287.776638] x7 : 0000000000000000 x6 : 0000000000000000 [ 287.781949] x5 : 0000000000200000 x4 : 0000000000000000 [ 287.787259] x3 : 0000000000000001 x2 : 00000001b7af9000 [ 287.792570] x1 : 00000000fbfff000 x0 : 0000000000000000 [ 287.797881] Call trace: [ 287.800328] swiotlb_tbl_map_single+0x1fc/0x310 [ 287.804859] swiotlb_map+0x60/0x148 [ 287.808347] dma_direct_map_page+0xf0/0x130 [ 287.812530] dma_direct_map_sg+0x78/0xe0 [ 287.816453] imx_uart_dma_tx+0x134/0x2f8 [ 287.820374] imx_uart_dma_tx_callback+0xd8/0x168 [ 287.824992] vchan_complete+0x194/0x200 [ 287.828828] tasklet_action_common.isra.0+0x154/0x1a0 [ 287.833879] tasklet_action+0x24/0x30 [ 287.837540] __do_softirq+0x120/0x23c [ 287.841202] irq_exit+0xb8/0xd8 [ 287.844343] __handle_domain_irq+0x64/0xb8 [ 287.848438] gic_handle_irq+0x5c/0x148 [ 287.852185] el1_irq+0xb8/0x180 [ 287.855327] cpuidle_enter_state+0x84/0x360 [ 287.859508] cpuidle_enter+0x34/0x48 [ 287.863083] call_cpuidle+0x18/0x38 [ 287.866571] do_idle+0x1e0/0x280 [ 287.869798] cpu_startup_entry+0x20/0x40 [ 287.873721] rest_init+0xd4/0xe0 [ 287.876949] arch_call_rest_init+0xc/0x14 [ 287.880958] start_kernel+0x420/0x44c [ 287.884622] Code: 9124c021 9417aff8 a94363f7 17ffffd5 (d4210000) [ 287.890718] ---[ end trace 5bc44c4ab6b009ce ]--- [ 287.895334] Kernel panic - not syncing: Fatal exception in interrupt [ 287.901686] SMP: stopping secondary CPUs [ 288.905607] SMP: failed to stop secondary CPUs 0-1 [ 288.910395] Kernel Offset: disabled [ 288.913882] CPU features: 0x0002,2000200c [ 288.917888] Memory Limit: none [ 288.920944] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- Reported-by: Eagle Zhou <eagle.zhou@nxp.com> Tested-by: Eagle Zhou <eagle.zhou@nxp.com> Signed-off-by: Fugang Duan <fugang.duan@nxp.com> Cc: stable <stable@vger.kernel.org> Fixes: 7942f85 ("serial: imx: TX DMA: clean up sg initialization") Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/1581401761-6378-1-git-send-email-fugang.duan@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
to projectceladon/linux-intel-lts
that referenced
this pull request
Mar 12, 2020
[ Upstream commit 5eed6f1 ] Commit 9b6f7e1 ("mm: rework memcg kernel stack accounting") will result in fork failing if allocating a kernel stack for a task in dup_task_struct exceeds the kernel memory allowance for that cgroup. Unfortunately, it also results in a crash. This is due to the code jumping to free_stack and calling free_thread_stack when the memcg kernel stack charge fails, but without tsk->stack pointing at the freshly allocated stack. This in turn results in the vfree_atomic in free_thread_stack oopsing with a backtrace like this: #5 [ffffc900244efc88] die at ffffffff8101f0ab #6 [ffffc900244efcb8] do_general_protection at ffffffff8101cb86 #7 [ffffc900244efce0] general_protection at ffffffff818ff082 [exception RIP: llist_add_batch+7] RIP: ffffffff8150d487 RSP: ffffc900244efd98 RFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88085ef55980 RCX: 0000000000000000 RDX: ffff88085ef55980 RSI: 343834343531203a RDI: 343834343531203a RBP: ffffc900244efd98 R8: 0000000000000001 R9: ffff8808578c3600 R10: 0000000000000000 R11: 0000000000000001 R12: ffff88029f6c21c0 R13: 0000000000000286 R14: ffff880147759b00 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffc900244efda0] vfree_atomic at ffffffff811df2c7 #9 [ffffc900244efdb8] copy_process at ffffffff81086e37 intel#10 [ffffc900244efe98] _do_fork at ffffffff810884e0 intel#11 [ffffc900244eff10] sys_vfork at ffffffff810887ff intel#12 [ffffc900244eff20] do_syscall_64 at ffffffff81002a43 RIP: 000000000049b948 RSP: 00007ffcdb307830 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000896030 RCX: 000000000049b948 RDX: 0000000000000000 RSI: 00007ffcdb307790 RDI: 00000000005d7421 RBP: 000000000067370f R8: 00007ffcdb3077b0 R9: 000000000001ed00 R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000040 R13: 000000000000000f R14: 0000000000000000 R15: 000000000088d018 ORIG_RAX: 000000000000003a CS: 0033 SS: 002b The simplest fix is to assign tsk->stack right where it is allocated. Link: http://lkml.kernel.org/r/20181214231726.7ee4843c@imladris.surriel.com Fixes: 9b6f7e1 ("mm: rework memcg kernel stack accounting") Signed-off-by: Rik van Riel <riel@surriel.com> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Mar 17, 2020
commit f767078 upstream. There has oops as below happen on i.MX8MP EVK platform that has 6G bytes DDR memory. when (xmit->tail < xmit->head) && (xmit->head == 0), it setups one sg entry with sg->length is zero: sg_set_buf(sgl + 1, xmit->buf, xmit->head); if xmit->buf is allocated from >4G address space, and SDMA only support <4G address space, then dma_map_sg() will call swiotlb_map() to do bounce buffer copying and mapping. But swiotlb_map() don't allow sg entry's length is zero, otherwise report BUG_ON(). So the patch is to correct the tx DMA scatter list. Oops: [ 287.675715] kernel BUG at kernel/dma/swiotlb.c:497! [ 287.680592] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 287.686075] Modules linked in: [ 287.689133] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.3-00016-g3fdc4e0-dirty #10 [ 287.696872] Hardware name: FSL i.MX8MP EVK (DT) [ 287.701402] pstate: 80000085 (Nzcv daIf -PAN -UAO) [ 287.706199] pc : swiotlb_tbl_map_single+0x1fc/0x310 [ 287.711076] lr : swiotlb_map+0x60/0x148 [ 287.714909] sp : ffff800010003c00 [ 287.718221] x29: ffff800010003c00 x28: 0000000000000000 [ 287.723533] x27: 0000000000000040 x26: ffff800011ae0000 [ 287.728844] x25: ffff800011ae09f8 x24: 0000000000000000 [ 287.734155] x23: 00000001b7af9000 x22: 0000000000000000 [ 287.739465] x21: ffff000176409c10 x20: 00000000001f7ffe [ 287.744776] x19: ffff000176409c10 x18: 000000000000002e [ 287.750087] x17: 0000000000000000 x16: 0000000000000000 [ 287.755397] x15: 0000000000000000 x14: 0000000000000000 [ 287.760707] x13: ffff00017f334000 x12: 0000000000000001 [ 287.766018] x11: 00000000001fffff x10: 0000000000000000 [ 287.771328] x9 : 0000000000000003 x8 : 0000000000000000 [ 287.776638] x7 : 0000000000000000 x6 : 0000000000000000 [ 287.781949] x5 : 0000000000200000 x4 : 0000000000000000 [ 287.787259] x3 : 0000000000000001 x2 : 00000001b7af9000 [ 287.792570] x1 : 00000000fbfff000 x0 : 0000000000000000 [ 287.797881] Call trace: [ 287.800328] swiotlb_tbl_map_single+0x1fc/0x310 [ 287.804859] swiotlb_map+0x60/0x148 [ 287.808347] dma_direct_map_page+0xf0/0x130 [ 287.812530] dma_direct_map_sg+0x78/0xe0 [ 287.816453] imx_uart_dma_tx+0x134/0x2f8 [ 287.820374] imx_uart_dma_tx_callback+0xd8/0x168 [ 287.824992] vchan_complete+0x194/0x200 [ 287.828828] tasklet_action_common.isra.0+0x154/0x1a0 [ 287.833879] tasklet_action+0x24/0x30 [ 287.837540] __do_softirq+0x120/0x23c [ 287.841202] irq_exit+0xb8/0xd8 [ 287.844343] __handle_domain_irq+0x64/0xb8 [ 287.848438] gic_handle_irq+0x5c/0x148 [ 287.852185] el1_irq+0xb8/0x180 [ 287.855327] cpuidle_enter_state+0x84/0x360 [ 287.859508] cpuidle_enter+0x34/0x48 [ 287.863083] call_cpuidle+0x18/0x38 [ 287.866571] do_idle+0x1e0/0x280 [ 287.869798] cpu_startup_entry+0x20/0x40 [ 287.873721] rest_init+0xd4/0xe0 [ 287.876949] arch_call_rest_init+0xc/0x14 [ 287.880958] start_kernel+0x420/0x44c [ 287.884622] Code: 9124c021 9417aff8 a94363f7 17ffffd5 (d4210000) [ 287.890718] ---[ end trace 5bc44c4ab6b009ce ]--- [ 287.895334] Kernel panic - not syncing: Fatal exception in interrupt [ 287.901686] SMP: stopping secondary CPUs [ 288.905607] SMP: failed to stop secondary CPUs 0-1 [ 288.910395] Kernel Offset: disabled [ 288.913882] CPU features: 0x0002,2000200c [ 288.917888] Memory Limit: none [ 288.920944] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- Reported-by: Eagle Zhou <eagle.zhou@nxp.com> Tested-by: Eagle Zhou <eagle.zhou@nxp.com> Signed-off-by: Fugang Duan <fugang.duan@nxp.com> Cc: stable <stable@vger.kernel.org> Fixes: 7942f85 ("serial: imx: TX DMA: clean up sg initialization") Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/1581401761-6378-1-git-send-email-fugang.duan@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Mar 31, 2020
commit f767078 upstream. There has oops as below happen on i.MX8MP EVK platform that has 6G bytes DDR memory. when (xmit->tail < xmit->head) && (xmit->head == 0), it setups one sg entry with sg->length is zero: sg_set_buf(sgl + 1, xmit->buf, xmit->head); if xmit->buf is allocated from >4G address space, and SDMA only support <4G address space, then dma_map_sg() will call swiotlb_map() to do bounce buffer copying and mapping. But swiotlb_map() don't allow sg entry's length is zero, otherwise report BUG_ON(). So the patch is to correct the tx DMA scatter list. Oops: [ 287.675715] kernel BUG at kernel/dma/swiotlb.c:497! [ 287.680592] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 287.686075] Modules linked in: [ 287.689133] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.3-00016-g3fdc4e0-dirty #10 [ 287.696872] Hardware name: FSL i.MX8MP EVK (DT) [ 287.701402] pstate: 80000085 (Nzcv daIf -PAN -UAO) [ 287.706199] pc : swiotlb_tbl_map_single+0x1fc/0x310 [ 287.711076] lr : swiotlb_map+0x60/0x148 [ 287.714909] sp : ffff800010003c00 [ 287.718221] x29: ffff800010003c00 x28: 0000000000000000 [ 287.723533] x27: 0000000000000040 x26: ffff800011ae0000 [ 287.728844] x25: ffff800011ae09f8 x24: 0000000000000000 [ 287.734155] x23: 00000001b7af9000 x22: 0000000000000000 [ 287.739465] x21: ffff000176409c10 x20: 00000000001f7ffe [ 287.744776] x19: ffff000176409c10 x18: 000000000000002e [ 287.750087] x17: 0000000000000000 x16: 0000000000000000 [ 287.755397] x15: 0000000000000000 x14: 0000000000000000 [ 287.760707] x13: ffff00017f334000 x12: 0000000000000001 [ 287.766018] x11: 00000000001fffff x10: 0000000000000000 [ 287.771328] x9 : 0000000000000003 x8 : 0000000000000000 [ 287.776638] x7 : 0000000000000000 x6 : 0000000000000000 [ 287.781949] x5 : 0000000000200000 x4 : 0000000000000000 [ 287.787259] x3 : 0000000000000001 x2 : 00000001b7af9000 [ 287.792570] x1 : 00000000fbfff000 x0 : 0000000000000000 [ 287.797881] Call trace: [ 287.800328] swiotlb_tbl_map_single+0x1fc/0x310 [ 287.804859] swiotlb_map+0x60/0x148 [ 287.808347] dma_direct_map_page+0xf0/0x130 [ 287.812530] dma_direct_map_sg+0x78/0xe0 [ 287.816453] imx_uart_dma_tx+0x134/0x2f8 [ 287.820374] imx_uart_dma_tx_callback+0xd8/0x168 [ 287.824992] vchan_complete+0x194/0x200 [ 287.828828] tasklet_action_common.isra.0+0x154/0x1a0 [ 287.833879] tasklet_action+0x24/0x30 [ 287.837540] __do_softirq+0x120/0x23c [ 287.841202] irq_exit+0xb8/0xd8 [ 287.844343] __handle_domain_irq+0x64/0xb8 [ 287.848438] gic_handle_irq+0x5c/0x148 [ 287.852185] el1_irq+0xb8/0x180 [ 287.855327] cpuidle_enter_state+0x84/0x360 [ 287.859508] cpuidle_enter+0x34/0x48 [ 287.863083] call_cpuidle+0x18/0x38 [ 287.866571] do_idle+0x1e0/0x280 [ 287.869798] cpu_startup_entry+0x20/0x40 [ 287.873721] rest_init+0xd4/0xe0 [ 287.876949] arch_call_rest_init+0xc/0x14 [ 287.880958] start_kernel+0x420/0x44c [ 287.884622] Code: 9124c021 9417aff8 a94363f7 17ffffd5 (d4210000) [ 287.890718] ---[ end trace 5bc44c4ab6b009ce ]--- [ 287.895334] Kernel panic - not syncing: Fatal exception in interrupt [ 287.901686] SMP: stopping secondary CPUs [ 288.905607] SMP: failed to stop secondary CPUs 0-1 [ 288.910395] Kernel Offset: disabled [ 288.913882] CPU features: 0x0002,2000200c [ 288.917888] Memory Limit: none [ 288.920944] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- Reported-by: Eagle Zhou <eagle.zhou@nxp.com> Tested-by: Eagle Zhou <eagle.zhou@nxp.com> Signed-off-by: Fugang Duan <fugang.duan@nxp.com> Cc: stable <stable@vger.kernel.org> Fixes: 7942f85 ("serial: imx: TX DMA: clean up sg initialization") Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/1581401761-6378-1-git-send-email-fugang.duan@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
May 5, 2020
[ Upstream commit 1bc7896 ] When experimenting with bpf_send_signal() helper in our production environment (5.2 based), we experienced a deadlock in NMI mode: #5 [ffffc9002219f770] queued_spin_lock_slowpath at ffffffff8110be24 #6 [ffffc9002219f770] _raw_spin_lock_irqsave at ffffffff81a43012 #7 [ffffc9002219f780] try_to_wake_up at ffffffff810e7ecd #8 [ffffc9002219f7e0] signal_wake_up_state at ffffffff810c7b55 #9 [ffffc9002219f7f0] __send_signal at ffffffff810c8602 #10 [ffffc9002219f830] do_send_sig_info at ffffffff810ca31a #11 [ffffc9002219f868] bpf_send_signal at ffffffff8119d227 #12 [ffffc9002219f988] bpf_overflow_handler at ffffffff811d4140 #13 [ffffc9002219f9e0] __perf_event_overflow at ffffffff811d68cf #14 [ffffc9002219fa10] perf_swevent_overflow at ffffffff811d6a09 #15 [ffffc9002219fa38] ___perf_sw_event at ffffffff811e0f47 #16 [ffffc9002219fc30] __schedule at ffffffff81a3e04d #17 [ffffc9002219fc90] schedule at ffffffff81a3e219 #18 [ffffc9002219fca0] futex_wait_queue_me at ffffffff8113d1b9 #19 [ffffc9002219fcd8] futex_wait at ffffffff8113e529 #20 [ffffc9002219fdf0] do_futex at ffffffff8113ffbc #21 [ffffc9002219fec0] __x64_sys_futex at ffffffff81140d1c #22 [ffffc9002219ff38] do_syscall_64 at ffffffff81002602 #23 [ffffc9002219ff50] entry_SYSCALL_64_after_hwframe at ffffffff81c00068 The above call stack is actually very similar to an issue reported by Commit eac9153 ("bpf/stackmap: Fix deadlock with rq_lock in bpf_get_stack()") by Song Liu. The only difference is bpf_send_signal() helper instead of bpf_get_stack() helper. The above deadlock is triggered with a perf_sw_event. Similar to Commit eac9153, the below almost identical reproducer used tracepoint point sched/sched_switch so the issue can be easily caught. /* stress_test.c */ #include <stdio.h> #include <stdlib.h> #include <sys/mman.h> #include <pthread.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #define THREAD_COUNT 1000 char *filename; void *worker(void *p) { void *ptr; int fd; char *pptr; fd = open(filename, O_RDONLY); if (fd < 0) return NULL; while (1) { struct timespec ts = {0, 1000 + rand() % 2000}; ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0); usleep(1); if (ptr == MAP_FAILED) { printf("failed to mmap\n"); break; } munmap(ptr, 4096 * 64); usleep(1); pptr = malloc(1); usleep(1); pptr[0] = 1; usleep(1); free(pptr); usleep(1); nanosleep(&ts, NULL); } close(fd); return NULL; } int main(int argc, char *argv[]) { void *ptr; int i; pthread_t threads[THREAD_COUNT]; if (argc < 2) return 0; filename = argv[1]; for (i = 0; i < THREAD_COUNT; i++) { if (pthread_create(threads + i, NULL, worker, NULL)) { fprintf(stderr, "Error creating thread\n"); return 0; } } for (i = 0; i < THREAD_COUNT; i++) pthread_join(threads[i], NULL); return 0; } and the following command: 1. run `stress_test /bin/ls` in one windown 2. hack bcc trace.py with the following change: # --- a/tools/trace.py # +++ b/tools/trace.py @@ -513,6 +513,7 @@ BPF_PERF_OUTPUT(%s); __data.tgid = __tgid; __data.pid = __pid; bpf_get_current_comm(&__data.comm, sizeof(__data.comm)); + bpf_send_signal(10); %s %s %s.perf_submit(%s, &__data, sizeof(__data)); 3. in a different window run ./trace.py -p $(pidof stress_test) t:sched:sched_switch The deadlock can be reproduced in our production system. Similar to Song's fix, the fix is to delay sending signal if irqs is disabled to avoid deadlocks involving with rq_lock. With this change, my above stress-test in our production system won't cause deadlock any more. I also implemented a scale-down version of reproducer in the selftest (a subsequent commit). With latest bpf-next, it complains for the following potential deadlock. [ 32.832450] -> #1 (&p->pi_lock){-.-.}: [ 32.833100] _raw_spin_lock_irqsave+0x44/0x80 [ 32.833696] task_rq_lock+0x2c/0xa0 [ 32.834182] task_sched_runtime+0x59/0xd0 [ 32.834721] thread_group_cputime+0x250/0x270 [ 32.835304] thread_group_cputime_adjusted+0x2e/0x70 [ 32.835959] do_task_stat+0x8a7/0xb80 [ 32.836461] proc_single_show+0x51/0xb0 ... [ 32.839512] -> #0 (&(&sighand->siglock)->rlock){....}: [ 32.840275] __lock_acquire+0x1358/0x1a20 [ 32.840826] lock_acquire+0xc7/0x1d0 [ 32.841309] _raw_spin_lock_irqsave+0x44/0x80 [ 32.841916] __lock_task_sighand+0x79/0x160 [ 32.842465] do_send_sig_info+0x35/0x90 [ 32.842977] bpf_send_signal+0xa/0x10 [ 32.843464] bpf_prog_bc13ed9e4d3163e3_send_signal_tp_sched+0x465/0x1000 [ 32.844301] trace_call_bpf+0x115/0x270 [ 32.844809] perf_trace_run_bpf_submit+0x4a/0xc0 [ 32.845411] perf_trace_sched_switch+0x10f/0x180 [ 32.846014] __schedule+0x45d/0x880 [ 32.846483] schedule+0x5f/0xd0 ... [ 32.853148] Chain exists of: [ 32.853148] &(&sighand->siglock)->rlock --> &p->pi_lock --> &rq->lock [ 32.853148] [ 32.854451] Possible unsafe locking scenario: [ 32.854451] [ 32.855173] CPU0 CPU1 [ 32.855745] ---- ---- [ 32.856278] lock(&rq->lock); [ 32.856671] lock(&p->pi_lock); [ 32.857332] lock(&rq->lock); [ 32.857999] lock(&(&sighand->siglock)->rlock); Deadlock happens on CPU0 when it tries to acquire &sighand->siglock but it has been held by CPU1 and CPU1 tries to grab &rq->lock and cannot get it. This is not exactly the callstack in our production environment, but sympotom is similar and both locks are using spin_lock_irqsave() to acquire the lock, and both involves rq_lock. The fix to delay sending signal when irq is disabled also fixed this issue. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200304191104.2796501-1-yhs@fb.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Aug 20, 2020
Luca Abeni reported this: | BUG: scheduling while atomic: kworker/u8:2/15203/0x00000003 | CPU: 1 PID: 15203 Comm: kworker/u8:2 Not tainted 4.19.1-rt3 #10 | Call Trace: | rt_spin_lock+0x3f/0x50 | gen6_read32+0x45/0x1d0 [i915] | g4x_get_vblank_counter+0x36/0x40 [i915] | trace_event_raw_event_i915_pipe_update_start+0x7d/0xf0 [i915] The tracing events use trace_i915_pipe_update_start() among other events use functions acquire spin locks. A few trace points use intel_get_crtc_scanline(), others use ->get_vblank_counter() wich also might acquire a sleeping lock. Based on this I don't see any other way than disable trace points on RT. Cc: stable-rt@vger.kernel.org Reported-by: Luca Abeni <lucabe72@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
sys-oak
pushed a commit
that referenced
this pull request
Aug 24, 2020
[ Upstream commit e24c644 ] I compiled with AddressSanitizer and I had these memory leaks while I was using the tep_parse_format function: Direct leak of 28 byte(s) in 4 object(s) allocated from: #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe) #1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985 #2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140 #3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206 #4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291 #5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299 #6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849 #7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161 #8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207 #9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786 #10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285 #11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369 #12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335 #13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389 #14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431 #15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251 #16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284 #17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593 #18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727 #19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048 #20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127 #21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152 #22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252 #23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347 #24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461 #25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673 #26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) The token variable in the process_dynamic_array_len function is allocated in the read_expect_type function, but is not freed before calling the read_token function. Free the token variable before calling read_token in order to plug the leak. Signed-off-by: Philippe Duplessis-Guindon <pduplessis@efficios.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/linux-trace-devel/20200730150236.5392-1-pduplessis@efficios.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Aug 25, 2020
[ Upstream commit e24c644 ] I compiled with AddressSanitizer and I had these memory leaks while I was using the tep_parse_format function: Direct leak of 28 byte(s) in 4 object(s) allocated from: #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe) #1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985 #2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140 #3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206 #4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291 #5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299 #6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849 #7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161 #8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207 #9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786 #10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285 #11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369 #12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335 #13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389 #14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431 #15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251 #16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284 #17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593 #18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727 #19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048 #20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127 #21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152 #22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252 #23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347 #24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461 #25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673 #26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) The token variable in the process_dynamic_array_len function is allocated in the read_expect_type function, but is not freed before calling the read_token function. Free the token variable before calling read_token in order to plug the leak. Signed-off-by: Philippe Duplessis-Guindon <pduplessis@efficios.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/linux-trace-devel/20200730150236.5392-1-pduplessis@efficios.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Sep 17, 2020
[ Upstream commit e24c644 ] I compiled with AddressSanitizer and I had these memory leaks while I was using the tep_parse_format function: Direct leak of 28 byte(s) in 4 object(s) allocated from: #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe) #1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985 #2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140 #3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206 #4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291 #5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299 #6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849 #7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161 #8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207 #9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786 #10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285 #11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369 #12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335 #13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389 #14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431 #15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251 #16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284 #17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593 #18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727 #19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048 #20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127 #21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152 #22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252 #23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347 #24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461 #25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673 #26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) The token variable in the process_dynamic_array_len function is allocated in the read_expect_type function, but is not freed before calling the read_token function. Free the token variable before calling read_token in order to plug the leak. Signed-off-by: Philippe Duplessis-Guindon <pduplessis@efficios.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/linux-trace-devel/20200730150236.5392-1-pduplessis@efficios.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Sep 30, 2020
[ Upstream commit b12eea5 ] The evsel->unit borrows a pointer of pmu event or alias instead of owns a string. But tool event (duration_time) passes a result of strdup() caused a leak. It was found by ASAN during metric test: Direct leak of 210 byte(s) in 70 object(s) allocated from: #0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5) #1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414 #2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414 #3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439 #4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096 #5 0x559fbbcc95da in __parse_events util/parse-events.c:2141 #6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406 #7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393 #8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415 #9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498 #10 0x559fbbc0109b in run_test tests/builtin-test.c:410 #11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440 #12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695 #13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807 #14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 #16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 #17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 #18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: f0fbb11 ("perf stat: Implement duration_time as a proper event") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Sep 30, 2020
[ Upstream commit d26383d ] The following leaks were detected by ASAN: Indirect leak of 360 byte(s) in 9 object(s) allocated from: #0 0x7fecc305180e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e) #1 0x560578f6dce5 in perf_pmu__new_format util/pmu.c:1333 #2 0x560578f752fc in perf_pmu_parse util/pmu.y:59 #3 0x560578f6a8b7 in perf_pmu__format_parse util/pmu.c:73 #4 0x560578e07045 in test__pmu tests/pmu.c:155 #5 0x560578de109b in run_test tests/builtin-test.c:410 #6 0x560578de109b in test_and_print tests/builtin-test.c:440 #7 0x560578de401a in __cmd_test tests/builtin-test.c:661 #8 0x560578de401a in cmd_test tests/builtin-test.c:807 #9 0x560578e49354 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #10 0x560578ce71a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 #11 0x560578ce71a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 #12 0x560578ce71a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 #13 0x7fecc2b7acc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: cff7f95 ("perf tests: Move pmu tests into separate object") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-12-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Sep 30, 2020
[ Upstream commit e24c644 ] I compiled with AddressSanitizer and I had these memory leaks while I was using the tep_parse_format function: Direct leak of 28 byte(s) in 4 object(s) allocated from: #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe) #1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985 #2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140 #3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206 #4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291 #5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299 #6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849 #7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161 #8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207 #9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786 #10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285 #11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369 #12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335 #13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389 #14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431 #15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251 #16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284 #17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593 #18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727 #19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048 #20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127 #21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152 #22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252 #23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347 #24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461 #25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673 #26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) The token variable in the process_dynamic_array_len function is allocated in the read_expect_type function, but is not freed before calling the read_token function. Free the token variable before calling read_token in order to plug the leak. Signed-off-by: Philippe Duplessis-Guindon <pduplessis@efficios.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/linux-trace-devel/20200730150236.5392-1-pduplessis@efficios.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Oct 12, 2020
[ Upstream commit d26383d ] The following leaks were detected by ASAN: Indirect leak of 360 byte(s) in 9 object(s) allocated from: #0 0x7fecc305180e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e) #1 0x560578f6dce5 in perf_pmu__new_format util/pmu.c:1333 #2 0x560578f752fc in perf_pmu_parse util/pmu.y:59 #3 0x560578f6a8b7 in perf_pmu__format_parse util/pmu.c:73 #4 0x560578e07045 in test__pmu tests/pmu.c:155 #5 0x560578de109b in run_test tests/builtin-test.c:410 #6 0x560578de109b in test_and_print tests/builtin-test.c:440 #7 0x560578de401a in __cmd_test tests/builtin-test.c:661 #8 0x560578de401a in cmd_test tests/builtin-test.c:807 #9 0x560578e49354 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #10 0x560578ce71a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 #11 0x560578ce71a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 #12 0x560578ce71a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 #13 0x7fecc2b7acc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: cff7f95 ("perf tests: Move pmu tests into separate object") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-12-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Nov 13, 2020
commit 66d204a upstream. Very sporadically I had test case btrfs/069 from fstests hanging (for years, it is not a recent regression), with the following traces in dmesg/syslog: [162301.160628] BTRFS info (device sdc): dev_replace from /dev/sdd (devid 2) to /dev/sdg started [162301.181196] BTRFS info (device sdc): scrub: finished on devid 4 with status: 0 [162301.287162] BTRFS info (device sdc): dev_replace from /dev/sdd (devid 2) to /dev/sdg finished [162513.513792] INFO: task btrfs-transacti:1356167 blocked for more than 120 seconds. [162513.514318] Not tainted 5.9.0-rc6-btrfs-next-69 #1 [162513.514522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [162513.514747] task:btrfs-transacti state:D stack: 0 pid:1356167 ppid: 2 flags:0x00004000 [162513.514751] Call Trace: [162513.514761] __schedule+0x5ce/0xd00 [162513.514765] ? _raw_spin_unlock_irqrestore+0x3c/0x60 [162513.514771] schedule+0x46/0xf0 [162513.514844] wait_current_trans+0xde/0x140 [btrfs] [162513.514850] ? finish_wait+0x90/0x90 [162513.514864] start_transaction+0x37c/0x5f0 [btrfs] [162513.514879] transaction_kthread+0xa4/0x170 [btrfs] [162513.514891] ? btrfs_cleanup_transaction+0x660/0x660 [btrfs] [162513.514894] kthread+0x153/0x170 [162513.514897] ? kthread_stop+0x2c0/0x2c0 [162513.514902] ret_from_fork+0x22/0x30 [162513.514916] INFO: task fsstress:1356184 blocked for more than 120 seconds. [162513.515192] Not tainted 5.9.0-rc6-btrfs-next-69 #1 [162513.515431] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [162513.515680] task:fsstress state:D stack: 0 pid:1356184 ppid:1356177 flags:0x00004000 [162513.515682] Call Trace: [162513.515688] __schedule+0x5ce/0xd00 [162513.515691] ? _raw_spin_unlock_irqrestore+0x3c/0x60 [162513.515697] schedule+0x46/0xf0 [162513.515712] wait_current_trans+0xde/0x140 [btrfs] [162513.515716] ? finish_wait+0x90/0x90 [162513.515729] start_transaction+0x37c/0x5f0 [btrfs] [162513.515743] btrfs_attach_transaction_barrier+0x1f/0x50 [btrfs] [162513.515753] btrfs_sync_fs+0x61/0x1c0 [btrfs] [162513.515758] ? __ia32_sys_fdatasync+0x20/0x20 [162513.515761] iterate_supers+0x87/0xf0 [162513.515765] ksys_sync+0x60/0xb0 [162513.515768] __do_sys_sync+0xa/0x10 [162513.515771] do_syscall_64+0x33/0x80 [162513.515774] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [162513.515781] RIP: 0033:0x7f5238f50bd7 [162513.515782] Code: Bad RIP value. [162513.515784] RSP: 002b:00007fff67b978e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a2 [162513.515786] RAX: ffffffffffffffda RBX: 000055b1fad2c560 RCX: 00007f5238f50bd7 [162513.515788] RDX: 00000000ffffffff RSI: 000000000daf0e74 RDI: 000000000000003a [162513.515789] RBP: 0000000000000032 R08: 000000000000000a R09: 00007f5239019be0 [162513.515791] R10: fffffffffffff24f R11: 0000000000000206 R12: 000000000000003a [162513.515792] R13: 00007fff67b97950 R14: 00007fff67b97906 R15: 000055b1fad1a340 [162513.515804] INFO: task fsstress:1356185 blocked for more than 120 seconds. [162513.516064] Not tainted 5.9.0-rc6-btrfs-next-69 #1 [162513.516329] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [162513.516617] task:fsstress state:D stack: 0 pid:1356185 ppid:1356177 flags:0x00000000 [162513.516620] Call Trace: [162513.516625] __schedule+0x5ce/0xd00 [162513.516628] ? _raw_spin_unlock_irqrestore+0x3c/0x60 [162513.516634] schedule+0x46/0xf0 [162513.516647] wait_current_trans+0xde/0x140 [btrfs] [162513.516650] ? finish_wait+0x90/0x90 [162513.516662] start_transaction+0x4d7/0x5f0 [btrfs] [162513.516679] btrfs_setxattr_trans+0x3c/0x100 [btrfs] [162513.516686] __vfs_setxattr+0x66/0x80 [162513.516691] __vfs_setxattr_noperm+0x70/0x200 [162513.516697] vfs_setxattr+0x6b/0x120 [162513.516703] setxattr+0x125/0x240 [162513.516709] ? lock_acquire+0xb1/0x480 [162513.516712] ? mnt_want_write+0x20/0x50 [162513.516721] ? rcu_read_lock_any_held+0x8e/0xb0 [162513.516723] ? preempt_count_add+0x49/0xa0 [162513.516725] ? __sb_start_write+0x19b/0x290 [162513.516727] ? preempt_count_add+0x49/0xa0 [162513.516732] path_setxattr+0xba/0xd0 [162513.516739] __x64_sys_setxattr+0x27/0x30 [162513.516741] do_syscall_64+0x33/0x80 [162513.516743] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [162513.516745] RIP: 0033:0x7f5238f56d5a [162513.516746] Code: Bad RIP value. [162513.516748] RSP: 002b:00007fff67b97868 EFLAGS: 00000202 ORIG_RAX: 00000000000000bc [162513.516750] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f5238f56d5a [162513.516751] RDX: 000055b1fbb0d5a0 RSI: 00007fff67b978a0 RDI: 000055b1fbb0d470 [162513.516753] RBP: 000055b1fbb0d5a0 R08: 0000000000000001 R09: 00007fff67b97700 [162513.516754] R10: 0000000000000004 R11: 0000000000000202 R12: 0000000000000004 [162513.516756] R13: 0000000000000024 R14: 0000000000000001 R15: 00007fff67b978a0 [162513.516767] INFO: task fsstress:1356196 blocked for more than 120 seconds. [162513.517064] Not tainted 5.9.0-rc6-btrfs-next-69 #1 [162513.517365] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [162513.517763] task:fsstress state:D stack: 0 pid:1356196 ppid:1356177 flags:0x00004000 [162513.517780] Call Trace: [162513.517786] __schedule+0x5ce/0xd00 [162513.517789] ? _raw_spin_unlock_irqrestore+0x3c/0x60 [162513.517796] schedule+0x46/0xf0 [162513.517810] wait_current_trans+0xde/0x140 [btrfs] [162513.517814] ? finish_wait+0x90/0x90 [162513.517829] start_transaction+0x37c/0x5f0 [btrfs] [162513.517845] btrfs_attach_transaction_barrier+0x1f/0x50 [btrfs] [162513.517857] btrfs_sync_fs+0x61/0x1c0 [btrfs] [162513.517862] ? __ia32_sys_fdatasync+0x20/0x20 [162513.517865] iterate_supers+0x87/0xf0 [162513.517869] ksys_sync+0x60/0xb0 [162513.517872] __do_sys_sync+0xa/0x10 [162513.517875] do_syscall_64+0x33/0x80 [162513.517878] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [162513.517881] RIP: 0033:0x7f5238f50bd7 [162513.517883] Code: Bad RIP value. [162513.517885] RSP: 002b:00007fff67b978e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a2 [162513.517887] RAX: ffffffffffffffda RBX: 000055b1fad2c560 RCX: 00007f5238f50bd7 [162513.517889] RDX: 0000000000000000 RSI: 000000007660add2 RDI: 0000000000000053 [162513.517891] RBP: 0000000000000032 R08: 0000000000000067 R09: 00007f5239019be0 [162513.517893] R10: fffffffffffff24f R11: 0000000000000206 R12: 0000000000000053 [162513.517895] R13: 00007fff67b97950 R14: 00007fff67b97906 R15: 000055b1fad1a340 [162513.517908] INFO: task fsstress:1356197 blocked for more than 120 seconds. [162513.518298] Not tainted 5.9.0-rc6-btrfs-next-69 #1 [162513.518672] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [162513.519157] task:fsstress state:D stack: 0 pid:1356197 ppid:1356177 flags:0x00000000 [162513.519160] Call Trace: [162513.519165] __schedule+0x5ce/0xd00 [162513.519168] ? _raw_spin_unlock_irqrestore+0x3c/0x60 [162513.519174] schedule+0x46/0xf0 [162513.519190] wait_current_trans+0xde/0x140 [btrfs] [162513.519193] ? finish_wait+0x90/0x90 [162513.519206] start_transaction+0x4d7/0x5f0 [btrfs] [162513.519222] btrfs_create+0x57/0x200 [btrfs] [162513.519230] lookup_open+0x522/0x650 [162513.519246] path_openat+0x2b8/0xa50 [162513.519270] do_filp_open+0x91/0x100 [162513.519275] ? find_held_lock+0x32/0x90 [162513.519280] ? lock_acquired+0x33b/0x470 [162513.519285] ? do_raw_spin_unlock+0x4b/0xc0 [162513.519287] ? _raw_spin_unlock+0x29/0x40 [162513.519295] do_sys_openat2+0x20d/0x2d0 [162513.519300] do_sys_open+0x44/0x80 [162513.519304] do_syscall_64+0x33/0x80 [162513.519307] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [162513.519309] RIP: 0033:0x7f5238f4a903 [162513.519310] Code: Bad RIP value. [162513.519312] RSP: 002b:00007fff67b97758 EFLAGS: 00000246 ORIG_RAX: 0000000000000055 [162513.519314] RAX: ffffffffffffffda RBX: 00000000ffffffff RCX: 00007f5238f4a903 [162513.519316] RDX: 0000000000000000 RSI: 00000000000001b6 RDI: 000055b1fbb0d470 [162513.519317] RBP: 00007fff67b978c0 R08: 0000000000000001 R09: 0000000000000002 [162513.519319] R10: 00007fff67b974f7 R11: 0000000000000246 R12: 0000000000000013 [162513.519320] R13: 00000000000001b6 R14: 00007fff67b97906 R15: 000055b1fad1c620 [162513.519332] INFO: task btrfs:1356211 blocked for more than 120 seconds. [162513.519727] Not tainted 5.9.0-rc6-btrfs-next-69 #1 [162513.520115] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [162513.520508] task:btrfs state:D stack: 0 pid:1356211 ppid:1356178 flags:0x00004002 [162513.520511] Call Trace: [162513.520516] __schedule+0x5ce/0xd00 [162513.520519] ? _raw_spin_unlock_irqrestore+0x3c/0x60 [162513.520525] schedule+0x46/0xf0 [162513.520544] btrfs_scrub_pause+0x11f/0x180 [btrfs] [162513.520548] ? finish_wait+0x90/0x90 [162513.520562] btrfs_commit_transaction+0x45a/0xc30 [btrfs] [162513.520574] ? start_transaction+0xe0/0x5f0 [btrfs] [162513.520596] btrfs_dev_replace_finishing+0x6d8/0x711 [btrfs] [162513.520619] btrfs_dev_replace_by_ioctl.cold+0x1cc/0x1fd [btrfs] [162513.520639] btrfs_ioctl+0x2a25/0x36f0 [btrfs] [162513.520643] ? do_sigaction+0xf3/0x240 [162513.520645] ? find_held_lock+0x32/0x90 [162513.520648] ? do_sigaction+0xf3/0x240 [162513.520651] ? lock_acquired+0x33b/0x470 [162513.520655] ? _raw_spin_unlock_irq+0x24/0x50 [162513.520657] ? lockdep_hardirqs_on+0x7d/0x100 [162513.520660] ? _raw_spin_unlock_irq+0x35/0x50 [162513.520662] ? do_sigaction+0xf3/0x240 [162513.520671] ? __x64_sys_ioctl+0x83/0xb0 [162513.520672] __x64_sys_ioctl+0x83/0xb0 [162513.520677] do_syscall_64+0x33/0x80 [162513.520679] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [162513.520681] RIP: 0033:0x7fc3cd307d87 [162513.520682] Code: Bad RIP value. [162513.520684] RSP: 002b:00007ffe30a56bb8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [162513.520686] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fc3cd307d87 [162513.520687] RDX: 00007ffe30a57a30 RSI: 00000000ca289435 RDI: 0000000000000003 [162513.520689] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [162513.520690] R10: 0000000000000008 R11: 0000000000000202 R12: 0000000000000003 [162513.520692] R13: 0000557323a212e0 R14: 00007ffe30a5a520 R15: 0000000000000001 [162513.520703] Showing all locks held in the system: [162513.520712] 1 lock held by khungtaskd/54: [162513.520713] #0: ffffffffb40a91a0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x15/0x197 [162513.520728] 1 lock held by in:imklog/596: [162513.520729] #0: ffff8f3f0d781400 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x4d/0x60 [162513.520782] 1 lock held by btrfs-transacti/1356167: [162513.520784] #0: ffff8f3d810cc848 (&fs_info->transaction_kthread_mutex){+.+.}-{3:3}, at: transaction_kthread+0x4a/0x170 [btrfs] [162513.520798] 1 lock held by btrfs/1356190: [162513.520800] #0: ffff8f3d57644470 (sb_writers#15){.+.+}-{0:0}, at: mnt_want_write_file+0x22/0x60 [162513.520805] 1 lock held by fsstress/1356184: [162513.520806] #0: ffff8f3d576440e8 (&type->s_umount_key#62){++++}-{3:3}, at: iterate_supers+0x6f/0xf0 [162513.520811] 3 locks held by fsstress/1356185: [162513.520812] #0: ffff8f3d57644470 (sb_writers#15){.+.+}-{0:0}, at: mnt_want_write+0x20/0x50 [162513.520815] #1: ffff8f3d80a650b8 (&type->i_mutex_dir_key#10){++++}-{3:3}, at: vfs_setxattr+0x50/0x120 [162513.520820] #2: ffff8f3d57644690 (sb_internal#2){.+.+}-{0:0}, at: start_transaction+0x40e/0x5f0 [btrfs] [162513.520833] 1 lock held by fsstress/1356196: [162513.520834] #0: ffff8f3d576440e8 (&type->s_umount_key#62){++++}-{3:3}, at: iterate_supers+0x6f/0xf0 [162513.520838] 3 locks held by fsstress/1356197: [162513.520839] #0: ffff8f3d57644470 (sb_writers#15){.+.+}-{0:0}, at: mnt_want_write+0x20/0x50 [162513.520843] #1: ffff8f3d506465e8 (&type->i_mutex_dir_key#10){++++}-{3:3}, at: path_openat+0x2a7/0xa50 [162513.520846] #2: ffff8f3d57644690 (sb_internal#2){.+.+}-{0:0}, at: start_transaction+0x40e/0x5f0 [btrfs] [162513.520858] 2 locks held by btrfs/1356211: [162513.520859] #0: ffff8f3d810cde30 (&fs_info->dev_replace.lock_finishing_cancel_unmount){+.+.}-{3:3}, at: btrfs_dev_replace_finishing+0x52/0x711 [btrfs] [162513.520877] #1: ffff8f3d57644690 (sb_internal#2){.+.+}-{0:0}, at: start_transaction+0x40e/0x5f0 [btrfs] This was weird because the stack traces show that a transaction commit, triggered by a device replace operation, is blocking trying to pause any running scrubs but there are no stack traces of blocked tasks doing a scrub. After poking around with drgn, I noticed there was a scrub task that was constantly running and blocking for shorts periods of time: >>> t = find_task(prog, 1356190) >>> prog.stack_trace(t) #0 __schedule+0x5ce/0xcfc #1 schedule+0x46/0xe4 #2 schedule_timeout+0x1df/0x475 #3 btrfs_reada_wait+0xda/0x132 #4 scrub_stripe+0x2a8/0x112f #5 scrub_chunk+0xcd/0x134 #6 scrub_enumerate_chunks+0x29e/0x5ee #7 btrfs_scrub_dev+0x2d5/0x91b #8 btrfs_ioctl+0x7f5/0x36e7 #9 __x64_sys_ioctl+0x83/0xb0 #10 do_syscall_64+0x33/0x77 #11 entry_SYSCALL_64+0x7c/0x156 Which corresponds to: int btrfs_reada_wait(void *handle) { struct reada_control *rc = handle; struct btrfs_fs_info *fs_info = rc->fs_info; while (atomic_read(&rc->elems)) { if (!atomic_read(&fs_info->reada_works_cnt)) reada_start_machine(fs_info); wait_event_timeout(rc->wait, atomic_read(&rc->elems) == 0, (HZ + 9) / 10); } (...) So the counter "rc->elems" was set to 1 and never decreased to 0, causing the scrub task to loop forever in that function. Then I used the following script for drgn to check the readahead requests: $ cat dump_reada.py import sys import drgn from drgn import NULL, Object, cast, container_of, execscript, \ reinterpret, sizeof from drgn.helpers.linux import * mnt_path = b"/home/fdmanana/btrfs-tests/scratch_1" mnt = None for mnt in for_each_mount(prog, dst = mnt_path): pass if mnt is None: sys.stderr.write(f'Error: mount point {mnt_path} not found\n') sys.exit(1) fs_info = cast('struct btrfs_fs_info *', mnt.mnt.mnt_sb.s_fs_info) def dump_re(re): nzones = re.nzones.value_() print(f're at {hex(re.value_())}') print(f'\t logical {re.logical.value_()}') print(f'\t refcnt {re.refcnt.value_()}') print(f'\t nzones {nzones}') for i in range(nzones): dev = re.zones[i].device name = dev.name.str.string_() print(f'\t\t dev id {dev.devid.value_()} name {name}') print() for _, e in radix_tree_for_each(fs_info.reada_tree): re = cast('struct reada_extent *', e) dump_re(re) $ drgn dump_reada.py re at 0xffff8f3da9d25ad8 logical 38928384 refcnt 1 nzones 1 dev id 0 name b'/dev/sdd' $ So there was one readahead extent with a single zone corresponding to the source device of that last device replace operation logged in dmesg/syslog. Also the ID of that zone's device was 0 which is a special value set in the source device of a device replace operation when the operation finishes (constant BTRFS_DEV_REPLACE_DEVID set at btrfs_dev_replace_finishing()), confirming again that device /dev/sdd was the source of a device replace operation. Normally there should be as many zones in the readahead extent as there are devices, and I wasn't expecting the extent to be in a block group with a 'single' profile, so I went and confirmed with the following drgn script that there weren't any single profile block groups: $ cat dump_block_groups.py import sys import drgn from drgn import NULL, Object, cast, container_of, execscript, \ reinterpret, sizeof from drgn.helpers.linux import * mnt_path = b"/home/fdmanana/btrfs-tests/scratch_1" mnt = None for mnt in for_each_mount(prog, dst = mnt_path): pass if mnt is None: sys.stderr.write(f'Error: mount point {mnt_path} not found\n') sys.exit(1) fs_info = cast('struct btrfs_fs_info *', mnt.mnt.mnt_sb.s_fs_info) BTRFS_BLOCK_GROUP_DATA = (1 << 0) BTRFS_BLOCK_GROUP_SYSTEM = (1 << 1) BTRFS_BLOCK_GROUP_METADATA = (1 << 2) BTRFS_BLOCK_GROUP_RAID0 = (1 << 3) BTRFS_BLOCK_GROUP_RAID1 = (1 << 4) BTRFS_BLOCK_GROUP_DUP = (1 << 5) BTRFS_BLOCK_GROUP_RAID10 = (1 << 6) BTRFS_BLOCK_GROUP_RAID5 = (1 << 7) BTRFS_BLOCK_GROUP_RAID6 = (1 << 8) BTRFS_BLOCK_GROUP_RAID1C3 = (1 << 9) BTRFS_BLOCK_GROUP_RAID1C4 = (1 << 10) def bg_flags_string(bg): flags = bg.flags.value_() ret = '' if flags & BTRFS_BLOCK_GROUP_DATA: ret = 'data' if flags & BTRFS_BLOCK_GROUP_METADATA: if len(ret) > 0: ret += '|' ret += 'meta' if flags & BTRFS_BLOCK_GROUP_SYSTEM: if len(ret) > 0: ret += '|' ret += 'system' if flags & BTRFS_BLOCK_GROUP_RAID0: ret += ' raid0' elif flags & BTRFS_BLOCK_GROUP_RAID1: ret += ' raid1' elif flags & BTRFS_BLOCK_GROUP_DUP: ret += ' dup' elif flags & BTRFS_BLOCK_GROUP_RAID10: ret += ' raid10' elif flags & BTRFS_BLOCK_GROUP_RAID5: ret += ' raid5' elif flags & BTRFS_BLOCK_GROUP_RAID6: ret += ' raid6' elif flags & BTRFS_BLOCK_GROUP_RAID1C3: ret += ' raid1c3' elif flags & BTRFS_BLOCK_GROUP_RAID1C4: ret += ' raid1c4' else: ret += ' single' return ret def dump_bg(bg): print() print(f'block group at {hex(bg.value_())}') print(f'\t start {bg.start.value_()} length {bg.length.value_()}') print(f'\t flags {bg.flags.value_()} - {bg_flags_string(bg)}') bg_root = fs_info.block_group_cache_tree.address_of_() for bg in rbtree_inorder_for_each_entry('struct btrfs_block_group', bg_root, 'cache_node'): dump_bg(bg) $ drgn dump_block_groups.py block group at 0xffff8f3d673b0400 start 22020096 length 16777216 flags 258 - system raid6 block group at 0xffff8f3d53ddb400 start 38797312 length 536870912 flags 260 - meta raid6 block group at 0xffff8f3d5f4d9c00 start 575668224 length 2147483648 flags 257 - data raid6 block group at 0xffff8f3d08189000 start 2723151872 length 67108864 flags 258 - system raid6 block group at 0xffff8f3db70ff000 start 2790260736 length 1073741824 flags 260 - meta raid6 block group at 0xffff8f3d5f4dd800 start 3864002560 length 67108864 flags 258 - system raid6 block group at 0xffff8f3d67037000 start 3931111424 length 2147483648 flags 257 - data raid6 $ So there were only 2 reasons left for having a readahead extent with a single zone: reada_find_zone(), called when creating a readahead extent, returned NULL either because we failed to find the corresponding block group or because a memory allocation failed. With some additional and custom tracing I figured out that on every further ocurrence of the problem the block group had just been deleted when we were looping to create the zones for the readahead extent (at reada_find_extent()), so we ended up with only one zone in the readahead extent, corresponding to a device that ends up getting replaced. So after figuring that out it became obvious why the hang happens: 1) Task A starts a scrub on any device of the filesystem, except for device /dev/sdd; 2) Task B starts a device replace with /dev/sdd as the source device; 3) Task A calls btrfs_reada_add() from scrub_stripe() and it is currently starting to scrub a stripe from block group X. This call to btrfs_reada_add() is the one for the extent tree. When btrfs_reada_add() calls reada_add_block(), it passes the logical address of the extent tree's root node as its 'logical' argument - a value of 38928384; 4) Task A then enters reada_find_extent(), called from reada_add_block(). It finds there isn't any existing readahead extent for the logical address 38928384, so it proceeds to the path of creating a new one. It calls btrfs_map_block() to find out which stripes exist for the block group X. On the first iteration of the for loop that iterates over the stripes, it finds the stripe for device /dev/sdd, so it creates one zone for that device and adds it to the readahead extent. Before getting into the second iteration of the loop, the cleanup kthread deletes block group X because it was empty. So in the iterations for the remaining stripes it does not add more zones to the readahead extent, because the calls to reada_find_zone() returned NULL because they couldn't find block group X anymore. As a result the new readahead extent has a single zone, corresponding to the device /dev/sdd; 4) Before task A returns to btrfs_reada_add() and queues the readahead job for the readahead work queue, task B finishes the device replace and at btrfs_dev_replace_finishing() swaps the device /dev/sdd with the new device /dev/sdg; 5) Task A returns to reada_add_block(), which increments the counter "->elems" of the reada_control structure allocated at btrfs_reada_add(). Then it returns back to btrfs_reada_add() and calls reada_start_machine(). This queues a job in the readahead work queue to run the function reada_start_machine_worker(), which calls __reada_start_machine(). At __reada_start_machine() we take the device list mutex and for each device found in the current device list, we call reada_start_machine_dev() to start the readahead work. However at this point the device /dev/sdd was already freed and is not in the device list anymore. This means the corresponding readahead for the extent at 38928384 is never started, and therefore the "->elems" counter of the reada_control structure allocated at btrfs_reada_add() never goes down to 0, causing the call to btrfs_reada_wait(), done by the scrub task, to wait forever. Note that the readahead request can be made either after the device replace started or before it started, however in pratice it is very unlikely that a device replace is able to start after a readahead request is made and is able to complete before the readahead request completes - maybe only on a very small and nearly empty filesystem. This hang however is not the only problem we can have with readahead and device removals. When the readahead extent has other zones other than the one corresponding to the device that is being removed (either by a device replace or a device remove operation), we risk having a use-after-free on the device when dropping the last reference of the readahead extent. For example if we create a readahead extent with two zones, one for the device /dev/sdd and one for the device /dev/sde: 1) Before the readahead worker starts, the device /dev/sdd is removed, and the corresponding btrfs_device structure is freed. However the readahead extent still has the zone pointing to the device structure; 2) When the readahead worker starts, it only finds device /dev/sde in the current device list of the filesystem; 3) It starts the readahead work, at reada_start_machine_dev(), using the device /dev/sde; 4) Then when it finishes reading the extent from device /dev/sde, it calls __readahead_hook() which ends up dropping the last reference on the readahead extent through the last call to reada_extent_put(); 5) At reada_extent_put() it iterates over each zone of the readahead extent and attempts to delete an element from the device's 'reada_extents' radix tree, resulting in a use-after-free, as the device pointer of the zone for /dev/sdd is now stale. We can also access the device after dropping the last reference of a zone, through reada_zone_release(), also called by reada_extent_put(). And a device remove suffers the same problem, however since it shrinks the device size down to zero before removing the device, it is very unlikely to still have readahead requests not completed by the time we free the device, the only possibility is if the device has a very little space allocated. While the hang problem is exclusive to scrub, since it is currently the only user of btrfs_reada_add() and btrfs_reada_wait(), the use-after-free problem affects any path that triggers readhead, which includes btree_readahead_hook() and __readahead_hook() (a readahead worker can trigger readahed for the children of a node) for example - any path that ends up calling reada_add_block() can trigger the use-after-free after a device is removed. So fix this by waiting for any readahead requests for a device to complete before removing a device, ensuring that while waiting for existing ones no new ones can be made. This problem has been around for a very long time - the readahead code was added in 2011, device remove exists since 2008 and device replace was introduced in 2013, hard to pick a specific commit for a git Fixes tag. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Dec 31, 2020
[ Upstream commit d26383d ] The following leaks were detected by ASAN: Indirect leak of 360 byte(s) in 9 object(s) allocated from: #0 0x7fecc305180e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e) #1 0x560578f6dce5 in perf_pmu__new_format util/pmu.c:1333 #2 0x560578f752fc in perf_pmu_parse util/pmu.y:59 #3 0x560578f6a8b7 in perf_pmu__format_parse util/pmu.c:73 #4 0x560578e07045 in test__pmu tests/pmu.c:155 #5 0x560578de109b in run_test tests/builtin-test.c:410 #6 0x560578de109b in test_and_print tests/builtin-test.c:440 #7 0x560578de401a in __cmd_test tests/builtin-test.c:661 #8 0x560578de401a in cmd_test tests/builtin-test.c:807 #9 0x560578e49354 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #10 0x560578ce71a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 #11 0x560578ce71a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 #12 0x560578ce71a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 #13 0x7fecc2b7acc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: cff7f95 ("perf tests: Move pmu tests into separate object") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-12-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Jan 5, 2021
Luca Abeni reported this: | BUG: scheduling while atomic: kworker/u8:2/15203/0x00000003 | CPU: 1 PID: 15203 Comm: kworker/u8:2 Not tainted 4.19.1-rt3 #10 | Call Trace: | rt_spin_lock+0x3f/0x50 | gen6_read32+0x45/0x1d0 [i915] | g4x_get_vblank_counter+0x36/0x40 [i915] | trace_event_raw_event_i915_pipe_update_start+0x7d/0xf0 [i915] The tracing events use trace_i915_pipe_update_start() among other events use functions acquire spin locks. A few trace points use intel_get_crtc_scanline(), others use ->get_vblank_counter() wich also might acquire a sleeping lock. Based on this I don't see any other way than disable trace points on RT. Cc: stable-rt@vger.kernel.org Reported-by: Luca Abeni <lucabe72@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
sys-oak
pushed a commit
that referenced
this pull request
Mar 12, 2021
…g system shutdown [ Upstream commit 45a2702 ] During Coldboot stress tests, system encountered the following panic. Panic logs depicts rt5682_i2c_shutdown() happened first and then later jack detect handler workqueue function triggered. This situation causes panic as rt5682_i2c_shutdown() resets codec. Fix this panic by cancelling all jack detection delayed work. Panic log: [ 20.936124] sof_pci_shutdown [ 20.940248] snd_sof_device_shutdown [ 20.945023] snd_sof_shutdown [ 21.126849] rt5682_i2c_shutdown [ 21.286053] rt5682_jack_detect_handler [ 21.291235] BUG: kernel NULL pointer dereference, address: 000000000000037c [ 21.299302] #PF: supervisor read access in kernel mode [ 21.305254] #PF: error_code(0x0000) - not-present page [ 21.311218] PGD 0 P4D 0 [ 21.314155] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 21.319206] CPU: 2 PID: 123 Comm: kworker/2:3 Tainted: G U 5.4.68 #10 [ 21.333687] ACPI: Preparing to enter system sleep state S5 [ 21.337669] Workqueue: events_power_efficient rt5682_jack_detect_handler [snd_soc_rt5682] [ 21.337671] RIP: 0010:rt5682_jack_detect_handler+0x6c/0x279 [snd_soc_rt5682] Fixes: a50067d ('ASoC: rt5682: split i2c driver into separate module') Signed-off-by: Jairaj Arava <jairaj.arava@intel.com> Signed-off-by: Sathyanarayana Nujella <sathyanarayana.nujella@intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@intel.com> Reviewed-by: Shuming Fan <shumingf@realtek.com> Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/20210205171428.2344210-1-ranjani.sridharan@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Mar 12, 2021
[ Upstream commit c5c97ca ] The ubsan reported the following error. It was because sample's raw data missed u32 padding at the end. So it broke the alignment of the array after it. The raw data contains an u32 size prefix so the data size should have an u32 padding after 8-byte aligned data. 27: Sample parsing :util/synthetic-events.c:1539:4: runtime error: store to misaligned address 0x62100006b9bc for type '__u64' (aka 'unsigned long long'), which requires 8 byte alignment 0x62100006b9bc: note: pointer points here 00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ #0 0x561532a9fc96 in perf_event__synthesize_sample util/synthetic-events.c:1539:13 #1 0x5615327f4a4f in do_test tests/sample-parsing.c:284:8 #2 0x5615327f3f50 in test__sample_parsing tests/sample-parsing.c:381:9 #3 0x56153279d3a1 in run_test tests/builtin-test.c:424:9 #4 0x56153279c836 in test_and_print tests/builtin-test.c:454:9 #5 0x56153279b7eb in __cmd_test tests/builtin-test.c:675:4 #6 0x56153279abf0 in cmd_test tests/builtin-test.c:821:9 #7 0x56153264e796 in run_builtin perf.c:312:11 #8 0x56153264cf03 in handle_internal_command perf.c:364:8 #9 0x56153264e47d in run_argv perf.c:408:2 #10 0x56153264c9a9 in main perf.c:538:3 #11 0x7f137ab6fbbc in __libc_start_main (/lib64/libc.so.6+0x38bbc) #12 0x561532596828 in _start ... SUMMARY: UndefinedBehaviorSanitizer: misaligned-pointer-use util/synthetic-events.c:1539:4 in Fixes: 045f8cd ("perf tests: Add a sample parsing test") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20210214091638.519643-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Mar 12, 2021
[ Upstream commit c5c97ca ] The ubsan reported the following error. It was because sample's raw data missed u32 padding at the end. So it broke the alignment of the array after it. The raw data contains an u32 size prefix so the data size should have an u32 padding after 8-byte aligned data. 27: Sample parsing :util/synthetic-events.c:1539:4: runtime error: store to misaligned address 0x62100006b9bc for type '__u64' (aka 'unsigned long long'), which requires 8 byte alignment 0x62100006b9bc: note: pointer points here 00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ #0 0x561532a9fc96 in perf_event__synthesize_sample util/synthetic-events.c:1539:13 #1 0x5615327f4a4f in do_test tests/sample-parsing.c:284:8 #2 0x5615327f3f50 in test__sample_parsing tests/sample-parsing.c:381:9 #3 0x56153279d3a1 in run_test tests/builtin-test.c:424:9 #4 0x56153279c836 in test_and_print tests/builtin-test.c:454:9 #5 0x56153279b7eb in __cmd_test tests/builtin-test.c:675:4 #6 0x56153279abf0 in cmd_test tests/builtin-test.c:821:9 #7 0x56153264e796 in run_builtin perf.c:312:11 #8 0x56153264cf03 in handle_internal_command perf.c:364:8 #9 0x56153264e47d in run_argv perf.c:408:2 #10 0x56153264c9a9 in main perf.c:538:3 #11 0x7f137ab6fbbc in __libc_start_main (/lib64/libc.so.6+0x38bbc) #12 0x561532596828 in _start ... SUMMARY: UndefinedBehaviorSanitizer: misaligned-pointer-use util/synthetic-events.c:1539:4 in Fixes: 045f8cd ("perf tests: Add a sample parsing test") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20210214091638.519643-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Mar 17, 2021
[ Upstream commit c5c97ca ] The ubsan reported the following error. It was because sample's raw data missed u32 padding at the end. So it broke the alignment of the array after it. The raw data contains an u32 size prefix so the data size should have an u32 padding after 8-byte aligned data. 27: Sample parsing :util/synthetic-events.c:1539:4: runtime error: store to misaligned address 0x62100006b9bc for type '__u64' (aka 'unsigned long long'), which requires 8 byte alignment 0x62100006b9bc: note: pointer points here 00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ #0 0x561532a9fc96 in perf_event__synthesize_sample util/synthetic-events.c:1539:13 #1 0x5615327f4a4f in do_test tests/sample-parsing.c:284:8 #2 0x5615327f3f50 in test__sample_parsing tests/sample-parsing.c:381:9 #3 0x56153279d3a1 in run_test tests/builtin-test.c:424:9 #4 0x56153279c836 in test_and_print tests/builtin-test.c:454:9 #5 0x56153279b7eb in __cmd_test tests/builtin-test.c:675:4 #6 0x56153279abf0 in cmd_test tests/builtin-test.c:821:9 #7 0x56153264e796 in run_builtin perf.c:312:11 #8 0x56153264cf03 in handle_internal_command perf.c:364:8 #9 0x56153264e47d in run_argv perf.c:408:2 #10 0x56153264c9a9 in main perf.c:538:3 #11 0x7f137ab6fbbc in __libc_start_main (/lib64/libc.so.6+0x38bbc) #12 0x561532596828 in _start ... SUMMARY: UndefinedBehaviorSanitizer: misaligned-pointer-use util/synthetic-events.c:1539:4 in Fixes: 045f8cd ("perf tests: Add a sample parsing test") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20210214091638.519643-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Mar 17, 2021
commit 4d14c5c upstream Calling btrfs_qgroup_reserve_meta_prealloc from btrfs_delayed_inode_reserve_metadata can result in flushing delalloc while holding a transaction and delayed node locks. This is deadlock prone. In the past multiple commits: * ae5e070 ("btrfs: qgroup: don't try to wait flushing if we're already holding a transaction") * 6f23277 ("btrfs: qgroup: don't commit transaction when we already hold the handle") Tried to solve various aspects of this but this was always a whack-a-mole game. Unfortunately those 2 fixes don't solve a deadlock scenario involving btrfs_delayed_node::mutex. Namely, one thread can call btrfs_dirty_inode as a result of reading a file and modifying its atime: PID: 6963 TASK: ffff8c7f3f94c000 CPU: 2 COMMAND: "test" #0 __schedule at ffffffffa529e07d #1 schedule at ffffffffa529e4ff #2 schedule_timeout at ffffffffa52a1bdd #3 wait_for_completion at ffffffffa529eeea <-- sleeps with delayed node mutex held #4 start_delalloc_inodes at ffffffffc0380db5 #5 btrfs_start_delalloc_snapshot at ffffffffc0393836 #6 try_flush_qgroup at ffffffffc03f04b2 #7 __btrfs_qgroup_reserve_meta at ffffffffc03f5bb6 <-- tries to reserve space and starts delalloc inodes. #8 btrfs_delayed_update_inode at ffffffffc03e31aa <-- acquires delayed node mutex #9 btrfs_update_inode at ffffffffc0385ba8 #10 btrfs_dirty_inode at ffffffffc038627b <-- TRANSACTIION OPENED #11 touch_atime at ffffffffa4cf0000 #12 generic_file_read_iter at ffffffffa4c1f123 #13 new_sync_read at ffffffffa4ccdc8a #14 vfs_read at ffffffffa4cd0849 #15 ksys_read at ffffffffa4cd0bd1 #16 do_syscall_64 at ffffffffa4a052eb #17 entry_SYSCALL_64_after_hwframe at ffffffffa540008c This will cause an asynchronous work to flush the delalloc inodes to happen which can try to acquire the same delayed_node mutex: PID: 455 TASK: ffff8c8085fa4000 CPU: 5 COMMAND: "kworker/u16:30" #0 __schedule at ffffffffa529e07d #1 schedule at ffffffffa529e4ff #2 schedule_preempt_disabled at ffffffffa529e80a #3 __mutex_lock at ffffffffa529fdcb <-- goes to sleep, never wakes up. #4 btrfs_delayed_update_inode at ffffffffc03e3143 <-- tries to acquire the mutex #5 btrfs_update_inode at ffffffffc0385ba8 <-- this is the same inode that pid 6963 is holding #6 cow_file_range_inline.constprop.78 at ffffffffc0386be7 #7 cow_file_range at ffffffffc03879c1 #8 btrfs_run_delalloc_range at ffffffffc038894c #9 writepage_delalloc at ffffffffc03a3c8f #10 __extent_writepage at ffffffffc03a4c01 #11 extent_write_cache_pages at ffffffffc03a500b #12 extent_writepages at ffffffffc03a6de2 #13 do_writepages at ffffffffa4c277eb #14 __filemap_fdatawrite_range at ffffffffa4c1e5bb #15 btrfs_run_delalloc_work at ffffffffc0380987 <-- starts running delayed nodes #16 normal_work_helper at ffffffffc03b706c #17 process_one_work at ffffffffa4aba4e4 #18 worker_thread at ffffffffa4aba6fd #19 kthread at ffffffffa4ac0a3d #20 ret_from_fork at ffffffffa54001ff To fully address those cases the complete fix is to never issue any flushing while holding the transaction or the delayed node lock. This patch achieves it by calling qgroup_reserve_meta directly which will either succeed without flushing or will fail and return -EDQUOT. In the latter case that return value is going to be propagated to btrfs_dirty_inode which will fallback to start a new transaction. That's fine as the majority of time we expect the inode will have BTRFS_DELAYED_NODE_INODE_DIRTY flag set which will result in directly copying the in-memory state. Fixes: c53e965 ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT") CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> [sudip: adjust context] Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Mar 24, 2021
commit f06391c upstream. [ 6684.493350] Unable to handle kernel paging request at virtual address ffff800011c5b0f0 [ 6684.498531] mmc0: card 0001 removed [ 6684.501556] Mem abort info: [ 6684.509681] ESR = 0x96000047 [ 6684.512786] EC = 0x25: DABT (current EL), IL = 32 bits [ 6684.518394] SET = 0, FnV = 0 [ 6684.521707] EA = 0, S1PTW = 0 [ 6684.524998] Data abort info: [ 6684.528236] ISV = 0, ISS = 0x00000047 [ 6684.532986] CM = 0, WnR = 1 [ 6684.536129] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000081b22000 [ 6684.543923] [ffff800011c5b0f0] pgd=00000000bffff003, p4d=00000000bffff003, pud=00000000bfffe003, pmd=00000000900e1003, pte=0000000000000000 [ 6684.557915] Internal error: Oops: 96000047 [#1] PREEMPT SMP [ 6684.564240] Modules linked in: sdhci_esdhc_imx(-) sdhci_pltfm sdhci cqhci mmc_block mmc_core fsl_jr_uio caam_jr caamkeyblob_desc caamhash_desc caamalg_desc crypto_engine rng_core authenc libdes crct10dif_ce flexcan can_dev caam error [last unloaded: mmc_core] [ 6684.587281] CPU: 0 PID: 79138 Comm: kworker/0:3H Not tainted 5.10.9-01410-g3ba33182767b-dirty #10 [ 6684.596160] Hardware name: Freescale i.MX8DXL EVK (DT) [ 6684.601320] Workqueue: kblockd blk_mq_run_work_fn [ 6684.606094] pstate: 40000005 (nZcv daif -PAN -UAO -TCO BTYPE=--) [ 6684.612286] pc : cqhci_request+0x148/0x4e8 [cqhci] ^GMessage from syslogd@ at Thu Jan 1 01:51:24 1970 ...[ 6684.617085] lr : cqhci_request+0x314/0x4e8 [cqhci] [ 6684.626734] sp : ffff80001243b9f0 [ 6684.630049] x29: ffff80001243b9f0 x28: ffff00002c3dd000 [ 6684.635367] x27: 0000000000000001 x26: 0000000000000001 [ 6684.640690] x25: ffff00002c451000 x24: 000000000000000f [ 6684.646007] x23: ffff000017e71c80 x22: ffff00002c451000 [ 6684.651326] x21: ffff00002c0f3550 x20: ffff00002c0f3550 [ 6684.656651] x19: ffff000017d46880 x18: ffff00002cea1500 [ 6684.661977] x17: 0000000000000000 x16: 0000000000000000 [ 6684.667294] x15: 000001ee628e3ed1 x14: 0000000000000278 [ 6684.672610] x13: 0000000000000001 x12: 0000000000000001 [ 6684.677927] x11: 0000000000000000 x10: 0000000000000000 [ 6684.683243] x9 : 000000000000002b x8 : 0000000000001000 [ 6684.688560] x7 : 0000000000000010 x6 : ffff00002c0f3678 [ 6684.693886] x5 : 000000000000000f x4 : ffff800011c5b000 [ 6684.699211] x3 : 000000000002d988 x2 : 0000000000000008 [ 6684.704537] x1 : 00000000000000f0 x0 : 0002d9880008102f [ 6684.709854] Call trace: [ 6684.712313] cqhci_request+0x148/0x4e8 [cqhci] [ 6684.716803] mmc_cqe_start_req+0x58/0x68 [mmc_core] [ 6684.721698] mmc_blk_mq_issue_rq+0x460/0x810 [mmc_block] [ 6684.727018] mmc_mq_queue_rq+0x118/0x2b0 [mmc_block] The problem occurs when cqhci_request() get called after cqhci_disable() as it leads to access of allocated memory that has already been freed. Let's fix the problem by calling cqhci_disable() a bit later in the remove path. Signed-off-by: Frank Li <Frank.Li@nxp.com> Diagnosed-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20210303174248.542175-1-Frank.Li@nxp.com Fixes: f690f44 ("mmc: mmc: Enable CQE's") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Mar 29, 2021
commit f06391c upstream. [ 6684.493350] Unable to handle kernel paging request at virtual address ffff800011c5b0f0 [ 6684.498531] mmc0: card 0001 removed [ 6684.501556] Mem abort info: [ 6684.509681] ESR = 0x96000047 [ 6684.512786] EC = 0x25: DABT (current EL), IL = 32 bits [ 6684.518394] SET = 0, FnV = 0 [ 6684.521707] EA = 0, S1PTW = 0 [ 6684.524998] Data abort info: [ 6684.528236] ISV = 0, ISS = 0x00000047 [ 6684.532986] CM = 0, WnR = 1 [ 6684.536129] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000081b22000 [ 6684.543923] [ffff800011c5b0f0] pgd=00000000bffff003, p4d=00000000bffff003, pud=00000000bfffe003, pmd=00000000900e1003, pte=0000000000000000 [ 6684.557915] Internal error: Oops: 96000047 [#1] PREEMPT SMP [ 6684.564240] Modules linked in: sdhci_esdhc_imx(-) sdhci_pltfm sdhci cqhci mmc_block mmc_core fsl_jr_uio caam_jr caamkeyblob_desc caamhash_desc caamalg_desc crypto_engine rng_core authenc libdes crct10dif_ce flexcan can_dev caam error [last unloaded: mmc_core] [ 6684.587281] CPU: 0 PID: 79138 Comm: kworker/0:3H Not tainted 5.10.9-01410-g3ba33182767b-dirty #10 [ 6684.596160] Hardware name: Freescale i.MX8DXL EVK (DT) [ 6684.601320] Workqueue: kblockd blk_mq_run_work_fn [ 6684.606094] pstate: 40000005 (nZcv daif -PAN -UAO -TCO BTYPE=--) [ 6684.612286] pc : cqhci_request+0x148/0x4e8 [cqhci] ^GMessage from syslogd@ at Thu Jan 1 01:51:24 1970 ...[ 6684.617085] lr : cqhci_request+0x314/0x4e8 [cqhci] [ 6684.626734] sp : ffff80001243b9f0 [ 6684.630049] x29: ffff80001243b9f0 x28: ffff00002c3dd000 [ 6684.635367] x27: 0000000000000001 x26: 0000000000000001 [ 6684.640690] x25: ffff00002c451000 x24: 000000000000000f [ 6684.646007] x23: ffff000017e71c80 x22: ffff00002c451000 [ 6684.651326] x21: ffff00002c0f3550 x20: ffff00002c0f3550 [ 6684.656651] x19: ffff000017d46880 x18: ffff00002cea1500 [ 6684.661977] x17: 0000000000000000 x16: 0000000000000000 [ 6684.667294] x15: 000001ee628e3ed1 x14: 0000000000000278 [ 6684.672610] x13: 0000000000000001 x12: 0000000000000001 [ 6684.677927] x11: 0000000000000000 x10: 0000000000000000 [ 6684.683243] x9 : 000000000000002b x8 : 0000000000001000 [ 6684.688560] x7 : 0000000000000010 x6 : ffff00002c0f3678 [ 6684.693886] x5 : 000000000000000f x4 : ffff800011c5b000 [ 6684.699211] x3 : 000000000002d988 x2 : 0000000000000008 [ 6684.704537] x1 : 00000000000000f0 x0 : 0002d9880008102f [ 6684.709854] Call trace: [ 6684.712313] cqhci_request+0x148/0x4e8 [cqhci] [ 6684.716803] mmc_cqe_start_req+0x58/0x68 [mmc_core] [ 6684.721698] mmc_blk_mq_issue_rq+0x460/0x810 [mmc_block] [ 6684.727018] mmc_mq_queue_rq+0x118/0x2b0 [mmc_block] The problem occurs when cqhci_request() get called after cqhci_disable() as it leads to access of allocated memory that has already been freed. Let's fix the problem by calling cqhci_disable() a bit later in the remove path. Signed-off-by: Frank Li <Frank.Li@nxp.com> Diagnosed-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20210303174248.542175-1-Frank.Li@nxp.com Fixes: f690f44 ("mmc: mmc: Enable CQE's") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Dec 21, 2023
[ Upstream commit a84fbf2 ] Generating metrics llc_code_read_mpi_demand_plus_prefetch, llc_data_read_mpi_demand_plus_prefetch, llc_miss_local_memory_bandwidth_read, llc_miss_local_memory_bandwidth_write, nllc_miss_remote_memory_bandwidth_read, memory_bandwidth_read, memory_bandwidth_write, uncore_frequency, upi_data_transmit_bw, C2_Pkg_Residency, C3_Core_Residency, C3_Pkg_Residency, C6_Core_Residency, C6_Pkg_Residency, C7_Core_Residency, C7_Pkg_Residency, UNCORE_FREQ and tma_info_system_socket_clks would trigger an address sanitizer heap-buffer-overflows on a SkylakeX. ``` ==2567752==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5020003ed098 at pc 0x5621a816654e bp 0x7fffb55d4da0 sp 0x7fffb55d4d98 READ of size 4 at 0x5020003eee78 thread T0 #0 0x558265d6654d in aggr_cpu_id__is_empty tools/perf/util/cpumap.c:694:12 #1 0x558265c914da in perf_stat__get_aggr tools/perf/builtin-stat.c:1490:6 #2 0x558265c914da in perf_stat__get_global_cached tools/perf/builtin-stat.c:1530:9 #3 0x558265e53290 in should_skip_zero_counter tools/perf/util/stat-display.c:947:31 #4 0x558265e53290 in print_counter_aggrdata tools/perf/util/stat-display.c:985:18 #5 0x558265e51931 in print_counter tools/perf/util/stat-display.c:1110:3 #6 0x558265e51931 in evlist__print_counters tools/perf/util/stat-display.c:1571:5 #7 0x558265c8ec87 in print_counters tools/perf/builtin-stat.c:981:2 #8 0x558265c8cc71 in cmd_stat tools/perf/builtin-stat.c:2837:3 #9 0x558265bb9bd4 in run_builtin tools/perf/perf.c:323:11 #10 0x558265bb98eb in handle_internal_command tools/perf/perf.c:377:8 #11 0x558265bb9389 in run_argv tools/perf/perf.c:421:2 #12 0x558265bb9389 in main tools/perf/perf.c:537:3 ``` The issue was the use of testing a cpumap with NULL rather than using empty, as a map containing the dummy value isn't NULL and the -1 results in an empty aggr map being allocated which legitimately overflows when any member is accessed. Fixes: 8a96f45 ("perf stat: Avoid SEGV if core.cpus isn't set") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230906003912.3317462-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Dec 21, 2023
[ Upstream commit 7196398 ] KMSAN reported the following uninit-value access issue: ===================================================== BUG: KMSAN: uninit-value in ppp_sync_input drivers/net/ppp/ppp_synctty.c:690 [inline] BUG: KMSAN: uninit-value in ppp_sync_receive+0xdc9/0xe70 drivers/net/ppp/ppp_synctty.c:334 ppp_sync_input drivers/net/ppp/ppp_synctty.c:690 [inline] ppp_sync_receive+0xdc9/0xe70 drivers/net/ppp/ppp_synctty.c:334 tiocsti+0x328/0x450 drivers/tty/tty_io.c:2295 tty_ioctl+0x808/0x1920 drivers/tty/tty_io.c:2694 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl+0x211/0x400 fs/ioctl.c:857 __x64_sys_ioctl+0x97/0xe0 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Uninit was created at: __alloc_pages+0x75d/0xe80 mm/page_alloc.c:4591 __alloc_pages_node include/linux/gfp.h:238 [inline] alloc_pages_node include/linux/gfp.h:261 [inline] __page_frag_cache_refill+0x9a/0x2c0 mm/page_alloc.c:4691 page_frag_alloc_align+0x91/0x5d0 mm/page_alloc.c:4722 page_frag_alloc include/linux/gfp.h:322 [inline] __netdev_alloc_skb+0x215/0x6d0 net/core/skbuff.c:728 netdev_alloc_skb include/linux/skbuff.h:3225 [inline] dev_alloc_skb include/linux/skbuff.h:3238 [inline] ppp_sync_input drivers/net/ppp/ppp_synctty.c:669 [inline] ppp_sync_receive+0x237/0xe70 drivers/net/ppp/ppp_synctty.c:334 tiocsti+0x328/0x450 drivers/tty/tty_io.c:2295 tty_ioctl+0x808/0x1920 drivers/tty/tty_io.c:2694 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl+0x211/0x400 fs/ioctl.c:857 __x64_sys_ioctl+0x97/0xe0 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b CPU: 0 PID: 12950 Comm: syz-executor.1 Not tainted 6.6.0-14500-g1c41041124bd #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 ===================================================== ppp_sync_input() checks the first 2 bytes of the data are PPP_ALLSTATIONS and PPP_UI. However, if the data length is 1 and the first byte is PPP_ALLSTATIONS, an access to an uninitialized value occurs when checking PPP_UI. This patch resolves this issue by checking the data length. Fixes: 1da177e ("Linux-2.6.12-rc2") Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Dec 21, 2023
[ Upstream commit fb317eb ] KMSAN reported the following kernel-infoleak issue: ===================================================== BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline] BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline] BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline] BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline] BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline] BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186 instrument_copy_to_user include/linux/instrumented.h:114 [inline] copy_to_user_iter lib/iov_iter.c:24 [inline] iterate_ubuf include/linux/iov_iter.h:29 [inline] iterate_and_advance2 include/linux/iov_iter.h:245 [inline] iterate_and_advance include/linux/iov_iter.h:271 [inline] _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186 copy_to_iter include/linux/uio.h:197 [inline] simple_copy_to_iter net/core/datagram.c:532 [inline] __skb_datagram_iter.5+0x148/0xe30 net/core/datagram.c:420 skb_copy_datagram_iter+0x52/0x210 net/core/datagram.c:546 skb_copy_datagram_msg include/linux/skbuff.h:3960 [inline] netlink_recvmsg+0x43d/0x1630 net/netlink/af_netlink.c:1967 sock_recvmsg_nosec net/socket.c:1044 [inline] sock_recvmsg net/socket.c:1066 [inline] __sys_recvfrom+0x476/0x860 net/socket.c:2246 __do_sys_recvfrom net/socket.c:2264 [inline] __se_sys_recvfrom net/socket.c:2260 [inline] __x64_sys_recvfrom+0x130/0x200 net/socket.c:2260 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Uninit was created at: slab_post_alloc_hook+0x103/0x9e0 mm/slab.h:768 slab_alloc_node mm/slub.c:3478 [inline] kmem_cache_alloc_node+0x5f7/0xb50 mm/slub.c:3523 kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:560 __alloc_skb+0x2fd/0x770 net/core/skbuff.c:651 alloc_skb include/linux/skbuff.h:1286 [inline] tipc_tlv_alloc net/tipc/netlink_compat.c:156 [inline] tipc_get_err_tlv+0x90/0x5d0 net/tipc/netlink_compat.c:170 tipc_nl_compat_recv+0x1042/0x15d0 net/tipc/netlink_compat.c:1324 genl_family_rcv_msg_doit net/netlink/genetlink.c:972 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline] genl_rcv_msg+0x1220/0x12c0 net/netlink/genetlink.c:1067 netlink_rcv_skb+0x4a4/0x6a0 net/netlink/af_netlink.c:2545 genl_rcv+0x41/0x60 net/netlink/genetlink.c:1076 netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline] netlink_unicast+0xf4b/0x1230 net/netlink/af_netlink.c:1368 netlink_sendmsg+0x1242/0x1420 net/netlink/af_netlink.c:1910 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x997/0xd60 net/socket.c:2588 ___sys_sendmsg+0x271/0x3b0 net/socket.c:2642 __sys_sendmsg net/socket.c:2671 [inline] __do_sys_sendmsg net/socket.c:2680 [inline] __se_sys_sendmsg net/socket.c:2678 [inline] __x64_sys_sendmsg+0x2fa/0x4a0 net/socket.c:2678 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Bytes 34-35 of 36 are uninitialized Memory access of size 36 starts at ffff88802d464a00 Data copied to user address 00007ff55033c0a0 CPU: 0 PID: 30322 Comm: syz-executor.0 Not tainted 6.6.0-14500-g1c41041124bd #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 ===================================================== tipc_add_tlv() puts TLV descriptor and value onto `skb`. This size is calculated with TLV_SPACE() macro. It adds the size of struct tlv_desc and the length of TLV value passed as an argument, and aligns the result to a multiple of TLV_ALIGNTO, i.e., a multiple of 4 bytes. If the size of struct tlv_desc plus the length of TLV value is not aligned, the current implementation leaves the remaining bytes uninitialized. This is the cause of the above kernel-infoleak issue. This patch resolves this issue by clearing data up to an aligned size. Fixes: d0796d1 ("tipc: convert legacy nl bearer dump to nl compat") Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Dec 26, 2023
[ Upstream commit a2e36cd ] This allows it to break the following circular locking dependency. Aug 10 07:01:29 dg1test kernel: ====================================================== Aug 10 07:01:29 dg1test kernel: WARNING: possible circular locking dependency detected Aug 10 07:01:29 dg1test kernel: 6.4.0-rc7+ #10 Not tainted Aug 10 07:01:29 dg1test kernel: ------------------------------------------------------ Aug 10 07:01:29 dg1test kernel: wireplumber/2236 is trying to acquire lock: Aug 10 07:01:29 dg1test kernel: ffff8fca5320da18 (&fctx->lock){-...}-{2:2}, at: nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: but task is already holding lock: Aug 10 07:01:29 dg1test kernel: ffff8fca41208610 (&event->list_lock#2){-...}-{2:2}, at: nvkm_event_ntfy+0x50/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: which lock already depends on the new lock. Aug 10 07:01:29 dg1test kernel: the existing dependency chain (in reverse order) is: Aug 10 07:01:29 dg1test kernel: -> #3 (&event->list_lock#2){-...}-{2:2}: Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy+0x50/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: ga100_fifo_nonstall_intr+0x24/0x30 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_intr+0x12c/0x240 [nouveau] Aug 10 07:01:29 dg1test kernel: __handle_irq_event_percpu+0x88/0x240 Aug 10 07:01:29 dg1test kernel: handle_irq_event+0x38/0x80 Aug 10 07:01:29 dg1test kernel: handle_edge_irq+0xa3/0x240 Aug 10 07:01:29 dg1test kernel: __common_interrupt+0x72/0x160 Aug 10 07:01:29 dg1test kernel: common_interrupt+0x60/0xe0 Aug 10 07:01:29 dg1test kernel: asm_common_interrupt+0x26/0x40 Aug 10 07:01:29 dg1test kernel: -> #2 (&device->intr.lock){-...}-{2:2}: Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: nvkm_inth_allow+0x2c/0x80 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_state+0x181/0x250 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_allow+0x63/0xd0 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_uevent_mthd+0x4d/0x70 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_ioctl+0x10b/0x250 [nouveau] Aug 10 07:01:29 dg1test kernel: nvif_object_mthd+0xa8/0x1f0 [nouveau] Aug 10 07:01:29 dg1test kernel: nvif_event_allow+0x2a/0xa0 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_fence_enable_signaling+0x78/0x80 [nouveau] Aug 10 07:01:29 dg1test kernel: __dma_fence_enable_signaling+0x5e/0x100 Aug 10 07:01:29 dg1test kernel: dma_fence_add_callback+0x4b/0xd0 Aug 10 07:01:29 dg1test kernel: nouveau_cli_work_queue+0xae/0x110 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_gem_object_close+0x1d1/0x2a0 [nouveau] Aug 10 07:01:29 dg1test kernel: drm_gem_handle_delete+0x70/0xe0 [drm] Aug 10 07:01:29 dg1test kernel: drm_ioctl_kernel+0xa5/0x150 [drm] Aug 10 07:01:29 dg1test kernel: drm_ioctl+0x256/0x490 [drm] Aug 10 07:01:29 dg1test kernel: nouveau_drm_ioctl+0x5a/0xb0 [nouveau] Aug 10 07:01:29 dg1test kernel: __x64_sys_ioctl+0x91/0xd0 Aug 10 07:01:29 dg1test kernel: do_syscall_64+0x3c/0x90 Aug 10 07:01:29 dg1test kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc Aug 10 07:01:29 dg1test kernel: -> #1 (&event->refs_lock#4){....}-{2:2}: Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_state+0x37/0x250 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_allow+0x63/0xd0 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_uevent_mthd+0x4d/0x70 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_ioctl+0x10b/0x250 [nouveau] Aug 10 07:01:29 dg1test kernel: nvif_object_mthd+0xa8/0x1f0 [nouveau] Aug 10 07:01:29 dg1test kernel: nvif_event_allow+0x2a/0xa0 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_fence_enable_signaling+0x78/0x80 [nouveau] Aug 10 07:01:29 dg1test kernel: __dma_fence_enable_signaling+0x5e/0x100 Aug 10 07:01:29 dg1test kernel: dma_fence_add_callback+0x4b/0xd0 Aug 10 07:01:29 dg1test kernel: nouveau_cli_work_queue+0xae/0x110 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_gem_object_close+0x1d1/0x2a0 [nouveau] Aug 10 07:01:29 dg1test kernel: drm_gem_handle_delete+0x70/0xe0 [drm] Aug 10 07:01:29 dg1test kernel: drm_ioctl_kernel+0xa5/0x150 [drm] Aug 10 07:01:29 dg1test kernel: drm_ioctl+0x256/0x490 [drm] Aug 10 07:01:29 dg1test kernel: nouveau_drm_ioctl+0x5a/0xb0 [nouveau] Aug 10 07:01:29 dg1test kernel: __x64_sys_ioctl+0x91/0xd0 Aug 10 07:01:29 dg1test kernel: do_syscall_64+0x3c/0x90 Aug 10 07:01:29 dg1test kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc Aug 10 07:01:29 dg1test kernel: -> #0 (&fctx->lock){-...}-{2:2}: Aug 10 07:01:29 dg1test kernel: __lock_acquire+0x14e3/0x2240 Aug 10 07:01:29 dg1test kernel: lock_acquire+0xc8/0x2a0 Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_client_event+0xf/0x20 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy+0x9b/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: ga100_fifo_nonstall_intr+0x24/0x30 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_intr+0x12c/0x240 [nouveau] Aug 10 07:01:29 dg1test kernel: __handle_irq_event_percpu+0x88/0x240 Aug 10 07:01:29 dg1test kernel: handle_irq_event+0x38/0x80 Aug 10 07:01:29 dg1test kernel: handle_edge_irq+0xa3/0x240 Aug 10 07:01:29 dg1test kernel: __common_interrupt+0x72/0x160 Aug 10 07:01:29 dg1test kernel: common_interrupt+0x60/0xe0 Aug 10 07:01:29 dg1test kernel: asm_common_interrupt+0x26/0x40 Aug 10 07:01:29 dg1test kernel: other info that might help us debug this: Aug 10 07:01:29 dg1test kernel: Chain exists of: &fctx->lock --> &device->intr.lock --> &event->list_lock#2 Aug 10 07:01:29 dg1test kernel: Possible unsafe locking scenario: Aug 10 07:01:29 dg1test kernel: CPU0 CPU1 Aug 10 07:01:29 dg1test kernel: ---- ---- Aug 10 07:01:29 dg1test kernel: lock(&event->list_lock#2); Aug 10 07:01:29 dg1test kernel: lock(&device->intr.lock); Aug 10 07:01:29 dg1test kernel: lock(&event->list_lock#2); Aug 10 07:01:29 dg1test kernel: lock(&fctx->lock); Aug 10 07:01:29 dg1test kernel: *** DEADLOCK *** Aug 10 07:01:29 dg1test kernel: 2 locks held by wireplumber/2236: Aug 10 07:01:29 dg1test kernel: #0: ffff8fca53177bf8 (&device->intr.lock){-...}-{2:2}, at: nvkm_intr+0x29/0x240 [nouveau] Aug 10 07:01:29 dg1test kernel: #1: ffff8fca41208610 (&event->list_lock#2){-...}-{2:2}, at: nvkm_event_ntfy+0x50/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: stack backtrace: Aug 10 07:01:29 dg1test kernel: CPU: 6 PID: 2236 Comm: wireplumber Not tainted 6.4.0-rc7+ #10 Aug 10 07:01:29 dg1test kernel: Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 Aug 10 07:01:29 dg1test kernel: Call Trace: Aug 10 07:01:29 dg1test kernel: <TASK> Aug 10 07:01:29 dg1test kernel: dump_stack_lvl+0x5b/0x90 Aug 10 07:01:29 dg1test kernel: check_noncircular+0xe2/0x110 Aug 10 07:01:29 dg1test kernel: __lock_acquire+0x14e3/0x2240 Aug 10 07:01:29 dg1test kernel: lock_acquire+0xc8/0x2a0 Aug 10 07:01:29 dg1test kernel: ? nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: ? lock_acquire+0xc8/0x2a0 Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: ? nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_client_event+0xf/0x20 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy+0x9b/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: ga100_fifo_nonstall_intr+0x24/0x30 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_intr+0x12c/0x240 [nouveau] Aug 10 07:01:29 dg1test kernel: __handle_irq_event_percpu+0x88/0x240 Aug 10 07:01:29 dg1test kernel: handle_irq_event+0x38/0x80 Aug 10 07:01:29 dg1test kernel: handle_edge_irq+0xa3/0x240 Aug 10 07:01:29 dg1test kernel: __common_interrupt+0x72/0x160 Aug 10 07:01:29 dg1test kernel: common_interrupt+0x60/0xe0 Aug 10 07:01:29 dg1test kernel: asm_common_interrupt+0x26/0x40 Aug 10 07:01:29 dg1test kernel: RIP: 0033:0x7fb66174d700 Aug 10 07:01:29 dg1test kernel: Code: c1 e2 05 29 ca 8d 0c 10 0f be 07 84 c0 75 eb 89 c8 c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa e9 d7 0f fc ff 0f 1f 80 00 00 00 00 <f3> 0f 1e fa e9 c7 0f fc> Aug 10 07:01:29 dg1test kernel: RSP: 002b:00007ffdd3c48438 EFLAGS: 00000206 Aug 10 07:01:29 dg1test kernel: RAX: 000055bb758763c0 RBX: 000055bb758752c0 RCX: 00000000000028b0 Aug 10 07:01:29 dg1test kernel: RDX: 000055bb758752c0 RSI: 000055bb75887490 RDI: 000055bb75862950 Aug 10 07:01:29 dg1test kernel: RBP: 00007ffdd3c48490 R08: 000055bb75873b10 R09: 0000000000000001 Aug 10 07:01:29 dg1test kernel: R10: 0000000000000004 R11: 000055bb7587f000 R12: 000055bb75887490 Aug 10 07:01:29 dg1test kernel: R13: 000055bb757f6280 R14: 000055bb758875c0 R15: 000055bb757f6280 Aug 10 07:01:29 dg1test kernel: </TASK> Signed-off-by: Dave Airlie <airlied@redhat.com> Tested-by: Danilo Krummrich <dakr@redhat.com> Reviewed-by: Danilo Krummrich <dakr@redhat.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231107053255.2257079-1-airlied@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Dec 26, 2023
[ Upstream commit 0550d46 ] KMSAN reported the following uninit-value access issue: lo speed is unknown, defaulting to 1000 ===================================================== BUG: KMSAN: uninit-value in ib_get_width_and_speed drivers/infiniband/core/verbs.c:1889 [inline] BUG: KMSAN: uninit-value in ib_get_eth_speed+0x546/0xaf0 drivers/infiniband/core/verbs.c:1998 ib_get_width_and_speed drivers/infiniband/core/verbs.c:1889 [inline] ib_get_eth_speed+0x546/0xaf0 drivers/infiniband/core/verbs.c:1998 siw_query_port drivers/infiniband/sw/siw/siw_verbs.c:173 [inline] siw_get_port_immutable+0x6f/0x120 drivers/infiniband/sw/siw/siw_verbs.c:203 setup_port_data drivers/infiniband/core/device.c:848 [inline] setup_device drivers/infiniband/core/device.c:1244 [inline] ib_register_device+0x1589/0x1df0 drivers/infiniband/core/device.c:1383 siw_device_register drivers/infiniband/sw/siw/siw_main.c:72 [inline] siw_newlink+0x129e/0x13d0 drivers/infiniband/sw/siw/siw_main.c:490 nldev_newlink+0x8fd/0xa60 drivers/infiniband/core/nldev.c:1763 rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0xe8a/0x1120 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline] netlink_unicast+0xf4b/0x1230 net/netlink/af_netlink.c:1368 netlink_sendmsg+0x1242/0x1420 net/netlink/af_netlink.c:1910 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x997/0xd60 net/socket.c:2588 ___sys_sendmsg+0x271/0x3b0 net/socket.c:2642 __sys_sendmsg net/socket.c:2671 [inline] __do_sys_sendmsg net/socket.c:2680 [inline] __se_sys_sendmsg net/socket.c:2678 [inline] __x64_sys_sendmsg+0x2fa/0x4a0 net/socket.c:2678 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Local variable lksettings created at: ib_get_eth_speed+0x4b/0xaf0 drivers/infiniband/core/verbs.c:1974 siw_query_port drivers/infiniband/sw/siw/siw_verbs.c:173 [inline] siw_get_port_immutable+0x6f/0x120 drivers/infiniband/sw/siw/siw_verbs.c:203 CPU: 0 PID: 11257 Comm: syz-executor.1 Not tainted 6.6.0-14500-g1c41041124bd #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 ===================================================== If __ethtool_get_link_ksettings() fails, `netdev_speed` is set to the default value, SPEED_1000. In this case, if `lanes` field of struct ethtool_link_ksettings is not initialized, an uninitialized value is passed to ib_get_width_and_speed(). This causes the above issue. This patch resolves the issue by initializing `lanes` to 0. Fixes: cb06b6b ("RDMA/core: Get IB width and speed from netdev") Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Link: https://lore.kernel.org/r/20231108143113.1360567-1-syoshida@redhat.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Dec 26, 2023
[ Upstream commit e3e82fc ] When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be dereferenced as wrong struct in irdma_free_pending_cqp_request(). PID: 3669 TASK: ffff88aef892c000 CPU: 28 COMMAND: "kworker/28:0" #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34 #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2 #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f #3 [fffffe0000549eb8] do_nmi at ffffffff81079582 #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4 [exception RIP: native_queued_spin_lock_slowpath+1291] RIP: ffffffff8127e72b RSP: ffff88aa841ef778 RFLAGS: 00000046 RAX: 0000000000000000 RBX: ffff88b01f849700 RCX: ffffffff8127e47e RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff83857ec0 RBP: ffff88afe3e4efc8 R8: ffffed15fc7c9dfa R9: ffffed15fc7c9dfa R10: 0000000000000001 R11: ffffed15fc7c9df9 R12: 0000000000740000 R13: ffff88b01f849708 R14: 0000000000000003 R15: ffffed1603f092e1 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 -- <NMI exception stack> -- #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b #6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4 #7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363 #8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma] #9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma] #10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma] #11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma] #12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb #13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6 #14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278 #15 [ffff88aa841efb88] device_del at ffffffff82179d23 #16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice] #17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice] #18 [ffff88aa841efde8] process_one_work at ffffffff811c589a #19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff #20 [ffff88aa841eff10] kthread at ffffffff811d87a0 #21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com> Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Feb 2, 2024
[ Upstream commit e3e82fc ] When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be dereferenced as wrong struct in irdma_free_pending_cqp_request(). PID: 3669 TASK: ffff88aef892c000 CPU: 28 COMMAND: "kworker/28:0" #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34 #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2 #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f #3 [fffffe0000549eb8] do_nmi at ffffffff81079582 #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4 [exception RIP: native_queued_spin_lock_slowpath+1291] RIP: ffffffff8127e72b RSP: ffff88aa841ef778 RFLAGS: 00000046 RAX: 0000000000000000 RBX: ffff88b01f849700 RCX: ffffffff8127e47e RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff83857ec0 RBP: ffff88afe3e4efc8 R8: ffffed15fc7c9dfa R9: ffffed15fc7c9dfa R10: 0000000000000001 R11: ffffed15fc7c9df9 R12: 0000000000740000 R13: ffff88b01f849708 R14: 0000000000000003 R15: ffffed1603f092e1 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 -- <NMI exception stack> -- #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b #6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4 #7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363 #8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma] #9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma] #10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma] #11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma] #12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb #13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6 #14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278 #15 [ffff88aa841efb88] device_del at ffffffff82179d23 #16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice] #17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice] #18 [ffff88aa841efde8] process_one_work at ffffffff811c589a #19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff #20 [ffff88aa841eff10] kthread at ffffffff811d87a0 #21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com> Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Mar 8, 2024
[ Upstream commit e3e82fc ] When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be dereferenced as wrong struct in irdma_free_pending_cqp_request(). PID: 3669 TASK: ffff88aef892c000 CPU: 28 COMMAND: "kworker/28:0" #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34 #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2 #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f #3 [fffffe0000549eb8] do_nmi at ffffffff81079582 #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4 [exception RIP: native_queued_spin_lock_slowpath+1291] RIP: ffffffff8127e72b RSP: ffff88aa841ef778 RFLAGS: 00000046 RAX: 0000000000000000 RBX: ffff88b01f849700 RCX: ffffffff8127e47e RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff83857ec0 RBP: ffff88afe3e4efc8 R8: ffffed15fc7c9dfa R9: ffffed15fc7c9dfa R10: 0000000000000001 R11: ffffed15fc7c9df9 R12: 0000000000740000 R13: ffff88b01f849708 R14: 0000000000000003 R15: ffffed1603f092e1 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 -- <NMI exception stack> -- #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b #6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4 #7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363 #8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma] #9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma] #10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma] #11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma] #12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb #13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6 #14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278 #15 [ffff88aa841efb88] device_del at ffffffff82179d23 #16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice] #17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice] #18 [ffff88aa841efde8] process_one_work at ffffffff811c589a #19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff #20 [ffff88aa841eff10] kthread at ffffffff811d87a0 #21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com> Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Apr 9, 2024
[ Upstream commit c957280 ] From commit a304e1b ("[PATCH] Debug shared irqs"), there is a test to make sure the shared irq handler should be able to handle the unexpected event after deregistration. For this case, let's apply MT76_REMOVED flag to indicate the device was removed and do not run into the resource access anymore. BUG: KASAN: use-after-free in mt7921_irq_handler+0xd8/0x100 [mt7921e] Read of size 8 at addr ffff88824a7d3b78 by task rmmod/11115 CPU: 28 PID: 11115 Comm: rmmod Tainted: G W L 5.17.0 #10 Hardware name: Micro-Star International Co., Ltd. MS-7D73/MPG B650I EDGE WIFI (MS-7D73), BIOS 1.81 01/05/2024 Call Trace: <TASK> dump_stack_lvl+0x6f/0xa0 print_address_description.constprop.0+0x1f/0x190 ? mt7921_irq_handler+0xd8/0x100 [mt7921e] ? mt7921_irq_handler+0xd8/0x100 [mt7921e] kasan_report.cold+0x7f/0x11b ? mt7921_irq_handler+0xd8/0x100 [mt7921e] mt7921_irq_handler+0xd8/0x100 [mt7921e] free_irq+0x627/0xaa0 devm_free_irq+0x94/0xd0 ? devm_request_any_context_irq+0x160/0x160 ? kobject_put+0x18d/0x4a0 mt7921_pci_remove+0x153/0x190 [mt7921e] pci_device_remove+0xa2/0x1d0 __device_release_driver+0x346/0x6e0 driver_detach+0x1ef/0x2c0 bus_remove_driver+0xe7/0x2d0 ? __check_object_size+0x57/0x310 pci_unregister_driver+0x26/0x250 __do_sys_delete_module+0x307/0x510 ? free_module+0x6a0/0x6a0 ? fpregs_assert_state_consistent+0x4b/0xb0 ? rcu_read_lock_sched_held+0x10/0x70 ? syscall_enter_from_user_mode+0x20/0x70 ? trace_hardirqs_on+0x1c/0x130 do_syscall_64+0x5c/0x80 ? trace_hardirqs_on_prepare+0x72/0x160 ? do_syscall_64+0x68/0x80 ? trace_hardirqs_on_prepare+0x72/0x160 entry_SYSCALL_64_after_hwframe+0x44/0xae Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Closes: https://lore.kernel.org/linux-wireless/CABXGCsOdvVwdLmSsC8TZ1jF0UOg_F_W3wqLECWX620PUkvNk=A@mail.gmail.com/ Fixes: 9270270 ("wifi: mt76: mt7921: fix PCI DMA hang after reboot") Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Deren Wu <deren.wu@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Apr 9, 2024
[ Upstream commit fd5860a ] The loop inside nfs_netfs_issue_read() currently does not disable interrupts while iterating through pages in the xarray to submit for NFS read. This is not safe though since after taking xa_lock, another page in the mapping could be processed for writeback inside an interrupt, and deadlock can occur. The fix is simple and clean if we use xa_for_each_range(), which handles the iteration with RCU while reducing code complexity. The problem is easily reproduced with the following test: mount -o vers=3,fsc 127.0.0.1:/export /mnt/nfs dd if=/dev/zero of=/mnt/nfs/file1.bin bs=4096 count=1 echo 3 > /proc/sys/vm/drop_caches dd if=/mnt/nfs/file1.bin of=/dev/null umount /mnt/nfs On the console with a lockdep-enabled kernel a message similar to the following will be seen: ================================ WARNING: inconsistent lock state 6.7.0-lockdbg+ #10 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. test5/1708 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff888127baa598 (&xa->xa_lock#4){+.?.}-{3:3}, at: nfs_netfs_issue_read+0x1b2/0x4b0 [nfs] {IN-SOFTIRQ-W} state was registered at: lock_acquire+0x144/0x380 _raw_spin_lock_irqsave+0x4e/0xa0 __folio_end_writeback+0x17e/0x5c0 folio_end_writeback+0x93/0x1b0 iomap_finish_ioend+0xeb/0x6a0 blk_update_request+0x204/0x7f0 blk_mq_end_request+0x30/0x1c0 blk_complete_reqs+0x7e/0xa0 __do_softirq+0x113/0x544 __irq_exit_rcu+0xfe/0x120 irq_exit_rcu+0xe/0x20 sysvec_call_function_single+0x6f/0x90 asm_sysvec_call_function_single+0x1a/0x20 pv_native_safe_halt+0xf/0x20 default_idle+0x9/0x20 default_idle_call+0x67/0xa0 do_idle+0x2b5/0x300 cpu_startup_entry+0x34/0x40 start_secondary+0x19d/0x1c0 secondary_startup_64_no_verify+0x18f/0x19b irq event stamp: 176891 hardirqs last enabled at (176891): [<ffffffffa67a0be4>] _raw_spin_unlock_irqrestore+0x44/0x60 hardirqs last disabled at (176890): [<ffffffffa67a0899>] _raw_spin_lock_irqsave+0x79/0xa0 softirqs last enabled at (176646): [<ffffffffa515d91e>] __irq_exit_rcu+0xfe/0x120 softirqs last disabled at (176633): [<ffffffffa515d91e>] __irq_exit_rcu+0xfe/0x120 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&xa->xa_lock#4); <Interrupt> lock(&xa->xa_lock#4); *** DEADLOCK *** 2 locks held by test5/1708: #0: ffff888127baa498 (&sb->s_type->i_mutex_key#22){++++}-{4:4}, at: nfs_start_io_read+0x28/0x90 [nfs] #1: ffff888127baa650 (mapping.invalidate_lock#3){.+.+}-{4:4}, at: page_cache_ra_unbounded+0xa4/0x280 stack backtrace: CPU: 6 PID: 1708 Comm: test5 Kdump: loaded Not tainted 6.7.0-lockdbg+ Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-1.fc39 04/01/2014 Call Trace: dump_stack_lvl+0x5b/0x90 mark_lock+0xb3f/0xd20 __lock_acquire+0x77b/0x3360 _raw_spin_lock+0x34/0x80 nfs_netfs_issue_read+0x1b2/0x4b0 [nfs] netfs_begin_read+0x77f/0x980 [netfs] nfs_netfs_readahead+0x45/0x60 [nfs] nfs_readahead+0x323/0x5a0 [nfs] read_pages+0xf3/0x5c0 page_cache_ra_unbounded+0x1c8/0x280 filemap_get_pages+0x38c/0xae0 filemap_read+0x206/0x5e0 nfs_file_read+0xb7/0x140 [nfs] vfs_read+0x2a9/0x460 ksys_read+0xb7/0x140 Fixes: 000dbe0 ("NFS: Convert buffered read paths to use netfs when fscache is enabled") Suggested-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
May 10, 2024
[ Upstream commit f8bbc07 ] vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. net_ratelimit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 #1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 #2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e #3 [fffffe00003fced0] do_nmi at ffffffff8922660d #4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #5 [ffffa655314979e8] io_serial_in at ffffffff89792594 #6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 #7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 #8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 #9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 #10 [ffffa65531497ac8] console_unlock at ffffffff89316124 #11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 #12 [ffffa65531497b68] printk at ffffffff89318306 #13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 #14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] #15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] #16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] #17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] #18 [ffffa65531497f10] kthread at ffffffff892d2e72 #19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <lei.chen@smartx.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20240415020247.2207781-1-lei.chen@smartx.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
May 17, 2024
[ Upstream commit f8bbc07 ] vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. net_ratelimit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 #1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 #2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e #3 [fffffe00003fced0] do_nmi at ffffffff8922660d #4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #5 [ffffa655314979e8] io_serial_in at ffffffff89792594 #6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 #7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 #8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 #9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 #10 [ffffa65531497ac8] console_unlock at ffffffff89316124 #11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 #12 [ffffa65531497b68] printk at ffffffff89318306 #13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 #14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] #15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] #16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] #17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] #18 [ffffa65531497f10] kthread at ffffffff892d2e72 #19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <lei.chen@smartx.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20240415020247.2207781-1-lei.chen@smartx.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Jun 26, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 #6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 #7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 #8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 #9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 #10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 #11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 #12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 #13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 #14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 #15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 #16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 #17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 #18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 #19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 #20 0x55c9400e53ad in main tools/perf/perf.c:561 #21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 #23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Jun 26, 2024
commit 9d274c1 upstream. We have been seeing crashes on duplicate keys in btrfs_set_item_key_safe(): BTRFS critical (device vdb): slot 4 key (450 108 8192) new key (450 108 8192) ------------[ cut here ]------------ kernel BUG at fs/btrfs/ctree.c:2620! invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 3139 Comm: xfs_io Kdump: loaded Not tainted 6.9.0 #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:btrfs_set_item_key_safe+0x11f/0x290 [btrfs] With the following stack trace: #0 btrfs_set_item_key_safe (fs/btrfs/ctree.c:2620:4) #1 btrfs_drop_extents (fs/btrfs/file.c:411:4) #2 log_one_extent (fs/btrfs/tree-log.c:4732:9) #3 btrfs_log_changed_extents (fs/btrfs/tree-log.c:4955:9) #4 btrfs_log_inode (fs/btrfs/tree-log.c:6626:9) #5 btrfs_log_inode_parent (fs/btrfs/tree-log.c:7070:8) #6 btrfs_log_dentry_safe (fs/btrfs/tree-log.c:7171:8) #7 btrfs_sync_file (fs/btrfs/file.c:1933:8) #8 vfs_fsync_range (fs/sync.c:188:9) #9 vfs_fsync (fs/sync.c:202:9) #10 do_fsync (fs/sync.c:212:9) #11 __do_sys_fdatasync (fs/sync.c:225:9) #12 __se_sys_fdatasync (fs/sync.c:223:1) #13 __x64_sys_fdatasync (fs/sync.c:223:1) #14 do_syscall_x64 (arch/x86/entry/common.c:52:14) #15 do_syscall_64 (arch/x86/entry/common.c:83:7) #16 entry_SYSCALL_64+0xaf/0x14c (arch/x86/entry/entry_64.S:121) So we're logging a changed extent from fsync, which is splitting an extent in the log tree. But this split part already exists in the tree, triggering the BUG(). This is the state of the log tree at the time of the crash, dumped with drgn (https://github.com/osandov/drgn/blob/main/contrib/btrfs_tree.py) to get more details than btrfs_print_leaf() gives us: >>> print_extent_buffer(prog.crashed_thread().stack_trace()[0]["eb"]) leaf 33439744 level 0 items 72 generation 9 owner 18446744073709551610 leaf 33439744 flags 0x100000000000000 fs uuid e5bd3946-400c-4223-8923-190ef1f18677 chunk uuid d58cb17e-6d02-494a-829a-18b7d8a399da item 0 key (450 INODE_ITEM 0) itemoff 16123 itemsize 160 generation 7 transid 9 size 8192 nbytes 8473563889606862198 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0 sequence 204 flags 0x10(PREALLOC) atime 1716417703.220000000 (2024-05-22 15:41:43) ctime 1716417704.983333333 (2024-05-22 15:41:44) mtime 1716417704.983333333 (2024-05-22 15:41:44) otime 17592186044416.000000000 (559444-03-08 01:40:16) item 1 key (450 INODE_REF 256) itemoff 16110 itemsize 13 index 195 namelen 3 name: 193 item 2 key (450 XATTR_ITEM 1640047104) itemoff 16073 itemsize 37 location key (0 UNKNOWN.0 0) type XATTR transid 7 data_len 1 name_len 6 name: user.a data a item 3 key (450 EXTENT_DATA 0) itemoff 16020 itemsize 53 generation 9 type 1 (regular) extent data disk byte 303144960 nr 12288 extent data offset 0 nr 4096 ram 12288 extent compression 0 (none) item 4 key (450 EXTENT_DATA 4096) itemoff 15967 itemsize 53 generation 9 type 2 (prealloc) prealloc data disk byte 303144960 nr 12288 prealloc data offset 4096 nr 8192 item 5 key (450 EXTENT_DATA 8192) itemoff 15914 itemsize 53 generation 9 type 2 (prealloc) prealloc data disk byte 303144960 nr 12288 prealloc data offset 8192 nr 4096 ... So the real problem happened earlier: notice that items 4 (4k-12k) and 5 (8k-12k) overlap. Both are prealloc extents. Item 4 straddles i_size and item 5 starts at i_size. Here is the state of the filesystem tree at the time of the crash: >>> root = prog.crashed_thread().stack_trace()[2]["inode"].root >>> ret, nodes, slots = btrfs_search_slot(root, BtrfsKey(450, 0, 0)) >>> print_extent_buffer(nodes[0]) leaf 30425088 level 0 items 184 generation 9 owner 5 leaf 30425088 flags 0x100000000000000 fs uuid e5bd3946-400c-4223-8923-190ef1f18677 chunk uuid d58cb17e-6d02-494a-829a-18b7d8a399da ... item 179 key (450 INODE_ITEM 0) itemoff 4907 itemsize 160 generation 7 transid 7 size 4096 nbytes 12288 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0 sequence 6 flags 0x10(PREALLOC) atime 1716417703.220000000 (2024-05-22 15:41:43) ctime 1716417703.220000000 (2024-05-22 15:41:43) mtime 1716417703.220000000 (2024-05-22 15:41:43) otime 1716417703.220000000 (2024-05-22 15:41:43) item 180 key (450 INODE_REF 256) itemoff 4894 itemsize 13 index 195 namelen 3 name: 193 item 181 key (450 XATTR_ITEM 1640047104) itemoff 4857 itemsize 37 location key (0 UNKNOWN.0 0) type XATTR transid 7 data_len 1 name_len 6 name: user.a data a item 182 key (450 EXTENT_DATA 0) itemoff 4804 itemsize 53 generation 9 type 1 (regular) extent data disk byte 303144960 nr 12288 extent data offset 0 nr 8192 ram 12288 extent compression 0 (none) item 183 key (450 EXTENT_DATA 8192) itemoff 4751 itemsize 53 generation 9 type 2 (prealloc) prealloc data disk byte 303144960 nr 12288 prealloc data offset 8192 nr 4096 Item 5 in the log tree corresponds to item 183 in the filesystem tree, but nothing matches item 4. Furthermore, item 183 is the last item in the leaf. btrfs_log_prealloc_extents() is responsible for logging prealloc extents beyond i_size. It first truncates any previously logged prealloc extents that start beyond i_size. Then, it walks the filesystem tree and copies the prealloc extent items to the log tree. If it hits the end of a leaf, then it calls btrfs_next_leaf(), which unlocks the tree and does another search. However, while the filesystem tree is unlocked, an ordered extent completion may modify the tree. In particular, it may insert an extent item that overlaps with an extent item that was already copied to the log tree. This may manifest in several ways depending on the exact scenario, including an EEXIST error that is silently translated to a full sync, overlapping items in the log tree, or this crash. This particular crash is triggered by the following sequence of events: - Initially, the file has i_size=4k, a regular extent from 0-4k, and a prealloc extent beyond i_size from 4k-12k. The prealloc extent item is the last item in its B-tree leaf. - The file is fsync'd, which copies its inode item and both extent items to the log tree. - An xattr is set on the file, which sets the BTRFS_INODE_COPY_EVERYTHING flag. - The range 4k-8k in the file is written using direct I/O. i_size is extended to 8k, but the ordered extent is still in flight. - The file is fsync'd. Since BTRFS_INODE_COPY_EVERYTHING is set, this calls copy_inode_items_to_log(), which calls btrfs_log_prealloc_extents(). - btrfs_log_prealloc_extents() finds the 4k-12k prealloc extent in the filesystem tree. Since it starts before i_size, it skips it. Since it is the last item in its B-tree leaf, it calls btrfs_next_leaf(). - btrfs_next_leaf() unlocks the path. - The ordered extent completion runs, which converts the 4k-8k part of the prealloc extent to written and inserts the remaining prealloc part from 8k-12k. - btrfs_next_leaf() does a search and finds the new prealloc extent 8k-12k. - btrfs_log_prealloc_extents() copies the 8k-12k prealloc extent into the log tree. Note that it overlaps with the 4k-12k prealloc extent that was copied to the log tree by the first fsync. - fsync calls btrfs_log_changed_extents(), which tries to log the 4k-8k extent that was written. - This tries to drop the range 4k-8k in the log tree, which requires adjusting the start of the 4k-12k prealloc extent in the log tree to 8k. - btrfs_set_item_key_safe() sees that there is already an extent starting at 8k in the log tree and calls BUG(). Fix this by detecting when we're about to insert an overlapping file extent item in the log tree and truncating the part that would overlap. CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Jul 22, 2024
commit be346c1 upstream. The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] #6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] #7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] #8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] #9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] #10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] #11 dio_complete at ffffffff8c2b9fa7 #12 do_blockdev_direct_IO at ffffffff8c2bc09f #13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] #14 generic_file_direct_write at ffffffff8c1dcf14 #15 __generic_file_write_iter at ffffffff8c1dd07b #16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] #17 aio_write at ffffffff8c2cc72e #18 kmem_cache_alloc at ffffffff8c248dde #19 do_io_submit at ffffffff8c2ccada #20 do_syscall_64 at ffffffff8c004984 #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Jul 30, 2024
commit be346c1 upstream. The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] #6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] #7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] #8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] #9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] #10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] #11 dio_complete at ffffffff8c2b9fa7 #12 do_blockdev_direct_IO at ffffffff8c2bc09f #13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] #14 generic_file_direct_write at ffffffff8c1dcf14 #15 __generic_file_write_iter at ffffffff8c1dd07b #16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] #17 aio_write at ffffffff8c2cc72e #18 kmem_cache_alloc at ffffffff8c248dde #19 do_io_submit at ffffffff8c2ccada #20 do_syscall_64 at ffffffff8c004984 #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Sep 19, 2024
[ Upstream commit a699781 ] A sysfs reader can race with a device reset or removal, attempting to read device state when the device is not actually present. eg: [exception RIP: qed_get_current_link+17] #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede] #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3 #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4 #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300 #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c #13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3 #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1 #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb crash> struct net_device.state ffff9a9d21336000 state = 5, state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100). The device is not present, note lack of __LINK_STATE_PRESENT (0b10). This is the same sort of panic as observed in commit 4224cfd ("net-sysfs: add check for netdevice being present to speed_show"). There are many other callers of __ethtool_get_link_ksettings() which don't have a device presence check. Move this check into ethtool to protect all callers. Fixes: d519e17 ("net: export device speed and duplex via sysfs") Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show") Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com> Link: https://patch.msgid.link/8bae218864beaa44ed01628140475b9bf641c5b0.1724393671.git.jamie.bainbridge@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Oct 10, 2024
[ Upstream commit f8bbc07 ] vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. net_ratelimit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 #1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 #2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e #3 [fffffe00003fced0] do_nmi at ffffffff8922660d #4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #5 [ffffa655314979e8] io_serial_in at ffffffff89792594 #6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 #7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 #8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 #9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 #10 [ffffa65531497ac8] console_unlock at ffffffff89316124 #11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 #12 [ffffa65531497b68] printk at ffffffff89318306 #13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 #14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] #15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] #16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] #17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] #18 [ffffa65531497f10] kthread at ffffffff892d2e72 #19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <lei.chen@smartx.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20240415020247.2207781-1-lei.chen@smartx.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Oct 29, 2024
commit 7d59ac0 upstream. Attaching SST PCI device to VM causes "BUG: KASAN: slab-out-of-bounds". kasan report: [ 19.411889] ================================================================== [ 19.413702] BUG: KASAN: slab-out-of-bounds in _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.415634] Read of size 8 at addr ffff888829e65200 by task cpuhp/16/113 [ 19.417368] [ 19.418627] CPU: 16 PID: 113 Comm: cpuhp/16 Tainted: G E 6.9.0 #10 [ 19.420435] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713 07/28/2022 [ 19.422687] Call Trace: [ 19.424091] <TASK> [ 19.425448] dump_stack_lvl+0x5d/0x80 [ 19.426963] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.428694] print_report+0x19d/0x52e [ 19.430206] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 19.431837] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.433539] kasan_report+0xf0/0x170 [ 19.435019] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.436709] _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.438379] ? __pfx_sched_clock_cpu+0x10/0x10 [ 19.439910] isst_if_cpu_online+0x406/0x58f [isst_if_common] [ 19.441573] ? __pfx_isst_if_cpu_online+0x10/0x10 [isst_if_common] [ 19.443263] ? ttwu_queue_wakelist+0x2c1/0x360 [ 19.444797] cpuhp_invoke_callback+0x221/0xec0 [ 19.446337] cpuhp_thread_fun+0x21b/0x610 [ 19.447814] ? __pfx_cpuhp_thread_fun+0x10/0x10 [ 19.449354] smpboot_thread_fn+0x2e7/0x6e0 [ 19.450859] ? __pfx_smpboot_thread_fn+0x10/0x10 [ 19.452405] kthread+0x29c/0x350 [ 19.453817] ? __pfx_kthread+0x10/0x10 [ 19.455253] ret_from_fork+0x31/0x70 [ 19.456685] ? __pfx_kthread+0x10/0x10 [ 19.458114] ret_from_fork_asm+0x1a/0x30 [ 19.459573] </TASK> [ 19.460853] [ 19.462055] Allocated by task 1198: [ 19.463410] kasan_save_stack+0x30/0x50 [ 19.464788] kasan_save_track+0x14/0x30 [ 19.466139] __kasan_kmalloc+0xaa/0xb0 [ 19.467465] __kmalloc+0x1cd/0x470 [ 19.468748] isst_if_cdev_register+0x1da/0x350 [isst_if_common] [ 19.470233] isst_if_mbox_init+0x108/0xff0 [isst_if_mbox_msr] [ 19.471670] do_one_initcall+0xa4/0x380 [ 19.472903] do_init_module+0x238/0x760 [ 19.474105] load_module+0x5239/0x6f00 [ 19.475285] init_module_from_file+0xd1/0x130 [ 19.476506] idempotent_init_module+0x23b/0x650 [ 19.477725] __x64_sys_finit_module+0xbe/0x130 [ 19.476506] idempotent_init_module+0x23b/0x650 [ 19.477725] __x64_sys_finit_module+0xbe/0x130 [ 19.478920] do_syscall_64+0x82/0x160 [ 19.480036] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 19.481292] [ 19.482205] The buggy address belongs to the object at ffff888829e65000 which belongs to the cache kmalloc-512 of size 512 [ 19.484818] The buggy address is located 0 bytes to the right of allocated 512-byte region [ffff888829e65000, ffff888829e65200) [ 19.487447] [ 19.488328] The buggy address belongs to the physical page: [ 19.489569] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888829e60c00 pfn:0x829e60 [ 19.491140] head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 19.492466] anon flags: 0x57ffffc0000840(slab|head|node=1|zone=2|lastcpupid=0x1fffff) [ 19.493914] page_type: 0xffffffff() [ 19.494988] raw: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001 [ 19.496451] raw: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000 [ 19.497906] head: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001 [ 19.499379] head: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000 [ 19.500844] head: 0057ffffc0000003 ffffea0020a79801 ffffea0020a79848 00000000ffffffff [ 19.502316] head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000 [ 19.503784] page dumped because: kasan: bad access detected [ 19.505058] [ 19.505970] Memory state around the buggy address: [ 19.507172] ffff888829e65100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 19.508599] ffff888829e65180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 19.510013] >ffff888829e65200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 19.510014] ^ [ 19.510016] ffff888829e65280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 19.510018] ffff888829e65300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 19.515367] ================================================================== The reason for this error is physical_package_ids assigned by VMware VMM are not continuous and have gaps. This will cause value returned by topology_physical_package_id() to be more than topology_max_packages(). Here the allocation uses topology_max_packages(). The call to topology_max_packages() returns maximum logical package ID not physical ID. Hence use topology_logical_package_id() instead of topology_physical_package_id(). Fixes: 9a1aac8 ("platform/x86: ISST: PUNIT device mapping with Sub-NUMA clustering") Cc: stable@vger.kernel.org Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Zach Wade <zachwade.k@gmail.com> Link: https://lore.kernel.org/r/20240923144508.1764-1-zachwade.k@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Oct 29, 2024
commit ac01c8c upstream. AddressSanitizer found a use-after-free bug in the symbol code which manifested as 'perf top' segfaulting. ==1238389==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b00c48844b at pc 0x5650d8035961 bp 0x7f751aaecc90 sp 0x7f751aaecc80 READ of size 1 at 0x60b00c48844b thread T193 #0 0x5650d8035960 in _sort__sym_cmp util/sort.c:310 #1 0x5650d8043744 in hist_entry__cmp util/hist.c:1286 #2 0x5650d8043951 in hists__findnew_entry util/hist.c:614 #3 0x5650d804568f in __hists__add_entry util/hist.c:754 #4 0x5650d8045bf9 in hists__add_entry util/hist.c:772 #5 0x5650d8045df1 in iter_add_single_normal_entry util/hist.c:997 #6 0x5650d8043326 in hist_entry_iter__add util/hist.c:1242 #7 0x5650d7ceeefe in perf_event__process_sample /home/matt/src/linux/tools/perf/builtin-top.c:845 #8 0x5650d7ceeefe in deliver_event /home/matt/src/linux/tools/perf/builtin-top.c:1208 #9 0x5650d7fdb51b in do_flush util/ordered-events.c:245 #10 0x5650d7fdb51b in __ordered_events__flush util/ordered-events.c:324 #11 0x5650d7ced743 in process_thread /home/matt/src/linux/tools/perf/builtin-top.c:1120 #12 0x7f757ef1f133 in start_thread nptl/pthread_create.c:442 #13 0x7f757ef9f7db in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 When updating hist maps it's also necessary to update the hist symbol reference because the old one gets freed in map__put(). While this bug was probably introduced with 5c24b67 ("perf tools: Replace map->referenced & maps->removed_maps with map->refcnt"), the symbol objects were leaked until c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL") was merged so the bug was masked. Fixes: c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL") Reported-by: Yunzhao Li <yunzhao@cloudflare.com> Signed-off-by: Matt Fleming (Cloudflare) <matt@readmodwrite.com> Cc: Ian Rogers <irogers@google.com> Cc: kernel-team@cloudflare.com Cc: Namhyung Kim <namhyung@kernel.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: stable@vger.kernel.org # v5.13+ Link: https://lore.kernel.org/r/20240815142212.3834625-1-matt@readmodwrite.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Oct 29, 2024
commit 9af2efe upstream. The fields in the hist_entry are filled on-demand which means they only have meaningful values when relevant sort keys are used. So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in the hist entry can be garbage. So it shouldn't access it unconditionally. I got a segfault, when I wanted to see cgroup profiles. $ sudo perf record -a --all-cgroups --synth=cgroup true $ sudo perf report -s cgroup Program received signal SIGSEGV, Segmentation fault. 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 48 return RC_CHK_ACCESS(map)->dso; (gdb) bt #0 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 #1 0x00005555557aa39b in map__load (map=0x0) at util/map.c:344 #2 0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385 #3 0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true) at util/hist.c:644 #4 0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761 #5 0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779 #6 0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015 #7 0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0) at util/hist.c:1260 #8 0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at builtin-report.c:334 #9 0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232 #10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128, sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271 #11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354 #12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132 #13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245 #14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324 #15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342 #16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60) at util/session.c:780 #17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688, file_path=0x555556038ff0 "perf.data") at util/session.c:1406 As you can see the entry->ms.map was NULL even if he->ms.map has a value. This is because 'sym' sort key is not given, so it cannot assume whether he->ms.sym and entry->ms.sym is the same. I only checked the 'sym' sort key here as it implies 'dso' behavior (so maps are the same). Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Matt Fleming <matt@readmodwrite.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240826221045.1202305-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Oct 29, 2024
…tion to perf_sched__replay() [ Upstream commit c690786 ] The start_work_mutex and work_done_wait_mutex are used only for the 'perf sched replay'. Put their initialization in perf_sched__replay () to reduce unnecessary actions in other commands. Simple functional testing: # perf sched record perf bench sched messaging # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 0.197 [sec] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 14.952 MB perf.data (134165 samples) ] # perf sched replay run measurement overhead: 108 nsecs sleep measurement overhead: 65658 nsecs the run test took 999991 nsecs the sleep test took 1079324 nsecs nr_run_events: 42378 nr_sleep_events: 43102 nr_wakeup_events: 31852 target-less wakeups: 17 multi-target wakeups: 712 task 0 ( swapper: 0), nr_events: 10451 task 1 ( swapper: 1), nr_events: 3 task 2 ( swapper: 2), nr_events: 1 <SNIP> task 717 ( sched-messaging: 74483), nr_events: 152 task 718 ( sched-messaging: 74484), nr_events: 1944 task 719 ( sched-messaging: 74485), nr_events: 73 task 720 ( sched-messaging: 74486), nr_events: 163 task 721 ( sched-messaging: 74487), nr_events: 942 task 722 ( sched-messaging: 74488), nr_events: 78 task 723 ( sched-messaging: 74489), nr_events: 1090 ------------------------------------------------------------ #1 : 1366.507, ravg: 1366.51, cpu: 7682.70 / 7682.70 #2 : 1410.072, ravg: 1370.86, cpu: 7723.88 / 7686.82 #3 : 1396.296, ravg: 1373.41, cpu: 7568.20 / 7674.96 #4 : 1381.019, ravg: 1374.17, cpu: 7531.81 / 7660.64 #5 : 1393.826, ravg: 1376.13, cpu: 7725.25 / 7667.11 #6 : 1401.581, ravg: 1378.68, cpu: 7594.82 / 7659.88 #7 : 1381.337, ravg: 1378.94, cpu: 7371.22 / 7631.01 #8 : 1373.842, ravg: 1378.43, cpu: 7894.92 / 7657.40 #9 : 1364.697, ravg: 1377.06, cpu: 7324.91 / 7624.15 #10 : 1363.613, ravg: 1375.72, cpu: 7209.55 / 7582.69 # echo $? 0 Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240206083228.172607-2-yangjihong1@huawei.com Stable-dep-of: 1a5efc9 ("libsubcmd: Don't free the usage string") Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Oct 29, 2024
…f_sched__{lat|map|replay}() [ Upstream commit bd2cdf2 ] The curr_pid and cpu_last_switched are used only for the 'perf sched replay/latency/map'. Put their initialization in perf_sched__{lat|map|replay () to reduce unnecessary actions in other commands. Simple functional testing: # perf sched record perf bench sched messaging # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 0.209 [sec] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 16.456 MB perf.data (147907 samples) ] # perf sched lat ------------------------------------------------------------------------------------------------------------------------------------------- Task | Runtime ms | Switches | Avg delay ms | Max delay ms | Max delay start | Max delay end | ------------------------------------------------------------------------------------------------------------------------------------------- sched-messaging:(401) | 2990.699 ms | 38705 | avg: 0.661 ms | max: 67.046 ms | max start: 456532.624830 s | max end: 456532.691876 s qemu-system-x86:(7) | 179.764 ms | 2191 | avg: 0.152 ms | max: 21.857 ms | max start: 456532.576434 s | max end: 456532.598291 s sshd:48125 | 0.522 ms | 2 | avg: 0.037 ms | max: 0.046 ms | max start: 456532.514610 s | max end: 456532.514656 s <SNIP> ksoftirqd/11:82 | 0.063 ms | 1 | avg: 0.005 ms | max: 0.005 ms | max start: 456532.769366 s | max end: 456532.769371 s kworker/9:0-mm_:34624 | 0.233 ms | 20 | avg: 0.004 ms | max: 0.007 ms | max start: 456532.690804 s | max end: 456532.690812 s migration/13:93 | 0.000 ms | 1 | avg: 0.004 ms | max: 0.004 ms | max start: 456532.512669 s | max end: 456532.512674 s ----------------------------------------------------------------------------------------------------------------- TOTAL: | 3180.750 ms | 41368 | --------------------------------------------------- # echo $? 0 # perf sched map *A0 456532.510141 secs A0 => migration/0:15 *. 456532.510171 secs . => swapper:0 . *B0 456532.510261 secs B0 => migration/1:21 . *. 456532.510279 secs <SNIP> L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 *L7 . . . . 456532.785979 secs L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 *L7 . . . 456532.786054 secs L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 *L7 . . 456532.786127 secs L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 *L7 . 456532.786197 secs L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 L7 *L7 456532.786270 secs # echo $? 0 # perf sched replay run measurement overhead: 108 nsecs sleep measurement overhead: 66473 nsecs the run test took 1000002 nsecs the sleep test took 1082686 nsecs nr_run_events: 49334 nr_sleep_events: 50054 nr_wakeup_events: 34701 target-less wakeups: 165 multi-target wakeups: 766 task 0 ( swapper: 0), nr_events: 15419 task 1 ( swapper: 1), nr_events: 1 task 2 ( swapper: 2), nr_events: 1 <SNIP> task 715 ( sched-messaging: 110248), nr_events: 1438 task 716 ( sched-messaging: 110249), nr_events: 512 task 717 ( sched-messaging: 110250), nr_events: 500 task 718 ( sched-messaging: 110251), nr_events: 537 task 719 ( sched-messaging: 110252), nr_events: 823 ------------------------------------------------------------ #1 : 1325.288, ravg: 1325.29, cpu: 7823.35 / 7823.35 #2 : 1363.606, ravg: 1329.12, cpu: 7655.53 / 7806.56 #3 : 1349.494, ravg: 1331.16, cpu: 7544.80 / 7780.39 #4 : 1311.488, ravg: 1329.19, cpu: 7495.13 / 7751.86 #5 : 1309.902, ravg: 1327.26, cpu: 7266.65 / 7703.34 #6 : 1309.535, ravg: 1325.49, cpu: 7843.86 / 7717.39 #7 : 1316.482, ravg: 1324.59, cpu: 7854.41 / 7731.09 #8 : 1366.604, ravg: 1328.79, cpu: 7955.81 / 7753.57 #9 : 1326.286, ravg: 1328.54, cpu: 7466.86 / 7724.90 #10 : 1356.653, ravg: 1331.35, cpu: 7566.60 / 7709.07 # echo $? 0 Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240206083228.172607-5-yangjihong1@huawei.com Stable-dep-of: 1a5efc9 ("libsubcmd: Don't free the usage string") Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Oct 30, 2024
[ Upstream commit a699781 ] A sysfs reader can race with a device reset or removal, attempting to read device state when the device is not actually present. eg: [exception RIP: qed_get_current_link+17] #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede] #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3 #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4 #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300 #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c #13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3 #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1 #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb crash> struct net_device.state ffff9a9d21336000 state = 5, state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100). The device is not present, note lack of __LINK_STATE_PRESENT (0b10). This is the same sort of panic as observed in commit 4224cfd ("net-sysfs: add check for netdevice being present to speed_show"). There are many other callers of __ethtool_get_link_ksettings() which don't have a device presence check. Move this check into ethtool to protect all callers. Fixes: d519e17 ("net: export device speed and duplex via sysfs") Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show") Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com> Link: https://patch.msgid.link/8bae218864beaa44ed01628140475b9bf641c5b0.1724393671.git.jamie.bainbridge@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Nov 12, 2024
[ Upstream commit a699781 ] A sysfs reader can race with a device reset or removal, attempting to read device state when the device is not actually present. eg: [exception RIP: qed_get_current_link+17] #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede] #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3 #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4 #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300 #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c #13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3 #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1 #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb crash> struct net_device.state ffff9a9d21336000 state = 5, state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100). The device is not present, note lack of __LINK_STATE_PRESENT (0b10). This is the same sort of panic as observed in commit 4224cfd ("net-sysfs: add check for netdevice being present to speed_show"). There are many other callers of __ethtool_get_link_ksettings() which don't have a device presence check. Move this check into ethtool to protect all callers. Fixes: d519e17 ("net: export device speed and duplex via sysfs") Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show") Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com> Link: https://patch.msgid.link/8bae218864beaa44ed01628140475b9bf641c5b0.1724393671.git.jamie.bainbridge@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Dec 6, 2024
[ Upstream commit 1e67d86 ] Fix __hci_cmd_sync_sk() to return not NULL for unknown opcodes. __hci_cmd_sync_sk() returns NULL if a command returns a status event. However, it also returns NULL where an opcode doesn't exist in the hci_cc table because hci_cmd_complete_evt() assumes status = skb->data[0] for unknown opcodes. This leads to null-ptr-deref in cmd_sync for HCI_OP_READ_LOCAL_CODECS as there is no hci_cc for HCI_OP_READ_LOCAL_CODECS, which always assumes status = skb->data[0]. KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077] CPU: 1 PID: 2000 Comm: kworker/u9:5 Not tainted 6.9.0-ga6bcb805883c-dirty #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: hci7 hci_power_on RIP: 0010:hci_read_supported_codecs+0xb9/0x870 net/bluetooth/hci_codec.c:138 Code: 08 48 89 ef e8 b8 c1 8f fd 48 8b 75 00 e9 96 00 00 00 49 89 c6 48 ba 00 00 00 00 00 fc ff df 4c 8d 60 70 4c 89 e3 48 c1 eb 03 <0f> b6 04 13 84 c0 0f 85 82 06 00 00 41 83 3c 24 02 77 0a e8 bf 78 RSP: 0018:ffff888120bafac8 EFLAGS: 00010212 RAX: 0000000000000000 RBX: 000000000000000e RCX: ffff8881173f0040 RDX: dffffc0000000000 RSI: ffffffffa58496c0 RDI: ffff88810b9ad1e4 RBP: ffff88810b9ac000 R08: ffffffffa77882a7 R09: 1ffffffff4ef1054 R10: dffffc0000000000 R11: fffffbfff4ef1055 R12: 0000000000000070 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810b9ac000 FS: 0000000000000000(0000) GS:ffff8881f6c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f6ddaa3439e CR3: 0000000139764003 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> hci_read_local_codecs_sync net/bluetooth/hci_sync.c:4546 [inline] hci_init_stage_sync net/bluetooth/hci_sync.c:3441 [inline] hci_init4_sync net/bluetooth/hci_sync.c:4706 [inline] hci_init_sync net/bluetooth/hci_sync.c:4742 [inline] hci_dev_init_sync net/bluetooth/hci_sync.c:4912 [inline] hci_dev_open_sync+0x19a9/0x2d30 net/bluetooth/hci_sync.c:4994 hci_dev_do_open net/bluetooth/hci_core.c:483 [inline] hci_power_on+0x11e/0x560 net/bluetooth/hci_core.c:1015 process_one_work kernel/workqueue.c:3267 [inline] process_scheduled_works+0x8ef/0x14f0 kernel/workqueue.c:3348 worker_thread+0x91f/0xe50 kernel/workqueue.c:3429 kthread+0x2cb/0x360 kernel/kthread.c:388 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Fixes: abfeea4 ("Bluetooth: hci_sync: Convert MGMT_OP_START_DISCOVERY") Signed-off-by: Sungwoo Kim <iam@sung-woo.kim> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys-oak
pushed a commit
that referenced
this pull request
Dec 26, 2024
commit 7d59ac0 upstream. Attaching SST PCI device to VM causes "BUG: KASAN: slab-out-of-bounds". kasan report: [ 19.411889] ================================================================== [ 19.413702] BUG: KASAN: slab-out-of-bounds in _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.415634] Read of size 8 at addr ffff888829e65200 by task cpuhp/16/113 [ 19.417368] [ 19.418627] CPU: 16 PID: 113 Comm: cpuhp/16 Tainted: G E 6.9.0 #10 [ 19.420435] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713 07/28/2022 [ 19.422687] Call Trace: [ 19.424091] <TASK> [ 19.425448] dump_stack_lvl+0x5d/0x80 [ 19.426963] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.428694] print_report+0x19d/0x52e [ 19.430206] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 19.431837] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.433539] kasan_report+0xf0/0x170 [ 19.435019] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.436709] _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.438379] ? __pfx_sched_clock_cpu+0x10/0x10 [ 19.439910] isst_if_cpu_online+0x406/0x58f [isst_if_common] [ 19.441573] ? __pfx_isst_if_cpu_online+0x10/0x10 [isst_if_common] [ 19.443263] ? ttwu_queue_wakelist+0x2c1/0x360 [ 19.444797] cpuhp_invoke_callback+0x221/0xec0 [ 19.446337] cpuhp_thread_fun+0x21b/0x610 [ 19.447814] ? __pfx_cpuhp_thread_fun+0x10/0x10 [ 19.449354] smpboot_thread_fn+0x2e7/0x6e0 [ 19.450859] ? __pfx_smpboot_thread_fn+0x10/0x10 [ 19.452405] kthread+0x29c/0x350 [ 19.453817] ? __pfx_kthread+0x10/0x10 [ 19.455253] ret_from_fork+0x31/0x70 [ 19.456685] ? __pfx_kthread+0x10/0x10 [ 19.458114] ret_from_fork_asm+0x1a/0x30 [ 19.459573] </TASK> [ 19.460853] [ 19.462055] Allocated by task 1198: [ 19.463410] kasan_save_stack+0x30/0x50 [ 19.464788] kasan_save_track+0x14/0x30 [ 19.466139] __kasan_kmalloc+0xaa/0xb0 [ 19.467465] __kmalloc+0x1cd/0x470 [ 19.468748] isst_if_cdev_register+0x1da/0x350 [isst_if_common] [ 19.470233] isst_if_mbox_init+0x108/0xff0 [isst_if_mbox_msr] [ 19.471670] do_one_initcall+0xa4/0x380 [ 19.472903] do_init_module+0x238/0x760 [ 19.474105] load_module+0x5239/0x6f00 [ 19.475285] init_module_from_file+0xd1/0x130 [ 19.476506] idempotent_init_module+0x23b/0x650 [ 19.477725] __x64_sys_finit_module+0xbe/0x130 [ 19.476506] idempotent_init_module+0x23b/0x650 [ 19.477725] __x64_sys_finit_module+0xbe/0x130 [ 19.478920] do_syscall_64+0x82/0x160 [ 19.480036] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 19.481292] [ 19.482205] The buggy address belongs to the object at ffff888829e65000 which belongs to the cache kmalloc-512 of size 512 [ 19.484818] The buggy address is located 0 bytes to the right of allocated 512-byte region [ffff888829e65000, ffff888829e65200) [ 19.487447] [ 19.488328] The buggy address belongs to the physical page: [ 19.489569] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888829e60c00 pfn:0x829e60 [ 19.491140] head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 19.492466] anon flags: 0x57ffffc0000840(slab|head|node=1|zone=2|lastcpupid=0x1fffff) [ 19.493914] page_type: 0xffffffff() [ 19.494988] raw: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001 [ 19.496451] raw: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000 [ 19.497906] head: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001 [ 19.499379] head: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000 [ 19.500844] head: 0057ffffc0000003 ffffea0020a79801 ffffea0020a79848 00000000ffffffff [ 19.502316] head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000 [ 19.503784] page dumped because: kasan: bad access detected [ 19.505058] [ 19.505970] Memory state around the buggy address: [ 19.507172] ffff888829e65100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 19.508599] ffff888829e65180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 19.510013] >ffff888829e65200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 19.510014] ^ [ 19.510016] ffff888829e65280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 19.510018] ffff888829e65300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 19.515367] ================================================================== The reason for this error is physical_package_ids assigned by VMware VMM are not continuous and have gaps. This will cause value returned by topology_physical_package_id() to be more than topology_max_packages(). Here the allocation uses topology_max_packages(). The call to topology_max_packages() returns maximum logical package ID not physical ID. Hence use topology_logical_package_id() instead of topology_physical_package_id(). Fixes: 9a1aac8 ("platform/x86: ISST: PUNIT device mapping with Sub-NUMA clustering") Cc: stable@vger.kernel.org Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Zach Wade <zachwade.k@gmail.com> Link: https://lore.kernel.org/r/20240923144508.1764-1-zachwade.k@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Dec 26, 2024
commit ac01c8c upstream. AddressSanitizer found a use-after-free bug in the symbol code which manifested as 'perf top' segfaulting. ==1238389==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b00c48844b at pc 0x5650d8035961 bp 0x7f751aaecc90 sp 0x7f751aaecc80 READ of size 1 at 0x60b00c48844b thread T193 #0 0x5650d8035960 in _sort__sym_cmp util/sort.c:310 #1 0x5650d8043744 in hist_entry__cmp util/hist.c:1286 #2 0x5650d8043951 in hists__findnew_entry util/hist.c:614 #3 0x5650d804568f in __hists__add_entry util/hist.c:754 #4 0x5650d8045bf9 in hists__add_entry util/hist.c:772 #5 0x5650d8045df1 in iter_add_single_normal_entry util/hist.c:997 #6 0x5650d8043326 in hist_entry_iter__add util/hist.c:1242 #7 0x5650d7ceeefe in perf_event__process_sample /home/matt/src/linux/tools/perf/builtin-top.c:845 #8 0x5650d7ceeefe in deliver_event /home/matt/src/linux/tools/perf/builtin-top.c:1208 #9 0x5650d7fdb51b in do_flush util/ordered-events.c:245 #10 0x5650d7fdb51b in __ordered_events__flush util/ordered-events.c:324 #11 0x5650d7ced743 in process_thread /home/matt/src/linux/tools/perf/builtin-top.c:1120 #12 0x7f757ef1f133 in start_thread nptl/pthread_create.c:442 #13 0x7f757ef9f7db in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 When updating hist maps it's also necessary to update the hist symbol reference because the old one gets freed in map__put(). While this bug was probably introduced with 5c24b67 ("perf tools: Replace map->referenced & maps->removed_maps with map->refcnt"), the symbol objects were leaked until c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL") was merged so the bug was masked. Fixes: c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL") Reported-by: Yunzhao Li <yunzhao@cloudflare.com> Signed-off-by: Matt Fleming (Cloudflare) <matt@readmodwrite.com> Cc: Ian Rogers <irogers@google.com> Cc: kernel-team@cloudflare.com Cc: Namhyung Kim <namhyung@kernel.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: stable@vger.kernel.org # v5.13+ Link: https://lore.kernel.org/r/20240815142212.3834625-1-matt@readmodwrite.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Dec 26, 2024
commit 9af2efe upstream. The fields in the hist_entry are filled on-demand which means they only have meaningful values when relevant sort keys are used. So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in the hist entry can be garbage. So it shouldn't access it unconditionally. I got a segfault, when I wanted to see cgroup profiles. $ sudo perf record -a --all-cgroups --synth=cgroup true $ sudo perf report -s cgroup Program received signal SIGSEGV, Segmentation fault. 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 48 return RC_CHK_ACCESS(map)->dso; (gdb) bt #0 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 #1 0x00005555557aa39b in map__load (map=0x0) at util/map.c:344 #2 0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385 #3 0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true) at util/hist.c:644 #4 0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761 #5 0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779 #6 0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015 #7 0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0) at util/hist.c:1260 #8 0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at builtin-report.c:334 #9 0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232 #10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128, sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271 #11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354 #12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132 #13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245 #14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324 #15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342 #16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60) at util/session.c:780 #17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688, file_path=0x555556038ff0 "perf.data") at util/session.c:1406 As you can see the entry->ms.map was NULL even if he->ms.map has a value. This is because 'sym' sort key is not given, so it cannot assume whether he->ms.sym and entry->ms.sym is the same. I only checked the 'sym' sort key here as it implies 'dso' behavior (so maps are the same). Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Matt Fleming <matt@readmodwrite.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240826221045.1202305-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sys-oak
pushed a commit
that referenced
this pull request
Jan 8, 2025
Luca Abeni reported this: | BUG: scheduling while atomic: kworker/u8:2/15203/0x00000003 | CPU: 1 PID: 15203 Comm: kworker/u8:2 Not tainted 4.19.1-rt3 #10 | Call Trace: | rt_spin_lock+0x3f/0x50 | gen6_read32+0x45/0x1d0 [i915] | g4x_get_vblank_counter+0x36/0x40 [i915] | trace_event_raw_event_i915_pipe_update_start+0x7d/0xf0 [i915] The tracing events use trace_intel_pipe_update_start() among other events use functions acquire spinlock_t locks which are transformed into sleeping locks on PREEMPT_RT. A few trace points use intel_get_crtc_scanline(), others use ->get_vblank_counter() wich also might acquire a sleeping locks on PREEMPT_RT. At the time the arguments are evaluated within trace point, preemption is disabled and so the locks must not be acquired on PREEMPT_RT. Based on this I don't see any other way than disable trace points on PREMPT_RT. Acked-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reported-by: Luca Abeni <lucabe72@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.