forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARM: dts: stm32: use pll4_p for ethernet PTP clock #4
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
IEEE 1588 support needs a clock defined as ptp_ref and this uses pll4_p as the parent clock
Hi Dan, Thanks for your patch. It's something I use internally to test ptp feature. However, it's not yet in branch as pll4_p could be shared with other devices and his rate changed. I gonna raise an internal ticket to see if we can change default clock tree to make ptp feature enable. In this case, no longer need to add an assigned-clock parent in device tree. regards |
ADESTM
pushed a commit
that referenced
this pull request
Jun 22, 2020
[ Upstream commit 102210f ] rmnet_get_port() internally calls rcu_dereference_rtnl(), which checks RTNL. But rmnet_get_port() could be called by packet path. The packet path is not protected by RTNL. So, the suspicious RCU usage problem occurs. Test commands: modprobe rmnet ip netns add nst ip link add veth0 type veth peer name veth1 ip link set veth1 netns nst ip link add rmnet0 link veth0 type rmnet mux_id 1 ip netns exec nst ip link add rmnet1 link veth1 type rmnet mux_id 1 ip netns exec nst ip link set veth1 up ip netns exec nst ip link set rmnet1 up ip netns exec nst ip a a 192.168.100.2/24 dev rmnet1 ip link set veth0 up ip link set rmnet0 up ip a a 192.168.100.1/24 dev rmnet0 ping 192.168.100.2 Splat looks like: [ 146.630958][ T1174] WARNING: suspicious RCU usage [ 146.631735][ T1174] 5.6.0-rc1+ torvalds#447 Not tainted [ 146.632387][ T1174] ----------------------------- [ 146.633151][ T1174] drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c:386 suspicious rcu_dereference_check() ! [ 146.634742][ T1174] [ 146.634742][ T1174] other info that might help us debug this: [ 146.634742][ T1174] [ 146.645992][ T1174] [ 146.645992][ T1174] rcu_scheduler_active = 2, debug_locks = 1 [ 146.646937][ T1174] 5 locks held by ping/1174: [ 146.647609][ T1174] #0: ffff8880c31dea70 (sk_lock-AF_INET){+.+.}, at: raw_sendmsg+0xab8/0x2980 [ 146.662463][ T1174] #1: ffffffff93925660 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x243/0x2150 [ 146.671696][ T1174] #2: ffffffff93925660 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x213/0x2940 [ 146.673064][ T1174] #3: ffff8880c19ecd58 (&dev->qdisc_running_key#7){+...}, at: ip_finish_output2+0x714/0x2150 [ 146.690358][ T1174] #4: ffff8880c5796898 (&dev->qdisc_xmit_lock_key#3){+.-.}, at: sch_direct_xmit+0x1e2/0x1020 [ 146.699875][ T1174] [ 146.699875][ T1174] stack backtrace: [ 146.701091][ T1174] CPU: 0 PID: 1174 Comm: ping Not tainted 5.6.0-rc1+ torvalds#447 [ 146.705215][ T1174] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 146.706565][ T1174] Call Trace: [ 146.707102][ T1174] dump_stack+0x96/0xdb [ 146.708007][ T1174] rmnet_get_port.part.9+0x76/0x80 [rmnet] [ 146.709233][ T1174] rmnet_egress_handler+0x107/0x420 [rmnet] [ 146.710492][ T1174] ? sch_direct_xmit+0x1e2/0x1020 [ 146.716193][ T1174] rmnet_vnd_start_xmit+0x3d/0xa0 [rmnet] [ 146.717012][ T1174] dev_hard_start_xmit+0x160/0x740 [ 146.717854][ T1174] sch_direct_xmit+0x265/0x1020 [ 146.718577][ T1174] ? register_lock_class+0x14d0/0x14d0 [ 146.719429][ T1174] ? dev_watchdog+0xac0/0xac0 [ 146.723738][ T1174] ? __dev_queue_xmit+0x15fd/0x2940 [ 146.724469][ T1174] ? lock_acquire+0x164/0x3b0 [ 146.725172][ T1174] __dev_queue_xmit+0x20c7/0x2940 [ ... ] Fixes: ceed73a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
ADESTM
pushed a commit
that referenced
this pull request
Jun 22, 2020
[ Upstream commit 6c5d911 ] journal_head::b_transaction and journal_head::b_next_transaction could be accessed concurrently as noticed by KCSAN, LTP: starting fsync04 /dev/zero: Can't open blockdev EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) ================================================================== BUG: KCSAN: data-race in __jbd2_journal_refile_buffer [jbd2] / jbd2_write_access_granted [jbd2] write to 0xffff99f9b1bd0e30 of 8 bytes by task 25721 on cpu 70: __jbd2_journal_refile_buffer+0xdd/0x210 [jbd2] __jbd2_journal_refile_buffer at fs/jbd2/transaction.c:2569 jbd2_journal_commit_transaction+0x2d15/0x3f20 [jbd2] (inlined by) jbd2_journal_commit_transaction at fs/jbd2/commit.c:1034 kjournald2+0x13b/0x450 [jbd2] kthread+0x1cd/0x1f0 ret_from_fork+0x27/0x50 read to 0xffff99f9b1bd0e30 of 8 bytes by task 25724 on cpu 68: jbd2_write_access_granted+0x1b2/0x250 [jbd2] jbd2_write_access_granted at fs/jbd2/transaction.c:1155 jbd2_journal_get_write_access+0x2c/0x60 [jbd2] __ext4_journal_get_write_access+0x50/0x90 [ext4] ext4_mb_mark_diskspace_used+0x158/0x620 [ext4] ext4_mb_new_blocks+0x54f/0xca0 [ext4] ext4_ind_map_blocks+0xc79/0x1b40 [ext4] ext4_map_blocks+0x3b4/0x950 [ext4] _ext4_get_block+0xfc/0x270 [ext4] ext4_get_block+0x3b/0x50 [ext4] __block_write_begin_int+0x22e/0xae0 __block_write_begin+0x39/0x50 ext4_write_begin+0x388/0xb50 [ext4] generic_perform_write+0x15d/0x290 ext4_buffered_write_iter+0x11f/0x210 [ext4] ext4_file_write_iter+0xce/0x9e0 [ext4] new_sync_write+0x29c/0x3b0 __vfs_write+0x92/0xa0 vfs_write+0x103/0x260 ksys_write+0x9d/0x130 __x64_sys_write+0x4c/0x60 do_syscall_64+0x91/0xb05 entry_SYSCALL_64_after_hwframe+0x49/0xbe 5 locks held by fsync04/25724: #0: ffff99f9911093f8 (sb_writers#13){.+.+}, at: vfs_write+0x21c/0x260 #1: ffff99f9db4c0348 (&sb->s_type->i_mutex_key#15){+.+.}, at: ext4_buffered_write_iter+0x65/0x210 [ext4] #2: ffff99f5e7dfcf58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2] #3: ffff99f9db4c0168 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0x176/0x950 [ext4] #4: ffffffff99086b40 (rcu_read_lock){....}, at: jbd2_write_access_granted+0x4e/0x250 [jbd2] irq event stamp: 1407125 hardirqs last enabled at (1407125): [<ffffffff980da9b7>] __find_get_block+0x107/0x790 hardirqs last disabled at (1407124): [<ffffffff980da8f9>] __find_get_block+0x49/0x790 softirqs last enabled at (1405528): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c softirqs last disabled at (1405521): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0 Reported by Kernel Concurrency Sanitizer on: CPU: 68 PID: 25724 Comm: fsync04 Tainted: G L 5.6.0-rc2-next-20200221+ #7 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 The plain reads are outside of jh->b_state_lock critical section which result in data races. Fix them by adding pairs of READ|WRITE_ONCE(). Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Qian Cai <cai@lca.pw> Link: https://lore.kernel.org/r/20200222043111.2227-1-cai@lca.pw Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
ADESTM
pushed a commit
that referenced
this pull request
Jun 22, 2020
[ Upstream commit a866759 ] This reverts commit 64e62bd. This commit ends up causing some lockdep splats due to trying to grab the payload lock while holding the mgr's lock: [ 54.010099] [ 54.011765] ====================================================== [ 54.018670] WARNING: possible circular locking dependency detected [ 54.025577] 5.5.0-rc6-02274-g77381c23ee63 torvalds#47 Not tainted [ 54.031610] ------------------------------------------------------ [ 54.038516] kworker/1:6/1040 is trying to acquire lock: [ 54.044354] ffff888272af3228 (&mgr->payload_lock){+.+.}, at: drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4 [ 54.054957] [ 54.054957] but task is already holding lock: [ 54.061473] ffff888272af3060 (&mgr->lock){+.+.}, at: drm_dp_mst_topology_mgr_set_mst+0x3c/0x2e4 [ 54.071193] [ 54.071193] which lock already depends on the new lock. [ 54.071193] [ 54.080334] [ 54.080334] the existing dependency chain (in reverse order) is: [ 54.088697] [ 54.088697] -> #1 (&mgr->lock){+.+.}: [ 54.094440] __mutex_lock+0xc3/0x498 [ 54.099015] drm_dp_mst_topology_get_port_validated+0x25/0x80 [ 54.106018] drm_dp_update_payload_part1+0xa2/0x2e2 [ 54.112051] intel_mst_pre_enable_dp+0x144/0x18f [ 54.117791] intel_encoders_pre_enable+0x63/0x70 [ 54.123532] hsw_crtc_enable+0xa1/0x722 [ 54.128396] intel_update_crtc+0x50/0x194 [ 54.133455] skl_commit_modeset_enables+0x40c/0x540 [ 54.139485] intel_atomic_commit_tail+0x5f7/0x130d [ 54.145418] intel_atomic_commit+0x2c8/0x2d8 [ 54.150770] drm_atomic_helper_set_config+0x5a/0x70 [ 54.156801] drm_mode_setcrtc+0x2ab/0x833 [ 54.161862] drm_ioctl+0x2e5/0x424 [ 54.166242] vfs_ioctl+0x21/0x2f [ 54.170426] do_vfs_ioctl+0x5fb/0x61e [ 54.175096] ksys_ioctl+0x55/0x75 [ 54.179377] __x64_sys_ioctl+0x1a/0x1e [ 54.184146] do_syscall_64+0x5c/0x6d [ 54.188721] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 54.194946] [ 54.194946] -> #0 (&mgr->payload_lock){+.+.}: [ 54.201463] [ 54.201463] other info that might help us debug this: [ 54.201463] [ 54.210410] Possible unsafe locking scenario: [ 54.210410] [ 54.217025] CPU0 CPU1 [ 54.222082] ---- ---- [ 54.227138] lock(&mgr->lock); [ 54.230643] lock(&mgr->payload_lock); [ 54.237742] lock(&mgr->lock); [ 54.244062] lock(&mgr->payload_lock); [ 54.248346] [ 54.248346] *** DEADLOCK *** [ 54.248346] [ 54.254959] 7 locks held by kworker/1:6/1040: [ 54.259822] #0: ffff888275c4f528 ((wq_completion)events){+.+.}, at: worker_thread+0x455/0x6e2 [ 54.269451] #1: ffffc9000119beb0 ((work_completion)(&(&dev_priv->hotplug.hotplug_work)->work)){+.+.}, at: worker_thread+0x455/0x6e2 [ 54.282768] #2: ffff888272a403f0 (&dev->mode_config.mutex){+.+.}, at: i915_hotplug_work_func+0x4b/0x2be [ 54.293368] #3: ffffffff824fc6c0 (drm_connector_list_iter){.+.+}, at: i915_hotplug_work_func+0x17e/0x2be [ 54.304061] #4: ffffc9000119bc58 (crtc_ww_class_acquire){+.+.}, at: drm_helper_probe_detect_ctx+0x40/0xfd [ 54.314855] #5: ffff888272a40470 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_lock+0x74/0xe2 [ 54.324385] #6: ffff888272af3060 (&mgr->lock){+.+.}, at: drm_dp_mst_topology_mgr_set_mst+0x3c/0x2e4 [ 54.334597] [ 54.334597] stack backtrace: [ 54.339464] CPU: 1 PID: 1040 Comm: kworker/1:6 Not tainted 5.5.0-rc6-02274-g77381c23ee63 torvalds#47 [ 54.348893] Hardware name: Google Fizz/Fizz, BIOS Google_Fizz.10139.39.0 01/04/2018 [ 54.357451] Workqueue: events i915_hotplug_work_func [ 54.362995] Call Trace: [ 54.365724] dump_stack+0x71/0x9c [ 54.369427] check_noncircular+0x91/0xbc [ 54.373809] ? __lock_acquire+0xc9e/0xf66 [ 54.378286] ? __lock_acquire+0xc9e/0xf66 [ 54.382763] ? lock_acquire+0x175/0x1ac [ 54.387048] ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4 [ 54.393177] ? __mutex_lock+0xc3/0x498 [ 54.397362] ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4 [ 54.403492] ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4 [ 54.409620] ? drm_dp_dpcd_access+0xd9/0x101 [ 54.414390] ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4 [ 54.420517] ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4 [ 54.426645] ? intel_digital_port_connected+0x34d/0x35c [ 54.432482] ? intel_dp_detect+0x227/0x44e [ 54.437056] ? ww_mutex_lock+0x49/0x9a [ 54.441242] ? drm_helper_probe_detect_ctx+0x75/0xfd [ 54.446789] ? intel_encoder_hotplug+0x4b/0x97 [ 54.451752] ? intel_ddi_hotplug+0x61/0x2e0 [ 54.456423] ? mark_held_locks+0x53/0x68 [ 54.460803] ? _raw_spin_unlock_irqrestore+0x3a/0x51 [ 54.466347] ? lockdep_hardirqs_on+0x187/0x1a4 [ 54.471310] ? drm_connector_list_iter_next+0x89/0x9a [ 54.476953] ? i915_hotplug_work_func+0x206/0x2be [ 54.482208] ? worker_thread+0x4d5/0x6e2 [ 54.486587] ? worker_thread+0x455/0x6e2 [ 54.490966] ? queue_work_on+0x64/0x64 [ 54.495151] ? kthread+0x1e9/0x1f1 [ 54.498946] ? queue_work_on+0x64/0x64 [ 54.503130] ? kthread_unpark+0x5e/0x5e [ 54.507413] ? ret_from_fork+0x3a/0x50 The proper fix for this is probably cleanup the VCPI allocations when we're enabling the topology, or on the first payload allocation. For now though, let's just revert. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 64e62bd ("drm/dp_mst: Remove VCPI while disabling topology mgr") Cc: Sean Paul <sean@poorly.run> Cc: Wayne Lin <Wayne.Lin@amd.com> Reviewed-by: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20200117205149.97262-1-lyude@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>
ADESTM
pushed a commit
that referenced
this pull request
Jun 22, 2020
commit 9f61419 upstream. Looks like the dma_sync calls don't do what we want on armv7 either. Fixes: Unable to handle kernel paging request at virtual address 50001000 pgd = (ptrval) [50001000] *pgd=00000000 Internal error: Oops: 805 [#1] SMP ARM Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc6-00271-g9f159ae07f07 #4 Hardware name: Freescale i.MX53 (Device Tree Support) PC is at v7_dma_clean_range+0x20/0x38 LR is at __dma_page_cpu_to_dev+0x28/0x90 pc : [<c011c76c>] lr : [<c01181c4>] psr: 20000013 sp : d80b5a88 ip : de96c000 fp : d840ce6c r10: 00000000 r9 : 00000001 r8 : d843e010 r7 : 00000000 r6 : 00008000 r5 : ddb6c000 r4 : 00000000 r3 : 0000003f r2 : 00000040 r1 : 50008000 r0 : 50001000 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: 70004019 DAC: 00000051 Process swapper/0 (pid: 1, stack limit = 0x(ptrval)) Signed-off-by: Rob Clark <robdclark@chromium.org> Fixes: 3de433c ("drm/msm: Use the correct dma_sync calls in msm_gem") Tested-by: Fabio Estevam <festevam@gmail.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM
pushed a commit
that referenced
this pull request
Jun 22, 2020
commit 56df70a upstream. find_mergeable_vma() can return NULL. In this case, it leads to a crash when we access vm_mm(its offset is 0x40) later in write_protect_page. And this case did happen on our server. The following call trace is captured in kernel 4.19 with the following patch applied and KSM zero page enabled on our server. commit e86c59b ("mm/ksm: improve deduplication of zero pages with colouring") So add a vma check to fix it. BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 Oops: 0000 [#1] SMP NOPTI CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9 RIP: try_to_merge_one_page+0xc7/0x760 Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4 60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49> 8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48 RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246 RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000 RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000 RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577 R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000 R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40 FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: ksm_scan_thread+0x115e/0x1960 kthread+0xf5/0x130 ret_from_fork+0x1f/0x30 [songmuchun@bytedance.com: if the vma is out of date, just exit] Link: http://lkml.kernel.org/r/20200416025034.29780-1-songmuchun@bytedance.com [akpm@linux-foundation.org: add the conventional braces, replace /** with /*] Fixes: e86c59b ("mm/ksm: improve deduplication of zero pages with colouring") Co-developed-by: Xiongchun Duan <duanxiongchun@bytedance.com> Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Hugh Dickins <hughd@google.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Cc: Markus Elfring <Markus.Elfring@web.de> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200416025034.29780-1-songmuchun@bytedance.com Link: http://lkml.kernel.org/r/20200414132905.83819-1-songmuchun@bytedance.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM
pushed a commit
that referenced
this pull request
Jun 22, 2020
commit e84fe99 upstream. Without CONFIG_PREEMPT, it can happen that we get soft lockups detected, e.g., while booting up. watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0-next-20200331+ #4 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014 RIP: __pageblock_pfn_to_page+0x134/0x1c0 Call Trace: set_zone_contiguous+0x56/0x70 page_alloc_init_late+0x166/0x176 kernel_init_freeable+0xfa/0x255 kernel_init+0xa/0x106 ret_from_fork+0x35/0x40 The issue becomes visible when having a lot of memory (e.g., 4TB) assigned to a single NUMA node - a system that can easily be created using QEMU. Inside VMs on a hypervisor with quite some memory overcommit, this is fairly easy to trigger. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Reviewed-by: Baoquan He <bhe@redhat.com> Reviewed-by: Shile Zhang <shile.zhang@linux.alibaba.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Shile Zhang <shile.zhang@linux.alibaba.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200416073417.5003-1-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM
pushed a commit
that referenced
this pull request
Jun 22, 2020
commit 00e2176 upstream. The check for the HWO flag in dwc3_gadget_ep_reclaim_trb_sg() causes us to break out of the loop before we call dwc3_gadget_ep_reclaim_completed_trb(), which is what likely should be clearing the HWO flag. This can cause odd behavior where we never reclaim all the trbs in the sg list, so we never call giveback on a usb req, and that will causes transfer stalls. This effectively resovles the adb stalls seen on HiKey960 after userland changes started only using AIO in adbd. Cc: YongQin Liu <yongqin.liu@linaro.org> Cc: Anurag Kumar Vulisha <anurag.kumar.vulisha@xilinx.com> Cc: Yang Fei <fei.yang@intel.com> Cc: Thinh Nguyen <thinhn@synopsys.com> Cc: Tejas Joglekar <tejas.joglekar@synopsys.com> Cc: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Cc: Jack Pham <jackp@codeaurora.org> Cc: Josh Gao <jmgao@google.com> Cc: Todd Kjos <tkjos@google.com> Cc: Felipe Balbi <balbi@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-usb@vger.kernel.org Cc: stable@vger.kernel.org #4.20+ Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Felipe Balbi <balbi@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM
pushed a commit
that referenced
this pull request
Sep 25, 2020
commit 936e6b8 upstream. Suppose that, for unrelated reasons, FSF requests on behalf of recovery are very slow and can run into the ERP timeout. In the case at hand, we did adapter recovery to a large degree. However due to the slowness a LUN open is pending so the corresponding fc_rport remains blocked. After fast_io_fail_tmo we trigger close physical port recovery for the port under which the LUN should have been opened. The new higher order port recovery dismisses the pending LUN open ERP action and dismisses the pending LUN open FSF request. Such dismissal decouples the ERP action from the pending corresponding FSF request by setting zfcp_fsf_req->erp_action to NULL (among other things) [zfcp_erp_strategy_check_fsfreq()]. If now the ERP timeout for the pending open LUN request runs out, we must not use zfcp_fsf_req->erp_action in the ERP timeout handler. This is a problem since v4.15 commit 75492a5 ("s390/scsi: Convert timers to use timer_setup()"). Before that we intentionally only passed zfcp_erp_action as context argument to zfcp_erp_timeout_handler(). Note: The lifetime of the corresponding zfcp_fsf_req object continues until a (late) response or an (unrelated) adapter recovery. Just like the regular response path ignores dismissed requests [zfcp_fsf_req_complete() => zfcp_fsf_protstatus_eval() => return early] the ERP timeout handler now needs to ignore dismissed requests. So simply return early in the ERP timeout handler if the FSF request is marked as dismissed in its status flags. To protect against the race where zfcp_erp_strategy_check_fsfreq() dismisses and sets zfcp_fsf_req->erp_action to NULL after our previous status flag check, return early if zfcp_fsf_req->erp_action is NULL. After all, the former ERP action does not need to be woken up as that was already done as part of the dismissal above [zfcp_erp_action_dismiss()]. This fixes the following panic due to kernel page fault in IRQ context: Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0000000000000000 TEID: 0000000000000483 Fault in home space mode while using kernel ASCE. AS:000009859238c00b R2:00000e3e7ffd000b R3:00000e3e7ffcc007 S:00000e3e7ffd7000 P:000000000000013d Oops: 0004 ilc:2 [#1] SMP Modules linked in: ... CPU: 82 PID: 311273 Comm: stress Kdump: loaded Tainted: G E X ... Hardware name: IBM 8561 T01 701 (LPAR) Krnl PSW : 0404c00180000000 001fffff80549be0 (zfcp_erp_notify+0x40/0xc0 [zfcp]) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 Krnl GPRS: 0000000000000080 00000e3d00000000 00000000000000f0 0000000000030000 000000010028e700 000000000400a39c 000000010028e700 00000e3e7cf87e02 0000000010000000 0700098591cb67f0 0000000000000000 0000000000000000 0000033840e9a000 0000000000000000 001fffe008d6bc18 001fffe008d6bbc8 Krnl Code: 001fffff80549bd4: a7180000 lhi %r1,0 001fffff80549bd8: 4120a0f0 la %r2,240(%r10) #001fffff80549bdc: a53e0003 llilh %r3,3 >001fffff80549be0: ba132000 cs %r1,%r3,0(%r2) 001fffff80549be4: a7740037 brc 7,1fffff80549c52 001fffff80549be8: e320b0180004 lg %r2,24(%r11) 001fffff80549bee: e31020e00004 lg %r1,224(%r2) 001fffff80549bf4: 412020e0 la %r2,224(%r2) Call Trace: [<001fffff80549be0>] zfcp_erp_notify+0x40/0xc0 [zfcp] [<00000985915e26f0>] call_timer_fn+0x38/0x190 [<00000985915e2944>] expire_timers+0xfc/0x190 [<00000985915e2ac4>] run_timer_softirq+0xec/0x218 [<0000098591ca7c4c>] __do_softirq+0x144/0x398 [<00000985915110aa>] do_softirq_own_stack+0x72/0x88 [<0000098591551b58>] irq_exit+0xb0/0xb8 [<0000098591510c6a>] do_IRQ+0x82/0xb0 [<0000098591ca7140>] ext_int_handler+0x128/0x12c [<0000098591722d98>] clear_subpage.constprop.13+0x38/0x60 ([<000009859172ae4c>] clear_huge_page+0xec/0x250) [<000009859177e7a2>] do_huge_pmd_anonymous_page+0x32a/0x768 [<000009859172a712>] __handle_mm_fault+0x88a/0x900 [<000009859172a860>] handle_mm_fault+0xd8/0x1b0 [<0000098591529ef6>] do_dat_exception+0x136/0x3e8 [<0000098591ca6d34>] pgm_check_handler+0x1c8/0x220 Last Breaking-Event-Address: [<001fffff80549c88>] zfcp_erp_timeout_handler+0x10/0x18 [zfcp] Kernel panic - not syncing: Fatal exception in interrupt Link: https://lore.kernel.org/r/20200623140242.98864-1-maier@linux.ibm.com Fixes: 75492a5 ("s390/scsi: Convert timers to use timer_setup()") Cc: <stable@vger.kernel.org> #4.15+ Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM
pushed a commit
that referenced
this pull request
Sep 25, 2020
[ Upstream commit a066562 ] Enable promisc mode of PF, set VF link state to enable, and run iperf of the VF, then do self test of the PF. The self test will fail with a low frequency, and may cause a use-after-free problem. [ 87.142126] selftest:000004a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [ 87.159722] ================================================================== [ 87.174187] BUG: KASAN: use-after-free in hex_dump_to_buffer+0x140/0x608 [ 87.187600] Read of size 1 at addr ffff003b22828000 by task ethtool/1186 [ 87.201012] [ 87.203978] CPU: 7 PID: 1186 Comm: ethtool Not tainted 5.5.0-rc4-gfd51c473-dirty #4 [ 87.219306] Hardware name: Huawei TaiShan 2280 V2/BC82AMDA, BIOS TA BIOS 2280-A CS V2.B160.01 01/15/2020 [ 87.238292] Call trace: [ 87.243173] dump_backtrace+0x0/0x280 [ 87.250491] show_stack+0x24/0x30 [ 87.257114] dump_stack+0xe8/0x140 [ 87.263911] print_address_description.isra.8+0x70/0x380 [ 87.274538] __kasan_report+0x12c/0x230 [ 87.282203] kasan_report+0xc/0x18 [ 87.288999] __asan_load1+0x60/0x68 [ 87.295969] hex_dump_to_buffer+0x140/0x608 [ 87.304332] print_hex_dump+0x140/0x1e0 [ 87.312000] hns3_lb_check_skb_data+0x168/0x170 [ 87.321060] hns3_clean_rx_ring+0xa94/0xfe0 [ 87.329422] hns3_self_test+0x708/0x8c0 The length of packet sent by the selftest process is only 128 + 14 bytes, and the min buffer size of a BD is 256 bytes, and the receive process will make sure the packet sent by the selftest process is in the linear part, so only check the linear part in hns3_lb_check_skb_data(). So fix this use-after-free by using skb_headlen() to dump skb->data instead of skb->len. Fixes: c39c4d9 ("net: hns3: Add mac loopback selftest support in hns3 driver") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
ADESTM
pushed a commit
that referenced
this pull request
Sep 25, 2020
commit f5e5677 upstream. NULL pointer exception happens occasionally on serial output initiated by login timeout. This was reproduced only if kernel was built with significant debugging options and EDMA driver is used with serial console. col-vf50 login: root Password: Login timed out after 60 seconds. Unable to handle kernel NULL pointer dereference at virtual address 00000044 Internal error: Oops: 5 [#1] ARM CPU: 0 PID: 157 Comm: login Not tainted 5.7.0-next-20200610-dirty #4 Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree) (fsl_edma_tx_handler) from [<8016eb10>] (__handle_irq_event_percpu+0x64/0x304) (__handle_irq_event_percpu) from [<8016eddc>] (handle_irq_event_percpu+0x2c/0x7c) (handle_irq_event_percpu) from [<8016ee64>] (handle_irq_event+0x38/0x5c) (handle_irq_event) from [<801729e4>] (handle_fasteoi_irq+0xa4/0x160) (handle_fasteoi_irq) from [<8016ddcc>] (generic_handle_irq+0x34/0x44) (generic_handle_irq) from [<8016e40c>] (__handle_domain_irq+0x54/0xa8) (__handle_domain_irq) from [<80508bc8>] (gic_handle_irq+0x4c/0x80) (gic_handle_irq) from [<80100af0>] (__irq_svc+0x70/0x98) Exception stack(0x8459fe80 to 0x8459fec8) fe80: 72286b00 e3359f64 00000001 0000412d a0070013 85c98840 85c98840 a0070013 fea0: 8054e0d4 00000000 00000002 00000000 00000002 8459fed0 8081fbe8 8081fbec fec0: 60070013 ffffffff (__irq_svc) from [<8081fbec>] (_raw_spin_unlock_irqrestore+0x30/0x58) (_raw_spin_unlock_irqrestore) from [<8056cb48>] (uart_flush_buffer+0x88/0xf8) (uart_flush_buffer) from [<80554e60>] (tty_ldisc_hangup+0x38/0x1ac) (tty_ldisc_hangup) from [<8054c7f4>] (__tty_hangup+0x158/0x2bc) (__tty_hangup) from [<80557b90>] (disassociate_ctty.part.1+0x30/0x23c) (disassociate_ctty.part.1) from [<8011fc18>] (do_exit+0x580/0xba0) (do_exit) from [<801214f8>] (do_group_exit+0x3c/0xb4) (do_group_exit) from [<80121580>] (__wake_up_parent+0x0/0x14) Issue looks like race condition between interrupt handler fsl_edma_tx_handler() (called as result of fsl_edma_xfer_desc()) and terminating the transfer with fsl_edma_terminate_all(). The fsl_edma_tx_handler() handles interrupt for a transfer with already freed edesc and idle==true. Fixes: d6be34f ("dma: Add Freescale eDMA engine driver support") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Robin Gong <yibin.gong@nxp.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/1591877861-28156-2-git-send-email-krzk@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM
pushed a commit
that referenced
this pull request
Sep 25, 2020
commit fc846e9 upstream. The `INSN_CONFIG` comedi instruction with sub-instruction code `INSN_CONFIG_DIGITAL_TRIG` includes a base channel in `data[3]`. This is used as a right shift amount for other bitmask values without being checked. Shift amounts greater than or equal to 32 will result in undefined behavior. Add code to deal with this, adjusting the checks for invalid channels so that enabled channel bits that would have been lost by shifting are also checked for validity. Only channels 0 to 15 are valid. Fixes: a8c66b6 ("staging: comedi: addi_apci_1500: rewrite the subdevice support functions") Cc: <stable@vger.kernel.org> #4.0+: ef75e14: staging: comedi: verify array index is correct before using it Cc: <stable@vger.kernel.org> #4.0+ Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20200717145257.112660-5-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM
pushed a commit
that referenced
this pull request
Sep 25, 2020
commit 254503a upstream. The drm/omap driver was fixed to correct an issue where using a divider of 32 breaks the DSS despite the TRM stating 32 is a valid number. Through experimentation, it appears that 31 works, and it is consistent with the value used by the drm/omap driver. This patch fixes the divider for fbdev driver instead of the drm. Fixes: f76ee89 ("omapfb: copy omapdss & displays for omapfb") Cc: <stable@vger.kernel.org> #4.5+ Signed-off-by: Adam Ford <aford173@gmail.com> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Dave Airlie <airlied@gmail.com> Cc: Rob Clark <robdclark@gmail.com> [b.zolnierkie: mark patch as applicable to stable 4.5+ (was 4.9+)] Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200630182636.439015-1-aford173@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM
pushed a commit
that referenced
this pull request
Sep 25, 2020
[ Upstream commit e24c644 ] I compiled with AddressSanitizer and I had these memory leaks while I was using the tep_parse_format function: Direct leak of 28 byte(s) in 4 object(s) allocated from: #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe) #1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985 #2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140 #3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206 #4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291 #5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299 #6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849 #7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161 #8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207 #9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786 #10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285 #11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369 #12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335 #13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389 #14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431 #15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251 #16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284 #17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593 #18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727 #19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048 #20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127 #21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152 #22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252 #23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347 #24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461 #25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673 #26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) The token variable in the process_dynamic_array_len function is allocated in the read_expect_type function, but is not freed before calling the read_token function. Free the token variable before calling read_token in order to plug the leak. Signed-off-by: Philippe Duplessis-Guindon <pduplessis@efficios.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/linux-trace-devel/20200730150236.5392-1-pduplessis@efficios.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
ADESTM
pushed a commit
that referenced
this pull request
Sep 25, 2020
commit 18c850f upstream. There's long existed a lockdep splat because we open our bdev's under the ->device_list_mutex at mount time, which acquires the bd_mutex. Usually this goes unnoticed, but if you do loopback devices at all suddenly the bd_mutex comes with a whole host of other dependencies, which results in the splat when you mount a btrfs file system. ====================================================== WARNING: possible circular locking dependency detected 5.8.0-0.rc3.1.fc33.x86_64+debug #1 Not tainted ------------------------------------------------------ systemd-journal/509 is trying to acquire lock: ffff970831f84db0 (&fs_info->reloc_mutex){+.+.}-{3:3}, at: btrfs_record_root_in_trans+0x44/0x70 [btrfs] but task is already holding lock: ffff97083144d598 (sb_pagefaults){.+.+}-{0:0}, at: btrfs_page_mkwrite+0x59/0x560 [btrfs] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #6 (sb_pagefaults){.+.+}-{0:0}: __sb_start_write+0x13e/0x220 btrfs_page_mkwrite+0x59/0x560 [btrfs] do_page_mkwrite+0x4f/0x130 do_wp_page+0x3b0/0x4f0 handle_mm_fault+0xf47/0x1850 do_user_addr_fault+0x1fc/0x4b0 exc_page_fault+0x88/0x300 asm_exc_page_fault+0x1e/0x30 -> #5 (&mm->mmap_lock#2){++++}-{3:3}: __might_fault+0x60/0x80 _copy_from_user+0x20/0xb0 get_sg_io_hdr+0x9a/0xb0 scsi_cmd_ioctl+0x1ea/0x2f0 cdrom_ioctl+0x3c/0x12b4 sr_block_ioctl+0xa4/0xd0 block_ioctl+0x3f/0x50 ksys_ioctl+0x82/0xc0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x52/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #4 (&cd->lock){+.+.}-{3:3}: __mutex_lock+0x7b/0x820 sr_block_open+0xa2/0x180 __blkdev_get+0xdd/0x550 blkdev_get+0x38/0x150 do_dentry_open+0x16b/0x3e0 path_openat+0x3c9/0xa00 do_filp_open+0x75/0x100 do_sys_openat2+0x8a/0x140 __x64_sys_openat+0x46/0x70 do_syscall_64+0x52/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #3 (&bdev->bd_mutex){+.+.}-{3:3}: __mutex_lock+0x7b/0x820 __blkdev_get+0x6a/0x550 blkdev_get+0x85/0x150 blkdev_get_by_path+0x2c/0x70 btrfs_get_bdev_and_sb+0x1b/0xb0 [btrfs] open_fs_devices+0x88/0x240 [btrfs] btrfs_open_devices+0x92/0xa0 [btrfs] btrfs_mount_root+0x250/0x490 [btrfs] legacy_get_tree+0x30/0x50 vfs_get_tree+0x28/0xc0 vfs_kern_mount.part.0+0x71/0xb0 btrfs_mount+0x119/0x380 [btrfs] legacy_get_tree+0x30/0x50 vfs_get_tree+0x28/0xc0 do_mount+0x8c6/0xca0 __x64_sys_mount+0x8e/0xd0 do_syscall_64+0x52/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #2 (&fs_devs->device_list_mutex){+.+.}-{3:3}: __mutex_lock+0x7b/0x820 btrfs_run_dev_stats+0x36/0x420 [btrfs] commit_cowonly_roots+0x91/0x2d0 [btrfs] btrfs_commit_transaction+0x4e6/0x9f0 [btrfs] btrfs_sync_file+0x38a/0x480 [btrfs] __x64_sys_fdatasync+0x47/0x80 do_syscall_64+0x52/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #1 (&fs_info->tree_log_mutex){+.+.}-{3:3}: __mutex_lock+0x7b/0x820 btrfs_commit_transaction+0x48e/0x9f0 [btrfs] btrfs_sync_file+0x38a/0x480 [btrfs] __x64_sys_fdatasync+0x47/0x80 do_syscall_64+0x52/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #0 (&fs_info->reloc_mutex){+.+.}-{3:3}: __lock_acquire+0x1241/0x20c0 lock_acquire+0xb0/0x400 __mutex_lock+0x7b/0x820 btrfs_record_root_in_trans+0x44/0x70 [btrfs] start_transaction+0xd2/0x500 [btrfs] btrfs_dirty_inode+0x44/0xd0 [btrfs] file_update_time+0xc6/0x120 btrfs_page_mkwrite+0xda/0x560 [btrfs] do_page_mkwrite+0x4f/0x130 do_wp_page+0x3b0/0x4f0 handle_mm_fault+0xf47/0x1850 do_user_addr_fault+0x1fc/0x4b0 exc_page_fault+0x88/0x300 asm_exc_page_fault+0x1e/0x30 other info that might help us debug this: Chain exists of: &fs_info->reloc_mutex --> &mm->mmap_lock#2 --> sb_pagefaults Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(sb_pagefaults); lock(&mm->mmap_lock#2); lock(sb_pagefaults); lock(&fs_info->reloc_mutex); *** DEADLOCK *** 3 locks held by systemd-journal/509: #0: ffff97083bdec8b8 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x12e/0x4b0 #1: ffff97083144d598 (sb_pagefaults){.+.+}-{0:0}, at: btrfs_page_mkwrite+0x59/0x560 [btrfs] #2: ffff97083144d6a8 (sb_internal){.+.+}-{0:0}, at: start_transaction+0x3f8/0x500 [btrfs] stack backtrace: CPU: 0 PID: 509 Comm: systemd-journal Not tainted 5.8.0-0.rc3.1.fc33.x86_64+debug #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: dump_stack+0x92/0xc8 check_noncircular+0x134/0x150 __lock_acquire+0x1241/0x20c0 lock_acquire+0xb0/0x400 ? btrfs_record_root_in_trans+0x44/0x70 [btrfs] ? lock_acquire+0xb0/0x400 ? btrfs_record_root_in_trans+0x44/0x70 [btrfs] __mutex_lock+0x7b/0x820 ? btrfs_record_root_in_trans+0x44/0x70 [btrfs] ? kvm_sched_clock_read+0x14/0x30 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0xc/0xb0 btrfs_record_root_in_trans+0x44/0x70 [btrfs] start_transaction+0xd2/0x500 [btrfs] btrfs_dirty_inode+0x44/0xd0 [btrfs] file_update_time+0xc6/0x120 btrfs_page_mkwrite+0xda/0x560 [btrfs] ? sched_clock+0x5/0x10 do_page_mkwrite+0x4f/0x130 do_wp_page+0x3b0/0x4f0 handle_mm_fault+0xf47/0x1850 do_user_addr_fault+0x1fc/0x4b0 exc_page_fault+0x88/0x300 ? asm_exc_page_fault+0x8/0x30 asm_exc_page_fault+0x1e/0x30 RIP: 0033:0x7fa3972fdbfe Code: Bad RIP value. Fix this by not holding the ->device_list_mutex at this point. The device_list_mutex exists to protect us from modifying the device list while the file system is running. However it can also be modified by doing a scan on a device. But this action is specifically protected by the uuid_mutex, which we are holding here. We cannot race with opening at this point because we have the ->s_mount lock held during the mount. Not having the ->device_list_mutex here is perfectly safe as we're not going to change the devices at this point. CC: stable@vger.kernel.org # 4.19+ Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> [ add some comments ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM
pushed a commit
that referenced
this pull request
Sep 25, 2020
commit 2d9a2c5 upstream. Before v4.15 commit 75492a5 ("s390/scsi: Convert timers to use timer_setup()"), we intentionally only passed zfcp_adapter as context argument to zfcp_fsf_request_timeout_handler(). Since we only trigger adapter recovery, it was unnecessary to sync against races between timeout and (late) completion. Likewise, we only passed zfcp_erp_action as context argument to zfcp_erp_timeout_handler(). Since we only wakeup an ERP action, it was unnecessary to sync against races between timeout and (late) completion. Meanwhile the timeout handlers get timer_list as context argument and do a timer-specific container-of to zfcp_fsf_req which can have been freed. Fix it by making sure that any request timeout handlers, that might just have started before del_timer(), are completed by using del_timer_sync() instead. This ensures the request free happens afterwards. Space time diagram of potential use-after-free: Basic idea is to have 2 or more pending requests whose timeouts run out at almost the same time. req 1 timeout ERP thread req 2 timeout ---------------- ---------------- --------------------------------------- zfcp_fsf_request_timeout_handler fsf_req = from_timer(fsf_req, t, timer) adapter = fsf_req->adapter zfcp_qdio_siosl(adapter) zfcp_erp_adapter_reopen(adapter,...) zfcp_erp_strategy ... zfcp_fsf_req_dismiss_all list_for_each_entry_safe zfcp_fsf_req_complete 1 del_timer 1 zfcp_fsf_req_free 1 zfcp_fsf_req_complete 2 zfcp_fsf_request_timeout_handler del_timer 2 fsf_req = from_timer(fsf_req, t, timer) zfcp_fsf_req_free 2 adapter = fsf_req->adapter ^^^^^^^ already freed Link: https://lore.kernel.org/r/20200813152856.50088-1-maier@linux.ibm.com Fixes: 75492a5 ("s390/scsi: Convert timers to use timer_setup()") Cc: <stable@vger.kernel.org> #4.15+ Suggested-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
atorgue
added a commit
that referenced
this pull request
Nov 2, 2020
Mutex protection in not needed in _allocate_opp_table as it this function is already called under mutex lock. Furthermore, it fixes the following issue seen with CONFIG_PROVE_LOCKING flag: [ 3.648912] ========================= [ 3.648927] WARNING: held lock freed! [ 3.648944] 5.4.56-00014-g7cc4a3bdec16 #7 Not tainted [ 3.648957] ------------------------- [ 3.648976] kworker/1:1/22 is freeing memory df183c00-df183fff, with a lock still held there! [ 3.648991] df183ca0 (&opp_table->lock){+.+.}, at: _opp_get_opp_table+0x1cc/0x224 [ 3.649037] 5 locks held by kworker/1:1/22: [ 3.649049] #0: df0086a0 ((wq_completion)events){+.+.}, at: process_one_work+0x200/0x858 [ 3.649082] #1: df515f1c (deferred_probe_work){+.+.}, at: process_one_work+0x200/0x858 [ 3.649110] #2: d1ff08cc (&dev->mutex){....}, at: __device_attach+0x34/0x17c [ 3.649145] #3: c139e5e0 (opp_table_lock){+.+.}, at: _opp_get_opp_table+0x1c/0x224 [ 3.649175] #4: df183ca0 (&opp_table->lock){+.+.}, at: _opp_get_opp_table+0x1cc/0x224 Fixes: cf1f361 ("opp: core: fix memory leak in probe deferral") Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com> Change-Id: I6da51fab26b698bb803a3a8ddf5454f73d8af956 Reviewed-on: https://gerrit.st.com/c/mpu/oe/st/linux-stm32/+/178474 Reviewed-by: CITOOLS <smet-aci-reviews@lists.codex.cro.st.com>
stalyatech
pushed a commit
to stalyatech/linux
that referenced
this pull request
Feb 26, 2021
commit 936e6b8 upstream. Suppose that, for unrelated reasons, FSF requests on behalf of recovery are very slow and can run into the ERP timeout. In the case at hand, we did adapter recovery to a large degree. However due to the slowness a LUN open is pending so the corresponding fc_rport remains blocked. After fast_io_fail_tmo we trigger close physical port recovery for the port under which the LUN should have been opened. The new higher order port recovery dismisses the pending LUN open ERP action and dismisses the pending LUN open FSF request. Such dismissal decouples the ERP action from the pending corresponding FSF request by setting zfcp_fsf_req->erp_action to NULL (among other things) [zfcp_erp_strategy_check_fsfreq()]. If now the ERP timeout for the pending open LUN request runs out, we must not use zfcp_fsf_req->erp_action in the ERP timeout handler. This is a problem since v4.15 commit 75492a5 ("s390/scsi: Convert timers to use timer_setup()"). Before that we intentionally only passed zfcp_erp_action as context argument to zfcp_erp_timeout_handler(). Note: The lifetime of the corresponding zfcp_fsf_req object continues until a (late) response or an (unrelated) adapter recovery. Just like the regular response path ignores dismissed requests [zfcp_fsf_req_complete() => zfcp_fsf_protstatus_eval() => return early] the ERP timeout handler now needs to ignore dismissed requests. So simply return early in the ERP timeout handler if the FSF request is marked as dismissed in its status flags. To protect against the race where zfcp_erp_strategy_check_fsfreq() dismisses and sets zfcp_fsf_req->erp_action to NULL after our previous status flag check, return early if zfcp_fsf_req->erp_action is NULL. After all, the former ERP action does not need to be woken up as that was already done as part of the dismissal above [zfcp_erp_action_dismiss()]. This fixes the following panic due to kernel page fault in IRQ context: Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0000000000000000 TEID: 0000000000000483 Fault in home space mode while using kernel ASCE. AS:000009859238c00b R2:00000e3e7ffd000b R3:00000e3e7ffcc007 S:00000e3e7ffd7000 P:000000000000013d Oops: 0004 ilc:2 [STMicroelectronics#1] SMP Modules linked in: ... CPU: 82 PID: 311273 Comm: stress Kdump: loaded Tainted: G E X ... Hardware name: IBM 8561 T01 701 (LPAR) Krnl PSW : 0404c00180000000 001fffff80549be0 (zfcp_erp_notify+0x40/0xc0 [zfcp]) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 Krnl GPRS: 0000000000000080 00000e3d00000000 00000000000000f0 0000000000030000 000000010028e700 000000000400a39c 000000010028e700 00000e3e7cf87e02 0000000010000000 0700098591cb67f0 0000000000000000 0000000000000000 0000033840e9a000 0000000000000000 001fffe008d6bc18 001fffe008d6bbc8 Krnl Code: 001fffff80549bd4: a7180000 lhi %r1,0 001fffff80549bd8: 4120a0f0 la %r2,240(%r10) #001fffff80549bdc: a53e0003 llilh %r3,3 >001fffff80549be0: ba132000 cs %r1,%r3,0(%r2) 001fffff80549be4: a7740037 brc 7,1fffff80549c52 001fffff80549be8: e320b0180004 lg %r2,24(%r11) 001fffff80549bee: e31020e00004 lg %r1,224(%r2) 001fffff80549bf4: 412020e0 la %r2,224(%r2) Call Trace: [<001fffff80549be0>] zfcp_erp_notify+0x40/0xc0 [zfcp] [<00000985915e26f0>] call_timer_fn+0x38/0x190 [<00000985915e2944>] expire_timers+0xfc/0x190 [<00000985915e2ac4>] run_timer_softirq+0xec/0x218 [<0000098591ca7c4c>] __do_softirq+0x144/0x398 [<00000985915110aa>] do_softirq_own_stack+0x72/0x88 [<0000098591551b58>] irq_exit+0xb0/0xb8 [<0000098591510c6a>] do_IRQ+0x82/0xb0 [<0000098591ca7140>] ext_int_handler+0x128/0x12c [<0000098591722d98>] clear_subpage.constprop.13+0x38/0x60 ([<000009859172ae4c>] clear_huge_page+0xec/0x250) [<000009859177e7a2>] do_huge_pmd_anonymous_page+0x32a/0x768 [<000009859172a712>] __handle_mm_fault+0x88a/0x900 [<000009859172a860>] handle_mm_fault+0xd8/0x1b0 [<0000098591529ef6>] do_dat_exception+0x136/0x3e8 [<0000098591ca6d34>] pgm_check_handler+0x1c8/0x220 Last Breaking-Event-Address: [<001fffff80549c88>] zfcp_erp_timeout_handler+0x10/0x18 [zfcp] Kernel panic - not syncing: Fatal exception in interrupt Link: https://lore.kernel.org/r/20200623140242.98864-1-maier@linux.ibm.com Fixes: 75492a5 ("s390/scsi: Convert timers to use timer_setup()") Cc: <stable@vger.kernel.org> STMicroelectronics#4.15+ Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
stalyatech
pushed a commit
to stalyatech/linux
that referenced
this pull request
Feb 26, 2021
[ Upstream commit a066562 ] Enable promisc mode of PF, set VF link state to enable, and run iperf of the VF, then do self test of the PF. The self test will fail with a low frequency, and may cause a use-after-free problem. [ 87.142126] selftest:000004a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [ 87.159722] ================================================================== [ 87.174187] BUG: KASAN: use-after-free in hex_dump_to_buffer+0x140/0x608 [ 87.187600] Read of size 1 at addr ffff003b22828000 by task ethtool/1186 [ 87.201012] [ 87.203978] CPU: 7 PID: 1186 Comm: ethtool Not tainted 5.5.0-rc4-gfd51c473-dirty STMicroelectronics#4 [ 87.219306] Hardware name: Huawei TaiShan 2280 V2/BC82AMDA, BIOS TA BIOS 2280-A CS V2.B160.01 01/15/2020 [ 87.238292] Call trace: [ 87.243173] dump_backtrace+0x0/0x280 [ 87.250491] show_stack+0x24/0x30 [ 87.257114] dump_stack+0xe8/0x140 [ 87.263911] print_address_description.isra.8+0x70/0x380 [ 87.274538] __kasan_report+0x12c/0x230 [ 87.282203] kasan_report+0xc/0x18 [ 87.288999] __asan_load1+0x60/0x68 [ 87.295969] hex_dump_to_buffer+0x140/0x608 [ 87.304332] print_hex_dump+0x140/0x1e0 [ 87.312000] hns3_lb_check_skb_data+0x168/0x170 [ 87.321060] hns3_clean_rx_ring+0xa94/0xfe0 [ 87.329422] hns3_self_test+0x708/0x8c0 The length of packet sent by the selftest process is only 128 + 14 bytes, and the min buffer size of a BD is 256 bytes, and the receive process will make sure the packet sent by the selftest process is in the linear part, so only check the linear part in hns3_lb_check_skb_data(). So fix this use-after-free by using skb_headlen() to dump skb->data instead of skb->len. Fixes: c39c4d9 ("net: hns3: Add mac loopback selftest support in hns3 driver") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
stalyatech
pushed a commit
to stalyatech/linux
that referenced
this pull request
Feb 26, 2021
commit f5e5677 upstream. NULL pointer exception happens occasionally on serial output initiated by login timeout. This was reproduced only if kernel was built with significant debugging options and EDMA driver is used with serial console. col-vf50 login: root Password: Login timed out after 60 seconds. Unable to handle kernel NULL pointer dereference at virtual address 00000044 Internal error: Oops: 5 [STMicroelectronics#1] ARM CPU: 0 PID: 157 Comm: login Not tainted 5.7.0-next-20200610-dirty STMicroelectronics#4 Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree) (fsl_edma_tx_handler) from [<8016eb10>] (__handle_irq_event_percpu+0x64/0x304) (__handle_irq_event_percpu) from [<8016eddc>] (handle_irq_event_percpu+0x2c/0x7c) (handle_irq_event_percpu) from [<8016ee64>] (handle_irq_event+0x38/0x5c) (handle_irq_event) from [<801729e4>] (handle_fasteoi_irq+0xa4/0x160) (handle_fasteoi_irq) from [<8016ddcc>] (generic_handle_irq+0x34/0x44) (generic_handle_irq) from [<8016e40c>] (__handle_domain_irq+0x54/0xa8) (__handle_domain_irq) from [<80508bc8>] (gic_handle_irq+0x4c/0x80) (gic_handle_irq) from [<80100af0>] (__irq_svc+0x70/0x98) Exception stack(0x8459fe80 to 0x8459fec8) fe80: 72286b00 e3359f64 00000001 0000412d a0070013 85c98840 85c98840 a0070013 fea0: 8054e0d4 00000000 00000002 00000000 00000002 8459fed0 8081fbe8 8081fbec fec0: 60070013 ffffffff (__irq_svc) from [<8081fbec>] (_raw_spin_unlock_irqrestore+0x30/0x58) (_raw_spin_unlock_irqrestore) from [<8056cb48>] (uart_flush_buffer+0x88/0xf8) (uart_flush_buffer) from [<80554e60>] (tty_ldisc_hangup+0x38/0x1ac) (tty_ldisc_hangup) from [<8054c7f4>] (__tty_hangup+0x158/0x2bc) (__tty_hangup) from [<80557b90>] (disassociate_ctty.part.1+0x30/0x23c) (disassociate_ctty.part.1) from [<8011fc18>] (do_exit+0x580/0xba0) (do_exit) from [<801214f8>] (do_group_exit+0x3c/0xb4) (do_group_exit) from [<80121580>] (__wake_up_parent+0x0/0x14) Issue looks like race condition between interrupt handler fsl_edma_tx_handler() (called as result of fsl_edma_xfer_desc()) and terminating the transfer with fsl_edma_terminate_all(). The fsl_edma_tx_handler() handles interrupt for a transfer with already freed edesc and idle==true. Fixes: d6be34f ("dma: Add Freescale eDMA engine driver support") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Robin Gong <yibin.gong@nxp.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/1591877861-28156-2-git-send-email-krzk@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
stalyatech
pushed a commit
to stalyatech/linux
that referenced
this pull request
Feb 26, 2021
commit fc846e9 upstream. The `INSN_CONFIG` comedi instruction with sub-instruction code `INSN_CONFIG_DIGITAL_TRIG` includes a base channel in `data[3]`. This is used as a right shift amount for other bitmask values without being checked. Shift amounts greater than or equal to 32 will result in undefined behavior. Add code to deal with this, adjusting the checks for invalid channels so that enabled channel bits that would have been lost by shifting are also checked for validity. Only channels 0 to 15 are valid. Fixes: a8c66b6 ("staging: comedi: addi_apci_1500: rewrite the subdevice support functions") Cc: <stable@vger.kernel.org> STMicroelectronics#4.0+: ef75e14: staging: comedi: verify array index is correct before using it Cc: <stable@vger.kernel.org> STMicroelectronics#4.0+ Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20200717145257.112660-5-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
arnopo
pushed a commit
to arnopo/linux
that referenced
this pull request
Mar 14, 2022
arm32 uses software to simulate the instruction replaced by kprobe. some instructions may be simulated by constructing assembly functions. therefore, before executing instruction simulation, it is necessary to construct assembly function execution environment in C language through binding registers. after kasan is enabled, the register binding relationship will be destroyed, resulting in instruction simulation errors and causing kernel panic. the kprobe emulate instruction function is distributed in three files: actions-common.c actions-arm.c actions-thumb.c, so disable KASAN when compiling these files. for example, use kprobe insert on cap_capable+20 after kasan enabled, the cap_capable assembly code is as follows: <cap_capable>: e92d47f0 push {r4, r5, r6, r7, r8, r9, sl, lr} e1a05000 mov r5, r0 e280006c add r0, r0, torvalds#108 ; 0x6c e1a04001 mov r4, r1 e1a06002 mov r6, r2 e59fa090 ldr sl, [pc, torvalds#144] ; ebfc7bf8 bl c03aa4b4 <__asan_load4> e595706c ldr r7, [r5, torvalds#108] ; 0x6c e2859014 add r9, r5, STMicroelectronics#20 ...... The emulate_ldr assembly code after enabling kasan is as follows: c06f1384 <emulate_ldr>: e92d47f0 push {r4, r5, r6, r7, r8, r9, sl, lr} e282803c add r8, r2, torvalds#60 ; 0x3c e1a05000 mov r5, r0 e7e37855 ubfx r7, r5, STMicroelectronics#16, STMicroelectronics#4 e1a00008 mov r0, r8 e1a09001 mov r9, r1 e1a04002 mov r4, r2 ebf35462 bl c03c6530 <__asan_load4> e357000f cmp r7, STMicroelectronics#15 e7e36655 ubfx r6, r5, STMicroelectronics#12, STMicroelectronics#4 e205a00f and sl, r5, STMicroelectronics#15 0a000001 beq c06f13bc <emulate_ldr+0x38> e0840107 add r0, r4, r7, lsl STMicroelectronics#2 ebf3545c bl c03c6530 <__asan_load4> e084010a add r0, r4, sl, lsl STMicroelectronics#2 ebf3545a bl c03c6530 <__asan_load4> e2890010 add r0, r9, STMicroelectronics#16 ebf35458 bl c03c6530 <__asan_load4> e5990010 ldr r0, [r9, STMicroelectronics#16] e12fff30 blx r0 e356000f cm r6, STMicroelectronics#15 1a000014 bne c06f1430 <emulate_ldr+0xac> e1a06000 mov r6, r0 e2840040 add r0, r4, torvalds#64 ; 0x40 ...... when running in emulate_ldr to simulate the ldr instruction, panic occurred, and the log is as follows: Unable to handle kernel NULL pointer dereference at virtual address 00000090 pgd = ecb46400 [00000090] *pgd=2e0fa003, *pmd=00000000 Internal error: Oops: 206 [STMicroelectronics#1] SMP ARM PC is at cap_capable+0x14/0xb0 LR is at emulate_ldr+0x50/0xc0 psr: 600d0293 sp : ecd63af8 ip : 00000004 fp : c0a7c30c r10: 00000000 r9 : c30897f4 r8 : ecd63cd4 r7 : 0000000f r6 : 0000000a r5 : e59fa090 r4 : ecd63c98 r3 : c06ae294 r2 : 00000000 r1 : b7611300 r0 : bf4ec008 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 32c5387d Table: 2d546400 DAC: 55555555 Process bash (pid: 1643, stack limit = 0xecd60190) (cap_capable) from (kprobe_handler+0x218/0x340) (kprobe_handler) from (kprobe_trap_handler+0x24/0x48) (kprobe_trap_handler) from (do_undefinstr+0x13c/0x364) (do_undefinstr) from (__und_svc_finish+0x0/0x30) (__und_svc_finish) from (cap_capable+0x18/0xb0) (cap_capable) from (cap_vm_enough_memory+0x38/0x48) (cap_vm_enough_memory) from (security_vm_enough_memory_mm+0x48/0x6c) (security_vm_enough_memory_mm) from (copy_process.constprop.5+0x16b4/0x25c8) (copy_process.constprop.5) from (_do_fork+0xe8/0x55c) (_do_fork) from (SyS_clone+0x1c/0x24) (SyS_clone) from (__sys_trace_return+0x0/0x10) Code: 0050a0e1 6c0080e2 0140a0e1 0260a0e1 (f801f0e7) Fixes: 35aa1df ("ARM kprobes: instruction single-stepping support") Fixes: 4210157 ("ARM: 9017/2: Enable KASan for ARM") Signed-off-by: huangshaobo <huangshaobo6@huawei.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
dponchalee
pushed a commit
to dponchalee/linux-st
that referenced
this pull request
Feb 2, 2023
[ Upstream commit 84a5358 ] The SRv6 layer allows defining HMAC data that can later be used to sign IPv6 Segment Routing Headers. This configuration is realised via netlink through four attributes: SEG6_ATTR_HMACKEYID, SEG6_ATTR_SECRET, SEG6_ATTR_SECRETLEN and SEG6_ATTR_ALGID. Because the SECRETLEN attribute is decoupled from the actual length of the SECRET attribute, it is possible to provide invalid combinations (e.g., secret = "", secretlen = 64). This case is not checked in the code and with an appropriately crafted netlink message, an out-of-bounds read of up to 64 bytes (max secret length) can occur past the skb end pointer and into skb_shared_info: Breakpoint 1, seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208 208 memcpy(hinfo->secret, secret, slen); (gdb) bt #0 seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208 STMicroelectronics#1 0xffffffff81e012e9 in genl_family_rcv_msg_doit (skb=skb@entry=0xffff88800b1f9f00, nlh=nlh@entry=0xffff88800b1b7600, extack=extack@entry=0xffffc90000ba7af0, ops=ops@entry=0xffffc90000ba7a80, hdrlen=4, net=0xffffffff84237580 <init_net>, family=<optimized out>, family=<optimized out>) at net/netlink/genetlink.c:731 STMicroelectronics#2 0xffffffff81e01435 in genl_family_rcv_msg (extack=0xffffc90000ba7af0, nlh=0xffff88800b1b7600, skb=0xffff88800b1f9f00, family=0xffffffff82fef6c0 <seg6_genl_family>) at net/netlink/genetlink.c:775 STMicroelectronics#3 genl_rcv_msg (skb=0xffff88800b1f9f00, nlh=0xffff88800b1b7600, extack=0xffffc90000ba7af0) at net/netlink/genetlink.c:792 STMicroelectronics#4 0xffffffff81dfffc3 in netlink_rcv_skb (skb=skb@entry=0xffff88800b1f9f00, cb=cb@entry=0xffffffff81e01350 <genl_rcv_msg>) at net/netlink/af_netlink.c:2501 STMicroelectronics#5 0xffffffff81e00919 in genl_rcv (skb=0xffff88800b1f9f00) at net/netlink/genetlink.c:803 STMicroelectronics#6 0xffffffff81dff6ae in netlink_unicast_kernel (ssk=0xffff888010eec800, skb=0xffff88800b1f9f00, sk=0xffff888004aed000) at net/netlink/af_netlink.c:1319 STMicroelectronics#7 netlink_unicast (ssk=ssk@entry=0xffff888010eec800, skb=skb@entry=0xffff88800b1f9f00, portid=portid@entry=0, nonblock=<optimized out>) at net/netlink/af_netlink.c:1345 STMicroelectronics#8 0xffffffff81dff9a4 in netlink_sendmsg (sock=<optimized out>, msg=0xffffc90000ba7e48, len=<optimized out>) at net/netlink/af_netlink.c:1921 ... (gdb) p/x ((struct sk_buff *)0xffff88800b1f9f00)->head + ((struct sk_buff *)0xffff88800b1f9f00)->end $1 = 0xffff88800b1b76c0 (gdb) p/x secret $2 = 0xffff88800b1b76c0 (gdb) p slen $3 = 64 '@' The OOB data can then be read back from userspace by dumping HMAC state. This commit fixes this by ensuring SECRETLEN cannot exceed the actual length of SECRET. Reported-by: Lucas Leong <wmliang.tw@gmail.com> Tested: verified that EINVAL is correctly returned when secretlen > len(secret) Fixes: 4f4853d ("ipv6: sr: implement API to control SR HMAC structure") Signed-off-by: David Lebrun <dlebrun@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
dponchalee
pushed a commit
to dponchalee/linux-st
that referenced
this pull request
Feb 2, 2023
commit 29a5b8a upstream. When walking through an inode extents, the ext4_ext_binsearch_idx() function assumes that the extent header has been previously validated. However, there are no checks that verify that the number of entries (eh->eh_entries) is non-zero when depth is > 0. And this will lead to problems because the EXT_FIRST_INDEX() and EXT_LAST_INDEX() will return garbage and result in this: [ 135.245946] ------------[ cut here ]------------ [ 135.247579] kernel BUG at fs/ext4/extents.c:2258! [ 135.249045] invalid opcode: 0000 [STMicroelectronics#1] PREEMPT SMP [ 135.250320] CPU: 2 PID: 238 Comm: tmp118 Not tainted 5.19.0-rc8+ STMicroelectronics#4 [ 135.252067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014 [ 135.255065] RIP: 0010:ext4_ext_map_blocks+0xc20/0xcb0 [ 135.256475] Code: [ 135.261433] RSP: 0018:ffffc900005939f8 EFLAGS: 00010246 [ 135.262847] RAX: 0000000000000024 RBX: ffffc90000593b70 RCX: 0000000000000023 [ 135.264765] RDX: ffff8880038e5f10 RSI: 0000000000000003 RDI: ffff8880046e922c [ 135.266670] RBP: ffff8880046e9348 R08: 0000000000000001 R09: ffff888002ca580c [ 135.268576] R10: 0000000000002602 R11: 0000000000000000 R12: 0000000000000024 [ 135.270477] R13: 0000000000000000 R14: 0000000000000024 R15: 0000000000000000 [ 135.272394] FS: 00007fdabdc56740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 [ 135.274510] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 135.276075] CR2: 00007ffc26bd4f00 CR3: 0000000006261004 CR4: 0000000000170ea0 [ 135.277952] Call Trace: [ 135.278635] <TASK> [ 135.279247] ? preempt_count_add+0x6d/0xa0 [ 135.280358] ? percpu_counter_add_batch+0x55/0xb0 [ 135.281612] ? _raw_read_unlock+0x18/0x30 [ 135.282704] ext4_map_blocks+0x294/0x5a0 [ 135.283745] ? xa_load+0x6f/0xa0 [ 135.284562] ext4_mpage_readpages+0x3d6/0x770 [ 135.285646] read_pages+0x67/0x1d0 [ 135.286492] ? folio_add_lru+0x51/0x80 [ 135.287441] page_cache_ra_unbounded+0x124/0x170 [ 135.288510] filemap_get_pages+0x23d/0x5a0 [ 135.289457] ? path_openat+0xa72/0xdd0 [ 135.290332] filemap_read+0xbf/0x300 [ 135.291158] ? _raw_spin_lock_irqsave+0x17/0x40 [ 135.292192] new_sync_read+0x103/0x170 [ 135.293014] vfs_read+0x15d/0x180 [ 135.293745] ksys_read+0xa1/0xe0 [ 135.294461] do_syscall_64+0x3c/0x80 [ 135.295284] entry_SYSCALL_64_after_hwframe+0x46/0xb0 This patch simply adds an extra check in __ext4_ext_check(), verifying that eh_entries is not 0 when eh_depth is > 0. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215941 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216283 Cc: Baokun Li <libaokun1@huawei.com> Cc: stable@kernel.org Signed-off-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20220822094235.2690-1-lhenriques@suse.de Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
devik2
pushed a commit
to devik2/st_linux
that referenced
this pull request
Oct 31, 2023
commit fe2b73d upstream. The code in decode_config4() of arch/mips/kernel/cpu-probe.c asid_mask = MIPS_ENTRYHI_ASID; if (config4 & MIPS_CONF4_AE) asid_mask |= MIPS_ENTRYHI_ASIDX; set_cpu_asid_mask(c, asid_mask); set asid_mask to cpuinfo->asid_mask. So in order to support variable ASID_MASK, KVM_ENTRYHI_ASID should also be changed to cpu_asid_mask(&boot_cpu_data). Cc: Stable <stable@vger.kernel.org> STMicroelectronics#4.9+ Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Xing Li <lixing@loongson.cn> [Huacai: Change current_cpu_data to boot_cpu_data for optimization] Signed-off-by: Huacai Chen <chenhc@lemote.com> Message-Id: <1590220602-3547-2-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
devik2
pushed a commit
to devik2/st_linux
that referenced
this pull request
Oct 31, 2023
[ Upstream commit f6766ff ] We need to check mddev->del_work before flush workqueu since the purpose of flush is to ensure the previous md is disappeared. Otherwise the similar deadlock appeared if LOCKDEP is enabled, it is due to md_open holds the bdev->bd_mutex before flush workqueue. kernel: [ 154.522645] ====================================================== kernel: [ 154.522647] WARNING: possible circular locking dependency detected kernel: [ 154.522650] 5.6.0-rc7-lp151.27-default STMicroelectronics#25 Tainted: G O kernel: [ 154.522651] ------------------------------------------------------ kernel: [ 154.522653] mdadm/2482 is trying to acquire lock: kernel: [ 154.522655] ffff888078529128 ((wq_completion)md_misc){+.+.}, at: flush_workqueue+0x84/0x4b0 kernel: [ 154.522673] kernel: [ 154.522673] but task is already holding lock: kernel: [ 154.522675] ffff88804efa9338 (&bdev->bd_mutex){+.+.}, at: __blkdev_get+0x79/0x590 kernel: [ 154.522691] kernel: [ 154.522691] which lock already depends on the new lock. kernel: [ 154.522691] kernel: [ 154.522694] kernel: [ 154.522694] the existing dependency chain (in reverse order) is: kernel: [ 154.522696] kernel: [ 154.522696] -> STMicroelectronics#4 (&bdev->bd_mutex){+.+.}: kernel: [ 154.522704] __mutex_lock+0x87/0x950 kernel: [ 154.522706] __blkdev_get+0x79/0x590 kernel: [ 154.522708] blkdev_get+0x65/0x140 kernel: [ 154.522709] blkdev_get_by_dev+0x2f/0x40 kernel: [ 154.522716] lock_rdev+0x3d/0x90 [md_mod] kernel: [ 154.522719] md_import_device+0xd6/0x1b0 [md_mod] kernel: [ 154.522723] new_dev_store+0x15e/0x210 [md_mod] kernel: [ 154.522728] md_attr_store+0x7a/0xc0 [md_mod] kernel: [ 154.522732] kernfs_fop_write+0x117/0x1b0 kernel: [ 154.522735] vfs_write+0xad/0x1a0 kernel: [ 154.522737] ksys_write+0xa4/0xe0 kernel: [ 154.522745] do_syscall_64+0x64/0x2b0 kernel: [ 154.522748] entry_SYSCALL_64_after_hwframe+0x49/0xbe kernel: [ 154.522749] kernel: [ 154.522749] -> STMicroelectronics#3 (&mddev->reconfig_mutex){+.+.}: kernel: [ 154.522752] __mutex_lock+0x87/0x950 kernel: [ 154.522756] new_dev_store+0xc9/0x210 [md_mod] kernel: [ 154.522759] md_attr_store+0x7a/0xc0 [md_mod] kernel: [ 154.522761] kernfs_fop_write+0x117/0x1b0 kernel: [ 154.522763] vfs_write+0xad/0x1a0 kernel: [ 154.522765] ksys_write+0xa4/0xe0 kernel: [ 154.522767] do_syscall_64+0x64/0x2b0 kernel: [ 154.522769] entry_SYSCALL_64_after_hwframe+0x49/0xbe kernel: [ 154.522770] kernel: [ 154.522770] -> STMicroelectronics#2 (kn->count#253){++++}: kernel: [ 154.522775] __kernfs_remove+0x253/0x2c0 kernel: [ 154.522778] kernfs_remove+0x1f/0x30 kernel: [ 154.522780] kobject_del+0x28/0x60 kernel: [ 154.522783] mddev_delayed_delete+0x24/0x30 [md_mod] kernel: [ 154.522786] process_one_work+0x2a7/0x5f0 kernel: [ 154.522788] worker_thread+0x2d/0x3d0 kernel: [ 154.522793] kthread+0x117/0x130 kernel: [ 154.522795] ret_from_fork+0x3a/0x50 kernel: [ 154.522796] kernel: [ 154.522796] -> STMicroelectronics#1 ((work_completion)(&mddev->del_work)){+.+.}: kernel: [ 154.522800] process_one_work+0x27e/0x5f0 kernel: [ 154.522802] worker_thread+0x2d/0x3d0 kernel: [ 154.522804] kthread+0x117/0x130 kernel: [ 154.522806] ret_from_fork+0x3a/0x50 kernel: [ 154.522807] kernel: [ 154.522807] -> #0 ((wq_completion)md_misc){+.+.}: kernel: [ 154.522813] __lock_acquire+0x1392/0x1690 kernel: [ 154.522816] lock_acquire+0xb4/0x1a0 kernel: [ 154.522818] flush_workqueue+0xab/0x4b0 kernel: [ 154.522821] md_open+0xb6/0xc0 [md_mod] kernel: [ 154.522823] __blkdev_get+0xea/0x590 kernel: [ 154.522825] blkdev_get+0x65/0x140 kernel: [ 154.522828] do_dentry_open+0x1d1/0x380 kernel: [ 154.522831] path_openat+0x567/0xcc0 kernel: [ 154.522834] do_filp_open+0x9b/0x110 kernel: [ 154.522836] do_sys_openat2+0x201/0x2a0 kernel: [ 154.522838] do_sys_open+0x57/0x80 kernel: [ 154.522840] do_syscall_64+0x64/0x2b0 kernel: [ 154.522842] entry_SYSCALL_64_after_hwframe+0x49/0xbe kernel: [ 154.522844] kernel: [ 154.522844] other info that might help us debug this: kernel: [ 154.522844] kernel: [ 154.522846] Chain exists of: kernel: [ 154.522846] (wq_completion)md_misc --> &mddev->reconfig_mutex --> &bdev->bd_mutex kernel: [ 154.522846] kernel: [ 154.522850] Possible unsafe locking scenario: kernel: [ 154.522850] kernel: [ 154.522852] CPU0 CPU1 kernel: [ 154.522853] ---- ---- kernel: [ 154.522854] lock(&bdev->bd_mutex); kernel: [ 154.522856] lock(&mddev->reconfig_mutex); kernel: [ 154.522858] lock(&bdev->bd_mutex); kernel: [ 154.522860] lock((wq_completion)md_misc); kernel: [ 154.522861] kernel: [ 154.522861] *** DEADLOCK *** kernel: [ 154.522861] kernel: [ 154.522864] 1 lock held by mdadm/2482: kernel: [ 154.522865] #0: ffff88804efa9338 (&bdev->bd_mutex){+.+.}, at: __blkdev_get+0x79/0x590 kernel: [ 154.522868] kernel: [ 154.522868] stack backtrace: kernel: [ 154.522873] CPU: 1 PID: 2482 Comm: mdadm Tainted: G O 5.6.0-rc7-lp151.27-default STMicroelectronics#25 kernel: [ 154.522875] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 kernel: [ 154.522878] Call Trace: kernel: [ 154.522881] dump_stack+0x8f/0xcb kernel: [ 154.522884] check_noncircular+0x194/0x1b0 kernel: [ 154.522888] ? __lock_acquire+0x1392/0x1690 kernel: [ 154.522890] __lock_acquire+0x1392/0x1690 kernel: [ 154.522893] lock_acquire+0xb4/0x1a0 kernel: [ 154.522895] ? flush_workqueue+0x84/0x4b0 kernel: [ 154.522898] flush_workqueue+0xab/0x4b0 kernel: [ 154.522900] ? flush_workqueue+0x84/0x4b0 kernel: [ 154.522905] ? md_open+0xb6/0xc0 [md_mod] kernel: [ 154.522908] md_open+0xb6/0xc0 [md_mod] kernel: [ 154.522910] __blkdev_get+0xea/0x590 kernel: [ 154.522912] ? bd_acquire+0xc0/0xc0 kernel: [ 154.522914] blkdev_get+0x65/0x140 kernel: [ 154.522916] ? bd_acquire+0xc0/0xc0 kernel: [ 154.522918] do_dentry_open+0x1d1/0x380 kernel: [ 154.522921] path_openat+0x567/0xcc0 kernel: [ 154.522923] ? __lock_acquire+0x380/0x1690 kernel: [ 154.522926] do_filp_open+0x9b/0x110 kernel: [ 154.522929] ? __alloc_fd+0xe5/0x1f0 kernel: [ 154.522935] ? kmem_cache_alloc+0x28c/0x630 kernel: [ 154.522939] ? do_sys_openat2+0x201/0x2a0 kernel: [ 154.522941] do_sys_openat2+0x201/0x2a0 kernel: [ 154.522944] do_sys_open+0x57/0x80 kernel: [ 154.522946] do_syscall_64+0x64/0x2b0 kernel: [ 154.522948] entry_SYSCALL_64_after_hwframe+0x49/0xbe kernel: [ 154.522951] RIP: 0033:0x7f98d279d9ae And md_alloc also flushed the same workqueue, but the thing is different here. Because all the paths call md_alloc don't hold bdev->bd_mutex, and the flush is necessary to avoid race condition, so leave it as it is. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
devik2
pushed a commit
to devik2/st_linux
that referenced
this pull request
Oct 31, 2023
commit 936e6b8 upstream. Suppose that, for unrelated reasons, FSF requests on behalf of recovery are very slow and can run into the ERP timeout. In the case at hand, we did adapter recovery to a large degree. However due to the slowness a LUN open is pending so the corresponding fc_rport remains blocked. After fast_io_fail_tmo we trigger close physical port recovery for the port under which the LUN should have been opened. The new higher order port recovery dismisses the pending LUN open ERP action and dismisses the pending LUN open FSF request. Such dismissal decouples the ERP action from the pending corresponding FSF request by setting zfcp_fsf_req->erp_action to NULL (among other things) [zfcp_erp_strategy_check_fsfreq()]. If now the ERP timeout for the pending open LUN request runs out, we must not use zfcp_fsf_req->erp_action in the ERP timeout handler. This is a problem since v4.15 commit 75492a5 ("s390/scsi: Convert timers to use timer_setup()"). Before that we intentionally only passed zfcp_erp_action as context argument to zfcp_erp_timeout_handler(). Note: The lifetime of the corresponding zfcp_fsf_req object continues until a (late) response or an (unrelated) adapter recovery. Just like the regular response path ignores dismissed requests [zfcp_fsf_req_complete() => zfcp_fsf_protstatus_eval() => return early] the ERP timeout handler now needs to ignore dismissed requests. So simply return early in the ERP timeout handler if the FSF request is marked as dismissed in its status flags. To protect against the race where zfcp_erp_strategy_check_fsfreq() dismisses and sets zfcp_fsf_req->erp_action to NULL after our previous status flag check, return early if zfcp_fsf_req->erp_action is NULL. After all, the former ERP action does not need to be woken up as that was already done as part of the dismissal above [zfcp_erp_action_dismiss()]. This fixes the following panic due to kernel page fault in IRQ context: Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0000000000000000 TEID: 0000000000000483 Fault in home space mode while using kernel ASCE. AS:000009859238c00b R2:00000e3e7ffd000b R3:00000e3e7ffcc007 S:00000e3e7ffd7000 P:000000000000013d Oops: 0004 ilc:2 [STMicroelectronics#1] SMP Modules linked in: ... CPU: 82 PID: 311273 Comm: stress Kdump: loaded Tainted: G E X ... Hardware name: IBM 8561 T01 701 (LPAR) Krnl PSW : 0404c00180000000 001fffff80549be0 (zfcp_erp_notify+0x40/0xc0 [zfcp]) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 Krnl GPRS: 0000000000000080 00000e3d00000000 00000000000000f0 0000000000030000 000000010028e700 000000000400a39c 000000010028e700 00000e3e7cf87e02 0000000010000000 0700098591cb67f0 0000000000000000 0000000000000000 0000033840e9a000 0000000000000000 001fffe008d6bc18 001fffe008d6bbc8 Krnl Code: 001fffff80549bd4: a7180000 lhi %r1,0 001fffff80549bd8: 4120a0f0 la %r2,240(%r10) #001fffff80549bdc: a53e0003 llilh %r3,3 >001fffff80549be0: ba132000 cs %r1,%r3,0(%r2) 001fffff80549be4: a7740037 brc 7,1fffff80549c52 001fffff80549be8: e320b0180004 lg %r2,24(%r11) 001fffff80549bee: e31020e00004 lg %r1,224(%r2) 001fffff80549bf4: 412020e0 la %r2,224(%r2) Call Trace: [<001fffff80549be0>] zfcp_erp_notify+0x40/0xc0 [zfcp] [<00000985915e26f0>] call_timer_fn+0x38/0x190 [<00000985915e2944>] expire_timers+0xfc/0x190 [<00000985915e2ac4>] run_timer_softirq+0xec/0x218 [<0000098591ca7c4c>] __do_softirq+0x144/0x398 [<00000985915110aa>] do_softirq_own_stack+0x72/0x88 [<0000098591551b58>] irq_exit+0xb0/0xb8 [<0000098591510c6a>] do_IRQ+0x82/0xb0 [<0000098591ca7140>] ext_int_handler+0x128/0x12c [<0000098591722d98>] clear_subpage.constprop.13+0x38/0x60 ([<000009859172ae4c>] clear_huge_page+0xec/0x250) [<000009859177e7a2>] do_huge_pmd_anonymous_page+0x32a/0x768 [<000009859172a712>] __handle_mm_fault+0x88a/0x900 [<000009859172a860>] handle_mm_fault+0xd8/0x1b0 [<0000098591529ef6>] do_dat_exception+0x136/0x3e8 [<0000098591ca6d34>] pgm_check_handler+0x1c8/0x220 Last Breaking-Event-Address: [<001fffff80549c88>] zfcp_erp_timeout_handler+0x10/0x18 [zfcp] Kernel panic - not syncing: Fatal exception in interrupt Link: https://lore.kernel.org/r/20200623140242.98864-1-maier@linux.ibm.com Fixes: 75492a5 ("s390/scsi: Convert timers to use timer_setup()") Cc: <stable@vger.kernel.org> STMicroelectronics#4.15+ Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
devik2
pushed a commit
to devik2/st_linux
that referenced
this pull request
Oct 31, 2023
[ Upstream commit dcacbc1 ] It's possible for a block driver to set logical block size to a value greater than page size incorrectly; e.g. bcache takes the value from the superblock, set by the user w/ make-bcache. This causes a BUG/NULL pointer dereference in the path: __blkdev_get() -> set_init_blocksize() // set i_blkbits based on ... -> bdev_logical_block_size() -> queue_logical_block_size() // ... this value -> bdev_disk_changed() ... -> blkdev_readpage() -> block_read_full_page() -> create_page_buffers() // size = 1 << i_blkbits -> create_empty_buffers() // give size/take pointer -> alloc_page_buffers() // return NULL .. BUG! Because alloc_page_buffers() is called with size > PAGE_SIZE, thus it initializes head = NULL, skips the loop, return head; then create_empty_buffers() gets (and uses) the NULL pointer. This has been around longer than commit ad6bf88 ("block: fix an integer overflow in logical block size"); however, it increased the range of values that can trigger the issue. Previously only 8k/16k/32k (on x86/4k page size) would do it, as greater values overflow unsigned short to zero, and queue_ logical_block_size() would then use the default of 512. Now the range with unsigned int is much larger, and users w/ the 512k value, which happened to be zero'ed previously and work fine, started to hit this issue -- as the zero is gone, and queue_logical_block_size() does return 512k (>PAGE_SIZE.) Fix this by checking the bcache device's logical block size, and if it's greater than page size, fallback to the backing/ cached device's logical page size. This doesn't affect cache devices as those are still checked for block/page size in read_super(); only the backing/cached devices are not. Apparently it's a regression from commit 2903381 ("bcache: Take data offset from the bdev superblock."), moving the check into BCACHE_SB_VERSION_CDEV only. Now that we have superblocks of backing devices out there with this larger value, we cannot refuse to load them (i.e., have a similar check in _BDEV.) Ideally perhaps bcache should use all values from the backing device (physical/logical/io_min block size)? But for now just fix the problematic case. Test-case: # IMG=/root/disk.img # dd if=/dev/zero of=$IMG bs=1 count=0 seek=1G # DEV=$(losetup --find --show $IMG) # make-bcache --bdev $DEV --block 8k < see dmesg > Before: # uname -r 5.7.0-rc7 [ 55.944046] BUG: kernel NULL pointer dereference, address: 0000000000000000 ... [ 55.949742] CPU: 3 PID: 610 Comm: bcache-register Not tainted 5.7.0-rc7 STMicroelectronics#4 ... [ 55.952281] RIP: 0010:create_empty_buffers+0x1a/0x100 ... [ 55.966434] Call Trace: [ 55.967021] create_page_buffers+0x48/0x50 [ 55.967834] block_read_full_page+0x49/0x380 [ 55.972181] do_read_cache_page+0x494/0x610 [ 55.974780] read_part_sector+0x2d/0xaa [ 55.975558] read_lba+0x10e/0x1e0 [ 55.977904] efi_partition+0x120/0x5a6 [ 55.980227] blk_add_partitions+0x161/0x390 [ 55.982177] bdev_disk_changed+0x61/0xd0 [ 55.982961] __blkdev_get+0x350/0x490 [ 55.983715] __device_add_disk+0x318/0x480 [ 55.984539] bch_cached_dev_run+0xc5/0x270 [ 55.986010] register_bcache.cold+0x122/0x179 [ 55.987628] kernfs_fop_write+0xbc/0x1a0 [ 55.988416] vfs_write+0xb1/0x1a0 [ 55.989134] ksys_write+0x5a/0xd0 [ 55.989825] do_syscall_64+0x43/0x140 [ 55.990563] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 55.991519] RIP: 0033:0x7f7d60ba3154 ... After: # uname -r 5.7.0.bcachelbspgsz [ 31.672460] bcache: bcache_device_init() bcache0: sb/logical block size (8192) greater than page size (4096) falling back to device logical block size (512) [ 31.675133] bcache: register_bdev() registered backing device loop0 # grep ^ /sys/block/bcache0/queue/*_block_size /sys/block/bcache0/queue/logical_block_size:512 /sys/block/bcache0/queue/physical_block_size:8192 Reported-by: Ryan Finnie <ryan@finnie.org> Reported-by: Sebastian Marsching <sebastian@marsching.com> Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
mingzhangqun
pushed a commit
to mingzhangqun/st-linux
that referenced
this pull request
Mar 23, 2024
[ Upstream commit 91d6a46 ] gpi_ch_init() doesn't lock the ctrl_lock mutex, so there is no need to unlock it too. Instead the mutex is handled by the function gpi_alloc_chan_resources(), which properly locks and unlocks the mutex. ===================================== WARNING: bad unlock balance detected! 6.3.0-rc5-00253-g99792582ded1-dirty STMicroelectronics#15 Not tainted ------------------------------------- kworker/u16:0/9 is trying to release lock (&gpii->ctrl_lock) at: [<ffffb99d04e1284c>] gpi_alloc_chan_resources+0x108/0x5bc but there are no more locks to release! other info that might help us debug this: 6 locks held by kworker/u16:0/9: #0: ffff575740010938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x220/0x594 STMicroelectronics#1: ffff80000809bdd0 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x220/0x594 STMicroelectronics#2: ffff575740f2a0f8 (&dev->mutex){....}-{3:3}, at: __device_attach+0x38/0x188 STMicroelectronics#3: ffff57574b5570f8 (&dev->mutex){....}-{3:3}, at: __device_attach+0x38/0x188 STMicroelectronics#4: ffffb99d06a2f180 (of_dma_lock){+.+.}-{3:3}, at: of_dma_request_slave_channel+0x138/0x280 STMicroelectronics#5: ffffb99d06a2ee20 (dma_list_mutex){+.+.}-{3:3}, at: dma_get_slave_channel+0x28/0x10c stack backtrace: CPU: 7 PID: 9 Comm: kworker/u16:0 Not tainted 6.3.0-rc5-00253-g99792582ded1-dirty STMicroelectronics#15 Hardware name: Google Pixel 3 (DT) Workqueue: events_unbound deferred_probe_work_func Call trace: dump_backtrace+0xa0/0xfc show_stack+0x18/0x24 dump_stack_lvl+0x60/0xac dump_stack+0x18/0x24 print_unlock_imbalance_bug+0x130/0x148 lock_release+0x270/0x300 __mutex_unlock_slowpath+0x48/0x2cc mutex_unlock+0x20/0x2c gpi_alloc_chan_resources+0x108/0x5bc dma_chan_get+0x84/0x188 dma_get_slave_channel+0x5c/0x10c gpi_of_dma_xlate+0x110/0x1a0 of_dma_request_slave_channel+0x174/0x280 dma_request_chan+0x3c/0x2d4 geni_i2c_probe+0x544/0x63c platform_probe+0x68/0xc4 really_probe+0x148/0x2ac __driver_probe_device+0x78/0xe0 driver_probe_device+0x3c/0x160 __device_attach_driver+0xb8/0x138 bus_for_each_drv+0x84/0xe0 __device_attach+0x9c/0x188 device_initial_probe+0x14/0x20 bus_probe_device+0xac/0xb0 device_add+0x60c/0x7d8 of_device_add+0x44/0x60 of_platform_device_create_pdata+0x90/0x124 of_platform_bus_create+0x15c/0x3c8 of_platform_populate+0x58/0xf8 devm_of_platform_populate+0x58/0xbc geni_se_probe+0xf0/0x164 platform_probe+0x68/0xc4 really_probe+0x148/0x2ac __driver_probe_device+0x78/0xe0 driver_probe_device+0x3c/0x160 __device_attach_driver+0xb8/0x138 bus_for_each_drv+0x84/0xe0 __device_attach+0x9c/0x188 device_initial_probe+0x14/0x20 bus_probe_device+0xac/0xb0 deferred_probe_work_func+0x8c/0xc8 process_one_work+0x2bc/0x594 worker_thread+0x228/0x438 kthread+0x108/0x10c ret_from_fork+0x10/0x20 Fixes: 5d0c353 ("dmaengine: qcom: Add GPI dma driver") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20230409233355.453741-1-dmitry.baryshkov@linaro.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
mingzhangqun
pushed a commit
to mingzhangqun/st-linux
that referenced
this pull request
Mar 23, 2024
[ Upstream commit da94a77 ] Error handler of tcf_block_bind() frees the whole bo->cb_list on error. However, by that time the flow_block_cb instances are already in the driver list because driver ndo_setup_tc() callback is called before that up the call chain in tcf_block_offload_cmd(). This leaves dangling pointers to freed objects in the list and causes use-after-free[0]. Fix it by also removing flow_block_cb instances from driver_list before deallocating them. [0]: [ 279.868433] ================================================================== [ 279.869964] BUG: KASAN: slab-use-after-free in flow_block_cb_setup_simple+0x631/0x7c0 [ 279.871527] Read of size 8 at addr ffff888147e2bf20 by task tc/2963 [ 279.873151] CPU: 6 PID: 2963 Comm: tc Not tainted 6.3.0-rc6+ STMicroelectronics#4 [ 279.874273] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 279.876295] Call Trace: [ 279.876882] <TASK> [ 279.877413] dump_stack_lvl+0x33/0x50 [ 279.878198] print_report+0xc2/0x610 [ 279.878987] ? flow_block_cb_setup_simple+0x631/0x7c0 [ 279.879994] kasan_report+0xae/0xe0 [ 279.880750] ? flow_block_cb_setup_simple+0x631/0x7c0 [ 279.881744] ? mlx5e_tc_reoffload_flows_work+0x240/0x240 [mlx5_core] [ 279.883047] flow_block_cb_setup_simple+0x631/0x7c0 [ 279.884027] tcf_block_offload_cmd.isra.0+0x189/0x2d0 [ 279.885037] ? tcf_block_setup+0x6b0/0x6b0 [ 279.885901] ? mutex_lock+0x7d/0xd0 [ 279.886669] ? __mutex_unlock_slowpath.constprop.0+0x2d0/0x2d0 [ 279.887844] ? ingress_init+0x1c0/0x1c0 [sch_ingress] [ 279.888846] tcf_block_get_ext+0x61c/0x1200 [ 279.889711] ingress_init+0x112/0x1c0 [sch_ingress] [ 279.890682] ? clsact_init+0x2b0/0x2b0 [sch_ingress] [ 279.891701] qdisc_create+0x401/0xea0 [ 279.892485] ? qdisc_tree_reduce_backlog+0x470/0x470 [ 279.893473] tc_modify_qdisc+0x6f7/0x16d0 [ 279.894344] ? tc_get_qdisc+0xac0/0xac0 [ 279.895213] ? mutex_lock+0x7d/0xd0 [ 279.896005] ? __mutex_lock_slowpath+0x10/0x10 [ 279.896910] rtnetlink_rcv_msg+0x5fe/0x9d0 [ 279.897770] ? rtnl_calcit.isra.0+0x2b0/0x2b0 [ 279.898672] ? __sys_sendmsg+0xb5/0x140 [ 279.899494] ? do_syscall_64+0x3d/0x90 [ 279.900302] ? entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 279.901337] ? kasan_save_stack+0x2e/0x40 [ 279.902177] ? kasan_save_stack+0x1e/0x40 [ 279.903058] ? kasan_set_track+0x21/0x30 [ 279.903913] ? kasan_save_free_info+0x2a/0x40 [ 279.904836] ? ____kasan_slab_free+0x11a/0x1b0 [ 279.905741] ? kmem_cache_free+0x179/0x400 [ 279.906599] netlink_rcv_skb+0x12c/0x360 [ 279.907450] ? rtnl_calcit.isra.0+0x2b0/0x2b0 [ 279.908360] ? netlink_ack+0x1550/0x1550 [ 279.909192] ? rhashtable_walk_peek+0x170/0x170 [ 279.910135] ? kmem_cache_alloc_node+0x1af/0x390 [ 279.911086] ? _copy_from_iter+0x3d6/0xc70 [ 279.912031] netlink_unicast+0x553/0x790 [ 279.912864] ? netlink_attachskb+0x6a0/0x6a0 [ 279.913763] ? netlink_recvmsg+0x416/0xb50 [ 279.914627] netlink_sendmsg+0x7a1/0xcb0 [ 279.915473] ? netlink_unicast+0x790/0x790 [ 279.916334] ? iovec_from_user.part.0+0x4d/0x220 [ 279.917293] ? netlink_unicast+0x790/0x790 [ 279.918159] sock_sendmsg+0xc5/0x190 [ 279.918938] ____sys_sendmsg+0x535/0x6b0 [ 279.919813] ? import_iovec+0x7/0x10 [ 279.920601] ? kernel_sendmsg+0x30/0x30 [ 279.921423] ? __copy_msghdr+0x3c0/0x3c0 [ 279.922254] ? import_iovec+0x7/0x10 [ 279.923041] ___sys_sendmsg+0xeb/0x170 [ 279.923854] ? copy_msghdr_from_user+0x110/0x110 [ 279.924797] ? ___sys_recvmsg+0xd9/0x130 [ 279.925630] ? __perf_event_task_sched_in+0x183/0x470 [ 279.926656] ? ___sys_sendmsg+0x170/0x170 [ 279.927529] ? ctx_sched_in+0x530/0x530 [ 279.928369] ? update_curr+0x283/0x4f0 [ 279.929185] ? perf_event_update_userpage+0x570/0x570 [ 279.930201] ? __fget_light+0x57/0x520 [ 279.931023] ? __switch_to+0x53d/0xe70 [ 279.931846] ? sockfd_lookup_light+0x1a/0x140 [ 279.932761] __sys_sendmsg+0xb5/0x140 [ 279.933560] ? __sys_sendmsg_sock+0x20/0x20 [ 279.934436] ? fpregs_assert_state_consistent+0x1d/0xa0 [ 279.935490] do_syscall_64+0x3d/0x90 [ 279.936300] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 279.937311] RIP: 0033:0x7f21c814f887 [ 279.938085] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 [ 279.941448] RSP: 002b:00007fff11efd478 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 279.942964] RAX: ffffffffffffffda RBX: 0000000064401979 RCX: 00007f21c814f887 [ 279.944337] RDX: 0000000000000000 RSI: 00007fff11efd4e0 RDI: 0000000000000003 [ 279.945660] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [ 279.947003] R10: 00007f21c8008708 R11: 0000000000000246 R12: 0000000000000001 [ 279.948345] R13: 0000000000409980 R14: 000000000047e538 R15: 0000000000485400 [ 279.949690] </TASK> [ 279.950706] Allocated by task 2960: [ 279.951471] kasan_save_stack+0x1e/0x40 [ 279.952338] kasan_set_track+0x21/0x30 [ 279.953165] __kasan_kmalloc+0x77/0x90 [ 279.954006] flow_block_cb_setup_simple+0x3dd/0x7c0 [ 279.955001] tcf_block_offload_cmd.isra.0+0x189/0x2d0 [ 279.956020] tcf_block_get_ext+0x61c/0x1200 [ 279.956881] ingress_init+0x112/0x1c0 [sch_ingress] [ 279.957873] qdisc_create+0x401/0xea0 [ 279.958656] tc_modify_qdisc+0x6f7/0x16d0 [ 279.959506] rtnetlink_rcv_msg+0x5fe/0x9d0 [ 279.960392] netlink_rcv_skb+0x12c/0x360 [ 279.961216] netlink_unicast+0x553/0x790 [ 279.962044] netlink_sendmsg+0x7a1/0xcb0 [ 279.962906] sock_sendmsg+0xc5/0x190 [ 279.963702] ____sys_sendmsg+0x535/0x6b0 [ 279.964534] ___sys_sendmsg+0xeb/0x170 [ 279.965343] __sys_sendmsg+0xb5/0x140 [ 279.966132] do_syscall_64+0x3d/0x90 [ 279.966908] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 279.968407] Freed by task 2960: [ 279.969114] kasan_save_stack+0x1e/0x40 [ 279.969929] kasan_set_track+0x21/0x30 [ 279.970729] kasan_save_free_info+0x2a/0x40 [ 279.971603] ____kasan_slab_free+0x11a/0x1b0 [ 279.972483] __kmem_cache_free+0x14d/0x280 [ 279.973337] tcf_block_setup+0x29d/0x6b0 [ 279.974173] tcf_block_offload_cmd.isra.0+0x226/0x2d0 [ 279.975186] tcf_block_get_ext+0x61c/0x1200 [ 279.976080] ingress_init+0x112/0x1c0 [sch_ingress] [ 279.977065] qdisc_create+0x401/0xea0 [ 279.977857] tc_modify_qdisc+0x6f7/0x16d0 [ 279.978695] rtnetlink_rcv_msg+0x5fe/0x9d0 [ 279.979562] netlink_rcv_skb+0x12c/0x360 [ 279.980388] netlink_unicast+0x553/0x790 [ 279.981214] netlink_sendmsg+0x7a1/0xcb0 [ 279.982043] sock_sendmsg+0xc5/0x190 [ 279.982827] ____sys_sendmsg+0x535/0x6b0 [ 279.983703] ___sys_sendmsg+0xeb/0x170 [ 279.984510] __sys_sendmsg+0xb5/0x140 [ 279.985298] do_syscall_64+0x3d/0x90 [ 279.986076] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 279.987532] The buggy address belongs to the object at ffff888147e2bf00 which belongs to the cache kmalloc-192 of size 192 [ 279.989747] The buggy address is located 32 bytes inside of freed 192-byte region [ffff888147e2bf00, ffff888147e2bfc0) [ 279.992367] The buggy address belongs to the physical page: [ 279.993430] page:00000000550f405c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x147e2a [ 279.995182] head:00000000550f405c order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 279.996713] anon flags: 0x200000000010200(slab|head|node=0|zone=2) [ 279.997878] raw: 0200000000010200 ffff888100042a00 0000000000000000 dead000000000001 [ 279.999384] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000 [ 280.000894] page dumped because: kasan: bad access detected [ 280.002386] Memory state around the buggy address: [ 280.003338] ffff888147e2be00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 280.004781] ffff888147e2be80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 280.006224] >ffff888147e2bf00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 280.007700] ^ [ 280.008592] ffff888147e2bf80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 280.010035] ffff888147e2c000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 280.011564] ================================================================== Fixes: 59094b1 ("net: sched: use flow block API") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
mingzhangqun
pushed a commit
to mingzhangqun/st-linux
that referenced
this pull request
Mar 23, 2024
[ Upstream commit 05bb016 ] ACPICA commit 770653e3ba67c30a629ca7d12e352d83c2541b1e Before this change we see the following UBSAN stack trace in Fuchsia: #0 0x000021e4213b3302 in acpi_ds_init_aml_walk(struct acpi_walk_state*, union acpi_parse_object*, struct acpi_namespace_node*, u8*, u32, struct acpi_evaluate_info*, u8) ../../third_party/acpica/source/components/dispatcher/dswstate.c:682 <platform-bus-x86.so>+0x233302 STMicroelectronics#1.2 0x000020d0f660777f in ubsan_get_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:41 <libclang_rt.asan.so>+0x3d77f STMicroelectronics#1.1 0x000020d0f660777f in maybe_print_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:51 <libclang_rt.asan.so>+0x3d77f STMicroelectronics#1 0x000020d0f660777f in ~scoped_report() compiler-rt/lib/ubsan/ubsan_diag.cpp:387 <libclang_rt.asan.so>+0x3d77f STMicroelectronics#2 0x000020d0f660b96d in handlepointer_overflow_impl() compiler-rt/lib/ubsan/ubsan_handlers.cpp:809 <libclang_rt.asan.so>+0x4196d STMicroelectronics#3 0x000020d0f660b50d in compiler-rt/lib/ubsan/ubsan_handlers.cpp:815 <libclang_rt.asan.so>+0x4150d STMicroelectronics#4 0x000021e4213b3302 in acpi_ds_init_aml_walk(struct acpi_walk_state*, union acpi_parse_object*, struct acpi_namespace_node*, u8*, u32, struct acpi_evaluate_info*, u8) ../../third_party/acpica/source/components/dispatcher/dswstate.c:682 <platform-bus-x86.so>+0x233302 STMicroelectronics#5 0x000021e4213e2369 in acpi_ds_call_control_method(struct acpi_thread_state*, struct acpi_walk_state*, union acpi_parse_object*) ../../third_party/acpica/source/components/dispatcher/dsmethod.c:605 <platform-bus-x86.so>+0x262369 STMicroelectronics#6 0x000021e421437fac in acpi_ps_parse_aml(struct acpi_walk_state*) ../../third_party/acpica/source/components/parser/psparse.c:550 <platform-bus-x86.so>+0x2b7fac STMicroelectronics#7 0x000021e4214464d2 in acpi_ps_execute_method(struct acpi_evaluate_info*) ../../third_party/acpica/source/components/parser/psxface.c:244 <platform-bus-x86.so>+0x2c64d2 STMicroelectronics#8 0x000021e4213aa052 in acpi_ns_evaluate(struct acpi_evaluate_info*) ../../third_party/acpica/source/components/namespace/nseval.c:250 <platform-bus-x86.so>+0x22a052 STMicroelectronics#9 0x000021e421413dd8 in acpi_ns_init_one_device(acpi_handle, u32, void*, void**) ../../third_party/acpica/source/components/namespace/nsinit.c:735 <platform-bus-x86.so>+0x293dd8 STMicroelectronics#10 0x000021e421429e98 in acpi_ns_walk_namespace(acpi_object_type, acpi_handle, u32, u32, acpi_walk_callback, acpi_walk_callback, void*, void**) ../../third_party/acpica/source/components/namespace/nswalk.c:298 <platform-bus-x86.so>+0x2a9e98 STMicroelectronics#11 0x000021e4214131ac in acpi_ns_initialize_devices(u32) ../../third_party/acpica/source/components/namespace/nsinit.c:268 <platform-bus-x86.so>+0x2931ac STMicroelectronics#12 0x000021e42147c40d in acpi_initialize_objects(u32) ../../third_party/acpica/source/components/utilities/utxfinit.c:304 <platform-bus-x86.so>+0x2fc40d STMicroelectronics#13 0x000021e42126d603 in acpi::acpi_impl::initialize_acpi(acpi::acpi_impl*) ../../src/devices/board/lib/acpi/acpi-impl.cc:224 <platform-bus-x86.so>+0xed603 Add a simple check that avoids incrementing a pointer by zero, but otherwise behaves as before. Note that our findings are against ACPICA 20221020, but the same code exists on master. Link: acpica/acpica@770653e3 Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
mingzhangqun
pushed a commit
to mingzhangqun/st-linux
that referenced
this pull request
Mar 23, 2024
[ Upstream commit c11bd04 ] The recursion check in __bpf_prog_enter* and __bpf_prog_exit* leave preempt_count_{sub,add} unprotected. When attaching trampoline to them we get panic as follows, [ 867.843050] BUG: TASK stack guard page was hit at 0000000009d325cf (stack is 0000000046a46a15..00000000537e7b28) [ 867.843064] stack guard page: 0000 [STMicroelectronics#1] PREEMPT SMP NOPTI [ 867.843067] CPU: 8 PID: 11009 Comm: trace Kdump: loaded Not tainted 6.2.0+ STMicroelectronics#4 [ 867.843100] Call Trace: [ 867.843101] <TASK> [ 867.843104] asm_exc_int3+0x3a/0x40 [ 867.843108] RIP: 0010:preempt_count_sub+0x1/0xa0 [ 867.843135] __bpf_prog_enter_recur+0x17/0x90 [ 867.843148] bpf_trampoline_6442468108_0+0x2e/0x1000 [ 867.843154] ? preempt_count_sub+0x1/0xa0 [ 867.843157] preempt_count_sub+0x5/0xa0 [ 867.843159] ? migrate_enable+0xac/0xf0 [ 867.843164] __bpf_prog_exit_recur+0x2d/0x40 [ 867.843168] bpf_trampoline_6442468108_0+0x55/0x1000 ... [ 867.843788] preempt_count_sub+0x5/0xa0 [ 867.843793] ? migrate_enable+0xac/0xf0 [ 867.843829] __bpf_prog_exit_recur+0x2d/0x40 [ 867.843837] BUG: IRQ stack guard page was hit at 0000000099bd8228 (stack is 00000000b23e2bc4..000000006d95af35) [ 867.843841] BUG: IRQ stack guard page was hit at 000000005ae07924 (stack is 00000000ffd69623..0000000014eb594c) [ 867.843843] BUG: IRQ stack guard page was hit at 00000000028320f0 (stack is 00000000034b6438..0000000078d1bcec) [ 867.843842] bpf_trampoline_6442468108_0+0x55/0x1000 ... That is because in __bpf_prog_exit_recur, the preempt_count_{sub,add} are called after prog->active is decreased. Fixing this by adding these two functions into btf ids deny list. Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Yafang <laoar.shao@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jiri Olsa <olsajiri@gmail.com> Acked-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20230413025248.79764-1-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
mingzhangqun
pushed a commit
to mingzhangqun/st-linux
that referenced
this pull request
Mar 23, 2024
[ Upstream commit 822b5a1 ] syzkaller found a data race of pkt_sk(sk)->num. The value is changed under lock_sock() and po->bind_lock, so we need READ_ONCE() to access pkt_sk(sk)->num without these locks in packet_bind_spkt(), packet_bind(), and sk_diag_fill(). Note that WRITE_ONCE() is already added by commit c7d2ef5 ("net/packet: annotate accesses to po->bind"). BUG: KCSAN: data-race in packet_bind / packet_do_bind write (marked) to 0xffff88802ffd1cee of 2 bytes by task 7322 on cpu 0: packet_do_bind+0x446/0x640 net/packet/af_packet.c:3236 packet_bind+0x99/0xe0 net/packet/af_packet.c:3321 __sys_bind+0x19b/0x1e0 net/socket.c:1803 __do_sys_bind net/socket.c:1814 [inline] __se_sys_bind net/socket.c:1812 [inline] __x64_sys_bind+0x40/0x50 net/socket.c:1812 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc read to 0xffff88802ffd1cee of 2 bytes by task 7318 on cpu 1: packet_bind+0xbf/0xe0 net/packet/af_packet.c:3322 __sys_bind+0x19b/0x1e0 net/socket.c:1803 __do_sys_bind net/socket.c:1814 [inline] __se_sys_bind net/socket.c:1812 [inline] __x64_sys_bind+0x40/0x50 net/socket.c:1812 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc value changed: 0x0300 -> 0x0000 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 7318 Comm: syz-executor.4 Not tainted 6.3.0-13380-g7fddb5b5300c STMicroelectronics#4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Fixes: 96ec632 ("packet: Diag core and basic socket info dumping") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230524232934.50950-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
arnopo
pushed a commit
to arnopo/linux
that referenced
this pull request
Nov 29, 2024
Hou Tao says: ==================== The patch set fixes several issues in bits iterator. Patch STMicroelectronics#1 fixes the kmemleak problem of bits iterator. Patch STMicroelectronics#2~STMicroelectronics#3 fix the overflow problem of nr_bits. Patch STMicroelectronics#4 fixes the potential stack corruption when bits iterator is used on 32-bit host. Patch STMicroelectronics#5 adds more test cases for bits iterator. Please see the individual patches for more details. And comments are always welcome. --- v4: * patch STMicroelectronics#1: add ack from Yafang * patch STMicroelectronics#3: revert code-churn like changes: (1) compute nr_bytes and nr_bits before the check of nr_words. (2) use nr_bits == 64 to check for single u64, preventing build warning on 32-bit hosts. * patch STMicroelectronics#4: use "BITS_PER_LONG == 32" instead of "!defined(CONFIG_64BIT)" v3: https://lore.kernel.org/bpf/20241025013233.804027-1-houtao@huaweicloud.com/T/#t * split the bits-iterator related patches from "Misc fixes for bpf" patch set * patch STMicroelectronics#1: use "!nr_bits || bits >= nr_bits" to stop the iteration * patch STMicroelectronics#2: add a new helper for the overflow problem * patch STMicroelectronics#3: decrease the limitation from 512 to 511 and check whether nr_bytes is too large for bpf memory allocator explicitly * patch STMicroelectronics#5: add two more test cases for bit iterator v2: http://lore.kernel.org/bpf/d49fa2f4-f743-c763-7579-c3cab4dd88cb@huaweicloud.com ==================== Link: https://lore.kernel.org/r/20241030100516.3633640-1-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
arnopo
pushed a commit
to arnopo/linux
that referenced
this pull request
Nov 29, 2024
Petr Machata says: ==================== mlxsw: Fixes In this patchset: - Tx header should be pushed for each packet which is transmitted via Spectrum ASICs. Patch STMicroelectronics#1 adds a missing call to skb_cow_head() to make sure that there is both enough room to push the Tx header and that the SKB header is not cloned and can be modified. - Commit b5b60bb ("mlxsw: pci: Use page pool for Rx buffers allocation") converted mlxsw to use page pool for Rx buffers allocation. Sync for CPU and for device should be done for Rx pages. In patches STMicroelectronics#2 and STMicroelectronics#3, add the missing calls to sync pages for, respectively, CPU and the device. - Patch STMicroelectronics#4 then fixes a bug to IPv6 GRE forwarding offload. Patch STMicroelectronics#5 adds a generic forwarding test that fails with mlxsw ports prior to the fix. ==================== Link: https://patch.msgid.link/cover.1729866134.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
arnopo
pushed a commit
to arnopo/linux
that referenced
this pull request
Nov 29, 2024
When CONFIG_KASAN_SW_TAGS and CONFIG_KASAN_STACK are enabled, the object_is_on_stack() function may produce incorrect results due to the presence of tags in the obj pointer, while the stack pointer does not have tags. This discrepancy can lead to incorrect stack object detection and subsequently trigger warnings if CONFIG_DEBUG_OBJECTS is also enabled. Example of the warning: ODEBUG: object 3eff800082ea7bb0 is NOT on stack ffff800082ea0000, but annotated. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at lib/debugobjects.c:557 __debug_object_init+0x330/0x364 Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-rc5 STMicroelectronics#4 Hardware name: linux,dummy-virt (DT) pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __debug_object_init+0x330/0x364 lr : __debug_object_init+0x330/0x364 sp : ffff800082ea7b40 x29: ffff800082ea7b40 x28: 98ff0000c0164518 x27: 98ff0000c0164534 x26: ffff800082d93ec8 x25: 0000000000000001 x24: 1cff0000c00172a0 x23: 0000000000000000 x22: ffff800082d93ed0 x21: ffff800081a24418 x20: 3eff800082ea7bb0 x19: efff800000000000 x18: 0000000000000000 x17: 00000000000000ff x16: 0000000000000047 x15: 206b63617473206e x14: 0000000000000018 x13: ffff800082ea7780 x12: 0ffff800082ea78e x11: 0ffff800082ea790 x10: 0ffff800082ea79d x9 : 34d77febe173e800 x8 : 34d77febe173e800 x7 : 0000000000000001 x6 : 0000000000000001 x5 : feff800082ea74b8 x4 : ffff800082870a90 x3 : ffff80008018d3c4 x2 : 0000000000000001 x1 : ffff800082858810 x0 : 0000000000000050 Call trace: __debug_object_init+0x330/0x364 debug_object_init_on_stack+0x30/0x3c schedule_hrtimeout_range_clock+0xac/0x26c schedule_hrtimeout+0x1c/0x30 wait_task_inactive+0x1d4/0x25c kthread_bind_mask+0x28/0x98 init_rescuer+0x1e8/0x280 workqueue_init+0x1a0/0x3cc kernel_init_freeable+0x118/0x200 kernel_init+0x28/0x1f0 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- ODEBUG: object 3eff800082ea7bb0 is NOT on stack ffff800082ea0000, but annotated. ------------[ cut here ]------------ Link: https://lkml.kernel.org/r/20241113042544.19095-1-qun-wei.lin@mediatek.com Signed-off-by: Qun-Wei Lin <qun-wei.lin@mediatek.com> Cc: Andrew Yang <andrew.yang@mediatek.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Casper Li <casper.li@mediatek.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chinwen Chang <chinwen.chang@mediatek.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
IEEE 1588 support needs a clock defined as ptp_ref and this uses pll4_p as the parent clock
before:
$ ./testptp -g ; sleep 1 ; ./testptp -g
clock time: 0.000000000 or Wed Dec 31 18:00:00 1969
clock time: 0.000000000 or Wed Dec 31 18:00:00 1969
after:
$ ./testptp -g ; sleep 1 ; ./testptp -g
clock time: 1587623094.279486315 or Thu Apr 23 01:24:54 2020
clock time: 1587623095.292978915 or Thu Apr 23 01:24:55 2020