-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4.14 hdmi #27
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Defconfig is adopted from 4.16-hdmi branch. No WiFi/Bt.
4.14 hdmi
d3adme4t
pushed a commit
to d3adme4t/BPI-Router-Linux
that referenced
this pull request
Jun 4, 2018
[ Upstream commit 864db4d ] Following kernel panic is observed on 64bit machine while loading the driver. It is fixed if we pass dynamically allocated memory to SDIO for DMA. BUG: unable to handle kernel paging request at ffffeb04000172e0 IP: sg_miter_stop+0x56/0x70 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI Modules linked in: rsi_sdio(OE+) rsi_91x(OE) btrsi(OE) rfcomm bluetooth ecdh_generic mac80211 mmc_block fuse xt_CHECKSUM iptable_mangle drm_kms_helper mmc_core serio_raw drm firewire_ohci tg3 CPU: 0 PID: 4003 Comm: insmod Tainted: G OE 4.16.0-rc1+ frank-w#27 Hardware name: Dell Inc. Latitude E5500 /0DW634, BIOS A19 06/13/2013 RIP: 0010:sg_miter_stop+0x56/0x70 RSP: 0018:ffff88007d003e78 EFLAGS: 00010002 RAX: 0000000000000003 RBX: 0000000000000004 RCX: 0000000000000000 RDX: ffffeb04000172c0 RSI: ffff88002f58002c RDI: ffff88007d003e80 RBP: 0000000000000004 R08: ffff88007d003e80 R09: 0000000000000008 R10: 0000000000000003 R11: 0000000000000001 R12: 0000000000000004 R13: ffff88002f580028 R14: 0000000000000000 R15: 0000000000000004 FS: 00007f35c29db700(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffeb04000172e0 CR3: 000000007038e000 CR4: 00000000000406f0 Call Trace: <IRQ> sg_copy_buffer+0xc6/0xf0 sdhci_tasklet_finish+0x170/0x260 [sdhci] tasklet_action+0xf4/0x100 __do_softirq+0xef/0x26e irq_exit+0xbe/0xd0 do_IRQ+0x4a/0xc0 common_interrupt+0xa2/0xa2 </IRQ> Signed-off-by: Amitkumar Karwar <amit.karwar@redpinesignals.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Jun 10, 2018
[ Upstream commit 864db4d ] Following kernel panic is observed on 64bit machine while loading the driver. It is fixed if we pass dynamically allocated memory to SDIO for DMA. BUG: unable to handle kernel paging request at ffffeb04000172e0 IP: sg_miter_stop+0x56/0x70 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI Modules linked in: rsi_sdio(OE+) rsi_91x(OE) btrsi(OE) rfcomm bluetooth ecdh_generic mac80211 mmc_block fuse xt_CHECKSUM iptable_mangle drm_kms_helper mmc_core serio_raw drm firewire_ohci tg3 CPU: 0 PID: 4003 Comm: insmod Tainted: G OE 4.16.0-rc1+ #27 Hardware name: Dell Inc. Latitude E5500 /0DW634, BIOS A19 06/13/2013 RIP: 0010:sg_miter_stop+0x56/0x70 RSP: 0018:ffff88007d003e78 EFLAGS: 00010002 RAX: 0000000000000003 RBX: 0000000000000004 RCX: 0000000000000000 RDX: ffffeb04000172c0 RSI: ffff88002f58002c RDI: ffff88007d003e80 RBP: 0000000000000004 R08: ffff88007d003e80 R09: 0000000000000008 R10: 0000000000000003 R11: 0000000000000001 R12: 0000000000000004 R13: ffff88002f580028 R14: 0000000000000000 R15: 0000000000000004 FS: 00007f35c29db700(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffeb04000172e0 CR3: 000000007038e000 CR4: 00000000000406f0 Call Trace: <IRQ> sg_copy_buffer+0xc6/0xf0 sdhci_tasklet_finish+0x170/0x260 [sdhci] tasklet_action+0xf4/0x100 __do_softirq+0xef/0x26e irq_exit+0xbe/0xd0 do_IRQ+0x4a/0xc0 common_interrupt+0xa2/0xa2 </IRQ> Signed-off-by: Amitkumar Karwar <amit.karwar@redpinesignals.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Jul 3, 2018
…_neon_begin, _end commit 6e88f01 upstream. In a arm64 server(QDF2400),I met a similar might-sleep warning as [1]: [ 7.019116] BUG: sleeping function called from invalid context at ./include/crypto/algapi.h:416 [ 7.027863] in_atomic(): 1, irqs_disabled(): 0, pid: 410, name: cryptomgr_test [ 7.035106] 1 lock held by cryptomgr_test/410: [ 7.039549] #0: (ptrval) (&drbg->drbg_mutex){+.+.}, at: drbg_instantiate+0x34/0x398 [ 7.048038] CPU: 9 PID: 410 Comm: cryptomgr_test Not tainted 4.17.0-rc6+ #27 [ 7.068228] dump_backtrace+0x0/0x1c0 [ 7.071890] show_stack+0x24/0x30 [ 7.075208] dump_stack+0xb0/0xec [ 7.078523] ___might_sleep+0x160/0x238 [ 7.082360] skcipher_walk_done+0x118/0x2c8 [ 7.086545] ctr_encrypt+0x98/0x130 [ 7.090035] simd_skcipher_encrypt+0x68/0xc0 [ 7.094304] drbg_kcapi_sym_ctr+0xd4/0x1f8 [ 7.098400] drbg_ctr_update+0x98/0x330 [ 7.102236] drbg_seed+0x1b8/0x2f0 [ 7.105637] drbg_instantiate+0x2ac/0x398 [ 7.109646] drbg_kcapi_seed+0xbc/0x188 [ 7.113482] crypto_rng_reset+0x4c/0xb0 [ 7.117319] alg_test_drbg+0xec/0x330 [ 7.120981] alg_test.part.6+0x1c8/0x3c8 [ 7.124903] alg_test+0x58/0xa0 [ 7.128044] cryptomgr_test+0x50/0x58 [ 7.131708] kthread+0x134/0x138 [ 7.134936] ret_from_fork+0x10/0x1c Seems there is a bug in Ard Biesheuvel's commit. Fixes: 6833817 ("crypto: arm64/aes-blk - move kernel mode neon en/disable into loop") [1] https://www.spinics.net/lists/linux-crypto/msg33103.html Signed-off-by: jia.he@hxt-semitech.com Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: <stable@vger.kernel.org> # 4.17 Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Jul 25, 2018
commit 36eb8ff upstream. Crash dump shows following instructions crash> bt PID: 0 TASK: ffffffffbe412480 CPU: 0 COMMAND: "swapper/0" #0 [ffff891ee0003868] machine_kexec at ffffffffbd063ef1 #1 [ffff891ee00038c8] __crash_kexec at ffffffffbd12b6f2 #2 [ffff891ee0003998] crash_kexec at ffffffffbd12c84c #3 [ffff891ee00039b8] oops_end at ffffffffbd030f0a #4 [ffff891ee00039e0] no_context at ffffffffbd074643 #5 [ffff891ee0003a40] __bad_area_nosemaphore at ffffffffbd07496e #6 [ffff891ee0003a90] bad_area_nosemaphore at ffffffffbd074a64 #7 [ffff891ee0003aa0] __do_page_fault at ffffffffbd074b0a #8 [ffff891ee0003b18] do_page_fault at ffffffffbd074fc8 #9 [ffff891ee0003b50] page_fault at ffffffffbda01925 [exception RIP: qlt_schedule_sess_for_deletion+15] RIP: ffffffffc02e526f RSP: ffff891ee0003c08 RFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffc0307847 RDX: 00000000000020e6 RSI: ffff891edbc377c8 RDI: 0000000000000000 RBP: ffff891ee0003c18 R8: ffffffffc02f0b20 R9: 0000000000000250 R10: 0000000000000258 R11: 000000000000b780 R12: ffff891ed9b43000 R13: 00000000000000f0 R14: 0000000000000006 R15: ffff891edbc377c8 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff891ee0003c20] qla2x00_fcport_event_handler at ffffffffc02853d3 [qla2xxx] #11 [ffff891ee0003cf0] __dta_qla24xx_async_gnl_sp_done_333 at ffffffffc0285a1d [qla2xxx] #12 [ffff891ee0003de8] qla24xx_process_response_queue at ffffffffc02a2eb5 [qla2xxx] #13 [ffff891ee0003e88] qla24xx_msix_rsp_q at ffffffffc02a5403 [qla2xxx] #14 [ffff891ee0003ec0] __handle_irq_event_percpu at ffffffffbd0f4c59 #15 [ffff891ee0003f10] handle_irq_event_percpu at ffffffffbd0f4e02 #16 [ffff891ee0003f40] handle_irq_event at ffffffffbd0f4e90 #17 [ffff891ee0003f68] handle_edge_irq at ffffffffbd0f8984 #18 [ffff891ee0003f88] handle_irq at ffffffffbd0305d5 #19 [ffff891ee0003fb8] do_IRQ at ffffffffbda02a18 --- <IRQ stack> --- #20 [ffffffffbe403d30] ret_from_intr at ffffffffbda0094e [exception RIP: unknown or invalid address] RIP: 000000000000001f RSP: 0000000000000000 RFLAGS: fff3b8c2091ebb3f RAX: ffffbba5a0000200 RBX: 0000be8cdfa8f9fa RCX: 0000000000000018 RDX: 0000000000000101 RSI: 000000000000015d RDI: 0000000000000193 RBP: 0000000000000083 R8: ffffffffbe403e38 R9: 0000000000000002 R10: 0000000000000000 R11: ffffffffbe56b820 R12: ffff891ee001cf00 R13: ffffffffbd11c0a4 R14: ffffffffbe403d60 R15: 0000000000000001 ORIG_RAX: ffff891ee0022ac0 CS: 0000 SS: ffffffffffffffb9 bt: WARNING: possibly bogus exception frame #21 [ffffffffbe403dd8] cpuidle_enter_state at ffffffffbd67c6fd #22 [ffffffffbe403e40] cpuidle_enter at ffffffffbd67c907 #23 [ffffffffbe403e50] call_cpuidle at ffffffffbd0d98f3 #24 [ffffffffbe403e60] do_idle at ffffffffbd0d9b42 #25 [ffffffffbe403e98] cpu_startup_entry at ffffffffbd0d9da3 #26 [ffffffffbe403ec0] rest_init at ffffffffbd81d4aa #27 [ffffffffbe403ed0] start_kernel at ffffffffbe67d2ca #28 [ffffffffbe403f28] x86_64_start_reservations at ffffffffbe67c675 #29 [ffffffffbe403f38] x86_64_start_kernel at ffffffffbe67c6eb #30 [ffffffffbe403f50] secondary_startup_64 at ffffffffbd0000d5 Fixes: 040036b ("scsi: qla2xxx: Delay loop id allocation at login") Cc: <stable@vger.kernel.org> # v4.17+ Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Aug 25, 2018
[ Upstream commit 3256d29 ] lockdep spotted that we are using rfs_h.lock in enic_get_rxnfc() without initializing. rfs_h.lock is initialized in enic_open(). But ethtool_ops can be called when interface is down. Move enic_rfs_flw_tbl_init to enic_probe. INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 18 PID: 1189 Comm: ethtool Not tainted 4.17.0-rc7-devel+ #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 Call Trace: dump_stack+0x85/0xc0 register_lock_class+0x550/0x560 ? __handle_mm_fault+0xa8b/0x1100 __lock_acquire+0x81/0x670 lock_acquire+0xb9/0x1e0 ? enic_get_rxnfc+0x139/0x2b0 [enic] _raw_spin_lock_bh+0x38/0x80 ? enic_get_rxnfc+0x139/0x2b0 [enic] enic_get_rxnfc+0x139/0x2b0 [enic] ethtool_get_rxnfc+0x8d/0x1c0 dev_ethtool+0x16c8/0x2400 ? __mutex_lock+0x64d/0xa00 ? dev_load+0x6a/0x150 dev_ioctl+0x253/0x4b0 sock_do_ioctl+0x9a/0x130 sock_ioctl+0x1af/0x350 do_vfs_ioctl+0x8e/0x670 ? syscall_trace_enter+0x1e2/0x380 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x5a/0x170 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Aug 25, 2018
[ Upstream commit 3256d29 ] lockdep spotted that we are using rfs_h.lock in enic_get_rxnfc() without initializing. rfs_h.lock is initialized in enic_open(). But ethtool_ops can be called when interface is down. Move enic_rfs_flw_tbl_init to enic_probe. INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 18 PID: 1189 Comm: ethtool Not tainted 4.17.0-rc7-devel+ #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 Call Trace: dump_stack+0x85/0xc0 register_lock_class+0x550/0x560 ? __handle_mm_fault+0xa8b/0x1100 __lock_acquire+0x81/0x670 lock_acquire+0xb9/0x1e0 ? enic_get_rxnfc+0x139/0x2b0 [enic] _raw_spin_lock_bh+0x38/0x80 ? enic_get_rxnfc+0x139/0x2b0 [enic] enic_get_rxnfc+0x139/0x2b0 [enic] ethtool_get_rxnfc+0x8d/0x1c0 dev_ethtool+0x16c8/0x2400 ? __mutex_lock+0x64d/0xa00 ? dev_load+0x6a/0x150 dev_ioctl+0x253/0x4b0 sock_do_ioctl+0x9a/0x130 sock_ioctl+0x1af/0x350 do_vfs_ioctl+0x8e/0x670 ? syscall_trace_enter+0x1e2/0x380 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x5a/0x170 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Dec 17, 2018
[ Upstream commit 095faf4 ] There is a reference counter to ensure that masquerade modules register notifiers only once. However, the existing reference counter approach is not safe, test commands are: while : do modprobe ip6t_MASQUERADE & modprobe nft_masq_ipv6 & modprobe -rv ip6t_MASQUERADE & modprobe -rv nft_masq_ipv6 & done numbers below represent the reference counter. -------------------------------------------------------- CPU0 CPU1 CPU2 CPU3 CPU4 [insmod] [insmod] [rmmod] [rmmod] [insmod] -------------------------------------------------------- 0->1 register 1->2 returns 2->1 returns 1->0 0->1 register <-- unregister -------------------------------------------------------- The unregistation of CPU3 should be processed before the registration of CPU4. In order to fix this, use a mutex instead of reference counter. splat looks like: [ 323.869557] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [modprobe:1381] [ 323.869574] Modules linked in: nf_tables(+) nf_nat_ipv6(-) nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 n] [ 323.869574] irq event stamp: 194074 [ 323.898930] hardirqs last enabled at (194073): [<ffffffff90004a0d>] trace_hardirqs_on_thunk+0x1a/0x1c [ 323.898930] hardirqs last disabled at (194074): [<ffffffff90004a29>] trace_hardirqs_off_thunk+0x1a/0x1c [ 323.898930] softirqs last enabled at (182132): [<ffffffff922006ec>] __do_softirq+0x6ec/0xa3b [ 323.898930] softirqs last disabled at (182109): [<ffffffff90193426>] irq_exit+0x1a6/0x1e0 [ 323.898930] CPU: 0 PID: 1381 Comm: modprobe Not tainted 4.20.0-rc2+ #27 [ 323.898930] RIP: 0010:raw_notifier_chain_register+0xea/0x240 [ 323.898930] Code: 3c 03 0f 8e f2 00 00 00 44 3b 6b 10 7f 4d 49 bc 00 00 00 00 00 fc ff df eb 22 48 8d 7b 10 488 [ 323.898930] RSP: 0018:ffff888101597218 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13 [ 323.898930] RAX: 0000000000000000 RBX: ffffffffc04361c0 RCX: 0000000000000000 [ 323.898930] RDX: 1ffffffff26132ae RSI: ffffffffc04aa3c0 RDI: ffffffffc04361d0 [ 323.898930] RBP: ffffffffc04361c8 R08: 0000000000000000 R09: 0000000000000001 [ 323.898930] R10: ffff8881015972b0 R11: fffffbfff26132c4 R12: dffffc0000000000 [ 323.898930] R13: 0000000000000000 R14: 1ffff110202b2e44 R15: ffffffffc04aa3c0 [ 323.898930] FS: 00007f813ed41540(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000 [ 323.898930] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 323.898930] CR2: 0000559bf2c9f120 CR3: 000000010bc80000 CR4: 00000000001006f0 [ 323.898930] Call Trace: [ 323.898930] ? atomic_notifier_chain_register+0x2d0/0x2d0 [ 323.898930] ? down_read+0x150/0x150 [ 323.898930] ? sched_clock_cpu+0x126/0x170 [ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 323.898930] register_netdevice_notifier+0xbb/0x790 [ 323.898930] ? __dev_close_many+0x2d0/0x2d0 [ 323.898930] ? __mutex_unlock_slowpath+0x17f/0x740 [ 323.898930] ? wait_for_completion+0x710/0x710 [ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 323.898930] ? up_write+0x6c/0x210 [ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 324.127073] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 324.127073] nft_chain_filter_init+0x1e/0xe8a [nf_tables] [ 324.127073] nf_tables_module_init+0x37/0x92 [nf_tables] [ ... ] Fixes: 8dd33cc ("netfilter: nf_nat: generalize IPv4 masquerading support for nf_tables") Fixes: be6b635 ("netfilter: nf_nat: generalize IPv6 masquerading support for nf_tables") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
frank-w
pushed a commit
that referenced
this pull request
Dec 17, 2018
Increase kasan instrumented kernel stack size from 32k to 64k. Other architectures seems to get away with just doubling kernel stack size under kasan, but on s390 this appears to be not enough due to bigger frame size. The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting stack overflow is fs sync on xfs filesystem: #0 [9a0681e8] 704 bytes check_usage at 34b1fc #1 [9a0684a8] 432 bytes check_usage at 34c710 #2 [9a068658] 1048 bytes validate_chain at 35044a #3 [9a068a70] 312 bytes __lock_acquire at 3559fe #4 [9a068ba8] 440 bytes lock_acquire at 3576ee #5 [9a068d60] 104 bytes _raw_spin_lock at 21b44e0 #6 [9a068dc8] 1992 bytes enqueue_entity at 2dbf72 #7 [9a069590] 1496 bytes enqueue_task_fair at 2df5f0 #8 [9a069b68] 64 bytes ttwu_do_activate at 28f438 #9 [9a069ba8] 552 bytes try_to_wake_up at 298c4c #10 [9a069dd0] 168 bytes wake_up_worker at 23f97c #11 [9a069e78] 200 bytes insert_work at 23fc2e #12 [9a069f40] 648 bytes __queue_work at 2487c0 #13 [9a06a1c8] 200 bytes __queue_delayed_work at 24db28 #14 [9a06a290] 248 bytes mod_delayed_work_on at 24de84 #15 [9a06a388] 24 bytes kblockd_mod_delayed_work_on at 153e2a0 #16 [9a06a3a0] 288 bytes __blk_mq_delay_run_hw_queue at 158168c #17 [9a06a4c0] 192 bytes blk_mq_run_hw_queue at 1581a3c #18 [9a06a580] 184 bytes blk_mq_sched_insert_requests at 15a2192 #19 [9a06a638] 1024 bytes blk_mq_flush_plug_list at 1590f3a #20 [9a06aa38] 704 bytes blk_flush_plug_list at 1555028 #21 [9a06acf8] 320 bytes schedule at 219e476 #22 [9a06ae38] 760 bytes schedule_timeout at 21b0aac #23 [9a06b130] 408 bytes wait_for_common at 21a1706 #24 [9a06b2c8] 360 bytes xfs_buf_iowait at fa1540 #25 [9a06b430] 256 bytes __xfs_buf_submit at fadae6 #26 [9a06b530] 264 bytes xfs_buf_read_map at fae3f6 #27 [9a06b638] 656 bytes xfs_trans_read_buf_map at 10ac9a8 #28 [9a06b8c8] 304 bytes xfs_btree_kill_root at e72426 #29 [9a06b9f8] 288 bytes xfs_btree_lookup_get_block at e7bc5e #30 [9a06bb18] 624 bytes xfs_btree_lookup at e7e1a6 #31 [9a06bd88] 2664 bytes xfs_alloc_ag_vextent_near at dfa070 #32 [9a06c7f0] 144 bytes xfs_alloc_ag_vextent at dff3ca #33 [9a06c880] 1128 bytes xfs_alloc_vextent at e05fce #34 [9a06cce8] 584 bytes xfs_bmap_btalloc at e58342 #35 [9a06cf30] 1336 bytes xfs_bmapi_write at e618de #36 [9a06d468] 776 bytes xfs_iomap_write_allocate at ff678e #37 [9a06d770] 720 bytes xfs_map_blocks at f82af8 #38 [9a06da40] 928 bytes xfs_writepage_map at f83cd6 #39 [9a06dde0] 320 bytes xfs_do_writepage at f85872 #40 [9a06df20] 1320 bytes write_cache_pages at 73dfe8 #41 [9a06e448] 208 bytes xfs_vm_writepages at f7f892 #42 [9a06e518] 88 bytes do_writepages at 73fe6a #43 [9a06e570] 872 bytes __writeback_single_inode at a20cb6 #44 [9a06e8d8] 664 bytes writeback_sb_inodes at a23be2 #45 [9a06eb70] 296 bytes __writeback_inodes_wb at a242e0 #46 [9a06ec98] 928 bytes wb_writeback at a2500e #47 [9a06f038] 848 bytes wb_do_writeback at a260ae #48 [9a06f388] 536 bytes wb_workfn at a28228 #49 [9a06f5a0] 1088 bytes process_one_work at 24a234 #50 [9a06f9e0] 1120 bytes worker_thread at 24ba26 #51 [9a06fe40] 104 bytes kthread at 26545a #52 [9a06fea8] kernel_thread_starter at 21b6b62 To be able to increase the stack size to 64k reuse LLILL instruction in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE (65192) value as unsigned. Reported-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
frank-w
pushed a commit
that referenced
this pull request
Dec 17, 2018
There is a reference counter to ensure that masquerade modules register notifiers only once. However, the existing reference counter approach is not safe, test commands are: while : do modprobe ip6t_MASQUERADE & modprobe nft_masq_ipv6 & modprobe -rv ip6t_MASQUERADE & modprobe -rv nft_masq_ipv6 & done numbers below represent the reference counter. -------------------------------------------------------- CPU0 CPU1 CPU2 CPU3 CPU4 [insmod] [insmod] [rmmod] [rmmod] [insmod] -------------------------------------------------------- 0->1 register 1->2 returns 2->1 returns 1->0 0->1 register <-- unregister -------------------------------------------------------- The unregistation of CPU3 should be processed before the registration of CPU4. In order to fix this, use a mutex instead of reference counter. splat looks like: [ 323.869557] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [modprobe:1381] [ 323.869574] Modules linked in: nf_tables(+) nf_nat_ipv6(-) nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 n] [ 323.869574] irq event stamp: 194074 [ 323.898930] hardirqs last enabled at (194073): [<ffffffff90004a0d>] trace_hardirqs_on_thunk+0x1a/0x1c [ 323.898930] hardirqs last disabled at (194074): [<ffffffff90004a29>] trace_hardirqs_off_thunk+0x1a/0x1c [ 323.898930] softirqs last enabled at (182132): [<ffffffff922006ec>] __do_softirq+0x6ec/0xa3b [ 323.898930] softirqs last disabled at (182109): [<ffffffff90193426>] irq_exit+0x1a6/0x1e0 [ 323.898930] CPU: 0 PID: 1381 Comm: modprobe Not tainted 4.20.0-rc2+ #27 [ 323.898930] RIP: 0010:raw_notifier_chain_register+0xea/0x240 [ 323.898930] Code: 3c 03 0f 8e f2 00 00 00 44 3b 6b 10 7f 4d 49 bc 00 00 00 00 00 fc ff df eb 22 48 8d 7b 10 488 [ 323.898930] RSP: 0018:ffff888101597218 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13 [ 323.898930] RAX: 0000000000000000 RBX: ffffffffc04361c0 RCX: 0000000000000000 [ 323.898930] RDX: 1ffffffff26132ae RSI: ffffffffc04aa3c0 RDI: ffffffffc04361d0 [ 323.898930] RBP: ffffffffc04361c8 R08: 0000000000000000 R09: 0000000000000001 [ 323.898930] R10: ffff8881015972b0 R11: fffffbfff26132c4 R12: dffffc0000000000 [ 323.898930] R13: 0000000000000000 R14: 1ffff110202b2e44 R15: ffffffffc04aa3c0 [ 323.898930] FS: 00007f813ed41540(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000 [ 323.898930] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 323.898930] CR2: 0000559bf2c9f120 CR3: 000000010bc80000 CR4: 00000000001006f0 [ 323.898930] Call Trace: [ 323.898930] ? atomic_notifier_chain_register+0x2d0/0x2d0 [ 323.898930] ? down_read+0x150/0x150 [ 323.898930] ? sched_clock_cpu+0x126/0x170 [ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 323.898930] register_netdevice_notifier+0xbb/0x790 [ 323.898930] ? __dev_close_many+0x2d0/0x2d0 [ 323.898930] ? __mutex_unlock_slowpath+0x17f/0x740 [ 323.898930] ? wait_for_completion+0x710/0x710 [ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 323.898930] ? up_write+0x6c/0x210 [ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 324.127073] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables] [ 324.127073] nft_chain_filter_init+0x1e/0xe8a [nf_tables] [ 324.127073] nf_tables_module_init+0x37/0x92 [nf_tables] [ ... ] Fixes: 8dd33cc ("netfilter: nf_nat: generalize IPv4 masquerading support for nf_tables") Fixes: be6b635 ("netfilter: nf_nat: generalize IPv6 masquerading support for nf_tables") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
frank-w
pushed a commit
that referenced
this pull request
Feb 4, 2019
This is the same sort of error we saw in commit 17e2e7d ("mm, page_alloc: fix has_unmovable_pages for HugePages"). Gigantic hugepages cross several memblocks, so it can be that the page we get in scan_movable_pages() is a page-tail belonging to a 1G-hugepage. If that happens, page_hstate()->size_to_hstate() will return NULL, and we will blow up in hugepage_migration_supported(). The splat is as follows: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 1350 Comm: bash Tainted: G E 5.0.0-rc1-mm1-1-default+ #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:__offline_pages+0x6ae/0x900 Call Trace: memory_subsys_offline+0x42/0x60 device_offline+0x80/0xa0 state_store+0xab/0xc0 kernfs_fop_write+0x102/0x180 __vfs_write+0x26/0x190 vfs_write+0xad/0x1b0 ksys_write+0x42/0x90 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Modules linked in: af_packet(E) xt_tcpudp(E) ipt_REJECT(E) xt_conntrack(E) nf_conntrack(E) nf_defrag_ipv4(E) ip_set(E) nfnetlink(E) ebtable_nat(E) ebtable_broute(E) bridge(E) stp(E) llc(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ebtable_filter(E) ebtables(E) iptable_filter(E) ip_tables(E) x_tables(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) bochs_drm(E) ttm(E) aesni_intel(E) drm_kms_helper(E) aes_x86_64(E) crypto_simd(E) cryptd(E) glue_helper(E) drm(E) virtio_net(E) syscopyarea(E) sysfillrect(E) net_failover(E) sysimgblt(E) pcspkr(E) failover(E) i2c_piix4(E) fb_sys_fops(E) parport_pc(E) parport(E) button(E) btrfs(E) libcrc32c(E) xor(E) zstd_decompress(E) zstd_compress(E) xxhash(E) raid6_pq(E) sd_mod(E) ata_generic(E) ata_piix(E) ahci(E) libahci(E) libata(E) crc32c_intel(E) serio_raw(E) virtio_pci(E) virtio_ring(E) virtio(E) sg(E) scsi_mod(E) autofs4(E) [akpm@linux-foundation.org: fix brace layout, per David. Reduce indentation] Link: http://lkml.kernel.org/r/20190122154407.18417-1-osalvador@suse.de Signed-off-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
frank-w
pushed a commit
that referenced
this pull request
Feb 7, 2019
commit eeb0efd upstream. This is the same sort of error we saw in commit 17e2e7d ("mm, page_alloc: fix has_unmovable_pages for HugePages"). Gigantic hugepages cross several memblocks, so it can be that the page we get in scan_movable_pages() is a page-tail belonging to a 1G-hugepage. If that happens, page_hstate()->size_to_hstate() will return NULL, and we will blow up in hugepage_migration_supported(). The splat is as follows: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 1350 Comm: bash Tainted: G E 5.0.0-rc1-mm1-1-default+ #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:__offline_pages+0x6ae/0x900 Call Trace: memory_subsys_offline+0x42/0x60 device_offline+0x80/0xa0 state_store+0xab/0xc0 kernfs_fop_write+0x102/0x180 __vfs_write+0x26/0x190 vfs_write+0xad/0x1b0 ksys_write+0x42/0x90 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Modules linked in: af_packet(E) xt_tcpudp(E) ipt_REJECT(E) xt_conntrack(E) nf_conntrack(E) nf_defrag_ipv4(E) ip_set(E) nfnetlink(E) ebtable_nat(E) ebtable_broute(E) bridge(E) stp(E) llc(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ebtable_filter(E) ebtables(E) iptable_filter(E) ip_tables(E) x_tables(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) bochs_drm(E) ttm(E) aesni_intel(E) drm_kms_helper(E) aes_x86_64(E) crypto_simd(E) cryptd(E) glue_helper(E) drm(E) virtio_net(E) syscopyarea(E) sysfillrect(E) net_failover(E) sysimgblt(E) pcspkr(E) failover(E) i2c_piix4(E) fb_sys_fops(E) parport_pc(E) parport(E) button(E) btrfs(E) libcrc32c(E) xor(E) zstd_decompress(E) zstd_compress(E) xxhash(E) raid6_pq(E) sd_mod(E) ata_generic(E) ata_piix(E) ahci(E) libahci(E) libata(E) crc32c_intel(E) serio_raw(E) virtio_pci(E) virtio_ring(E) virtio(E) sg(E) scsi_mod(E) autofs4(E) [akpm@linux-foundation.org: fix brace layout, per David. Reduce indentation] Link: http://lkml.kernel.org/r/20190122154407.18417-1-osalvador@suse.de Signed-off-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Feb 7, 2019
commit eeb0efd upstream. This is the same sort of error we saw in commit 17e2e7d ("mm, page_alloc: fix has_unmovable_pages for HugePages"). Gigantic hugepages cross several memblocks, so it can be that the page we get in scan_movable_pages() is a page-tail belonging to a 1G-hugepage. If that happens, page_hstate()->size_to_hstate() will return NULL, and we will blow up in hugepage_migration_supported(). The splat is as follows: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 1350 Comm: bash Tainted: G E 5.0.0-rc1-mm1-1-default+ #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:__offline_pages+0x6ae/0x900 Call Trace: memory_subsys_offline+0x42/0x60 device_offline+0x80/0xa0 state_store+0xab/0xc0 kernfs_fop_write+0x102/0x180 __vfs_write+0x26/0x190 vfs_write+0xad/0x1b0 ksys_write+0x42/0x90 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Modules linked in: af_packet(E) xt_tcpudp(E) ipt_REJECT(E) xt_conntrack(E) nf_conntrack(E) nf_defrag_ipv4(E) ip_set(E) nfnetlink(E) ebtable_nat(E) ebtable_broute(E) bridge(E) stp(E) llc(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ebtable_filter(E) ebtables(E) iptable_filter(E) ip_tables(E) x_tables(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) bochs_drm(E) ttm(E) aesni_intel(E) drm_kms_helper(E) aes_x86_64(E) crypto_simd(E) cryptd(E) glue_helper(E) drm(E) virtio_net(E) syscopyarea(E) sysfillrect(E) net_failover(E) sysimgblt(E) pcspkr(E) failover(E) i2c_piix4(E) fb_sys_fops(E) parport_pc(E) parport(E) button(E) btrfs(E) libcrc32c(E) xor(E) zstd_decompress(E) zstd_compress(E) xxhash(E) raid6_pq(E) sd_mod(E) ata_generic(E) ata_piix(E) ahci(E) libahci(E) libata(E) crc32c_intel(E) serio_raw(E) virtio_pci(E) virtio_ring(E) virtio(E) sg(E) scsi_mod(E) autofs4(E) [akpm@linux-foundation.org: fix brace layout, per David. Reduce indentation] Link: http://lkml.kernel.org/r/20190122154407.18417-1-osalvador@suse.de Signed-off-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
May 25, 2022
commit cefa91b upstream. Given a sufficiently large number of actions, while copying and reserving memory for a new action of a new flow, if next_offset is greater than MAX_ACTIONS_BUFSIZE, the function reserve_sfa_size() does not return -EMSGSIZE as expected, but it allocates MAX_ACTIONS_BUFSIZE bytes increasing actions_len by req_size. This can then lead to an OOB write access, especially when further actions need to be copied. Fix it by rearranging the flow action size check. KASAN splat below: ================================================================== BUG: KASAN: slab-out-of-bounds in reserve_sfa_size+0x1ba/0x380 [openvswitch] Write of size 65360 at addr ffff888147e4001c by task handler15/836 CPU: 1 PID: 836 Comm: handler15 Not tainted 5.18.0-rc1+ #27 ... Call Trace: <TASK> dump_stack_lvl+0x45/0x5a print_report.cold+0x5e/0x5db ? __lock_text_start+0x8/0x8 ? reserve_sfa_size+0x1ba/0x380 [openvswitch] kasan_report+0xb5/0x130 ? reserve_sfa_size+0x1ba/0x380 [openvswitch] kasan_check_range+0xf5/0x1d0 memcpy+0x39/0x60 reserve_sfa_size+0x1ba/0x380 [openvswitch] __add_action+0x24/0x120 [openvswitch] ovs_nla_add_action+0xe/0x20 [openvswitch] ovs_ct_copy_action+0x29d/0x1130 [openvswitch] ? __kernel_text_address+0xe/0x30 ? unwind_get_return_address+0x56/0xa0 ? create_prof_cpu_mask+0x20/0x20 ? ovs_ct_verify+0xf0/0xf0 [openvswitch] ? prep_compound_page+0x198/0x2a0 ? __kasan_check_byte+0x10/0x40 ? kasan_unpoison+0x40/0x70 ? ksize+0x44/0x60 ? reserve_sfa_size+0x75/0x380 [openvswitch] __ovs_nla_copy_actions+0xc26/0x2070 [openvswitch] ? __zone_watermark_ok+0x420/0x420 ? validate_set.constprop.0+0xc90/0xc90 [openvswitch] ? __alloc_pages+0x1a9/0x3e0 ? __alloc_pages_slowpath.constprop.0+0x1da0/0x1da0 ? unwind_next_frame+0x991/0x1e40 ? __mod_node_page_state+0x99/0x120 ? __mod_lruvec_page_state+0x2e3/0x470 ? __kasan_kmalloc_large+0x90/0xe0 ovs_nla_copy_actions+0x1b4/0x2c0 [openvswitch] ovs_flow_cmd_new+0x3cd/0xb10 [openvswitch] ... Cc: stable@vger.kernel.org Fixes: f28cd2a ("openvswitch: fix flow actions reallocation") Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
May 25, 2022
commit cefa91b upstream. Given a sufficiently large number of actions, while copying and reserving memory for a new action of a new flow, if next_offset is greater than MAX_ACTIONS_BUFSIZE, the function reserve_sfa_size() does not return -EMSGSIZE as expected, but it allocates MAX_ACTIONS_BUFSIZE bytes increasing actions_len by req_size. This can then lead to an OOB write access, especially when further actions need to be copied. Fix it by rearranging the flow action size check. KASAN splat below: ================================================================== BUG: KASAN: slab-out-of-bounds in reserve_sfa_size+0x1ba/0x380 [openvswitch] Write of size 65360 at addr ffff888147e4001c by task handler15/836 CPU: 1 PID: 836 Comm: handler15 Not tainted 5.18.0-rc1+ #27 ... Call Trace: <TASK> dump_stack_lvl+0x45/0x5a print_report.cold+0x5e/0x5db ? __lock_text_start+0x8/0x8 ? reserve_sfa_size+0x1ba/0x380 [openvswitch] kasan_report+0xb5/0x130 ? reserve_sfa_size+0x1ba/0x380 [openvswitch] kasan_check_range+0xf5/0x1d0 memcpy+0x39/0x60 reserve_sfa_size+0x1ba/0x380 [openvswitch] __add_action+0x24/0x120 [openvswitch] ovs_nla_add_action+0xe/0x20 [openvswitch] ovs_ct_copy_action+0x29d/0x1130 [openvswitch] ? __kernel_text_address+0xe/0x30 ? unwind_get_return_address+0x56/0xa0 ? create_prof_cpu_mask+0x20/0x20 ? ovs_ct_verify+0xf0/0xf0 [openvswitch] ? prep_compound_page+0x198/0x2a0 ? __kasan_check_byte+0x10/0x40 ? kasan_unpoison+0x40/0x70 ? ksize+0x44/0x60 ? reserve_sfa_size+0x75/0x380 [openvswitch] __ovs_nla_copy_actions+0xc26/0x2070 [openvswitch] ? __zone_watermark_ok+0x420/0x420 ? validate_set.constprop.0+0xc90/0xc90 [openvswitch] ? __alloc_pages+0x1a9/0x3e0 ? __alloc_pages_slowpath.constprop.0+0x1da0/0x1da0 ? unwind_next_frame+0x991/0x1e40 ? __mod_node_page_state+0x99/0x120 ? __mod_lruvec_page_state+0x2e3/0x470 ? __kasan_kmalloc_large+0x90/0xe0 ovs_nla_copy_actions+0x1b4/0x2c0 [openvswitch] ovs_flow_cmd_new+0x3cd/0xb10 [openvswitch] ... Cc: stable@vger.kernel.org Fixes: f28cd2a ("openvswitch: fix flow actions reallocation") Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
May 25, 2022
commit cefa91b upstream. Given a sufficiently large number of actions, while copying and reserving memory for a new action of a new flow, if next_offset is greater than MAX_ACTIONS_BUFSIZE, the function reserve_sfa_size() does not return -EMSGSIZE as expected, but it allocates MAX_ACTIONS_BUFSIZE bytes increasing actions_len by req_size. This can then lead to an OOB write access, especially when further actions need to be copied. Fix it by rearranging the flow action size check. KASAN splat below: ================================================================== BUG: KASAN: slab-out-of-bounds in reserve_sfa_size+0x1ba/0x380 [openvswitch] Write of size 65360 at addr ffff888147e4001c by task handler15/836 CPU: 1 PID: 836 Comm: handler15 Not tainted 5.18.0-rc1+ #27 ... Call Trace: <TASK> dump_stack_lvl+0x45/0x5a print_report.cold+0x5e/0x5db ? __lock_text_start+0x8/0x8 ? reserve_sfa_size+0x1ba/0x380 [openvswitch] kasan_report+0xb5/0x130 ? reserve_sfa_size+0x1ba/0x380 [openvswitch] kasan_check_range+0xf5/0x1d0 memcpy+0x39/0x60 reserve_sfa_size+0x1ba/0x380 [openvswitch] __add_action+0x24/0x120 [openvswitch] ovs_nla_add_action+0xe/0x20 [openvswitch] ovs_ct_copy_action+0x29d/0x1130 [openvswitch] ? __kernel_text_address+0xe/0x30 ? unwind_get_return_address+0x56/0xa0 ? create_prof_cpu_mask+0x20/0x20 ? ovs_ct_verify+0xf0/0xf0 [openvswitch] ? prep_compound_page+0x198/0x2a0 ? __kasan_check_byte+0x10/0x40 ? kasan_unpoison+0x40/0x70 ? ksize+0x44/0x60 ? reserve_sfa_size+0x75/0x380 [openvswitch] __ovs_nla_copy_actions+0xc26/0x2070 [openvswitch] ? __zone_watermark_ok+0x420/0x420 ? validate_set.constprop.0+0xc90/0xc90 [openvswitch] ? __alloc_pages+0x1a9/0x3e0 ? __alloc_pages_slowpath.constprop.0+0x1da0/0x1da0 ? unwind_next_frame+0x991/0x1e40 ? __mod_node_page_state+0x99/0x120 ? __mod_lruvec_page_state+0x2e3/0x470 ? __kasan_kmalloc_large+0x90/0xe0 ovs_nla_copy_actions+0x1b4/0x2c0 [openvswitch] ovs_flow_cmd_new+0x3cd/0xb10 [openvswitch] ... Cc: stable@vger.kernel.org Fixes: f28cd2a ("openvswitch: fix flow actions reallocation") Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
May 25, 2022
commit cefa91b upstream. Given a sufficiently large number of actions, while copying and reserving memory for a new action of a new flow, if next_offset is greater than MAX_ACTIONS_BUFSIZE, the function reserve_sfa_size() does not return -EMSGSIZE as expected, but it allocates MAX_ACTIONS_BUFSIZE bytes increasing actions_len by req_size. This can then lead to an OOB write access, especially when further actions need to be copied. Fix it by rearranging the flow action size check. KASAN splat below: ================================================================== BUG: KASAN: slab-out-of-bounds in reserve_sfa_size+0x1ba/0x380 [openvswitch] Write of size 65360 at addr ffff888147e4001c by task handler15/836 CPU: 1 PID: 836 Comm: handler15 Not tainted 5.18.0-rc1+ #27 ... Call Trace: <TASK> dump_stack_lvl+0x45/0x5a print_report.cold+0x5e/0x5db ? __lock_text_start+0x8/0x8 ? reserve_sfa_size+0x1ba/0x380 [openvswitch] kasan_report+0xb5/0x130 ? reserve_sfa_size+0x1ba/0x380 [openvswitch] kasan_check_range+0xf5/0x1d0 memcpy+0x39/0x60 reserve_sfa_size+0x1ba/0x380 [openvswitch] __add_action+0x24/0x120 [openvswitch] ovs_nla_add_action+0xe/0x20 [openvswitch] ovs_ct_copy_action+0x29d/0x1130 [openvswitch] ? __kernel_text_address+0xe/0x30 ? unwind_get_return_address+0x56/0xa0 ? create_prof_cpu_mask+0x20/0x20 ? ovs_ct_verify+0xf0/0xf0 [openvswitch] ? prep_compound_page+0x198/0x2a0 ? __kasan_check_byte+0x10/0x40 ? kasan_unpoison+0x40/0x70 ? ksize+0x44/0x60 ? reserve_sfa_size+0x75/0x380 [openvswitch] __ovs_nla_copy_actions+0xc26/0x2070 [openvswitch] ? __zone_watermark_ok+0x420/0x420 ? validate_set.constprop.0+0xc90/0xc90 [openvswitch] ? __alloc_pages+0x1a9/0x3e0 ? __alloc_pages_slowpath.constprop.0+0x1da0/0x1da0 ? unwind_next_frame+0x991/0x1e40 ? __mod_node_page_state+0x99/0x120 ? __mod_lruvec_page_state+0x2e3/0x470 ? __kasan_kmalloc_large+0x90/0xe0 ovs_nla_copy_actions+0x1b4/0x2c0 [openvswitch] ovs_flow_cmd_new+0x3cd/0xb10 [openvswitch] ... Cc: stable@vger.kernel.org Fixes: f28cd2a ("openvswitch: fix flow actions reallocation") Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Jun 24, 2023
The cited commit adds a compeletion to remove dependency on rtnl lock. But it causes a deadlock for multiple encapsulations: crash> bt ffff8aece8a64000 PID: 1514557 TASK: ffff8aece8a64000 CPU: 3 COMMAND: "tc" #0 [ffffa6d14183f368] __schedule at ffffffffb8ba7f45 #1 [ffffa6d14183f3f8] schedule at ffffffffb8ba8418 #2 [ffffa6d14183f418] schedule_preempt_disabled at ffffffffb8ba8898 #3 [ffffa6d14183f428] __mutex_lock at ffffffffb8baa7f8 #4 [ffffa6d14183f4d0] mutex_lock_nested at ffffffffb8baabeb #5 [ffffa6d14183f4e0] mlx5e_attach_encap at ffffffffc0f48c17 [mlx5_core] #6 [ffffa6d14183f628] mlx5e_tc_add_fdb_flow at ffffffffc0f39680 [mlx5_core] #7 [ffffa6d14183f688] __mlx5e_add_fdb_flow at ffffffffc0f3b636 [mlx5_core] #8 [ffffa6d14183f6f0] mlx5e_tc_add_flow at ffffffffc0f3bcdf [mlx5_core] #9 [ffffa6d14183f728] mlx5e_configure_flower at ffffffffc0f3c1d1 [mlx5_core] #10 [ffffa6d14183f790] mlx5e_rep_setup_tc_cls_flower at ffffffffc0f3d529 [mlx5_core] #11 [ffffa6d14183f7a0] mlx5e_rep_setup_tc_cb at ffffffffc0f3d714 [mlx5_core] #12 [ffffa6d14183f7b0] tc_setup_cb_add at ffffffffb8931bb8 #13 [ffffa6d14183f810] fl_hw_replace_filter at ffffffffc0dae901 [cls_flower] #14 [ffffa6d14183f8d8] fl_change at ffffffffc0db5c57 [cls_flower] #15 [ffffa6d14183f970] tc_new_tfilter at ffffffffb8936047 #16 [ffffa6d14183fac8] rtnetlink_rcv_msg at ffffffffb88c7c31 #17 [ffffa6d14183fb50] netlink_rcv_skb at ffffffffb8942853 #18 [ffffa6d14183fbc0] rtnetlink_rcv at ffffffffb88c1835 #19 [ffffa6d14183fbd0] netlink_unicast at ffffffffb8941f27 #20 [ffffa6d14183fc18] netlink_sendmsg at ffffffffb8942245 #21 [ffffa6d14183fc98] sock_sendmsg at ffffffffb887d482 #22 [ffffa6d14183fcb8] ____sys_sendmsg at ffffffffb887d81a #23 [ffffa6d14183fd38] ___sys_sendmsg at ffffffffb88806e2 #24 [ffffa6d14183fe90] __sys_sendmsg at ffffffffb88807a2 #25 [ffffa6d14183ff28] __x64_sys_sendmsg at ffffffffb888080f #26 [ffffa6d14183ff38] do_syscall_64 at ffffffffb8b9b6a8 #27 [ffffa6d14183ff50] entry_SYSCALL_64_after_hwframe at ffffffffb8c0007c crash> bt 0xffff8aeb07544000 PID: 1110766 TASK: ffff8aeb07544000 CPU: 0 COMMAND: "kworker/u20:9" #0 [ffffa6d14e6b7bd8] __schedule at ffffffffb8ba7f45 #1 [ffffa6d14e6b7c68] schedule at ffffffffb8ba8418 #2 [ffffa6d14e6b7c88] schedule_timeout at ffffffffb8baef88 #3 [ffffa6d14e6b7d10] wait_for_completion at ffffffffb8ba968b #4 [ffffa6d14e6b7d60] mlx5e_take_all_encap_flows at ffffffffc0f47ec4 [mlx5_core] #5 [ffffa6d14e6b7da0] mlx5e_rep_update_flows at ffffffffc0f3e734 [mlx5_core] #6 [ffffa6d14e6b7df8] mlx5e_rep_neigh_update at ffffffffc0f400bb [mlx5_core] #7 [ffffa6d14e6b7e50] process_one_work at ffffffffb80acc9c #8 [ffffa6d14e6b7ed0] worker_thread at ffffffffb80ad012 #9 [ffffa6d14e6b7f10] kthread at ffffffffb80b615d #10 [ffffa6d14e6b7f50] ret_from_fork at ffffffffb8001b2f After the first encap is attached, flow will be added to encap entry's flows list. If neigh update is running at this time, the following encaps of the flow can't hold the encap_tbl_lock and sleep. If neigh update thread is waiting for that flow's init_done, deadlock happens. Fix it by holding lock outside of the for loop. If neigh update is running, prevent encap flows from offloading. Since the lock is held outside of the for loop, concurrent creation of encap entries is not allowed. So remove unnecessary wait_for_completion call for res_ready. Fixes: 95435ad ("net/mlx5e: Only access fully initialized flows in neigh update") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
frank-w
pushed a commit
that referenced
this pull request
Oct 2, 2023
commit 0b0747d upstream. The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Oct 2, 2023
commit 0b0747d upstream. The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Oct 26, 2023
commit 0b0747d upstream. The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Oct 26, 2023
commit f9a4300 upstream. [ This backport includes ab5e5c0 ("netfilter: nf_tables: use kfree_rcu(ptr, rcu) to release hooks in clean_net path") ] __nft_release_hooks() is called from pre_netns exit path which unregisters the hooks, then the NETDEV_UNREGISTER event is triggered which unregisters the hooks again. [ 565.221461] WARNING: CPU: 18 PID: 193 at net/netfilter/core.c:495 __nf_unregister_net_hook+0x247/0x270 [...] [ 565.246890] CPU: 18 PID: 193 Comm: kworker/u64:1 Tainted: G E 5.18.0-rc7+ #27 [ 565.253682] Workqueue: netns cleanup_net [ 565.257059] RIP: 0010:__nf_unregister_net_hook+0x247/0x270 [...] [ 565.297120] Call Trace: [ 565.300900] <TASK> [ 565.304683] nf_tables_flowtable_event+0x16a/0x220 [nf_tables] [ 565.308518] raw_notifier_call_chain+0x63/0x80 [ 565.312386] unregister_netdevice_many+0x54f/0xb50 Unregister and destroy netdev hook from netns pre_exit via kfree_rcu so the NETDEV_UNREGISTER path see unregistered hooks. Fixes: 767d121 ("netfilter: nftables: fix possible UAF over chains from packet path in netns") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
frank-w
pushed a commit
that referenced
this pull request
Oct 26, 2023
commit 0b0747d upstream. The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Oct 26, 2023
[ Upstream commit 0b0747d ] The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
frank-w
pushed a commit
that referenced
this pull request
Jan 14, 2024
commit f9a4300 upstream. __nft_release_hooks() is called from pre_netns exit path which unregisters the hooks, then the NETDEV_UNREGISTER event is triggered which unregisters the hooks again. [ 565.221461] WARNING: CPU: 18 PID: 193 at net/netfilter/core.c:495 __nf_unregister_net_hook+0x247/0x270 [...] [ 565.246890] CPU: 18 PID: 193 Comm: kworker/u64:1 Tainted: G E 5.18.0-rc7+ #27 [ 565.253682] Workqueue: netns cleanup_net [ 565.257059] RIP: 0010:__nf_unregister_net_hook+0x247/0x270 [...] [ 565.297120] Call Trace: [ 565.300900] <TASK> [ 565.304683] nf_tables_flowtable_event+0x16a/0x220 [nf_tables] [ 565.308518] raw_notifier_call_chain+0x63/0x80 [ 565.312386] unregister_netdevice_many+0x54f/0xb50 Unregister and destroy netdev hook from netns pre_exit via kfree_rcu so the NETDEV_UNREGISTER path see unregistered hooks. Fixes: 767d121 ("netfilter: nftables: fix possible UAF over chains from packet path in netns") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Apr 6, 2024
commit b7cc4ff upstream. It appears the client object tree has no locking unless I've missed something else. Fix races around adding/removing client objects, mostly vram bar mappings. 4562.099306] general protection fault, probably for non-canonical address 0x6677ed422bceb80c: 0000 [#1] PREEMPT SMP PTI [ 4562.099314] CPU: 2 PID: 23171 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 [ 4562.099324] Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 [ 4562.099330] RIP: 0010:nvkm_object_search+0x1d/0x70 [nouveau] [ 4562.099503] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 48 89 f8 48 85 f6 74 39 48 8b 87 a0 00 00 00 48 85 c0 74 12 <48> 8b 48 f8 48 39 ce 73 15 48 8b 40 10 48 85 c0 75 ee 48 c7 c0 fe [ 4562.099506] RSP: 0000:ffffa94cc420bbf8 EFLAGS: 00010206 [ 4562.099512] RAX: 6677ed422bceb814 RBX: ffff98108791f400 RCX: ffff9810f26b8f58 [ 4562.099517] RDX: 0000000000000000 RSI: ffff9810f26b9158 RDI: ffff98108791f400 [ 4562.099519] RBP: ffff9810f26b9158 R08: 0000000000000000 R09: 0000000000000000 [ 4562.099521] R10: ffffa94cc420bc48 R11: 0000000000000001 R12: ffff9810f02a7cc0 [ 4562.099526] R13: 0000000000000000 R14: 00000000000000ff R15: 0000000000000007 [ 4562.099528] FS: 00007f629c5017c0(0000) GS:ffff98142c700000(0000) knlGS:0000000000000000 [ 4562.099534] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4562.099536] CR2: 00007f629a882000 CR3: 000000017019e004 CR4: 00000000003706f0 [ 4562.099541] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4562.099542] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4562.099544] Call Trace: [ 4562.099555] <TASK> [ 4562.099573] ? die_addr+0x36/0x90 [ 4562.099583] ? exc_general_protection+0x246/0x4a0 [ 4562.099593] ? asm_exc_general_protection+0x26/0x30 [ 4562.099600] ? nvkm_object_search+0x1d/0x70 [nouveau] [ 4562.099730] nvkm_ioctl+0xa1/0x250 [nouveau] [ 4562.099861] nvif_object_map_handle+0xc8/0x180 [nouveau] [ 4562.099986] nouveau_ttm_io_mem_reserve+0x122/0x270 [nouveau] [ 4562.100156] ? dma_resv_test_signaled+0x26/0xb0 [ 4562.100163] ttm_bo_vm_fault_reserved+0x97/0x3c0 [ttm] [ 4562.100182] ? __mutex_unlock_slowpath+0x2a/0x270 [ 4562.100189] nouveau_ttm_fault+0x69/0xb0 [nouveau] [ 4562.100356] __do_fault+0x32/0x150 [ 4562.100362] do_fault+0x7c/0x560 [ 4562.100369] __handle_mm_fault+0x800/0xc10 [ 4562.100382] handle_mm_fault+0x17c/0x3e0 [ 4562.100388] do_user_addr_fault+0x208/0x860 [ 4562.100395] exc_page_fault+0x7f/0x200 [ 4562.100402] asm_exc_page_fault+0x26/0x30 [ 4562.100412] RIP: 0033:0x9b9870 [ 4562.100419] Code: 85 a8 f7 ff ff 8b 8d 80 f7 ff ff 89 08 e9 18 f2 ff ff 0f 1f 84 00 00 00 00 00 44 89 32 e9 90 fa ff ff 0f 1f 84 00 00 00 00 00 <44> 89 32 e9 f8 f1 ff ff 0f 1f 84 00 00 00 00 00 66 44 89 32 e9 e7 [ 4562.100422] RSP: 002b:00007fff9ba2dc70 EFLAGS: 00010246 [ 4562.100426] RAX: 0000000000000004 RBX: 000000000dd65e10 RCX: 000000fff0000000 [ 4562.100428] RDX: 00007f629a882000 RSI: 00007f629a882000 RDI: 0000000000000066 [ 4562.100432] RBP: 00007fff9ba2e570 R08: 0000000000000000 R09: 0000000123ddf000 [ 4562.100434] R10: 0000000000000001 R11: 0000000000000246 R12: 000000007fffffff [ 4562.100436] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 4562.100446] </TASK> [ 4562.100448] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c nfnetlink cmac bnep sunrpc iwlmvm intel_rapl_msr intel_rapl_common snd_sof_pci_intel_cnl x86_pkg_temp_thermal intel_powerclamp snd_sof_intel_hda_common mac80211 coretemp snd_soc_acpi_intel_match kvm_intel snd_soc_acpi snd_soc_hdac_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof_intel_hda_mlink snd_sof_intel_hda snd_sof kvm snd_sof_utils snd_soc_core snd_hda_codec_realtek libarc4 snd_hda_codec_generic snd_compress snd_hda_ext_core vfat fat snd_hda_intel snd_intel_dspcfg irqbypass iwlwifi snd_hda_codec snd_hwdep snd_hda_core btusb btrtl mei_hdcp iTCO_wdt rapl mei_pxp btintel snd_seq iTCO_vendor_support btbcm snd_seq_device intel_cstate bluetooth snd_pcm cfg80211 intel_wmi_thunderbolt wmi_bmof intel_uncore snd_timer mei_me snd ecdh_generic i2c_i801 [ 4562.100541] ecc mei i2c_smbus soundcore rfkill intel_pch_thermal acpi_pad zram nouveau drm_ttm_helper ttm gpu_sched i2c_algo_bit drm_gpuvm drm_exec mxm_wmi drm_display_helper drm_kms_helper drm crct10dif_pclmul crc32_pclmul nvme e1000e crc32c_intel nvme_core ghash_clmulni_intel video wmi pinctrl_cannonlake ip6_tables ip_tables fuse [ 4562.100616] ---[ end trace 0000000000000000 ]--- Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
May 9, 2024
Running a lot of VK CTS in parallel against nouveau, once every few hours you might see something like this crash. BUG: kernel NULL pointer dereference, address: 0000000000000008 PGD 8000000114e6e067 P4D 8000000114e6e067 PUD 109046067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1 RSP: 0000:ffffac20c5857838 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 00000000004d8001 RCX: 0000000000000001 RDX: 00000000004d8001 RSI: 00000000000006d8 RDI: ffffa07afe332180 RBP: 00000000000006d8 R08: ffffac20c5857ad0 R09: 0000000000ffff10 R10: 0000000000000001 R11: ffffa07af27e2de0 R12: 000000000000001c R13: ffffac20c5857ad0 R14: ffffa07a96fe9040 R15: 000000000000001c FS: 00007fe395eed7c0(0000) GS:ffffa07e2c980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000011febe001 CR4: 00000000003706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ... ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau] nvkm_vmm_iter+0x351/0xa20 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __lock_acquire+0x3ed/0x2170 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_map_locked+0x224/0x3a0 [nouveau] Adding any sort of useful debug usually makes it go away, so I hand wrote the function in a line, and debugged the asm. Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in the nv50_instobj_acquire called from nvkm_kmap. If Thread A and Thread B both get to nv50_instobj_acquire around the same time, and Thread A hits the refcount_set line, and in lockstep thread B succeeds at refcount_inc_not_zero, there is a chance the ptrs value won't have been stored since refcount_set is unordered. Force a memory barrier here, I picked smp_mb, since we want it on all CPUs and it's write followed by a read. v2: use paired smp_rmb/smp_wmb. Cc: <stable@vger.kernel.org> Fixes: be55287 ("drm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobj") Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240411011510.2546857-1-airlied@gmail.com
frank-w
pushed a commit
that referenced
this pull request
May 19, 2024
commit fff1386 upstream. Running a lot of VK CTS in parallel against nouveau, once every few hours you might see something like this crash. BUG: kernel NULL pointer dereference, address: 0000000000000008 PGD 8000000114e6e067 P4D 8000000114e6e067 PUD 109046067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1 RSP: 0000:ffffac20c5857838 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 00000000004d8001 RCX: 0000000000000001 RDX: 00000000004d8001 RSI: 00000000000006d8 RDI: ffffa07afe332180 RBP: 00000000000006d8 R08: ffffac20c5857ad0 R09: 0000000000ffff10 R10: 0000000000000001 R11: ffffa07af27e2de0 R12: 000000000000001c R13: ffffac20c5857ad0 R14: ffffa07a96fe9040 R15: 000000000000001c FS: 00007fe395eed7c0(0000) GS:ffffa07e2c980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000011febe001 CR4: 00000000003706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ... ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau] nvkm_vmm_iter+0x351/0xa20 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __lock_acquire+0x3ed/0x2170 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_map_locked+0x224/0x3a0 [nouveau] Adding any sort of useful debug usually makes it go away, so I hand wrote the function in a line, and debugged the asm. Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in the nv50_instobj_acquire called from nvkm_kmap. If Thread A and Thread B both get to nv50_instobj_acquire around the same time, and Thread A hits the refcount_set line, and in lockstep thread B succeeds at refcount_inc_not_zero, there is a chance the ptrs value won't have been stored since refcount_set is unordered. Force a memory barrier here, I picked smp_mb, since we want it on all CPUs and it's write followed by a read. v2: use paired smp_rmb/smp_wmb. Cc: <stable@vger.kernel.org> Fixes: be55287 ("drm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobj") Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240411011510.2546857-1-airlied@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
May 19, 2024
commit fff1386 upstream. Running a lot of VK CTS in parallel against nouveau, once every few hours you might see something like this crash. BUG: kernel NULL pointer dereference, address: 0000000000000008 PGD 8000000114e6e067 P4D 8000000114e6e067 PUD 109046067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1 RSP: 0000:ffffac20c5857838 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 00000000004d8001 RCX: 0000000000000001 RDX: 00000000004d8001 RSI: 00000000000006d8 RDI: ffffa07afe332180 RBP: 00000000000006d8 R08: ffffac20c5857ad0 R09: 0000000000ffff10 R10: 0000000000000001 R11: ffffa07af27e2de0 R12: 000000000000001c R13: ffffac20c5857ad0 R14: ffffa07a96fe9040 R15: 000000000000001c FS: 00007fe395eed7c0(0000) GS:ffffa07e2c980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000011febe001 CR4: 00000000003706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ... ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau] nvkm_vmm_iter+0x351/0xa20 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __lock_acquire+0x3ed/0x2170 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_map_locked+0x224/0x3a0 [nouveau] Adding any sort of useful debug usually makes it go away, so I hand wrote the function in a line, and debugged the asm. Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in the nv50_instobj_acquire called from nvkm_kmap. If Thread A and Thread B both get to nv50_instobj_acquire around the same time, and Thread A hits the refcount_set line, and in lockstep thread B succeeds at refcount_inc_not_zero, there is a chance the ptrs value won't have been stored since refcount_set is unordered. Force a memory barrier here, I picked smp_mb, since we want it on all CPUs and it's write followed by a read. v2: use paired smp_rmb/smp_wmb. Cc: <stable@vger.kernel.org> Fixes: be55287 ("drm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobj") Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240411011510.2546857-1-airlied@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
May 20, 2024
commit fff1386 upstream. Running a lot of VK CTS in parallel against nouveau, once every few hours you might see something like this crash. BUG: kernel NULL pointer dereference, address: 0000000000000008 PGD 8000000114e6e067 P4D 8000000114e6e067 PUD 109046067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1 RSP: 0000:ffffac20c5857838 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 00000000004d8001 RCX: 0000000000000001 RDX: 00000000004d8001 RSI: 00000000000006d8 RDI: ffffa07afe332180 RBP: 00000000000006d8 R08: ffffac20c5857ad0 R09: 0000000000ffff10 R10: 0000000000000001 R11: ffffa07af27e2de0 R12: 000000000000001c R13: ffffac20c5857ad0 R14: ffffa07a96fe9040 R15: 000000000000001c FS: 00007fe395eed7c0(0000) GS:ffffa07e2c980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000011febe001 CR4: 00000000003706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ... ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau] nvkm_vmm_iter+0x351/0xa20 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __lock_acquire+0x3ed/0x2170 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_map_locked+0x224/0x3a0 [nouveau] Adding any sort of useful debug usually makes it go away, so I hand wrote the function in a line, and debugged the asm. Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in the nv50_instobj_acquire called from nvkm_kmap. If Thread A and Thread B both get to nv50_instobj_acquire around the same time, and Thread A hits the refcount_set line, and in lockstep thread B succeeds at refcount_inc_not_zero, there is a chance the ptrs value won't have been stored since refcount_set is unordered. Force a memory barrier here, I picked smp_mb, since we want it on all CPUs and it's write followed by a read. v2: use paired smp_rmb/smp_wmb. Cc: <stable@vger.kernel.org> Fixes: be55287 ("drm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobj") Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240411011510.2546857-1-airlied@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Jun 8, 2024
commit fff1386 upstream. Running a lot of VK CTS in parallel against nouveau, once every few hours you might see something like this crash. BUG: kernel NULL pointer dereference, address: 0000000000000008 PGD 8000000114e6e067 P4D 8000000114e6e067 PUD 109046067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1 RSP: 0000:ffffac20c5857838 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 00000000004d8001 RCX: 0000000000000001 RDX: 00000000004d8001 RSI: 00000000000006d8 RDI: ffffa07afe332180 RBP: 00000000000006d8 R08: ffffac20c5857ad0 R09: 0000000000ffff10 R10: 0000000000000001 R11: ffffa07af27e2de0 R12: 000000000000001c R13: ffffac20c5857ad0 R14: ffffa07a96fe9040 R15: 000000000000001c FS: 00007fe395eed7c0(0000) GS:ffffa07e2c980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000011febe001 CR4: 00000000003706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ... ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau] nvkm_vmm_iter+0x351/0xa20 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __lock_acquire+0x3ed/0x2170 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_map_locked+0x224/0x3a0 [nouveau] Adding any sort of useful debug usually makes it go away, so I hand wrote the function in a line, and debugged the asm. Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in the nv50_instobj_acquire called from nvkm_kmap. If Thread A and Thread B both get to nv50_instobj_acquire around the same time, and Thread A hits the refcount_set line, and in lockstep thread B succeeds at refcount_inc_not_zero, there is a chance the ptrs value won't have been stored since refcount_set is unordered. Force a memory barrier here, I picked smp_mb, since we want it on all CPUs and it's write followed by a read. v2: use paired smp_rmb/smp_wmb. Cc: <stable@vger.kernel.org> Fixes: be55287 ("drm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobj") Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240411011510.2546857-1-airlied@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Jun 8, 2024
commit fff1386 upstream. Running a lot of VK CTS in parallel against nouveau, once every few hours you might see something like this crash. BUG: kernel NULL pointer dereference, address: 0000000000000008 PGD 8000000114e6e067 P4D 8000000114e6e067 PUD 109046067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1 RSP: 0000:ffffac20c5857838 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 00000000004d8001 RCX: 0000000000000001 RDX: 00000000004d8001 RSI: 00000000000006d8 RDI: ffffa07afe332180 RBP: 00000000000006d8 R08: ffffac20c5857ad0 R09: 0000000000ffff10 R10: 0000000000000001 R11: ffffa07af27e2de0 R12: 000000000000001c R13: ffffac20c5857ad0 R14: ffffa07a96fe9040 R15: 000000000000001c FS: 00007fe395eed7c0(0000) GS:ffffa07e2c980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000011febe001 CR4: 00000000003706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ... ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau] nvkm_vmm_iter+0x351/0xa20 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __lock_acquire+0x3ed/0x2170 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_map_locked+0x224/0x3a0 [nouveau] Adding any sort of useful debug usually makes it go away, so I hand wrote the function in a line, and debugged the asm. Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in the nv50_instobj_acquire called from nvkm_kmap. If Thread A and Thread B both get to nv50_instobj_acquire around the same time, and Thread A hits the refcount_set line, and in lockstep thread B succeeds at refcount_inc_not_zero, there is a chance the ptrs value won't have been stored since refcount_set is unordered. Force a memory barrier here, I picked smp_mb, since we want it on all CPUs and it's write followed by a read. v2: use paired smp_rmb/smp_wmb. Cc: <stable@vger.kernel.org> Fixes: be55287 ("drm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobj") Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240411011510.2546857-1-airlied@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Jun 29, 2024
[ Upstream commit 8987515 ] When request_irq() fails, error path calls vp_del_vqs(). There, as vq is present in the list, free_irq() is called for the same vector. That causes following splat: [ 0.414355] Trying to free already-free IRQ 27 [ 0.414403] WARNING: CPU: 1 PID: 1 at kernel/irq/manage.c:1899 free_irq+0x1a1/0x2d0 [ 0.414510] Modules linked in: [ 0.414540] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc4+ #27 [ 0.414540] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 [ 0.414540] RIP: 0010:free_irq+0x1a1/0x2d0 [ 0.414540] Code: 1e 00 48 83 c4 08 48 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 90 8b 74 24 04 48 c7 c7 98 80 6c b1 e8 00 c9 f7 ff 90 <0f> 0b 90 90 48 89 ee 4c 89 ef e8 e0 20 b8 00 49 8b 47 40 48 8b 40 [ 0.414540] RSP: 0000:ffffb71480013ae0 EFLAGS: 00010086 [ 0.414540] RAX: 0000000000000000 RBX: ffffa099c2722000 RCX: 0000000000000000 [ 0.414540] RDX: 0000000000000000 RSI: ffffb71480013998 RDI: 0000000000000001 [ 0.414540] RBP: 0000000000000246 R08: 00000000ffffdfff R09: 0000000000000001 [ 0.414540] R10: 00000000ffffdfff R11: ffffffffb18729c0 R12: ffffa099c1c91760 [ 0.414540] R13: ffffa099c1c916a4 R14: ffffa099c1d2f200 R15: ffffa099c1c91600 [ 0.414540] FS: 0000000000000000(0000) GS:ffffa099fec40000(0000) knlGS:0000000000000000 [ 0.414540] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.414540] CR2: 0000000000000000 CR3: 0000000008e3e001 CR4: 0000000000370ef0 [ 0.414540] Call Trace: [ 0.414540] <TASK> [ 0.414540] ? __warn+0x80/0x120 [ 0.414540] ? free_irq+0x1a1/0x2d0 [ 0.414540] ? report_bug+0x164/0x190 [ 0.414540] ? handle_bug+0x3b/0x70 [ 0.414540] ? exc_invalid_op+0x17/0x70 [ 0.414540] ? asm_exc_invalid_op+0x1a/0x20 [ 0.414540] ? free_irq+0x1a1/0x2d0 [ 0.414540] vp_del_vqs+0xc1/0x220 [ 0.414540] vp_find_vqs_msix+0x305/0x470 [ 0.414540] vp_find_vqs+0x3e/0x1a0 [ 0.414540] vp_modern_find_vqs+0x1b/0x70 [ 0.414540] init_vqs+0x387/0x600 [ 0.414540] virtnet_probe+0x50a/0xc80 [ 0.414540] virtio_dev_probe+0x1e0/0x2b0 [ 0.414540] really_probe+0xc0/0x2c0 [ 0.414540] ? __pfx___driver_attach+0x10/0x10 [ 0.414540] __driver_probe_device+0x73/0x120 [ 0.414540] driver_probe_device+0x1f/0xe0 [ 0.414540] __driver_attach+0x88/0x180 [ 0.414540] bus_for_each_dev+0x85/0xd0 [ 0.414540] bus_add_driver+0xec/0x1f0 [ 0.414540] driver_register+0x59/0x100 [ 0.414540] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 0.414540] virtio_net_driver_init+0x90/0xb0 [ 0.414540] do_one_initcall+0x58/0x230 [ 0.414540] kernel_init_freeable+0x1a3/0x2d0 [ 0.414540] ? __pfx_kernel_init+0x10/0x10 [ 0.414540] kernel_init+0x1a/0x1c0 [ 0.414540] ret_from_fork+0x31/0x50 [ 0.414540] ? __pfx_kernel_init+0x10/0x10 [ 0.414540] ret_from_fork_asm+0x1a/0x30 [ 0.414540] </TASK> Fix this by calling deleting the current vq when request_irq() fails. Fixes: 0b0f9dc ("Revert "virtio_pci: use shared interrupts for virtqueues"") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Message-Id: <20240426150845.3999481-1-jiri@resnulli.us> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
frank-w
pushed a commit
that referenced
this pull request
Jun 29, 2024
[ Upstream commit 8987515 ] When request_irq() fails, error path calls vp_del_vqs(). There, as vq is present in the list, free_irq() is called for the same vector. That causes following splat: [ 0.414355] Trying to free already-free IRQ 27 [ 0.414403] WARNING: CPU: 1 PID: 1 at kernel/irq/manage.c:1899 free_irq+0x1a1/0x2d0 [ 0.414510] Modules linked in: [ 0.414540] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc4+ #27 [ 0.414540] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 [ 0.414540] RIP: 0010:free_irq+0x1a1/0x2d0 [ 0.414540] Code: 1e 00 48 83 c4 08 48 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 90 8b 74 24 04 48 c7 c7 98 80 6c b1 e8 00 c9 f7 ff 90 <0f> 0b 90 90 48 89 ee 4c 89 ef e8 e0 20 b8 00 49 8b 47 40 48 8b 40 [ 0.414540] RSP: 0000:ffffb71480013ae0 EFLAGS: 00010086 [ 0.414540] RAX: 0000000000000000 RBX: ffffa099c2722000 RCX: 0000000000000000 [ 0.414540] RDX: 0000000000000000 RSI: ffffb71480013998 RDI: 0000000000000001 [ 0.414540] RBP: 0000000000000246 R08: 00000000ffffdfff R09: 0000000000000001 [ 0.414540] R10: 00000000ffffdfff R11: ffffffffb18729c0 R12: ffffa099c1c91760 [ 0.414540] R13: ffffa099c1c916a4 R14: ffffa099c1d2f200 R15: ffffa099c1c91600 [ 0.414540] FS: 0000000000000000(0000) GS:ffffa099fec40000(0000) knlGS:0000000000000000 [ 0.414540] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.414540] CR2: 0000000000000000 CR3: 0000000008e3e001 CR4: 0000000000370ef0 [ 0.414540] Call Trace: [ 0.414540] <TASK> [ 0.414540] ? __warn+0x80/0x120 [ 0.414540] ? free_irq+0x1a1/0x2d0 [ 0.414540] ? report_bug+0x164/0x190 [ 0.414540] ? handle_bug+0x3b/0x70 [ 0.414540] ? exc_invalid_op+0x17/0x70 [ 0.414540] ? asm_exc_invalid_op+0x1a/0x20 [ 0.414540] ? free_irq+0x1a1/0x2d0 [ 0.414540] vp_del_vqs+0xc1/0x220 [ 0.414540] vp_find_vqs_msix+0x305/0x470 [ 0.414540] vp_find_vqs+0x3e/0x1a0 [ 0.414540] vp_modern_find_vqs+0x1b/0x70 [ 0.414540] init_vqs+0x387/0x600 [ 0.414540] virtnet_probe+0x50a/0xc80 [ 0.414540] virtio_dev_probe+0x1e0/0x2b0 [ 0.414540] really_probe+0xc0/0x2c0 [ 0.414540] ? __pfx___driver_attach+0x10/0x10 [ 0.414540] __driver_probe_device+0x73/0x120 [ 0.414540] driver_probe_device+0x1f/0xe0 [ 0.414540] __driver_attach+0x88/0x180 [ 0.414540] bus_for_each_dev+0x85/0xd0 [ 0.414540] bus_add_driver+0xec/0x1f0 [ 0.414540] driver_register+0x59/0x100 [ 0.414540] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 0.414540] virtio_net_driver_init+0x90/0xb0 [ 0.414540] do_one_initcall+0x58/0x230 [ 0.414540] kernel_init_freeable+0x1a3/0x2d0 [ 0.414540] ? __pfx_kernel_init+0x10/0x10 [ 0.414540] kernel_init+0x1a/0x1c0 [ 0.414540] ret_from_fork+0x31/0x50 [ 0.414540] ? __pfx_kernel_init+0x10/0x10 [ 0.414540] ret_from_fork_asm+0x1a/0x30 [ 0.414540] </TASK> Fix this by calling deleting the current vq when request_irq() fails. Fixes: 0b0f9dc ("Revert "virtio_pci: use shared interrupts for virtqueues"") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Message-Id: <20240426150845.3999481-1-jiri@resnulli.us> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
frank-w
pushed a commit
that referenced
this pull request
Jun 29, 2024
[ Upstream commit 8987515 ] When request_irq() fails, error path calls vp_del_vqs(). There, as vq is present in the list, free_irq() is called for the same vector. That causes following splat: [ 0.414355] Trying to free already-free IRQ 27 [ 0.414403] WARNING: CPU: 1 PID: 1 at kernel/irq/manage.c:1899 free_irq+0x1a1/0x2d0 [ 0.414510] Modules linked in: [ 0.414540] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc4+ #27 [ 0.414540] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 [ 0.414540] RIP: 0010:free_irq+0x1a1/0x2d0 [ 0.414540] Code: 1e 00 48 83 c4 08 48 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 90 8b 74 24 04 48 c7 c7 98 80 6c b1 e8 00 c9 f7 ff 90 <0f> 0b 90 90 48 89 ee 4c 89 ef e8 e0 20 b8 00 49 8b 47 40 48 8b 40 [ 0.414540] RSP: 0000:ffffb71480013ae0 EFLAGS: 00010086 [ 0.414540] RAX: 0000000000000000 RBX: ffffa099c2722000 RCX: 0000000000000000 [ 0.414540] RDX: 0000000000000000 RSI: ffffb71480013998 RDI: 0000000000000001 [ 0.414540] RBP: 0000000000000246 R08: 00000000ffffdfff R09: 0000000000000001 [ 0.414540] R10: 00000000ffffdfff R11: ffffffffb18729c0 R12: ffffa099c1c91760 [ 0.414540] R13: ffffa099c1c916a4 R14: ffffa099c1d2f200 R15: ffffa099c1c91600 [ 0.414540] FS: 0000000000000000(0000) GS:ffffa099fec40000(0000) knlGS:0000000000000000 [ 0.414540] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.414540] CR2: 0000000000000000 CR3: 0000000008e3e001 CR4: 0000000000370ef0 [ 0.414540] Call Trace: [ 0.414540] <TASK> [ 0.414540] ? __warn+0x80/0x120 [ 0.414540] ? free_irq+0x1a1/0x2d0 [ 0.414540] ? report_bug+0x164/0x190 [ 0.414540] ? handle_bug+0x3b/0x70 [ 0.414540] ? exc_invalid_op+0x17/0x70 [ 0.414540] ? asm_exc_invalid_op+0x1a/0x20 [ 0.414540] ? free_irq+0x1a1/0x2d0 [ 0.414540] vp_del_vqs+0xc1/0x220 [ 0.414540] vp_find_vqs_msix+0x305/0x470 [ 0.414540] vp_find_vqs+0x3e/0x1a0 [ 0.414540] vp_modern_find_vqs+0x1b/0x70 [ 0.414540] init_vqs+0x387/0x600 [ 0.414540] virtnet_probe+0x50a/0xc80 [ 0.414540] virtio_dev_probe+0x1e0/0x2b0 [ 0.414540] really_probe+0xc0/0x2c0 [ 0.414540] ? __pfx___driver_attach+0x10/0x10 [ 0.414540] __driver_probe_device+0x73/0x120 [ 0.414540] driver_probe_device+0x1f/0xe0 [ 0.414540] __driver_attach+0x88/0x180 [ 0.414540] bus_for_each_dev+0x85/0xd0 [ 0.414540] bus_add_driver+0xec/0x1f0 [ 0.414540] driver_register+0x59/0x100 [ 0.414540] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 0.414540] virtio_net_driver_init+0x90/0xb0 [ 0.414540] do_one_initcall+0x58/0x230 [ 0.414540] kernel_init_freeable+0x1a3/0x2d0 [ 0.414540] ? __pfx_kernel_init+0x10/0x10 [ 0.414540] kernel_init+0x1a/0x1c0 [ 0.414540] ret_from_fork+0x31/0x50 [ 0.414540] ? __pfx_kernel_init+0x10/0x10 [ 0.414540] ret_from_fork_asm+0x1a/0x30 [ 0.414540] </TASK> Fix this by calling deleting the current vq when request_irq() fails. Fixes: 0b0f9dc ("Revert "virtio_pci: use shared interrupts for virtqueues"") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Message-Id: <20240426150845.3999481-1-jiri@resnulli.us> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
BtbN
pushed a commit
to BtbN/BPI-Router-Linux
that referenced
this pull request
Jul 5, 2024
[ Upstream commit 8987515 ] When request_irq() fails, error path calls vp_del_vqs(). There, as vq is present in the list, free_irq() is called for the same vector. That causes following splat: [ 0.414355] Trying to free already-free IRQ 27 [ 0.414403] WARNING: CPU: 1 PID: 1 at kernel/irq/manage.c:1899 free_irq+0x1a1/0x2d0 [ 0.414510] Modules linked in: [ 0.414540] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc4+ frank-w#27 [ 0.414540] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 [ 0.414540] RIP: 0010:free_irq+0x1a1/0x2d0 [ 0.414540] Code: 1e 00 48 83 c4 08 48 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 90 8b 74 24 04 48 c7 c7 98 80 6c b1 e8 00 c9 f7 ff 90 <0f> 0b 90 90 48 89 ee 4c 89 ef e8 e0 20 b8 00 49 8b 47 40 48 8b 40 [ 0.414540] RSP: 0000:ffffb71480013ae0 EFLAGS: 00010086 [ 0.414540] RAX: 0000000000000000 RBX: ffffa099c2722000 RCX: 0000000000000000 [ 0.414540] RDX: 0000000000000000 RSI: ffffb71480013998 RDI: 0000000000000001 [ 0.414540] RBP: 0000000000000246 R08: 00000000ffffdfff R09: 0000000000000001 [ 0.414540] R10: 00000000ffffdfff R11: ffffffffb18729c0 R12: ffffa099c1c91760 [ 0.414540] R13: ffffa099c1c916a4 R14: ffffa099c1d2f200 R15: ffffa099c1c91600 [ 0.414540] FS: 0000000000000000(0000) GS:ffffa099fec40000(0000) knlGS:0000000000000000 [ 0.414540] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.414540] CR2: 0000000000000000 CR3: 0000000008e3e001 CR4: 0000000000370ef0 [ 0.414540] Call Trace: [ 0.414540] <TASK> [ 0.414540] ? __warn+0x80/0x120 [ 0.414540] ? free_irq+0x1a1/0x2d0 [ 0.414540] ? report_bug+0x164/0x190 [ 0.414540] ? handle_bug+0x3b/0x70 [ 0.414540] ? exc_invalid_op+0x17/0x70 [ 0.414540] ? asm_exc_invalid_op+0x1a/0x20 [ 0.414540] ? free_irq+0x1a1/0x2d0 [ 0.414540] vp_del_vqs+0xc1/0x220 [ 0.414540] vp_find_vqs_msix+0x305/0x470 [ 0.414540] vp_find_vqs+0x3e/0x1a0 [ 0.414540] vp_modern_find_vqs+0x1b/0x70 [ 0.414540] init_vqs+0x387/0x600 [ 0.414540] virtnet_probe+0x50a/0xc80 [ 0.414540] virtio_dev_probe+0x1e0/0x2b0 [ 0.414540] really_probe+0xc0/0x2c0 [ 0.414540] ? __pfx___driver_attach+0x10/0x10 [ 0.414540] __driver_probe_device+0x73/0x120 [ 0.414540] driver_probe_device+0x1f/0xe0 [ 0.414540] __driver_attach+0x88/0x180 [ 0.414540] bus_for_each_dev+0x85/0xd0 [ 0.414540] bus_add_driver+0xec/0x1f0 [ 0.414540] driver_register+0x59/0x100 [ 0.414540] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 0.414540] virtio_net_driver_init+0x90/0xb0 [ 0.414540] do_one_initcall+0x58/0x230 [ 0.414540] kernel_init_freeable+0x1a3/0x2d0 [ 0.414540] ? __pfx_kernel_init+0x10/0x10 [ 0.414540] kernel_init+0x1a/0x1c0 [ 0.414540] ret_from_fork+0x31/0x50 [ 0.414540] ? __pfx_kernel_init+0x10/0x10 [ 0.414540] ret_from_fork_asm+0x1a/0x30 [ 0.414540] </TASK> Fix this by calling deleting the current vq when request_irq() fails. Fixes: 0b0f9dc ("Revert "virtio_pci: use shared interrupts for virtqueues"") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Message-Id: <20240426150845.3999481-1-jiri@resnulli.us> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
frank-w
pushed a commit
that referenced
this pull request
Aug 11, 2024
[ Upstream commit 8987515 ] When request_irq() fails, error path calls vp_del_vqs(). There, as vq is present in the list, free_irq() is called for the same vector. That causes following splat: [ 0.414355] Trying to free already-free IRQ 27 [ 0.414403] WARNING: CPU: 1 PID: 1 at kernel/irq/manage.c:1899 free_irq+0x1a1/0x2d0 [ 0.414510] Modules linked in: [ 0.414540] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc4+ #27 [ 0.414540] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 [ 0.414540] RIP: 0010:free_irq+0x1a1/0x2d0 [ 0.414540] Code: 1e 00 48 83 c4 08 48 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 90 8b 74 24 04 48 c7 c7 98 80 6c b1 e8 00 c9 f7 ff 90 <0f> 0b 90 90 48 89 ee 4c 89 ef e8 e0 20 b8 00 49 8b 47 40 48 8b 40 [ 0.414540] RSP: 0000:ffffb71480013ae0 EFLAGS: 00010086 [ 0.414540] RAX: 0000000000000000 RBX: ffffa099c2722000 RCX: 0000000000000000 [ 0.414540] RDX: 0000000000000000 RSI: ffffb71480013998 RDI: 0000000000000001 [ 0.414540] RBP: 0000000000000246 R08: 00000000ffffdfff R09: 0000000000000001 [ 0.414540] R10: 00000000ffffdfff R11: ffffffffb18729c0 R12: ffffa099c1c91760 [ 0.414540] R13: ffffa099c1c916a4 R14: ffffa099c1d2f200 R15: ffffa099c1c91600 [ 0.414540] FS: 0000000000000000(0000) GS:ffffa099fec40000(0000) knlGS:0000000000000000 [ 0.414540] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.414540] CR2: 0000000000000000 CR3: 0000000008e3e001 CR4: 0000000000370ef0 [ 0.414540] Call Trace: [ 0.414540] <TASK> [ 0.414540] ? __warn+0x80/0x120 [ 0.414540] ? free_irq+0x1a1/0x2d0 [ 0.414540] ? report_bug+0x164/0x190 [ 0.414540] ? handle_bug+0x3b/0x70 [ 0.414540] ? exc_invalid_op+0x17/0x70 [ 0.414540] ? asm_exc_invalid_op+0x1a/0x20 [ 0.414540] ? free_irq+0x1a1/0x2d0 [ 0.414540] vp_del_vqs+0xc1/0x220 [ 0.414540] vp_find_vqs_msix+0x305/0x470 [ 0.414540] vp_find_vqs+0x3e/0x1a0 [ 0.414540] vp_modern_find_vqs+0x1b/0x70 [ 0.414540] init_vqs+0x387/0x600 [ 0.414540] virtnet_probe+0x50a/0xc80 [ 0.414540] virtio_dev_probe+0x1e0/0x2b0 [ 0.414540] really_probe+0xc0/0x2c0 [ 0.414540] ? __pfx___driver_attach+0x10/0x10 [ 0.414540] __driver_probe_device+0x73/0x120 [ 0.414540] driver_probe_device+0x1f/0xe0 [ 0.414540] __driver_attach+0x88/0x180 [ 0.414540] bus_for_each_dev+0x85/0xd0 [ 0.414540] bus_add_driver+0xec/0x1f0 [ 0.414540] driver_register+0x59/0x100 [ 0.414540] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 0.414540] virtio_net_driver_init+0x90/0xb0 [ 0.414540] do_one_initcall+0x58/0x230 [ 0.414540] kernel_init_freeable+0x1a3/0x2d0 [ 0.414540] ? __pfx_kernel_init+0x10/0x10 [ 0.414540] kernel_init+0x1a/0x1c0 [ 0.414540] ret_from_fork+0x31/0x50 [ 0.414540] ? __pfx_kernel_init+0x10/0x10 [ 0.414540] ret_from_fork_asm+0x1a/0x30 [ 0.414540] </TASK> Fix this by calling deleting the current vq when request_irq() fails. Fixes: 0b0f9dc ("Revert "virtio_pci: use shared interrupts for virtqueues"") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Message-Id: <20240426150845.3999481-1-jiri@resnulli.us> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
frank-w
pushed a commit
that referenced
this pull request
Sep 14, 2024
[ Upstream commit 8987515 ] When request_irq() fails, error path calls vp_del_vqs(). There, as vq is present in the list, free_irq() is called for the same vector. That causes following splat: [ 0.414355] Trying to free already-free IRQ 27 [ 0.414403] WARNING: CPU: 1 PID: 1 at kernel/irq/manage.c:1899 free_irq+0x1a1/0x2d0 [ 0.414510] Modules linked in: [ 0.414540] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc4+ #27 [ 0.414540] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 [ 0.414540] RIP: 0010:free_irq+0x1a1/0x2d0 [ 0.414540] Code: 1e 00 48 83 c4 08 48 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 90 8b 74 24 04 48 c7 c7 98 80 6c b1 e8 00 c9 f7 ff 90 <0f> 0b 90 90 48 89 ee 4c 89 ef e8 e0 20 b8 00 49 8b 47 40 48 8b 40 [ 0.414540] RSP: 0000:ffffb71480013ae0 EFLAGS: 00010086 [ 0.414540] RAX: 0000000000000000 RBX: ffffa099c2722000 RCX: 0000000000000000 [ 0.414540] RDX: 0000000000000000 RSI: ffffb71480013998 RDI: 0000000000000001 [ 0.414540] RBP: 0000000000000246 R08: 00000000ffffdfff R09: 0000000000000001 [ 0.414540] R10: 00000000ffffdfff R11: ffffffffb18729c0 R12: ffffa099c1c91760 [ 0.414540] R13: ffffa099c1c916a4 R14: ffffa099c1d2f200 R15: ffffa099c1c91600 [ 0.414540] FS: 0000000000000000(0000) GS:ffffa099fec40000(0000) knlGS:0000000000000000 [ 0.414540] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.414540] CR2: 0000000000000000 CR3: 0000000008e3e001 CR4: 0000000000370ef0 [ 0.414540] Call Trace: [ 0.414540] <TASK> [ 0.414540] ? __warn+0x80/0x120 [ 0.414540] ? free_irq+0x1a1/0x2d0 [ 0.414540] ? report_bug+0x164/0x190 [ 0.414540] ? handle_bug+0x3b/0x70 [ 0.414540] ? exc_invalid_op+0x17/0x70 [ 0.414540] ? asm_exc_invalid_op+0x1a/0x20 [ 0.414540] ? free_irq+0x1a1/0x2d0 [ 0.414540] vp_del_vqs+0xc1/0x220 [ 0.414540] vp_find_vqs_msix+0x305/0x470 [ 0.414540] vp_find_vqs+0x3e/0x1a0 [ 0.414540] vp_modern_find_vqs+0x1b/0x70 [ 0.414540] init_vqs+0x387/0x600 [ 0.414540] virtnet_probe+0x50a/0xc80 [ 0.414540] virtio_dev_probe+0x1e0/0x2b0 [ 0.414540] really_probe+0xc0/0x2c0 [ 0.414540] ? __pfx___driver_attach+0x10/0x10 [ 0.414540] __driver_probe_device+0x73/0x120 [ 0.414540] driver_probe_device+0x1f/0xe0 [ 0.414540] __driver_attach+0x88/0x180 [ 0.414540] bus_for_each_dev+0x85/0xd0 [ 0.414540] bus_add_driver+0xec/0x1f0 [ 0.414540] driver_register+0x59/0x100 [ 0.414540] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 0.414540] virtio_net_driver_init+0x90/0xb0 [ 0.414540] do_one_initcall+0x58/0x230 [ 0.414540] kernel_init_freeable+0x1a3/0x2d0 [ 0.414540] ? __pfx_kernel_init+0x10/0x10 [ 0.414540] kernel_init+0x1a/0x1c0 [ 0.414540] ret_from_fork+0x31/0x50 [ 0.414540] ? __pfx_kernel_init+0x10/0x10 [ 0.414540] ret_from_fork_asm+0x1a/0x30 [ 0.414540] </TASK> Fix this by calling deleting the current vq when request_irq() fails. Fixes: 0b0f9dc ("Revert "virtio_pci: use shared interrupts for virtqueues"") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Message-Id: <20240426150845.3999481-1-jiri@resnulli.us> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
frank-w
pushed a commit
that referenced
this pull request
Sep 14, 2024
commit f9a4300 upstream. __nft_release_hooks() is called from pre_netns exit path which unregisters the hooks, then the NETDEV_UNREGISTER event is triggered which unregisters the hooks again. [ 565.221461] WARNING: CPU: 18 PID: 193 at net/netfilter/core.c:495 __nf_unregister_net_hook+0x247/0x270 [...] [ 565.246890] CPU: 18 PID: 193 Comm: kworker/u64:1 Tainted: G E 5.18.0-rc7+ #27 [ 565.253682] Workqueue: netns cleanup_net [ 565.257059] RIP: 0010:__nf_unregister_net_hook+0x247/0x270 [...] [ 565.297120] Call Trace: [ 565.300900] <TASK> [ 565.304683] nf_tables_flowtable_event+0x16a/0x220 [nf_tables] [ 565.308518] raw_notifier_call_chain+0x63/0x80 [ 565.312386] unregister_netdevice_many+0x54f/0xb50 Unregister and destroy netdev hook from netns pre_exit via kfree_rcu so the NETDEV_UNREGISTER path see unregistered hooks. Fixes: 767d121 ("netfilter: nftables: fix possible UAF over chains from packet path in netns") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Nov 3, 2024
Wesley reported an issue: ================================================================== EXT4-fs (dm-5): resizing filesystem from 7168 to 786432 blocks ------------[ cut here ]------------ kernel BUG at fs/ext4/resize.c:324! CPU: 9 UID: 0 PID: 3576 Comm: resize2fs Not tainted 6.11.0+ #27 RIP: 0010:ext4_resize_fs+0x1212/0x12d0 Call Trace: __ext4_ioctl+0x4e0/0x1800 ext4_ioctl+0x12/0x20 __x64_sys_ioctl+0x99/0xd0 x64_sys_call+0x1206/0x20d0 do_syscall_64+0x72/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e ================================================================== While reviewing the patch, Honza found that when adjusting resize_bg in alloc_flex_gd(), it was possible for flex_gd->resize_bg to be bigger than flexbg_size. The reproduction of the problem requires the following: o_group = flexbg_size * 2 * n; o_size = (o_group + 1) * group_size; n_group: [o_group + flexbg_size, o_group + flexbg_size * 2) o_size = (n_group + 1) * group_size; Take n=0,flexbg_size=16 as an example: last:15 |o---------------|--------------n-| o_group:0 resize to n_group:30 The corresponding reproducer is: img=test.img rm -f $img truncate -s 600M $img mkfs.ext4 -F $img -b 1024 -G 16 8M dev=`losetup -f --show $img` mkdir -p /tmp/test mount $dev /tmp/test resize2fs $dev 248M Delete the problematic plus 1 to fix the issue, and add a WARN_ON_ONCE() to prevent the issue from happening again. [ Note: another reproucer which this commit fixes is: img=test.img rm -f $img truncate -s 25MiB $img mkfs.ext4 -b 4096 -E nodiscard,lazy_itable_init=0,lazy_journal_init=0 $img truncate -s 3GiB $img dev=`losetup -f --show $img` mkdir -p /tmp/test mount $dev /tmp/test resize2fs $dev 3G umount $dev losetup -d $dev -- TYT ] Reported-by: Wesley Hershberger <wesley.hershberger@canonical.com> Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2081231 Reported-by: Stéphane Graber <stgraber@stgraber.org> Closes: https://lore.kernel.org/all/20240925143325.518508-1-aleksandr.mikhalitsyn@canonical.com/ Tested-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Tested-by: Eric Sandeen <sandeen@redhat.com> Fixes: 665d3e0 ("ext4: reduce unnecessary memory allocation in alloc_flex_gd()") Cc: stable@vger.kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240927133329.1015041-1-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
BtbN
pushed a commit
to BtbN/BPI-Router-Linux
that referenced
this pull request
Nov 9, 2024
commit 6121258 upstream. Wesley reported an issue: ================================================================== EXT4-fs (dm-5): resizing filesystem from 7168 to 786432 blocks ------------[ cut here ]------------ kernel BUG at fs/ext4/resize.c:324! CPU: 9 UID: 0 PID: 3576 Comm: resize2fs Not tainted 6.11.0+ frank-w#27 RIP: 0010:ext4_resize_fs+0x1212/0x12d0 Call Trace: __ext4_ioctl+0x4e0/0x1800 ext4_ioctl+0x12/0x20 __x64_sys_ioctl+0x99/0xd0 x64_sys_call+0x1206/0x20d0 do_syscall_64+0x72/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e ================================================================== While reviewing the patch, Honza found that when adjusting resize_bg in alloc_flex_gd(), it was possible for flex_gd->resize_bg to be bigger than flexbg_size. The reproduction of the problem requires the following: o_group = flexbg_size * 2 * n; o_size = (o_group + 1) * group_size; n_group: [o_group + flexbg_size, o_group + flexbg_size * 2) o_size = (n_group + 1) * group_size; Take n=0,flexbg_size=16 as an example: last:15 |o---------------|--------------n-| o_group:0 resize to n_group:30 The corresponding reproducer is: img=test.img rm -f $img truncate -s 600M $img mkfs.ext4 -F $img -b 1024 -G 16 8M dev=`losetup -f --show $img` mkdir -p /tmp/test mount $dev /tmp/test resize2fs $dev 248M Delete the problematic plus 1 to fix the issue, and add a WARN_ON_ONCE() to prevent the issue from happening again. [ Note: another reproucer which this commit fixes is: img=test.img rm -f $img truncate -s 25MiB $img mkfs.ext4 -b 4096 -E nodiscard,lazy_itable_init=0,lazy_journal_init=0 $img truncate -s 3GiB $img dev=`losetup -f --show $img` mkdir -p /tmp/test mount $dev /tmp/test resize2fs $dev 3G umount $dev losetup -d $dev -- TYT ] Reported-by: Wesley Hershberger <wesley.hershberger@canonical.com> Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2081231 Reported-by: Stéphane Graber <stgraber@stgraber.org> Closes: https://lore.kernel.org/all/20240925143325.518508-1-aleksandr.mikhalitsyn@canonical.com/ Tested-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Tested-by: Eric Sandeen <sandeen@redhat.com> Fixes: 665d3e0 ("ext4: reduce unnecessary memory allocation in alloc_flex_gd()") Cc: stable@vger.kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240927133329.1015041-1-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
frank-w
pushed a commit
that referenced
this pull request
Dec 6, 2024
[ Upstream commit 60f07e2 ] We use uprobe in aarch64_be, which we found the tracee task would exit due to SIGILL when we enable the uprobe trace. We can see the replace inst from uprobe is not correct in aarch big-endian. As in Armv8-A, instruction fetches are always treated as little-endian, we should treat the UPROBE_SWBP_INSN as little-endian。 The test case is as following。 bash-4.4# ./mqueue_test_aarchbe 1 1 2 1 10 > /dev/null & bash-4.4# cd /sys/kernel/debug/tracing/ bash-4.4# echo 'p:test /mqueue_test_aarchbe:0xc30 %x0 %x1' > uprobe_events bash-4.4# echo 1 > events/uprobes/enable bash-4.4# bash-4.4# ps PID TTY TIME CMD 140 ? 00:00:01 bash 237 ? 00:00:00 ps [1]+ Illegal instruction ./mqueue_test_aarchbe 1 1 2 1 100 > /dev/null which we debug use gdb as following: bash-4.4# gdb attach 155 (gdb) disassemble send Dump of assembler code for function send: 0x0000000000400c30 <+0>: .inst 0xa00020d4 ; undefined 0x0000000000400c34 <+4>: mov x29, sp 0x0000000000400c38 <+8>: str w0, [sp, #28] 0x0000000000400c3c <+12>: strb w1, [sp, #27] 0x0000000000400c40 <+16>: str xzr, [sp, #40] 0x0000000000400c44 <+20>: str xzr, [sp, #48] 0x0000000000400c48 <+24>: add x0, sp, #0x1b 0x0000000000400c4c <+28>: mov w3, #0x0 // #0 0x0000000000400c50 <+32>: mov x2, #0x1 // #1 0x0000000000400c54 <+36>: mov x1, x0 0x0000000000400c58 <+40>: ldr w0, [sp, #28] 0x0000000000400c5c <+44>: bl 0x405e10 <mq_send> 0x0000000000400c60 <+48>: str w0, [sp, #60] 0x0000000000400c64 <+52>: ldr w0, [sp, #60] 0x0000000000400c68 <+56>: ldp x29, x30, [sp], #64 0x0000000000400c6c <+60>: ret End of assembler dump. (gdb) info b No breakpoints or watchpoints. (gdb) c Continuing. Program received signal SIGILL, Illegal instruction. 0x0000000000400c30 in send () (gdb) x/10x 0x400c30 0x400c30 <send>: 0xd42000a0 0xfd030091 0xe01f00b9 0xe16f0039 0x400c40 <send+16>: 0xff1700f9 0xff1b00f9 0xe06f0091 0x03008052 0x400c50 <send+32>: 0x220080d2 0xe10300aa (gdb) disassemble 0x400c30 Dump of assembler code for function send: => 0x0000000000400c30 <+0>: .inst 0xa00020d4 ; undefined 0x0000000000400c34 <+4>: mov x29, sp 0x0000000000400c38 <+8>: str w0, [sp, #28] 0x0000000000400c3c <+12>: strb w1, [sp, #27] 0x0000000000400c40 <+16>: str xzr, [sp, #40] Signed-off-by: junhua huang <huang.junhua@zte.com.cn> Link: https://lore.kernel.org/r/202212021511106844809@zte.com.cn Signed-off-by: Will Deacon <will@kernel.org> Stable-dep-of: 13f8f1e ("arm64: probes: Fix uprobes for big-endian kernels") Signed-off-by: Sasha Levin <sashal@kernel.org>
frank-w
pushed a commit
that referenced
this pull request
Dec 6, 2024
[ Upstream commit 60f07e2 ] We use uprobe in aarch64_be, which we found the tracee task would exit due to SIGILL when we enable the uprobe trace. We can see the replace inst from uprobe is not correct in aarch big-endian. As in Armv8-A, instruction fetches are always treated as little-endian, we should treat the UPROBE_SWBP_INSN as little-endian。 The test case is as following。 bash-4.4# ./mqueue_test_aarchbe 1 1 2 1 10 > /dev/null & bash-4.4# cd /sys/kernel/debug/tracing/ bash-4.4# echo 'p:test /mqueue_test_aarchbe:0xc30 %x0 %x1' > uprobe_events bash-4.4# echo 1 > events/uprobes/enable bash-4.4# bash-4.4# ps PID TTY TIME CMD 140 ? 00:00:01 bash 237 ? 00:00:00 ps [1]+ Illegal instruction ./mqueue_test_aarchbe 1 1 2 1 100 > /dev/null which we debug use gdb as following: bash-4.4# gdb attach 155 (gdb) disassemble send Dump of assembler code for function send: 0x0000000000400c30 <+0>: .inst 0xa00020d4 ; undefined 0x0000000000400c34 <+4>: mov x29, sp 0x0000000000400c38 <+8>: str w0, [sp, #28] 0x0000000000400c3c <+12>: strb w1, [sp, #27] 0x0000000000400c40 <+16>: str xzr, [sp, #40] 0x0000000000400c44 <+20>: str xzr, [sp, #48] 0x0000000000400c48 <+24>: add x0, sp, #0x1b 0x0000000000400c4c <+28>: mov w3, #0x0 // #0 0x0000000000400c50 <+32>: mov x2, #0x1 // #1 0x0000000000400c54 <+36>: mov x1, x0 0x0000000000400c58 <+40>: ldr w0, [sp, #28] 0x0000000000400c5c <+44>: bl 0x405e10 <mq_send> 0x0000000000400c60 <+48>: str w0, [sp, #60] 0x0000000000400c64 <+52>: ldr w0, [sp, #60] 0x0000000000400c68 <+56>: ldp x29, x30, [sp], #64 0x0000000000400c6c <+60>: ret End of assembler dump. (gdb) info b No breakpoints or watchpoints. (gdb) c Continuing. Program received signal SIGILL, Illegal instruction. 0x0000000000400c30 in send () (gdb) x/10x 0x400c30 0x400c30 <send>: 0xd42000a0 0xfd030091 0xe01f00b9 0xe16f0039 0x400c40 <send+16>: 0xff1700f9 0xff1b00f9 0xe06f0091 0x03008052 0x400c50 <send+32>: 0x220080d2 0xe10300aa (gdb) disassemble 0x400c30 Dump of assembler code for function send: => 0x0000000000400c30 <+0>: .inst 0xa00020d4 ; undefined 0x0000000000400c34 <+4>: mov x29, sp 0x0000000000400c38 <+8>: str w0, [sp, #28] 0x0000000000400c3c <+12>: strb w1, [sp, #27] 0x0000000000400c40 <+16>: str xzr, [sp, #40] Signed-off-by: junhua huang <huang.junhua@zte.com.cn> Link: https://lore.kernel.org/r/202212021511106844809@zte.com.cn Signed-off-by: Will Deacon <will@kernel.org> Stable-dep-of: 13f8f1e ("arm64: probes: Fix uprobes for big-endian kernels") Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.