Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update obs build for kernel 6.5.3 #26

Merged
merged 6 commits into from
Sep 19, 2023
Merged

update obs build for kernel 6.5.3 #26

merged 6 commits into from
Sep 19, 2023

Conversation

matrix-wsk
Copy link
Collaborator

No description provided.

@deepin-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from matrix-wsk. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@matrix-wsk matrix-wsk merged commit c669c1f into deepin-community:kernel-mainline Sep 19, 2023
@Zeno-sole
Copy link
Contributor

/integrate

matrix-wsk pushed a commit that referenced this pull request Sep 25, 2023
commit 0b0747d upstream.

The following processes run into a deadlock. CPU 41 was waiting for CPU 29
to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29
was hung by that spinlock with IRQs disabled.

  PID: 17360    TASK: ffff95c1090c5c40  CPU: 41  COMMAND: "mrdiagd"
  !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0
  !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0
  !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0
   # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0
   # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0
   # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0
   # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0
   # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0
   # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0
   # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0
   #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0
   #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0
   #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0
   #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0
   #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0
   #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0
   #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0
   #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0
   #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0
   #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

  PID: 17355    TASK: ffff95c1090c3d80  CPU: 29  COMMAND: "mrdiagd"
  !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0
  !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0
   # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0
   # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0
   # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0
   # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0
   # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0
   # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0
   # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0
   # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

The lock is used to synchronize different sysfs operations, it doesn't
protect any resource that will be touched by an interrupt. Consequently
it's not required to disable IRQs. Replace the spinlock with a mutex to fix
the deadlock.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
matrix-wsk pushed a commit that referenced this pull request Dec 14, 2023
[ Upstream commit a524eab ]

As of commit b92143d ("net: dsa: mv88e6xxx: add infrastructure for
phylink_pcs") probing of a Marvell 88e6350 switch causes a NULL pointer
de-reference like this example:

    ...
    mv88e6085 d0072004.mdio-mii:11: switch 0x3710 detected: Marvell 88E6350, revision 2
    8<--- cut here ---
    Unable to handle kernel NULL pointer dereference at virtual address 00000000 when read
    [00000000] *pgd=00000000
    Internal error: Oops: 5 [#1] ARM
    Modules linked in:
    CPU: 0 PID: 8 Comm: kworker/u2:0 Not tainted 6.7.0-rc2-dirty #26
    Hardware name: Marvell Armada 370/XP (Device Tree)
    Workqueue: events_unbound deferred_probe_work_func
    PC is at mv88e6xxx_port_setup+0x1c/0x44
    LR is at dsa_port_devlink_setup+0x74/0x154
    pc : [<c057ea24>]    lr : [<c0819598>]    psr: a0000013
    sp : c184fce0  ip : c542b8f4  fp : 00000000
    r10: 00000001  r9 : c542a540  r8 : c542bc00
    r7 : c542b838  r6 : c5244580  r5 : 00000005  r4 : c5244580
    r3 : 00000000  r2 : c542b840  r1 : 00000005  r0 : c1a02040
    ...

The Marvell 6350 switch has no SERDES interface and so has no
corresponding pcs_ops defined for it. But during probing a call is made
to mv88e6xxx_port_setup() which unconditionally expects pcs_ops to exist -
though the presence of the pcs_ops->pcs_init function is optional.

Modify code to check for pcs_ops first, before checking for and calling
pcs_ops->pcs_init. Modify checking and use of pcs_ops->pcs_teardown
which may potentially suffer the same problem.

Fixes: b92143d ("net: dsa: mv88e6xxx: add infrastructure for phylink_pcs")
Signed-off-by: Greg Ungerer <gerg@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
matrix-wsk pushed a commit to matrix-wsk/kernel-6.6 that referenced this pull request Feb 23, 2024
commit 56925f3 upstream.

With previous patch, one of subtests in test_btf_id becomes
flaky and may fail. The following is a failing example:

  Error: deepin-community#26 btf
  Error: deepin-community#26/174 btf/BTF ID
    Error: deepin-community#26/174 btf/BTF ID
    btf_raw_create:PASS:check 0 nsec
    btf_raw_create:PASS:check 0 nsec
    test_btf_id:PASS:check 0 nsec
    ...
    test_btf_id:PASS:check 0 nsec
    test_btf_id:FAIL:check BTF lingersdo_test_get_info:FAIL:check failed: -1

The test tries to prove a btf_id not available after the map is closed.
But btf_id is freed only after workqueue and a rcu grace period, compared
to previous case just after a rcu grade period.
Depending on system workload, workqueue could take quite some time
to execute function bpf_map_free_deferred() which may cause the test failure.
Instead of adding arbitrary delays, let us remove the logic to
check btf_id availability after map is closed.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231214203820.1469402-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sterling-teng pushed a commit to sterling-teng/kernel that referenced this pull request Jun 13, 2024
When we boot a 3C5000-7A2000 2Node server using "nr_cpus=1 maxcpus=1",
we get the following Call Trace:
    4.807397] ------------[ cut here ]------------
[    4.811996] WARNING: CPU: 0 PID: 1 at drivers/acpi/irq.c:63 acpi_register_gsi+0xe0/0x100
[    4.820048] Modules linked in:
[    4.823082] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc7-next-20240509+ deepin-community#26
[    4.830604] Hardware name: LOONGSON Dabieshan/Loongson-LS2C50C2, BIOS Loongson UEFI (3C50007A2000_Ls2c5lc2) V4.0.13-Dual 12/06/23 17:33:54
[    4.842964] pc 9000000000bdc6e0 ra 9000000000bdc66c tp 90000002051d4000 sp 90000002051d7ae0
[    4.851269] a0 0000000000000000 a1 00000000000000a0 a2 0000000000000000 a3 0000000000000000
[    4.859567] a4 9000000200014518 a5 000000000000003c a6 90000000018876b0 a7 00302e39303a3030
[    4.867866] t0 0000000000000000 t1 0000000000000080 t2 0000000000000040 t3 0000000000036800
[    4.876165] t4 0000000000036800 t5 0000000000000004 t6 0000000000000000 t7 0000000000000000
[    4.884464] t8 900000000800d910 u0 0000000000000000 s9 90000000014bf9e8 s0 00000000000000a0
[    4.892763] s1 0000000000000000 s2 0000000000000000 s3 9000000001985000 s4 90000002051d7b88
[    4.901063] s5 0000000000000001 s6 90000002055470b8 s7 90000000014400ac s8 0000000000000008
[    4.909362]    ra: 9000000000bdc66c acpi_register_gsi+0x6c/0x100
[    4.915330]   ERA: 9000000000bdc6e0 acpi_register_gsi+0xe0/0x100
[    4.921297]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[    4.927446]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[    4.931777]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
[    4.936538]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[    4.941301] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[    4.946838]  PRID: 0014c011 (Loongson-64bit, Loongson-3C5000)
[    4.952545] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc7-next-20240509+ deepin-community#26
[    4.960066] Hardware name: LOONGSON Dabieshan/Loongson-LS2C50C2, BIOS Loongson UEFI (3C50007A2000_Ls2c5lc2) V4.0.13-Dual 12/06/23 17:33:54
[    4.972425] Stack : 9000000001985000 0000000000000000 9000000000223524 90000002051d4000
[    4.980382]         90000002051d7740 90000002051d7748 0000000000000000 90000002051d7888
[    4.988338]         90000002051d7880 90000002051d7880 90000002051d7690 0000000000000001
[    4.996294]         0000000000000001 90000002051d7748 420a070366b7e75d 90000002054f4500
[    5.004251]         0000000000000001 0000000000000000 4137303030354333 c0000000ffffdfff
[    5.012206]         0000000000000003 00000000000ea642 0000000006b3c000 90000000014bf9e8
[    5.020163]         0000000000000000 0000000000000000 90000000017f7e80 9000000001985000
[    5.028119]         0000000000000001 0000000000000000 ffffffffffd7f549 90000000014400ac
[    5.036075]         0000000000000008 0000000000000000 900000000022353c 00007ffff1354000
[    5.044031]         00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d
[    5.051987]         ...
[    5.054410] Call Trace:
[    5.054412] [<900000000022353c>] show_stack+0x5c/0x1a0
[    5.061940] [<900000000141e86c>] dump_stack_lvl+0x9c/0xc4
[    5.067305] [<900000000140170c>] __warn+0x94/0xcc
[    5.071977] [<90000000013cd900>] report_bug+0x160/0x220
[    5.077170] [<900000000141facc>] do_bp+0x26c/0x2e0
[    5.081929] [<0000000000000000>] 0x0
[    5.085475] [<9000000000bdc6e0>] acpi_register_gsi+0xe0/0x100
[    5.091182] [<9000000000bd5350>] acpi_pci_irq_enable+0xd0/0x220
[    5.097061] [<9000000000b5e8d4>] pci_device_probe+0x54/0x260
[    5.102683] [<9000000000d1fefc>] really_probe+0xbc/0x320
[    5.107960] [<9000000000d201f0>] __driver_probe_device+0x90/0x160
[    5.114012] [<9000000000d202f8>] driver_probe_device+0x38/0x120
[    5.119892] [<9000000000d205d4>] __driver_attach+0x94/0x1c0
[    5.125426] [<9000000000d1d7ac>] bus_for_each_dev+0x8c/0x100
[    5.131046] [<9000000000d1ee30>] bus_add_driver+0xf0/0x240
[    5.136494] [<9000000000d21688>] driver_register+0x68/0x140
[    5.142028] [<9000000000220158>] do_one_initcall+0x78/0x200
[    5.147561] [<9000000001440fcc>] kernel_init_freeable+0x23c/0x2b0
[    5.153614] [<900000000142123c>] kernel_init+0x1c/0x120
[    5.158802] [<9000000000221348>] ret_from_kernel_thread+0xc/0xa4

[    5.166244] ---[ end trace 0000000000000000 ]---
[    5.170998] GSI: No registered irqchip, giving up

The root cause is that when the CPU boot from flat mode, it is
unreliable to use cpu_to node to calculate the node number.
Therefore, we use NODES_PER_FLATMODE_NODE to calculate the
physical node number.

According to the 3C5000 user manual, the value of
NODES_PER_FLATMODE_NODE should be 4.

Signed-off-by: Ming Wang <wangming01@loongson.cn>
Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn>
Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants