Skip to content

workqueue lockup in rcu_sched #18

@oaken-source

Description

@oaken-source

Using linux-5.12 on the 7100, after a couple hours or days of running, I get messages like the following on the serial console:

[271072.624733] BUG: workqueue lockup - pool cpus=0-1 flags=0x4 nice=0 stuck for 2530s!
[271072.632592] Showing busy workqueues and worker pools:
[271072.637810] workqueue events_unbound: flags=0x2
[271072.642464]   pwq 4: cpus=0-1 flags=0x4 nice=0 active=3/512 refcnt=5
[271072.648930]     pending: flush_to_ldisc, flush_to_ldisc, flush_to_ldisc
[271072.655750] workqueue ext4-rsv-conversion: flags=0x2000a
[271072.661169]   pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/1 refcnt=3
[271072.667470]     pending: ext4_end_io_rsv_work
[271072.671983] pool 1: cpus=0 node=0 flags=0x1 nice=-20 hung=31s workers=2 manager: 816094 idle: 823322
[271081.464351] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[271081.470431]         (detected by 0, t=254302 jiffies, g=8370033, q=8218)
[271081.476627] rcu: All QSes seen, last rcu_sched kthread activity 254300 (4322046580-4321792280), jiffies_till_next_fqs=1, root ->qsmask 0x0
[271081.489167] rcu: rcu_sched kthread timer wakeup didn't happen for 254299 jiffies! g8370033 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200
[271081.500810] rcu:    Possible timer handling issue on cpu=1 timer-softirq=2492648
[271081.508132] rcu: rcu_sched kthread starved for 254300 jiffies! g8370033 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=1
[271081.518846] rcu:    Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[271081.527904] rcu: RCU grace-period kthread stack dump:
[271081.533050] task:rcu_sched       state:R stack:    0 pid:   11 ppid:     2 flags:0x00000000
[271081.541522] Call Trace:
[271081.544064] [<ffffffe000616bdc>] __schedule+0x1be/0x4a6
[271081.549410] [<ffffffe000616f1a>] schedule+0x56/0xca
[271081.554390] [<ffffffe000619b2a>] schedule_timeout+0x68/0xca
[271081.560071] [<ffffffe00005cddc>] rcu_gp_kthread+0x510/0x93e
[271081.565756] [<ffffffe000023bfa>] kthread+0xfe/0x10c
[271081.570739] [<ffffffe0000032b6>] ret_from_exception+0x0/0xc
[271081.576414] rcu: Stack dump where RCU GP kthread last ran:
[271081.581990] Task dump for CPU 1:
[271081.585308] task:sh              state:R  running task     stack:    0 pid:823462 ppid:822764 flags:0x00000000
[271081.595429] Call Trace:
[271081.597967] [<ffffffe000616bdc>] __schedule+0x1be/0x4a6

This repeats ad nauseum, while everything else seems to be locked up. The serial console is otherwise unresponsive and network access through ping and ssh simply times out.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions