spl_taskq_expand() scheduling while atomic #12696

behlendorf · 2021-10-26T23:17:13Z

System information

Type	Version/Name
Distribution Name	Ubuntu
Distribution Version
Kernel Version	5.8.0-59-lowlatency #66~20.04.1-Ubuntu
Architecture
OpenZFS Version	2.1

Describe the problem you're observing

Reported in this #12664 (comment). This issue was introduced by the taskq hotplug change, 60a4c7d, which calls kthread_create() from an atomic context. This debug warning is enabled when the kernel is built with CONFIG_DEBUG_ATOMIC_SLEEP.

Oct 25 12:37:48 foxbox kernel: [103815.516588] CPU: 1 PID: 15 Comm: cpuhp/1 Tainted: P           OE     5
.8.0-59-lowlatency #66~20.04.1-Ubuntu
Oct 25 12:37:48 foxbox kernel: [103815.516589] Hardware name: Gigabyte Technology Co., Ltd. Z68MA-D2H-B3/
Z68MA-D2H-B3, BIOS F10 02/23/2012
Oct 25 12:37:48 foxbox kernel: [103815.516590] Call Trace:
Oct 25 12:37:48 foxbox kernel: [103815.516599]  dump_stack+0x74/0x92
Oct 25 12:37:48 foxbox kernel: [103815.516602]  __schedule_bug.cold+0x4c/0x5d
Oct 25 12:37:48 foxbox kernel: [103815.516606]  __schedule+0x789/0xa70
Oct 25 12:37:48 foxbox kernel: [103815.516609]  schedule+0x6e/0xe0
Oct 25 12:37:48 foxbox kernel: [103815.516611]  schedule_timeout+0x132/0x180
Oct 25 12:37:48 foxbox kernel: [103815.516615]  ? ttwu_queue_wakelist+0x9b/0xd0
Oct 25 12:37:48 foxbox kernel: [103815.516618]  wait_for_completion_killable+0x8a/0x140
Oct 25 12:37:48 foxbox kernel: [103815.516622]  __kthread_create_on_node+0xd6/0x1c0
Oct 25 12:37:48 foxbox kernel: [103815.516630]  ? task_done+0xb0/0xb0 [spl]
Oct 25 12:37:48 foxbox kernel: [103815.516633]  ? string_nocheck+0x4f/0x60
Oct 25 12:37:48 foxbox kernel: [103815.516639]  ? task_done+0xb0/0xb0 [spl]
Oct 25 12:37:48 foxbox kernel: [103815.516645]  ? taskq_thread_create+0xf0/0xf0 [spl]
Oct 25 12:37:48 foxbox kernel: [103815.516647]  kthread_create_on_node+0x49/0x60
Oct 25 12:37:48 foxbox kernel: [103815.516652]  ? task_done+0xb0/0xb0 [spl]
Oct 25 12:37:48 foxbox kernel: [103815.516658]  spl_kthread_create+0x88/0xd0 [spl]
Oct 25 12:37:48 foxbox kernel: [103815.516665]  taskq_thread_create+0x6a/0xf0 [spl]
Oct 25 12:37:48 foxbox kernel: [103815.516671]  spl_taskq_expand+0xc0/0xd0 [spl]
Oct 25 12:37:48 foxbox kernel: [103815.516675]  cpuhp_invoke_callback+0x248/0x630
Oct 25 12:37:48 foxbox kernel: [103815.516678]  cpuhp_thread_fun+0xb8/0x120
Oct 25 12:37:48 foxbox kernel: [103815.516681]  smpboot_thread_fn+0xfc/0x1f0
Oct 25 12:37:48 foxbox kernel: [103815.516683]  ? sort_range+0x30/0x30
Oct 25 12:37:48 foxbox kernel: [103815.516685]  kthread+0x129/0x170
Oct 25 12:37:48 foxbox kernel: [103815.516686]  ? kthread_park+0x90/0x90
Oct 25 12:37:48 foxbox kernel: [103815.516690]  ret_from_fork+0x1f/0x30
Oct 25 12:37:48 foxbox kernel: [103815.517031] BUG: scheduling while atomic: cpuhp/1/15/0x00000000

Describe how to reproduce the problem

It should be possible to reproduce this warning on any kernel with CONFIG_DEBUG_ATOMIC_SLEEP enabled when handling a hotplug event. The issue is that spl_taskq_expand() takes the taskq's tq_lock spinlock then calls spl_kthread_create() which may sleep. The thread creation needs to be moved outside the atomic spinlock context to resolve this. Based on the documentation I was able to find sleeping in the hotplug callback should be fine.

The text was updated successfully, but these errors were encountered:

pcd1193182 · 2021-10-26T23:48:05Z

I believe this diff should solve the problem; I'll post a PR once I have time to run the manual tests:

diff --git a/module/os/linux/spl/spl-taskq.c b/module/os/linux/spl/spl-taskq.c
index 61631256c8..7d1a73c738 100644
--- a/module/os/linux/spl/spl-taskq.c
+++ b/module/os/linux/spl/spl-taskq.c
@@ -1298,8 +1298,10 @@ spl_taskq_expand(unsigned int cpu, struct hlist_node *node)
 	ASSERT(tq);
 	spin_lock_irqsave_nested(&tq->tq_lock, flags, tq->tq_lock_class);
 
-	if (!(tq->tq_flags & TASKQ_ACTIVE))
-		goto out;
+	if (!(tq->tq_flags & TASKQ_ACTIVE)) {
+		spin_unlock_irqrestore(&tq->tq_lock, flags);
+		return (err);
+	}
 
 	ASSERT(tq->tq_flags & TASKQ_THREADS_CPU_PCT);
 	int nthreads = MIN(tq->tq_cpu_pct, 100);
@@ -1308,13 +1310,11 @@ spl_taskq_expand(unsigned int cpu, struct hlist_node *node)
 
 	if (!((tq->tq_flags & TASKQ_DYNAMIC) && spl_taskq_thread_dynamic) &&
 	    tq->tq_maxthreads > tq->tq_nthreads) {
-		ASSERT3U(tq->tq_maxthreads, ==, tq->tq_nthreads + 1);
+		spin_unlock_irqrestore(&tq->tq_lock, flags);
 		taskq_thread_t *tqt = taskq_thread_create(tq);
 		if (tqt == NULL)
 			err = -1;
+		return (err);
 	}
-
-out:
 	spin_unlock_irqrestore(&tq->tq_lock, flags);
 	return (err);
 }

adamdmoss · 2021-10-26T23:57:21Z

Wild, thanks for the quick attention! I'll give the patch a try.

pcd1193182 · 2021-10-27T00:00:04Z

No problem. It may be a bit before I have time to set up and run the manual cpu hotplug tests, with the dev summit coming up, but it's on my todo list. If you have a setup where you can test that yourself, that would also work, but if not I'll get to it.

AttilaFueloep · 2021-10-27T00:39:51Z

Is #12642 also an incarnation of this?

adamdmoss · 2021-10-27T00:43:17Z

With a couple of quick suspend/resumes this appears to have fixed the problem - no 'BUG:' whinging in logs and resume is successful.

If you have a setup where you can test that yourself

Afraid not. I think I'm only seeing this at all because (presumably) Linux soft-unplugs and replugs the CPU(s) over suspend-resume.

adamdmoss · 2021-10-27T00:44:01Z

Is #12642 also an incarnation of this?

Sure looks like it to me.

behlendorf · 2021-10-27T19:11:51Z

It sure is, I've closed it as a duplicate of this issue. Thanks for noticing that.

ff1bit · 2021-10-28T00:00:32Z

I compiled the latest zfs (head) with the patch in #12696 (comment) and the logs are now clean (no more scheduling while atomic errors) after resuming from suspend (kernel 5.14.11-arch1-1)
Thank you everyone for the quick fix

sunilvaltix · 2022-01-22T08:03:57Z

Is this marked to land in 2.1.x in near future?

RJVB · 2022-01-27T10:56:49Z

Other inquiring minds like to know this too. I'm seeing what I presume is the same bug after resuming from sleep: I get the backtrace 3 or 4 times in the syslog, curiously always for CPU (core) 3. I have yet to notice any other effects, it looks like the log entry is just a (debug) message about something that is being handled.

I do also see this before each backtrace:

Jan 27 11:48:07 Bola kernel: [259714.830485] Preemption disabled at:
Jan 27 11:48:07 Bola kernel: [259714.830491] [<          (null)>]           (null)
Jan 27 11:48:07 Bola kernel: [259714.830503] CPU: 3 PID: 26 Comm: cpuhp/3 Tainted: P        W  OE   4.14.23-ck1-mainline-core2-rjvb #1

I suppose I would have noticed if preemption remained disabled...?

satmandu · 2022-02-25T15:01:54Z

Enabling zswap seems to make this appear much more quickly.

[ 3403.549232] BUG: scheduling while atomic: cpuhp/7/54/0x00000000
[ 3403.550228] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer nls_utf8 cifs cifs_arc4 cifs_md4 fscache netfs qrtr cmac algif_hash algif_skcipher af_alg bnep nls_iso8859_1 intel_rapl_msr mei_pxp intel_rapl_common mei_hdcp joydev x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_cirrus snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec applesmc snd_hda_core kvm_intel snd_hwdep snd_pcm kvm btusb wl(POE) snd_seq_midi btrtl snd_seq_midi_event snd_rawmidi btbcm btintel btmtk snd_seq bluetooth input_leds ecdh_generic rapl snd_seq_device ecc intel_cstate snd_timer bcm5974 apple_mfi_fastcharge cfg80211 snd mei_me efi_pstore mei soundcore sbs acpi_als sbshc industrialio_triggered_buffer apple_gmux kfifo_buf industrialio mac_hid apple_bl sch_fq_codel coretemp ipmi_devintf ipmi_msghandler msr parport_pc ppdev lp parport ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE)
[ 3403.614873]  icp(POE) spl(OE) hid_apple hid_generic uas usbhid usb_storage hid nouveau mxm_wmi wmi i2c_algo_bit drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd nvme i2c_i801 drm nvme_core thunderbolt xhci_pci lpc_ich i2c_smbus xhci_pci_renesas video z3fold lz4 lz4_compress
[ 3403.643031] CPU: 7 PID: 54 Comm: cpuhp/7 Tainted: P        W  OE     5.17.0-051700rc5-generic #202202202230
[ 3403.644774] Hardware name: Apple Inc. MacBookPro11,3/Mac-2BD1B31983FE1663, BIOS 432.60.3.0.0 10/27/2021
[ 3403.645714] Call Trace:
[ 3403.646263]  <TASK>
[ 3403.646889]  show_stack+0x52/0x58
[ 3403.649106]  dump_stack_lvl+0x4c/0x63
[ 3403.650946]  dump_stack+0x10/0x12
[ 3403.652576]  __schedule_bug.cold+0x50/0x5e
[ 3403.654744]  __schedule+0x463/0x5a0
[ 3403.657158]  schedule+0x55/0xd0
[ 3403.658982]  schedule_timeout+0x114/0x150
[ 3403.661395]  ? try_to_wake_up+0x210/0x600
[ 3403.663623]  wait_for_completion+0x8b/0xf0
[ 3403.666105]  devtmpfs_submit_req+0x73/0x90
[ 3403.668257]  devtmpfs_create_node+0x94/0xc0
[ 3403.671162]  device_add+0x72f/0x7b0
[ 3403.674197]  device_create_groups_vargs+0xd4/0xf0
[ 3403.676667]  ? msr_devnode+0x30/0x30 [msr]
[ 3403.679056]  device_create+0x49/0x60
[ 3403.681509]  ? _raw_spin_lock_irqsave+0xe/0x10
[ 3403.683717]  ? spl_taskq_expand+0x8c/0xd0 [spl]
[ 3403.692882]  msr_device_create+0x2b/0x40 [msr]
[ 3403.695259]  cpuhp_invoke_callback+0x108/0x4c0
[ 3403.697167]  ? taskq_thread_create+0xf0/0xf0 [spl]
[ 3403.706734]  cpuhp_thread_fun+0xc0/0x1b0
[ 3403.708975]  ? smpboot_register_percpu_thread+0x130/0x130
[ 3403.710969]  smpboot_thread_fn+0xe0/0x1d0
[ 3403.713042]  kthread+0xeb/0x120
[ 3403.714955]  ? kthread_complete_and_exit+0x20/0x20
[ 3403.717390]  ret_from_fork+0x1f/0x30
[ 3403.720924]  </TASK>

RJVB · 2022-02-25T18:21:23Z

I have zswap enabled and can confirm that #12696 is a sufficient fix even in that case.

behlendorf · 2022-02-25T18:41:22Z

Thanks for noticing this wasn't in the 2.1.x patch stack, we'll add it to the list.

behlendorf · 2022-02-25T18:43:06Z

Closing since this was fixed in master.

behlendorf added the Type: Defect label Oct 26, 2021

behlendorf mentioned this issue Oct 26, 2021

zvol: Support blk-mq for better performance #12664

Closed

13 tasks

behlendorf mentioned this issue Oct 27, 2021

spl - kernel error: scheduling while atomic #12642

Closed

pcd1193182 mentioned this issue Nov 2, 2021

Fix cpu hotplug atomic sleep issue #12714

Merged

13 tasks

behlendorf closed this as completed Feb 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spl_taskq_expand() scheduling while atomic #12696

spl_taskq_expand() scheduling while atomic #12696

behlendorf commented Oct 26, 2021

pcd1193182 commented Oct 26, 2021

adamdmoss commented Oct 26, 2021

pcd1193182 commented Oct 27, 2021 •

edited

Loading

AttilaFueloep commented Oct 27, 2021

adamdmoss commented Oct 27, 2021

adamdmoss commented Oct 27, 2021

behlendorf commented Oct 27, 2021

ff1bit commented Oct 28, 2021

sunilvaltix commented Jan 22, 2022

RJVB commented Jan 27, 2022

satmandu commented Feb 25, 2022

RJVB commented Feb 25, 2022

behlendorf commented Feb 25, 2022

behlendorf commented Feb 25, 2022

spl_taskq_expand() scheduling while atomic #12696

spl_taskq_expand() scheduling while atomic #12696

Comments

behlendorf commented Oct 26, 2021

System information

Describe the problem you're observing

Describe how to reproduce the problem

pcd1193182 commented Oct 26, 2021

adamdmoss commented Oct 26, 2021

pcd1193182 commented Oct 27, 2021 • edited Loading

AttilaFueloep commented Oct 27, 2021

adamdmoss commented Oct 27, 2021

adamdmoss commented Oct 27, 2021

behlendorf commented Oct 27, 2021

ff1bit commented Oct 28, 2021

sunilvaltix commented Jan 22, 2022

RJVB commented Jan 27, 2022

satmandu commented Feb 25, 2022

RJVB commented Feb 25, 2022

behlendorf commented Feb 25, 2022

behlendorf commented Feb 25, 2022

pcd1193182 commented Oct 27, 2021 •

edited

Loading