-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spl_taskq_expand() scheduling while atomic #12696
Comments
I believe this diff should solve the problem; I'll post a PR once I have time to run the manual tests:
|
Wild, thanks for the quick attention! I'll give the patch a try. |
No problem. It may be a bit before I have time to set up and run the manual cpu hotplug tests, with the dev summit coming up, but it's on my todo list. If you have a setup where you can test that yourself, that would also work, but if not I'll get to it. |
Is #12642 also an incarnation of this? |
With a couple of quick suspend/resumes this appears to have fixed the problem - no 'BUG:' whinging in logs and resume is successful.
Afraid not. I think I'm only seeing this at all because (presumably) Linux soft-unplugs and replugs the CPU(s) over suspend-resume. |
Sure looks like it to me. |
It sure is, I've closed it as a duplicate of this issue. Thanks for noticing that. |
I compiled the latest zfs (head) with the patch in #12696 (comment) and the logs are now clean (no more scheduling while atomic errors) after resuming from suspend (kernel 5.14.11-arch1-1) |
Is this marked to land in 2.1.x in near future? |
Other inquiring minds like to know this too. I'm seeing what I presume is the same bug after resuming from sleep: I get the backtrace 3 or 4 times in the syslog, curiously always for CPU (core) 3. I have yet to notice any other effects, it looks like the log entry is just a (debug) message about something that is being handled. I do also see this before each backtrace:
I suppose I would have noticed if preemption remained disabled...? |
Enabling zswap seems to make this appear much more quickly.
|
I have zswap enabled and can confirm that #12696 is a sufficient fix even in that case. |
Thanks for noticing this wasn't in the 2.1.x patch stack, we'll add it to the list. |
Closing since this was fixed in master. |
System information
Describe the problem you're observing
Reported in this #12664 (comment). This issue was introduced by the taskq hotplug change, 60a4c7d, which calls
kthread_create()
from an atomic context. This debug warning is enabled when the kernel is built withCONFIG_DEBUG_ATOMIC_SLEEP
.Describe how to reproduce the problem
It should be possible to reproduce this warning on any kernel with
CONFIG_DEBUG_ATOMIC_SLEEP
enabled when handling a hotplug event. The issue is thatspl_taskq_expand()
takes the taskq'stq_lock
spinlock then callsspl_kthread_create()
which may sleep. The thread creation needs to be moved outside the atomic spinlock context to resolve this. Based on the documentation I was able to find sleeping in the hotplug callback should be fine.The text was updated successfully, but these errors were encountered: