Skip to content

Commit e178482

Browse files
Tvrtko Ursulinmehmetb0
authored andcommitted
workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker
BugLink: https://bugs.launchpad.net/bugs/2106770 [ Upstream commit de35994 ] After commit 746ae46 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM") amdgpu started seeing the following warning: [ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work [gpu_sched] is flushing !WQ_MEM_RECLAIM events:amdgpu_device_delay_enable_gfx_off [amdgpu] ... [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched] ... [ ] Call Trace: [ ] <TASK> ... [ ] ? check_flush_dependency+0xf5/0x110 ... [ ] cancel_delayed_work_sync+0x6e/0x80 [ ] amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu] [ ] amdgpu_ring_alloc+0x40/0x50 [amdgpu] [ ] amdgpu_ib_schedule+0xf4/0x810 [amdgpu] [ ] ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched] [ ] amdgpu_job_run+0xaa/0x1f0 [amdgpu] [ ] drm_sched_run_job_work+0x257/0x430 [gpu_sched] [ ] process_one_work+0x217/0x720 ... [ ] </TASK> The intent of the verifcation done in check_flush_depedency is to ensure forward progress during memory reclaim, by flagging cases when either a memory reclaim process, or a memory reclaim work item is flushed from a context not marked as memory reclaim safe. This is correct when flushing, but when called from the cancel(_delayed)_work_sync() paths it is a false positive because work is either already running, or will not be running at all. Therefore cancelling it is safe and we can relax the warning criteria by letting the helper know of the calling context. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: fca839c ("workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue") References: 746ae46 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM") Cc: Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v4.5+ Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> CVE-2024-57888 Signed-off-by: Manuel Diewald <manuel.diewald@canonical.com> Signed-off-by: Mehmet Basaran <mehmet.basaran@canonical.com>
1 parent 717195c commit e178482

File tree

1 file changed

+13
-9
lines changed

1 file changed

+13
-9
lines changed

kernel/workqueue.c

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2957,23 +2957,27 @@ static int rescuer_thread(void *__rescuer)
29572957
* check_flush_dependency - check for flush dependency sanity
29582958
* @target_wq: workqueue being flushed
29592959
* @target_work: work item being flushed (NULL for workqueue flushes)
2960+
* @from_cancel: are we called from the work cancel path
29602961
*
29612962
* %current is trying to flush the whole @target_wq or @target_work on it.
2962-
* If @target_wq doesn't have %WQ_MEM_RECLAIM, verify that %current is not
2963-
* reclaiming memory or running on a workqueue which doesn't have
2964-
* %WQ_MEM_RECLAIM as that can break forward-progress guarantee leading to
2965-
* a deadlock.
2963+
* If this is not the cancel path (which implies work being flushed is either
2964+
* already running, or will not be at all), check if @target_wq doesn't have
2965+
* %WQ_MEM_RECLAIM and verify that %current is not reclaiming memory or running
2966+
* on a workqueue which doesn't have %WQ_MEM_RECLAIM as that can break forward-
2967+
* progress guarantee leading to a deadlock.
29662968
*/
29672969
static void check_flush_dependency(struct workqueue_struct *target_wq,
2968-
struct work_struct *target_work)
2970+
struct work_struct *target_work,
2971+
bool from_cancel)
29692972
{
2970-
work_func_t target_func = target_work ? target_work->func : NULL;
2973+
work_func_t target_func;
29712974
struct worker *worker;
29722975

2973-
if (target_wq->flags & WQ_MEM_RECLAIM)
2976+
if (from_cancel || target_wq->flags & WQ_MEM_RECLAIM)
29742977
return;
29752978

29762979
worker = current_wq_worker();
2980+
target_func = target_work ? target_work->func : NULL;
29772981

29782982
WARN_ONCE(current->flags & PF_MEMALLOC,
29792983
"workqueue: PF_MEMALLOC task %d(%s) is flushing !WQ_MEM_RECLAIM %s:%ps",
@@ -3218,7 +3222,7 @@ void __flush_workqueue(struct workqueue_struct *wq)
32183222
list_add_tail(&this_flusher.list, &wq->flusher_overflow);
32193223
}
32203224

3221-
check_flush_dependency(wq, NULL);
3225+
check_flush_dependency(wq, NULL, false);
32223226

32233227
mutex_unlock(&wq->mutex);
32243228

@@ -3395,7 +3399,7 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
33953399
}
33963400

33973401
wq = pwq->wq;
3398-
check_flush_dependency(wq, work);
3402+
check_flush_dependency(wq, work, from_cancel);
33993403

34003404
insert_wq_barrier(pwq, barr, work, worker);
34013405
raw_spin_unlock_irq(&pool->lock);

0 commit comments

Comments
 (0)