arc_shrinker_scan() on Linux should be reentrancy-safe #10986
Labels
Status: Stale
No recent activity for issue
Status: Triage Needed
New issue which needs to be triaged
Type: Defect
Incorrect behavior (e.g. crash, hang)
System information
18.04)) #4618.04.1-Ubuntu SMP Fri Jul 10 07:21:24 UTC 2020Describe the problem you're observing
The os/linux/zfs/arc_os.c
arc_shrinker_scan()
may be invoked reentrantly, i.e. re-invoked before a previous invocation has completed.This isn't inherently a problem but I'm not convinced that the downstream code (
arc_reduce_target_size()
andarc_wait_for_eviction()
) is expecting this / robust to this.Describe how to reproduce the problem
Found while looking into arc collapse; instrumenting arc_shrinker_scan() with an atomic enter/leave counter should reveal this problem.
I'm not sure if this is unique to either direct or indirect claim, though either can be a victim.
I'm also not sure if this is due to arc_shrinker_scan() being invoked from multiple (kswapd?) threads or whether zfs itself is causing stop-the-world emergency direct calls to its own shrinker.
I suspect it's a mix, mostly the latter, because arc_shrinker_scan() waits on eviction to shrink memory usage, but the eviction path requires further allocations, certainly a troubling combination when trying to respond to critical memory pressure. This is also probably a bug or design mishap which I'm happy to file separately if you think it's useful.
The text was updated successfully, but these errors were encountered: