Skip to content

Commit

Permalink
btrfs: fix the ordered extent hang after certain delalloc range failure
Browse files Browse the repository at this point in the history
[BUG]
Even with the recent double ordered extent freeing bugs fixed, there is
still a low chance to cause hang ordered extents that will never be able
to finish.

This is especially problematic if the block size (4K) is smaller than
page size (64K, aarch64), and can cause the following hang:

 BTRFS info (device dm-3): checking UUID tree
 BTRFS error (device dm-3): cow_file_range failed, root=446 inode=274 start=1523712 len=49152: -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, root=446 inode=274 start=1523712 len=49152: -28
 BTRFS error (device dm-3): failed to run delalloc range, root=446 ino=274 folio=1507328 submit_bitmap=4-15 start=1523712 len=49152: -28
 BTRFS error (device dm-3): cow_file_range failed, root=3274 inode=283 start=454656 len=4096: -28
 BTRFS error (device dm-3): run_delalloc_nocow failed, root=3274 inode=283 start=454656 len=4096: -28
 BTRFS error (device dm-3): failed to run delalloc range, root=3274 ino=283 folio=393216 submit_bitmap=15 start=454656 len=4096: -28
 INFO: task kworker/u17:5:2948764 blocked for more than 122 seconds.
       Not tainted 6.13.0-rc1-custom+ torvalds#89
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 task:kworker/u17:5   state:D stack:0     pid:2948764 tgid:2948764 ppid:2      flags:0x00000008
 Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
 Call trace:
  __switch_to+0xf4/0x168 (T)
  __schedule+0x32c/0x7a8
  schedule+0x54/0x140
  schedule_timeout+0xe8/0x140
  __wait_for_common+0xf4/0x228
  wait_for_completion+0x28/0x40
  btrfs_wait_ordered_extents+0x2d0/0x400 [btrfs]
  btrfs_wait_ordered_roots+0x148/0x238 [btrfs]
  shrink_delalloc+0x164/0x280 [btrfs]
  flush_space+0x288/0x328 [btrfs]
  btrfs_async_reclaim_metadata_space+0xc8/0x240 [btrfs]
  process_one_work+0x228/0x680
  worker_thread+0x1bc/0x360
  kthread+0x100/0x118
  ret_from_fork+0x10/0x20
 INFO: task fsstress:3706703 blocked for more than 122 seconds.

[CAUSE]
The above dmesg shows a case where run_delalloc_nocow() failed for a
range inside the folio, not covering the whole folio.

When run_delalloc_nocow() failed, we call
btrfs_cleanup_ordered_extents() inside btrfs_run_delalloc_range() to
finish involved ordered extents.

But if btrfs_cleanup_ordered_extents() is passed with @locked_folio, and
the range is inside the folio, btrfs_cleanup_ordered_extents() will not
call btrfs_mark_ordered_io_finished() to finish the ordered extents,
preventing the involved ordered extents from finishing.

[FIX]
Just do not pass @locked_folio into btrfs_cleanup_ordered_extents().

The @locked_folio parameter is only utilized to skip the
btrfs_mark_ordered_io_finished(), which should never be skipped at all.

Furthermore @locked_folio for btrfs_mark_ordered_io_finished() is only
to clear the ordered flags for folios, to co-operate with the
out-of-band dirty folio detection.

In our error handling case, the folios will already have their ordered
flags cleared inside btrfs_cleanup_ordered_extents(), so there is no
need to pass a @locked_folio into it.

Furthermore this is a long existing bug, and needs a backport, thus this
patch is the minimal fix.
There will be a later patch to remove the @locked_folio parameter
completely.

Fixes: d1051d6 ("btrfs: Fix error handling in btrfs_cleanup_ordered_extents")
Signed-off-by: Qu Wenruo <wqu@suse.com>
  • Loading branch information
adam900710 committed Dec 11, 2024
1 parent 49ffbf1 commit 9b74181
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion fs/btrfs/inode.c
Original file line number Diff line number Diff line change
Expand Up @@ -2297,7 +2297,7 @@ int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct folio *locked_fol

out:
if (ret < 0)
btrfs_cleanup_ordered_extents(inode, locked_folio, start,
btrfs_cleanup_ordered_extents(inode, NULL, start,
end - start + 1);
return ret;
}
Expand Down

0 comments on commit 9b74181

Please sign in to comment.