Skip to content

Commit

Permalink
Don't assert mg_initialized due to device addition race
Browse files Browse the repository at this point in the history
During device removal stress tests, we noticed that we were tripping 
the assertion that mg_initialized was true. After investigation, it was 
determined that the mg in question was the embedded log metaslab 
group for a newly added vdev; the normal mg had been initialized (by 
metaslab_sync_reassess, via vdev_sync_done). However, because the spa 
config alloc lock is not held as writer across both calls to 
metaslab_sync_reassess, it is possible for an allocation to happen 
between the two metaslab_groups being initialized. Because the metaslab 
code doesn't check the group in question, just the vdev's main mg, it 
is possible to get past the initial check in vdev_allocatable and 
later fail due to the assertion.

We simply remove the assertions. We could also consider locking the 
ALLOC lock around the reassess calls in vdev_sync_done, but that risks 
deadlocks. We could check the actual target mg in vdev_allocatable, 
but that risks racing with a passivation that comes in after that 
check but before the assertion. We still won't be able to actually 
allocate from the metaslab group if no metaslabs are ready, so this 
change shouldn't break anything.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes openzfs#15818
  • Loading branch information
pcd1193182 authored and lundman committed Mar 13, 2024
1 parent cab3d6c commit 370259b
Showing 1 changed file with 0 additions and 3 deletions.
3 changes: 0 additions & 3 deletions module/zfs/metaslab.c
Original file line number Diff line number Diff line change
Expand Up @@ -5105,7 +5105,6 @@ metaslab_group_alloc(metaslab_group_t *mg, zio_alloc_list_t *zal,
int allocator, boolean_t try_hard)
{
uint64_t offset;
ASSERT(mg->mg_initialized);

offset = metaslab_group_alloc_normal(mg, zal, asize, txg, want_unique,
dva, d, allocator, try_hard);
Expand Down Expand Up @@ -5256,8 +5255,6 @@ metaslab_alloc_dva(spa_t *spa, metaslab_class_t *mc, uint64_t psize,
goto next;
}

ASSERT(mg->mg_initialized);

/*
* Avoid writing single-copy data to an unhealthy,
* non-redundant vdev, unless we've already tried all
Expand Down

0 comments on commit 370259b

Please sign in to comment.