Skip to content

Commit

Permalink
Fix txg_quiesce thread deadlock
Browse files Browse the repository at this point in the history
A deadlock was accidentally introduced by commit e95853a which
can occur when the system is under memory pressure.  What happens
is that while the txg_quiesce thread is holding the tx->tx_cpu
locks it enters memory reclaim.  In the context of this memory
reclaim it then issues synchronous I/O to a ZVOL swap device.
Because the txg_quiesce thread is holding the tx->tx_cpu locks
a new txg cannot be opened to handle the I/O.  Deadlock.

The fix is straight forward.  Move the memory allocation outside
the critical region where the tx->tx_cpu locks are held.  And for
good measure change the offending allocation to KM_PUSHPAGE to
ensure it never attempts to issue I/O during reclaim.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #1274
  • Loading branch information
behlendorf committed Apr 26, 2013
1 parent 0c15bf1 commit 57f5a20
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 8 deletions.
2 changes: 1 addition & 1 deletion module/zfs/dsl_pool.c
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ dsl_pool_txg_history_add(dsl_pool_t *dp, uint64_t txg)
{
txg_history_t *th, *rm;

th = kmem_zalloc(sizeof(txg_history_t), KM_SLEEP);
th = kmem_zalloc(sizeof(txg_history_t), KM_PUSHPAGE);
mutex_init(&th->th_lock, NULL, MUTEX_DEFAULT, NULL);
th->th_kstat.txg = txg;
th->th_kstat.state = TXG_STATE_OPEN;
Expand Down
14 changes: 7 additions & 7 deletions module/zfs/txg.c
Original file line number Diff line number Diff line change
Expand Up @@ -366,6 +366,13 @@ txg_quiesce(dsl_pool_t *dp, uint64_t txg)
ASSERT(txg == tx->tx_open_txg);
tx->tx_open_txg++;

/*
* Now that we've incremented tx_open_txg, we can let threads
* enter the next transaction group.
*/
for (c = 0; c < max_ncpus; c++)
mutex_exit(&tx->tx_cpu[c].tc_lock);

/*
* Measure how long the txg was open and replace the kstat.
*/
Expand All @@ -375,13 +382,6 @@ txg_quiesce(dsl_pool_t *dp, uint64_t txg)
dsl_pool_txg_history_put(th);
dsl_pool_txg_history_add(dp, tx->tx_open_txg);

/*
* Now that we've incremented tx_open_txg, we can let threads
* enter the next transaction group.
*/
for (c = 0; c < max_ncpus; c++)
mutex_exit(&tx->tx_cpu[c].tc_lock);

/*
* Quiesce the transaction group by waiting for everyone to txg_exit().
*/
Expand Down

0 comments on commit 57f5a20

Please sign in to comment.