Skip to content

Commit e219cca

Browse files
committed
jbd: don't give up looking for space so easily in __log_wait_for_space
Commit be07c4e introducd a regression because it assumed that if there were no transactions ready to be checkpointed, that no progress could be made on making space available in the journal, and so the journal should be aborted. This assumption is false; it could be the case that simply calling cleanup_journal_tail() will recover the necessary space, or, for small journals, the currently committing transaction could be responsible for chewing up the required space in the log, so we need to wait for the currently committing transaction to finish before trying to force a checkpoint operation. This patch fixes the bug reported by Meelis Roos at: http://bugzilla.kernel.org/show_bug.cgi?id=11937 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Duane Griffin <duaneg@dghda.com> Cc: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
1 parent 45beca0 commit e219cca

File tree

1 file changed

+24
-7
lines changed

1 file changed

+24
-7
lines changed

fs/jbd/checkpoint.c

Lines changed: 24 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ static int __try_to_free_cp_buf(struct journal_head *jh)
115115
*/
116116
void __log_wait_for_space(journal_t *journal)
117117
{
118-
int nblocks;
118+
int nblocks, space_left;
119119
assert_spin_locked(&journal->j_state_lock);
120120

121121
nblocks = jbd_space_needed(journal);
@@ -128,25 +128,42 @@ void __log_wait_for_space(journal_t *journal)
128128
/*
129129
* Test again, another process may have checkpointed while we
130130
* were waiting for the checkpoint lock. If there are no
131-
* outstanding transactions there is nothing to checkpoint and
132-
* we can't make progress. Abort the journal in this case.
131+
* transactions ready to be checkpointed, try to recover
132+
* journal space by calling cleanup_journal_tail(), and if
133+
* that doesn't work, by waiting for the currently committing
134+
* transaction to complete. If there is absolutely no way
135+
* to make progress, this is either a BUG or corrupted
136+
* filesystem, so abort the journal and leave a stack
137+
* trace for forensic evidence.
133138
*/
134139
spin_lock(&journal->j_state_lock);
135140
spin_lock(&journal->j_list_lock);
136141
nblocks = jbd_space_needed(journal);
137-
if (__log_space_left(journal) < nblocks) {
142+
space_left = __log_space_left(journal);
143+
if (space_left < nblocks) {
138144
int chkpt = journal->j_checkpoint_transactions != NULL;
145+
tid_t tid = 0;
139146

147+
if (journal->j_committing_transaction)
148+
tid = journal->j_committing_transaction->t_tid;
140149
spin_unlock(&journal->j_list_lock);
141150
spin_unlock(&journal->j_state_lock);
142151
if (chkpt) {
143152
log_do_checkpoint(journal);
153+
} else if (cleanup_journal_tail(journal) == 0) {
154+
/* We were able to recover space; yay! */
155+
;
156+
} else if (tid) {
157+
log_wait_commit(journal, tid);
144158
} else {
145-
printk(KERN_ERR "%s: no transactions\n",
146-
__func__);
159+
printk(KERN_ERR "%s: needed %d blocks and "
160+
"only had %d space available\n",
161+
__func__, nblocks, space_left);
162+
printk(KERN_ERR "%s: no way to get more "
163+
"journal space\n", __func__);
164+
WARN_ON(1);
147165
journal_abort(journal, 0);
148166
}
149-
150167
spin_lock(&journal->j_state_lock);
151168
} else {
152169
spin_unlock(&journal->j_list_lock);

0 commit comments

Comments
 (0)