-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FSX mmap failure #2976
Comments
This is again xfstests 127 but in this case a deadlock was observed. It appears that we basically have a rare (but possible) lock inversion here. between the zfs range lock and individual page writeback bits. zpl_writepages zpl_fallocate
|
@behlendorf As to the first failure, I ran into something weird when trying to get a log of fsx's operations when using the 191110531 seed value: In the log dump above, operation 2962 is a write from 0x165fa through 0x1ef39. When I run fsx by hand and collect the log of all operations, that particular write is operation 1962. I'm a bit mystified as to why there's a difference. In the second sub-issue, nice catch on the lock inversion. I never encountered it in my testing, obviously. Isn't it |
I'm not quite sure what to make of that either. But I've certainly seen this for other seed values. On the second issue your right it is But you're absolutely right, I don't see why we couldn't do something much more modest and just take the range lock a few lines earlier in diff --git a/module/zfs/zfs_vnops.c b/module/zfs/zfs_vnops.c
index d05ccef..8834b05 100644
--- a/module/zfs/zfs_vnops.c
+++ b/module/zfs/zfs_vnops.c
@@ -3899,10 +3899,11 @@ zfs_putpage(struct inode *ip, struct page *pp, struct writeback_control *wbc)
}
#endif
+ rl = zfs_range_lock(zp, pgoff, pglen, RL_WRITER);
+
set_page_writeback(pp);
unlock_page(pp);
- rl = zfs_range_lock(zp, pgoff, pglen, RL_WRITER);
tx = dmu_tx_create(zsb->z_os);
dmu_tx_hold_write(tx, zp->z_id, pgoff, pglen); |
@behlendorf Yes, that was exactly what I had in mind. As to the other approach, I was concerned that covering the entire |
There exists a lock inversions involving the zfs range lock and the individual page writeback bits which can result in a deadlock. To prevent this we must always manipulate the writeback bit while holding the range lock. The exact deadlock is as follows: ------ Process A ------ ------ Process B ------ zpl_writepages zpl_fallocate write_cache_pages zpl_fallocate_common zpl_putpage zfs_space zfs_putpage (set bit) zfs_freesp zfs_range_lock (wait on lock) zfs_free_range (take lock) truncate_inode_pages_range wait_on_page_writeback (wait on bit) Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#2976
I've opened #2980 with the proposed patch above so it can roll through testing and we don't loose track of it. The original issue I'm still not sure about but here's another example from the buildbot which seems to be hitting the issue more frequently now. |
There exists a lock inversions involving the zfs range lock and the individual page writeback bits which can result in a deadlock. To prevent this we must always manipulate the writeback bit while holding the range lock. The exact deadlock is as follows: ------ Process A ------ ------ Process B ------ zpl_writepages zpl_fallocate write_cache_pages zpl_fallocate_common zpl_putpage zfs_space zfs_putpage (set bit) zfs_freesp zfs_range_lock (wait on lock) zfs_free_range (take lock) [has not yet initiated I/O, truncate_inode_pages_range the bit will not be cleared] wait_on_page_writeback (wait on bit) Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Richard Yao <richard.yao@clusterhq.com> Issue openzfs#2976
This is a follow up commit to 74328ee which correctly resolved a lock inversion between zfs_putpage() and zfs_free_range(). Unfortunately, in the process it accidentally introduced another inversion between zfs_putpage() and zfs_read(). The page must be unlocked before taking the range lock. This patch corrects that issue. In addition, because the locking rules here are subtle a block comment has been added clearly explaining why the ordering here is critical. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#2976
This is a follow up commit to 74328ee which correctly resolved a lock inversion between zfs_putpage() and zfs_free_range(). Unfortunately, in the process it accidentally introduced another inversion between zfs_putpage() and zfs_read(). The page must be unlocked before taking the range lock. This patch corrects that issue. In addition, because the locking rules here are subtle a block comment has been added clearly explaining why the ordering here is critical. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#2976
This is a follow up commit to 74328ee which correctly resolved a lock inversion between zfs_putpage() and zfs_free_range(). Unfortunately, in the process it accidentally introduced another inversion between zfs_putpage() and zfs_read(). The page must be unlocked before taking the range lock. This patch corrects that issue. In addition, because the locking rules here are subtle a block comment has been added clearly explaining why the ordering here is critical. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ned Bass <bass6@llnl.gov> Issue openzfs#2976
While the lock inversion has been resolved. The original issue here still needs to be investigated. |
@behlendorf FYI: |
Commit 21a96fb which is now in master should address this issue. Unfortunately, I've never had much luck being able to reproduce this so we'll need to take a wait and see approach. I'm closing the issue but we'll want to keep an eye on the automated testing to see if it ever reoccurs. |
@dweeezil the following rare failure was accidentally introduced by commit 223df01 and still exists as of zfs-0.6.3-159-gc944be5. It causes an occasional xfstests failure during the regression tests. I'm filing this issue so we can track it but I have not yet seriously investigated the issue.
The text was updated successfully, but these errors were encountered: