Skip to content

Commit

Permalink
Fix mmap(2)/write(2)/read(2) deadlock
Browse files Browse the repository at this point in the history
When modifing overlapping regions of a file using mmap(2) and
write(2)/read(2) it is possible to deadlock due to a lock inversion.
The zfs_write() and zfs_read() hooks first take the zfs range lock
and then lock the individual pages.  Conversely, when using mmap'ed
I/O the zpl_writepage() hook is called with the individual page
locks already taken and then zfs_putpage() takes the zfs range lock.

The most straight forward fix is to simply not take the zfs range
lock in the mmap(2) case.  The individual pages will still be locked
thus serializing access.  Updating the same region of a file with
write(2) and mmap(2) has always been a dodgy thing to do.  This change
at a minimum ensures we don't deadlock and is consistent with the
existing Linux semantics enforced by the VFS.

This isn't an issue under Solaris because the only range locking
performed will be with the zfs range locks.  It's up to each filesystem
to perform its own file locking.  Under Linux the VFS provides many
of these services.

It may be possible/desirable at a latter date to entirely dump the
existing zfs range locking and rely on the Linux VFS page locks.
However, for now its safest to perform both layers of locking until
zfs is more tightly integrated with the page cache.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#302
  • Loading branch information
behlendorf committed Jul 19, 2011
1 parent 7f9d994 commit a140dc5
Showing 1 changed file with 0 additions and 5 deletions.
5 changes: 0 additions & 5 deletions module/zfs/zfs_vnops.c
Original file line number Diff line number Diff line change
Expand Up @@ -3838,7 +3838,6 @@ zfs_putpage(struct page *page, struct writeback_control *wbc, void *data)
struct inode *ip = mapping->host;
znode_t *zp = ITOZ(ip);
zfs_sb_t *zsb = ITOZSB(ip);
rl_t *rl;
u_offset_t io_off;
size_t io_len;
size_t len;
Expand All @@ -3850,19 +3849,15 @@ zfs_putpage(struct page *page, struct writeback_control *wbc, void *data)
ZFS_ENTER(zsb);
ZFS_VERIFY_ZP(zp);

rl = zfs_range_lock(zp, io_off, io_len, RL_WRITER);

if (io_off > zp->z_size) {
/* past end of file */
zfs_range_unlock(rl);
ZFS_EXIT(zsb);
return (0);
}

len = MIN(io_len, P2ROUNDUP(zp->z_size, PAGESIZE) - io_off);

error = zfs_putapage(ip, page, io_off, len);
zfs_range_unlock(rl);

if (zsb->z_os->os_sync == ZFS_SYNC_ALWAYS)
zil_commit(zsb->z_log, zp->z_id);
Expand Down

0 comments on commit a140dc5

Please sign in to comment.