Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbuf_free_range() overzealously frees dbufs #3223

Closed
wants to merge 1 commit into from

Conversation

nedbass
Copy link
Contributor

@nedbass nedbass commented Mar 25, 2015

When called to free a spill block from a dnode, dbuf_free_range() has a
bug that results in all dbufs for the dnode getting freed. A variety of
problems may result from this bug, but a common one was a zap lookup
tripping an ASSERT because the zap buffers had been zeroed out. This
could happen on a dataset with xattr=sa set when extended attributes are
written and removed on a directory concurrently with I/O to files in
that directory.

Signed-off-by: Ned Bass bass6@llnl.gov
Fixes #3195
Fixes #3222

@nedbass
Copy link
Contributor Author

nedbass commented Mar 25, 2015

@dweeezil please review if you have a chance. Thanks

@dweeezil
Copy link
Contributor

@nedbass I'm reviewing the original issue (#2884) and am now trying to convince myself that I shouldn't have used "&&" rather than "||" to set freespill (or that it even matters). That said, your fix certainly looks correct. I think we should also change:

if ((db->db_blkid < start || db->db_blkid > end) && !freespill)

to

  if (!freespill && (db->db_blkid < start || db->db_blkid > end))

to make it clearer that we don't want to do the magnitude comparison in the freespill case. It's also more symmetric with your fix.

My original version of the #2884 fix didn't use a boolean but it was added at @behlendorf's suggestion to clarify the code. The flag check was stuck on the right-hand side because that's where the older logic was.

@behlendorf
Copy link
Contributor

@nedbass along with @dweeezil's suggested improvements can you add a one line comment above those two conditionals briefly explaining them. Maybe something like this:

                /* Skip indirect blocks */
                if (db->db_level != 0)
                        continue;
                /* Skip spill blocks and direct blocks outside the range */
                if (!freespill && (db->db_blkid < start || db->db_blkid > end))
                        continue;
                /* Skip all direct blocks, only free spill blocks */
                if (freespill && (db->db_blkid != DMU_SPILL_BLKID))
                        continue;

When called to free a spill block from a dnode, dbuf_free_range() has a
bug that results in all dbufs for the dnode getting freed.  A variety of
problems may result from this bug, but a common one was a zap lookup
tripping an ASSERT because the zap buffers had been zeroed out.  This
could happen on a dataset with xattr=sa set when extended attributes are
written and removed on a directory concurrently with I/O to files in
that directory.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Fixes openzfs#3195
Fixes openzfs#3204
Fixes openzfs#3222
@nedbass
Copy link
Contributor Author

nedbass commented Mar 25, 2015

Updated with suggested changes. Also linked to #3204 as I think it may explain that as well.

@nedbass
Copy link
Contributor Author

nedbass commented Mar 25, 2015

Regarding #3204, tracepoints data does point to a link with this bug. PID 21287 calls dbuf_free_range() on the spill block and undirties block 0. This seems to botch up the size accounting and leads to the mismatch between lsize and datablksz that trips the ASSERT for PID 19891. Though I haven't quite wrapped my head around the exact sequence of events.

           <...>-21287 [001] 18743272.331737: zfs_zfs__dprintf: dbuf.c:878:dbuf_free_range(): ds=tank/fish obj=45 start=18446744073709551614 end=18446744073709551614
           <...>-21287 [001] 18743272.331740: zfs_zfs__dprintf: dbuf.c:1382:dbuf_undirty(): ds=tank/fish obj=45 lvl=0 blkid=0 size=1f400
           <...>-19891 [001] 18743273.748444: zfs_zfs__dprintf: dnode_sync.c:612:dnode_sync(): ds=tank/fish obj=45 is_hole=0 is_imbedded=0 lsize=107520 datablksz=107520
           <...>-19891 [001] 18743273.748546: zfs_zfs__dprintf: dnode.c:1680:dnode_diduse_space(): ds=tank/fish obj=45 dn=ffff88007c878050 dnp=ffff880041061a00 used=112640 delta=-5120
           <...>-19891 [001] 18743273.748550: zfs_zfs__dprintf: dbuf.c:2506:dbuf_sync_leaf(): ds=tank/fish obj=45 lvl=0 blkid=-1 blkptr=(null) <NULL>
           <...>-21406 [000] 18743282.606971: zfs_zfs__dprintf: dnode.c:1615:dnode_free_range(): ds=tank/fish obj=45 blkid=0 nblks=1 txg=39
           <...>-21406 [000] 18743282.606974: zfs_zfs__dprintf: dbuf.c:878:dbuf_free_range(): ds=tank/fish obj=45 start=0 end=0
           <...>-21406 [000] 18743282.606976: zfs_zfs__dprintf: dnode.c:1243:dnode_setdirty(): ds=tank/fish obj=45 txg=39
           <...>-21406 [000] 18743282.606998: zfs_zfs__dprintf: dbuf.c:1182:dbuf_dirty(): ds=tank/fish obj=45 lvl=0 blkid=-1 size=140
           <...>-21406 [000] 18743282.607129: zfs_zfs__dprintf: dbuf.c:1837:dbuf_create(): ds=tank/fish obj=45 lvl=0 blkid=0 db=ffff88003ba378f0
           <...>-21406 [000] 18743282.607161: zfs_zfs__dprintf: dbuf.c:1837:dbuf_create(): ds=tank/fish obj=45 lvl=0 blkid=0 db=ffff88003ba378f0
           <...>-21406 [000] 18743282.607172: zfs_zfs__dprintf: dbuf.c:1837:dbuf_create(): ds=tank/fish obj=45 lvl=0 blkid=0 db=ffff88003ba378f0
           <...>-21406 [000] 18743282.607189: zfs_zfs__dprintf: dbuf.c:1182:dbuf_dirty(): ds=tank/fish obj=45 lvl=0 blkid=0 size=1f400
           <...>-19891 [001] 18743283.747862: zfs_zfs__dprintf: dnode_sync.c:612:dnode_sync(): ds=tank/fish obj=45 is_hole=0 is_imbedded=0 lsize=107520 datablksz=128000

@behlendorf
Copy link
Contributor

Merged as:

58806b4 dbuf_free_range() overzealously frees dbufs

@behlendorf behlendorf closed this Mar 25, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants