Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illumos 4047 #1760

Closed
wants to merge 7 commits into from
Closed

Illumos 4047 #1760

wants to merge 7 commits into from

Conversation

dweeezil
Copy link
Contributor

This branch consists of an illumos patch stack encompassing issues 3875, 3834, 4047 and a yet-unnumbered issue discovered following 4047.

These patches should help various problems encountered during zfs recv operations.

mmatuska and others added 7 commits October 6, 2013 22:31
3669 zfs hold or release of a non-existent snapshot does not output error
3739 cannot set zfs quota or reservation on pool version < 22

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Eric Shrock <eric.schrock@delphix.com>
Approved by: Dan McDonald <danmcd@nexenta.com>

References:
  https://www.illumos.org/issues/3699
  https://www.illumos.org/issues/3739

Ported-by: Tim Chase <tim@chase2k.com>
3740 Poor ZFS send / receive performance due to snapshot hold / release processing

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Christopher Siden <christopher.siden@delphix.com>

References:
  https://www.illumos.org/issues/3740

Ported-by: Tim Chase <tim@chase2k.com>
3829 fix for 3740 changed behavior of zfs destroy/hold/release ioctl

Reviewed by: Matt Amdur <matt.amdur@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>

References:
  https://www.illumos.org/issues/3829

Ported by: Tim Chase <tim@chase2k.com>
3875 panic in zfs_root() after failed rollback
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Gordon Ross <gwr@nexenta.com>

References:
  https://www.illumos.org/issues/3875

Ported by: Tim Chase <tim@chase2k.com>

Porting Notes:

The ZoL version of zfs_resume_fs() had diverged from the upstream quite
a bit given that it has a number of Linux-isms.  As part of this patch,
I took this function in toto from illumos and re-modified it for ZoL in
the spirit of the previous version of the function.
3834 incremental replication of 'holey' file systems is slow
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>

References:
  https://www.illumos.org/issues/3834

Ported by: Tim Chase <tim@chase2k.com>
4047 panic from dbuf_free_range() from dmu_free_object() while doing zfs receive
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Dan McDonald <danmcd@nexenta.com>

References:
  https://www.illumos.org/issues/4047

Ported by: Tim Chase <tim@chase2k.com>
In the email list referenced below, Matthew Ahrens said:

    The problem occurs when an object is reused for a new file which has a
    non-power-of-2 blocksize.

    The problem occurs due to the following sequence of events:

    First we process the OBJECT record in the send stream.
        Call dmu_object_reclaim() with the new, non-power-of-2 blocksize
            Call dmu_free_long_range() to free the entire file
                New code (27504) in dmu_free_long_range_impl() does not pass
    length=-1 to dnode_free_range()
                    dnode_free_range() does not realize we are truncating, and thus
    does not decrease dn_maxblkid
                    Note, the fix for 26642 removes the dn_maxblkid-decreasing code
    from dnode_free_range() entirely.
            Call dnode_reallocate() to change the dnode configuration
                If debug build fail ASSERT(dn_maxblkid == 0)

    Assuming we didn't fail the assertion, next we will process the FREE record
    which frees the remainder of the file (e.g from 2.5K to the end)
        Eventually call dmu_tx_hold_free(off=2.5K, len=<big>)
            New code (24744) calls dmu_tx_count_write() on entire specified range
            Too much space has been scheduled to write, so fail with EFBIG

    The most expedient fix is to add code to zero out dn_maxblkid when doing
    dmu_object_reclaim().

    We can also make the code in dmu_tx_hold_free() more resilient by having it
    dmu_tx_count_write() only the first block of non-power-of-2 blocksize files
    (which only have one block).

    The problem can be reproduced by the following script:

    #!/bin/bash -x

    zpool create test c1t1d0

    # clean up from previous run
    zfs destroy -r test/fs
    zfs destroy -r test/recvd

    zfs create -o recordsize=8k test/fs

    dd if=/dev/zero of=/test/fs/big bs=1024k count=100

    # need to create enough files that ZFS will go back and reuse
    # object numbers
    for (( i=0; i<4000; i=i+1 )); do
            echo >/test/fs/empty-$i
    done

    zfs snapshot test/fs@big

    find /test/fs | xargs rm
    sync

    # replace "big" file with a file with uneven blocksize (1.5k)
    for (( i=0; i<100; i=i+1 )); do
            dd if=/dev/zero of=/test/fs/small-$i bs=1200 count=1
    done

    zfs snapshot test/fs@small

    zfs send test/fs@big | zfs recv test/recvd

    # this receive will fail
    zfs send -i @BIG test/fs@small | zfs recv test/recvd

References:
  http://www.listbox.com/member/archive/182191/2013/09/sort/time_rev/page/5/entry/3:140/20130909182626:D79EC5B8-199E-11E3-8BF5-CB08091A731B/
@dweeezil
Copy link
Contributor Author

dweeezil commented Oct 7, 2013

I've added patches for illumos issues 3829, 3740, 3699 and 3739 to this patch stack. They're not all send/recv-related but they're all somewhat intertwined. There's a commit corresponding to each illumos commit.

@dweeezil
Copy link
Contributor Author

dweeezil commented Oct 9, 2013

These are in #1775.

@dweeezil dweeezil closed this Oct 9, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants