Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to ZFS volume creates correctly sized file filled with \0 #3125

Closed
Lalufu opened this issue Feb 20, 2015 · 70 comments
Closed

Move to ZFS volume creates correctly sized file filled with \0 #3125

Lalufu opened this issue Feb 20, 2015 · 70 comments
Labels
Type: Regression Indicates a functional regression
Milestone

Comments

@Lalufu
Copy link
Contributor

Lalufu commented Feb 20, 2015

Test case:

[sun@ethan ~ :) 4]$ cp a.txt b.txt
[sun@ethan ~ :) 6]$ sha1sum a.txt b.txt
3d8bb1556fd9d5d2c43d97011d7bf6bef8e8c295  a.txt
3d8bb1556fd9d5d2c43d97011d7bf6bef8e8c295  b.txt
[sun@ethan ~ :) 7]$ cp b.txt /tank/share/media/audio/
[sun@ethan ~ :) 8]$ sha1sum a.txt b.txt /tank/share/media/audio/b.txt
3d8bb1556fd9d5d2c43d97011d7bf6bef8e8c295  a.txt
3d8bb1556fd9d5d2c43d97011d7bf6bef8e8c295  b.txt
3d8bb1556fd9d5d2c43d97011d7bf6bef8e8c295  /tank/share/media/audio/b.txt
[sun@ethan ~ :) 9]$ rm /tank/share/media/audio/b.txt
[sun@ethan ~ :) 10]$ mv b.txt /tank/share/media/audio/
[sun@ethan ~ :) 11]$ sha1sum a.txt /tank/share/media/audio/b.txt
3d8bb1556fd9d5d2c43d97011d7bf6bef8e8c295  a.txt
92382bb65cc7b6b5293356054e9a146a96043336  /tank/share/media/audio/b.txt
[sun@ethan ~ :) 12]$ hexdump -C /tank/share/media/audio/b.txt
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000009e0  00 00 00 00 00 00 00 00  00                       |.........|
000009e9

This seems to only happen on a single dataset so far.
Tracing through mv with gdb suggests this is somehow timing related, as stepping through
seems to prevent the effect from happening.

NAME                    PROPERTY               VALUE                    SOURCE
tank/share/media/audio  type                   filesystem               -
tank/share/media/audio  creation               Thu Oct  3 12:02 2013    -
tank/share/media/audio  used                   372G                     -
tank/share/media/audio  available              3.13T                    -
tank/share/media/audio  referenced             366G                     -
tank/share/media/audio  compressratio          1.00x                    -
tank/share/media/audio  mounted                yes                      -
tank/share/media/audio  quota                  none                     default
tank/share/media/audio  reservation            none                     default
tank/share/media/audio  recordsize             128K                     default
tank/share/media/audio  mountpoint             /tank/share/media/audio  inherited from tank/share
tank/share/media/audio  sharenfs               rw=10.200.254.1,ro       inherited from tank/share
tank/share/media/audio  checksum               on                       default
tank/share/media/audio  compression            off                      inherited from tank/share/media
tank/share/media/audio  atime                  off                      inherited from tank/share
tank/share/media/audio  devices                off                      inherited from tank
tank/share/media/audio  exec                   on                       default
tank/share/media/audio  setuid                 off                      inherited from tank
tank/share/media/audio  readonly               off                      default
tank/share/media/audio  zoned                  off                      default
tank/share/media/audio  snapdir                hidden                   default
tank/share/media/audio  aclinherit             restricted               default
tank/share/media/audio  canmount               on                       default
tank/share/media/audio  xattr                  sa                       inherited from tank
tank/share/media/audio  copies                 1                        default
tank/share/media/audio  version                5                        -
tank/share/media/audio  utf8only               off                      -
tank/share/media/audio  normalization          none                     -
tank/share/media/audio  casesensitivity        sensitive                -
tank/share/media/audio  vscan                  off                      default
tank/share/media/audio  nbmand                 off                      default
tank/share/media/audio  sharesmb               off                      default
tank/share/media/audio  refquota               none                     default
tank/share/media/audio  refreservation         none                     default
tank/share/media/audio  primarycache           all                      default
tank/share/media/audio  secondarycache         all                      default
tank/share/media/audio  usedbysnapshots        6.05G                    -
tank/share/media/audio  usedbydataset          366G                     -
tank/share/media/audio  usedbychildren         0                        -
tank/share/media/audio  usedbyrefreservation   0                        -
tank/share/media/audio  logbias                latency                  default
tank/share/media/audio  dedup                  off                      default
tank/share/media/audio  mlslabel               none                     default
tank/share/media/audio  sync                   standard                 default
tank/share/media/audio  refcompressratio       1.00x                    -
tank/share/media/audio  written                0                        -
tank/share/media/audio  logicalused            366G                     -
tank/share/media/audio  logicalreferenced      363G                     -
tank/share/media/audio  snapdev                hidden                   default
tank/share/media/audio  acltype                posixacl                 local
tank/share/media/audio  context                none                     default
tank/share/media/audio  fscontext              none                     default
tank/share/media/audio  defcontext             none                     default
tank/share/media/audio  rootcontext            none                     default
tank/share/media/audio  relatime               off                      default
tank/share/media/audio  com.sun:auto-snapshot  true                     inherited from tank/share
@Lalufu
Copy link
Contributor Author

Lalufu commented Feb 20, 2015

Note: source and destination are not on the same filesystem, so an actual copy/unlink takes place.

@fajarnugraha
Copy link
Contributor

Works for me. Is this some old zol version, perhaps?

$ dpkg -l zfs-dkms spl-dkms
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                 Version                 Architecture            Description
+++-====================================-=======================-=======================-==============================================================================
ii  spl-dkms                             0.6.3-16~a3c1eb~wheezy  all                     Solaris Porting Layer kernel modules for Linux
ii  zfs-dkms                             0.6.3-26~2d9d57~wheezy  all                     Native ZFS filesystem kernel modules for Linux
$ ./test.sh 
+ df -h . ../d2
Filesystem      Size  Used Avail Use% Mounted on
rpool/test/d1    47G     0   47G   0% /rpool/test/d1
rpool/test/d2    47G     0   47G   0% /rpool/test/d2
+ dd if=/dev/urandom of=a.txt bs=32 count=1
1+0 records in
1+0 records out
32 bytes (32 B) copied, 0.000312707 s, 102 kB/s
+ cp a.txt b.txt
+ sha1sum a.txt b.txt
2ad172f4e6de75e974df4b00bd97ddeb5e1135e0  a.txt
2ad172f4e6de75e974df4b00bd97ddeb5e1135e0  b.txt
+ cp b.txt ../d2
+ sha1sum a.txt b.txt ../d2/b.txt
2ad172f4e6de75e974df4b00bd97ddeb5e1135e0  a.txt
2ad172f4e6de75e974df4b00bd97ddeb5e1135e0  b.txt
2ad172f4e6de75e974df4b00bd97ddeb5e1135e0  ../d2/b.txt
+ rm ../d2/b.txt
+ mv b.txt ../d2/b.txt
+ sha1sum a.txt ../d2/b.txt
2ad172f4e6de75e974df4b00bd97ddeb5e1135e0  a.txt
2ad172f4e6de75e974df4b00bd97ddeb5e1135e0  ../d2/b.txt
+ hexdump -C ../d2/b.txt
00000000  f7 e3 a6 5c 86 e9 40 b3  58 1f 0c cd a3 0c 15 07  |...\..@.X.......|
00000010  01 1b c6 3c 6c 6c 55 1a  f1 59 ea 4e c5 21 f7 09  |...

@Lalufu
Copy link
Contributor Author

Lalufu commented Feb 23, 2015

This happens only on one of the datasets on this pool, others are fine. On this single one it's reproducible, though.

Versions (forgot about those):
SPL: Loaded module v0.6.3-1.2
ZFS: Loaded module v0.6.3-1.2, ZFS pool version 5000, ZFS filesystem version 5
3.18.7-200.fc21.x86_64

@behlendorf
Copy link
Contributor

@Lalufu that's very troubling. Is the problem with the new file persistent? That is if you export/import the pool does the empty file now have good contents or is it still empty. Can you reliably reproduce this for that dataset?

@Lalufu
Copy link
Contributor Author

Lalufu commented Feb 25, 2015

Is a reboot equivalent to an export/import for this purpose? That would be easier to test than exporting the pool in a running system.

On this particular dataset the issue is reliably reproducible.

Looking at an strace of a (working) cp and a (not working) mv the main difference seems to be that mv (after copying the file over to the destinanation) does the following syscalls:

utimensat() (on the destination)
flistxattr(NULL) (on the source)
flistxattr("security.selinux") (in the source)
fgetxattr("system.posix_acl_access") (on the source, fails with ENODATA)
fstat() (on the source)
fsetxattr("system.posix_acl_access") (on the destination)

@dweeezil
Copy link
Contributor

@Lalufu your strace output plus the xattr=sa point at an SA corruption problem. The 0.6.3-1.2 tag is missing a62d1b0 which fixes the last known SA problem. It would be interesting to see all of the xattr-related calls from the strace of the mv command. That said, if your system has selinux enabled and you're running a ZFS-enabled selinux policy, you'll not see all of the xattr calls from strace because some of them are made internally from the kernel's selinux code.

Also, if you have used older versions of ZoL on this pool, they were likely missing other SA fixes which opens the possibility the filesystem and/or the pool is permanently corrupted.

It might be worth trying current master code or cherry-picking a62d1b0 into 0.6.3-1.2 (it applies cleanly) and see what happens. It would also be a good idea to run zdb -b <pool> (while it's exported) to make sure it can be traversed cleanly and that there's no missing space.

@Lalufu
Copy link
Contributor Author

Lalufu commented Feb 25, 2015

Yes, I thought about xattr as well. Unfortunately most of the preconditions you mentioned are there, the system is running selinux, the pool has been created on 0.6.2 (as far as I can remember, is there an easy way to check this?).

I'll try the zdb call later to see what gives, and try to build a zfs version with the check.

@Lalufu
Copy link
Contributor Author

Lalufu commented Feb 26, 2015

zdb -b seems to be happy:

[root@ethan ~]# zdb -b tank

Traversing all blocks to verify nothing leaked ...

        No leaks (block sum matches space maps exactly)

        bp count:        86093402
        bp logical:    7577584837632      avg:  88015
        bp physical:   7483290201088      avg:  86920     compression:   1.01
        bp allocated:  11350897614848      avg: 131843     compression:   0.67
        bp deduped:             0    ref>1:      0   deduplication:   1.00
        SPA allocated: 11350896795648     used: 63.53%

Still need to cherry pick the patch and see if the zero filled file is still empty after a reboot.

@Lalufu
Copy link
Contributor Author

Lalufu commented Feb 26, 2015

The file is still empty after a clean reboot.

@Lalufu
Copy link
Contributor Author

Lalufu commented Feb 26, 2015

I've cherry-picked a62d1b0, and it does not make a difference (I've only rebuilt the kernel module, not user space with the patch).

I've uploaded the strace of a cp and a mv of the same file to the same destination, for comparison.

http://tara.camperquake.de/temp/cp.strace
http://tara.camperquake.de/temp/mv.strace

Those links will expire after ca. 14 days.

@tuxoko
Copy link
Contributor

tuxoko commented Mar 17, 2015

Hi, @Lalufu
The strace links are dead. Could you repost them?
I would like to look into it a bit.

@Lalufu
Copy link
Contributor Author

Lalufu commented Mar 18, 2015 via email

@Lalufu
Copy link
Contributor Author

Lalufu commented Jun 15, 2015 via email

@klkblake
Copy link

I can reproduce this problem on 0.7.1. I use a gentoo derivative, and the packages are compiled on a different ZFS filesystem (mounted at /var/tmp) than the root. When I install packages, there is a random chance that some of the files will be zeroed. It seems bursty -- I can install a package a few times with it corrupting each time, and then spend a while unable to reproduce it. When it does corrupt, that too seems bursty, insofar that often it'll be just a bunch of man pages corrupted, or just a bunch of shared libraries, not a random selection of files.

@bunder2015
Copy link
Contributor

bunder2015 commented Sep 20, 2017

I have a working theory that I'm testing. Can you set /var/tmp/portage to sync=always? It's gonna be slower than usual but I haven't seen this since.

edit: here's a little test script a banged out to find these corrupted files.

#!/bin/bash

find /lib64/*.so* | xargs file | grep " data"
find /lib32/*.so* | xargs file | grep " data"
find /usr/lib64/*.so* | xargs file | grep " data"
find /usr/lib32/*.so* | xargs file | grep " data"

find /usr/share/man | grep bz2 | xargs file | grep "  data"

@klkblake
Copy link

I set it to sync=always and set it to recompile nss in a loop, checking for the presence of corrupted files after each iteration, and went to bed. It ran for... probably a bit over 12 hours, without detecting any corruption. That said, with how dramatically it was slowed, it's unclear how many times it actually completed. Still, I expect this is evidence that it works.
The script I ran was:

while [[ -z $stop ]]; do emerge -1 dev-libs/nss; for f in $(equery --no-color files --filter=obj,conf,cmd,doc,man,info dev-libs/nss 2>/dev/null); do tr -d '\0' < $f | read -n 1 || stop=1; done; done

This is slower than yours, but is more precise, it will only detect files that are fully zeroed. when doing broader testing using find it's useful to pass find ... '!' -size 0 to exclude empty files, but this isn't necessary here because the test package doesn't happen to install any.

@bunder2015
Copy link
Contributor

Hmm, at least I'm not the only one who see this. genlop -t nss should tell you how many times it ran.

Question now is, what causes these files to get zero filled? @behlendorf @tuxoko any ideas?

@dweeezil
Copy link
Contributor

@klkblake Since you seem to be able to reproduce this problem, could you perform a zdb -ddddd <pool>/<fs> <inum> ( is inode number from "ls -i") on one of the bogus files and post the result. It would be interesting to see whether the file is sparse. It might be useful to know your kernel version as well. The most obvious place I can think of where ZFS would produce zeroes would be fallocate.

@klkblake
Copy link

Currently running kernel version 4.12.13.
output of ls -lsh /usr/lib64/debug/usr/lib64/libnssckbi.so.debug:

512 -rw-r--r-- 1 root root 557K Sep 22 04:39 /usr/lib64/debug/usr/lib64/libnssckbi.so.debug

lsattr -l and getfattr -d both give nothing
the complete output of zdb -ddddd mjolnir/mjolnir 4300130:

Dataset mjolnir/mjolnir [ZPL], ID 75, cr_txg 16, 271G, 2049944 objects, rootbp DVA[0]=<0:959ccfc000:1000> DVA[1]=<0:f1b5900000:1000> [L0 DMU objset] sha512 uncompressed LE contiguous unique double size=800L/800P birth=2274219L/2274219P fill=2049944 cksum=57d50f64245d8bb3:a6ce60c66e1d4807:80c60b8ed12fae6b:286c78b732cf6303

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
   4300130    1   128K   128K      0     512   128K    0.00  ZFS plain file
                                               168   bonus  System attributes
        dnode flags: USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED 
        dnode maxblkid: 0
        path    /usr/lib64/debug/usr/lib64/libnssckbi.so.debug
        uid     0
        gid     0
        atime   Fri Sep 22 04:39:32 2017
        mtime   Fri Sep 22 04:39:16 2017
        ctime   Fri Sep 22 04:39:17 2017
        crtime  Fri Sep 22 04:39:17 2017
        gen     2267470
        mode    100644
        size    569984
        parent  1015946
        links   1
        pflags  40800000004
Indirect blocks:


@dweeezil
Copy link
Contributor

@klkblake Somewhat at expected, the file is completely sparse. Have you got any of the following enabled on the dataset: compression, dedup, non-standard checksumming? Are the corrupted files always ".so.debug" files?

@klkblake
Copy link

Compression is on, deduplication is off, checksum is set to sha512. They aren't always .so.debug. I also have seen .so and .bz2 (compressed man pages). I might have seen executables at one point? But also, I've been testing this by installing packages repeatedly, so this is more or less what you would expect.

@dweeezil
Copy link
Contributor

@klkblake The reason I asked about those specific properties is because they open other avenues (than traditional lseek and hole-punching) for creating sparse files or otherwise affecting the write path (nop-write, zero suppression, etc.). Could you please try the same test with checksum=on (fletcher4) to rule out nop-write. Also, without compression enabled.

@bunder2015
Copy link
Contributor

FWIW I still use default checksumming. I've only ever seen this on libraries and man pages.

@behlendorf
Copy link
Contributor

Let's decisively rule out NOP writes as a possible cause by disabling them entirely. This can be done by setting the module option zfs_nopwrite_enabled=0 at any time and they'll be disabled immediately.

@tuxoko
Copy link
Contributor

tuxoko commented Sep 25, 2017

@bunder2015
Does this only happens on 0.7 branch? If it's possible can you try this on 0.6.5 branch?
Since the only report we have is on 0.6.3 and 0.7 I wonder if there's a regression.

@Lalufu
Copy link
Contributor Author

Lalufu commented Oct 21, 2017 via email

@janlam7
Copy link

janlam7 commented Oct 21, 2017

I created a branch from 0.7.3 with 66aca24 reverted at https://github.com/janlam7/zfs/tree/issue/3125

On my machine it prevents the issue from occuring.

@bunder2015
Copy link
Contributor

@ryao has asked me to file a gentoo bug on this issue, https://bugs.gentoo.org/635002

@hunbalazs
Copy link

@Lalufu indeed

--- /usr/lib64/python3.4/site-packages/portage/util/file_copy/__init__.py.orig	2017-10-21 17:49:36.707482499 +0200
+++ /usr/lib64/python3.4/site-packages/portage/util/file_copy/__init__.py	2017-10-21 17:50:13.168483047 +0200
@@ -30,7 +30,8 @@
 		_file_copy(src_file.fileno(), dst_file.fileno())
 
 
-if _file_copy is None:
-	copyfile = shutil.copyfile
-else:
-	copyfile = _optimized_copyfile
+#if _file_copy is None:
+#	copyfile = shutil.copyfile
+#else:
+#	copyfile = _optimized_copyfile
+copyfile = shutil.copyfile

I'll try that. I tried to reproduce by using portage's _optimized_copyfile over and over but my short stress test didn't give any result. I'll try to continue testing overnight.

@bunder2015
Copy link
Contributor

News from the gentoo side...

Looking and the sendfile documentation, it looks like portage's interpretation of the *offset parameter is incorrect, since that variable represents the input file offset rather than the output file offset. I'll fix it to use sendfiles return value to measure bytes written to the output file.

Since I can't reproduce this very easily, can someone test this portage patch and report any changes? https://bugs.gentoo.org/635002#c6

@janlam7
Copy link

janlam7 commented Oct 23, 2017

On my system the patch in https://bugs.gentoo.org/635002#c6 does not prevent this issue from occurring.

@hunbalazs
Copy link

hunbalazs commented Oct 23, 2017

Using shutil.copyfile ( @Lalufu ) and https://bugs.gentoo.org/635002#c6 ( @bunder2015 ) did not help. Mounting /var/tmp/portage as tmpfs fixes it.
The interesting thing is that by using @janlam7 's method I could not reproduce the issue (by installing skypeforlinux multiple times) only by using nvidia-drivers.
I'll try with linux 4.1.x ( @ryao ) and with @janlam7 's branch

@janlam7
Copy link

janlam7 commented Oct 24, 2017

With kernel 4.1.43-gentoo-r1 and zfs v0.7.2-r0-gentoo I can reproduce it.

@bunder2015
Copy link
Contributor

The gentoo folks have pushed out a revision to the patch I linked yesterday, same URL.

@gedia
Copy link
Contributor

gedia commented Oct 24, 2017

@bunder2015 I'm still able to reproduce this with version 5 of said patch and sync=disabled

@tuxoko
Copy link
Contributor

tuxoko commented Oct 26, 2017

Would revert the dirty check fix the issue?

diff --git a/module/zfs/dmu.c b/module/zfs/dmu.c
index b3cf10c..6569131 100644
--- a/module/zfs/dmu.c
+++ b/module/zfs/dmu.c
@@ -2054,12 +2054,10 @@ dmu_offset_next(objset_t *os, uint64_t object, boolean_t hole, uint64_t *off)
 	/*
 	 * Check if dnode is dirty
 	 */
-	if (dn->dn_dirtyctx != DN_UNDIRTIED) {
-		for (i = 0; i < TXG_SIZE; i++) {
-			if (!list_is_empty(&dn->dn_dirty_records[i])) {
-				clean = B_FALSE;
-				break;
-			}
+	for (i = 0; i < TXG_SIZE; i++) {
+		if (list_link_active(&dn->dn_dirty_link[i])) {
+			clean = B_FALSE;
+			break;
 		}
 	}
 

@bunder2015
Copy link
Contributor

Sorry for not keeping an eye on this during the week... Got word back from gentoo again yesterday after their last round of portage patches.

A problem in the lseek SEEK_DATA/SEEK_HOLE implementation might cause this.

Has anyone had a chance to test @tuxoko's patch?

@janlam7
Copy link

janlam7 commented Oct 29, 2017

@tuxoko 's patch works for me.

@behlendorf
Copy link
Contributor

@tuxoko good thought, that would explain things. Better to solely check the dnode for dirty status. It would be great if we could get a few other users to verify the fix and a PR opened.

@gedia
Copy link
Contributor

gedia commented Oct 31, 2017

@tuxoko With tuxoko's patch I haven't been able to reproduce this bug so far either. Thanks!

@tuxoko
Copy link
Contributor

tuxoko commented Nov 1, 2017

@behlendorf
To be honest, I don't know what exact check we should use. The new check obviously would cause this regression. However, there might some problem with the old one that warrant a change. Do you know what was the reason for this change in the first place? Also, do you think we should consider mmap here?

@behlendorf
Copy link
Contributor

@tuxoko we should use the one you proposed, which is what was here prior to 66aca24 for SEEK_HOLE. This was a case of over optimizing the dirty check which slipped through since it's almost always right. As for mmap we should be OK here since zpl_writepage() will dirty the object with dmu_write() like any normal write.

@tuxoko
Copy link
Contributor

tuxoko commented Nov 2, 2017

@behlendorf
But wouldn't zpl_writepage only happens when writing back of dirty pages?
There would still be a window between dirty page and zpl_writepage. Or do we don't need to care about it for SEEK_HOLE/SEEK_DATA?

@behlendorf
Copy link
Contributor

@tuxoko my understanding from the msync(2) man page is there's no guarantee made to applications about this. We could go the extra mile here and conservatively force an msync when zp->z_is_mapped is set.

behlendorf added a commit to behlendorf/zfs that referenced this issue Nov 14, 2017
The correct way to determine if a dnode is dirty is to check
if any of the dn->dn_dirty_link's are active.  Relying solely
on the dn->dn_dirtyctx can result in the dnode being mistakenly
reported as clean.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#3125
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Nov 21, 2017
The correct way to determine if a dnode is dirty is to check
if any of the dn->dn_dirty_link's are active.  Relying solely
on the dn->dn_dirtyctx can result in the dnode being mistakenly
reported as clean.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#3125
Closes openzfs#6867

Requires-spl: refs/pull/669/head
tonyhutter pushed a commit that referenced this issue Nov 22, 2017
The correct way to determine if a dnode is dirty is to check
if any of the dn->dn_dirty_link's are active.  Relying solely
on the dn->dn_dirtyctx can result in the dnode being mistakenly
reported as clean.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3125 
Closes #6867
Nasf-Fan pushed a commit to Nasf-Fan/zfs that referenced this issue Jan 29, 2018
The correct way to determine if a dnode is dirty is to check
if any of the dn->dn_dirty_link's are active.  Relying solely
on the dn->dn_dirtyctx can result in the dnode being mistakenly
reported as clean.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#3125
Closes openzfs#6867
Nasf-Fan pushed a commit to Nasf-Fan/zfs that referenced this issue Feb 13, 2018
The correct way to determine if a dnode is dirty is to check
if any of the dn->dn_dirty_link's are active.  Relying solely
on the dn->dn_dirtyctx can result in the dnode being mistakenly
reported as clean.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#3125
Closes openzfs#6867
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Regression Indicates a functional regression
Projects
None yet
Development

No branches or pull requests