Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel: PANIC at dmu_objset.c:1844:do_userquota_cacheflush() #7753

Closed
phryneas opened this issue Jul 28, 2018 · 22 comments
Closed

kernel: PANIC at dmu_objset.c:1844:do_userquota_cacheflush() #7753

phryneas opened this issue Jul 28, 2018 · 22 comments

Comments

@phryneas
Copy link

System information

Type Version/Name
Distribution Name NixOS
Distribution Version 18.03.132915.d6c6c7fcec6
Linux Kernel 4.14.56
Architecture x86_64
ZFS Version 0.7.0-1
SPL Version 0.7.0-1
Exact commit ba863d0

Declaration of complete build process: https://github.com/NixOS/nixpkgs/blob/d6c6c7fcec6dbd2b8ab14f0b35d56c7733872baa/pkgs/os-specific/linux/zfs/default.nix

Describe the problem you're observing

On shutdown, the system hangs indefinitely after trying to unmount zfs shares.

According to logs, the kernel panics.

Describe how to reproduce the problem

Simply shut down - happens every time.

Include any warning/errors/backtraces from the system logs

Jul 26 17:00:39 nixos systemd[1]: Unmounting /var/lib...
Jul 26 17:00:39 nixos systemd[1]: Unmounting /run/keys...
Jul 26 17:00:39 nixos systemd[1]: Unmounting /boot...
Jul 26 17:00:39 nixos systemd[1]: Unmounting /tmp...
Jul 26 17:00:39 nixos systemd[1]: Unmounting /home...
Jul 26 17:00:39 nixos systemd[1]: Unmounting /run/wrappers...
Jul 26 17:00:39 nixos systemd[1]: Unmounting /run/user/1000...
Jul 26 17:00:39 nixos systemd[1]: Unmounting /zroot...
Jul 26 17:00:39 nixos systemd[1]: Stopped Update UTMP about System Boot/Shutdown.
Jul 26 17:00:39 nixos systemd[1]: Unmounted /run/keys.
Jul 26 17:00:39 nixos systemd[1]: Unmounted /run/user/1000.
Jul 26 17:00:39 nixos systemd[1]: Unmounted /run/wrappers.
Jul 26 17:00:39 nixos systemd[1]: Stopped target Swap.
Jul 26 17:00:39 nixos systemd[1]: Unmounted /tmp.
Jul 26 17:00:39 nixos systemd[1]: Unmounted /tmp.
Jul 26 17:00:39 nixos kernel: VERIFY3(0 == zap_increment(os, (-1ULL), uqn->uqn_id, uqn->uqn_delta, tx)) failed (0 == 13)
Jul 26 17:00:39 nixos kernel: PANIC at dmu_objset.c:1844:do_userquota_cacheflush()
Jul 26 17:00:39 nixos kernel: Showing stack for process 274
Jul 26 17:00:39 nixos kernel: CPU: 1 PID: 274 Comm: dp_sync_taskq Tainted: P           O    4.14.56 #1-NixOS
Jul 26 17:00:39 nixos kernel: Hardware name: MSI MS-7821/Z87-G55 (MS-7821), BIOS V10.7 07/19/2014
Jul 26 17:00:39 nixos kernel: Call Trace:
Jul 26 17:00:39 nixos kernel:  dump_stack+0x5c/0x85
Jul 26 17:00:39 nixos kernel:  spl_panic+0xc8/0x110 [spl]
Jul 26 17:00:39 nixos kernel:  ? zap_lockdir+0x37/0x90 [zfs]
Jul 26 17:00:39 nixos kernel:  ? zap_lookup_norm+0x40/0xa0 [zfs]
Jul 26 17:00:39 nixos kernel:  ? zap_lookup+0x12/0x20 [zfs]
Jul 26 17:00:39 nixos kernel:  userquota_updates_task+0x2df/0x5b0 [zfs]
Jul 26 17:00:39 nixos kernel:  ? put_prev_entity+0x2c/0x4a0
Jul 26 17:00:39 nixos kernel:  ? dmu_objset_is_dirty+0x20/0x20 [zfs]
Jul 26 17:00:39 nixos kernel:  ? dmu_objset_is_dirty+0x20/0x20 [zfs]
Jul 26 17:00:39 nixos kernel:  ? dmu_objset_is_dirty+0x20/0x20 [zfs]
Jul 26 17:00:39 nixos kernel:  taskq_thread+0x267/0x470 [spl]
Jul 26 17:00:39 nixos kernel:  ? wake_up_q+0x70/0x70
Jul 26 17:00:39 nixos kernel:  ? taskq_thread_should_stop+0x70/0x70 [spl]
Jul 26 17:00:39 nixos kernel:  kthread+0x11a/0x130
Jul 26 17:00:39 nixos kernel:  ? kthread_create_on_node+0x40/0x40
Jul 26 17:00:39 nixos kernel:  ret_from_fork+0x35/0x40
Jul 26 17:00:39 nixos systemd[1]: Unmounted /boot.
Jul 26 17:02:09 nixos systemd[1]: var-lib.mount: Unmounting timed out. Terminating.
Jul 26 17:02:09 nixos systemd[1]: zroot.mount: Unmounting timed out. Terminating.
Jul 26 17:02:09 nixos systemd[1]: home.mount: Unmounting timed out. Terminating.
Jul 26 17:03:40 nixos systemd[1]: var-lib.mount: Mount process timed out. Killing.
Jul 26 17:03:40 nixos systemd[1]: var-lib.mount: Killing process 19017 (umount) with signal SIGKILL.
Jul 26 17:03:40 nixos systemd[1]: zroot.mount: Mount process timed out. Killing.
Jul 26 17:03:40 nixos systemd[1]: zroot.mount: Killing process 19025 (umount) with signal SIGKILL.
Jul 26 17:03:40 nixos systemd[1]: home.mount: Mount process timed out. Killing.
Jul 26 17:03:40 nixos systemd[1]: home.mount: Killing process 19022 (umount) with signal SIGKILL.
Jul 26 17:03:44 nixos kernel: INFO: task dp_sync_taskq:273 blocked for more than 120 seconds.
Jul 26 17:03:44 nixos kernel:       Tainted: P           O    4.14.56 #1-NixOS
Jul 26 17:03:44 nixos kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 26 17:03:44 nixos kernel: dp_sync_taskq   D    0   273      2 0x80000000
Jul 26 17:03:44 nixos kernel: Call Trace:
Jul 26 17:03:44 nixos kernel:  ? __schedule+0x1a2/0x6c0
Jul 26 17:03:44 nixos kernel:  schedule+0x28/0x80
Jul 26 17:03:44 nixos kernel:  schedule_preempt_disabled+0xa/0x10
Jul 26 17:03:44 nixos kernel:  __mutex_lock.isra.2+0x17d/0x4b0
Jul 26 17:03:44 nixos kernel:  ? do_userobjquota_update.part.15+0xd5/0xe0 [zfs]
Jul 26 17:03:44 nixos kernel:  ? userquota_updates_task+0x288/0x5b0 [zfs]
Jul 26 17:03:44 nixos kernel:  userquota_updates_task+0x288/0x5b0 [zfs]
Jul 26 17:03:44 nixos kernel:  ? dmu_objset_is_dirty+0x20/0x20 [zfs]
Jul 26 17:03:44 nixos kernel:  ? dmu_objset_is_dirty+0x20/0x20 [zfs]
Jul 26 17:03:44 nixos kernel:  ? dmu_objset_is_dirty+0x20/0x20 [zfs]
Jul 26 17:03:44 nixos kernel:  taskq_thread+0x267/0x470 [spl]
Jul 26 17:03:44 nixos kernel:  ? wake_up_q+0x70/0x70
% zfs get all zroot/crypt/tmp 
NAME             PROPERTY              VALUE                          SOURCE
zroot/crypt/tmp  type                  filesystem                     -
zroot/crypt/tmp  creation              Mon May 21 17:55 2018          -
zroot/crypt/tmp  used                  1.37G                          -
zroot/crypt/tmp  available             370G                           -
zroot/crypt/tmp  referenced            1.37G                          -
zroot/crypt/tmp  compressratio         1.00x                          -
zroot/crypt/tmp  mounted               yes                            -
zroot/crypt/tmp  quota                 none                           default
zroot/crypt/tmp  reservation           none                           default
zroot/crypt/tmp  recordsize            128K                           default
zroot/crypt/tmp  mountpoint            legacy                         local
zroot/crypt/tmp  sharenfs              off                            default
zroot/crypt/tmp  checksum              on                             default
zroot/crypt/tmp  compression           off                            default
zroot/crypt/tmp  atime                 on                             default
zroot/crypt/tmp  devices               on                             default
zroot/crypt/tmp  exec                  on                             default
zroot/crypt/tmp  setuid                on                             default
zroot/crypt/tmp  readonly              off                            default
zroot/crypt/tmp  zoned                 off                            default
zroot/crypt/tmp  snapdir               hidden                         default
zroot/crypt/tmp  aclinherit            restricted                     default
zroot/crypt/tmp  createtxg             16                             -
zroot/crypt/tmp  canmount              on                             default
zroot/crypt/tmp  xattr                 on                             default
zroot/crypt/tmp  copies                1                              default
zroot/crypt/tmp  version               5                              -
zroot/crypt/tmp  utf8only              off                            -
zroot/crypt/tmp  normalization         none                           -
zroot/crypt/tmp  casesensitivity       sensitive                      -
zroot/crypt/tmp  vscan                 off                            default
zroot/crypt/tmp  nbmand                off                            default
zroot/crypt/tmp  sharesmb              off                            default
zroot/crypt/tmp  refquota              none                           default
zroot/crypt/tmp  refreservation        none                           default
zroot/crypt/tmp  guid                  7632425401047322315            -
zroot/crypt/tmp  primarycache          all                            default
zroot/crypt/tmp  secondarycache        all                            default
zroot/crypt/tmp  usedbysnapshots       0B                             -
zroot/crypt/tmp  usedbydataset         1.37G                          -
zroot/crypt/tmp  usedbychildren        0B                             -
zroot/crypt/tmp  usedbyrefreservation  0B                             -
zroot/crypt/tmp  logbias               latency                        default
zroot/crypt/tmp  dedup                 off                            default
zroot/crypt/tmp  mlslabel              none                           default
zroot/crypt/tmp  sync                  disabled                       local
zroot/crypt/tmp  dnodesize             legacy                         default
zroot/crypt/tmp  refcompressratio      1.00x                          -
zroot/crypt/tmp  written               1.37G                          -
zroot/crypt/tmp  logicalused           1.23G                          -
zroot/crypt/tmp  logicalreferenced     1.23G                          -
zroot/crypt/tmp  volmode               default                        default
zroot/crypt/tmp  filesystem_limit      none                           default
zroot/crypt/tmp  snapshot_limit        none                           default
zroot/crypt/tmp  filesystem_count      none                           default
zroot/crypt/tmp  snapshot_count        none                           default
zroot/crypt/tmp  snapdev               hidden                         default
zroot/crypt/tmp  acltype               off                            default
zroot/crypt/tmp  context               none                           default
zroot/crypt/tmp  fscontext             none                           default
zroot/crypt/tmp  defcontext            none                           default
zroot/crypt/tmp  rootcontext           none                           default
zroot/crypt/tmp  relatime              on                             temporary
zroot/crypt/tmp  redundant_metadata    all                            default
zroot/crypt/tmp  overlay               off                            default
zroot/crypt/tmp  encryption            aes-256-gcm                    -
zroot/crypt/tmp  keylocation           none                           default
zroot/crypt/tmp  keyformat             passphrase                     -
zroot/crypt/tmp  pbkdf2iters           342K                           -
zroot/crypt/tmp  encryptionroot        zroot/crypt                    -
zroot/crypt/tmp  keystatus             available                      -
zroot/crypt/tmp  nixos:shutdown-time   Sat Jul 28 11:58:50 CEST 2018  inherited from zroot

I have not enabled quota on any volume:

% zfs get quota     
NAME                        PROPERTY  VALUE  SOURCE
zroot                       quota     none   default
zroot/crypt                 quota     none   default
zroot/crypt/home            quota     none   default
zroot/crypt/home@--head--   quota     -      -
zroot/crypt/root            quota     none   default
zroot/crypt/tmp             quota     none   default
zroot/crypt/var-lib         quota     none   default

If I can provide any additional data, please tell me.

@behlendorf
Copy link
Contributor

Thanks for reporting this. Just FYI quota accounting in ZFS is always enabled, it's only the enforcement which can be enabled/disabled.

What's a little surprising about this failure is the error reported by the VERIFY is 13 = EACCESS. By this point in the code most permission checks will already have been done. One exception is that EACCESS may be returned from the zio layer if the encryption keys aren't available (see commit cd32e5d). I see you're running a version of ZFS with encryption but without some of the follow up fixes which handle some corner cases like this. If possible could you upgrade to the latest ZFS master and see if you can still reproduce the issue.

Let's get @tcaputi's thoughts on this as well since I suspect it is encryption related.

@tcaputi
Copy link
Contributor

tcaputi commented Jul 30, 2018

I agree with Brian. We've made a few fixes since the version that you have came out. I would suggest installing the newest version, rebooting / reinserting the kernel module, and seeing if the problem persists. If it does, reply here and we'll take a closer look.

@phryneas
Copy link
Author

I will do that in the next few days and report back :)

Just for the record, to be sure: upgrading & possibly loading it with my old kernel later on will currently work without any corruption?

@tcaputi
Copy link
Contributor

tcaputi commented Jul 30, 2018

None of this code is currently in a stable release, and we make no guarantees about code that is just on master. That being said, I am fairly certain that we have done nothing to break the on-disk format of encryption since ba863d0 was released and I dont think any patches we have made could cause permanent errors that couldn't be undone by reverting to an older zs version.

If you are feeling concerned I would recommend making a backup of any data you can't afford to lose.

@behlendorf
Copy link
Contributor

This may go without saying. But I'd just add to @tcaputi's comment that if you plan to rollback to the older version do not run zpool upgrade. This will enable new features which won't be understood by the older code.

@phryneas
Copy link
Author

Okay, understood :) Will do some testing & report back!

@phryneas
Copy link
Author

phryneas commented Aug 3, 2018

I've switched to 3d503a7

and now I got

Aug 03 23:02:04 nixos systemd[1]: Unmounting /zroot...
Aug 03 23:02:04 nixos systemd[1]: Unmounting /boot...
Aug 03 23:02:04 nixos systemd[1]: Unmounting /var/lib...
Aug 03 23:02:04 nixos systemd[1]: Unmounting /home...
Aug 03 23:02:04 nixos systemd[1]: Unmounting /run/user/1000...
Aug 03 23:02:05 nixos systemd[1]: Unmounting /run/keys...
Aug 03 23:02:05 nixos systemd[1]: Unmounting /run/wrappers...
Aug 03 23:02:05 nixos systemd[1]: Unmounting /tmp...
Aug 03 23:02:05 nixos systemd[1]: Stopped Update UTMP about System Boot/Shutdown.
Aug 03 23:02:05 nixos systemd[1]: Unmounted /run/keys.
Aug 03 23:02:05 nixos systemd[1]: Unmounted /boot.
Aug 03 23:02:05 nixos kernel: VERIFY3(0 == spa_do_crypt_objset_mac_abd(B_TRUE, spa, dsobj, zio->io_abd, psize, (((((bp)->blk_
![img_20180803_230521](https://user-images.githubusercontent.com/4282439/43666151-ea461b9c-9772-11e8-95eb-131ff894a9d4.jpg)
prop) >> (63)) & ((1ULL << (1)) - 1)) != (1ULL)))) failed (0 == 2)

at shutdown.

Funny thing is, there were more logs that didn't make it to my log file, I'm adding a screen photo.
(Photo was taken one shutdown later and actually shows a different kernel panic?)

https://imgur.com/a/MyH4e2N

@tcaputi
Copy link
Contributor

tcaputi commented Aug 3, 2018

I'll take a look on Monday, but we haven't seen anything like this in a long time.... Were the filesystems created by zfs receive? If so what flags did you use?

@phryneas
Copy link
Author

phryneas commented Aug 4, 2018

No, they were created from scratch.

I was loosely following this wiki entry:

https://nixos.wiki/wiki/NixOS_on_ZFS#Encrypted_ZFS

@tcaputi
Copy link
Contributor

tcaputi commented Aug 6, 2018

@phryneas We are having a hard time getting NixOS up and running to do any tests. Do you have a nix configuration file that we can use to get started?

@phryneas
Copy link
Author

phryneas commented Aug 6, 2018

@tcaputi https://gist.github.com/phryneas/7ef5eb97e35d973eb1e7c2a8c9590153

I can also give you access to a completely set up hetzner cloud VM (including virtual console, snapshots, rollback etc.) - you'd have to contact me via mail for that. See my contact info at https://phryneas.de

PS: to clarify this: the hetzner cloud vm is set up quite similar to my system, but I haven't been able to reproduce that behaviour there yet :/

@tcaputi
Copy link
Contributor

tcaputi commented Aug 15, 2018

I think I have located the problem after spending some time with the affected machine. The problem only seems to effect one dataset, which is has the sync property set to disabled. I will see if I can reproduce the problem locally and get a fix implemented.

@tcaputi
Copy link
Contributor

tcaputi commented Aug 16, 2018

I am now able to reproduce the problem. It only happens with sync=disabled and when doing a readonly remount on the filesystem before umounting. I should be able to get a fix sometime today.

tcaputi pushed a commit to datto/zfs that referenced this issue Aug 16, 2018
Currently, when unmounting a filesystem, ZFS will only wait for
a txg sync if the dataset is dirty and not readonly. However, this
can be problematic in cases where a dataset is remounted readonly
immediately before being unmounted, which often happens when the
system is being shut down. Since encrypted datasets require that
all I/O is completed before the dataset is disowned, this issue
causes problems when write I/Os leak into the txgs after the
dataset is disowned. This patch simply enforces that all dirty
datasets should wait for a txg sync before umount completes.

Fixes: openzfs#7753

Signed-off-by: Tom Caputi <tcaputi@datto.com>
tcaputi pushed a commit to datto/zfs that referenced this issue Aug 16, 2018
Currently, when unmounting a filesystem, ZFS will only wait for
a txg sync if the dataset is dirty and not readonly. However, this
can be problematic in cases where a dataset is remounted readonly
immediately before being unmounted, which often happens when the
system is being shut down. Since encrypted datasets require that
all I/O is completed before the dataset is disowned, this issue
causes problems when write I/Os leak into the txgs after the
dataset is disowned, which can happen when sync=disabled. This
patch simply enforces that all dirty datasets should wait for a
txg sync before umount completes.

Fixes: openzfs#7753

Signed-off-by: Tom Caputi <tcaputi@datto.com>
@tcaputi
Copy link
Contributor

tcaputi commented Aug 16, 2018

Pull request has been opened with a fix. @phryneas Would you mind checking if this solves your issue?

tcaputi pushed a commit to datto/zfs that referenced this issue Aug 16, 2018
Currently, when unmounting a filesystem, ZFS will only wait for
a txg sync if the dataset is dirty and not readonly. However, this
can be problematic in cases where a dataset is remounted readonly
immediately before being unmounted, which often happens when the
system is being shut down. Since encrypted datasets require that
all I/O is completed before the dataset is disowned, this issue
causes problems when write I/Os leak into the txgs after the
dataset is disowned, which can happen when sync=disabled. This
patch simply enforces that all dirty datasets should wait for a
txg sync before umount completes.

Fixes: openzfs#7753

Signed-off-by: Tom Caputi <tcaputi@datto.com>
@phryneas
Copy link
Author

Will do this evening - thanks for all the work you've put into this so far!

@tcaputi
Copy link
Contributor

tcaputi commented Aug 17, 2018

I posted this on the PR, but it seems like the work isn't completely done.... Although my reproducer is fixed, your VM is still having problems. I will continue investigating tomorrow.

tcaputi pushed a commit to datto/zfs that referenced this issue Aug 20, 2018
Currently, when unmounting a filesystem, ZFS will only wait for
a txg sync if the dataset is dirty and not readonly. However, this
can be problematic in cases where a dataset is remounted readonly
immediately before being unmounted, which often happens when the
system is being shut down. Since encrypted datasets require that
all I/O is completed before the dataset is disowned, this issue
causes problems when write I/Os leak into the txgs after the
dataset is disowned, which can happen when sync=disabled.

While looking into fixes for this issue, it was discovered that
dsl_dataset_is_dirty() does not return B_TRUE when the dataset has
been removed from the txg dirty datasets list, but has not actually
been processed yet. Furthermore, the implementation is comletely
different from dmu_objset_is_dirty(), adding to the confusion.
Rather than relying on this function, this patch forces the umount
code path (and the remount readonly code path) to always perform a
txg sync on read-write datasets and removes the function altogether.

Fixes: openzfs#7753

Signed-off-by: Tom Caputi <tcaputi@datto.com>
@tcaputi
Copy link
Contributor

tcaputi commented Aug 20, 2018

@phryneas I just pushed a commit to the PR that I think will fix the issue. I will have time to test it tomorrow, which should hopefully be the last time I need your VM's. Thank you for your patience in resolving this issue.

@phryneas
Copy link
Author

Cool! I'll give this a test on my bare metal machine (that initially had the problem) tomorrow :)

tcaputi pushed a commit to datto/zfs that referenced this issue Aug 21, 2018
Currently, when unmounting a filesystem, ZFS will only wait for
a txg sync if the dataset is dirty and not readonly. However, this
can be problematic in cases where a dataset is remounted readonly
immediately before being unmounted, which often happens when the
system is being shut down. Since encrypted datasets require that
all I/O is completed before the dataset is disowned, this issue
causes problems when write I/Os leak into the txgs after the
dataset is disowned, which can happen when sync=disabled.

While looking into fixes for this issue, it was discovered that
dsl_dataset_is_dirty() does not return B_TRUE when the dataset has
been removed from the txg dirty datasets list, but has not actually
been processed yet. Furthermore, the implementation is comletely
different from dmu_objset_is_dirty(), adding to the confusion.
Rather than relying on this function, this patch forces the umount
code path (and the remount readonly code path) to always perform a
txg sync on read-write datasets and removes the function altogether.

Fixes: openzfs#7753

Signed-off-by: Tom Caputi <tcaputi@datto.com>
@tcaputi
Copy link
Contributor

tcaputi commented Aug 21, 2018

@phryneas I tested the patch against your VM 10 times without issue. Let me know if you have more problems on your bare metal machine.

@phryneas
Copy link
Author

I've given it a few reboots and so far it shut down nicely every time. Seems to work 👍

tcaputi pushed a commit to datto/zfs that referenced this issue Aug 24, 2018
Currently, when unmounting a filesystem, ZFS will only wait for
a txg sync if the dataset is dirty and not readonly. However, this
can be problematic in cases where a dataset is remounted readonly
immediately before being unmounted, which often happens when the
system is being shut down. Since encrypted datasets require that
all I/O is completed before the dataset is disowned, this issue
causes problems when write I/Os leak into the txgs after the
dataset is disowned, which can happen when sync=disabled.

While looking into fixes for this issue, it was discovered that
dsl_dataset_is_dirty() does not return B_TRUE when the dataset has
been removed from the txg dirty datasets list, but has not actually
been processed yet. Furthermore, the implementation is comletely
different from dmu_objset_is_dirty(), adding to the confusion.
Rather than relying on this function, this patch forces the umount
code path (and the remount readonly code path) to always perform a
txg sync on read-write datasets and removes the function altogether.

Fixes: openzfs#7753

Signed-off-by: Tom Caputi <tcaputi@datto.com>
@phryneas
Copy link
Author

I'll take down the VMs, now that this is merged. Again, thanks for your time and hard work @tcaputi 🥇

@tcaputi
Copy link
Contributor

tcaputi commented Aug 27, 2018

Thanks for letting me use the VM's to debug the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants