-
-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nixos/zfs: introduce option to control hibernation #171680
Conversation
You could add a note to the docstring ( |
Note that I believe that by now ZFS can be hibernated, as long as you do not import any pools before resuming. Until we can make assurances to that effect in nixos, we probably should have something like this though. |
Yeah, I got this impression too reading the original ZFS issue mentioned in #106093, but I did not test further as I am using ZFS both on my desktop and laptop, and I do not use suspend neither hibernation. As soon as I have some free time, will revisit this issue and open again the PR for further review, for now I am just using the notes from https://nixos.wiki/wiki/ZFS |
I have this option explicitly set precisely for the same reason. |
Can we please get this merged? Every time I used - intentionally or unintentionally - the hibernate functionality I got ZFS pool corruption and could not mount the pool at boot. Running |
Hey there! This is mostly stale because I did not have enough time to research more, so because of that is draft :| Furthermore the cause seems to be more in case of hibernating with Swap on ZFS, see openzfs/zfs#12842. I will check if I can update this soon (maybe this week)? And we should decide if we make this safe step for the user or just put a bigger note of this possibly problem. |
0270fec
to
eaf6710
Compare
Ok, I have did a quick research, and the result is "better safe than sorry". ZFS corruption seem caused by either using swap on ZFS (which is not a good idea, see openzfs/zfs#7734) or hibernating and importing a pool before resume/swap (openzfs/zfs#12842 (comment) openzfs/zfs#12842 (comment)). I do not think that my first implementation is a good way to "fix" it because would require users to overwrite their |
This would make the explicit |
Indeed, since now, I think that's the nice and easy way, since we allow the user to opt-in if desired. My first implementation probably would need a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I lost track of this and have since switched to another system.
eaf6710
to
1b30f31
Compare
I think "better safe than sorry" is the correct approach. |
@ofborg test zfs |
I did a few runs locally and no problems found. |
e080813
to
a538580
Compare
Thank you. |
Oh I might have inadvertently fixed swap with ZFS on NixOS then. I've patched my local nixpkgs to do the hibernation resuming before doing any zpool imports (which happens in Motivation was not having to enter my ZFS encryption password if I'm restoring from swap anyways. (notably not using swap encryption yet, so ZFS encryption doesn't add much right now), but as far as I can see, that fixes this issue too right? |
# https://github.com/openzfs/zfs/issues/260 | ||
# https://github.com/openzfs/zfs/issues/12842 | ||
# https://github.com/NixOS/nixpkgs/issues/106093 | ||
kernelParams = lib.optionals (!config.boot.zfs.allowHibernation) [ "nohibernate" ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right workaround. Because it allows people to assign a swapDevices
but that not taking any effect. I think it would be better to have an assertion like this:
{
assertions = [{
assertion = (config.boot.zfs.enable && config.swapDevices != []) -> config.config.boot.zfs.allowHibernation;
message = "Using swap with ZFS can corrupt data. Either disable swap by removing all `swapDevices` entries, or if you know what you're doing, ignore this assertion by enabling `boot.zfs.allowHibernation`";
}];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also have another assertion which throws when ZFS is enabled and you have a swap file on a ZFS filesystem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using swap with ZFS can corrupt data.
Does this apply even for a swap partition that's not on a ZVOL or otherwise on ZFS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right workaround. Because it allows people to assign a
swapDevices
but that not taking any effect. I think it would be better to have an assertion like this:{ assertions = [{ assertion = (config.boot.zfs.enable && config.swapDevices != []) -> config.config.boot.zfs.allowHibernation; message = "Using swap with ZFS can corrupt data. Either disable swap by removing all `swapDevices` entries, or if you know what you're doing, ignore this assertion by enabling `boot.zfs.allowHibernation`"; }];
I just got bit by this. I'm only using ZFS on a block file in an EXT4 filesystem so I'm not concerned with hibernating ZFS filesystems. I think this suggestion would aid discoverability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a misunderstanding in @infinisil's year old comment. The issue is not that you cannot have swap devices while using ZFS; you totally can (though, for other unrelated reasons, those cannot be stored on ZFS). The issue is that you cannot hibernate while ZFS pools are imported, period, no matter where the swap is stored, or else those pools have a chance of being irreparably corrupted.
Point being: ZFS + swap on non-ZFS: Good!. ZFS + hibernation in any case whatsoever: Bad! Hence what we have now is the right fix. If you want to ignore it, well it is configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that you cannot hibernate while ZFS pools are imported, period, no matter where the swap is stored, or else those pools have a chance of being irreparably corrupted.
Makes sense! Though, the discoverability for hibernation being disabled is not great. I only realized ZFS was the problem after reading this discourse post. I'm really not using ZFS in anger, so I'm happy to risk data loss for being able to hibernate my laptop, and I think @infinisil's suggestion would make this clear to new users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
infinisil's suggestion would disable swap entirely with ZFS, which is certainly not desirable. We don't really have a way to know at eval-time whether a user intends to use hibernation, so it's difficult to warn about.
Actually, we should probably have an assertion that boot.resumeDevice
cannot be set while allowHibernation
is false, and that would cover some use cases. But NixOS has auto-detection of resume devices in stage 1 (if you're using EFI and scripted initrd exactly because... reasons), so this is very often not set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
infinisil's suggestion would disable swap entirely with ZFS, which is certainly not desirable. We don't really have a way to know at eval-time whether a user intends to use hibernation, so it's difficult to warn about.
Hmm, how about:
{
assertions = [{
assertion = (config.boot.zfs.enable && ! (builtins.elem "nohibernate" config.boot.kernelParams)) -> config.config.boot.zfs.allowHibernation;
message = "Using hibernation with ZFS can corrupt data. Either disable hibernation with `boot.kernelParams = [ "nohibernate" ]` or if you know what you're doing, ignore this assertion by enabling `boot.zfs.allowHibernation`";
}];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As also hinted at in #171680 (comment), I've been using ZFS with swap (on a separate partition) with hibernation for a while now and nothing has corrupted 🤷. I am using a patch on top of Nixpkgs though: infinisil@448fe5d. This makes swap restore before ZFS imports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@infinisil that patch helps. Certainly, it should be quite rare with that patch. But "quite a rare, but known, source of data loss" is not something we want people using by accident.
@RyanGibb That assertion doesn't really do anything though? allowHibernation
implies that nohibernate
is not in the cmdline. Unless the user manually added it themselves and enabled allowHibernation
. While that would be a nonsensical thing for them to do, it would at least be safe, and nothing to worry about, and means that they explicitly added both configuration lines.
Related to: openzfs/zfs#12842 NixOS/nixpkgs#171680 NixOS/nixpkgs#203524 Signed-off-by: Jakub Sokołowski <jakub@status.im>
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/hibernate-doesnt-work-anymore/24673/12 |
Description of changes
This should close #106093.
Since ZFS is not hibernation friendly at the moment (see openzfs/zfs#260) I think disabling hibernation by default is a safe measure to prevent data loss.
Kernel doc for this param https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/kernel-parameters.txt#L3559.
Since this is a breaking change, I have added a observation on release notes for 22.11.
Things done
sandbox = true
set innix.conf
? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)nixos/doc/manual/md-to-db.sh
to update generated release notes