Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

btrfs raid not ready but systemd tries to mount it anyway #947

Closed
danieljrmay opened this issue Oct 16, 2020 · 11 comments · Fixed by #1131
Closed

btrfs raid not ready but systemd tries to mount it anyway #947

danieljrmay opened this issue Oct 16, 2020 · 11 comments · Fixed by #1131
Labels
bug Our bugs
Milestone

Comments

@danieljrmay
Copy link

Describe the bug
I have a machine with a EXT4 / partition and a 10 HDD BTRFS RAID1 volume which I have mounting at /srv in /etc/fstab. Out of the 10 HDDs which make up the BTRFS volume 4 were connected directly to the motherboard, 6 were connected via an HBA card. In this configuration the the BTRFS volume would fail to mount automatically at boot time.

I initially reported this issue at ask.fedoraproject.org:

https://ask.fedoraproject.org/t/btrfs-volume-mount-fails-at-boot-but-works-once-system-is-up/9593

The same issue was then reported to the systemd-devel mailing list:

https://lists.freedesktop.org/archives/systemd-devel/2020-October/045399.html

Where it has been suggested that the issue may be with dracut:

https://lists.freedesktop.org/archives/systemd-devel/2020-October/045440.html

Hence I am reporting this here.

Distribution used
Fedora 32.

Dracut version

$ sudo rpm -q dracut
dracut-050-61.git20200529.fc32.x86_64

Init system

$ sudo rpm -q systemd
systemd-245.8-2.fc32.x86_64

To Reproduce
This may be due to a race condition so it might be difficult to reproduce. However, a setup similar to mine may well have the same problem: non-BTRFS / filesystem, a multi-device BTRFS filesystem which is split between direct connection to the motherboard and HBA card and is configured to mount in /etc/fstab

The BTRFS volume mounts successfully at boot time when:

  • udev.log_priority=debug systemd systemd.log_level=debug is added to the grub boot parameters.
  • All 10 HDDs are connected to the HBA card (with normal grub boot parameters).

Expected behavior
The BTRFS volume should mount automatically during the boot process.

Additional context
This is a Fedora 32 machine which I think was upgraded via dnf from Fedora 31 sometime ago. The BTRFS volume was added recently when migrating storage from an mdadm based RAID6 XFS filesystem.

I hope this makes some kind of sense 😉

@danieljrmay danieljrmay added the bug Our bugs label Oct 16, 2020
@cmurf
Copy link

cmurf commented Oct 16, 2020

I agree with Lennart. Either include the btrfs udev rule unconditionally or if fstab contains any btrfs reference. It's definitely not in @danieljrmay 's initramfs when inspected with lsinitrd.

@danieljrmay
Copy link
Author

It's definitely not in @danieljrmay 's initramfs when inspected with lsinitrd.

I should have mentioned this in the original report:

$ sudo uname -a
Linux oxygen 5.8.13-200.fc32.x86_64 #1 SMP Thu Oct 1 21:49:42 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ sudo lsinitrd /boot/initramfs-5.8.13-200.fc32.x86_64.img | grep btrfs
$ # No mentions of btrfs found

$ sudo dracut -f

$ sudo lsinitrd /boot/initramfs-5.8.13-200.fc32.x86_64.img | grep btrfs
$ # No mentions of btrfs found

@stale
Copy link

stale bot commented Dec 16, 2020

This issue is being marked as stale because it has not had any recent activity. It will be closed if no further activity occurs. If this is still an issue in the latest release of Dracut and you would like to keep it open please comment on this issue within the next 7 days. Thank you for your contributions.

@stale stale bot added the stale communication is stuck label Dec 16, 2020
@cmurf
Copy link

cmurf commented Dec 17, 2020

dracut-051 doesn't have a fix for this so I'm commenting to make sure it doesn't get closed.

@stale stale bot removed the stale communication is stuck label Dec 17, 2020
@stale
Copy link

stale bot commented Jan 16, 2021

This issue is being marked as stale because it has not had any recent activity. It will be closed if no further activity occurs. If this is still an issue in the latest release of Dracut and you would like to keep it open please comment on this issue within the next 7 days. Thank you for your contributions.

@stale stale bot added the stale communication is stuck label Jan 16, 2021
@cmurf
Copy link

cmurf commented Jan 17, 2021

Buried in the long ask.fpo thread referenced at the top, but not mentioned in this issue, using dracut -f --add btrfs fixes this problem. This results in:

$ sudo lsinitrd /boot/initramfs-5.10.7-200.fc33.x86_64.img | grep btrfs
Arguments: -f --add 'btrfs'
btrfs
-rw-r--r--   1 root     root           20 Oct 23 09:30 etc/cmdline.d/00-btrfs.conf
-rw-r--r--   1 root     root          616 Oct 23 09:30 usr/lib/udev/rules.d/64-btrfs.rules
-rwxr-xr-x   1 root     root      1014296 Oct 23 09:30 usr/sbin/btrfs
lrwxrwxrwx   1 root     root            5 Oct 23 09:30 usr/sbin/btrfsck -> btrfs
-rwxr-xr-x   1 root     root         1189 Oct 23 07:25 usr/sbin/fsck.btrfs
$ 

@stale stale bot removed the stale communication is stuck label Jan 17, 2021
@haraldh
Copy link
Collaborator

haraldh commented Feb 25, 2021

But your rootfs != btrfs... Shouldn't it bring up the btrfs after the switch root just fine?
Your initramfs did not even have the btrfs kernel module. So in theory everything should be just working in hotplug mode in the real root. .... Except, if your real root does not wait for the btrfs device to be complete before it tries to mount it to /srv

@haraldh
Copy link
Collaborator

haraldh commented Feb 25, 2021

To be clear, dracut must not be involved for /srv , if the btrfs devices are totally different from the rootfs device.

That means, initramfs boot must not be stalled for e.g. big SAN arrays mounted to /srv.

@haraldh haraldh closed this as completed Feb 25, 2021
@haraldh
Copy link
Collaborator

haraldh commented Feb 25, 2021

ok, I stand corrected:

Ths btrfs udev rule file appears to be missing in the initrd. The
block devices with the btrfs file systems on them will thus be marked
ready in systemd instantly instead of being delayed until all other
devices of the same btrfs fs have shown up in udev too.

@haraldh
Copy link
Collaborator

haraldh commented Feb 25, 2021

That is really unfortunate.

@haraldh
Copy link
Collaborator

haraldh commented Feb 25, 2021

This needs a systemd patch and a dracut patch

@haraldh haraldh reopened this Feb 25, 2021
@haraldh haraldh added this to the dracut-054 milestone Feb 25, 2021
haraldh added a commit to haraldh/dracut that referenced this issue Mar 3, 2021
Install `64-btrfs.rules` unconditionally to mark btrfs devices ready or
not.

In case no `btrfs` kernel module is available in the initramfs, the
device should not be ready.

Depends on: systemd/systemd#18802

Fixes: dracutdevs#947
haraldh added a commit that referenced this issue Mar 13, 2021
Install `64-btrfs.rules` unconditionally to mark btrfs devices ready or
not.

In case no `btrfs` kernel module is available in the initramfs, the
device should not be ready.

Depends on: systemd/systemd#18802

Fixes: #947
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Our bugs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants