Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix multipath + LUKS and add a test #1728

Open
jlebon opened this issue Nov 22, 2023 · 4 comments
Open

Fix multipath + LUKS and add a test #1728

jlebon opened this issue Nov 22, 2023 · 4 comments
Labels
jira for syncing to jira

Comments

@jlebon
Copy link
Member

jlebon commented Nov 22, 2023

We don't have coverage for this right now but should.

Easiest would be to make it a kola test. Though we could make it an external test if we add the ability to make the primary disk multipathed from the test metadata.

@jlebon
Copy link
Member Author

jlebon commented Nov 27, 2023

And indeed, this is currently broken. So yeah, we definitely need to add coverage for this.
xref: https://issues.redhat.com/browse/OCPBUGS-23343

@jlebon jlebon added the jira for syncing to jira label May 13, 2024
@jlebon jlebon changed the title Add a test for multipath + LUKS Fix multipath + LUKS and add a test May 13, 2024
@jlebon jlebon transferred this issue from coreos/coreos-assembler May 13, 2024
jlebon added a commit to jlebon/coreos-installer that referenced this issue May 21, 2024
In the multipath + LUKS case, `get_luks_uuid()` would incorrectly
skip over the multipath partition containing the LUKS header because
`is_dm_device()` returned true. The code eventually errors out when it
gets to the disks backing the multipath device.

The `is_dm_device()` check was added as part of 69b706d ("rootmap:
handle filesystems with LUKS integrity") to correctly handle the LUKS
integrity case in the Secure Execution path. There, the device right
under the LUKS device is another crypt device mapper device used for
integrity that we need to skip over.

Instead of generically checking for a device mapper target, check
specifically that it's a LUKS integrity target before deciding to skip.

Part of: coreos/fedora-coreos-tracker#1728
jlebon added a commit to jlebon/coreos-installer that referenced this issue May 21, 2024
In the multipath + LUKS case, `get_luks_uuid()` would incorrectly
skip over the multipath partition containing the LUKS header because
`is_dm_device()` returned true. The code eventually errors out when it
gets to the disks backing the multipath device.

The `is_dm_device()` check was added as part of 69b706d ("rootmap:
handle filesystems with LUKS integrity") to correctly handle the LUKS
integrity case in the Secure Execution path. There, the device right
under the LUKS device is another crypt device mapper device used for
integrity that we need to skip over.

Instead of generically checking for a device mapper target, check
specifically that it's a LUKS integrity target before deciding to skip.

Part of: coreos/fedora-coreos-tracker#1728

Co-authored-by: Aashish Radhakrishnan <aaradhak@redhat.com>
Co-authored-by: Gursewak Mangat <gursmangat@gmail.com>
Co-authored-by: Michael Nguyen <mnguyen@redhat.com>
Co-authored-by: Steven Presti <spresti@redhat.com>
@jlebon
Copy link
Member Author

jlebon commented May 21, 2024

coreos/coreos-installer#1473 should fix the underlying issue, but we should still add a test case in f-c-c for this.

jlebon added a commit to jlebon/coreos-installer that referenced this issue May 21, 2024
In the multipath + LUKS case, `get_luks_uuid()` would incorrectly
skip over the multipath partition containing the LUKS header because
`is_dm_device()` returned true. The code eventually errors out when it
gets to the disks backing the multipath device.

The `is_dm_device()` check was added as part of 69b706d ("rootmap:
handle filesystems with LUKS integrity") to correctly handle the LUKS
integrity case in the Secure Execution path. There, the device right
under the LUKS device is another crypt device mapper device used for
integrity that we need to skip over.

Instead of generically checking for a device mapper target, check
specifically that it's a LUKS integrity target before deciding to skip.

Part of: coreos/fedora-coreos-tracker#1728

Co-authored-by: Aashish Radhakrishnan <aaradhak@redhat.com>
Co-authored-by: Gursewak Mangat <gursmangat@gmail.com>
Co-authored-by: Michael Nguyen <mnguyen@redhat.com>
Co-authored-by: Steven Presti <spresti@redhat.com>
jlebon added a commit to jlebon/coreos-installer that referenced this issue May 21, 2024
In the multipath + LUKS case, `get_luks_uuid()` would incorrectly
skip over the multipath partition containing the LUKS header because
`is_dm_device()` returned true. The code eventually errors out when it
gets to the disks backing the multipath device.

The `is_dm_device()` check was added as part of 69b706d ("rootmap:
handle filesystems with LUKS integrity") to correctly handle the LUKS
integrity case in the Secure Execution path. There, the device right
under the LUKS device is another crypt device mapper device used for
integrity that we need to skip over.

Instead of generically checking for a device mapper target, check
specifically that it's a LUKS integrity target before deciding to skip.

Part of: coreos/fedora-coreos-tracker#1728

Co-authored-by: Aashish Radhakrishnan <aaradhak@redhat.com>
Co-authored-by: Gursewak Mangat <gursmangat@gmail.com>
Co-authored-by: Michael Nguyen <mnguyen@redhat.com>
Co-authored-by: Steven Presti <spresti@redhat.com>
@jlebon
Copy link
Member Author

jlebon commented May 23, 2024

For testing this, you will want to use the karg rd.multipath=default only, i.e. not root=/dev/disk/by-label/dm-mpath-root rw since the rootfs in this case is not directly on multipath but on the encrypted LUKS. This will in turn make rdcore rootmap kick in as usual which will add the necessary root, rw, and rd.luks.name. So all together:

$ cat tmp.bu
variant: fcos
version: 1.4.0
boot_device:
  luks:
    tpm2: true
$ cosa run --qemu-multipath --kargs 'rd.multipath=default' -c -m 4096 -B tmp.bu

jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jun 14, 2024
This service was needed in the past to make multipath + LUKS work well.
The underlying bug seems to have been fixed now as I can no longer
reproduce it in Fedora or RHEL 9.4. Conveniently, this also works around
a bug in which that service would sometimes hang because of a bug[[1]]
in systemd which is still outstanding in RHEL 9.

Drop it.

We don't have any tests for this yet. Multipath + LUKS currently doesn't
work but should be fixed soon[[2]]. A test will be added as part of
that work.

[1]: systemd/systemd#29863
[2]: coreos/fedora-coreos-tracker#1728

Fixes: https://issues.redhat.com/browse/OCPBUGS-29325
jlebon added a commit to coreos/fedora-coreos-config that referenced this issue Jun 17, 2024
This service was needed in the past to make multipath + LUKS work well.
The underlying bug seems to have been fixed now as I can no longer
reproduce it in Fedora or RHEL 9.4. Conveniently, this also works around
a bug in which that service would sometimes hang because of a bug[[1]]
in systemd which is still outstanding in RHEL 9.

Drop it.

We don't have any tests for this yet. Multipath + LUKS currently doesn't
work but should be fixed soon[[2]]. A test will be added as part of
that work.

[1]: systemd/systemd#29863
[2]: coreos/fedora-coreos-tracker#1728

Fixes: https://issues.redhat.com/browse/OCPBUGS-29325
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jun 17, 2024
This service was needed in the past to make multipath + LUKS work well.
The underlying bug seems to have been fixed now as I can no longer
reproduce it in Fedora or RHEL 9.4. Conveniently, this also works around
a bug in which that service would sometimes hang because of a bug[[1]]
in systemd which is still outstanding in RHEL 9.

Drop it.

We don't have any tests for this yet. Multipath + LUKS currently doesn't
work but should be fixed soon[[2]]. A test will be added as part of
that work.

[1]: systemd/systemd#29863
[2]: coreos/fedora-coreos-tracker#1728

Fixes: https://issues.redhat.com/browse/OCPBUGS-29325
(cherry picked from commit cc2e865)
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jun 17, 2024
This service was needed in the past to make multipath + LUKS work well.
The underlying bug seems to have been fixed now as I can no longer
reproduce it in Fedora or RHEL 9.4. Conveniently, this also works around
a bug in which that service would sometimes hang because of a bug[[1]]
in systemd which is still outstanding in RHEL 9.

Drop it.

We don't have any tests for this yet. Multipath + LUKS currently doesn't
work but should be fixed soon[[2]]. A test will be added as part of
that work.

[1]: systemd/systemd#29863
[2]: coreos/fedora-coreos-tracker#1728

Fixes: https://issues.redhat.com/browse/OCPBUGS-29325
(cherry picked from commit cc2e865)
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jun 17, 2024
This service was needed in the past to make multipath + LUKS work well.
The underlying bug seems to have been fixed now as I can no longer
reproduce it in Fedora or RHEL 9.4. Conveniently, this also works around
a bug in which that service would sometimes hang because of a bug[[1]]
in systemd which is still outstanding in RHEL 9.

Drop it.

We don't have any tests for this yet. Multipath + LUKS currently doesn't
work but should be fixed soon[[2]]. A test will be added as part of
that work.

[1]: systemd/systemd#29863
[2]: coreos/fedora-coreos-tracker#1728

Fixes: https://issues.redhat.com/browse/OCPBUGS-29325
(cherry picked from commit cc2e865)
jlebon added a commit to coreos/fedora-coreos-config that referenced this issue Jun 18, 2024
This service was needed in the past to make multipath + LUKS work well.
The underlying bug seems to have been fixed now as I can no longer
reproduce it in Fedora or RHEL 9.4. Conveniently, this also works around
a bug in which that service would sometimes hang because of a bug[[1]]
in systemd which is still outstanding in RHEL 9.

Drop it.

We don't have any tests for this yet. Multipath + LUKS currently doesn't
work but should be fixed soon[[2]]. A test will be added as part of
that work.

[1]: systemd/systemd#29863
[2]: coreos/fedora-coreos-tracker#1728

Fixes: https://issues.redhat.com/browse/OCPBUGS-29325
(cherry picked from commit cc2e865)
jlebon added a commit to coreos/fedora-coreos-config that referenced this issue Jun 18, 2024
This service was needed in the past to make multipath + LUKS work well.
The underlying bug seems to have been fixed now as I can no longer
reproduce it in Fedora or RHEL 9.4. Conveniently, this also works around
a bug in which that service would sometimes hang because of a bug[[1]]
in systemd which is still outstanding in RHEL 9.

Drop it.

We don't have any tests for this yet. Multipath + LUKS currently doesn't
work but should be fixed soon[[2]]. A test will be added as part of
that work.

[1]: systemd/systemd#29863
[2]: coreos/fedora-coreos-tracker#1728

Fixes: https://issues.redhat.com/browse/OCPBUGS-29325
(cherry picked from commit cc2e865)
jlebon added a commit to coreos/fedora-coreos-config that referenced this issue Jun 18, 2024
This service was needed in the past to make multipath + LUKS work well.
The underlying bug seems to have been fixed now as I can no longer
reproduce it in Fedora or RHEL 9.4. Conveniently, this also works around
a bug in which that service would sometimes hang because of a bug[[1]]
in systemd which is still outstanding in RHEL 9.

Drop it.

We don't have any tests for this yet. Multipath + LUKS currently doesn't
work but should be fixed soon[[2]]. A test will be added as part of
that work.

[1]: systemd/systemd#29863
[2]: coreos/fedora-coreos-tracker#1728

Fixes: https://issues.redhat.com/browse/OCPBUGS-29325
(cherry picked from commit cc2e865)
jlebon added a commit to coreos/fedora-coreos-config that referenced this issue Jun 19, 2024
This service was needed in the past to make multipath + LUKS work well.
The underlying bug seems to have been fixed now as I can no longer
reproduce it in Fedora or RHEL 9.4. Conveniently, this also works around
a bug in which that service would sometimes hang because of a bug[[1]]
in systemd which is still outstanding in RHEL 9.

Drop it.

We don't have any tests for this yet. Multipath + LUKS currently doesn't
work but should be fixed soon[[2]]. A test will be added as part of
that work.

[1]: systemd/systemd#29863
[2]: coreos/fedora-coreos-tracker#1728

Fixes: https://issues.redhat.com/browse/OCPBUGS-29325
(cherry picked from commit cc2e865)
@jlebon
Copy link
Member Author

jlebon commented Aug 21, 2024

New enough coreos-installer now landed in FCOS. Now with coreos/coreos-assembler#3822, it's possible to write an external test for this with primaryDisk: ":mpath" and then sets up LUKS. Probably makes sense to add it in https://github.com/coreos/fedora-coreos-config/tree/testing-devel/tests/kola/root-reprovision/luks with the other LUKS tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira for syncing to jira
Projects
None yet
Development

No branches or pull requests

1 participant