You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Seal the LUKS encryption keys for the EPHEMERAL partition using a TPM register that depends on confidential information from the STATE partition.
Description
Consider a scenario where an attacker has physical access to a cluster (as opposed to access to a lone disk removed from the cluster). I'm currently unsure whether using TPM-based LUKS encryption for both the STATE and EPHEMERAL partitions is sufficient to guarantee that user data on EPHEMERAL will not be readable by the attacker.
The boot process, as I understand it using SecureBoot and LUKS with TPM-based encryption, looks roughly as follows:
At system startup, UEFI is loaded. The UEFI firmware itself and its configuration are measured into PCR[0] and PCR[1].
UEFI measures the bootloader and its configuration into PCR[4] and PCR[5]. If PCR measurements up until this point are as expected, the bootloader is loaded.
The bootloader (systemd-boot) runs. The UKI content (includes the kernel, its config, the initrd, etc.) are measured into PCR[11].
If the PCR measurements up to this point are as expected, the kernel and initrd are loaded.
The kernel initializes hardware, etc., and may measure data into PCR[10].
If the PCR measurements up to this point are as expected, the TPM releases the encryption key for the STATE partition. The partition is decrypted and mounted.
The kernel executes init and talos takes over.
Talos runs through its start-up phases and measures progress into PCR[11] and can read the machine config from the now decrypted STATE partition.
If the PCR measurements up to this point are as expected, the TPM releases the encryption key for EPHEMERAL. The partition is decrypted and mounted as /var.
If I'm not mistaken, all PCR measurements up to step 9, where the encryption key for EPHEMERAL is released by the TPM, depend either on the physical device (CPU, TPM-Chip, etc.) or the current talos installer image used on the machine (Kernel, Bootloader, SecureBoot signature, etc.) and not on the identity of the machine/cluster. I believe this would enable an attacker to overwrite the STATE with their talos STATE partition (created by the attacker using the same installer image).
The issue, I think, is that during step 8, no machine- (and/or cluster-specific) information is measured into any of the PCRs, including PCR[11] which is currently used for LUKS in talos. Compare that to systemd-pcrmachine.service which measures the machine-id, a confidential identifier that is generated on the first boot, into PCR[15] (which could then be used for binding the LUKS encryption key for EPHEMERAL to).
Workaround
If the above is correct, I think a workaround for the time being is using TPM-based LUKS encryption just for the STATE partition and passphrase-based encryption for EPHEMERAL. The relevant part of the boot process would look like this:
If the PCR measurements up until this point are as expected, the TPM releases the encryption key for the STATE partition. The partition is decrypted and mounted.
The kernel executes init and talos takes over.
Talos runs through its start-up phases and measures progress into PCR[11] and is able to read the machine config from the now decrypted STATE partition.
The static passphrase is read from the machine config and used as the key for EPHEMERAL. The partition is decrypted and mounted as /var.
Overwriting the STATE partition to try and boot talos with a machine config under the control of the attacker would destroy the static passphrase and render EPHEMERAL inaccessible to the attacker.
Feature Request
Seal the LUKS encryption keys for the
EPHEMERAL
partition using a TPM register that depends on confidential information from theSTATE
partition.Description
Consider a scenario where an attacker has physical access to a cluster (as opposed to access to a lone disk removed from the cluster). I'm currently unsure whether using TPM-based LUKS encryption for both the
STATE
andEPHEMERAL
partitions is sufficient to guarantee that user data onEPHEMERAL
will not be readable by the attacker.The boot process, as I understand it using SecureBoot and LUKS with TPM-based encryption, looks roughly as follows:
init
and talos takes over.STATE
partition.EPHEMERAL
. The partition is decrypted and mounted as/var
.If I'm not mistaken, all PCR measurements up to step 9, where the encryption key for
EPHEMERAL
is released by the TPM, depend either on the physical device (CPU, TPM-Chip, etc.) or the current talos installer image used on the machine (Kernel, Bootloader, SecureBoot signature, etc.) and not on the identity of the machine/cluster. I believe this would enable an attacker to overwrite theSTATE
with their talosSTATE
partition (created by the attacker using the same installer image).The issue, I think, is that during step 8, no machine- (and/or cluster-specific) information is measured into any of the PCRs, including PCR[11] which is currently used for LUKS in talos. Compare that to systemd-pcrmachine.service which measures the
machine-id
, a confidential identifier that is generated on the first boot, into PCR[15] (which could then be used for binding the LUKS encryption key forEPHEMERAL
to).Workaround
If the above is correct, I think a workaround for the time being is using TPM-based LUKS encryption just for the
STATE
partition and passphrase-based encryption forEPHEMERAL
. The relevant part of the boot process would look like this:init
and talos takes over.STATE
partition.EPHEMERAL
. The partition is decrypted and mounted as/var
.Overwriting the
STATE
partition to try and boot talos with a machine config under the control of the attacker would destroy the static passphrase and renderEPHEMERAL
inaccessible to the attacker.Useful references
The text was updated successfully, but these errors were encountered: