-
Notifications
You must be signed in to change notification settings - Fork 46
Key Rotation
DRAFT
As of 0.5.14, each munged
supports only a single cryptographic key. Transitioning to a new key may be desirable due to a security incident with a compromised host, or a routine part of a local security policy. When transitioning an administrative realm to a new key, there will be a point in time where some hosts have received the new key while others are still using the old key. munged
should be capable of transitioning to a new key without having a downtime where the service is unavailable. Support is needed to allow authentication between hosts to continue during a time interval where some hosts have only the old key while others have both the old and new keys. During this time interval, both keys are valid; after it has elapsed, the old key will be invalidated.
This document proposes a mechanism to transition to a new key within an administrative realm spanning multiple hosts. A new key is generated on one host, distributed to all hosts within the administrative realm, and added to each local munged
daemon asynchronously. During this time, credentials will be encoded with both keys, but can be decoded with either key. The underlying key rotation on individual hosts can thereby occur at different times within this time interval. After the new key has been distributed to all hosts and added to each munged
daemon, the old key can be automatically removed after which credentials will be encoded with only the new key. See #19.
- Set the expiration time of key k0 such that k0:valid-end = current time t + 1 day. This specifies the end of the time interval where both the old and new keys are valid.
- Create key k1 such that k1:valid-start < k0:valid-end. k1:valid-start is presumed to be the current time, but it could also be a future time when the key would become valid; that would allow the time interval where both keys are valid to be very short.
- Signal
munged
viamungekey
to begin the key rotation. - Upon receipt of the signal,
munged
reloads its keyring. -
munged
identifies k0 and k1 as being valid at the current time t. -
munged
sets a timer to expire k0 at k0:valid-end, at which point k0 will be disabled. -
munged
sets a timer to expire k1 at k1:valid-end, at which point k1 will be disabled. -
munged
encodes new credentials during this time interval with both k0 and k1. - At time k0:valid-end,
munged
will disable k0; new credentials will then be encoded with only k1.
- Copy the keyring to host h1.
rsync /etc/munge/munge.keyring h1:/etc/munge/munge.keyring
- Signal
munged
on h1 to reload its keyring.
ssh h1 mungekey --reload
- Obtain the list of keys and their corresponding key IDs (kid).
mungekey --list
- Export the old key k0 with its updated expiration time.
mungekey --export --id <k0:kid> --output munge.keyring.new
- Export the new key k1.
mungekey --export --id <k1:kid> --output munge.keyring.new --append
- Copy the new keys to host h1.
rsync munge.keyring.new h1:/etc/munge/munge.keyring.new
- Import the new keys.
ssh h1 mungekey --import /etc/munge/munge.keyring.new
- Signal
munged
on h1 to reload its keyring.
ssh h1 mungekey --reload
- If the host has not updated its keyring, it will only have key k0. When decoding a new credential, the k0:data-encryption-key packet with k0 will successfully decode the credential. The k1:data-encryption-key packet is not needed. However, all data-encryption-key packets within the credential should be processed to mitigate potential timing attacks.
- If the host has updated its keyring, it will have both k0 and k1. When decoding a new credential, the k0:data-encryption-key packet with k0 will successfully decode the credential. The k1:data-encryption-key packet is not needed. However, all data-encryption-key packets within the credential should be processed to mitigate potential timing attacks.
- After k0:valid-end, k0 will be disabled. When decoding a new credential, the k0:data-encryption-key packet will fail to decode the credential, but the k1:data-encryption-key packet with k1 will succeed.
- systemd socket activation: Currently,
munged
reads the key at start-up, and then creates a thread work crew to process requests. Reloading the key without restarting the daemon requires pausing the work crew and/or protecting key access with a mutex, both of which may adversely affect transaction processing speed. Another option would be to restart the daemon, while minimizing the window between service shutdown and restart where requests could be dropped. On Linux, systemd socket activation [1, 2, 3] allows the service to be restarted while keeping around its socket, thereby ensuring no incoming requests will be dropped.
- The overall size of the credential will increase since data-encryption-key can no longer be implicitly derived from the message-authentication-code.
- During the key rotation time interval where multiple keys are valid, the creation of an additional data-encryption-key packet will incur a small overhead in terms of encoding time, credential size, and decoding time. Consequently, the length of this interval should be kept as short as possible (but no shorter!) based on the time required to propagate the new key throughout the administrative realm.
- The mechanism for
mungekey
to signalmunged
is yet to be determined. The simplest would be to send aSIGHUP
to the appropriatemunged
process. The pid could be determined from the advisory lock on the appropriate socket. But ifSIGHUP
is already being used for something conventional like re-opening the logfile, a separate signaling mechanism may be required. This could be accomplished with the addition of aRELOAD_KEYRING
command to the client/server protocol. - The addition of a protocol command would necessitate changes to
libmunge
to support havingmungekey
signalmunged
. -
munged
needs to handle the case where no keys on its keyring are currently valid. The simplest action to take would be to exit, but it should instead remain running so it can be signaled bymungekey
again to reload its keyring, etc. During this time, requests to encode or decode a credential would return an error to the client.