DLPX-76794 recovery_sync dropbear failure causes delphix-bootcount service to fail on Ubuntu 20.04 #32
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The recovery_sync process copies the ssh keys from the main environment into the boot environment and converts them to a format that the
dropbear
ssh server understands, by usingdropbearconvert
. On Ubuntu 20.04 this process is failing with:@pcd1193182 discovered that this was caused by all SSH keys now being generated with the proprietary OPENSSH header:
instead of a header that is specific to the given key's type:
By default
ssh-keygen
generates a key in the newer RFC4716 format, which was used on both Ubuntu 18.04 and Ubuntu 20.04, however on Ubuntu 20.04 the header has changed. Thedropbearconvert
utility on Ubuntu 20.04 doesn't know how to handle they keys with an OPENSSH header, however that was fixed in the Ubuntu 20.10 package (See mkj/dropbear#91). One workaround is to instead use a key that is generated in PEM format, which is always accepted by dropbearconvert.Therefore there are 3 ways that this problem could be fixed:
Option 3 would have probably been the cleanest one, however we do not have an easy way to pull a package from a different archive, so it is the least practical one.
Option 1: while it's unclear what are the downsides of using the older key format on the engine, forcing it to use an old format just to satisfy the dropbearconvert utility sounds like a change that could a bigger impact.
Option 2, while introducing an extra step in the recovery_sync process, seems to have the least impact on the system while remaining simpler to implement that option 3. Hence I've chosen option 2.
Testing
ssh-keygen -l
.