-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multiple initrd compression formats (non-gzip compression formats currently cause kernel crashes in worst case scenario) #4683
Comments
The most appropriate fix would seem to be to have dracut handle compression of initrd and the additional random data. Did we ever teach dracut how to do this as suggested in #1950 @jlebon @cgwalters ? Or is there a dracut ticket opened? |
Issue as a one-liner:
Working version (with gzip):
|
dracut-random.cpio.gz should go away, see dracutdevs/dracut#2331 |
I have a question @cgwalters , does the dracut-random.cpio.gz file provide value in it's current form? Could we just remove that from being appended to the initrd for now? It's not random anyway in it's current form... |
We added it for a reason:
The goal isn't that the data there is random...the goal is that it provides the |
In the short term, we can do two things here:
|
What we could do is:
Read the first 6 bytes of the initrd, if it is not "gzip" (like in the example shell script from
gunzip -c dracut-random.cpio.gz | whatever_is_in_compressor_variable Example of detection from lsinitrd:
What do you think? I think we might as well match it, if we have to detect the compression format, the easier part is matching it. |
Can you provide the exact error message from the kernel? |
I'll paste it here tomorrow, the exact one, off the top of my head it might have been "decoding failed"... That was the good case, getting the kernel log, in that case it would just ignore the unrecognisable gzip data and boot anyway.... In the worse case, you wouldn't even see that log, the kernel would freak out and panic tagging @masneyb for awareness... This is easy to reproduce by the way, just use a lz4 compressed initrd with ostree... This code assumes all initrd's are compressed with gzip, which is actually quite a dated compression algorithm, not sure why we are using that in RHEL or Fedora as default for initrd, seems like we are just using it because it's what's always been used. |
In the healthier case, this is the kernel log:
|
This is not working, tested or complete: https://github.com/coreos/rpm-ostree/pull/4702/files but this is what I'm thinking... It makes sense to shell out here as it's too much maintenance to integrate all the compression/decompression formats in rpm-ostree directly... Easier to just let the command line compressors/decompressors do their job. |
A hidden environment variable to allow callers to work around coreos#4683
I did #4704 which will allow you to opt-out of this for now. Now,
would be better but needs some grunt work to ensure we're detecting things correctly. |
Any arbitrary amount of padding between concatenated CPIOs is allowed: https://www.kernel.org/doc/Documentation/early-userspace/buffer-format.txt However, some initramfs decompressors in the kernel expect a minimal amount of padding to detect EOF and allow the kernel outer loop to try out other decompressors for the next payload. This is the case for lz4. This upstream patch allows the lz4 decompressor to treat padding as EOF: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2c484419efc09e7234c667aa72698cb79ba8d8ed But it assumes that the padding is at least 4 bytes. Since padding doesn't hurt regardless of the compression used, let's just unconditionally add it. Fixes coreos#4683.
The kernel should already support a series of CPIOs in different formats. The reproducer in #4683 (comment) only says that the user-space lz4 tool complains, but that's different code from what's in the kernel. For example, the latter needs to support arbitrary padding after the end of a compressed stream. (See the grammar at https://www.kernel.org/doc/Documentation/early-userspace/buffer-format.txt). I thought at first this was a misalignment issue (see https://github.com/coreos/coreos-assembler/blob/eaaa1d7c5f235d365017c4b1d76a98833f9f411a/src/cmd-buildextend-live#L120-L133), though I don't think that applies here since we're appending a compressed CPIO. Searching for that error message lead me to https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2c484419efc09e7234c667aa72698cb79ba8d8ed which is quite informative. The patch fixes trailing padding handling for the kernel lz4 decompressor by checking if the next lz4 frame where the magic number is expected is all zeros. But for that logic to work, it needs at least 4 bytes of padding. So... I think all we need is to add some padding between the two? ...time passes... OK, this should work: #4705 |
Any arbitrary amount of padding between concatenated CPIOs is allowed: https://www.kernel.org/doc/Documentation/early-userspace/buffer-format.txt However, some initramfs decompressors in the kernel expect a minimal amount of padding to detect EOF and allow the kernel outer loop to try out other decompressors for the next payload. This is the case for lz4. This upstream patch allows the lz4 decompressor to treat padding as EOF: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2c484419efc09e7234c667aa72698cb79ba8d8ed But it assumes that the padding is at least 4 bytes. Since padding doesn't hurt regardless of the compression used, let's just unconditionally add it. Fixes coreos#4683. Co-authored-by: Colin Walters <walters@verbum.org>
Any arbitrary amount of padding between concatenated CPIOs is allowed: https://www.kernel.org/doc/Documentation/early-userspace/buffer-format.txt However, some initramfs decompressors in the kernel expect a minimal amount of padding to detect EOF and allow the kernel outer loop to try out other decompressors for the next payload. This is the case for lz4. This upstream patch allows the lz4 decompressor to treat padding as EOF: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2c484419efc09e7234c667aa72698cb79ba8d8ed But it assumes that the padding is at least 4 bytes. Since padding doesn't hurt regardless of the compression used, let's just unconditionally add it. Fixes coreos#4683. Co-authored-by: Colin Walters <walters@verbum.org>
Any arbitrary amount of padding between concatenated CPIOs is allowed: https://www.kernel.org/doc/Documentation/early-userspace/buffer-format.txt However, some initramfs decompressors in the kernel expect a minimal amount of padding to detect EOF and allow the kernel outer loop to try out other decompressors for the next payload. This is the case for lz4. This upstream patch allows the lz4 decompressor to treat padding as EOF: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2c484419efc09e7234c667aa72698cb79ba8d8ed But it assumes that the padding is at least 4 bytes. Since padding doesn't hurt regardless of the compression used, let's just unconditionally add it. Fixes #4683. Co-authored-by: Colin Walters <walters@verbum.org>
Host system details
Expected
with lz4 configured via:
/usr/lib/dracut/dracut.conf.d
of passed via parameter:
--lz4
--compress -l -2
System boots.
** Actual
with lz4 configured via:
/usr/lib/dracut/dracut.conf.d
of passed via parameter:
--lz4
--compress -l -2
Kernel either displays decoding error on decompression or panics.
You can also see the decoding error via userspace lz4 tool.
Steps forward
I scratched my head for a day or two on this issue. My interpretation is that it boils down to the appending of the:
dracut-random.cpio.gz
data towards the end of rpmostree_run_dracut.
We could do this in dracut, or compress all the cpio data ourselves in rpm-ostree (if we compress ourselves we don't have to dump dracut-random.cpio.gz, dracut-random.cpio.zstd, dracut-random.cpio.lz4, etc. just shell out and pipe to compression tool maybe?).
Encountered in Red Hat/CentOS Stream Automotive distributions as we changed the initramfs compression algorithm to lz4 (it boots faster from our testing) and were encountering non-booting systems, etc.
Not really related to the issue at all, but we started the effort to support more compression formats here:
#3745
The text was updated successfully, but these errors were encountered: