Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

7z fails to extract initramfs #640

Closed
NiklasGollenstede opened this issue Aug 11, 2023 · 8 comments
Closed

7z fails to extract initramfs #640

NiklasGollenstede opened this issue Aug 11, 2023 · 8 comments
Assignees
Labels
bug Something isn't working format:archive

Comments

@NiklasGollenstede
Copy link

NiklasGollenstede commented Aug 11, 2023

Describe the bug
I was browsing Nix projects when I found this one and thought "that's cool, should be useful to inspect initrds, but unfortunately, that currently does not really work very much.

The below command tries to extract a zstd-compressed NixOS initramfs with prepended Intel CPU microcode, but:

  • 7z fails to extract the main CPIO archive, but unblob still exits with 0/success.
  • The prepended microcode CPIO archive is reported to be found, but then missing in the output.
  • There are unknown chunks that are really just zero-padding.

To Reproduce

 nix run github:onekey-sec/unblob/23.5.31 -- $( nix build --no-link --print-out-paths github:srid/nixos-config/1a6879bbd1c0f87f67533a7b91bc438e042b3bf6#nixosConfigurations.actual.config.system.build.initialRamdisk )/initrd
Command output
2023-08-10 23:58.54 [info     ] Start processing file          file=/nix/store/msmx1ylsyhxk6hx3p4nz39vqi2gkzn3j-initrd-linux-6.1.43/initrd pid=3806009
2023-08-10 23:58.54 [warning  ] Found unknown Chunks           chunks=[0x6f1200-0x6f1800] pid=3806014
2023-08-10 23:58.54 [info     ] Extracting unknown chunk       chunk=0x6f1200-0x6f1800 path=initrd_extract/7279104-7280640.unknown pid=3806014
2023-08-10 23:58.54 [info     ] Extracting valid chunk         chunk=0x6f1800-0x12284cb path=initrd_extract/7280640-19039435.zstd pid=3806014
2023-08-10 23:58.54 [info     ] Extracting valid chunk         chunk=0x0-0x6f1200 path=initrd_extract/0-7279104.cpio_portable_ascii pid=3806014
2023-08-10 23:58.54 [warning  ] Found unknown Chunks           chunks=[0x19b3000-0x19b4000] pid=3806016
2023-08-10 23:58.54 [info     ] Extracting unknown chunk       chunk=0x19b3000-0x19b4000 path=initrd_extract/7280640-19039435.zstd_extract/zstd.uncompressed_extract/26947584-26951680.unknown pid=3806016
2023-08-10 23:58.54 [info     ] Extracting valid chunk         chunk=0x0-0x19b3000 path=initrd_extract/7280640-19039435.zstd_extract/zstd.uncompressed_extract/0-26947584.cpio_portable_ascii pid=3806016
2023-08-10 23:58.54 [error    ] Extract command failed         command=7z x -y /tmp/initrd_extract/7280640-19039435.zstd_extract/zstd.uncompressed_extract/0-26947584.cpio_portable_ascii -o/tmp/initrd_extract/7280640-19039435.zstd_extract/zstd.uncompressed_extract/0-26947584.cpio_portable_ascii_extract exit_code=0x2 pid=3806016 severity=<Severity.WARNING: 'WARNING'> stderr=
ERRORS:
There are data after the end of archive

ERROR: There are some data after the end of the payload data : 0-26947584
 stdout=
7-Zip [64] 17.05 : Copyright (c) 1999-2021 Igor Pavlov : 2017-08-28
p7zip Version 17.05 (locale=C,Utf16=off,HugeFiles=on,64 bits,16 CPUs x64)

Scanning the drive for archives:
1 file, 26947584 bytes (26 MiB)

Extracting archive: /tmp/initrd_extract/7280640-19039435.zstd_extract/zstd.uncompressed_extract/0-26947584.cpio_portable_ascii
--
Path = /tmp/initrd_extract/7280640-19039435.zstd_extract/zstd.uncompressed_extract/0-26947584.cpio_portable_ascii
Type = xz
ERRORS:
There are data after the end of archive
Offset = 35432
Physical Size = 5228
Tail Size = 26906924
Method = LZMA2:21
Streams = 1
Blocks = 1


Sub items Errors: 1

Archives with Errors: 1

Open Errors: 1

Sub items Errors: 1

tree -s ./initrd_extract
[          4]  initrd_extract
├── [       1536]  7279104-7280640.unknown
└── [          4]  7280640-19039435.zstd_extract
    ├── [   26951680]  zstd.uncompressed
    └── [          5]  zstd.uncompressed_extract
        ├── [   26947584]  0-26947584.cpio_portable_ascii
        ├── [          3]  0-26947584.cpio_portable_ascii_extract
        │   └── [      27568]  0-26947584
        └── [       4096]  26947584-26951680.unknown

Expected behavior

  • 7z not to fail / something else (like cpio) to extract the archive.
  • unblob to exit non-zero upon sub-command failure.
  • The microcode in the output tree.
  • unknown blocks that are entirely zero to be called zero-padding or something like that.

Environment information

  • Nix 2.13.3 on NixOS 23.05
  • (all other versions are pinned via nix flakes, see the above commands)
@qkaiser
Copy link
Contributor

qkaiser commented Aug 11, 2023

Thanks for the very detailed report @NiklasGollenstede !

7z not to fail / something else (like cpio) to extract the archive.

7z is failing due to unblob miscalculating the CPIO chunk. We will look into it.

unblob to exit non-zero upon sub-command failure.

It should be non-zero according to get_exit_code_from_reports. We will look into it.

The microcode in the output tree.

It will be present if you run unblob with -k option.

unknown blocks that are entirely zero to be called zero-padding or something like that.

Agree 100%. We're tracking this at #263 and have a draft branch for it.

@qkaiser qkaiser added bug Something isn't working format:archive labels Aug 11, 2023
@NiklasGollenstede
Copy link
Author

Sounds good, thanks!

The microcode in the output tree.

It will be present if you run unblob with -k option.

Well, that keeps the prepended CPIO archive. But that archive contains a file, which apparently gets completely ignored. Running with -k and then cpio -idv < initrd_extract/0-7279104.cpio_portable_ascii extracts that file:

kernel/x86/microcode/GenuineIntel.bin
14217 blocks

I gutess the expected result (with -k) would be:

[          7]  initrd_extract/
├── [    7279104]  0-7279104.cpio_portable_ascii
├── [          3]  0-7279104.cpio_portable_ascii_extract/
│   └── [          3]  kernel/
│       └── [          3]  x86/
│           └── [          3]  microcode/
│               └── [    7278592]  GenuineIntel.bin
├── [       1536]  7279104-7280640.zero-padding
├── [   11758795]  7280640-19039435.zstd
└── [          4]  7280640-19039435.zstd_extract/
    ├── [   26951680]  zstd.uncompressed
    └── [          5]  zstd.uncompressed_extract/
        ├── [   26947584]  0-26947584.cpio_portable_ascii
        ├── [         ??]  0-26947584.cpio_portable_ascii_extract/
        │   └── ...
        └── [       4096]  26947584-26951680.zero-padding

@qkaiser
Copy link
Contributor

qkaiser commented Aug 14, 2023

7z will not create the extraction directory if the source file name does not follow some convention which I still don't fully comprehend (name.cpio is OK, name.cpio.truncated is OK, but name.cpio.ext is not).

@qkaiser
Copy link
Contributor

qkaiser commented Aug 30, 2023

Quick update: I'll probably write a CPIO extractor since we're already parsing the entries anyway, should not take long with the recent addition of the Filesystem API.

@qkaiser
Copy link
Contributor

qkaiser commented Aug 31, 2023

@NiklasGollenstede I opened a pull request to handle this, this will be reviewed over the coming weeks.

These are the results I'm getting with your sample and that branch:

.
├── 0-7279104.cpio_portable_ascii
├── 0-7279104.cpio_portable_ascii_extract
│   └── kernel
│       └── x86
│           └── microcode
│               └── GenuineIntel.bin
├── 7279104-7280640.unknown
├── 7280640-19039435.zstd
└── 7280640-19039435.zstd_extract
    ├── zstd.uncompressed
    └── zstd.uncompressed_extract
        ├── 0-26947584.cpio_portable_ascii
        ├── 0-26947584.cpio_portable_ascii_extract
        │   ├── dev
        │   ├── etc
        │   │   ├── mdadm.conf -> ../nix/store/ivzdqwmjb3g5cddb0l3kakqpym53n4sk-mdadm.conf
        │   │   └── modprobe.d
        │   │       ├── debian.conf -> ../../nix/store/1hvskwda7r1spasqqg4ascjngqpnp0qw-kmod-debian-aliases.conf-22-1.1
        │   │       ├── nixos.conf -> ../../nix/store/fg1iypr8qlc4li832bsnqsv2182wjkmb-etc-modprobe.d-nixos.conf
        │   │       └── ubuntu.conf -> ../../nix/store/pyzxg3hb6r88l7bqfya22q002sbchfxi-initrd-kmod-blacklist-ubuntu
        │   ├── init -> nix/store/320svbpp3f9mhdc4xr0p1n2gm3nfwzv1-stage-1-init.sh
        │   └── nix
        │       └── store
        │           ├── 1hvskwda7r1spasqqg4ascjngqpnp0qw-kmod-debian-aliases.conf-22-1.1
        │           ├── 320svbpp3f9mhdc4xr0p1n2gm3nfwzv1-stage-1-init.sh
        │           └── 5cxd4ywn7sis9h5yibxfc6bwvjz15af9-linux-6.1.43-modules-shrunk
        └── 26947584-26951680.unknown

13 directories, 14 files

Don't hesitate to give it a try.

@qkaiser qkaiser self-assigned this Aug 31, 2023
@NiklasGollenstede
Copy link
Author

$ nix run github:onekey-sec/unblob/a5536446208f749c9df77f3d5a07528933e9e418 -- $( nix build --no-link --print-out-paths github:srid/nixos-config/1a6879bbd1c0f87f67533a7b91bc438e042b3bf6#nixosConfigurations.actual.config.system.build.initialRamdisk )/initrd
╭──────────────── unblob (23.8.11) ────────────────╮
│ Extracted files: 5                               │{{1}}
│ Extracted directories: 12                        │{{2}}
│ Extracted links: 5                               │
│ Extraction directory size: 50.82 MB              │
│ Chunks identification ratio: 99.99%              │
╰──────────────────── Summary ─────────────────────╯
            Chunks distribution
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┓
┃ Chunk type          ┃   Size   ┃ Ratio  ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━┩
│ CPIO_PORTABLE_ASCII │ 32.64 MB │ 74.42% │{{3}}
│ ZSTD                │ 11.21 MB │ 25.57% │{{3}}
│ UNKNOWN             │ 5.50 KB  │ 0.01%  │
└─────────────────────┴──────────┴────────┘
       Encountered errors
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Severity       ┃ Name         ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ Severity.ERROR │ UnknownError │{{4}}
└────────────────┴──────────────┘
$ ech0 $?
1
$ tree -sF initrd_extract/
initrd_extract/
|-- [          3]  0-7279104.cpio_portable_ascii_extract/
|   `-- [          3]  kernel/
|       `-- [          3]  x86/
|           `-- [          3]  microcode/
|               `-- [    7278592]  GenuineIntel.bin {{1}}
|-- [       1536]  7279104-7280640.unknown {{1}}
`-- [          4]  7280640-19039435.zstd_extract/
    |-- [   26951680]  zstd.uncompressed {{5}}
    `-- [          5]  zstd.uncompressed_extract/
        |-- [   26947584]  0-26947584.cpio_portable_ascii
        |-- [          6]  0-26947584.cpio_portable_ascii_extract/
        |   |-- [          2]  dev/
        |   |-- [          4]  etc/
        |   |   |-- [         56]  mdadm.conf -> ../nix/store/ivzdqwmjb3g5cddb0l3kakqpym53n4sk-mdadm.conf
        |   |   `-- [          5]  modprobe.d/
        |   |       |-- [         80]  debian.conf -> ../../nix/store/1hvskwda7r1spasqqg4ascjngqpnp0qw-kmod-debian-aliases.conf-22-1.1
        |   |       |-- [         74]  nixos.conf -> ../../nix/store/fg1iypr8qlc4li832bsnqsv2182wjkmb-etc-modprobe.d-nixos.conf
        |   |       `-- [         77]  ubuntu.conf -> ../../nix/store/pyzxg3hb6r88l7bqfya22q002sbchfxi-initrd-kmod-blacklist-ubuntu
        |   |-- [         58]  init -> nix/store/320svbpp3f9mhdc4xr0p1n2gm3nfwzv1-stage-1-init.sh
        |   `-- [          3]  nix/
        |       `-- [          5]  store/
        |           |-- [        655]  1hvskwda7r1spasqqg4ascjngqpnp0qw-kmod-debian-aliases.conf-22-1.1 {{1}}
        |           |-- [      20667]  320svbpp3f9mhdc4xr0p1n2gm3nfwzv1-stage-1-init.sh {{1}}
        |           `-- [          2]  5cxd4ywn7sis9h5yibxfc6bwvjz15af9-linux-6.1.43-modules-shrunk/
        `-- [       4096]  26947584-26951680.unknown {{1}}

13 directories, 12 files

That looks better. It seems to be handling the first archive correctly!
But then there is still the/an error, and most of the files from the nested archive were not extracted.

Some further nitpickiness (largely unrelated to this overall issue):

  1. I only see 3 extracted files. Do the unknown chunks count as files? I don't think they are "files" in that sense. (They result in regular files in the output, but semantically they are not files in the archive.)
  2. Similarly, there are 14 dirs in the output tree (incl. top-level, 9 of which are within *.cpio_portable_ascii_extract dirs (i.e. were encoded in the input).
  3. The CPIO (largely) was inside the ZSTD. I don't think it makes very much sense to express their relative size of a whole (which one?).
  4. Good. But knowing at least which extraction was attempted and failed would be nice. I know to expect that there are things missing in the output, but not where.
  5. It seems to me that without the -k option, unblob removes blobs that it processed successfully. Why is zstd.uncompressed still there? It was split into 0-26947584.cpio_portable_ascii and 26947584-26951680.unknown and should then be done, no?

@qkaiser
Copy link
Contributor

qkaiser commented Sep 1, 2023

@NiklasGollenstede converted to a discussion at #650 so everyone in the team can chime in. Thanks for taking the time writing this by the way.

@qkaiser
Copy link
Contributor

qkaiser commented Oct 12, 2023

Closing this issue since CPIO is properly extracted now. The discussion on console output is kept open for further exchanges.

@qkaiser qkaiser closed this as completed Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working format:archive
Projects
None yet
Development

No branches or pull requests

2 participants