kernel_cmdline: Refactor into separate `bytes` and `utf8` modules #1603

jeckersb · 2025-09-09T20:18:50Z

Split the kernel command line parsing functionality into two focused
modules. The bytes module handles raw byte parsing without UTF-8
requirements, matching kernel behavior for arbitrary byte
sequences. The utf8 module provides string-based parsing for cases
where UTF-8 validation is needed. The utf8 module reuses the
bytes module primitives where possible, and uses the fact that
utf8::Cmdline can only be constructed from valid UTF-8 to do
unchecked conversions between the two.

Signed-off-by: John Eckersberg jeckersb@redhat.com

jeckersb · 2025-09-09T20:20:53Z

/gemini review

gemini-code-assist

Code Review

This pull request refactors the kernel_cmdline crate into bytes and utf8 modules, which is a significant improvement in code structure and clarity. The bytes module handles raw byte parsing, while the utf8 module provides a safe, string-based API on top. The use of unsafe in the utf8 module is well-justified with SAFETY comments. The changes are well-implemented. I have one minor suggestion to clean up a test case.

crates/kernel_cmdline/src/bytes.rs

crates/kernel_cmdline/src/utf8.rs

Split the kernel command line parsing functionality into two focused modules. The `bytes` module handles raw byte parsing without UTF-8 requirements, matching kernel behavior for arbitrary byte sequences. The `utf8` module provides string-based parsing for cases where UTF-8 validation is needed. The `utf8` module reuses the `bytes` module primitives where possible, and uses the fact that `utf8::Cmdline` can only be constructed from valid UTF-8 to do unchecked conversions between the two. Signed-off-by: John Eckersberg <jeckersb@redhat.com>

jeckersb · 2025-09-09T21:00:18Z

I replaced all of the unsafe usage with checked + expect. It's not like this is performance-critical, the extra checks aren't going to hurt anything. I'm as certain as I can be that it will never panic (but if it does... better than UB).

crates/kernel_cmdline/src/utf8.rs

jeckersb · 2025-09-10T21:00:52Z

Ok I think this new commit works the way we want it to. Gracefully handles non-UTF-8 data but errors out if we have non-UTF-8 data in the specific parameters we're interested in.

jeckersb · 2025-09-10T21:13:24Z

/gemini review

gemini-code-assist

Code Review

This pull request is a good refactoring of the kernel command line parsing logic. It splits the functionality into a bytes module for raw byte slice parsing and a utf8 module for validated UTF-8 string parsing. This separation improves clarity and robustness, especially in handling potentially non-UTF-8 command line arguments from the kernel. The utf8 module correctly builds upon the bytes module, ensuring code reuse and maintaining safety invariants. The changes in install.rs to use the new modules are also well-done and improve error handling around non-UTF-8 data.

I've found one critical bug in the quote stripping logic and a minor performance issue. Apart from those, the changes look great.

crates/kernel_cmdline/src/bytes.rs

crates/kernel_cmdline/src/utf8.rs

crates/lib/src/install.rs

- Removed `From<bytes::Parameter>` implementation for `utf8::Parameter` and similar for `utf8::ParameterKey`. This was public and would allow end-users to construct utf8 parameters from non-utf8 data. Replaced internally with `from_bytes` in the places where we know we can safely convert known-UTF-8 data. - Added `TryFrom<bytes::Paramter>` implementation for `utf8::Parameter` to allow checked conversions, plus tests. - Added `iter_utf8` and `find_utf8` to `bytes::Cmdline`, plus tests. - Updated `find_root_args_to_inherit` in bootc to use these improvements. Notably bootc will now allow non-UTF8 data in the kernel cmdline, *unless* it occurs in parameters that bootc is explicitly looking for. - Added more tests to `find_root_args_to_inherit` to validate expected functionality with non-UTF-8 data. - Fixed a parser bug that gemini pointed out with unmatched quotes, plus tests to check for that. Signed-off-by: John Eckersberg <jeckersb@redhat.com>

bootc-bot bot requested a review from cgwalters September 9, 2025 20:19

gemini-code-assist bot reviewed Sep 9, 2025

View reviewed changes

crates/kernel_cmdline/src/bytes.rs Outdated Show resolved Hide resolved

jeckersb force-pushed the cmdline-improvements branch from a6d11bd to 7051af9 Compare September 9, 2025 20:26

cgwalters reviewed Sep 9, 2025

View reviewed changes

crates/kernel_cmdline/src/utf8.rs Outdated Show resolved Hide resolved

cgwalters added the area/unsafe-code Changes `unsafe` code label Sep 9, 2025

jeckersb force-pushed the cmdline-improvements branch from 7051af9 to 5454cec Compare September 9, 2025 20:58

jeckersb marked this pull request as ready for review September 9, 2025 21:12

bootc-bot bot requested a review from cgwalters September 9, 2025 21:12

cgwalters reviewed Sep 9, 2025

View reviewed changes

crates/kernel_cmdline/src/utf8.rs Show resolved Hide resolved

jeckersb commented Sep 10, 2025

View reviewed changes

crates/kernel_cmdline/src/utf8.rs Outdated Show resolved Hide resolved

crates/kernel_cmdline/src/utf8.rs Outdated Show resolved Hide resolved

gemini-code-assist bot reviewed Sep 10, 2025

View reviewed changes

crates/kernel_cmdline/src/bytes.rs Outdated Show resolved Hide resolved

crates/kernel_cmdline/src/utf8.rs Show resolved Hide resolved

cgwalters approved these changes Sep 10, 2025

View reviewed changes

crates/lib/src/install.rs Outdated Show resolved Hide resolved

jeckersb force-pushed the cmdline-improvements branch from 4413c56 to 621ea8f Compare September 11, 2025 13:46

cgwalters approved these changes Sep 11, 2025

View reviewed changes

crates/lib/src/install.rs Outdated Show resolved Hide resolved

jeckersb force-pushed the cmdline-improvements branch from 621ea8f to 6cede27 Compare September 11, 2025 14:24

cgwalters merged commit d947ed1 into bootc-dev:main Sep 11, 2025
28 checks passed

cgwalters mentioned this pull request Sep 12, 2025

Add a composefs backend #1444

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

kernel_cmdline: Refactor into separate `bytes` and `utf8` modules #1603

kernel_cmdline: Refactor into separate `bytes` and `utf8` modules #1603

Uh oh!

jeckersb commented Sep 9, 2025

Uh oh!

jeckersb commented Sep 9, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

jeckersb commented Sep 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jeckersb commented Sep 10, 2025

Uh oh!

jeckersb commented Sep 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kernel_cmdline: Refactor into separate bytes and utf8 modules #1603

kernel_cmdline: Refactor into separate bytes and utf8 modules #1603

Uh oh!

Conversation

jeckersb commented Sep 9, 2025

Uh oh!

jeckersb commented Sep 9, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

jeckersb commented Sep 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jeckersb commented Sep 10, 2025

Uh oh!

jeckersb commented Sep 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kernel_cmdline: Refactor into separate `bytes` and `utf8` modules #1603

kernel_cmdline: Refactor into separate `bytes` and `utf8` modules #1603