-
Notifications
You must be signed in to change notification settings - Fork 143
kernel_cmdline: Refactor into separate bytes and utf8 modules
#1603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the kernel_cmdline crate into bytes and utf8 modules, which is a significant improvement in code structure and clarity. The bytes module handles raw byte parsing, while the utf8 module provides a safe, string-based API on top. The use of unsafe in the utf8 module is well-justified with SAFETY comments. The changes are well-implemented. I have one minor suggestion to clean up a test case.
a6d11bd to
7051af9
Compare
Split the kernel command line parsing functionality into two focused modules. The `bytes` module handles raw byte parsing without UTF-8 requirements, matching kernel behavior for arbitrary byte sequences. The `utf8` module provides string-based parsing for cases where UTF-8 validation is needed. The `utf8` module reuses the `bytes` module primitives where possible, and uses the fact that `utf8::Cmdline` can only be constructed from valid UTF-8 to do unchecked conversions between the two. Signed-off-by: John Eckersberg <jeckersb@redhat.com>
7051af9 to
5454cec
Compare
|
I replaced all of the |
|
Ok I think this new commit works the way we want it to. Gracefully handles non-UTF-8 data but errors out if we have non-UTF-8 data in the specific parameters we're interested in. |
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a good refactoring of the kernel command line parsing logic. It splits the functionality into a bytes module for raw byte slice parsing and a utf8 module for validated UTF-8 string parsing. This separation improves clarity and robustness, especially in handling potentially non-UTF-8 command line arguments from the kernel. The utf8 module correctly builds upon the bytes module, ensuring code reuse and maintaining safety invariants. The changes in install.rs to use the new modules are also well-done and improve error handling around non-UTF-8 data.
I've found one critical bug in the quote stripping logic and a minor performance issue. Apart from those, the changes look great.
4413c56 to
621ea8f
Compare
- Removed `From<bytes::Parameter>` implementation for `utf8::Parameter` and similar for `utf8::ParameterKey`. This was public and would allow end-users to construct utf8 parameters from non-utf8 data. Replaced internally with `from_bytes` in the places where we know we can safely convert known-UTF-8 data. - Added `TryFrom<bytes::Paramter>` implementation for `utf8::Parameter` to allow checked conversions, plus tests. - Added `iter_utf8` and `find_utf8` to `bytes::Cmdline`, plus tests. - Updated `find_root_args_to_inherit` in bootc to use these improvements. Notably bootc will now allow non-UTF8 data in the kernel cmdline, *unless* it occurs in parameters that bootc is explicitly looking for. - Added more tests to `find_root_args_to_inherit` to validate expected functionality with non-UTF-8 data. - Fixed a parser bug that gemini pointed out with unmatched quotes, plus tests to check for that. Signed-off-by: John Eckersberg <jeckersb@redhat.com>
621ea8f to
6cede27
Compare
Split the kernel command line parsing functionality into two focused
modules. The
bytesmodule handles raw byte parsing without UTF-8requirements, matching kernel behavior for arbitrary byte
sequences. The
utf8module provides string-based parsing for caseswhere UTF-8 validation is needed. The
utf8module reuses thebytesmodule primitives where possible, and uses the fact thatutf8::Cmdlinecan only be constructed from valid UTF-8 to dounchecked conversions between the two.
Signed-off-by: John Eckersberg jeckersb@redhat.com