-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
stat: fix mount point handling for non-UTF8 paths #8538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
GNU testsuite comparison: |
|
please ask your AI to cleanup a bit the patch, too much duplicated code
|
|
Hello @sylvestre |
src/uu/stat/src/stat.rs
Outdated
| for _ in 0..padding_needed { | ||
| print!(" "); | ||
| } | ||
| } else { | ||
| for _ in 0..padding_needed { | ||
| print!(" "); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is duplicated code here
src/uu/stat/src/stat.rs
Outdated
| io::stdout().write_all(display_bytes)?; | ||
| } | ||
| } else { | ||
| io::stdout().write_all(display_bytes)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay thanks i will edit it
please look at the other tests for this :) |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
1 similar comment
|
GNU testsuite comparison: |
src/uu/stat/src/stat.rs
Outdated
| io::stdout().write_all(display_bytes)?; | ||
|
|
||
| if left && padding_needed > 0 { | ||
| for _ in 0..padding_needed { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still duplicated code
src/uu/stat/src/stat.rs
Outdated
| let bytes = s.as_bytes(); | ||
|
|
||
| if pad_and_print_bytes(bytes, flags.left, width, precision).is_err() { | ||
| let lossy_string = s.to_string_lossy(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename the variable to explain what it is, not the style
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
src/uu/stat/src/stat.rs
Outdated
| if path.starts_with(root) { | ||
| // TODO: This is probably wrong, we should pass the OsString | ||
| return Some(root.to_string_lossy().into_owned()); | ||
| return Some(root.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we avoid the clone here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we can by annotating the function and the enum OutputType by a lifetime to be like this
fn find_mount_point<'a, P: AsRef<Path>>(&'a self, p: P) -> Option<&'a OsString>
and the outputtype be generic over 'a but in line
1064: OutputType::OsStr(self.find_mount_point(file).unwrap())
i can't use unwrap or default i don't know how
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it errors we can fall back to normal outputType::str with a default value of "" is that acceptable ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please investigate :)
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
src/uu/stat/src/stat.rs
Outdated
| /// | ||
| /// On Unix systems, this preserves non-UTF8 data by printing raw bytes | ||
| /// On other platforms, falls back to lossy string conversion | ||
| fn pad_and_print_bytes( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have unit tests in this file, maybe test also this function ? thanks
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
Hello @sylvestre, is there any updates on this pr |
|
GNU testsuite comparison: |
a0fd51e to
300fbdb
Compare
|
GNU testsuite comparison: |
src/uu/stat/src/stat.rs
Outdated
| (padding_needed, 0) | ||
| }; | ||
|
|
||
| writer.write_all(&vec![b' '; left_pad])?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try you try using using repeat() iterator or a pre-allocated buffer ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something like
std::io::repeat(b' ').take(n)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay will take a look at it
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
Hello @sylvestre , is there any updates on this pr |
Summary
This PR addresses a TODO in
find_mount_point()by properly handling mount point paths that contain non-UTF8 bytes. Previously,to_string_lossy()was used, which could corrupt mount point names with invalid UTF-8 sequences. The fix ensures raw bytes are preserved on Unix-like systems and safely handled on Windows.Changes Made
find_mount_point()return type updatedChanged from
Option<String>→Option<OsString>to preserve original bytes.Added
OsStrvariant toOutputTypeenumAllows proper handling of non-UTF8 data through the output pipeline.
Implemented
print_os_str().to_string_lossy(), safely handling UTF-16 strings.Added
pad_and_print_bytes()helperHandles alignment, width, and padding for raw byte output.
Updated
'm'format specifierReturns
OutputType::OsStrinstead of a lossy string, preserving original data.Platform Behavior
.to_string_lossy()safely, since Windows internally uses UTF-16.Related Issues
Fixes the TODO in
stat.rs:it is mentioned in this issue