-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Labels
Description
stat: mount point output should preserve non-UTF8 bytes
Problem
The stat implementation in Rust currently converts mount points to a String using .to_string_lossy().
This behavior does not match GNU stat: if the mount point contains non-UTF8 bytes, they are replaced with the Unicode replacement character � (U+FFFD).
GNU stat preserves the original raw bytes, so the two tools diverge when dealing with mount points that include invalid UTF-8.
Steps to Reproduce
Setup
# Create a mount point with an invalid UTF-8 byte in the name (0x80)
mkdir $'mnt_\x80'
# Mount a tmpfs there
sudo mount -t tmpfs tmpfs $'mnt_\x80'
# Create a file inside it
touch $'mnt_\x80/file.txt'Proof of Concept
# GNU stat
#!/bin/bash
# GNU stat
stat -c '%m' $'mnt_\x80/file.txt' | xxd
# coreutils stat
uu_stat -c '%m' $'mnt_\x80/file.txt' | xxd
Output
Proposed Solution
- Update
find_mount_pointto return anOsStringinstead ofString. - Extend
OutputTypewith a new variant, e.g.OutputType::OsStr. - Add a new helper function
print_osstrin the printing layer that writes raw bytes instead of assuming UTF-8. - Update
statto useOutputType::OsStrfor mount points.