Skip to content

stat: mount point output should preserve non-UTF8 bytes #8534

@dekuu5

Description

@dekuu5

stat: mount point output should preserve non-UTF8 bytes

Problem

The stat implementation in Rust currently converts mount points to a String using .to_string_lossy().
This behavior does not match GNU stat: if the mount point contains non-UTF8 bytes, they are replaced with the Unicode replacement character (U+FFFD).

GNU stat preserves the original raw bytes, so the two tools diverge when dealing with mount points that include invalid UTF-8.


Steps to Reproduce

Setup

# Create a mount point with an invalid UTF-8 byte in the name (0x80)
mkdir $'mnt_\x80'

# Mount a tmpfs there
sudo mount -t tmpfs tmpfs $'mnt_\x80'

# Create a file inside it
touch $'mnt_\x80/file.txt'

Proof of Concept

# GNU stat
#!/bin/bash

# GNU stat
stat -c '%m' $'mnt_\x80/file.txt' | xxd

# coreutils stat
uu_stat -c '%m' $'mnt_\x80/file.txt' | xxd

Output

Image

Proposed Solution

  1. Update find_mount_point to return an OsString instead of String.
  2. Extend OutputType with a new variant, e.g. OutputType::OsStr.
  3. Add a new helper function print_osstr in the printing layer that writes raw bytes instead of assuming UTF-8.
  4. Update stat to use OutputType::OsStr for mount points.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions