|
| 1 | +--- |
| 2 | +name: C strings |
| 3 | +area: language and library |
| 4 | +status: |
| 5 | + label: in-progress |
| 6 | + css_class: attention |
| 7 | +buttons: |
| 8 | + - label: RFC for C string literals (merged) |
| 9 | + link: https://github.com/rust-lang/rfcs/pull/3348 |
| 10 | + - label: Tracking Issue |
| 11 | + link: https://github.com/rust-lang/rust/issues/105723 |
| 12 | + - label: RFC for UTF-8 in C and byte strings |
| 13 | + link: https://github.com/rust-lang/rfcs/pull/3349 |
| 14 | + - label: Issue to make `&CStr` a thin pointer |
| 15 | + link: https://github.com/rust-lang/rust/issues/59905 |
| 16 | +--- |
| 17 | +The `&std::ffi::CStr` type is used to represent (null terminated) C strings in Rust. |
| 18 | +Currently, this type has some subtle issues and is in many ways less ergonomic than the regular string type. |
| 19 | + |
| 20 | +Some of the areas of potential improvement: |
| 21 | + |
| 22 | +- There is no syntax for a `&CStr` literal. (Update: We now have the experimental `c"…"` syntax.) |
| 23 | + |
| 24 | +- Due to a language limitation, `&CStr` is currently represented as a pointer+size pair. |
| 25 | + It should instead be just a pointer without a size, since the size is already determined by the null terminator. |
| 26 | + |
| 27 | + - Because of this, conversion from `*const c_char` requires scanning the whole string to find its size. |
| 28 | + That should be a free/nop conversion instead. |
| 29 | + See the note in [this documentation](https://doc.rust-lang.org/stable/std/ffi/struct.CStr.html#method.to_bytes). |
| 30 | + |
| 31 | + - For the same reason, `&CStr` cannot be passed through a C FFI boundary. |
| 32 | + Ideally, `&CStr` should have the same ABI as a `*const c_char`. |
| 33 | + |
| 34 | +- `CStr` has far fewer useful methods than `str` (e.g. finding, splitting, replacing, etc.) making it hard to work with directly. |
| 35 | + |
| 36 | +- `format!()` can only produce a `String`, not a `std::ffi::CString`. |
| 37 | + |
| 38 | +- All bytes in a `CStr` (before the terminator) are non-zero, but none of the |
| 39 | + methods use `NonZeroU8` to leverage the type system for this invariant. |
0 commit comments