Skip to content

Commit 30b4182

Browse files
authored
Rollup merge of #94984 - ericseppanen:cstr_from_bytes, r=Mark-Simulacrum
add `CStr` method that accepts any slice containing a nul-terminated string I haven't created an issue (tracking or otherwise) for this yet; apologies if my approach isn't correct. This is my first code contribution. This change adds a member fn that converts a slice into a `CStr`; it is intended to be safer than `from_ptr` (which is unsafe and may read out of bounds), and more useful than `from_bytes_with_nul` (which requires that the caller already know where the nul byte is). The reason I find this useful is for situations like this: ```rust let mut buffer = [0u8; 32]; unsafe { some_c_function(buffer.as_mut_ptr(), buffer.len()); } let result = CStr::from_bytes_with_nul(&buffer).unwrap(); ``` This code above returns an error with `kind = InteriorNul`, because `from_bytes_with_nul` expects that the caller has passed in a slice with the NUL byte at the end of the slice. But if I just got back a nul-terminated string from some FFI function, I probably don't know where the NUL byte is. I would wish for a `CStr` constructor with the following properties: - Accept `&[u8]` as input - Scan for the first NUL byte and return the `CStr` that spans the correct sub-slice (see [future note below](#94984 (comment))). - Return an error if no NUL byte is found within the input slice I asked on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/CStr.20from.20.26.5Bu8.5D.20without.20knowing.20the.20NUL.20location.3F) whether this sounded like a good idea, and got a couple of positive-sounding responses from ``@joshtriplett`` and ``@AzureMarker.`` This is my first draft, so feedback is welcome. A few issues that definitely need feedback: 1. Naming. ``@joshtriplett`` called this `from_bytes_with_internal_nul` on Zulip, but after staring at all of the available methods, I believe that this function is probably what end users want (rather than the existing fn `from_bytes_with_nul`). Giving it a simpler name (**`from_bytes`**) implies that this should be their first choice. 2. Should I add a similar method on `CString` that accepts `Vec<u8>`? I'd assume the answer is probably yes, but I figured I'd try to get early feedback before making this change bigger. 3. What should the error type look like? I made a unit struct since `CStr::from_bytes` can only fail in one obvious way, but if I need to do this for `CString` as well then that one may want to return `FromVecWithNulError`. And maybe that should dictate the shape of the `CStr` error type also? Also, cc ``@poliorcetics`` who wrote #73139 containing similar fns.
2 parents 463e516 + d5fe4ca commit 30b4182

File tree

2 files changed

+106
-0
lines changed

2 files changed

+106
-0
lines changed

Diff for: library/std/src/ffi/c_str.rs

+69
Original file line numberDiff line numberDiff line change
@@ -328,6 +328,27 @@ impl FromVecWithNulError {
328328
}
329329
}
330330

331+
/// An error indicating that no nul byte was present.
332+
///
333+
/// A slice used to create a [`CStr`] must contain a nul byte somewhere
334+
/// within the slice.
335+
///
336+
/// This error is created by the [`CStr::from_bytes_until_nul`] method.
337+
///
338+
#[derive(Clone, PartialEq, Eq, Debug)]
339+
#[unstable(feature = "cstr_from_bytes_until_nul", issue = "95027")]
340+
pub struct FromBytesUntilNulError(());
341+
342+
#[unstable(feature = "cstr_from_bytes_until_nul", issue = "95027")]
343+
impl Error for FromBytesUntilNulError {}
344+
345+
#[unstable(feature = "cstr_from_bytes_until_nul", issue = "95027")]
346+
impl fmt::Display for FromBytesUntilNulError {
347+
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
348+
write!(f, "data provided does not contain a nul")
349+
}
350+
}
351+
331352
/// An error indicating invalid UTF-8 when converting a [`CString`] into a [`String`].
332353
///
333354
/// `CString` is just a wrapper over a buffer of bytes with a nul terminator;
@@ -1239,12 +1260,60 @@ impl CStr {
12391260
}
12401261
}
12411262

1263+
/// Creates a C string wrapper from a byte slice.
1264+
///
1265+
/// This method will create a `CStr` from any byte slice that contains at
1266+
/// least one nul byte. The caller does not need to know or specify where
1267+
/// the nul byte is located.
1268+
///
1269+
/// If the first byte is a nul character, this method will return an
1270+
/// empty `CStr`. If multiple nul characters are present, the `CStr` will
1271+
/// end at the first one.
1272+
///
1273+
/// If the slice only has a single nul byte at the end, this method is
1274+
/// equivalent to [`CStr::from_bytes_with_nul`].
1275+
///
1276+
/// # Examples
1277+
/// ```
1278+
/// #![feature(cstr_from_bytes_until_nul)]
1279+
///
1280+
/// use std::ffi::CStr;
1281+
///
1282+
/// let mut buffer = [0u8; 16];
1283+
/// unsafe {
1284+
/// // Here we might call an unsafe C function that writes a string
1285+
/// // into the buffer.
1286+
/// let buf_ptr = buffer.as_mut_ptr();
1287+
/// buf_ptr.write_bytes(b'A', 8);
1288+
/// }
1289+
/// // Attempt to extract a C nul-terminated string from the buffer.
1290+
/// let c_str = CStr::from_bytes_until_nul(&buffer[..]).unwrap();
1291+
/// assert_eq!(c_str.to_str().unwrap(), "AAAAAAAA");
1292+
/// ```
1293+
///
1294+
#[unstable(feature = "cstr_from_bytes_until_nul", issue = "95027")]
1295+
pub fn from_bytes_until_nul(bytes: &[u8]) -> Result<&CStr, FromBytesUntilNulError> {
1296+
let nul_pos = memchr::memchr(0, bytes);
1297+
match nul_pos {
1298+
Some(nul_pos) => {
1299+
// SAFETY: We know there is a nul byte at nul_pos, so this slice
1300+
// (ending at the nul byte) is a well-formed C string.
1301+
let subslice = &bytes[..nul_pos + 1];
1302+
Ok(unsafe { CStr::from_bytes_with_nul_unchecked(subslice) })
1303+
}
1304+
None => Err(FromBytesUntilNulError(())),
1305+
}
1306+
}
1307+
12421308
/// Creates a C string wrapper from a byte slice.
12431309
///
12441310
/// This function will cast the provided `bytes` to a `CStr`
12451311
/// wrapper after ensuring that the byte slice is nul-terminated
12461312
/// and does not contain any interior nul bytes.
12471313
///
1314+
/// If the nul byte may not be at the end,
1315+
/// [`CStr::from_bytes_until_nul`] can be used instead.
1316+
///
12481317
/// # Examples
12491318
///
12501319
/// ```

Diff for: library/std/src/ffi/c_str/tests.rs

+37
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,43 @@ fn from_bytes_with_nul_interior() {
117117
assert!(cstr.is_err());
118118
}
119119

120+
#[test]
121+
fn cstr_from_bytes_until_nul() {
122+
// Test an empty slice. This should fail because it
123+
// does not contain a nul byte.
124+
let b = b"";
125+
assert_eq!(CStr::from_bytes_until_nul(&b[..]), Err(FromBytesUntilNulError(())));
126+
127+
// Test a non-empty slice, that does not contain a nul byte.
128+
let b = b"hello";
129+
assert_eq!(CStr::from_bytes_until_nul(&b[..]), Err(FromBytesUntilNulError(())));
130+
131+
// Test an empty nul-terminated string
132+
let b = b"\0";
133+
let r = CStr::from_bytes_until_nul(&b[..]).unwrap();
134+
assert_eq!(r.to_bytes(), b"");
135+
136+
// Test a slice with the nul byte in the middle
137+
let b = b"hello\0world!";
138+
let r = CStr::from_bytes_until_nul(&b[..]).unwrap();
139+
assert_eq!(r.to_bytes(), b"hello");
140+
141+
// Test a slice with the nul byte at the end
142+
let b = b"hello\0";
143+
let r = CStr::from_bytes_until_nul(&b[..]).unwrap();
144+
assert_eq!(r.to_bytes(), b"hello");
145+
146+
// Test a slice with two nul bytes at the end
147+
let b = b"hello\0\0";
148+
let r = CStr::from_bytes_until_nul(&b[..]).unwrap();
149+
assert_eq!(r.to_bytes(), b"hello");
150+
151+
// Test a slice containing lots of nul bytes
152+
let b = b"\0\0\0\0";
153+
let r = CStr::from_bytes_until_nul(&b[..]).unwrap();
154+
assert_eq!(r.to_bytes(), b"");
155+
}
156+
120157
#[test]
121158
fn into_boxed() {
122159
let orig: &[u8] = b"Hello, world!\0";

0 commit comments

Comments
 (0)