Skip to content

Commit

Permalink
A type representing an owned C-compatible wide string
Browse files Browse the repository at this point in the history
  • Loading branch information
bozaro committed Oct 20, 2016
1 parent 5b785bc commit 342f4a6
Showing 1 changed file with 72 additions and 0 deletions.
72 changes: 72 additions & 0 deletions text/0000-cwstring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
- Feature Name: A type representing an owned C-compatible wide string
- Start Date: 2016-10-20
- RFC PR:
- Rust Issue:

# Summary

Add CWideString/CWideStr for more simple interaction with not well-formed UTF-16
external API (for example: with Windows API).

# Motivation

This RFC born from issue: [rust-lang/rust#34303](https://github.com/rust-lang/rust/issues/34303)

Many of Windows API use not well-formed UTF-16 strings. Some of this API use
null-terminated strings.

Rust lack simple null-terminated UTF-16 string conversions and now working with this
API need too many boilerplate copy-paste code like:
```
unsafe fn from_wide_null(ptr: *const u16) -> OsString {
let mut len = 0;
while *ptr.offset(len) != 0 {
len += 1;
}
OsStringExt::from_wide(slice::from_raw_parts(ptr, len as usize))
}
fn to_wide_null(s: &str) -> Vec<u16> {
self.encode_utf16().chain(Some(0)).collect()
}
```

OsString also can't be used with Windows API directly, because it stores string internally
in WTF-8 encoding.

So, this RFC try to add simple and effective way to work with UTF-16 string.

Also it can be usefull:

* Inside Rust OsString implementatino on Windows platform;
* With Java FFI (Java internally use UTF-16 strings).

# Detailed design

Copy CStr/CString as CWideStr/CWideString with using u16 instead of u8.

A preliminary implementation can be found at: https://github.com/bozaro/rust-cwstring

# Drawbacks

This classes is not generally platform specific, but it mostly useful on Windows.

Also CWideString name can be confused, because it use differ element size then
`std::wstring` in C++ world (`std::wstring` type defined as `std::basic_string<wchar_t>`,
but `sizeof(wchar_t)` is platform depended: 4 bytes on Linux, 2 bytes on Windows).

# Alternatives

Keep all as is.

# Unresolved questions

I try implement CWideStr/CWideString and got some issues/questions:

* NulError is copied as WideNulError, but it breaking currect code or name convensions.
* into_string method is removed, because unlike CString, he does not give the performance
profit. Also IntoStringError is not copied from c_str.
* I don't find good name for `u16` method like `to_bytes`.
* memchr and strlen replaced by wmemchr and wstrlen failback implementation.
* May be need better implementation fmt::Debug for CWideStr.

0 comments on commit 342f4a6

Please sign in to comment.