-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support UTF-8 Everywhere #269
Comments
We are very concerned about UTF-8 support on Windows, but very unlikely to add implicit conversions as it can have unintended performance consequences. Same reason there's no implicit conversion between |
Surely it's up to the cppwinrt user to decide whether the cost of conversion is acceptable. I've been doing UTF-8 Everywhere for several years with WinAPI apps and it hasn't caused a single performance issue. And supposing such an issue did arise, it should be pretty simple to solve (for instance by using UTF-16 in the affected section of code). I think the reason there's no conversion between If cppwinrt included the implicit conversions I requested, it really would be a lot more attractive to folks like me who use UTF-8 everywhere. |
Imho things like string conversion should be explicit, as this can actually be not only a performance but also a correctness issue: A std::string might not actually contain a utf8 encoded string and IIRC, NTFS paths might actually contain 16Bit values tht don't form valid utf-16 encoded code points (I hope I got the terminology correct). That being said. Those explicit transformations should be as convenient as possible. |
I'd make a strong point against implementing conversion constructors and operators. Beyond the performance implications there is the issue about correctness. While a Conversions must be explicit. Otherwise the ambiguity around Besides, the "UTF-8 Everywhere" mantra is more dogmatic than convincing. UTF-8 is great, for information interchange (writing files to disk, sending data across a network, etc.). For a Windows application I have yet to see a convincing argument against using UTF-16 internally throughout. |
Have you guys actually read the UTF-8 Everywhere manifesto? |
Yes. And it isn't very convincing. UTF-8 is great for data interchange. It isn't exactly well suited as an internal representation for text in a Windows application. As for this specific issue, you need to explain, why assuming UTF-8 in a general purpose library is more important than allowing it to easily interface with legacy code that uses ANSI encoding. |
@4A696D: Yes I have and in any code that is portable I try to follow it (doing so in c++ is not always easy though as long as there is no standardized utf8 string). However, the fact of the matter is, that windows APIs (as well as Java and Qt for that matter) use mostly wchars / utf-16 and the roundtrip windows API string -> utf-8 -> windows API string is not efficient, not always correct (although that are probably mostly very specific corner cases or bugs) and often simply not necessary. As I said. There definitely should be an easy way to do the conversion, but it should not be hidden. |
Also, we are coming pretty close to " If you have to do this all the time you should rethink your design" territory. |
Our internal builds now have |
Thanks Kenny. However, this doesn't really offer cppwinrt users anything they couldn't already do for themselves. To avoid having to litter UTF-8 based code with endless |
Right, the helpers are merely provided as a convenience. Feel free to use them or not. |
In order to support the UTF-8 Everywhere principle, please consider adding the following to hstring:
a constructor and assignment operator taking a std::string containing UTF-8 encoded text
a conversion operator returning a std::string containing UTF-8 encoded text
Thanks!
The text was updated successfully, but these errors were encountered: