-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use fputws for outputting wide strings #1243
Conversation
4c7501c
to
46c3c82
Compare
This should be done once per stream by the application, not per write call in the library. |
Unfortunately that means I cannot use fmtlib as a drop in replacement for an application containing a mix of There is more to Microsoft's int main(int argc, char** argv) {
printf("Hello %s!\n", "World");
wprintf(L"Hello %s!\n", L"World");
fmt::print("Hello {}!\n", "World");
fmt::print(L"Hello {}!\n", L"World");
return 0;
} Calling For my stdout prints, I can use a |
Another solution for the correct behavior is using |
I think the reasoning behind rejecting this PR isn't well thought out, normally implementing code doesn't need to worry about setting the mode of the stream since the API handles it automatically. Since fmtlib uses other means of implementing stream writing it's responsible for properly setting the mode imho. |
You can easily write your own wrapper around |
I think having it as part of fmtlib is the proper solution. |
Hmm, OK. This is not a great design, but let's do the same for consistency reason. |
Since when does anything Microsoft do qualify as good design? lol, thanks for reconsidering. |
@vitaut After doing some more experimentation, I've determined that The following FMT_FUNC void vprint(std::FILE* f, wstring_view format_str, wformat_args args) {
wmemory_buffer buffer;
internal::vformat_to(buffer, format_str, args);
UINT code_page = ___lc_codepage_func();
int len_a =
WideCharToMultiByte(code_page, WC_NO_BEST_FIT_CHARS, buffer.data(),
static_cast<int>(buffer.size()), nullptr, 0, nullptr,
nullptr);
if (len_a == 0)
return;
memory_buffer mb_buffer;
mb_buffer.resize(len_a);
WideCharToMultiByte(code_page, WC_NO_BEST_FIT_CHARS, buffer.data(),
static_cast<int>(buffer.size()), mb_buffer.data(),
len_a, nullptr, nullptr);
internal::fwrite_fully(mb_buffer.data(), 1, mb_buffer.size(), f);
} Aside from Microsoft's dubious handling of character encodings, the With no other changes aside from int main(int argc, char** argv) {
std::setlocale(LC_ALL, "en_US.UTF-8");
std::printf("Hello %s\n", "ツ");
std::wprintf(L"Hello %s\n", L"\x30C4");
fmt::print("Hello {}\n", "ツ");
fmt::print(L"Hello {}\n", L"\x30C4");
return 0;
} appears as (Japanese code page) It also works in the fancy new Terminal application with the UTF-8 code page |
Interesting. Are you suggesting that we should do the same? |
If the goal is functional parity with On the POSIX side of things, Linux man pages of |
I just tried it out under glibc 2.29 on Arch Linux. Violating glibc essentially uses separate vtables for low-level I/O operations. byte and wide oriented streams each have separate tables implemented in Unfortunately, |
A POSIX-compliant pair of FMT_FUNC void vprint(std::FILE* f, string_view format_str, format_args args) {
if (std::fwide(f, -1) >= 0) {
errno = EILSEQ;
return;
}
memory_buffer buffer;
internal::vformat_to(buffer, format_str,
basic_format_args<buffer_context<char>>(args));
internal::fwrite_fully(buffer.data(), 1, buffer.size(), f);
}
FMT_FUNC void vprint(std::FILE* f, wstring_view format_str, wformat_args args) {
if (std::fwide(f, 1) <= 0) {
errno = EILSEQ;
return;
}
wmemory_buffer buffer;
internal::vformat_to(buffer, format_str, args);
buffer.push_back(L'\0');
if (std::fputws(buffer.data(), f) == -1) {
FMT_THROW(system_error(errno, "cannot write to file"));
}
} Those errno sets might be better done with exceptions, but I'll leave that decision with you. |
The behavior of _lock_file(f);
foreach wc in buffer
_fputwc_nolock(wc, f);
_unlock_file(f); See also https://github.com/huangqinjin/wmain#ucrt-and-utf-8 and UCRT source. |
I agree with @huangqinjin's assessment. There's a lot of systemic nuance that fmtlib really shouldn't have to take on; even in POSIX-land. My vote is for The only discrepancy I've observed between |
Since Microsoft implemented EDIT: That if statement gets totally optimized out, so no harm in leaving it in |
46c3c82
to
55d8ba2
Compare
55d8ba2
to
08aedec
Compare
OK, let's go with |
Nevermind, I didn't see that you've already done this. |
@vitaut @jackoalan , I suggest not to issue See https://stackoverflow.com/questions/8947949/mixing-cout-and-wcout-in-same-program. |
I agree. |
By the time the corresponding overload of |
I'll go ahead and remove the |
Also adds fwide byte/wide orientation checking to verify streams are able to receive the character type in question. On Windows, the fwide calls are no-ops that pass through the second arg and optimize out the if statement entirely.
08aedec
to
ce3491b
Compare
Thanks! |
You're welcome ^^ |
The Windows console host will not correctly present
wchar_t
character streams unless the file mode is set to_O_WTEXT
. These changes modify the various users offwrite_fully
to temporarily set the character mode with the aid of a RAII helper class.I agree that my contributions are licensed under the {fmt} license, and agree to future changes to the licensing.