Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-byte characters not rendered correctly when output from a C or C++ program #223

Open
tfpf opened this issue Jul 23, 2024 · 0 comments

Comments

@tfpf
Copy link

tfpf commented Jul 23, 2024

If a C or C++ program writes multi-byte characters to the console, they are not rendered correctly. The following shell script demonstrates the same.

#! /usr/bin/env sh

pacman -S --needed mingw-w64-ucrt-x86_64-gcc mingw-w64-ucrt-x86_64-python

printf '#include <stdio.h>\nint main(void) { puts("∈√≈≡⊥"); }\n' >msys2.c
gcc msys2.c
./a
echo $(./a)

printf '#include <cstdio>\nint main(void) { std::puts("∈√≈≡⊥"); }\n' >msys2.cc
g++ msys2.cc
./a
echo $(./a)

printf 'import sys\n\nprint("∈√≈≡⊥")' >msys2.py
python msys2.py

echo "∈√≈≡⊥"

Here's the output.

warning: mingw-w64-ucrt-x86_64-gcc-14.1.0-3 is up to date -- skipping
warning: mingw-w64-ucrt-x86_64-python-3.11.9-1 is up to date -- skipping
 there is nothing to do
∈√≈≡⊥
∈√≈≡⊥
∈√≈≡⊥
∈√≈≡⊥
∈√≈≡⊥
∈√≈≡⊥

Key Observations

  • When multi-byte characters are written by a C or C++ program, the actual characters written don't appear to be related, and can themselves by multi-byte.
  • If the output is saved to a variable and then echoed (see echo $(./a) above), the characters are displayed correctly.
  • Upon writing multi-byte characters from Python or sh, nothing unexpected occurs.

I am using 64-bit MSYS2 20230526. I didn't try this with the latest version because I didn't find any bug reports for this issue even after searching for a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant