Description
This is a western Europe Windows 11 machine:
Python 3.11.3 (tags/v3.11.3:f3909b8, Apr 4 2023, 23:49:59) [MSC v.1934 64 bit (AMD64)] on win32
>>> import subprocess
>>> subprocess.run("echo ö", shell=True, text=True, stdout=subprocess.PIPE).stdout.strip("\n")
'”'
As you can see, there is codepage confusion. You don't get back what you wrote out.
Windows has different codepage settings applied, depending on context. File encoding (also called ANSI codepage) is not necessarily identical with console encoding (also called OEM codepage), see https://stackoverflow.com/a/43194047. The OEM codepage contains legacy graphical symbols like "╣" or "▒".
On my machine:
locale.getencoding()='cp1252'
ctypes.windll.kernel32.GetACP()=1252
ctypes.windll.kernel32.GetConsoleCP()=850
ctypes.windll.kernel32.GetConsoleOutputCP()=850
The character "ö" has codepoint 0x94 in CP850 (see table there). In CP1252, this codepoint maps to "”".
The suggestion here is that subprocess
related things should not pass the choice of the default encoding to io.TextWrapper
(which is documented to take locale.getencoding()
), but should instead default to the value returned by GetConsoleCP()
.
This would be exactly the same as the GO people decided to do.