-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changing console's OutputEncoding on linux to unicode generates garbage #29735
Comments
On Ubuntu in WSL at least this does not repro
|
@danmosemsft it doesn't? Your output shows the garbage I refer to caused by incorrectly used console encoding in .net core and actual encoding in the terminal. It should print "привет" instead, like on Windows. |
@JustArchi I misunderstood, you mean that writing through the Console object is messed up. I guess so, at least, my next string is double spaced:
|
Changing Console.OutputEncoding changes the encoding used to translate strings to the bytes written out to the underlying stream, whether that's to a terminal or redirected. There's no good way I'm aware of to programmatically tell the terminal to change the encoding it uses to decode written bytes, but we dutifully follow the request so that a terminal manually changed or redirected output gets the correct data. I'm not sure what else we could do here. What are you suggesting is the viable alternative? |
@stephentoub I thought of 3 approaches to this issue (all 3 apply when
The only question that remains is whether we're able to determine whether we can safely assume that terminal encoding can't be changed (because we have unix spec and we're sure that we're dealing with the terminal). If that's the case, it should be addressed through one of two ways above, as opposed to existing logic that leaves the console encoding in a state that no consumer would want it to end up with. My current idea involves not doing anything when calling The objective is to somehow improve current result of leaving console encoding in a state that no consumer would want, without manually adding Thanks! |
Someone can manually change the encoding of their terminal, in which case they'd want to use Console.OutputEncoding to match. If we start ignoring that request or throwing, it breaks that use case. |
That's true, but when you can't change the encoding of the already-set terminal then runtime should have logic for detecting and applying to the encoding that was already set (whether it's utf8 or anything else), and since you can't do anything with |
To the best of my knowledge you can't change encoding of already established terminal output on unix (as opposed to windows), which is why logical solution to me is runtime detecting that encoding (and applying to it), while making all future calls to output encoding a no-op. |
How? |
That's a good question, I don't know, maybe you have some idea 😅. |
Majority of unix applications seem to depend on |
I'm not aware of any good way to reliably determine the encoding the terminal is using (if anyone knows of one, please share). And without that, I don't think this is actionable. |
Right |
Which is why I'm not really suggesting any particular solution as I don't feel comfortable enough doing so, I'm just brainstorming potential approaches to the problem in order to determine whether there is anything we can do to improve in regard to this issue. If you feel like there is nothing we can do to improve this use case then I fully understand that and the issue can be closed, I just thought that perhaps there is some possible improvement here in regards to avoiding breaking the encoding for unaware customers. |
What is the use case? Why are you setting Personally, I think it's really confusing that |
@svick I'm changing encoding on Windows to have consistent display of more obscure characters for my users, this involves stuff like cyrillic characters on non-cyrillic OS languages to display properly (instead of bunch of Accidentally this line regressed on linux/osx setups since there it changed the output encoding without changing terminal encoding, so I was forced to make my line above a conditional The idea was that runtime could handle it in a smart way on linux/osx instead of changing encoding without terminal, but I guess there is no good way to go about this. The end objective was to have a transparency in that command that could work regardless of OS, instead of me sticking to current way of |
@JustArchi I think a good approach to do that is to use On Unix, that doesn't work, so you should stick with the default UTF-8: |
@svick This is actually what I've decided to go with, but I still have |
It's mentioned in #52374 so I'm not sure if we need standalone issue, but the problem of course still applies and is not resolved at the time of posting. |
Hello.
I've experimented a bit in my cross-platform app by declaring
Console.OutputEncoding = Encoding.Unicode;
globally before first console entry gets written.On Windows, as expected, the output encoding is changed nicely and the console displays whole range of symbols, including cyrillic characters.
On Linux, the encoding is also changed, but the console generates garbage from this point onwards, where previously the cyrillic characters would also show properly (probably due to UTF-8 already being default there).
Judging by my own research based on this line, I'd expect that changing encoding on linux would truly be a no-op operation which doesn't affect anything, or at worst produces an exception to handle during runtime, but instead it broke display that worked previously.
I'm not sure if this is intended or not, I apologize in advance if it is but I couldn't find any issue that relates to my problem. Feel free to close it in this case.
Otherwise, feel free to check the issue yourself, it should be enough to launch code similar to below on any linux machine:
In my case, it prints
?@825B
. It's important to test it with cyrillic or something more obscure, as00
in ASCII characters and similar will be written asNULL
s on the terminal, thus not displayed.As you can expect, this issue also affects OS X.
Thank you in advance for looking into this issue.
The text was updated successfully, but these errors were encountered: