You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In short, the problem is that when I try to have a conversation in Chinese, a Unicode encoding error occurs : UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-6: character maps to <undefined> (See the end for a detailed log)
Versions
OS: Windows 11
Shell: Powershell 7.4.5
ShellGPT: v1.4.4
Some thoughts
According to the log, I found that the command was actually sent correctly to the openai server, there was no problem with the API call, and in fact I had seen some returned responses in the terminal. However, I found that the error comes directly from “encoding = ‘locale’” and “f = <_io.TextIOWrapper name=‘C:\Users\atp\AppData\Local\Temp\cache\2c4c3249b2b4c...
mode=’w‘ encoding=’cp1252'>”. If we do a little research, we find that CP1252 is a character encoding for Western European languages. It is mainly used for processing text containing Latin letters, not Chinese or other languages.
However, the interesting thing is that my system's default language is Chinese, and my terminal also supports Chinese input and display. Logically, the locale setting should detect UTF-8, and I haven't seen anyone else encountering problems using Chinese on sgpt on the internet.
Failed attempt
Although powershell uses UTF-8 by default, I tried chcp 65001 anyway. After restarting the terminal, the problem is still not solved.
Successful attempt
In a completely random spirit, I opened the Windows settings, clicked on Language and Region in the Time and Language option, and then entered Manage Language Settings. In the pop-up window, I found this text: “Language for non-Unicode programs. This setting (System Locale) controls the language used when displaying text in programs that do not support Unicode.” And the current setting is “French (France)”. So I clicked Change system locale and then checked Beta: Unicode UTF-8 for worldwide language support. I was then asked to restart the system, and the problem was solved.
Potential problem for international user
The reason my system sets French as the alternate language for non-Unicode programs is because I live in France and Windows sets it by default. I believe that most people set the locale and time zone of their computer according to their current place of residence, and the software does not detect the appropriate locale setting, which leads to the previous problem. I have not tested whether non-local encoding languages will also cause the same error in other operating systems and shells, but this is definitely a potential problem until it is officially resolved.
Suggestions
Modify the code that detects the localization configuration in the code so that it can correctly encode and decode according to the language used.
When a problem occurs, the function could first try to encode and decode using Unicode or other possible encoding methods.
(Highly recommended) Add the option to customize encoding and decoding schemes, for example, to allow sgpt --encoding utf8.
BTW
I think ShellGPT is an excellent project and a practical tool, but
I find the practice of using the first dialogue content as the title of the history file very inconvenient, especially when the sentences are very long. The command to try to keep the dialogue becomes very long, and it is also likely to cause the problem of an illegal file name being disallowed to be created. One potential solution is to consider all conversations without an explicitly specified chat name to be in the same chat, until sgpt --new is entered, after which the conversation is considered to be in a new one.
Please provide a convenient function to delete specific and all conversation records.
Problem
In short, the problem is that when I try to have a conversation in Chinese, a Unicode encoding error occurs :
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-6: character maps to <undefined>
(See the end for a detailed log)Versions
OS: Windows 11
Shell: Powershell 7.4.5
ShellGPT: v1.4.4
Some thoughts
According to the log, I found that the command was actually sent correctly to the openai server, there was no problem with the API call, and in fact I had seen some returned responses in the terminal. However, I found that the error comes directly from “encoding = ‘locale’” and “f = <_io.TextIOWrapper name=‘C:\Users\atp\AppData\Local\Temp\cache\2c4c3249b2b4c...
mode=’w‘ encoding=’cp1252'>”. If we do a little research, we find that CP1252 is a character encoding for Western European languages. It is mainly used for processing text containing Latin letters, not Chinese or other languages.
However, the interesting thing is that my system's default language is Chinese, and my terminal also supports Chinese input and display. Logically, the locale setting should detect UTF-8, and I haven't seen anyone else encountering problems using Chinese on sgpt on the internet.
Failed attempt
Although powershell uses UTF-8 by default, I tried
chcp 65001
anyway. After restarting the terminal, the problem is still not solved.Successful attempt
In a completely random spirit, I opened the Windows settings, clicked on
Language and Region
in theTime and Language
option, and then enteredManage Language Settings
. In the pop-up window, I found this text: “Language for non-Unicode programs. This setting (System Locale) controls the language used when displaying text in programs that do not support Unicode.” And the current setting is “French (France)”. So I clickedChange system locale
and then checkedBeta: Unicode UTF-8 for worldwide language support
. I was then asked to restart the system, and the problem was solved.Potential problem for international user
The reason my system sets French as the alternate language for non-Unicode programs is because I live in France and Windows sets it by default. I believe that most people set the locale and time zone of their computer according to their current place of residence, and the software does not detect the appropriate locale setting, which leads to the previous problem. I have not tested whether non-local encoding languages will also cause the same error in other operating systems and shells, but this is definitely a potential problem until it is officially resolved.
Suggestions
sgpt --encoding utf8
.BTW
I think ShellGPT is an excellent project and a practical tool, but
sgpt --new
is entered, after which the conversation is considered to be in a new one.Full log
The text was updated successfully, but these errors were encountered: