-
-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON API: Encoding? #2273
Comments
As this is a quite fundamental change, guess it should get some testing, in rc2. Guess we won't have problems if the json is written to a file, not so sure about when it gets output on the console. Or piped and processed badly by other side. |
When you write text (strings) to stdout/stderr, then they are encoded to bytes using an encoding guessed by Python. That's independent of whether stdout is connected to a TTY/terminal or redirected to a file. |
Yes, but you suggested to always use utf-8. So how does e.g. a cygwin console or latin1/ascii console react when you output utf-8 on it? Or when doing |
Depends on #2925; Note:
The alternative is to make step one of using --json: "Replicate the way Python guesses encodings [which changes over Python releases]." i.e. "Use Python.". That's not acceptable. |
well - a completely ascii-save way could be to do unicode-escape then all unicode is escaped as |
just as an idea - why not to skip support for latin1 and other non-unicode terminals now? latin1 symbols inside utf8 will look the same on latin1 terminal I suppose. Other symbols will not be readable anyway, there is no good solution for this. And there is |
I prepared an initial, functionally incomplete patch I was completely dissatisfied with. I've been working to fix this for good by replacing most of these interactions (os.environ, input/yes, get_passphrase, ...) to use a iosys-class that determines encoding (from Python) and decodes stuff. But this is still incomplete and touches many of the more annoying parts of the code, so it may be reasonable to just go forward with rc2 and perhaps even 1.1.0 without having this resolved yet — on most (Linux/BSD) systems it will "mostly just work", because UTF-8 is a very widespread locale codeset and typically assumed. (OpenBSD has an especially good grip on things here for a Unix, because they only support UTF-8 and 7-bit ASCII). In this case it may be best to add a short note in the docs to say that encoding will be finalized to UTF-8 later. This will fall apart on Linux when no locale is configured (because Python will fallback to 7-bit ASCII), or glibc things no locale is configured, or considers the configuration invalid (e.g. partial or missing locale files). And of course every locale that is not UTF-8. |
OK, so let's have some docs now and the fix later. |
…gbackup#3009) (cherry picked from commit 133e847)
From https://docs.python.org/3.4/library/sys.html#sys.stdin / sys.stdout / sys.stderr:
More recent docs:
|
The docs were added Sep 9, 2017 #3019 (document utf-8 locale requirement for json mode). It looks like you forgot to remove the "documentation" label |
Thanks for the hint, I removed the documentation label. |
For stdin/stdout/stderr and JSON emitted on stdout (see frontends.rst), guess we could extend #3019 and just point there from the docs, so users invoking borg can adjust their environment variables if they do not use a locale with utf-8 encoding already: https://docs.python.org/3.8/library/sys.html#sys.stdin reads: Under all platforms, you can override the character encoding by setting the PYTHONIOENCODING environment variable before starting Python or by using the new -X utf8 command line option and PYTHONUTF8 environment variable. However, for the Windows console, this only applies when PYTHONLEGACYWINDOWSSTDIO is also set. |
Hmm, did I miss something or can this "fix" just be recommending to use PYTHONUTF8 seems way too intrusive and influences how a lot of stuff works - this could break stuff that worked before. |
document another way to get UTF-8 encoding on stdin/stdout/stderr, fixes #2273
From #2249
The text was updated successfully, but these errors were encountered: