Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The text encoding is different when fetching links #2152

Closed
a84r7a3rga76fg opened this issue Jan 1, 2022 · 6 comments
Closed

The text encoding is different when fetching links #2152

a84r7a3rga76fg opened this issue Jan 1, 2022 · 6 comments

Comments

@a84r7a3rga76fg
Copy link

a84r7a3rga76fg commented Jan 1, 2022

The text encoding is UCS-2 LE BOM when I fetch links, is there a special command or something I can use in the conf to make it UTF-8?

@mikf
Copy link
Owner

mikf commented Jan 3, 2022

when I fetch links

You mean writing the results of -g to a file?
There is no option to set the output encoding in gallery-dl, but specifying an encoding using PYTHONIOENCODING might work.

@a84r7a3rga76fg
Copy link
Author

Yes, I forgot to clarify it, with the -g command it'll use UCS-2 LE BOM but with any other command it'll use UTF-8.

@mikf
Copy link
Owner

mikf commented Jan 6, 2022

I did a bit of testing. Even on Windows it uses UTF-8 as output encoding when writing to a terminal, but Python switches to some other encoding when redirecting stdout to a file (gallery-dl -g ... > file).

Setting the PYTHONIOENCODING environment variable lets you force Python's output encoding in all cases, even when doing the redirect. So just do set PYTHONIOENCODING=utf-8 before running gallery-dl or set it globally.

@Hrxn
Copy link
Contributor

Hrxn commented Jan 7, 2022

Maybe a bit shorter: You can also create an environment variable by using good old CMD:

SetX PYTHONIOENCODING utf-8

Adds it to the current user (HKCU in the Registry)

SetX PYTHONIOENCODING utf-8 /m

To set it machine-wide (HKLM)

You can also use PowerShell, of course.

User:

[Environment]::SetEnvironmentVariable('PYTHONIOENCODING', 'utf-8', 'User')

Machine:

[Environment]::SetEnvironmentVariable('PYTHONIOENCODING', 'utf-8', 'Machine')

Setting it system-wide (HKLM) requires administrative privileges, of course. Setting it for the user works in PowerShell, and should work with SetX as well

@a84r7a3rga76fg
Copy link
Author

I tried those but it didn't work

mikf added a commit that referenced this issue Feb 26, 2023
(#1621, #2152, #2529)

Allow setting custom input/output encodings and options
without having to rely on Python's defaults.
@mikf
Copy link
Owner

mikf commented Apr 19, 2023

Since commit e480a93 (v1.25.0), it is possible to force a specific encoding with the output.stdout etc options.


On Windows, Python seems to use utf-8 when outputting to a terminal and a locale specific encoding when its output is redirected to a file.

On Windows, UTF-8 is used for the console device. Non-character devices such as disk files and pipes use the system locale encoding (i.e. the ANSI codepage). Non-console character devices such as NUL (i.e. where isatty() returns True) use the value of the console input and output codepages at startup, respectively for stdin and stdout/stderr. This defaults to the system locale encoding if the process is not initially attached to a console.

(https://docs.python.org/3/library/sys.html#sys.stdout)

@mikf mikf closed this as completed Apr 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants