Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pipenv doesn't work with locale encoding different than UTF-8 on Linux #3131

Closed
vstinner opened this issue Oct 30, 2018 · 8 comments
Closed
Assignees
Labels
Type: Bug 🐛 This issue is a bug.

Comments

@vstinner
Copy link

Hi,

I wanted to try pipenv, but it doesn't work with the fr_FR locale.

Versions:

  • pipenv 2018.10.13
  • Python 3.6.6
  • Fedora 28
$ python3 -m venv env
$ env/bin/python -m pip install pipenv
$ LANG=fr_FR env/bin/pipenv install
Traceback (most recent call last):
  File "env/bin/pipenv", line 11, in <module>
    sys.exit(cli())
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/decorators.py", line 64, in new_func
    return ctx.invoke(f, obj, *args, **kwargs)
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/vstinner/env/lib64/python3.6/site-packages/pipenv/cli/command.py", line 249, in install
    editable_packages=state.installstate.editables,
  File "/home/vstinner/env/lib64/python3.6/site-packages/pipenv/core.py", line 1975, in do_install
    skip_lock=skip_lock,
  File "/home/vstinner/env/lib64/python3.6/site-packages/pipenv/core.py", line 1282, in do_init
    pypi_mirror=pypi_mirror,
  File "/home/vstinner/env/lib64/python3.6/site-packages/pipenv/core.py", line 708, in do_install_dependencies
    bold=True,
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/utils.py", line 260, in echo
    file.write(message)
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2026' in position 59: ordinal not in range(256)

It seems like pipenv likes non-ASCII characters for fancy output.

@serhiy-storchaka
Copy link

Emojis are not the only cause. Pseudographics used in a progress bar will cause the same kind of errors.

@vstoykov
Copy link

If you want French locale (or any other locale) in 2018 you should be using UTF aware locale like fr_FR.UTF-8. No one should use locale which is not UTF-8 aware. Even when you want C locale you should be using C.UTF-8.

@techalchemy
Copy link
Member

This is unfortunate but you have to talk to Kenneth about it :( We really ought to just ignore them if they can’t be printed, but click also demands utf8 locale settings.

In the interim you can set PIPENV_HIDE_EMOJIS=1 to make things play nicely. Let me know if that helps, and thanks for taking the time to contribute.

I have to admit I’m curious to see what exactly you’re working on :)

@vstinner
Copy link
Author

If you want French locale (or any other locale) in 2018 you should be using UTF aware locale like fr_FR.UTF-8.

Well, it's just an example of locale. There are other locales which don't use UTF-8, but ASCII, ShiftJIS or anything else.

This is unfortunate but you have to talk to Kenneth about it :( We really ought to just ignore them if they can’t be printed, but click also demands utf8 locale settings.

Oh.

@techalchemy
Copy link
Member

@vstinner I did just rewrite some of our output encoding to use translation maps and native encodings, so I’m wondering how I can solve this most helpfully. You certainly would know more on the subject so if you have any insight I’d be curious. Currently we basically use locale.getpreferredencoding() with some small hacks to avoid setting it on Linux.

To write output we essentially use this approach:

UNICODE_TO_ASCII_TRANSLATION_MAP = {
    8230: u"...",
    8211: u"-"
}


def decode_output(output):
    if not isinstance(output, six.string_types):
        return output
    try:
        output = output.encode(DEFAULT_ENCODING)
    except (AttributeError, UnicodeDecodeError):
        if six.PY2:
            output = unicode.translate(vistir.misc.to_text(output),
                                            UNICODE_TO_ASCII_TRANSLATION_MAP)
        else:
            output = output.translate(UNICODE_TO_ASCII_TRANSLATION_MAP)
    output = output.decode(DEFAULT_ENCODING)
    return output

I’m wondering if I shouldn’t just use an ignore errors re-encoding approach there

@hroncok
Copy link
Contributor

hroncok commented Oct 30, 2018

If the locale is C, it will be coerced to C.utf-8 by Python.

If the locale is not C and not utf-8 based, something is wrong (at least on Linux) and maybe the user just needs to be told (i.e. do what click does: abort).

@vstoykov
Copy link

vstoykov commented Oct 30, 2018

@techalchemy looking at your code for fail-safe decoding/encoding of text it reminds me of one converter that I made in the past, which uses python's codecs.register_error. You can look at my convert-encoding.py

The only downside with this approach is that you are registering this converter globally and if this technique is used by different modules then there is a chance of name collision if names are not carefully chosen.

techalchemy added a commit that referenced this issue Oct 30, 2018
- Drops any unmapped non-ascii characters on non-utf8 systems
- Fixes #3131

Signed-off-by: Dan Ryan <dan@danryan.co>
@techalchemy techalchemy added the Type: Bug 🐛 This issue is a bug. label Oct 30, 2018
@techalchemy techalchemy self-assigned this Oct 30, 2018
@vstinner
Copy link
Author

Thank you ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug 🐛 This issue is a bug.
Projects
None yet
Development

No branches or pull requests

5 participants