Skip to content

Removing LANG env var from 3.10-rc and templates #570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 13, 2022

Conversation

henryh9n
Copy link
Contributor

@henryh9n henryh9n commented Jan 8, 2021

Since ENV LANG C.UTF-8 is apparently no longer necessary, removed those from 3.10-rc and templates not to cause confusion.

@tianon
Copy link
Member

tianon commented Jan 8, 2021

Unfortunately this won't work as-is (post-merge we'll have @docker-library-bot immediately apply the template change across all image variants/versions), but this is really interesting!

From my reading of that issue, it sounds like it's been unnecessary since Python 3.7? 😲

(It's too bad 3.6 isn't quite EOL yet or this would be really easy for us to remove! 😄)

@henryh9n
Copy link
Contributor Author

henryh9n commented Jan 9, 2021

Unfortunately this won't work as-is (post-merge we'll have @docker-library-bot immediately apply the template change across all image variants/versions), but this is really interesting!

From my reading of that issue, it sounds like it's been unnecessary since Python 3.7? 😲

(It's too bad 3.6 isn't quite EOL yet or this would be really easy for us to remove! 😄)

Yeah as far as I can see it is obsolete for at least version 3.7 (there are some discussions for earlier versions, but I'm not sure :)) So what should we do? Will changing only for 3.10-rc work? (ofc later removing it in templates) or should we wait for 3.6's end of the life?

@edmorley
Copy link
Contributor

Hi! Now that Python 3.6 is EOL, is this worth reconsidering? :-)

@henryh9n
Copy link
Contributor Author

@edmorley Thanks for the reminder. Done!
cc: @tianon

@henryh9n
Copy link
Contributor Author

@edmorley @tianon Checks look good. Should we merge?

@tianon
Copy link
Member

tianon commented Apr 13, 2022

Yes, thank you! Was waiting for the checks but hadn't gotten back to it yet. 🙏

@tianon tianon merged commit 0b9aee9 into docker-library:master Apr 13, 2022
docker-library-bot added a commit to docker-library-bot/official-images that referenced this pull request Apr 13, 2022
Changes:

- docker-library/python@0b9aee9: Removing LANG env var (docker-library/python#570)
- docker-library/python@d298086: Merge pull request docker-library/python#716 from edmorley/remove-with-system-ffi
- docker-library/python@028012a: Remove redundant `--with-system-ffi` configure option
- docker-library/python@1999778: Restore /usr/local/bin/python as a relative symlink to /usr/local/bin/python3, rather than an absolute one (docker-library/python#714)
@henryh9n henryh9n deleted the lang_env_removed branch April 13, 2022 21:34
@gritvald
Copy link

gritvald commented Apr 15, 2022

We are using python:3.9-slim-buster in our projects and after switched to python:3.9-slim-buster@sha256:9fd7e4aaa6aae1d390d09400cad6c14508f5a2cc543987f162c9d3da753e2879 (image that has removed ENV LANG C.UTF-8 from Dockerfile) our apps was generating a lots of UnicodeEncodeErrors.

After switching back to python:3.9-slim-buster@sha256:59356005f0cd210ab8f81f08f2aa78528c6049a500cd241f06e209b0dff09515 problems gone.

@henryh9n Let me know if you need some more info.

@sagata1999
Copy link

sagata1999 commented Apr 15, 2022

python:3.8 either falls apart to UnicodeEncodeErrors in cause of 'ascii' not knowing about utf-8 characters (anyway I'm thinking so). But, when we settled up ENV LANG=C.UTF-8 in our Dockerfile issue seems to disappear...

@henryh9n

@tianon
Copy link
Member

tianon commented Apr 15, 2022 via email

@mauriziomocci
Copy link

We are using python:3.9-slim-buster in our project. After a "friday afternoon deploy" before holidays our app started generating lots of UnicodeEncodeErrors. We've lost 3 hours investigating the issue :(

I attach a useful image that describes the problem:

1650045897665

@grapo
Copy link

grapo commented Apr 15, 2022

In our case there is a problem with uWSGI when using latest python:3.9-slim-buster
Uwsgi sets encoding of stderr and stdout stream (https://github.com/unbit/uwsgi/blob/master/plugins/python/python_plugin.c#L598) based on locale.getpreferredencoding()

We place print(locale.getlocale(locale.LC_CTYPE), locale.getpreferredencoding(), sys.flags.utf8_mode) in code and run uwsgi:

$ uwsgi --ini /uwsgi.ini
...
(None, None) ANSI_X3.4-1968 0
...

$ LANG=C.UTF-8 uwsgi --ini /uwsgi.ini
...
('en_US', 'UTF-8') UTF-8 0
...

If encoding is set to ANSI_X3.4-1968 then any string with nonascii character send to logger causes UnicodeEncodeError

Currently I have no idea why uwsgi don't use system locale

@lambda
Copy link

lambda commented Apr 15, 2022

We ran into issues after updating as well; scripts which use Click will refuse to run without a Unicode locale.

Changing the locale like this has some pretty wide-ranging effects, I think it would be better to revert this change and leave the locale as C.UTF-8. This is a breaking change for a number of use cases, and anyone who needed a different locale would have already overridden this in their use case; it seems like the change causes more problems than it solves.

@lambda
Copy link

lambda commented Apr 15, 2022

Huh. An attempt to create a minimal example of the failure using Click is not showing the issue. I may need to dig a bit to figure out a minimal set of steps to reproduce.

@drzraf
Copy link

drzraf commented Apr 17, 2022

This update broke all my CI, not only because it changed the basic behavior of python:alpine but of all available images making very hard to track it down and to downgrade to a working version.

Ex:
docker run -ti python:alpine3.14 sh -c "pip3 install --user babel && python -c 'from babel.core import default_locale, Locale; print(Locale.parse(default_locale(\"LC_NUMERIC\")))'"

  • Before: C.UTF-8
  • Now: None

And consequently:
docker run -ti python:alpine3.14 sh -c "pip3 install --user babel && python3 -c 'from babel.numbers import parse_decimal; parse_decimal(\"32.2\");'"

  • Before: works
  • Now:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/root/.local/lib/python3.10/site-packages/babel/numbers.py", line 729, in parse_decimal
    group_symbol = get_group_symbol(locale)
  File "/root/.local/lib/python3.10/site-packages/babel/numbers.py", line 332, in get_group_symbol
    return Locale.parse(locale).number_symbols.get('group', u',')
AttributeError: 'NoneType' object has no attribute 'number_symbols'

I highly suggest to roll this back thinking about all the current containers breaking in strange ways without any clear and easily identifiable reason.

@martin-thoma
Copy link

This also broke my CI. I don't completely grasp it, but opening files with German umlauts in the path doesn't work anymore:

with open("Ausführung.pdf", "rb") as fp:  # shortened a bit
    data = fp.read()

suddenly gives

'ascii' codec can't encode character '\xfc' in position 44: ordinal not in range(128)

@tianon
Copy link
Member

tianon commented Apr 19, 2022

Reverted via #719 😞 🙈

docker-library-bot added a commit to docker-library-bot/official-images that referenced this pull request Apr 19, 2022
Changes:

- docker-library/python@1cf43e7: Merge pull request docker-library/python#719 from docker-library/revert-570-lang_env_removed
- docker-library/python@f871d04: Revert "Removing LANG env var (docker-library/python#570)"
@edmorley
Copy link
Contributor

This also broke my CI. I don't completely grasp it, but opening files with German umlauts in the path doesn't work anymore:

@MT-Cash I don't suppose you have some more details? (Just to help anyone in the future trying to work out when it's finally safe to remove LANG)

I can't reproduce your exception using any of:

$ docker run -ti python:3.10 sh -c 'unset LANG && touch "Ausführung.pdf" && python -c "f = open(\"Ausführung.pdf\", \"rb\"); f.read()"'
$ docker run -ti python:3.8 sh -c 'unset LANG && touch "Ausführung.pdf" && python -c "f = open(\"Ausführung.pdf\", \"rb\"); f.read()"'
$ docker run -ti python:alpine3.14 sh -c 'unset LANG && touch "Ausführung.pdf" && python -c "f = open(\"Ausführung.pdf\", \"rb\"); f.read()"'

@edmorley
Copy link
Contributor

This update broke all my CI, not only because it changed the basic behavior of python:alpine

@drzraf I can reproduce the failure in your post.

However, some observations:

  • The expected result in your first example seems to be incorrect? It says Before: C.UTF-8 but I got en_US_POSIX for it.
  • The particular issue you saw with babel only seems to affect alpine (unlike the other issues which were reported on other distros too). For example it works on Debian on multiple Python version and Debian version variants, eg:
    docker run -ti python:3.10 sh -c "unset LANG && pip3 install --user babel && python3 -c 'from babel.numbers import parse_decimal; parse_decimal(\"32.2\");'"
    docker run -ti python:3.7-buster sh -c "unset LANG && pip3 install --user babel && python3 -c 'from babel.numbers import parse_decimal; parse_decimal(\"32.2\");'"
    

@edmorley
Copy link
Contributor

Whilst #719 resolved this issue (by reverting the LANG removal), the code comment explaining why LANG is needed still references the already-fixed Python bug:

# http://bugs.python.org/issue19846
# > At the moment, setting "LANG=C" on a Linux system *fundamentally breaks Python 3*, and that's not OK.
ENV LANG C.UTF-8

It feels like figuring out what broke here, reporting it upstream if appropriate (given the bugs PEPs I've seen seem to suggest this should be resolved already?), and updating the code comment to indicate why LANG is still needed might help prevent future similar breakages (plus allow for cleanup of LANG once appropriate).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants