-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix UnicodeEncodeError #578
Conversation
if fail to encode to utf8, unencoded charactor is written.
Can you please provide a reproducible example which shows when this patch is needed? |
This code throw UnicodeEncodeError in out.write(jp) # -*- coding: utf-8 -*-
from cStringIO import StringIO
out = StringIO()
out.write("test\n")
jp="#ああああ"
try:
jp=jp.encode('utf8')
except UnicodeDecodeError:
pass
jp=jp.decode('utf8','replace')
out.write(jp)
print (out.getvalue()) |
I tried to reproduce your problem with a simple package which outputs unicode characters. You can find the test repo at https://github.com/dirk-thomas/catkin_test_unicode But when invoking CMake on it the output is perfectly fine:
Can you please provide a complete example with a catkin package where you have the problem? |
I execute "./src/catkin/bin/catkin_make_isolated --install".
I add print like below in run_command out = StringIO() if quiet else sys.stdout
if capture:
print(cmd)
while True:
line = proc.stdout.readline()
print(line)
try:
# in case the input is already unicode
line = line.encode('utf8')
except (AttributeError, UnicodeDecodeError):
pass
line = line.decode('utf8', 'replace') Result is below
My OS is Xubuntu13.10. |
When setting:
I can reproduce it with even with catkin_make and my example package. |
Since your patch would fail with Python 3 I have created a modified version of it on the unicode branch (https://github.com/ros/catkin/tree/unicode). Can you please verify that this works for you too? I am not yet sure if that is the best way to address it but we will see... |
I checked your fix.
|
I tried several more variants but could not find any approach which worked with Python 3 and japanese LANG set. Until a patch supports both Python 2 and 3 I can't merge it. @wjwwood Any idea how the code for that should look like? |
@dirk-thomas It may be a problem with linux or your version of CMake? It works perfectly fine on my machine:
|
A more complete example, showing
|
Have you tried it with Python 3 too? |
How can I tell catkin to use Python3 to invoke things? |
I think it will require |
Ok, same thing as before, but invoking
|
Btw, I just did |
I updated the test package a couple of times. When I invoke the following command in a workspace with
The output is ok with
|
I have the same bug on my french Debian.
Now, when I change the language to English.
Then it works fine. |
I now just tried the patch. |
I'll try to look at this again today, I was unable to reproduce it on OS X before, but I may have been missing something. |
I cannot reproduce this with a simple example on OS X. I did exactly this:
I put the output from my tests and the files I used in the test are in this gist: https://gist.github.com/wjwwood/6efb3ff4ab42dc6b10fb Please try and reproduce what I've done here and see if you still get the problem. If you have the problem with my above example, then we might need to consider that this bug is realted to Linux and Python's unicode support. |
Just for completeness, I also ran it with |
The problem is still there - we had the case before that OS X behaves differently. |
So, @dirk-thomas that means you have the problem when you do exactly what I did above? The first thing I was trying to figure out is whether or not I was missing the problem because of how I was testing for it. |
I did not reproduce your exact sequence. We tested that some time ago extensively and nothing has changed since then since we couldn't find a way to make it work with Python 2 as well as 3 with english and japanese LANG environment. |
There was some confusion, you had changed your |
Which you removed here: I can try to reproduce it with you current repository. But unless we have a good reason as to why it happens on Linux and not BSD (my mac) then I'm not sure we need to change how we are doing things, or at least it is not clear to me that we are addressing the correct problem. |
If you want to look into it please try it on a Ubuntu machine first with the 2x2 cases described above. That failed for me always for one of the cases - I could not come up with Python code which made all four work. |
I cannot reproduce it with your repository either. I'll try to download linux later when I have better internet, but it seems to me that Linux is not handling this correctly. |
cStringIO (which is imported on python2) does not support unicode. Using vanilla StringIO fixes this for me. Try https://github.com/saljam/catkin/commit/db1fb3451c6004e7c872db4b45be27e9b7599c3f with:
and
|
I have extended the example significantly to allow a clear way to run the tests: https://github.com/dirk-thomas/catkin_test_unicode When I run the
The result is the same with the current code of catkin as well as when I simplify the builder.py code (https://github.com/ros/catkin/blob/indigo-devel/python/catkin/builder.py#L193) to:
The most likely reason why the behavior on OS X is different is a different value for the stdout encoding. @wjwwood Can you please validate that on OS X and post the summarized result? |
@saljam Changing the |
Right, it sounds like your machine's Japanese locale is missing or broken (on Ubuntu the encoding should be UTF8, not ANSI_X3.4-1968). Try 'locale -a' to see what's on, and locale-gen to generate the missing ones. On my machine, locale -a yields
and foo.py passes in all 4 cases. But that's unrelated to the StringIO issue. :) Just Python3 defaulting to ASCII when the system encoding is missing. And being strict about it. |
Something I discovered during my research on missing locale's with python: http://stackoverflow.com/questions/14547631/python-locale-error-unsupported-locale-setting
I expect the first line in the above snippet to fail on the machines reporting this error. |
@josegaert Can you please run |
Closed in favor of #595 since this PR does not work with all combinations of Python versions and LANG values. |
Hey, root@anoesjka:~# locale -a Greetings |
@josegaert Can you please check if PR #595 will work for you, too? |
fix unicode error with japanese LANG (fix #578)
if fail to encode to utf8, unencoded charactor is written.