Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability for skipping undecodable filenames #1038

Merged
merged 2 commits into from
Dec 2, 2014

Conversation

kyleknap
Copy link
Contributor

@kyleknap kyleknap commented Dec 2, 2014

Fixes #1028

So if you are running on posix platforms you will run into an issue if the file name was encoded differently than the system's set encoding, the listdir() call we make will not be able to decode the name properly.

Before, if you hit this problem the whole process failed and it would do so silently. Now it warns the user with the appropriate message and just skips the file.

Here is an example of reproducing the fixed issue on an Amazon Linux, and what it looks like with this PR:

$ ls
hello.py  zz.py

$ python
Python 2.6.9 (unknown, Sep 13 2014, 00:25:11) 
[GCC 4.8.2 20140120 (Red Hat 4.8.2-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> with open(u'\u00C0'.encode('latin1'), 'wb') as f:
...       f.write('foo')

$ aws s3 cp . s3://mybucketfoo --recursive 
warning: Skipping file '\xc0'. There was an error trying to decode the the file '\xc0' in directory "/home/ec2-user/temp/". 
Please check your locale settings.  The filename was decoded as: UTF-8
On posix platforms, check the LC_CTYPE environment variable.
upload: ./zz.py to s3://mybucketfoo/zz.py     
upload: ./hello.py to s3://mybucketfoo/hello.py 

$ echo $?
2

cc @jamesls @danielgtaylor

@jamesls
Copy link
Member

jamesls commented Dec 2, 2014

:shipit: Looks good.

kyleknap added a commit that referenced this pull request Dec 2, 2014
Add ability for skipping undecodable filenames
@kyleknap kyleknap merged commit 36db6a6 into aws:develop Dec 2, 2014
@kyleknap kyleknap deleted the decoding-errors branch December 2, 2014 01:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

problems with utf8 / iso-8859-1 encoded filenames
2 participants