-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
OSError when reading file with accents in file path #15086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Just my pennies worth. Quickly tried it out on Mac OSX and Ubuntu with no Could this be an environment/platform problem? I noticed that the Note: I just set up a virtualenv with python3.6 and installed pandas 0.19.2 using pip. >>> import pandas as pd
>>> pd.read_csv('test_é.txt')
a b c
0 1 2 3
1 4 5 6 Output of pd.show_versions()
INSTALLED VERSIONS
commit: None pandas: 0.19.2 |
I believe 3.6 switches the file system encoding on windows to utf8 (from ascii). Apart from that we don't have testing enable yet on windows for 3.6 (as some of the required packages are just now becoming available). |
so I just added build support on appveyor (windows) for 3.6, so if you'd push up your tests to see if it works, would be great. |
I also faced the same problem when the program stopped at pd.read_csv(file_path). The situation is similar to me after I upgraded my python to 3.6 (I'm not sure the last time the python I installed is exactly what version, maybe 3.5......). |
@jreback what is the next step towards a fix here? While I do not use Windows, I could try to help (just got a VM to debug a piece of my code that apparently does not work on windows) BTW, a workaround: pass a file handle instead of a name |
@tpietruszka see comments on the PR: #15092 (it got removed from a private fork, was pretty much there). you basically need to encode the paths differently on py3.6 (vs other pythons) on wnidows. basically need to implement: https://docs.python.org/3/whatsnew/3.6.html#pep-529-change-windows-filesystem-encoding-to-utf-8 |
my old code (can't run):
new code (sucessful):
I think this bug is filename problem. |
If anyone comes here like me because he/she hit the same problem, here is a solution until pandas is fixed to work with pep 529 (basically any non ascii chars will in your path or filename will result in errors): Insert the following two lines at the beginning of your code to revert back to the old way of handling paths on windows:
|
I use the solution above and it works. Thanks very much @fotisj ! |
Just pinging this - I have the same issue, I'm using a workaround but it would be great if that was not required. |
this needs a community patch |
I am encountering this issue. I want to try and contribute a patchc Any pointers on how to start fixing this? |
I think none of the maintainers have access to a system that can reproduce this. Perhaps some of the others in this issue can help put together a solution. |
Python 3.6+ changes the default encoding to UTF8 (PEP 529), which conflicts with the encoding of Windows (MBCS). This fix checks if we're using Python 3.6+ and on Windows, after which we force the encoding to "mbcs". Closes pandas-devgh-15086.
Python 3.6+ changes the default encoding to UTF8 (PEP 529), which conflicts with the encoding of Windows (MBCS). This fix checks if we're using Python 3.6+ and on Windows, after which we force the encoding to "mbcs". Closes pandas-devgh-15086.
Python 3.6+ changes the default encoding to UTF8 (PEP 529), which conflicts with the encoding of Windows (MBCS). This fix checks if we're using Python 3.6+ and on Windows, after which we force the encoding to "mbcs". Closes pandas-devgh-15086.
Python 3.6+ changes the default encoding to UTF8 (PEP 529), which conflicts with the encoding of Windows (MBCS). This fix checks if we're using Python 3.6+ and on Windows, after which we force the encoding to "mbcs". Closes pandas-devgh-15086.
Python 3.6+ changes the default encoding to UTF8 (PEP 529), which conflicts with the encoding of Windows (MBCS). This fix checks if we're using Python 3.6+ and on Windows, after which we force the encoding to "mbcs". Closes pandas-devgh-15086.
Python 3.6+ changes the default encoding to UTF8 (PEP 529), which conflicts with the encoding of Windows (MBCS). This fix checks if we're using Python 3.6+ and on Windows, after which we force the encoding to "mbcs". Closes gh-15086.
Python 3.6+ changes the default encoding to UTF8 (PEP 529), which conflicts with the encoding of Windows (MBCS). This fix checks if we're using Python 3.6+ and on Windows, after which we force the encoding to "mbcs". Closes pandas-devgh-15086.
Python 3.6+ changes the default encoding to UTF8 (PEP 529), which conflicts with the encoding of Windows (MBCS). This fix checks if we're using Python 3.6+ and on Windows, after which we force the encoding to "mbcs". Closes pandas-devgh-15086.
* Fix gh-15086 properly instead of making a workaround * fix code style * Make sure test_filename_with_special_chars properly tests combinations of chars Updated whatsnew * Address comments by @jreback * Parametrize test_filename_with_special_chars Use CP-1252 and CP-1251 filenames separately, skip the test on Windows on < 3.6 as it won't pass
* Fix pandas-devgh-15086 properly instead of making a workaround * fix code style * Make sure test_filename_with_special_chars properly tests combinations of chars Updated whatsnew * Address comments by @jreback * Parametrize test_filename_with_special_chars Use CP-1252 and CP-1251 filenames separately, skip the test on Windows on < 3.6 as it won't pass
Hi, I have this problem on pandas |
Remove file from same folder name like ,if your file stored in same folder name as file. |
@pranjulknit If I understand you suggest to move the file to a folder without these problematic characters in the path. This is not always possible. If you suggest that folder names and file names should be different - this is not the issue that is described here, I never had problems with that. |
Actually, i have this problem while reading csv file from jupyter notebook. |
Code Sample, a copy-pastable example if possible
test.txt
andtest_é.txt
are the same file, only the name change:Problem description
Pandas return OSError when trying to read a file with accents in file path.
The problem is new (Since I upgraded to Python 3.6 and Pandas 0.19.2)
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.0.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: fr
LOCALE: None.None
pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 32.3.1
Cython: 0.25.2
numpy: 1.11.3
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: None
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
httplib2: None
apiclient: None
sqlalchemy: 1.1.4
pymysql: None
psycopg2: None
jinja2: 2.9.3
boto: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: