Description
Code Sample, a copy-pastable example if possible
>>> import locale
>>> locale.getpreferredencoding()
'US-ASCII'
>>> open('/usr/local/lib/python3.4/site-packages/pandas/tests/io/data/spam.html').read()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.4/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 21552: ordinal not in range(128)
>>>
Problem description
Three tests, test_string_io
, test_string
, and test_file_like
, all open spam.html
without specifying the encoding, and then attempt to read it. This causes the tests to terminate prematurely with an error.
Expected Output
All three tests should pass since the code under test is not responsible for determining the file encoding.
Output of pd.show_versions()
pandas: 0.20.1
pytest: 3.1.0
pip: None
setuptools: 32.1.0
Cython: None
numpy: 1.11.2
scipy: 0.19.0
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.0.0
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: None
openpyxl: 2.4.7
xlrd: 1.0.0
xlwt: None
xlsxwriter: 0.9.6
lxml: 3.6.0
bs4: 4.5.1
html5lib: 0.9999999
sqlalchemy: 1.1.10
pymysql: 0.7.11.None
psycopg2: 2.7.1 (dt dec pq3 ext lo64)
jinja2: 2.9.5
s3fs: None
pandas_gbq: None
pandas_datareader: None