Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
url = ("https://iridl.ldeo.columbia.edu/"
"SOURCES/.UCSB/.CHIRPS/.v2p0/.monthly/"
".global/.T/last/subgrid/0./add/T/"
"table%3A/1/%3Atable/.csv")
pd.read_csv(url)
Issue Description
With Python 3.10, reading the CHIRPS rainfall data csv file from the URL in the provided example fails with the following error:
Traceback (most recent call last):
File "/usr/lib/python3.10/urllib/request.py", line 1348, in do_open
h.request(req.get_method(), req.selector, req.data, headers,
File "/usr/lib/python3.10/http/client.py", line 1282, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1328, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1277, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1037, in _send_output
self.send(msg)
File "/usr/lib/python3.10/http/client.py", line 975, in send
self.connect()
File "/usr/lib/python3.10/http/client.py", line 1454, in connect
self.sock = self._context.wrap_socket(self.sock,
File "/usr/lib/python3.10/ssl.py", line 512, in wrap_socket
return self.sslsocket_class._create(
File "/usr/lib/python3.10/ssl.py", line 1070, in _create
self.do_handshake()
File "/usr/lib/python3.10/ssl.py", line 1341, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:997)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/turnerm/sync/pa-aa-toolbox/run_chirps.py", line 21, in <module>
df = pd.read_csv(url)
File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/util/_decorators.py", line 317, in wrapper
return func(*args, **kwargs)
File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 927, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 582, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1421, in __init__
self._engine = self._make_engine(f, self.engine)
File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1707, in _make_engine
self.handles = get_handle( # type: ignore[call-overload]
File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/io/common.py", line 672, in get_handle
ioargs = _get_filepath_or_buffer(
File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/io/common.py", line 336, in _get_filepath_or_buffer
with urlopen(req_info) as req:
File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/io/common.py", line 239, in urlopen
return urllib.request.urlopen(*args, **kwargs)
File "/usr/lib/python3.10/urllib/request.py", line 216, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.10/urllib/request.py", line 519, in open
response = self._open(req, data)
File "/usr/lib/python3.10/urllib/request.py", line 536, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain
result = func(*args)
File "/usr/lib/python3.10/urllib/request.py", line 1391, in https_open
return self.do_open(http.client.HTTPSConnection, req,
File "/usr/lib/python3.10/urllib/request.py", line 1351, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:997)>
This error is not present in Python 3.6-3.9. I suspect it is due to the increased security for default TLS settings in Python 3.10. A workaround I found based on this SO post:
import ssl
from urllib.request import urlopen
import pandas as pd
url = ("https://iridl.ldeo.columbia.edu/"
"SOURCES/.UCSB/.CHIRPS/.v2p0/.monthly/"
".global/.T/last/subgrid/0./add/T/"
"table%3A/1/%3Atable/.csv")
context=ssl.create_default_context()
context.set_ciphers("DEFAULT")
result = urlopen(url, context=context)
df = pd.read_csv(result)
Expected Behavior
The csv should be read correctly into a dataframe, and should look like:
Time
0 Apr 2022
(Note that this dataset is not completely static, the date may eventually change, but it should be of a similar format)
Installed Versions
INSTALLED VERSIONS
commit : 3bf2cb1
python : 3.10.4.final.0
python-bits : 64
OS : Linux
OS-release : 5.13.0-41-generic
Version : #46~20.04.1-Ubuntu SMP Wed Apr 20 13:16:21 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.5.0.dev0+849.g3bf2cb1b2
numpy : 1.22.4
pytz : 2022.1
dateutil : 2.8.2
setuptools : 58.1.0
pip : 22.1.2
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 3.0.3
lxml.etree : 4.9.0
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.4.0
pandas_datareader: None
bs4 : None
bottleneck : None
brotli : None
fastparquet : None
fsspec : 2022.5.0
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : 3.0.10
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
snappy : None
sqlalchemy : None
tables : None
tabulate : 0.8.9
xarray : 2022.3.0
xlrd : 2.0.1
xlwt : 1.3.0
zstandard : None