Skip to content

BUG: SSL handshake error with Python 3.10 and Pandas read_csv for URLs #47189

Closed
@turnerm

Description

@turnerm

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
url = ("https://iridl.ldeo.columbia.edu/"
       "SOURCES/.UCSB/.CHIRPS/.v2p0/.monthly/"
       ".global/.T/last/subgrid/0./add/T/"
       "table%3A/1/%3Atable/.csv")
pd.read_csv(url)

Issue Description

With Python 3.10, reading the CHIRPS rainfall data csv file from the URL in the provided example fails with the following error:

Traceback (most recent call last):
  File "/usr/lib/python3.10/urllib/request.py", line 1348, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/usr/lib/python3.10/http/client.py", line 1282, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1328, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1277, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1037, in _send_output
    self.send(msg)
  File "/usr/lib/python3.10/http/client.py", line 975, in send
    self.connect()
  File "/usr/lib/python3.10/http/client.py", line 1454, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/usr/lib/python3.10/ssl.py", line 512, in wrap_socket
    return self.sslsocket_class._create(
  File "/usr/lib/python3.10/ssl.py", line 1070, in _create
    self.do_handshake()
  File "/usr/lib/python3.10/ssl.py", line 1341, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:997)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/turnerm/sync/pa-aa-toolbox/run_chirps.py", line 21, in <module>
    df = pd.read_csv(url)
  File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/util/_decorators.py", line 317, in wrapper
    return func(*args, **kwargs)
  File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 927, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 582, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1421, in __init__
    self._engine = self._make_engine(f, self.engine)
  File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1707, in _make_engine
    self.handles = get_handle(  # type: ignore[call-overload]
  File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/io/common.py", line 672, in get_handle
    ioargs = _get_filepath_or_buffer(
  File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/io/common.py", line 336, in _get_filepath_or_buffer
    with urlopen(req_info) as req:
  File "/home/turnerm/sync/pa-aa-toolbox/venv/lib/python3.10/site-packages/pandas/io/common.py", line 239, in urlopen
    return urllib.request.urlopen(*args, **kwargs)
  File "/usr/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File "/usr/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/usr/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:997)>

This error is not present in Python 3.6-3.9. I suspect it is due to the increased security for default TLS settings in Python 3.10. A workaround I found based on this SO post:

import ssl
from urllib.request import urlopen

import pandas as pd

url = ("https://iridl.ldeo.columbia.edu/"
       "SOURCES/.UCSB/.CHIRPS/.v2p0/.monthly/"
       ".global/.T/last/subgrid/0./add/T/"
       "table%3A/1/%3Atable/.csv")

context=ssl.create_default_context()
context.set_ciphers("DEFAULT")
result = urlopen(url, context=context)
df = pd.read_csv(result)

Expected Behavior

The csv should be read correctly into a dataframe, and should look like:

       Time
0  Apr 2022

(Note that this dataset is not completely static, the date may eventually change, but it should be of a similar format)

Installed Versions

INSTALLED VERSIONS

commit : 3bf2cb1
python : 3.10.4.final.0
python-bits : 64
OS : Linux
OS-release : 5.13.0-41-generic
Version : #46~20.04.1-Ubuntu SMP Wed Apr 20 13:16:21 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.5.0.dev0+849.g3bf2cb1b2
numpy : 1.22.4
pytz : 2022.1
dateutil : 2.8.2
setuptools : 58.1.0
pip : 22.1.2
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 3.0.3
lxml.etree : 4.9.0
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.4.0
pandas_datareader: None
bs4 : None
bottleneck : None
brotli : None
fastparquet : None
fsspec : 2022.5.0
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : 3.0.10
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
snappy : None
sqlalchemy : None
tables : None
tabulate : 0.8.9
xarray : 2022.3.0
xlrd : 2.0.1
xlwt : 1.3.0
zstandard : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions