Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REGR: 1.4.0rc0: pd.read_sql not compatible with pymysql #45416

Closed
2 of 3 tasks
auderson opened this issue Jan 17, 2022 · 8 comments · Fixed by #45496
Closed
2 of 3 tasks

REGR: 1.4.0rc0: pd.read_sql not compatible with pymysql #45416

auderson opened this issue Jan 17, 2022 · 8 comments · Fixed by #45496
Labels
Bug IO SQL to_sql, read_sql, read_sql_query Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@auderson
Copy link
Contributor

auderson commented Jan 17, 2022

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pymysql as sql
import pandas as pd

mysql_account = {}

con = sql.connect(**mysql_account)
pd.read_sql('', con)

Issue Description

the above code raises ValueError: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connection

Expected Behavior

Is pymysql not supported anymore?

Installed Versions

INSTALLED VERSIONS

commit : d023ba7
python : 3.8.10.final.0
python-bits : 64
OS : Linux
OS-release : 5.8.0-63-generic
Version : #71-Ubuntu SMP Tue Jul 13 15:59:12 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.4.0rc0
numpy : 1.20.3
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.4
setuptools : 58.0.4
Cython : 0.29.24
pytest : 6.2.5
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.2
html5lib : 1.1
pymysql : 1.0.2
psycopg2 : 2.8.6 (dt dec pq3 ext lo64)
jinja2 : 2.11.3
IPython : 7.29.0
pandas_datareader: 0.9.0
bs4 : None
bottleneck : None
fsspec : 2021.11.1
fastparquet : None
gcsfs : None
matplotlib : 3.5.0
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : 6.0.1
pyxlsb : None
s3fs : None
scipy : 1.7.3
sqlalchemy : 1.4.27
tables : None
tabulate : 0.8.9
xarray : None
xlrd : None
xlwt : None
numba : 0.54.1
zstandard : None

@auderson auderson added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 17, 2022
@simonjayhawkins
Copy link
Member

The ValueError raised was added in #42546 cc @fangchenli

@simonjayhawkins simonjayhawkins added IO SQL to_sql, read_sql, read_sql_query Regression Functionality that used to work in a prior pandas version and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 17, 2022
@simonjayhawkins simonjayhawkins added this to the 1.4 milestone Jan 17, 2022
@fangchenli
Copy link
Member

We currently only support pymysql through sqlalchemy.

@asishm
Copy link
Contributor

asishm commented Jan 17, 2022

Confirming this is breaking in 1.4.0rc0

It used to work earlier (prior to the linked PR) as the con object was getting passed into SQLiteDatabase which looks generic enough to accept any DBAPI2 connection obj.

@Jefffish09
Copy link

Jefffish09 commented Jan 18, 2022

Same error when using pymssql in 1.4.0rc0

image

@jorisvandenbossche
Copy link
Member

We have documented that we only support sqlite3 for plain DBAPI2 connection objects:

pandas/pandas/io/sql.py

Lines 462 to 464 in 21b7daf

con : SQLAlchemy connectable, str, or sqlite3 connection
Using SQLAlchemy makes it possible to use any DB supported by that
library. If a DBAPI2 object, only sqlite3 is supported. The user is responsible

But it is true that any connection has implicitly worked up to now as long as you are just reading a query (not reading a table or writing). Given that this has worked for so long, we should probably not just change it without warning.

@asishm
Copy link
Contributor

asishm commented Jan 18, 2022

Thanks @jorisvandenbossche
2 clarifying points:

  1. Reading a table didn't work for sqlite3 connections (checked on 1.3.4) or any other DBAPI2 connection (maybe deserves a new issue here) i.e. pd.read_sql('table_name', sqlite3_conn) fails with DatabaseError: Execution failed on sql 'table_name': near "table_name": syntax error. The decision whether to pass to read_sql_table only seems to be happening for sqlalchemy connections
  2. Writing to a table never worked with a non-sqlite3 DBAPI2 connection as it would try to write/create to a sqlite3 instance (which likely would not exist)

@simonjayhawkins simonjayhawkins changed the title BUG: 1.4.0rc0: pd.read_sql not compatible with pymysql REGR: 1.4.0rc0: pd.read_sql not compatible with pymysql Jan 20, 2022
@simonjayhawkins
Copy link
Member

Given that this has worked for so long, we should probably not just change it without warning.

@jorisvandenbossche Is this severe enough to block 1.4.0?

@simonjayhawkins
Copy link
Member

we could maybe revert #42546 for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO SQL to_sql, read_sql, read_sql_query Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants