Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REGR: ImportError on other dbapi2 connections when sqlalchemy is not installed #45660

Closed
3 tasks done
vincent3717 opened this issue Jan 27, 2022 · 25 comments
Closed
3 tasks done
Labels
Bug IO SQL to_sql, read_sql, read_sql_query Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@vincent3717
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import psycopg2 #version 2.9.3
import pandas as pd #version 1.4.0
conn = psycopg2.connect(
        host=XXX,
        dbname=XXX,
        user=XXX,
        password=XXX,
    )
df = pd.read_sql("select 1", conn)

ImportError: Missing optional dependency 'SQLAlchemy'.  Use pip or conda to install SQLAlchemy.

Issue Description

There was no SQLAlchemy dependency in the previous version when using a psycopg2 connection object in read_sql pandas function

Even after installing SQLAlchemy, there is a unexpected warning :

UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy

Expected Behavior

No error neither warning using a psycopg2 connection object in read_sql pandas function

Installed Versions

INSTALLED VERSIONS

commit : bb1f651
python : 3.9.9.final.0
python-bits : 64
OS : Linux
OS-release : 5.13.0-27-generic
Version : #29~20.04.1-Ubuntu SMP Fri Jan 14 00:32:30 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.4.0
numpy : 1.22.1
pytz : 2021.3
dateutil : 2.8.2
pip : 20.0.2
setuptools : 44.0.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : 2.9.3
jinja2 : None
IPython : 7.31.1
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.7.3
sqlalchemy : 1.4.31
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
zstandard : None

@vincent3717 vincent3717 added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 27, 2022
@phofl
Copy link
Member

phofl commented Jan 27, 2022

Can you try on master, this was (maybe) partly adressed in #45416

@abowden1989
Copy link
Contributor

abowden1989 commented Jan 27, 2022

I don't think this is addressed by the change in 3418679. When trying to use a different connection object when sqlalchemy is not installed, the code will fail on the line

    sqlalchemy = import_optional_dependency("sqlalchemy")

The change I would propose would be to pass errors="ignore" to the import_optional_dependency call, and then only do the isinstance checks relating to sqlalchemy if the return from import_optional_dependency is not None

This wouldn't address the issue that a warning is shown (which seems reasonable to me), but would mean that sqlalchemy doesn't need to be installed to use other dbapi2 connections (some of which may be compatible with sqlite3)

@jbrockmendel
Copy link
Member

pls give the issue an informative title

@sknutsonsf
Copy link

I am getting the same warning messages after installing SQLAlchemy

@simonjayhawkins simonjayhawkins added this to the 1.4.1 milestone Feb 1, 2022
@simonjayhawkins simonjayhawkins changed the title BUG: REGR: ImportError on other dbapi2 connections when sqlalchemy is not installed Feb 1, 2022
@simonjayhawkins simonjayhawkins added IO SQL to_sql, read_sql, read_sql_query Regression Functionality that used to work in a prior pandas version and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 1, 2022
@simonjayhawkins
Copy link
Member

Even after installing SQLAlchemy, there is a unexpected warning :

UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy

@fangchenli this is the intended behavior? this issue is closed off by #45679?

@fangchenli
Copy link
Member

Even after installing SQLAlchemy, there is a unexpected warning :
UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy

@fangchenli this is the intended behavior? this issue is closed off by #45679?

Yes, this is the intended behavior. See #45416. We don't officially support raw dbapi2 objects other than sqlite3. But if people think we should support it, we can keep this open for discussion.

@simonjayhawkins
Copy link
Member

But if people think we should support it, we can keep this open for discussion.

I think maybe could be a separate discussion so closing this one. The ImportError issue is fixed.

@tuttoaposto
Copy link

Sorry I'm a late add here. I got the same warning with cx_Oracle.connect and pyodbc.connect. That didn't happen before until I reinstalled all my python packages. Current using pandas 1.4.1. Also installed SQLAlchemy-1.4.31.

Anyway to include these connections? What should I do to make the warning go away?

@david-waterworth
Copy link

david-waterworth commented Mar 16, 2022

@fangchenli so just to confirm, are you saying you never supported other connection types, but this warning has been added recently? I'm suddenly seeing the same message using psycopg2 on code that I've been running fine for 3+ years?

Also, the message itself has grammatical errors

only support should be only supports
ordatabase should be or database

@fangchenli
Copy link
Member

@fangchenli so just to confirm, are you saying you never supported other connection types, but this warning has been added recently? I'm suddenly seeing the same message using psycopg2 on code that I've been running fine for 3+ years?

Also, the message itself has grammatical errors

only support should be only supports ordatabase should be or database

That is correct. Before the recent changes, all non-sqlalchemy connections were passed to a fallback class that was only tested against sqlite3. Surprisingly it has worked well with many different connection types for many years. We probably will address this issue and officially support other raw SQL connections soon.

And apologize for the grammatical errors lol...

@Sieboldianus
Copy link

Sieboldianus commented Mar 24, 2022

Same for me: psycopg2.connect() has worked flawlessly for 3 years and I am ignoring this warning for now:

import psycopg2
import warnings
import pandas as pd


def db_query(sql_query: str, db_conn: psycopg2.extensions.connection) -> pd.DataFrame:
    """Execute SQL Query and get Dataframe with pandas"""
    with warnings.catch_warnings():
        # ignore warning for non-SQLAlchemy Connecton
        # see github.com/pandas-dev/pandas/issues/45660
        warnings.simplefilter('ignore', UserWarning)
        # create pandas DataFrame from database query
        df = pd.read_sql_query(sql_query, db_conn)
    return df

db_conn= psycopg2.connect(
        host=DB_HOST,
        port=DB_PORT ,
        dbname=DB_NAME,
        user=DB_USER,
        password=DB_PASS
)

df_test = db_query("SELECT * FROM public.test LIMIT 1", db_conn)

@jodur
Copy link

jodur commented Mar 27, 2022

Same for me.
I use the same for firebird with fbd without warnings for years, until i switched my development to newer python versions and latest panda release.

def ReadLandenTabelSyntess():
    SELECT="Select at_land.gc_id AT_LAND_GC_ID,at_land.gc_code SYN_LAND_CODE,at_land.gc_omschrijving SYN_LAND from at_land"
    try:
      conn = fdb.connect(**DATABASE)
      at_land=pd.read_sql_query(SELECT,con=conn,coerce_float=False)
    except:
      logger.exception('**STOP** Error accessing Atrium database')
      sys.exit(1)    
    conn.close()
    return at_land

@gilesmca
Copy link

gilesmca commented Apr 1, 2022

It was asked above how to make the warning go away, for psycopg2 connections (and I suspect most of the others) I found a fairly painless way of quickly building an sqlalchemy engine from an existing connection which should then stop those annoying warnings:

import pandas as pd
import psycopg2
import sqlalchemy
from sqlalchemy.pool import StaticPool
…
engine = sqlalchemy.create_engine('postgresql+psycopg2://', poolclass=StaticPool, creator= lambda: {psycopg2connection})
data = pd.read_sql_query("select something”, con=engine)

I hope this helps.

@l736k
Copy link

l736k commented Apr 11, 2022

I'm facing this problem too with a pyodbc connection to a SQL Server database

class db:
	def __init__(self):
		self._conn = pyodbc.connect(
			'DRIVER={ODBC Driver 17 for SQL Server};'
			'SERVER=' + CONFIGS['db']['server'] + ';'
			'DATABSE=' + CONFIGS['db']['database'] + ';'
			'UID=' + CONFIGS['db']['uid'] + ';'
			'PWD=' + CONFIGS['db']['pwd'] + ';'
		)
		self._cursor = self._conn.cursor()

	[...]

	def query(self, sql):
		return pandas.read_sql_query(sql, self._conn)

@arani-mohammad
Copy link

I found this link: https://stackoverflow.com/questions/71082494/getting-a-warning-when-using-a-pyodbc-connection-object-with-pandas

from sqlalchemy.engine import URL
connection_string = "DRIVER={ODBC Driver 17 for SQL Server};SERVER=dagger;DATABASE=test;UID=user;PWD=password"
connection_url = URL.create("mssql+pyodbc", query={"odbc_connect": connection_string})

engine = create_engine(connection_url)

df = pd.sql_read(query, engine)

@eabase
Copy link

eabase commented May 4, 2022

@arani-mohammad Wrong imports, that snippet does not work with p3.10.3 + MS SQL, and the panda doesn't have an sql_read(), but a read_sql().

@sat-ch
Copy link

sat-ch commented May 6, 2022

@fangchenli so just to confirm, are you saying you never supported other connection types, but this warning has been added recently? I'm suddenly seeing the same message using psycopg2 on code that I've been running fine for 3+ years?
Also, the message itself has grammatical errors
only support should be only supports ordatabase should be or database

That is correct. Before the recent changes, all non-sqlalchemy connections were passed to a fallback class that was only tested against sqlite3. Surprisingly it has worked well with many different connection types for many years. We probably will address this issue and officially support other raw SQL connections soon.

And apologize for the grammatical errors lol...

@fangchenli
Could you please give us a tentative date on when this would be fixed?
Thanks

@sat-ch
Copy link

sat-ch commented May 6, 2022

UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy
Did anyone solve the warning issue for oracle database?
thanks

@lonnylundsten
Copy link

@fangchenli so just to confirm, are you saying you never supported other connection types, but this warning has been added recently? I'm suddenly seeing the same message using psycopg2 on code that I've been running fine for 3+ years?
Also, the message itself has grammatical errors
only support should be only supports ordatabase should be or database

That is correct. Before the recent changes, all non-sqlalchemy connections were passed to a fallback class that was only tested against sqlite3. Surprisingly it has worked well with many different connection types for many years. We probably will address this issue and officially support other raw SQL connections soon.

I think supporting various database connections is the way to go. I've tested pretty much all connectors and they did not report an error until very recently. Even with an error reported, my code works fine. The error is just annoying.

In testing I've found that the DBAPI2 connectors are much, much faster than others, at least when connecting to SQL Server. CTDS (https://zillow.github.io/ctds/) is actually the fastest connector library that I've found for connecting to SQL Server, however, pandas does still give an error when using this library. It would be great if this and other connectors could get 'official' support.

@rkechols
Copy link

Does anyone know a way to simply suppress the warning without installing sqlalchemy?

@nsmcan
Copy link

nsmcan commented Jun 24, 2022

Does anyone know a way to simply suppress the warning without installing sqlalchemy?

Approach of @Sieboldianus works fine

@holmpa3
Copy link

holmpa3 commented Nov 4, 2022

I am connecting to a MySQL database in the following code:

 mydb = pymysql.connect(
        host="localhost",
        database=db_name,
        user="root",
        password="password",
        autocommit=True

    )
    mycursor = mydb.cursor()
    engine = create_engine('mysql+pymysql://root:password@localhost/%s' % db_name )  

And I get the same warning... I am using SQLAlchemy to create the engine, is the issue that I am using pymysql to create the connection?

Are there plans to support pymysql connections?

@cdcadman
Copy link
Contributor

cdcadman commented Feb 9, 2023

@holmpa3, you can pass engine instead of mydb to avoid the warning. Pandas will create a sqlalchemy connection from the engine, so you don't need to create mydb for pandas to work.

@holmpa3
Copy link

holmpa3 commented Feb 13, 2023

@cdcadman I think I was passing mydb so I could manually commit changes. I seem to remember trying to get by only using engine but had issues with changes not being commited. It is just a warning anyway so I think I should be fine, any thoughts?

@cdcadman
Copy link
Contributor

@holmpa3 if you are passing engine to DataFrame.to_sql and the inserted records are not being committed, this could be a bug in pandas - please send a code sample if this is the case. If you are trying to commit a statement that you are passing to read_sql_query, then it might not be committed when you pass an Engine. You could instead pass a sqlalchemy Connection and handle the transaction yourself:

engine = create_engine('mysql+pymysql://root:password@localhost/%s' % db_name ) 
with engine.connect() as conn:
    with conn.begin():
        dtf = pd.read_sql_query(statements, conn)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO SQL to_sql, read_sql, read_sql_query Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

No branches or pull requests