Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB Connection pool throws error when executing raw sql in Celery job #15255

Closed
2 of 3 tasks
Animesh3193 opened this issue Jun 18, 2021 · 0 comments
Closed
2 of 3 tasks
Labels
#bug Bug report

Comments

@Animesh3193
Copy link

Animesh3193 commented Jun 18, 2021

When running Superset version 0.36 coupled with postgres to schedule jobs that runs raw SQL postgres queries in kubernetes environemnt. The query execution fails as follow :-

psycopg2.DatabaseError: error with status PGRES_TUPLES_OK and no message from the libpq

Expected results

  • When Raw SQL is run in the db session object using the execute method
  • By using fetchall to fetch all row from the postgres database, result is expected to be a list of tuples containing value of columns

Actual results

  • When Raw SQL is run in the db session object using the execute method
  • Then using fetchall to fetch all row we encounter below error
  • result that we get is the Postgres error as psycopg2.DatabaseError: error with status PGRES_TUPLES_OK and no message from the libpq

Screenshots

image

How to reproduce the bug

  • In a celery job initialize db session
  • DB session is initialized as db.session() or db.create_scoped_session()
  • raw sql is used as sqlalchemy text e.g. sqlalchemy.text('Select * from Employees')
  • raw sql is executed in the db object to give result as list of tuples
  • The code being used is db.create_scoped_session().execute(sqlalchemy.text('Select * from Employees')).fetchall()
  • The code fails as psycopg2.DatabaseError: error with status PGRES_TUPLES_OK and no message from the libpq.

Environment

(please complete the following information):

  • superset version: 0.36
  • python version: 3.7.6

We use postgres version 9.2 with the celery running in kubernetes within docker container.
Celery has redis configured as Backend. Below are few python module with their respective version

celery==4.4.2
SQLAlchemy==1.3.17
Flask==1.1.2
Flask-AppBuilder==2.3.4
kombu==4.6.8
psycopg2==2.8.5
redis == 3.5.0

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven't found one similar.

Additional context

  1. For the superset version we are using is 0.36 release which we do not have any plans to upgrade as of now.
  2. The main use is to fetch details from postgres table that are not currently part of superset model architecture and process the same every 5min to 1 hour which takes around 20 minutes to get completed, for which trying to create a celery job.
  3. Already went through the PR request :- [GitHub]fix(celery): Reset DB connection pools for forked worker processes #13350, which could be a solution, but unsure how to approach with the changes as version cannot be upgarded to 1.1
@Animesh3193 Animesh3193 added the #bug Bug report label Jun 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
#bug Bug report
Projects
None yet
Development

No branches or pull requests

1 participant