Skip to content

Commit

Permalink
fix(database): limit the number of open database connections (reanahu…
Browse files Browse the repository at this point in the history
…b#437)

Up until now, to avoid keeping open idle connections with the database,
the connection pool was disposed after every request. However,
connections that are checked out at the moment of the pool disposal are
kept open, and they do not count towards the maximum number of
connections set by `SQLALCHEMY_POOL_SIZE` and `SQLALCHEMY_MAX_OVERFLOW`.

This resulted in many connections being open at the same time,
saturating the capacity of the database.

Instead of destroying the pool at each request, connections are closed
each time they are put back in the pool. The end result is the same, as
no idle connection is kept open for long periods of time. At the same
time, the pool enforces that the number of open connections is limited,
so that the database is not overwhelmed.
  • Loading branch information
mdonadoni committed Feb 22, 2024
1 parent b9f8364 commit 980f749
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 9 deletions.
27 changes: 22 additions & 5 deletions reana_job_controller/factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,34 @@
from flask import Flask
from reana_commons.config import REANA_LOG_FORMAT, REANA_LOG_LEVEL
from reana_db.database import Session, engine as db_engine
from sqlalchemy import event

from reana_job_controller import config
from reana_job_controller.spec import build_openapi_spec


@event.listens_for(db_engine, "checkin")
def receive_checkin(dbapi_connection, connection_record):
"""Close all the connections before returning them to the connection pool."""
# Given the current architecture of REANA, job-controller needs to connect to the
# database in order to, among other things, update the details of jobs. However,
# it can happen that for long periods of time job-controller does not need to access
# the database, for example when waiting for long-lasting jobs to finish. For this
# reason, each connection is closed before being returned to the connection pool, so
# that job-controller does not unnecessarily use one or more of the available
# connection slots of PostgreSQL. Keeping one connection open for the whole
# duration of the workflow is not possible, as that would limit the number of
# workflows that can be run in parallel.
#
# To improve scalability, we should consider refactoring job-controller to avoid
# accessing the database, or at least consider using external connection pooling
# mechanisms such as pgBouncer.
connection_record.close()


def shutdown_session(response_or_exc):
"""Close session and remove all DB connections."""
db_engine.dispose()
"""Close session at the end of each request."""
Session.close()


def create_app(config_mapping=None):
Expand All @@ -43,9 +63,6 @@ def create_app(config_mapping=None):
app.register_blueprint(blueprint, url_prefix="/")

# Close session after each request
app.teardown_request(shutdown_session)

# Close session on app teardown
app.teardown_appcontext(shutdown_session)

return app
6 changes: 2 additions & 4 deletions reana_job_controller/job_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,8 @@
import json

from reana_commons.utils import calculate_file_access_time
from reana_db.database import Session, engine as db_engine
from reana_db.models import Job as JobTable
from reana_db.models import JobCache, JobStatus, Workflow
from reana_db.database import Session
from reana_db.models import Job as JobTable, JobCache, JobStatus, Workflow

from reana_job_controller.config import CACHE_ENABLED

Expand Down Expand Up @@ -65,7 +64,6 @@ def wrapper(inst, *args, **kwargs):
inst.create_job_in_db(backend_job_id)
if CACHE_ENABLED:
inst.cache_job()
db_engine.dispose()
return backend_job_id

return wrapper
Expand Down

0 comments on commit 980f749

Please sign in to comment.