Skip to content

Comments

fix: fab deserialize issue#62153

Merged
vincbeck merged 1 commit intoapache:mainfrom
cruseakshay:fix/fab-pending-rollback-session-cleanup
Feb 19, 2026
Merged

fix: fab deserialize issue#62153
vincbeck merged 1 commit intoapache:mainfrom
cruseakshay:fix/fab-pending-rollback-session-cleanup

Conversation

@cruseakshay
Copy link
Contributor

@cruseakshay cruseakshay commented Feb 19, 2026

Problem

When using FAB auth manager, a database connection drop (e.g. PostgreSQL's
idle_in_transaction_session_timeout) causes the API server to return HTTP 500
on every subsequent request until it is restarted.

The cascade happens in the JWT auth path hit on every authenticated request:

JWTRefreshMiddlewareresolve_user_from_tokendeserialize_user

deserialize_user uses FAB's scoped session (self.appbuilder.session). When a
connection dies, SQLAlchemy raises OperationalError on the first request and
leaves the session in an invalid state. All following requests reuse the same
poisoned thread-local session and raise PendingRollbackError.

This is distinct from the WSGI Flask-view path fixed in #61480 and the
load_user path fixed in #61943 — those do not cover the JWT token
deserialization path.

Solution

Catch SQLAlchemyError in deserialize_user, call session.remove() to
discard the poisoned scoped session, and re-raise the original exception.
The next request gets a fresh connection from the pool and succeeds.

session.remove() is wrapped in contextlib.suppress(Exception) so a failure
during cleanup can never mask the original database error.

  • First request after a drop: unavoidable 500 (the dead connection must be
    discovered) — behaviour is unchanged.
  • All subsequent requests: recover automatically — no restart needed.

Testing

  • test_db_error_calls_session_remove — parametrized over OperationalError
    and PendingRollbackError: verifies session.remove() is called on each.
  • test_db_error_propagates_when_session_remove_raises — verifies the original
    SQLAlchemyError is always what propagates, even when session.remove() itself
    throws.

Fixes #61761 | #61518
Related to #61480, #61943

@cruseakshay cruseakshay force-pushed the fix/fab-pending-rollback-session-cleanup branch 2 times, most recently from c2912c7 to 45bb04d Compare February 19, 2026 06:23
@cruseakshay cruseakshay marked this pull request as ready for review February 19, 2026 06:59
@cruseakshay cruseakshay force-pushed the fix/fab-pending-rollback-session-cleanup branch from 157560c to c650489 Compare February 19, 2026 17:57
Co-authored-by: Cursor <cursoragent@cursor.com>
@cruseakshay cruseakshay force-pushed the fix/fab-pending-rollback-session-cleanup branch from c650489 to 2b18fa6 Compare February 19, 2026 18:27
@vincbeck vincbeck merged commit 99ae6d7 into apache:main Feb 19, 2026
86 checks passed
@gavrik
Copy link

gavrik commented Feb 20, 2026

This fix is not working for me.

apache-airflow = 3.1.7
apache-airflow-providers=3.3.0

Downgrade apache-airflow-providers-fab to 3.1.2 version fix the problem.

@tschroeder-zendesk
Copy link
Contributor

This fix is not working for me.

apache-airflow = 3.1.7 apache-airflow-providers=3.3.0

Downgrade apache-airflow-providers-fab to 3.1.2 version fix the problem.

I don't think there is a release with this fix yet. I did a fix related to 3.3.0 and we needed this PR and that for full fix of FAB issues in 3.3.0, but as such we need a new release to use them afaik

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Airflow API Server Internal Error (500)

4 participants