Skip to content

Internal Server Error in Airflow API server with Keycloak provider when token is not active #59359

@dabla

Description

@dabla

Apache Airflow Provider(s)

keycloak

Versions of Apache Airflow Providers

We are using the latest airflow providers shipped with Airflow 3.1.4 constraints.

The issue arised with Airflow 3.1.4 and apache-airflow-providers-keycloak 0.3.0.

Apache Airflow version

3.1.4

Operating System

Linux

Deployment

Other 3rd-party Helm chart

Deployment details

No response

What happened

When we keep a browser open for a long time, like overnight, and we want to refresh the Airflow dags page, the Airflow api server crashes with following exception:

INFO:     172.31.52.95:0 - "GET /favicon.ico HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
  + Exception Group Traceback (most recent call last):
  |   File "/usr/local/lib/python3.13/site-packages/starlette/_utils.py", line 79, in collapse_excgroups
  |     yield
  |   File "/usr/local/lib/python3.13/site-packages/starlette/middleware/base.py", line 183, in __call__
  |     async with anyio.create_task_group() as task_group:
  |                ~~~~~~~~~~~~~~~~~~~~~~~^^
  |   File "/usr/local/lib/python3.13/site-packages/anyio/_backends/_asyncio.py", line 783, in __aexit__
  |     raise BaseExceptionGroup(
  |         "unhandled errors in a TaskGroup", self._exceptions
  |     ) from None
  | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "/usr/local/lib/python3.13/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi
    |     result = await app(  # type: ignore[func-returns-value]
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |         self.scope, self.receive, self.send
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |     )
    |     ^
    |   File "/usr/local/lib/python3.13/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
    |     return await self.app(scope, receive, send)
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/usr/local/lib/python3.13/site-packages/fastapi/applications.py", line 1082, in __call__
    |     await super().__call__(scope, receive, send)
    |   File "/usr/local/lib/python3.13/site-packages/starlette/applications.py", line 113, in __call__
    |     await self.middleware_stack(scope, receive, send)
    |   File "/usr/local/lib/python3.13/site-packages/starlette/middleware/errors.py", line 186, in __call__
    |     raise exc
    |   File "/usr/local/lib/python3.13/site-packages/starlette/middleware/errors.py", line 164, in __call__
    |     await self.app(scope, receive, _send)
    |   File "/usr/local/lib/python3.13/site-packages/starlette/middleware/gzip.py", line 29, in __call__
    |     await responder(scope, receive, send)
    |   File "/usr/local/lib/python3.13/site-packages/starlette/middleware/gzip.py", line 130, in __call__
    |     await super().__call__(scope, receive, send)
    |   File "/usr/local/lib/python3.13/site-packages/starlette/middleware/gzip.py", line 46, in __call__
    |     await self.app(scope, receive, self.send_with_compression)
    |   File "/usr/local/lib/python3.13/site-packages/starlette/middleware/cors.py", line 85, in __call__
    |     await self.app(scope, receive, send)
    |   File "/usr/local/lib/python3.13/site-packages/starlette/middleware/base.py", line 182, in __call__
    |     with recv_stream, send_stream, collapse_excgroups():
    |                                    ~~~~~~~~~~~~~~~~~~^^
    |   File "/usr/lib64/python3.13/contextlib.py", line 162, in __exit__
    |     self.gen.throw(value)
    |     ~~~~~~~~~~~~~~^^^^^^^
    |   File "/usr/local/lib/python3.13/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
    |     raise exc
    |   File "/usr/local/lib/python3.13/site-packages/starlette/middleware/base.py", line 184, in __call__
    |     response = await self.dispatch_func(request, call_next)
    |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/usr/local/lib/python3.13/site-packages/airflow/api_fastapi/auth/middlewares/refresh_token.py", line 45, in dispatch
    |     new_user = await self._refresh_user(current_token)
    |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/usr/local/lib/python3.13/site-packages/airflow/api_fastapi/auth/middlewares/refresh_token.py", line 68, in _refresh_user
    |     return get_auth_manager().refresh_user(user=user)
    |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
    |   File "/usr/local/lib/python3.13/site-packages/airflow/providers/keycloak/auth_manager/keycloak_auth_manager.py", line 121, in refresh_user
    |     tokens = client.refresh_token(user.refresh_token)
    |   File "/usr/local/lib/python3.13/site-packages/keycloak/keycloak_openid.py", line 410, in refresh_token
    |     return raise_error_from_response(data_raw, KeycloakPostError)
    |   File "/usr/local/lib/python3.13/site-packages/keycloak/exceptions.py", line 195, in raise_error_from_response
    |     raise error(
    |     ...<3 lines>...
    |     )
    | keycloak.exceptions.KeycloakPostError: 400: b'{"error":"invalid_grant","error_description":"Token is not active"}'
    +------------------------------------

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.13/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        self.scope, self.receive, self.send
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/local/lib/python3.13/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.13/site-packages/fastapi/applications.py", line 1082, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.13/site-packages/starlette/applications.py", line 113, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.13/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/usr/local/lib/python3.13/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.13/site-packages/starlette/middleware/gzip.py", line 29, in __call__
    await responder(scope, receive, send)
  File "/usr/local/lib/python3.13/site-packages/starlette/middleware/gzip.py", line 130, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.13/site-packages/starlette/middleware/gzip.py", line 46, in __call__
    await self.app(scope, receive, self.send_with_compression)
  File "/usr/local/lib/python3.13/site-packages/starlette/middleware/cors.py", line 85, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.13/site-packages/starlette/middleware/base.py", line 182, in __call__
    with recv_stream, send_stream, collapse_excgroups():
                                   ~~~~~~~~~~~~~~~~~~^^
  File "/usr/lib64/python3.13/contextlib.py", line 162, in __exit__
    self.gen.throw(value)
    ~~~~~~~~~~~~~~^^^^^^^
  File "/usr/local/lib/python3.13/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
    raise exc
  File "/usr/local/lib/python3.13/site-packages/starlette/middleware/base.py", line 184, in __call__
    response = await self.dispatch_func(request, call_next)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.13/site-packages/airflow/api_fastapi/auth/middlewares/refresh_token.py", line 45, in dispatch
    new_user = await self._refresh_user(current_token)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.13/site-packages/airflow/api_fastapi/auth/middlewares/refresh_token.py", line 68, in _refresh_user
    return get_auth_manager().refresh_user(user=user)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
  File "/usr/local/lib/python3.13/site-packages/airflow/providers/keycloak/auth_manager/keycloak_auth_manager.py", line 121, in refresh_user
    tokens = client.refresh_token(user.refresh_token)
  File "/usr/local/lib/python3.13/site-packages/keycloak/keycloak_openid.py", line 410, in refresh_token
    return raise_error_from_response(data_raw, KeycloakPostError)
  File "/usr/local/lib/python3.13/site-packages/keycloak/exceptions.py", line 195, in raise_error_from_response
    raise error(
    ...<3 lines>...
    )
keycloak.exceptions.KeycloakPostError: 400: b'{"error":"invalid_grant","error_description":"Token is not active"}'

What you think should happen instead

When the KeycloakAuthManager tries to refresh the user and thus calls the refresh_token method from the Keycloak client and the later one fails with an KeycloakPostError, it should catch that exception and return None instead of propagating that exception as this makes the API server crash with HTTP 500 Internal Server Error.

Without that patch, the only way to fix this problem is to clear the cookies related to the API server and then refresh the page in the browser.

The above fix or clearing the cookies as work around will force Keycloak to redo an authentication and thus refresh the token successfully instead of relying on an expired token in the cookie.

How to reproduce

Just keep the Airflow DAG's page open for a long time until the token expires on Keycloak side, then refresh it and you'll get an Internal Server Error page from the Airflow API server.

Image

Anything else

Before the fix

When the API server received any request (e.g., GET /dags), Airflow’s refresh-token middleware tried to refresh the Keycloak token:

tokens = client.refresh_token(user.refresh_token)

If Keycloak returned invalid_grant (expired/invalid token), the python-keycloak client raised:

KeycloakPostError: 400: {"error":"invalid_grant"...}

This exception bubbled up unhandled, breaking the ASGI task group, causing:

500 Internal Server Error
ExceptionGroup: unhandled errors in a TaskGroup

So instead of prompting a re-auth, the whole API crashed.

After the fix

You wrapped client.refresh_token(...) in a try/except KeycloakPostError:

Catch the error

Log a warning

Return None instead of raising

Because of that:

✔ The middleware no longer throws
✔ The API request is allowed to proceed
✔ A new login will be triggered on the next request (because None means “user is not authenticated anymore”)
✔ No more 500 errors from Uvicorn/Starlette

In short:

The fix prevents an unhandled Keycloak refresh error from crashing the API server. Instead, it logs the issue and gracefully forces re-authentication.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions