Skip to content

Conversation

@ashb
Copy link
Member

@ashb ashb commented Feb 21, 2025

Since requests to Task Execution API can originate out-of-"cluser" so to
speak, this PR re-works the JWTSigner class so that it is possible to use
public/private keys (vs just a simple pre-shared secret). This is useful in
many cloud environments where many companies have security requirements that
ingress gateways must validate the JWT tokens on the way in, and the only way
of doing this is with public keys.

So that we don't have two different ways of generating JWT tokens I have
totally replaced the old "JWTSigner" class (which it turns out didn't have any
unit test of its own, it was only tested indirectly through test_serve_logs
etc).

This PR itself lands teh change to the JWT generation/validation codein Airflow,
but to keep things small(ish!) it doesn't actually use it inside the Execution API yet.
That will be in a future Pr

As part of this change I have also changed the JWT that was generated by the
SimpleAuthManager and the AwsAuthManager (the only two we have that use JWT)
to use the offical sub (subject) clain to place the user identifer rather
than a custom claim name.

And although it might seem slightly strange at first, I have made the
JWTValidator an async class internally. (Hence avalidated_claims -- the a
prefix signifies async, much like aclose or aread on HTTPX async
responses). This allows us to periodically refresh the JWK document if
configured in the background, and using asgiref's async_to_sync means we only
have one version.

And also conveniently, this is the same tech that FastAPI uses, which means when
this is called from within a FastAPI app, rather than creating a new/nested
event loop it will "bubble out" to the main async loop that FastAPI is using.

Precursor to #45107


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:logging provider:amazon AWS/Amazon - related issues provider:edge Edge Executor / Worker (AIP-69) / edge3 provider:fab labels Feb 21, 2025
@ashb

This comment was marked as resolved.

Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, that's a big one!

@ashb ashb force-pushed the generate-task-jwt-tokens branch 2 times, most recently from f376eb1 to 1c9548c Compare March 6, 2025 21:35
@ashb ashb changed the title 🚧 Generate and validate JWT tokens from the Execution API 🚧 Re-work JWT Validation and Generation to use public/private key and official claims Mar 11, 2025
@ashb ashb force-pushed the generate-task-jwt-tokens branch from 1c9548c to 9ea089e Compare March 11, 2025 18:22
@ashb ashb changed the title 🚧 Re-work JWT Validation and Generation to use public/private key and official claims Re-work JWT Validation and Generation to use public/private key and official claims Mar 11, 2025
@ashb ashb force-pushed the generate-task-jwt-tokens branch from 9ea089e to c8dfbdd Compare March 11, 2025 18:33
@ashb ashb force-pushed the generate-task-jwt-tokens branch 3 times, most recently from 97a4e5d to cf06c57 Compare March 12, 2025 11:42
@ashb ashb force-pushed the generate-task-jwt-tokens branch from cf06c57 to 3170f6e Compare March 12, 2025 15:25
@bugraoz93
Copy link
Contributor

bugraoz93 commented Mar 12, 2025

Do we need a different expiration period for CLI and none CLI use? Is there any practical difference in the token?

I didn't want to unresolve the conversation, but I wanted to clarify that it was not strictly needed and there isn't any practical difference on the token. It is just a layer of flexibility for CLI since CLI is mostly used in automation, which could require longer periods than regular API calls coming from users. In the end, both are the same token and used in both places, but we could prevent that in the next steps.

@ashb ashb force-pushed the generate-task-jwt-tokens branch 2 times, most recently from f8901d5 to c57b9ee Compare March 12, 2025 23:16
@ashb ashb force-pushed the generate-task-jwt-tokens branch 4 times, most recently from 17ed712 to 97723fe Compare March 13, 2025 11:42
@ashb ashb force-pushed the generate-task-jwt-tokens branch 4 times, most recently from 7c56213 to 30d4aec Compare March 13, 2025 18:28
ashb and others added 4 commits March 14, 2025 08:23
…I to use

Since requests to Task Execution API can originate  out-of-"cluser" so to
speak, this PR re-works the JWTSigner class so that it is possible to use
public/private keys (vs just a simple pre-shared secret). This is useful in
many cloud environments where many companies have security requirements that
ingress gateways must validate the JWT tokens on the way in, and the only way
of doing this is with public keys

So that we don't have two different ways of generating JWT tokens I have
totally replaced the old "JWTSigner" class (which it turns out didn't have any
unit test of its own, it was only tested indirectly through test_serve_logs
etc).

As part of this change I have also changed the JWT that was generated by the
SimpleAuthManager and the AwsAuthManager (the only two we have that use JWT)
to use the offical `sub` (subject) clain to place the user identifer rather
than a custom claim name.

And although it might seem slightly strange at first, I have made the
JWTValidator an async class internally. (Hence `avalidated_claims` -- the `a`
prefix signifies async, much like `aclose` or `aread` on HTTPX async
responses). This allows us to periodically refresh the JWK document if
configured in the background, and using asgiref's async_to_sync means we only
have one version.

And also conviently, this is the same tech that FastAPI uses, which means when
this is called from within a FastAPI app, rather than creating a new/nested
event loop it will "bubble out" to the main async loop that FastAPI is using.
@ashb ashb force-pushed the generate-task-jwt-tokens branch from 30d4aec to 6721349 Compare March 14, 2025 08:30
@Lee-W Lee-W self-requested a review March 14, 2025 08:45
@ashb ashb force-pushed the generate-task-jwt-tokens branch from 6721349 to 1eaa7c1 Compare March 14, 2025 08:59
@ashb ashb merged commit 74f4860 into main Mar 14, 2025
148 checks passed
@ashb ashb deleted the generate-task-jwt-tokens branch March 14, 2025 21:18
@vincbeck
Copy link
Contributor

I think this PR introduces a bug in token validation.

To reproduce:

  • Do not set [api_auth] jwt_secret in config
  • Login (with any auth manager)
  • Any calls will result in a 403 because the token validation failed: ERROR - JWT token is not valid: Signature verification failed. Which ends up creating an infinite loop in the browser

After a very short investigation (sorry I could not put more time, I'll continue on Wednesday if it is not solved by then), the secret generated in get_signing_key in airflow/api_fastapi/auth/tokens.py is generated multiple times. The conf.set( does not work. I do not quite know why.

@vincbeck
Copy link
Contributor

Never mind, it has been fixed

agupta01 pushed a commit to agupta01/airflow that referenced this pull request Mar 21, 2025
…fficial claims (apache#46981)

Since requests to Task Execution API can originate  out-of-"cluser" so to
speak, this PR re-works the JWTSigner class so that it is possible to use
public/private keys (vs just a simple pre-shared secret). This is useful in
many cloud environments where many companies have security requirements that
ingress gateways must validate the JWT tokens on the way in, and the only way
of doing this is with public keys

So that we don't have two different ways of generating JWT tokens I have
totally replaced the old "JWTSigner" class (which it turns out didn't have any
unit test of its own, it was only tested indirectly through test_serve_logs
etc).

As part of this change I have also changed the JWT that was generated by the
SimpleAuthManager and the AwsAuthManager (the only two we have that use JWT)
to use the offical `sub` (subject) clain to place the user identifer rather
than a custom claim name.

And although it might seem slightly strange at first, I have made the
JWTValidator an async class internally. (Hence `avalidated_claims` -- the `a`
prefix signifies async, much like `aclose` or `aread` on HTTPX async
responses). This allows us to periodically refresh the JWK document if
configured in the background, and using asgiref's async_to_sync means we only
have one version.

Co-authored-by: Jens Scheffler <jscheffl@apache.org>
nailo2c pushed a commit to nailo2c/airflow that referenced this pull request Apr 4, 2025
…fficial claims (apache#46981)

Since requests to Task Execution API can originate  out-of-"cluser" so to
speak, this PR re-works the JWTSigner class so that it is possible to use
public/private keys (vs just a simple pre-shared secret). This is useful in
many cloud environments where many companies have security requirements that
ingress gateways must validate the JWT tokens on the way in, and the only way
of doing this is with public keys

So that we don't have two different ways of generating JWT tokens I have
totally replaced the old "JWTSigner" class (which it turns out didn't have any
unit test of its own, it was only tested indirectly through test_serve_logs
etc).

As part of this change I have also changed the JWT that was generated by the
SimpleAuthManager and the AwsAuthManager (the only two we have that use JWT)
to use the offical `sub` (subject) clain to place the user identifer rather
than a custom claim name.

And although it might seem slightly strange at first, I have made the
JWTValidator an async class internally. (Hence `avalidated_claims` -- the `a`
prefix signifies async, much like `aclose` or `aread` on HTTPX async
responses). This allows us to periodically refresh the JWK document if
configured in the background, and using asgiref's async_to_sync means we only
have one version.

Co-authored-by: Jens Scheffler <jscheffl@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API area:logging provider:amazon AWS/Amazon - related issues provider:edge Edge Executor / Worker (AIP-69) / edge3 provider:fab

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants