-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Ensure AWS ORM initialization and improve session configuration handling #54582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add `_ensure_db_session` to initialize orm when engine or session are None - Add `test_ensure_db_session_initializes_orm` to test ORM initialization logic
Move config creation inside get_session for better lazy evaluation to avoid session issue
providers/amazon/src/airflow/providers/amazon/aws/utils/eks_get_token.py
Outdated
Show resolved
Hide resolved
jason810496
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thanks for the fix.
providers/amazon/src/airflow/providers/amazon/aws/hooks/base_aws.py
Outdated
Show resolved
Hide resolved
providers/amazon/src/airflow/providers/amazon/aws/utils/eks_get_token.py
Outdated
Show resolved
Hide resolved
providers/amazon/src/airflow/providers/amazon/aws/utils/eks_get_token.py
Outdated
Show resolved
Hide resolved
- Rename _ensure_db_session to _ensure_orm_configured - Simplify ORM check (remove session part) - add docstring with inline comment - Update corresponding test function name and calls
potiuk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not understand: where eks_get_token is going to be used an in what circumstances?
Airflow Providers in Airflow 3 are supposed to not use Airflow ORM at all. So it sounds strange that there is a tool in provider that expects to initialize it.
What's the purpose and scenario of running this tool ?
|
The In this case, |
|
Hi @potiuk, I just checked the usage of The airflow/providers/amazon/src/airflow/providers/amazon/aws/operators/eks.py Lines 1067 to 1075 in 9b46d76
The airflow/providers/amazon/src/airflow/providers/amazon/aws/hooks/eks.py Lines 540 to 617 in 9b46d76
This PR is more of a quick fix to resolve #53578. To fully address the issue without directly accessing the ORM, we would need to introduce a new Additionally, @viiccwen, it would be great if you could test this fix with a real EKS setup, thanks! |
task.sdk already supports retrieving credentials stored in airflow configuration and this is THE way to retrieve it. I assume this command is run internally in the worker, not maually by the user at some point in time, and in this case just using Connection.get() from |
|
Simply: unconditional ORM configuration in provider's code run in the worker is just plain wrong. It would only be acceptable if it is run by a human running this command manually after logging in to one of the containers that have database access - scheduler, triggerer, dag file processor. When your code is run in the worker on Airflow 3, you have no access to database credentials and ORM initialization will fail. |
|
I understand your point, and I agree that in Airflow 3 we should not have access to the database. If I am not mistaken about the purpose of the
The sh -c output=$({python_executable} -m airflow.providers.amazon.aws.utils.eks_get_token --cluster-name {eks_cluster_name} {args} 2>&1) ...Therefore, if we truly want to eliminate database access when retrieving connections in the context of |
Well. from what I see this command is executed on the pod within EKS cluster - and it would mean that EKS cluster will have to have DB credentials in order to initialize ORM - and worker does not have those credentials anyway (in Airflow 3). Initialising ORM in this place is just not going to work - unless of course EKS cluster has the credentials which is obviously wrong according to the Airflow security model. I think though there is a way better solution - that is also "proper" the credentials of AWS connection should be uploaded to the EKS cluster (temporarily) as secret and the PODs run on EKS should use that secret (and the secret should be deleted after the pod completes. You cannot reallly get the EKS cluster that is used there to communicate with TaskSDK because there will normally be no firewalls opened for the cluster worker to communicate with API server (not mentioning that similarly if you want to run ORM/DB access you will have to have the capability of the EKS cluster to communicate back with Airflow DB. In either case - I think initializing the ORM does not solve the problem at all |
jason810496
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless of course EKS cluster has the credentials which is obviously wrong according to the Airflow security model.
I see and agree with your point.
I overlooked the key issue that the EKS cluster would need to have database credentials in order to initialize the ORM. Thank you for highlighting this.
I think though there is a way better solution - that is also "proper" the credentials of AWS connection should be uploaded to the EKS cluster (temporarily) as secret and the PODs run on EKS should use that secret (and the secret should be deleted after the pod completes. You cannot reallly get the EKS cluster that is used there to communicate with TaskSDK because there will normally be no firewalls opened for the cluster worker to communicate with API server (not mentioning that similarly if you want to run ORM/DB access you will have to have the capability of the EKS cluster to communicate back with Airflow DB.
Yes, setting the token with the credentials from the AWS connection is a much more appropriate solution in this case, and it also aligns with the behavior of other providers.
So, @viiccwen, it would be helpful to refactor the get_eks_token logic to use the Airflow Connection extra field (perhaps named eks_sa_token). Thank you!
The implementation could follow the pattern used in athena_sql for retrieving the driver config from the extra field:
airflow/providers/amazon/src/airflow/providers/amazon/aws/hooks/athena_sql.py
Lines 146 to 151 in 9b46d76
| return dict( | |
| driver=self.conn.extra_dejson.get("driver", "rest"), | |
| schema_name=self.conn.schema, | |
| region_name=self.conn.region_name, | |
| aws_domain=self.conn.extra_dejson.get("aws_domain", "amazonaws.com"), | |
| ) |
Additionally, we should document this breaking change for EksHook, and provide guidance for users on how to set up credentials using Airflow Connection. For example, users should create a Service Account in EKS first, then set <SERVICE-ACCOUNT-TOKEN> in the Airflow Connection as eks_sa_token in the extra field.
cc @o-nikolas @vincbeck
Do you have any comments or feedback on this?
|
hello @jason810496 @potiuk, |
|
Since #54083 was merged all the EKS tests on our system test dashboard have been broken: https://aws-mwaa.github.io/#/open-source/system-tests/dashboard.html From what I can tell this PR is solving the same issue (or at least may solve the same issue). @viiccwen Have you made any progress on this task? I'd like to help expedite this one if possible! |
Sorry for lately reply, I didn't see this comment until you opened the PR. 🥲 |
Summary
This PR introduces two related fixes to stabilize ORM initialization and session handling in AWS-related utilities.
Changes
Add
_ensure_db_sessionineks_get_token_ensure_db_session()to explicitly initialize the ORM whenengineorSessionareNone.eks_get_tokenas a standalone CLI properly initializes Airflow settings before creating theEksHook.test_ensure_db_session_initializes_ormintest_eks_get_token.pyto validate this behavior.Refactor config initialization in
base_awsget_sessionmethod.Sessionis not yet available at import time.Files Changed
airflow/providers/amazon/aws/utils/eks_get_token.pyairflow/providers/amazon/aws/utils/tests/test_eks_get_token.pyairflow/providers/amazon/aws/hooks/base_aws.pyMotivation
Previously, invoking
eks_get_tokendirectly from CLI could fail because Airflow’s ORM (settings.engine,settings.Session) was not yet initialized. This PR ensures ORM initialization is explicit and reliable.Additionally, deferring configuration creation inside
get_sessionremoves the risk of creating invalid or stale sessions during import, aligning with lazy evaluation best practices.Testing
_ensure_db_sessionto validate ORM initialization whenengineandSessionareNone.Related Issues
resolves #53578
Reproduced Step
in
mainbranch:in
fix-aws-orm-initialization: