-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented Basic EKS Integration #16571
Conversation
Not done looking at all the test results yet, but looks like most/all were caused by either a single misnamed param that slipped through in an example dag, or whitespace issues in the docs. Correcting. |
❤️ Can you explain why you had to change the label-when-reviewed action in this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a first pass -- haven't reviewed in depth at all yet.
kwargs["client_type"] = self.client_type | ||
super().__init__(*args, **kwargs) | ||
|
||
def create_cluster(self, name: str, roleArn: str, resourcesVpcConfig: Dict, **kwargs) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting conundrum here. The Python naming style says these args should be role_arn
and resources_vpc_config
-- but maybe it makes sense to mirror the Boto/AWS API names here.
What do other think, and what do we do in other AWS hooks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking, for what it is worth, was that the Hooks should mirror the API as closely as possible. That would improve the learning curve for those coming to Airflow with some boto experience, and also make it easier for those new to the boto API and looking at their docs to see what is going on since the names match up.
Existing AWS hooks are a mixed bag, but do lean towards snake_case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is worth following the conventions used by boto to avoid any inaccuracies or mistake.
This comment has been minimized.
This comment has been minimized.
4c89c12
to
ab5e1c0
Compare
@dimberman @mik-laj @vikramkoka - All CI tests are currently passing, ready for a proper review at your convenience. |
eks_cluster_name, | ||
] | ||
if role_arn: | ||
cli_args.extend(["--role-arn", role_arn]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to use external tool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it impossible to generate this token using STS?
import base64
import boto3
import re
from botocore.signers import RequestSigner
STS_TOKEN_EXPIRES_IN = 60
def get_bearer_token(session, cluster_id, region):
client = session.client('sts', region_name=region)
service_id = client.meta.service_model.service_id
signer = RequestSigner(
service_id,
region,
'sts',
'v4',
session.get_credentials(),
session.events
)
params = {
'method': 'GET',
'url': 'https://sts.{}.amazonaws.com/?Action=GetCallerIdentity&Version=2011-06-15'.format(region),
'body': {},
'headers': {
'x-k8s-aws-id': cluster_id
},
'context': {}
}
signed_url = signer.generate_presigned_url(
params,
region_name=region,
expires_in=STS_TOKEN_EXPIRES_IN,
operation_name=''
)
base64_url = base64.urlsafe_b64encode(signed_url.encode('utf-8')).decode('utf-8')
# remove any base64 encoding padding:
return 'k8s-aws-v1.' + re.sub(r'=*', '', base64_url)
# If making a HTTP request you would create the authorization headers as follows:
session = boto3.session.Session()
headers = {'Authorization': 'Bearer ' + get_bearer_token(session, 'my_cluster', 'us-east-1')}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of Google, we should be using normal Google tokens sometime.
hook.get_credentials().token
should be enough to generate a valid access token, but it is not clearly documented in public docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a real-world example for usage gcloud access token to access kubernetes cluster:
https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/using_gke_with_terraform#using-the-kubernetes-and-helm-providers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the AWS CLI was not our preference either, but had to settle for this due to other options not working correctly. The EKS service does not support providing the token directly at all. Searching online came up with a few workarounds, but the lesser evil was assuming that someone using an AWS service would be able to install an AWS tool. There is also precedent for it in Google's Kubernetes Operators using the gcloud local script to accomplish the same thing.
I'll test using your provided example to see if that works and report back. Dropping that dependency would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is asking for a problem, later IMHO.
I agree that this could bring unexpected problems in the future, but having to install AWS CLI by users and integrating Airflow with AWS CLI itself can be very difficult in edge cases too. We do not have a perfect solution here. The authorization method rarely changes compared to the product API, so I believe it will be enough stable approach.
Especially since it is described in the AWS documentation.
Amazon EKS uses IAM to provide authentication to your Kubernetes cluster through the AWS IAM authenticator for Kubernetes.
https://docs.aws.amazon.com/eks/latest/userguide/install-aws-iam-authenticator.html
or
This can be used as an alternative to the aws-iam-authenticator.
https://docs.aws.amazon.com/cli/latest/reference/eks/get-token.html
In the aws-iam-authenticator documentation, we have the code snippet I cited above.
https://github.com/kubernetes-sigs/aws-iam-authenticator#api-authorization-from-outside-a-cluster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are concerned about maintaining this code, why not add it to AWS SDK for Python? It could have been helpful for other people as well. Have you thought to create an internal ticket on this topic? We may not solve this problem now, but we may hope to improve it in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Early tests with the proposed get_bearer_token (#16571 (comment)) are promising. I'm going to play with some edge cases and see how it holds up, but I like the direction that is taking.
In general, are we happy with the security practices used there? Any concerns we need to pay attention to or address with that approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be addressed in a6ee2c3
Should Address: apache#16571 (comment)
- Added jinja template fields - Refactored fields to snake_case since they are now exposed - Removed a couple of straggling pylint instructions; pylint is no longer used
- Refactored using IDE magic: - any field named `self.conn_id` is now `self.aws_conn_id` - any param named `conn_id` is now `aws_conn_id` - any constant named `CONN_ID` is now `DEFAULT_CONN_ID` - Sensors were missing template fields, added those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code/features looks good now, the integration logo appears to be empty though?
Very strange. It looks fine locally, but you are right, it appears to be empty here on the PR. Just overwrote it with a new file, let's see if that fixes it. It is a 300x300 png file, same as the others AFAICT. [EDIT: Looks like no change on the PR side with the new file. Was there something specific I am supposed to do to process/include the image?] [EDIT 2: It displays fine on my fork, Are we sure they normally display in a PR? Seems odd that they wouldn't, but before I go chasing smoke I'd like to verify there is actually an issue.] [EDIT 3: If you click "view file" in the PR here it does show correctly (now?), it just doesn't display in the file diff view.] |
Yup, image looks good now. Previously it was a 98B file, now it's 11KiB :) |
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
Added hooks and operators for initial EKS support
Includes Hooks and operators for the following EKS endpoints, along with tests, and documentation:
Describe ClusterDescribe NodegroupList ClusterList NodegroupIncreases the required version for the
moto
package from 2.0.8 to 2.0.10 in order to ingest the new EKS support.closes: #8544
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.