Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to specify which connection, variable or config are being looked up in the backend using *_lookup_pattern parameters #29580

Merged
merged 6 commits into from
Mar 1, 2023

Conversation

vincbeck
Copy link
Contributor

@vincbeck vincbeck commented Feb 16, 2023

Many Airflow customers use AWS Secret Manager as their backend and they run into their AWS Secrets Manager / Systems Manager cost ballooning as Airflow has a fixed search path and each connection/variable/config looks up first in Secrets Backend. This happen on DAG execution, environment setup and can happen in DAG parsing if the user has defined things in the top-level python code. Adding *_lookup_pattern parameters to allow the user to specify which connection/variable/config they want to be looked in Secrets Manager and/or Systems manager.

Which customers are impacted?

  • Only customers who use AWS Secret Manager or AWS Systems Manager as secret backend specified through the config [secret]
  • Among AWS secrets backend customers, only the customers who choose to define the lookup-pattern will see different behavior which will be based on their definition. There will be no change in the default behavior

Why this change is needed?

Some customers want to fallback quickly on the Airflow metastore or env variables instead of looking up first in the secret backend. One example is a user having one connection defined in AWS Secret Manager or AWS Systems Manager and the rest in Airflow metastore. The user does not want to make a request for each connection to the secret backend but only for the specific one.

Related: #19251

@Taragolis you might be interested to look into this one.

AWS folks: @ferruzzi, @vandonr-amz, @syedahsn

…d up in the backend using *_lookup_pattern parameters
@Taragolis
Copy link
Contributor

for each variable and connection access and it is done every time the Scheduler parses DAGs

It happen every time if users not follow Best Practices about Top Level Python Code 🤣 . Or some of the Operators call establish connections in constructor, I know only about SSHOperator from Community Providers with this behaviour

@eladkal
Copy link
Contributor

eladkal commented Feb 16, 2023

I agree with @Taragolis the problem as described in this PR is not the same as described in #19251 in the issue dag parsing is not in the problem domain.

@vincbeck Please share DAG code example that this PR attempts to solve

@shubham22
Copy link

shubham22 commented Feb 16, 2023

Customers don't always write the most optimized DAGs. Conns, Configs and variables are always searched in secrets backend before env variables and meta store. This change will allow customers to only lookup secrets backend when their variables, et al. follow a pattern defined by them. Based on MWAA data, I can confirm that customers are spending hundreds of dollars on their secrets backed, just to use it with Ariflow.

@potiuk
Copy link
Member

potiuk commented Feb 16, 2023

Maybe it's time resurrect this one from @Taragolis ? #23560

I have a feeling that adding lookup filter is quite a bit more complex (in terms of users understanding what to do there properly and in terms of usage protection). What happens if you change the pattern ? Will you remember to update it or you will be surprised by high involce next month?

@shubham22
Copy link

shubham22 commented Feb 16, 2023

in terms of users understanding what to do there properly and in terms of usage protection

This is a fair argument and agree that this can be misused or can lead to issues which are hard to track. We did consider adding only prefix, say "aws-conn-* instead of general regex, but Airflow customers are primed toward flexibility. This is why we went with regex to make it suitable for different needs. With documentation and, likely, a blogpost we will try to share with customers best practices around this. Like many other things, if customers really want to break, there are many ways to break it.

#23560
Oh, I didn't know this was tried before. We did consider making search order path configurable as well, but decided against it as search path order can lead to lot of side-effects, compared to providing customers a configurable lookup pattern option.

@eladkal
Copy link
Contributor

eladkal commented Feb 16, 2023

I'm sorry but I don't follow which problem are we tackling here.
The PR description talks about DAG parsing but @shubham22 comment is not about that.

@shubham22
Copy link

The PR description talks about DAG parsing but @shubham22 comment is not about that.

Sorry for the confusion, Elad. @vincbeck will soon update the description to make the scope and impact clear.

@Taragolis
Copy link
Contributor

Customers don't always write the most optimized DAGs. Conns, Configs and variables are always searched in secrets backend before env variables and meta store. This change will allow customers to only lookup secrets backend when their variables, et al. follow a pattern defined by them. Based on MWAA data, I can confirm that customers are spending hundreds of dollars on their secrets backed, just to use it with Ariflow.

That actually just postpone the problem, and might lead to problem with Aurora instances (AFAIK it uses in MWAA as Postgres backend), yeah it would be cheaper for end users, because in this case they would not pay for IOPS on Aurora in case of MWAA, but still can be an issue with backend itself.

Is MWAA still support only Secrets Manager by default? Because I guess SSM Parameter store much cheaper rather than AWS SM. You don't have to pay 0.4 cents secrets/per month, and no cost for API call for Standard Parameter with Standard Throughput, I know because personally use SSM since Secrets Backends introduced in Airflow 1.10.10

I do not have any concern about this changes, just want to mention that the main problem, that users do not follow Best Practice, and as result they have a huge or not bills. The huge bill might happen if user turn on DEBUG level in production and write everything to logs in this case CloudWatch pricing might be, well... pretty surprising.

Maybe it's time resurrect this one from @Taragolis ? #23560

Ooooh... My first PR to Airflow repo 🤣 That still in my list, actually in two list, TODO and make wide discussion about Configurations and consistency with Secrets Backend, still no idea how to resolve the problem, that Airflow DB use both Secrets Backend capability and own implementation for obtain Variables and Connections in CLI/API/Webserver

@vincbeck
Copy link
Contributor Author

Sorry for the bad description guys! I definitely did not expect so much confusion so sorry for that again. I updated the description which, hopefully, make the PR more clear.

I understand we should push people to write DAGs which follow guidelines but I am definitely not targeting specifically these users when doing this change. See the example in the description. And I feel there are a lot of other use cases which this feature can be a good add-on.

Copy link
Contributor

@dimberman dimberman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! A few small nits then glad to merge.

airflow/providers/amazon/aws/secrets/secrets_manager.py Outdated Show resolved Hide resolved
Comment on lines +91 to +94
:param connections_lookup_pattern: Specifies a pattern the connection ID needs to match to be looked up in
AWS Secrets Manager. Applies only if `connections_prefix` is not None.
If set to None (null value in the configuration), all connections will be looked up first in
AWS Secrets Manager.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this idea. Cool feature add on!

@@ -264,14 +286,14 @@ def get_conn_uri(self, conn_id: str) -> str | None:

def get_variable(self, key: str) -> str | None:
"""
Get Airflow Variable from Environment Variable
Get Airflow Variable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some more description here. If we're getting this variable from multiple places maybe we should be clear about the options?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context of this class, we are actually getting it from only one location: AWS Secrets Manager. If this function returns None, then it is fetched from other location: Environment variable then metastore but this is done outside of this class. See documentation here. Let me know if you still think I should update the documentation

airflow/providers/amazon/aws/secrets/secrets_manager.py Outdated Show resolved Hide resolved
airflow/providers/amazon/aws/secrets/systems_manager.py Outdated Show resolved Hide resolved
airflow/providers/amazon/aws/secrets/systems_manager.py Outdated Show resolved Hide resolved
@potiuk
Copy link
Member

potiuk commented Feb 20, 2023

Yep. I thouthg about it and this seems like non-obvious great way of solving the "cost" problem connected with secret backends.

@vincbeck vincbeck requested review from dimberman and removed request for eladkal and o-nikolas February 21, 2023 19:02
@vincbeck
Copy link
Contributor Author

Thanks @dimberman for the feedbacks. I addressed them, feel free to review when you get a chance :)

@eladkal
Copy link
Contributor

eladkal commented Feb 23, 2023

Should we modify a bit the description of https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/security/secrets/secrets-backend/index.html#search-path ?

If you enable an alternative secrets backend, it will be searched first, followed by environment variables, then metastore. This search ordering is not configurable.

When merging this PR the statement becomes not very accurate

@vincbeck
Copy link
Contributor Author

Should we modify a bit the description of https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/security/secrets/secrets-backend/index.html#search-path ?

If you enable an alternative secrets backend, it will be searched first, followed by environment variables, then metastore. This search ordering is not configurable.

When merging this PR the statement becomes not very accurate

Good point. I added a sentence saying that, depending on the secret backend used, there might be an option to filter which connection/variable/config is searched in the secret backend. Let me know if it does look good to you

@eladkal
Copy link
Contributor

eladkal commented Mar 1, 2023

@pierrejeambrun this needs to go to 2.5.2
from core point of view this PR contains only doc so this is a doc-change only.
The core logic/changes is in the Amazon provider so this will be a feature release for the provider as part of providers release cycle.

Airflow core -> minor release 2.5.2
Amazon provider -> feature release 7.3.0

@eladkal eladkal dismissed dimberman’s stale review March 1, 2023 19:32

All points are addressed

@eladkal
Copy link
Contributor

eladkal commented Mar 1, 2023

@dimberman I'm going to merge this PR so we can start working on providers + core release. If something else is needed we can address it in followup PR.

@eladkal eladkal merged commit 5de4791 into apache:main Mar 1, 2023
@eladkal eladkal modified the milestones: Airflow 2.6.0, Airflow 2.5.2 Mar 1, 2023
@eladkal eladkal added the type:doc-only Changelog: Doc Only label Mar 1, 2023
@pierrejeambrun
Copy link
Member

Thanks @eladkal for the details, I completely missed that!

@pierrejeambrun
Copy link
Member

pierrejeambrun commented Mar 23, 2023

Marked for 2.6, conflicting and requires 2 providers only change to merge the 1 line core doc change. Raise a specific PR to v2-5 branch to avoid cherry picking them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants