Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse DAGs with no connections (offline) #489

Merged
merged 8 commits into from
Aug 25, 2023
Merged

Parse DAGs with no connections (offline) #489

merged 8 commits into from
Aug 25, 2023

Conversation

jlaneve
Copy link
Collaborator

@jlaneve jlaneve commented Aug 24, 2023

Description

Cosmos requires connections to be defined when using the dbt_ls parsing method. This is not ideal for 2 reasons:

  1. Requiring connections for parsing DAGs is generally bad Airflow practice, as it requires reaching out to a secrets backend/Airflow metadata DB on every parse
  2. Connections aren't set up in all environments, i.e. CI/CD

This PR fixes the issue by generating a mock profile when Cosmos is parsing (not executing) the DAG. This works for all but the automatic_profile_mapping, which requires a live connection.

Related Issue(s)

Breaking Change?

Checklist

  • I have made corresponding changes to the documentation (if required)
  • I have added tests that prove my fix is effective or that my feature works

@netlify
Copy link

netlify bot commented Aug 24, 2023

👷 Deploy Preview for amazing-pothos-a3bca0 processing.

Name Link
🔨 Latest commit bc4deec
🔍 Latest deploy log https://app.netlify.com/sites/amazing-pothos-a3bca0/deploys/64e7fb19fd8f2f00099b69c1

@jlaneve jlaneve temporarily deployed to internal August 24, 2023 15:26 — with GitHub Actions Inactive
@pre-commit-ci pre-commit-ci bot temporarily deployed to internal August 24, 2023 15:27 Inactive
@jlaneve jlaneve temporarily deployed to internal August 24, 2023 15:34 — with GitHub Actions Inactive
cosmos/profiles/base.py Outdated Show resolved Hide resolved
cosmos/profiles/base.py Outdated Show resolved Hide resolved
@harels harels marked this pull request as ready for review August 24, 2023 19:47
@harels harels requested a review from a team as a code owner August 24, 2023 19:47
@harels harels requested a review from a team August 24, 2023 19:47
@harels harels temporarily deployed to internal August 24, 2023 19:47 — with GitHub Actions Inactive
@harels harels temporarily deployed to internal August 24, 2023 20:15 — with GitHub Actions Inactive
@harels harels temporarily deployed to internal August 24, 2023 20:29 — with GitHub Actions Inactive
@harels harels temporarily deployed to internal August 25, 2023 00:51 — with GitHub Actions Inactive
Copy link
Contributor

@harels harels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made comments, applied changes and fixed tests. approving pending all tests passing

@codecov
Copy link

codecov bot commented Aug 25, 2023

Codecov Report

Patch coverage: 85.07% and project coverage change: -0.23% ⚠️

Comparison is base (6498123) 91.78% compared to head (bc4deec) 91.56%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #489      +/-   ##
==========================================
- Coverage   91.78%   91.56%   -0.23%     
==========================================
  Files          50       50              
  Lines        1790     1849      +59     
==========================================
+ Hits         1643     1693      +50     
- Misses        147      156       +9     
Files Changed Coverage Δ
cosmos/profiles/redshift/user_pass.py 88.23% <60.00%> (-11.77%) ⬇️
cosmos/profiles/bigquery/oauth.py 87.50% <66.66%> (-12.50%) ⬇️
cosmos/profiles/bigquery/service_account_file.py 87.50% <66.66%> (-12.50%) ⬇️
.../profiles/bigquery/service_account_keyfile_dict.py 94.59% <66.66%> (-5.41%) ⬇️
cosmos/config.py 92.77% <83.33%> (+0.27%) ⬆️
cosmos/profiles/base.py 95.57% <95.45%> (-0.22%) ⬇️
cosmos/dbt/graph.py 100.00% <100.00%> (ø)
cosmos/profiles/databricks/token.py 100.00% <100.00%> (ø)
cosmos/profiles/exasol/user_pass.py 100.00% <100.00%> (ø)
cosmos/profiles/postgres/user_pass.py 100.00% <100.00%> (ø)
... and 7 more

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@harels harels merged commit ccfde1e into main Aug 25, 2023
@harels harels deleted the parse-dags-no-conn branch August 25, 2023 12:18
@tatiana tatiana added this to the 1.1.0 milestone Aug 31, 2023
@tatiana tatiana mentioned this pull request Sep 6, 2023
tatiana added a commit that referenced this pull request Sep 6, 2023
**Features**

* Support dbt global flags (via dbt_cmd_global_flags in operator_args)
by @tatiana in #469
* Support parsing DAGs when there are no connections by @jlaneve in #489

**Enhancements**

* Hide sensitive field when using BigQuery keyfile_dict profile mapping
by @jbandoro in #471
* Consistent Airflow Dataset URIs, inlets and outlets with `Openlineage
package <https://pypi.org/project/openlineage-integration-common/>`_ by
@tatiana in #485. `Read more
<https://astronomer.github.io/astronomer-cosmos/configuration/lineage.html>`_.
* Refactor ``LoadMethod.DBT_LS`` to run from a temporary directory with
symbolic links by @tatiana in #488
* Run ``dbt deps`` when using ``LoadMethod.DBT_LS`` by @DanMawdsleyBA in
#481
* Update Cosmos log color to purple by @harels in #494
* Change operators to log ``dbt`` commands output as opposed to
recording to XCom by @tatiana in #513

**Bug fixes**

* Fix bug on select node add exclude selector subset ids logic by
@jensenity in #463
* Refactor dbt ls to run from a temporary directory, to avoid Read-only
file system errors during DAG parsing, by @tatiana in #414
* Fix profile_config arg in DbtKubernetesBaseOperator by @david-mag in
#505
* Fix SnowflakePrivateKeyPemProfileMapping private_key reference by
@nacpacheco in #501
* Fix incorrect temporary directory creation in VirtualenvOperator init
by @tatiana in #500
* Fix log propagation issue by @tatiana in #498
* Fix PostgresUserPasswordProfileMapping to retrieve port from
connection by @jlneve in #511

**Others**

* Docs: Fix RenderConfig load argument by @jbandoro in #466
* Enable CI integration tests from external forks by @tatiana in #458
* Improve CI tests runtime by @tatiana in #457
* Change CI to run coverage after tests pass by @tatiana in #461
* Fix forks code revision in code coverage by @tatiana in #472
* [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #467
* Drop support to Python 3.7 in the CI test matrix by @harels in #490
* Add Airflow 2.7 to the CI test matrix by @tatiana in #487
* Add MyPy type checks to CI since we exceeded pre-commit disk quota
usage by @tatiana in #510
tatiana added a commit that referenced this pull request Oct 25, 2023
…fileMapping` is used (#625)

Since #489 was merged, the behavior of `LoadMode.AUTOMATIC` changed to
generate a `profiles.yml` file if the file didn't exist. However, we
forgot to remove the previously necessary condition for being able to
run `LoadMode.DBT_LS` (having the `profiles.yml` file).

This leads to inconsistent behaviour in Cosmos when using
`LoadMode.AUTOMATIC` and the `manifest.json` was not available:
1. If the user used a `ProfileConfig` with `profiles_yml_filepath`, it
would use `LoadMode.DBT_LS`
2. If the user used a `ProfileConfig` with a ProfileMapping class, it
would unnecessarily use `LoadMode.CUSTOM`

This PR fixes the behaviour to attempt to use `LoadMode.DBT_LS`
regardless of how the `ProfileConfig` was set.
tatiana added a commit that referenced this pull request Oct 25, 2023
…fileMapping` is used (#625)

Since #489 was merged, the behavior of `LoadMode.AUTOMATIC` changed to
generate a `profiles.yml` file if the file didn't exist. However, we
forgot to remove the previously necessary condition for being able to
run `LoadMode.DBT_LS` (having the `profiles.yml` file).

This leads to inconsistent behaviour in Cosmos when using
`LoadMode.AUTOMATIC` and the `manifest.json` was not available:
1. If the user used a `ProfileConfig` with `profiles_yml_filepath`, it
would use `LoadMode.DBT_LS`
2. If the user used a `ProfileConfig` with a ProfileMapping class, it
would unnecessarily use `LoadMode.CUSTOM`

This PR fixes the behaviour to attempt to use `LoadMode.DBT_LS`
regardless of how the `ProfileConfig` was set.

(cherry picked from commit ad7dcf0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants