Add default source nodes rendering #661

arojasb3 · 2023-11-09T03:44:34Z

Description

I'm aiming to give a default behaviour to source nodes for checking their source freshness and their tests.
One of the main limitations I found while using the custom_callback functions on source nodes to check freshness is that those node were being created to 100% of sources but not all of them required freshness checks, this made workers waste compute time.

I'm adding a new variable into the DbtNode class called has_freshness which would be True for sources with freshness checks and False for the ones that not and any other resource type.

All sources with the has_freshness == False will be rendered as Empty Operators, to keep the dbt's behavior of showing sources as suggested in issue #630.

Related Issue(s)

#630

Breaking Change?

Checklist

I have made corresponding changes to the documentation (if required)
I have added tests that prove my fix is effective or that my feature works

netlify · 2023-11-09T03:44:40Z

👷 Deploy Preview for amazing-pothos-a3bca0 processing.

Name	Link
🔨 Latest commit	`99d4be7`
🔍 Latest deploy log	https://app.netlify.com/sites/amazing-pothos-a3bca0/deploys/654c55a5e83e2700089f7f86

arojasb3 · 2023-11-09T03:46:49Z

cosmos/airflow/graph.py

+            task_full_name = node.unique_id[len("source.") :]
+            task_id = f"{task_full_name}_source"
+            args["select"] = f"source:{node.unique_id[len('source.'):]}"
+            args["models"] = None


@tatiana I have a feeling that this lines can be written somewhere else.
My dbt testing showed that
dbt source freshness --models [any] fails, while using the --select works. I ended up manually setting as None the models argument.
Any thoughts?

arojasb3 · 2023-11-09T03:49:06Z

cosmos/airflow/graph.py

+            if node.has_freshness is False:
+                return TaskMetadata(
+                    id=task_id,
+                    # arguments=args,


@tatiana Returning an EmptyOperator just seemed "grayish" in the dag UI. Looking at all the cool cosmos-related operators "DbtLocal..." "Dbt...", etc and having an EmptyOperator didn't seemed to go with the "cosmos pattern", Should we create a dummySourceOperator that inherits EmptyOperator just for the sake of having all Operators follow the same naming pattern?

arojasb3 · 2023-11-09T03:53:36Z

cosmos/operators/local.py

@@ -742,3 +742,15 @@ def __init__(self, **kwargs: str) -> None:
        raise DeprecationWarning(
            "The DbtDepsOperator has been deprecated. " "Please use the `install_deps` flag in dbt_args instead."
        )
+
+
+class DbtSourceLocalOperator(DbtLocalBaseOperator):


@tatiana

Since #655 also creates a pretty similar Operator, I can wait until that is merged to follow the same naming pattern.

I tried using the DbtSourceLocalOperator name for this function to work.

pankajastro · 2024-07-12T12:28:32Z

Hi @arojasb3, it looks like you've made great progress on this feature! Would you be able to rebase it so it can be reviewed and merged? Thanks!

updates: - [github.com/astral-sh/ruff-pre-commit: v0.1.3 → v0.1.4](astral-sh/ruff-pre-commit@v0.1.3...v0.1.4)  Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

(cherry picked from commit a6cea8f)

@MrBones757

Bug fixes * Support ProjectConfig.dbt_project_path = None & different paths for Rendering and Execution by @MrBones757 in astronomer#634 * Fix adding test nodes to DAGs built using LoadMethod.DBT_MANIFEST and LoadMethod.CUSTOM by @edgga in astronomer#615 Others * Add pre-commit hook for McCabe max complexity check and fix errors by @jbandoro in astronomer#629 * Update contributing docs for running integration tests by @jbandoro in astronomer#638 * Fix CI issue running integration tests by @tatiana in astronomer#640 and astronomer#644 * pre-commit updates in astronomer#637 (cherry picked from commit fa0620a)

…ronomer#649) Adds a snowflake mapping for encrypted private key using an environment variable Closes: astronomer#632 Breaking Change? This does rename the previous SnowflakeEncryptedPrivateKeyFilePemProfileMapping to SnowflakeEncryptedPrivateKeyFilePemProfileMapping but this makes it clearer as a new SnowflakeEncryptedPrivateKeyPemProfileMapping is added which supports the env variable. Also was only released as a pre-release change

This allows you to fully refresh a model from the console. Full-refresh/backfill is a common task. Using Airflow parameters makes this easy. Without this, you'd have to trigger an entire deployment. In our setup, company analysts manage their models without modifying the DAG code. This empowers such users. Example of usage: ```python with DAG( dag_id="jaffle", params={"full_refresh": Param(default=False, type="boolean")}, render_template_as_native_obj=True ): task = DbtTaskGroup( operator_args={"full_refresh": "{{ params.get('full_refresh') }}", "install_deps": True}, ) ``` Closes: astronomer#151

…opagation if desired (astronomer#648) Add Airflow config check for cosmos/propagate_logs to allow override of default propagation behavior. Expose entry-point so that Airflow can theoretically detect configuration default. Closes astronomer#639 ## Breaking Change? This is backward-compatible as it falls back to default behavior if the `cosmos` section or `propagate_logs` option don't exist. ## Checklist - [X] I have made corresponding changes to the documentation (if required) - [X] I have added tests that prove my fix is effective or that my feature works --------- Co-authored-by: Andrew Greenburg <agreenburg@vergeventures.net>

If execution_config was reused, Cosmos 1.2.2 would raise: ``` astronomer-cosmos/dags/basic_cosmos_task_group.py Traceback (most recent call last): File "/Users/tati/Code/cosmos-clean/astronomer-cosmos/venv-38/lib/python3.8/site-packages/airflow/models/dagbag.py", line 343, in parse loader.exec_module(new_module) File "<frozen importlib._bootstrap_external>", line 848, in exec_module File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "/Users/tati/Code/cosmos-clean/astronomer-cosmos/dags/basic_cosmos_task_group.py", line 74, in <module> basic_cosmos_task_group() File "/Users/tati/Code/cosmos-clean/astronomer-cosmos/venv-38/lib/python3.8/site-packages/airflow/models/dag.py", line 3817, in factory f(**f_kwargs) File "/Users/tati/Code/cosmos-clean/astronomer-cosmos/dags/basic_cosmos_task_group.py", line 54, in basic_cosmos_task_group orders = DbtTaskGroup( File "/Users/tati/Code/cosmos-clean/astronomer-cosmos/cosmos/airflow/task_group.py", line 26, in __init__ DbtToAirflowConverter.__init__(self, *args, **specific_kwargs(**kwargs)) File "/Users/tati/Code/cosmos-clean/astronomer-cosmos/cosmos/converter.py", line 113, in __init__ raise CosmosValueError( cosmos.exceptions.CosmosValueError: ProjectConfig.dbt_project_path is mutually exclusive with RenderConfig.dbt_project_path and ExecutionConfig.dbt_project_path.If using RenderConfig.dbt_project_path or ExecutionConfig.dbt_project_path, ProjectConfig.dbt_project_path should be None ``` This has been raised by an Astro customer and our field engineer, who tried to run: https://github.com/astronomer/cosmos-demo

Adds the `aws_session_token` argument to Athena, which was added to dbt-athena 1.6.4 in dbt-labs/dbt-athena#459 Closes: astronomer#609 Also addresses this comment: astronomer#578 (comment)

…AL` (astronomer#659) Extends the local operator when running `dbt deps` with the provides profile flags. This makes the logic consistent between DAG parsing and task running as referenced below https://github.com/astronomer/astronomer-cosmos/blob/8e2d5908ce89aa98813af6dfd112239e124bd69a/cosmos/dbt/graph.py#L247-L266 Closes: astronomer#658

Since Cosmos 1.2.2 users who used `ExecutionMode.DBT_LS` (directly or via `ExecutionMode.AUTOMATIC`) and set `ExecutionConfig.dbt_executable_path` (most, if not all, Astro CLI users), like: ``` execution_config = ExecutionConfig( dbt_executable_path = f"{os.environ['AIRFLOW_HOME']}/dbt_venv/bin/dbt", ) ``` Started facing the issue: ``` Broken DAG: [/usr/local/airflow/dags/my_example.py] Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/cosmos/dbt/graph.py", line 178, in load self.load_via_dbt_ls() File "/usr/local/lib/python3.11/site-packages/cosmos/dbt/graph.py", line 233, in load_via_dbt_ls raise CosmosLoadDbtException(f"Unable to find the dbt executable: {self.dbt_cmd}") cosmos.dbt.graph.CosmosLoadDbtException: Unable to find the dbt executable: dbt ``` This issue was initially reported in the Airflow #airflow-astronomer Slack channel: https://apache-airflow.slack.com/archives/C03T0AVNA6A/p1699584315506629 The workaround to avoid this error in Cosmos 1.2.2 and 1.2.3 is to set the `dbt_executable_path` in the `RenderConfig`: ``` render_config=RenderConfig(dbt_executable_path = f"{os.environ['AIRFLOW_HOME']}/dbt_venv/bin/dbt",), ``` This PR solves the bug from Cosmos 1.2.4 onwards.

updates: - [github.com/astral-sh/ruff-pre-commit: v0.1.4 → v0.1.5](astral-sh/ruff-pre-commit@v0.1.4...v0.1.5) - [github.com/psf/black: 23.10.1 → 23.11.0](psf/black@23.10.1...23.11.0) - [github.com/pre-commit/mirrors-mypy: v1.6.1 → v1.7.0](pre-commit/mirrors-mypy@v1.6.1...v1.7.0)  Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com>

astronomer#660) This PR refactors the `create_symlinks` function that was previously used in load via dbt ls so that it can be used in `DbtLocalBaseOperator.run_command` instead of copying the entire directory. Closes: astronomer#614

…stronomer#671) Update `DbtLocalBaseOperator` code to store `compiled_sql` prior to exception handling so that when a task fails, the `compiled_sql` can still be reviewed. In the process found and fixed a related bug where `compiled_sql` was being dropped on some operations due to the way that the `full_refresh` field was being added to the `template_fields`. Closes astronomer#369 Fixes bug introduced in astronomer#623 where compiled_sql was being lost in `DbtSeedLocalOperator` and `DbtRunLocalOperator` Co-authored-by: Andrew Greenburg <agreenburg@vergeventures.net>

…stronomer#674) Add execution config to MWAA code example document. Closes: astronomer#667

Follow up from: astronomer#1055

…stronomer#1054) Using the Airflow metadata database connection as an example connection is misleading. The mismatch in the environment variable value used in the Cosmos integration tests, particularly with sqlite as the Airflow metadata database, is an issue that can hide other underlining problems. This PR decouples the test connection used by Cosmos example DAGs from the Airflow metadata Database connection. Since this change affects the Github action configuration, it will only work for the branch-triggered GH action runs, such as: https://github.com/astronomer/astronomer-cosmos/actions/runs/9596066209 Because this is a breaking change to the CI script itself, all the tests `pull_request_target` are expected to fail during the PR - and will pass once this is merged to `main`. This improvement was originally part of astronomer#1014 --------- Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>

As part of the CI build, we create a Python virtual environment with the dependencies necessary to run the tests. Currently, we recreate this environment variable every time a Github Action job is run. This PR caches the folder hatch and stores the Python virtualenv. It seems to have helped to reduce a bit, although the jobs are still very slow: - Unit tests execution from ~[2:40](https://github.com/astronomer/astronomer-cosmos/actions/runs/9550554350/job/26322778438) to [~2:25](https://github.com/astronomer/astronomer-cosmos/actions/runs/9598977261/job/26471650029) - Integration tests execution from [~11:07](https://github.com/astronomer/astronomer-cosmos/actions/runs/9550554350/job/26322894839) to [~10:27](https://github.com/astronomer/astronomer-cosmos/actions/runs/9598977261/job/26471677561)

## Description ~shutil.copy includes permission copying via chmod. If the user lacks permission to run chmod, a PermissionError occurs. To avoid this, we split the operation into two steps: first, copy the file contents; then, copy metadata if feasible without raising exceptions. Step 1: Copy file contents (no metadata) Step 2: Copy file metadata (permission bits and other metadata) without raising exception~ use shutil.copyfile(...) instead of shutil.copy(...) to avoid running chmod ## Related Issue(s) closes: astronomer#1008 ## Breaking Change? No ## Checklist - [ ] I have made corresponding changes to the documentation (if required) - [ ] I have added tests that prove my fix is effective or that my feature works

Add the node's attributes (config, tags, etc, ...) into a TaskInstance context for retrieval by callback functions in Airflow through the use of `pre_execute` to store these attributes into a task's context. As [this PR](https://github.com/astronomer/astronomer-cosmos/pull/700/files) seems to be closed, and I have a use case for this feature, I attempt to recreate the needed feature. We leverage the `context_merge` utility function from Airflow to merge the extra context into the `Context` object of a `TaskInstance`. Closes astronomer#698

…#1063) Add ability to specify `host`/`port` for Snowflake connection. At LocalStack, we have recently started building a Snowflake emulator that allows running SF queries entirely on the local machine: https://blog.localstack.cloud/2024-05-22-introducing-localstack-for-snowflake/ . As part of a sample application we're building, we have an Apache Airflow DAG that uses Cosmos (and DBT) to connect to the local Snowflake emulator running on `localhost`. Here is a link to the sample app: localstack-samples/localstack-snowflake-samples#12 Currently, we're hardcoding this integration in the user DAG file itself, [see here](https://github.com/localstack-samples/localstack-snowflake-samples/pull/12/files#diff-559d4f883ad589522b8a9d33f87fe95b0da72ac29b775e98b273a8eb3ede9924R10-R19): ``` ... from cosmos.profiles.snowflake.user_pass import SnowflakeUserPasswordProfileMapping ... SnowflakeUserPasswordProfileMapping.airflow_param_mapping["host"] = "extra.host" SnowflakeUserPasswordProfileMapping.airflow_param_mapping["port"] = "extra.port" ... ```

updates: - [github.com/astral-sh/ruff-pre-commit: v0.4.9 → v0.4.10](astral-sh/ruff-pre-commit@v0.4.9...v0.4.10)  Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…le (astronomer#1014) Improve significantly the `LoadMode.DBT_LS` performance. The example DAGs tested reduced the task queueing time significantly (from ~30s to ~0.5s) and the total DAG run time for Jaffle Shop from 1 min 25s to 40s (by more than 50%). Some users[ reported improvements of 84%](astronomer#1014 (comment)) in the DAG run time when trying out these changes. This difference can be even more significant on larger dbt projects. The improvement was accomplished by caching the dbt ls output as an Airflow Variable. This is an alternative to astronomer#992, when we cached the pickled DAG/TaskGroup into a local file in the Airflow node. Unlike astronomer#992, this approach works well for distributed deployments of Airflow. As with any caching solution, this strategy does not guarantee optimal performance on every run—whenever the cache is regenerated, the scheduler or DAG processor will experience a delay. It was also observed that the key value could change across platforms (e.g., `Darwin` and `Linux`). Therefore, if using a deployment with heterogeneous OS, the key may be regenerated often. Closes: astronomer#990 Closes: astronomer#1061 **Enabling/disabling this feature** This feature is enabled by default. Users can disable it by setting the environment variable `AIRFLOW__COSMOS__ENABLE_CACHE_DBT_LS=0`. **How the cache is refreshed** Users can purge or delete the cache via Airflow UI by identifying and deleting the cache key. The cache will be automatically refreshed in case any files of the dbt project change. Changes are calculated using the SHA256 of all the files in the directory. Initially, this feature was implemented using the files' modified timestamp, but this did not work well for some Airflow deployments (e.g., `astro --dags` since the timestamp was changed during deployments). Additionally, if any of the following DAG configurations are changed, we'll automatically purge the cache of the DAGs that use that specific configuration: * `ProjectConfig.dbt_vars` * `ProjectConfig.env_vars` * `ProjectConfig.partial_parse` * `RenderConfig.env_vars` * `RenderConfig.exclude` * `RenderConfig.select` * `RenderConfig.selector` The following argument was introduced in case users would like to define Airflow variables that could be used to refresh the cache (it expects a list with Airflow variable names): * `RenderConfig.airflow_vars_to_purge_cache` Example: ``` RenderConfig( airflow_vars_to_purge_cache==["refresh_cache"] ) ``` **Cache key** The Airflow variables that represent the dbt ls cache are prefixed by `cosmos_cache`. When using `DbtDag`, the keys use the DAG name. When using `DbtTaskGroup`, they consider the TaskGroup and parent task groups and DAG. Examples: 1. The `DbtDag` "cosmos_dag" will have the cache represented by `"cosmos_cache__basic_cosmos_dag"`. 2. The `DbtTaskGroup` "customers" declared inside teh DAG "basic_cosmos_task_group" will have the cache key `"cosmos_cache__basic_cosmos_task_group__customers"`. **Cache value** The cache values contain a few properties: - `last_modified` timestamp, represented using the ISO 8601 format. - `version` is a hash that represents the version of the dbt project and arguments used to run dbt ls by the time the cache was created - `dbt_ls_compressed` represents the dbt ls output compressed using zlib and encoded to base64 to be recorded as a string to the Airflow metadata database. Steps used to compress: ``` compressed_data = zlib.compress(dbt_ls_output.encode("utf-8")) encoded_data = base64.b64encode(compressed_data) dbt_ls_compressed = encoded_data.decode("utf-8") ``` We are compressing this value because it will be significant for larger dbt projects, depending on the selectors used, and we wanted this approach to be safe and not clutter the Airflow metadata database. Some numbers on the compression * A dbt project with 100 models can lead to a dbt ls output of 257k characters when using JSON. Zlib could compress it by 20x. * Another [real-life dbt project](https://gitlab.com/gitlab-data/analytics/-/tree/master/transform/snowflake-dbt?ref_type=heads) with 9,285 models led to a dbt ls output of 8.4 MB, uncompressed. It reduces to 489 KB after being compressed using `zlib` and encoded using `base64` - to 6% of the original size. * Maximum cell size in Postgres: 20MB The latency used to compress is in the order of milliseconds, not interfering in the performance of this solution. **Future work** * How this will affect the Airflow db in the long term * How does this performance compare to `ObjectStorage`? **Example of results before and after this change** Task queue times in Astro before the change: <img width="1488" alt="Screenshot 2024-06-03 at 11 15 26" src="https://github.com/astronomer/astronomer-cosmos/assets/272048/20f6ae8f-02e0-4974-b445-740925ab1b3c"> Task queue times in Astro after the change on the second run of the DAG: <img width="1624" alt="Screenshot 2024-06-03 at 11 15 44" src="https://github.com/astronomer/astronomer-cosmos/assets/272048/c7b8a821-8751-4d2c-8feb-1d0c9bbba97e"> This feature will be available in `astronomer-cosmos==1.5.0a8`.

## Description It appears there was an accident resolving conflicts in the changelog, which resulted in 1.4.2 and 1.4.1 (with the content for 1.4.3) being listed twice. ## Related Issue(s) N/A ## Breaking Change? No ## Checklist - [ ] I have made corresponding changes to the documentation (if required) - [ ] I have added tests that prove my fix is effective or that my feature works

Look like rendering for conf `enable_cache_dbt_ls` is broken in docs **Before change** <img width="834" alt="Screenshot 2024-06-27 at 1 36 27 AM" src="https://github.com/astronomer/astronomer-cosmos/assets/98807258/38565e3c-0b23-4764-936a-be40c53c0a00"> **After change** <img width="815" alt="Screenshot 2024-06-27 at 1 37 09 AM" src="https://github.com/astronomer/astronomer-cosmos/assets/98807258/1c301d6a-c233-440d-801f-f9475435fc69">

Add dbt profile caching mechanism. 1. Introduced env `enable_cache_profile` to enable or disable profile caching. This will be enabled only if global `enable_cache` is enabled. 2. Users can set the env `profile_cache_dir_name`. This will be the name of a sub-dir inside `cache_dir` where cached profiles will be stored. This is optional, and the default name is `profile` 3. Example Path for versioned profile: `{cache_dir}/{profile_cache_dir}/592906f650558ce1dadb75fcce84a2ec09e444441e6af6069f19204d59fe428b/profiles.yml` 4. Implemented profile mapping hashing: first, the profile is serialized using pickle. Then, the profile_name and target_name are appended before hashing the data using the SHA-256 algorithm **Perf test result:** In local dev env with command ``` AIRFLOW_HOME=`pwd` AIRFLOW_CONN_EXAMPLE_CONN="postgres://postgres:postgres@0.0.0.0:5432/postgres" AIRFLOW_HOME=`pwd` AIRFLOW__CORE__DAGBAG_IMPORT_TIMEOUT=20000 AIRFLOW__CORE__DAG_FILE_PROCESSOR_TIMEOUT=20000 hatch run tests.py3.10-2.8:test-performance ``` NUM_MODELS=100 - TIME=167.45248413085938 (with profile cache enabled) - TIME=173.94845390319824 (with profile cache disabled) NUM_MODELS=200 - TIME=376.2585120201111 (with profile cache enabled) - TIME=418.14210200309753 (with profile cache disabled) Closes: astronomer#925 Closes: astronomer#647

Partial parsing support was introduced in astronomer#800 and improved in astronomer#904 (caching). However, as the caching layer was introduced, we removed support to use partial parsing if the cache was disabled. This PR solves the issue. Fix: astronomer#1041

@tatiana

New Features * Speed up ``LoadMode.DBT_LS`` by caching dbt ls output in Airflow Variable by @tatiana in astronomer#1014 * Support to cache profiles created via ``ProfileMapping`` by @pankajastro in astronomer#1046 * Support for running dbt tasks in AWS EKS in astronomer#944 by @VolkerSchiewe * Add Clickhouse profile mapping by @roadan and @pankajastro in astronomer#353 and astronomer#1016 * Add node config to TaskInstance Context by @linchun3 in astronomer#1044 Bug fixes * Support partial parsing when cache is disabled by @tatiana in astronomer#1070 * Fix disk permission error in restricted env by @pankajastro in astronomer#1051 * Add CSP header to iframe contents by @dwreeves in astronomer#1055 * Stop attaching log adaptors to root logger to reduce logging costs by @glebkrapivin in astronomer#1047 Enhancements * Support ``static_index.html`` docs by @dwreeves in astronomer#999 * Support deep linking dbt docs via Airflow UI by @dwreeves in astronomer#1038 * Add ability to specify host/port for Snowflake connection by @whummer in astronomer#1063 Docs * Fix rendering for env ``enable_cache_dbt_ls`` by @pankajastro in astronomer#1069 Others * Update documentation for DbtDocs generator by @arjunanan6 in astronomer#1043 * Use uv in CI by @dwreeves in astronomer#1013 * Cache hatch folder in the CI by @tatiana in astronomer#1056 * Change example DAGs to use ``example_conn`` as opposed to ``airflow_db`` by @tatiana in astronomer#1054 * Mark plugin integration tests as integration by @tatiana in astronomer#1057 * Ensure compliance with linting rule D300 by using triple quotes for docstrings by @pankajastro in astronomer#1049 * Pre-commit hook updates in astronomer#1039, astronomer#1050, astronomer#1064 * Remove duplicates in changelog by @jedcunningham in astronomer#1068

updates: - [github.com/astral-sh/ruff-pre-commit: v0.4.10 → v0.5.0](astral-sh/ruff-pre-commit@v0.4.10...v0.5.0) - [github.com/asottile/blacken-docs: 1.16.0 → 1.18.0](adamchainz/blacken-docs@1.16.0...1.18.0) - [github.com/pre-commit/mirrors-mypy: v1.10.0 → v1.10.1](pre-commit/mirrors-mypy@v1.10.0...v1.10.1)  Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Teradata has [Provider](https://airflow.apache.org/docs/apache-airflow-providers-teradata/stable/index.html) in airflow and [adapter](https://github.com/Teradata/dbt-teradata) in dbt. The cosmos library doesn't have profile configuration with mapping support. This PR address this issue. Closes: astronomer#1053

When Airflow is getting temporary AWS credentials by assuming role with `role_arn` as only `Connection` parameter, this cause task to fail due to missing credentials. This is due to the latest changes related to profile caching. The `env_vars` are accessed before `profile` which, in this case, means required values are not populated yet.

updates: - [github.com/astral-sh/ruff-pre-commit: v0.5.0 → v0.5.1](astral-sh/ruff-pre-commit@v0.5.0...v0.5.1)  Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…ofile", target "dev" invalid: Runtime Error Must specify `schema` in Teradata profile (astronomer#1088) `TeradataUserPassword` profile mapping throws below error for mock profile ``` Credentials in profile "generated_profile", target "dev" invalid: Runtime Error Must specify the schema in Teradata profile ``` Closes astronomer#1087

netlify · 2024-07-14T02:06:43Z

✅ Deploy Preview for sunny-pastelito-5ecb04 canceled.

Name	Link
🔨 Latest commit	`c0e280e`
🔍 Latest deploy log	https://app.netlify.com/sites/sunny-pastelito-5ecb04/deploys/6693328cee35e50008d391be

arojasb3 · 2024-07-14T02:07:11Z

Hi @arojasb3, it looks like you've made great progress on this feature! Would you be able to rebase it so it can be reviewed and merged? Thanks!

I messed up the rebase, I'll just open a new PR

pankajastro · 2024-07-14T19:01:49Z

I messed up the rebase, I'll just open a new PR

No worries! Please go ahead and open a new PR; I’d be happy to review and test it.

Re-Opening of PR #661 This PR features a new way of rendering source nodes: - Check freshness for sources with freshness checks - Source tests - Empty operators for nodes without tests or freshness. One of the main limitations I found while using the `custom_callback` functions on source nodes to check freshness is that nodes were being created on 100% of sources but not all of them required freshness checks, this made workers waste compute time. I'm adding a new variable into the DbtNode class called has_freshness which would be True for sources with freshness checks and False for any other resource type. If this feature is enabled with the option `ALL`: All sources with the has_freshness == False will be rendered as Empty Operators, to keep the dbt's behavior of showing sources as suggested in issue #630  A new rendered template field is included too: `freshness` which is the sources.json generated by dbt when running `dbt source freshness` This adds a new node type (source), which changes some tests behavior. This PR also updates the dev dbt project jaffle_shop to include source nodes when enabled. ![image](https://github.com/user-attachments/assets/e972ac58-8741-4c13-9905-e78775f9cc80) As seen in the image, source nodes with freshness checks are rendered with a blue color, while the ones rendered as EmptyOperator show a white/light green color ## Related Issue(s) Closes: #630 Closes: #572 Closes: #875   ## Breaking Change? This won't be a breaking change since the default behavior will still be ignoring this new feature. That can be changed with the new RenderConfig variable called `source_rendering_behavior`. Co-authored-by: Pankaj <pankaj.singh@astronomer.io> Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com>

Re-Opening of PR #661 This PR features a new way of rendering source nodes: - Check freshness for sources with freshness checks - Source tests - Empty operators for nodes without tests or freshness. One of the main limitations I found while using the `custom_callback` functions on source nodes to check freshness is that nodes were being created on 100% of sources but not all of them required freshness checks, this made workers waste compute time. I'm adding a new variable into the DbtNode class called has_freshness which would be True for sources with freshness checks and False for any other resource type. If this feature is enabled with the option `ALL`: All sources with the has_freshness == False will be rendered as Empty Operators, to keep the dbt's behavior of showing sources as suggested in issue #630  A new rendered template field is included too: `freshness` which is the sources.json generated by dbt when running `dbt source freshness` This adds a new node type (source), which changes some tests behavior. This PR also updates the dev dbt project jaffle_shop to include source nodes when enabled. ![image](https://github.com/user-attachments/assets/e972ac58-8741-4c13-9905-e78775f9cc80) As seen in the image, source nodes with freshness checks are rendered with a blue color, while the ones rendered as EmptyOperator show a white/light green color Closes: #630 Closes: #572 Closes: #875   This won't be a breaking change since the default behavior will still be ignoring this new feature. That can be changed with the new RenderConfig variable called `source_rendering_behavior`. Co-authored-by: Pankaj <pankaj.singh@astronomer.io> Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com>

arojasb3 added 3 commits November 8, 2023 22:30

add source node Local and VirtualEnv operator

0c69c43

change source node creation

ee4722e

add freshness variable to dbtNode

99d4be7

arojasb3 had a problem deploying to external November 9, 2023 03:44 — with GitHub Actions Failure

arojasb3 commented Nov 9, 2023

View reviewed changes

tatiana added the area:rendering Related to rendering, like Jinja, Airflow tasks, etc label Nov 9, 2023

tatiana force-pushed the main branch from f27cb25 to 9001c98 Compare November 9, 2023 16:21

tatiana added the status:awaiting-reviewer The issue/PR is awaiting for a reviewer input label Dec 14, 2023

tatiana added this to the 1.6.0 milestone Apr 30, 2024

pankajastro mentioned this pull request Jul 12, 2024

Add default behaviour for converting dbt Source nodes into Airflow tasks #630

Closed

pre-commit-ci bot and others added 16 commits July 13, 2024 21:02

Update example DAG for manifest rendering

5e88012

(cherry picked from commit a6cea8f)

Add aws_session_token for Athena mapping (astronomer#663)

5fcee8f

Adds the `aws_session_token` argument to Athena, which was added to dbt-athena 1.6.4 in dbt-labs/dbt-athena#459 Closes: astronomer#609 Also addresses this comment: astronomer#578 (comment)

Docs fix: add execution config to MWAA code example (astronomer#667) (a…

ad8a9b8

…stronomer#674) Add execution config to MWAA code example document. Closes: astronomer#667

fix: test_load_via_dbt_ls_with_invalid_dbt_path if user has system dbt

9c66771

change to default factory

a1af36d

dwreeves and others added 20 commits July 13, 2024 21:05

Add test for response header (astronomer#1058)

59c4fea

Follow up from: astronomer#1055

Intial change for Snowflake encrypted private key

d9c2631

Work around for user/password mapping

c0e280e

arojasb3 had a problem deploying to external July 14, 2024 02:06 — with GitHub Actions Failure

arojasb3 closed this Jul 14, 2024

arojasb3 mentioned this pull request Jul 19, 2024

Add default source nodes rendering #1107

Merged

2 tasks

This was referenced Jul 31, 2024

Release Cosmos 1.6.0 #1103

Closed

Write blog post about source nodes support #1138

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add default source nodes rendering #661

Add default source nodes rendering #661

arojasb3 commented Nov 9, 2023

netlify bot commented Nov 9, 2023

arojasb3 Nov 9, 2023 •

edited

Loading

arojasb3 Nov 9, 2023 •

edited

Loading

arojasb3 Nov 9, 2023

pankajastro commented Jul 12, 2024

netlify bot commented Jul 14, 2024 •

edited

Loading

arojasb3 commented Jul 14, 2024

pankajastro commented Jul 14, 2024

Add default source nodes rendering #661

Add default source nodes rendering #661

Conversation

arojasb3 commented Nov 9, 2023

Description

Related Issue(s)

Breaking Change?

Checklist

netlify bot commented Nov 9, 2023

👷 Deploy Preview for amazing-pothos-a3bca0 processing.

arojasb3 Nov 9, 2023 • edited Loading

Choose a reason for hiding this comment

arojasb3 Nov 9, 2023 • edited Loading

Choose a reason for hiding this comment

arojasb3 Nov 9, 2023

Choose a reason for hiding this comment

pankajastro commented Jul 12, 2024

netlify bot commented Jul 14, 2024 • edited Loading

✅ Deploy Preview for sunny-pastelito-5ecb04 canceled.

arojasb3 commented Jul 14, 2024

pankajastro commented Jul 14, 2024

arojasb3 Nov 9, 2023 •

edited

Loading

arojasb3 Nov 9, 2023 •

edited

Loading

netlify bot commented Jul 14, 2024 •

edited

Loading