Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 1.3.2 #817

Merged
merged 12 commits into from
Jan 26, 2024
Merged

Release 1.3.2 #817

merged 12 commits into from
Jan 26, 2024

Conversation

tatiana
Copy link
Collaborator

@tatiana tatiana commented Jan 26, 2024

Bug fixes

Others

pre-commit-ci bot and others added 12 commits January 26, 2024 23:29
<!--pre-commit.ci start-->
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.1.11 →
v0.1.13](astral-sh/ruff-pre-commit@v0.1.11...v0.1.13)
<!--pre-commit.ci end-->

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
(cherry picked from commit 8691c38)
…methods (#803)

This resolves #798 where when using `LoadMode.DBT_LS_FILE`, the
`DbtGraph.update_node_dependency` was not called resulting in filtered
nodes not having `DbtNode.has_test` set as expected.

Closes: #798
(cherry picked from commit b313544)
Once Airflow 2.8 was released, Cosmos tests started failing.

There were two main issues: conflicting `pendulum` version and the installation of `apache-airflow-providers-common-io`.

# Details on `pendulum`:

```
_________________ ERROR collecting tests/airflow/test_graph.py _________________
tests/airflow/test_graph.py:6: in <module>
    from airflow import __version__ as airflow_version
../../../.local/share/hatch/env/virtual/astronomer-cosmos/Za_bFbg4/tests.py3.8-2.4/lib/python3.8/site-packages/airflow/__init__.py:34: in <module>
    from airflow import settings
../../../.local/share/hatch/env/virtual/astronomer-cosmos/Za_bFbg4/tests.py3.8-2.4/lib/python3.8/site-packages/airflow/settings.py:49: in <module>
    TIMEZONE = pendulum.tz.timezone('UTC')
E   TypeError: 'module' object is not callable
```
[Example
here](https://github.com/astronomer/astronomer-cosmos/actions/runs/7590233614/job/20676384033).
I think this is because Airflow v2.8.1 was [released
today](https://github.com/apache/airflow/releases/tag/2.8.1) that now
targets the 3.0.0 version of Pendulum that has the breaking API changes
seen above. Any pip install of `apache-airflow<2.8.1` I think is now
installing `pendulum==3.0.0` because the pendulum constraint is only
specified if you install airflow [with a constraint
file.](https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html)

I don't think hatch dependencies allow constraint file referencing, so
this attempt pins `pendulum` directly, kind of like what is already done
for pydantic.

# Details on `apache-airflow-providers-common-io`:

When building an environment, the first step Hatch does is to install the project dependencies.
It does not consider tool.hatch.envs.tests.overrides when first doing this.

So, for all our Airflow test Matrix, Hatch first installs Airflow 2.8. As part of this, it installs apache-airflow-providers-common-io==1.2.0. This new Airflow dependency conflicts with previous versions of Airflow. When Hatch downgrades the version of Airflow, it does not uninstall apache-airflow-providers-common-io.

Therefore, tests running for versions of Airflow before 2.8 were failing because of apache-airflow-providers-common-io with:

```
FAILED tests/operators/test_local.py::test_run_test_operator_with_callback - sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: task_instance
[SQL: SELECT task_instance.try_number, task_instance.task_id, task_instance.dag_id, task_instance.run_id, task_instance.map_index, task_instance.start_date, task_instance.end_date, task_instance.duration, task_instance.state, task_instance.max_tries, task_instance.hostname, task_instance.unixname, task_instance.job_id, task_instance.pool, task_instance.pool_slots, task_instance.queue, task_instance.priority_weight, task_instance.operator, task_instance.custom_operator_name, task_instance.queued_dttm, task_instance.queued_by_job_id, task_instance.pid, task_instance.executor_config, task_instance.updated_at, task_instance.external_executor_id, task_instance.trigger_id, task_instance.trigger_timeout, task_instance.next_method, task_instance.next_kwargs, dag_run_1.state AS state_1, dag_run_1.id, dag_run_1.dag_id AS dag_id_1, dag_run_1.queued_at, dag_run_1.execution_date, dag_run_1.start_date AS start_date_1, dag_run_1.end_date AS end_date_1, dag_run_1.run_id AS run_id_1, dag_run_1.creating_job_id, dag_run_1.external_trigger, dag_run_1.run_type, dag_run_1.conf, dag_run_1.data_interval_start, dag_run_1.data_interval_end, dag_run_1.last_scheduling_decision, dag_run_1.dag_hash, dag_run_1.log_template_id, dag_run_1.updated_at AS updated_at_1
FROM task_instance JOIN dag_run ON dag_run.dag_id = task_instance.dag_id AND dag_run.run_id = task_instance.run_id JOIN dag_run AS dag_run_1 ON dag_run_1.dag_id = task_instance.dag_id AND dag_run_1.run_id = task_instance.run_id
WHERE task_instance.dag_id = ? AND task_instance.task_id IN (?, ?) AND dag_run.execution_date >= ? AND dag_run.execution_date <= ? AND task_instance.operator = ?]
[parameters: ('test-id-2', 'run', 'test', '2024-01-22 23:11:55.593478', '2024-01-22 23:11:55.593478', 'ExternalTaskMarker')]
(Background on this error at: [https://sqlalche.me/e/14/e3q8\](https://sqlalche.me/e/14/e3q8/))
I did a workaround to uninstall apache-airflow-providers-common-io for all Airflow versions and only install it for Airflow 2.8. It is ugly, but seems to work. Once the tests pass, I'll merge our PR - so the CI can be back to green. We can go ahead and revisit the approach in the future.
```

We did a workaround to uninstall `apache-airflow-providers-common-io` for all Airflow versions and only install it for Airflow 2.8. It is ugly, but seems to work. Once the tests pass, I'll merge our PR - so the CI can be back to green. We can go ahead and revisit the approach in the future.

Co-authored-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com>
(cherry picked from commit f953cae)
<!--pre-commit.ci start-->
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.1.13 →
v0.1.14](astral-sh/ruff-pre-commit@v0.1.13...v0.1.14)
<!--pre-commit.ci end-->

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com>
(cherry picked from commit 522f64a)
Remove incorrect docstring from `DbtLocalBaseOperator` (relatest to
#796)

---------

Co-authored-by: Justin Bandoro <79104794+jbandoro@users.noreply.github.com>
Co-authored-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com>
(cherry picked from commit ef2c7bb)
…ion base subclasses (#805)

This fixes an issue reported in #804 after the refactor done in
#774 where the
`execute` methods for `DbtLocalBaseOperator`, `DbtDockerBaseOperator`,
and `DbtKubernetesBaseOperator` were different.

This PR refactors the `execute` method to the `AbstractDbtBaseOperator`
so it's the same for all of the local, docker and kubernetes inherited
operators, and adds `build_and_run_cmd` as an abstract method since the
implementation is different across the 3 different execution modes.

Closes #804

(cherry picked from commit 9c090a4)
A user (ZD case: 38982) reported being unable to see the test node for
one specific model in a particular DAG when using
RenderConfig(select=tags). They were using the custom Cosmos selector.
It seems the selector worked as expected for other models and for the
same model/test in a different DAG.

This CR adds more logs for troubleshooting when using.

(cherry picked from commit 6444465)
A user reported the following issue while using Cosmos 1.3.1:

```
    def _should_include_node(self, node_id: str, node: DbtNode) -> bool:
        "Checks if a single node should be included. Only runs once per node with caching."
        if node_id in self.visited_nodes:
            return node_id in self.selected_nodes

        self.visited_nodes.add(node_id)

        if node.resource_type == DbtResourceType.TEST:
>           node.tags = getattr(self.nodes.get(node.depends_on[0]), "tags", [])
E           IndexError: list index out of range

cosmos/dbt/selector.py:298: IndexError
```

In order to reproduce this issue, it was necessary to add a tag-based
select statement.

Based on the error, it seems their dbt project has a test without
`depends_on`. This CR adds support to this use case.

Closes: #813
(cherry picked from commit e842d82)
…ALL` (#816)

Before, when using `TestBehavior.AFTER_ALL`, it did not consider the
`select` and `exclude` settings defined in the rendering configuration.
This was an error since it would run all the dbt project tests, even if
they were outside of the scope of the Cosmos-defined DAG.

Now we take into account `select`, `selector` and `exclude` when using
this test behaviour.

Closes: #643
(cherry picked from commit 3040887)
[Cosmos docs](https://astronomer.github.io/astronomer-cosmos/configuration/lineage.html)
stated that users didn't have to install any dependency to use
OpenLineage with Cosmos.

However, for inlets and outlets to be emitted, Airflow 2.7 users must
install `apache-airflow-providers-openlineage` or
`astronomer-cosmos[openlineage]`.

Closes: #796
(cherry picked from commit fe01237)
This PR installs apache-airflow in the hatch `pre-install-commands`
section so Airflow related dependency conflicts do not need to be
managed in the override section that has been removed.

Installing from the Airflow constraint file for versions <=2.6 fails
because PyYAML is pinned in the constraint file for those versions to
6.0.0 and [a workaround ](yaml/pyyaml#736) is
required becaused older versions of PyYAML can no longer be installed
from unmodified source or sdist with the release of Cython3.

Closes: #811
(cherry picked from commit 928ba83)
Bug fixes

* Fix: ensure DbtGraph.update_node_dependency is called for all load methods by @jbandoro in #803
* Fix: ensure operator execute method is consistent across all execution base subclasses by @jbandoro in #805
* Fix custom selector when test node has no depends_on values by @tatiana in #814
* Fix forwarding selectors to test task when using TestBehavior.AFTER_ALL by @tatiana in #816

Others

* Docs: Remove incorrect docstring from DbtLocalBaseOperator by @jakob-hvitnov-telia in #797
* Add more logs to troubleshoot custom selector by @tatiana in #809
* Fix OpenLineage integration documentation by @tatiana in #810
* Fix test dependencies after Airflow 2.8 release by @jbandoro and @tatiana in #806
* Use Airflow constraint file for test environment setup by @jbandoro in #812
* pre-commit updates in #799, #807
@tatiana tatiana requested a review from a team as a code owner January 26, 2024 23:36
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. area:selector Related to selector, like DAG selector, DBT selector, etc dbt:test Primarily related to dbt test command or functionality execution:docker Related to Docker execution environment parsing:custom Related to custom parsing, like custom DAG parsing, custom DBT parsing, etc labels Jan 26, 2024
@tatiana tatiana requested a review from jbandoro January 26, 2024 23:37
@tatiana
Copy link
Collaborator Author

tatiana commented Jan 26, 2024

We are cutting this release out of 1.3.1 because there are some new features merged into the main branch that will be part of 1.4.0a1:

1.4.0a1 (coming soon)
--------------------

Features

* Add dbt profile config variables to mapped profile by @ykuc in #794
* Add dbt build operators by @dylanharper-qz in #795
(more)

@tatiana tatiana added this to the 1.3.2 milestone Jan 26, 2024
Copy link
Collaborator

@jbandoro jbandoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 thank you!

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jan 26, 2024
@tatiana tatiana merged commit e16aa7f into release-1.3 Jan 26, 2024
1 check passed
@tatiana tatiana deleted the release-1.3.2 branch January 26, 2024 23:50
@tatiana tatiana restored the release-1.3.2 branch January 26, 2024 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:selector Related to selector, like DAG selector, DBT selector, etc dbt:test Primarily related to dbt test command or functionality execution:docker Related to Docker execution environment lgtm This PR has been approved by a maintainer parsing:custom Related to custom parsing, like custom DAG parsing, custom DBT parsing, etc size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants