Raise exception when attempting to run a model that dbt can't find #682

tatiana · 2023-11-16T13:06:27Z

Context

Cosmos 1.2 considers a model run task successful even if dbt finds no models for that execution, as illustrated in:
#662 (comment)

The Airflow task succeeds, but the logs say:

[2023-11-16T12:32:37.420+0000] {subprocess.py:94} INFO - (astronomer-cosmos) - 12:32:37  The selection criterion 'customers_USA_v1' does not match any nodes

I'd expect us to be able to reproduce this by instantiating a DbtRunLocalOperator https://github.com/astronomer/astronomer-cosmos/blob/main/cosmos/operators/local.py#L433C7-L433C26 with a select statement that doesn't match any dbt models.

Proposal

(I would love thoughts from the community!)

We should raise an exception if dbt's output contains "select statement that doesn't match any dbt models".
We could allow users to opt out of this behaviour via some configuration.

The text was updated successfully, but these errors were encountered:

dosubot · 2023-11-16T13:06:48Z

🤖

Hello again, @tatiana! It's good to see you contributing to our project. I appreciate the detailed issue you've raised. Rest assured, I'm going to take a close look at it and get back to you with a comprehensive response soon.

dosubot · 2023-11-16T13:17:25Z

🤖

Thank you for bringing this to our attention, Tatiana. I agree that it would be beneficial for the DbtRunLocalOperator to raise an exception when dbt can't find a model that matches the select statement.

Currently, the DbtRunLocalOperator does not specifically handle this case. It executes the dbt command and logs the output, but it does not parse the output to check for a message indicating that no models matched the select statement.

To implement your proposal, we could modify the run_command method in the DbtRunLocalOperator to parse the output of the dbt command and raise an exception if it contains the message "The selection criterion '...' does not match any nodes". Here is a rough sketch of how this could be done:

def run_command(self, command: List[str]) -> None:
    result = run_dbt_command(command, self.env)
    if "The selection criterion" in result.output and "does not match any nodes" in result.output:
        raise AirflowException("The selection criterion does not match any dbt models")
    self.log.info(result.output)

This would cause the Airflow task to fail if dbt can't find a model that matches the select statement.

As for the configuration option to opt out of this behavior, we could add a new parameter to the DbtRunLocalOperator called raise_on_no_match. If raise_on_no_match is set to True, the operator would raise an exception when no models match the select statement. If raise_on_no_match is set to False, the operator would log a warning message but not raise an exception. Here is how this could be implemented:

def run_command(self, command: List[str]) -> None:
    result = run_dbt_command(command, self.env)
    if "The selection criterion" in result.output and "does not match any nodes" in result.output:
        if self.raise_on_no_match:
            raise AirflowException("The selection criterion does not match any dbt models")
        else:
            self.log.warning("The selection criterion does not match any dbt models")
    self.log.info(result.output)

Please note that this is a rough sketch and the actual implementation might differ. The changes would need to be thoroughly tested to ensure they work as expected and do not introduce any regressions.

I hope this helps! If you have any further questions or suggestions, please let me know.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

MrBones757 · 2023-11-16T14:28:35Z

What does DBT do in the case it doesn't match?

If DBT fails in this situation i would say we should fail as a default and allow the user to turn it off. Likewise if DBT doesn't fail we should follow suit and add a toggle to enable failing.

tatiana · 2023-11-16T21:47:58Z

@MrBones757 I really like your approach to this - but I feel in this particular case, things are not as clear.

dbt ls, for instance, doesn't return an error (return code) even when there are errors (such as missing dependencies!) - it merely outputs the error on the standard output.

Since dbt Core users use the command line, they always see the output - regardless of the OS return code /stderr, so the error is visible. I don't have access to dbtCloud, but I'd expect them to be exposed in the UI if a model expected to run did not run.

In the case of Cosmos/Airflow, we need to make a call of which scenarios represent a task failure - since we cannot rely on dbt Core return codes or the stderr.

dosubot · 2024-02-15T16:01:47Z

Hi, @tatiana,

I'm helping the Cosmos team manage their backlog and am marking this issue as stale. From what I understand, the issue you raised proposed raising an exception when attempting to run a model that dbt cannot find, with the ability for users to opt out of this behavior through configuration. The issue has been resolved by modifying the DbtRunLocalOperator to parse the output of the dbt command and raise an exception if no models are found. Additionally, a configuration option has been added to allow users to opt out of this behavior. The default behavior is now to fail if dbt fails to find a model, with an option to enable failing if dbt does not fail. This resolution addresses the concerns raised by you and ensures consistency in error visibility across different platforms.

Could you please confirm if this issue is still relevant to the latest version of the Cosmos repository? If it is, please let the Cosmos team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your contribution! If you have any further questions or need assistance with anything else, feel free to reach out.

tatiana mentioned this issue Nov 16, 2023

Fix running models that use alias while supporting dbt versions #662

Merged

2 tasks

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Feb 15, 2024

dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 22, 2024

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Feb 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raise exception when attempting to run a model that dbt can't find #682

Raise exception when attempting to run a model that dbt can't find #682

tatiana commented Nov 16, 2023

dosubot bot commented Nov 16, 2023

dosubot bot commented Nov 16, 2023

MrBones757 commented Nov 16, 2023

tatiana commented Nov 16, 2023

dosubot bot commented Feb 15, 2024

Raise exception when attempting to run a model that dbt can't find #682

Raise exception when attempting to run a model that dbt can't find #682

Comments

tatiana commented Nov 16, 2023

dosubot bot commented Nov 16, 2023

dosubot bot commented Nov 16, 2023

Sources

MrBones757 commented Nov 16, 2023

tatiana commented Nov 16, 2023

dosubot bot commented Feb 15, 2024