-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Dbt retry as airflow retry mechanism #1402
Comments
hi @tuantran0910 , it seems like a great idea. I am unsure if would be trivial to replace Airflow's retry mechanism directly with astronomer-cosmos/cosmos/operators/local.py Line 199 in 9820935
Your interest in contributing this is highly appreciated, please let us know if you'd have more thoughts or would like to discuss something more here |
Thanks @pankajkoti for replying me, I have read the code in the file I also have one more proposal:
For e.g: def check_retry_state(**kwargs):
# Get TaskInstance from the context
task_instance = kwargs['ti']
# Check the current try number
current_try_number = task_instance.try_number
max_retries = task_instance.task.retries
logging.info(f"Task is on attempt {current_try_number}/{max_retries + 1}")
if current_try_number > 1:
logging.info("Task has entered retry progress.")
# This is where dbt retry happens
else:
logging.info("This is the first attempt.")
@dag(schedule_interval=None, start_date=days_ago(1), catchup=False)
def retry_check_dag():
check_retry = PythonOperator(
task_id="check_retry_state",
python_callable=check_retry_state,
provide_context=True,
) This is my thoughts about the dbt retry. Looking forward to hearing your replies. |
hi @tuantran0910 , yes, you're right. However, I am still not sure how trivial it would be to intercept try_number from the TaskInstance and then even if we intercept the successive steps that need to be handled for updating the Airflow database metadata with the retries. If you're trying something that could work, please definitely open a PR and we would be happy to seek reviews further (maybe from Airflow devs too if needed) |
Description
dbt-core
version1.9.0
has been released few months ago. One of its new feature is the incremental strategymicrobatch
which will split the data to process into multiple batches and process it sequentially or in parallel. If some of those batch fails, we can use the commanddbt-retry
to process only failed batches.The
dbt retry
command will re-executes the last dbt command from the node point of failure. If the previously executed dbt command was successful, retry will finish as no operation. More details, visit here.I wonder whether we can replace the Airflow retry mechanism by the dbt retry ? I think would be useful for the incremental strategy
microbatch
(appears atdbt-core
from 1.9.0) as it only needs to retry the failed batchUse case/motivation
No response
Related issues
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: