Skip to content

Conversation

@ephraimbuddy
Copy link
Contributor

@ephraimbuddy ephraimbuddy commented Sep 19, 2025

When upgrading from Airflow 2, existing deferred triggers can reference TaskInstances without a dag_version_id and DagRuns with conf=None. This caused errors when the triggerer tried to start those triggers and when workers consumed ti_run responses.

This change:

  1. Skips starting triggers whose TaskInstance lacks dag_version_id, logging a warning instead of erroring
  2. Coerces DagRun.conf from None to {} in the ti_run response for compatibility with Airflow 2-era data
  3. Adds unit tests covering both behaviors

This prevents triggerer crashes and makes deferred tasks resume reliably after migration.

closes: #55713

How to test:

set a fernet key e.g:
export AIRFLOW__CORE__FERNET_KEY='8janSoQD86ALy_tnJjR-hcxNweHnUxhfDV61TBntr_4='
in both init.sh and environment_variables.env in airflow source.
AF2 uses init.sh for breeze but environment_variables.env is used in AF3.

Switch to airflow 2: git switch apache/v2-11-stable
Add this dag with the target time adjusted to maybe 30mins in the future:

from datetime import datetime
from airflow import DAG

from airflow.sensors.date_time import DateTimeSensorAsync


with (DAG("async_trigger_sleep", start_date=datetime(2025,9,17),tags=["async_migration"])
      ) as dag:
    DateTimeSensorAsync(
        task_id="wait_for_start",
        target_time=datetime(2025, 9, 19, 11, 26),

    )

start breeze breeze start-airflow --backend postgres --executor CeleryExecutor
Trigger the dag above and stop breeze once the task is deferred.
Switch to Airflow 3: git switch main
Start breeze: breeze start-airflow --backend postgres --executor CeleryExecutor

See that when the target time reach, the dag would be successful

@pdellarciprete
Copy link

We are having exactly the same issue, and it is causing the crash of the scheduler. Any plan to release it soon?
Is there any workaround to fix the values on the databases?

@kaxil kaxil modified the milestones: Airflow 3.1.1, Airflow 3.1.2 Oct 21, 2025
@ephraimbuddy ephraimbuddy force-pushed the fix-failing-triggers-after-migration branch from 45b6da9 to 9047086 Compare October 22, 2025 11:42
@ephraimbuddy ephraimbuddy force-pushed the fix-failing-triggers-after-migration branch from 9047086 to 43bca86 Compare October 22, 2025 17:08
When upgrading from Airflow 2, existing deferred triggers can reference
TaskInstances without a dag_version_id and DagRuns with conf=None. This
caused errors when the triggerer tried to start those triggers and when
workers consumed ti_run responses.

This change:
1. Skips starting triggers whose TaskInstance lacks dag_version_id, logging
a warning instead of erroring
2. Coerces DagRun.conf from None to {} in the ti_run response for
compatibility with Airflow 2-era data
3. Adds unit tests covering both behaviors

This prevents triggerer crashes and makes deferred tasks resume reliably
after migration.
@ephraimbuddy ephraimbuddy force-pushed the fix-failing-triggers-after-migration branch from 43bca86 to fef8897 Compare October 22, 2025 19:19
@ephraimbuddy ephraimbuddy merged commit 7cec2a7 into apache:main Oct 22, 2025
60 checks passed
@ephraimbuddy ephraimbuddy deleted the fix-failing-triggers-after-migration branch October 22, 2025 22:23
kaxil pushed a commit that referenced this pull request Oct 31, 2025
* Fix triggerer errors after Airflow 2 to 3 migration

When upgrading from Airflow 2, existing deferred triggers can reference
TaskInstances without a dag_version_id and DagRuns with conf=None. This
caused errors when the triggerer tried to start those triggers and when
workers consumed ti_run responses.

This change:
1. Skips starting triggers whose TaskInstance lacks dag_version_id, logging
a warning instead of erroring
2. Coerces DagRun.conf from None to {} in the ti_run response for
compatibility with Airflow 2-era data
3. Adds unit tests covering both behaviors

This prevents triggerer crashes and makes deferred tasks resume reliably
after migration.

* Remove config check as that has been addressed in a different PR

* Add comment on why we added this

* Remove null conf test

(cherry picked from commit 7cec2a7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API area:Triggerer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Triggerer crash when migrating Airflow 2 to 3 with async dagrun already in deferred state

4 participants