-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Description
Apache Airflow version
2.11.0
If "Other Airflow 2/3 version" selected, which one?
No response
What happened?
Recently I accidentally applied DB migrations 28 through 68 while attempting to update from Airflow 2.10.5 to 2.11.0. After cleaning up the mess by downgrading the DB I noticed that I'm no longer able to clear tasks in order to re-try them.
Trying gives you nothing except a lengthy exception in the webserver logs, the relevant part of which is this:
sqlalchemy.exc.IntegrityError: (psycopg2.errors.NotNullViolation) null value in column "id" of relation "task_instance_history" violates not-null constraint
DETAIL: Failing row contains (nosify.notify_successes, cert_env, manual__2025-11-02T23:00:04.866429+00:00, 0, 1, 2025-11-02 23:04:02.810671+00, 2025-11-02 23:04:05.093361+00, 2.28269, success,
0, airflow-worker-0.airflow-worker.airflow.svc.cluster.local, airflow, default_pool, 1, default, 1, _PythonDecoratedOperator, null, 2025-11-02 23:04:01.165684+00, 720119, 68415, null, \x80057d942e, 2025
-11-02 23:04:05.146802+00, null, a96e4277-0d0e-4b53-868d-48e72e2dd9bb, null, null, null, null, nosify.notify_successes, null, 721182).
[SQL: INSERT INTO task_instance_history (task_id, dag_id, run_id, map_index, try_number, start_date, end_date, duration, state, max_tries, hostname, unixname, job_id, pool, pool_slots, queue, p
riority_weight, operator, custom_operator_name, queued_dttm, queued_by_job_id, pid, executor, executor_config, updated_at, rendered_map_index, external_executor_id, trigger_id, trigger_timeout, next_meth
od, next_kwargs, task_display_name) VALUES (%(task_id)s, %(dag_id)s, %(run_id)s, %(map_index)s, %(try_number)s, %(start_date)s, %(end_date)s, %(duration)s, %(state)s, %(max_tries)s, %(hostname)s, %(unixn
ame)s, %(job_id)s, %(pool)s, %(pool_slots)s, %(queue)s, %(priority_weight)s, %(operator)s, %(custom_operator_name)s, %(queued_dttm)s, %(queued_by_job_id)s, %(pid)s, %(executor)s, %(executor_config)s, %(u
pdated_at)s, %(rendered_map_index)s, %(external_executor_id)s, %(trigger_id)s, %(trigger_timeout)s, %(next_method)s, %(next_kwargs)s, %(task_display_name)s) RETURNING task_instance_history.id]
[parameters: {'task_id': 'nosify.notify_successes', 'dag_id': 'cert_env', 'run_id': 'manual__2025-11-02T23:00:04.866429+00:00', 'map_index': 0, 'try_number': 1, 'start_date': datetime.datetime(
2025, 11, 2, 23, 4, 2, 810671, tzinfo=Timezone('UTC')), 'end_date': datetime.datetime(2025, 11, 2, 23, 4, 5, 93361, tzinfo=Timezone('UTC')), 'duration': 2.28269, 'state': 'success', 'max_tries': 0, 'host
name': 'airflow-worker-0.airflow-worker.airflow.svc.cluster.local', 'unixname': 'airflow', 'job_id': 721182, 'pool': 'default_pool', 'pool_slots': 1, 'queue': 'default', 'priority_weight': 1, 'operator':
'_PythonDecoratedOperator', 'custom_operator_name': None, 'queued_dttm': datetime.datetime(2025, 11, 2, 23, 4, 1, 165684, tzinfo=Timezone('UTC')), 'queued_by_job_id': 720119, 'pid': 68415, 'executor': N
one, 'executor_config': <psycopg2.extensions.Binary object at 0x7fd2dc58b4b0>, 'updated_at': datetime.datetime(2025, 11, 2, 23, 4, 5, 146802, tzinfo=Timezone('UTC')), 'rendered_map_index': None, 'externa
l_executor_id': 'a96e4277-0d0e-4b53-868d-48e72e2dd9bb', 'trigger_id': None, 'trigger_timeout': None, 'next_method': None, 'next_kwargs': 'null', 'task_display_name': 'nosify.notify_successes'}]
(Background on this error at: https://sqlalche.me/e/14/gkpj)
What you think should happen instead?
The code doesn't add a value for the id column, but it can't be NULL, either. This only leaves the conclusion that it should auto-increment.
The task_instance_history table was added in airflow-core/src/airflow/migrations/versions/0021_2_10_0_add_task_instance_history.py like this:
def upgrade():
"""Add task_instance_history table."""
op.create_table(
"task_instance_history",
sa.Column("id", sa.Integer(), primary_key=True, autoincrement=True),In airflow-core/src/airflow/migrations/versions/0060_3_0_0_add_try_id_to_ti_and_tih.py the id column is dropped during upgrade() and reinstated in downgrade() as follows:
def downgrade():
"""Unapply Add try_id to TI and TIH."""
dialect_name = op.get_bind().dialect.name
with op.batch_alter_table("task_instance_history", schema=None) as batch_op:
...
batch_op.add_column(sa.Column("id", sa.INTEGER, nullable=True))
...
...
with op.batch_alter_table("task_instance_history", schema=None) as batch_op:
batch_op.alter_column("id", nullable=False, existing_type=sa.INTEGER)
...So we can see it loses the auto-increment property and is no longer the Pkey.
How to reproduce
Upgrade the DB to revision 7645189f3479, then downgrade to e00344393f31.
Operating System
Host: CentOS 7, worker: Debian 12
Versions of Apache Airflow Providers
No response
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct