Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading from version 2.1.4 to 2.2.0 fails during postgres db upgrade with error The task_instance table has 1392 rows without a corresponding dag_run row. You must manually correct this problem (possibly by deleting the problem rows) #18965

Closed
1 of 2 tasks
gbonazzoli opened this issue Oct 14, 2021 · 2 comments
Labels
area:core duplicate Issue that is duplicated kind:bug This is a clearly a bug

Comments

@gbonazzoli
Copy link

Apache Airflow version

2.2.0 (latest released)

Operating System

Ubuntu 20.04.3 LTS

Versions of Apache Airflow Providers

apache-airflow-providers-oracle==2.0.1
apache-airflow-providers-ssh==2.1.1

Deployment

Virtualenv installation

Deployment details

No response

What happened

During the the command airflow db upgrade we got this error:

[2021-10-12 06:25:10,721] {db.py:815} ERROR - Automatic migration is not available
[2021-10-12 06:25:10,721] {db.py:817} ERROR - The task_instance table has 1392 rows without a corresponding dag_run row. You must manually correct this problem (possibly by deleting the problem rows).

What you expected to happen

It would be nice if the output of airflow db upgrade is the list of objects' ID that needs to be deleted, in order to preserve future's version consistency.

Or even better the SQL statement to manually remove them.

I tried to identify those rows but it is not a trivial task.

How to reproduce

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@gbonazzoli gbonazzoli added area:core kind:bug This is a clearly a bug labels Oct 14, 2021
@gbonazzoli
Copy link
Author

I found the solution to delete the offending records:

delete FROM task_instance where (execution_date, task_id, dag_id) in (
SELECT task_instance.execution_date,
       task_instance.task_id,
       task_instance.dag_id
FROM task_instance
LEFT JOIN dag_run ON task_instance.dag_id = dag_run.dag_id
AND task_instance.execution_date = dag_run.execution_date
WHERE dag_run.run_id IS NULL
);
commit;

And I successfully migrated to 2.2.0

@potiuk
Copy link
Member

potiuk commented Oct 14, 2021

Duplicate of #18894

@potiuk potiuk marked this as a duplicate of #18894 Oct 14, 2021
@potiuk potiuk closed this as completed Oct 14, 2021
@potiuk potiuk added the duplicate Issue that is duplicated label Oct 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core duplicate Issue that is duplicated kind:bug This is a clearly a bug
Projects
None yet
Development

No branches or pull requests

2 participants