-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Serialize Dags before making TI.dag_version_id non-nullable #53820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialize Dags before making TI.dag_version_id non-nullable #53820
Conversation
It seemed simpler to reserialize the DAGs and update the task instances directly, so that's what I did here. I'm slightly unsure if this could lead to failures during reserialization or cause performance issues. An alternative would have been to manually create entries for serialized_dag, dag_version, and dag_code before updating the TIs, but that felt more complex. The issue here is that, upgrades from AF2 fails due to the TIs not been associated with dag_versions. The issue mainly affects users upgrading from Airflow 2, since in Airflow 3 the dag_version table is already populated for all DAGs.
|
hmm -- we merged a change to just blow away old serialization -- shouldn't that have made this a non-issue? |
This one is about |
|
Do we need to reserialise during migration? I feel it might be enough to just delete the serialised data. I believe Airflow should automatically reserialise missing dags once the migration finishes and the scheduler is restarted? |
We deleted it initially here https://github.com/apache/airflow/pull/43700/files when migrating from AF2 but realized we could loose true histories and reverted it. Deleting the serdag could have been better then but I think we are doing it late if we do it at this point as we could loose AF3+ histories. |
What do you mean we could lose true histories? In airflow 2, we don't have serdag history |
|
This is alternative to apache#53820. Here we make the TI.dag_version_id nullable on the database level. it's still enforced in code
This is alternative to apache#53820. Here we make the TI.dag_version_id nullable on the database level. it's still enforced in code
This is alternative to apache#53820. Here we make the TI.dag_version_id nullable on the database level. it's still enforced in code
|
Closing in preference to #54366 |
It seemed simpler to reserialize the DAGs and update the task instances directly, so that's what I did here. I'm slightly unsure if this could lead to failures during reserialization or cause performance issues.
An alternative would have been to manually create entries for serialized_dag, dag_version, and dag_code before updating the TIs, but that felt more complex.
The issue here is that, upgrades from AF2 fails due to the TIs not been associated with dag_versions. The issue mainly affects users upgrading from Airflow 2, since in Airflow 3 the dag_version table is already populated for all DAGs.