-
Notifications
You must be signed in to change notification settings - Fork 16.3k
[v3-0-test] Allow downgrading to 2.11 from 3.x (#54371) #54399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
eladkal
approved these changes
Aug 12, 2025
e966d72 to
2fbd357
Compare
* Allow downgrading to 2.11 from 3.x There were two things blocking this: 1) The revision heads map didn't have any 2.11.x versions in it, so the previous implementation of `_get_version_revision` was only looking within the same <major.minor> pathc version. We change it to rely on the fact that our pre-commit checks ensure this map is ordered, and iterate over the dictionary reversed, and when we find the first thing less than the target revision we use that (direct equal is handled already above) 2) The `ab_*` tables not existing were blocking the migration. Part of this is now fixable manually with apache#54227, but I have decided that since FAB was required and the only option in 2.x, so I have decided to just create the tables if they are missing In order to try and cope with possible future changes I create the tables at the latest version and then downgrade to the oldest known revision. This is all handled in a `reset_to_2_x()` method on the FABDBManager, with a fallback to just blindly create the tables from the ORM for versions of the provider that don't yet have that function. * Remove `downgrade` from the RunDBManager interface This never made sense, and wasn't actually called as part of the `airflow db downgrade` CLI calls. The reason it doesn't make sense is that the version you pass is either the Airflow version (but external DB managers are installed and versioned separately) or the migration revision ID for the Airflow Core meta db. For FAB specifically there is the `airflow fab-db` CLI command to manage things, so "checking RunDBManager doesn't run Fab migrations" doesn't make sense as a test now (as the code that _could_ do it is removed), so I've removed the test too. (cherry picked from commit 1d04f09)
2fbd357 to
706aa8c
Compare
54 tasks
3 tasks
kaxil
pushed a commit
that referenced
this pull request
Aug 14, 2025
Fixes critical blocking issues when downgrading Airflow from 3.x to 2.x across PostgreSQL, MySQL, and SQLite databases. These issues were discovered after [PR #54399](#54399) unblocked the downgrade process. ### 1. PostgreSQL - NOT NULL Violation in `task_reschedule` Table **Error:** ``` sqlalchemy.exc.IntegrityError: (psycopg2.errors.NotNullViolation) column "try_number" of relation "task_reschedule" contains null values ``` **Root Cause:** Migration [`0068_3_0_0_ti_table_id_unique_per_try.py:L99`](https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/migrations/versions/0068_3_0_0_ti_table_id_unique_per_try.py#L99) uses `default="1"` instead of `server_default="1"`, causing existing NULL values to remain NULL when the column is made NOT NULL. **Solution:** Replace `default="1"` with `server_default="1"` to ensure database-level default value assignment. ### 2. PostgreSQL - NOT NULL Violation in `task_instance_history` Table **Error:** ``` sqlalchemy.exc.IntegrityError: (psycopg2.errors.NotNullViolation) column "task_instance_id" of relation "task_instance_history" contains null values ``` **Root Cause:** The `task_instance_id` column is made NOT NULL during downgrade but contains NULL values, even though this column gets dropped entirely in the same migration. **Solution:** Make `task_instance_id` column nullable since it's immediately dropped and not needed in 2.x schema. ### 3. MySQL - Invalid NULL Value During Row Numbering **Error:** ``` sqlalchemy.exc.OperationalError: (MySQLdb.OperationalError) (1138, 'Invalid use of NULL value') ``` **Root Cause:** MySQL query attempts to JOIN on NULL `id` values in `task_instance_history`: ```sql UPDATE task_instance_history tih JOIN ( SELECT id, ROW_NUMBER() OVER (ORDER BY id) AS row_num FROM task_instance_history ) AS temp ON tih.id = temp.id SET tih.id = temp.row_num; ``` The `id` column was just re-added as nullable with all NULL values, making the JOIN fail. **Solution:** Replace with MySQL variable-based sequential numbering: ```sql SET @row_number = 0; UPDATE task_instance_history SET id = (@row_number := @row_number + 1) ORDER BY try_id; ``` ### 4. SQLite - Foreign Key Constraint During Batch Operations **Error:** ``` sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) FOREIGN KEY constraint failed ``` **Root Cause:** SQLite's strict foreign key checking prevents dropping the `dag` table during `batch_alter_table` operations when other tables reference it. **Solution:** Add SQLite-specific handling to temporarily disable foreign key constraints: ```python if dialect_name == "sqlite": conn.execute(text("PRAGMA foreign_keys=OFF")) try: # batch operations finally: conn.execute(text("PRAGMA foreign_keys=ON")) ``` ## Testing Verified successful downgrade from Airflow 3.0.5rc1 to 2.11 on: - [x] PostgreSQL - [x] MySQL - [x] SQLite
kaxil
pushed a commit
that referenced
this pull request
Aug 14, 2025
Fixes critical blocking issues when downgrading Airflow from 3.x to 2.x across PostgreSQL, MySQL, and SQLite databases. These issues were discovered after [PR #54399](#54399) unblocked the downgrade process. **Error:** ``` sqlalchemy.exc.IntegrityError: (psycopg2.errors.NotNullViolation) column "try_number" of relation "task_reschedule" contains null values ``` **Root Cause:** Migration [`0068_3_0_0_ti_table_id_unique_per_try.py:L99`](https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/migrations/versions/0068_3_0_0_ti_table_id_unique_per_try.py#L99) uses `default="1"` instead of `server_default="1"`, causing existing NULL values to remain NULL when the column is made NOT NULL. **Solution:** Replace `default="1"` with `server_default="1"` to ensure database-level default value assignment. **Error:** ``` sqlalchemy.exc.IntegrityError: (psycopg2.errors.NotNullViolation) column "task_instance_id" of relation "task_instance_history" contains null values ``` **Root Cause:** The `task_instance_id` column is made NOT NULL during downgrade but contains NULL values, even though this column gets dropped entirely in the same migration. **Solution:** Make `task_instance_id` column nullable since it's immediately dropped and not needed in 2.x schema. **Error:** ``` sqlalchemy.exc.OperationalError: (MySQLdb.OperationalError) (1138, 'Invalid use of NULL value') ``` **Root Cause:** MySQL query attempts to JOIN on NULL `id` values in `task_instance_history`: ```sql UPDATE task_instance_history tih JOIN ( SELECT id, ROW_NUMBER() OVER (ORDER BY id) AS row_num FROM task_instance_history ) AS temp ON tih.id = temp.id SET tih.id = temp.row_num; ``` The `id` column was just re-added as nullable with all NULL values, making the JOIN fail. **Solution:** Replace with MySQL variable-based sequential numbering: ```sql SET @row_number = 0; UPDATE task_instance_history SET id = (@row_number := @row_number + 1) ORDER BY try_id; ``` **Error:** ``` sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) FOREIGN KEY constraint failed ``` **Root Cause:** SQLite's strict foreign key checking prevents dropping the `dag` table during `batch_alter_table` operations when other tables reference it. **Solution:** Add SQLite-specific handling to temporarily disable foreign key constraints: ```python if dialect_name == "sqlite": conn.execute(text("PRAGMA foreign_keys=OFF")) try: # batch operations finally: conn.execute(text("PRAGMA foreign_keys=ON")) ``` Verified successful downgrade from Airflow 3.0.5rc1 to 2.11 on: - [x] PostgreSQL - [x] MySQL - [x] SQLite (cherry picked from commit bc18493)
kosteev
pushed a commit
to GoogleCloudPlatform/composer-airflow
that referenced
this pull request
Sep 25, 2025
Fixes critical blocking issues when downgrading Airflow from 3.x to 2.x across PostgreSQL, MySQL, and SQLite databases. These issues were discovered after [PR #54399](apache/airflow#54399) unblocked the downgrade process. **Error:** ``` sqlalchemy.exc.IntegrityError: (psycopg2.errors.NotNullViolation) column "try_number" of relation "task_reschedule" contains null values ``` **Root Cause:** Migration [`0068_3_0_0_ti_table_id_unique_per_try.py:L99`](https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/migrations/versions/0068_3_0_0_ti_table_id_unique_per_try.py#L99) uses `default="1"` instead of `server_default="1"`, causing existing NULL values to remain NULL when the column is made NOT NULL. **Solution:** Replace `default="1"` with `server_default="1"` to ensure database-level default value assignment. **Error:** ``` sqlalchemy.exc.IntegrityError: (psycopg2.errors.NotNullViolation) column "task_instance_id" of relation "task_instance_history" contains null values ``` **Root Cause:** The `task_instance_id` column is made NOT NULL during downgrade but contains NULL values, even though this column gets dropped entirely in the same migration. **Solution:** Make `task_instance_id` column nullable since it's immediately dropped and not needed in 2.x schema. **Error:** ``` sqlalchemy.exc.OperationalError: (MySQLdb.OperationalError) (1138, 'Invalid use of NULL value') ``` **Root Cause:** MySQL query attempts to JOIN on NULL `id` values in `task_instance_history`: ```sql UPDATE task_instance_history tih JOIN ( SELECT id, ROW_NUMBER() OVER (ORDER BY id) AS row_num FROM task_instance_history ) AS temp ON tih.id = temp.id SET tih.id = temp.row_num; ``` The `id` column was just re-added as nullable with all NULL values, making the JOIN fail. **Solution:** Replace with MySQL variable-based sequential numbering: ```sql SET @row_number = 0; UPDATE task_instance_history SET id = (@row_number := @row_number + 1) ORDER BY try_id; ``` **Error:** ``` sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) FOREIGN KEY constraint failed ``` **Root Cause:** SQLite's strict foreign key checking prevents dropping the `dag` table during `batch_alter_table` operations when other tables reference it. **Solution:** Add SQLite-specific handling to temporarily disable foreign key constraints: ```python if dialect_name == "sqlite": conn.execute(text("PRAGMA foreign_keys=OFF")) try: # batch operations finally: conn.execute(text("PRAGMA foreign_keys=ON")) ``` Verified successful downgrade from Airflow 3.0.5rc1 to 2.11 on: - [x] PostgreSQL - [x] MySQL - [x] SQLite (cherry picked from commit bc18493d64edda6ecc3aebd9013a7d1355fa63d6) GitOrigin-RevId: b41a5d51b62e86dd2e1c4fb6e378e3c79ff81328
kosteev
pushed a commit
to GoogleCloudPlatform/composer-airflow
that referenced
this pull request
Oct 24, 2025
Fixes critical blocking issues when downgrading Airflow from 3.x to 2.x across PostgreSQL, MySQL, and SQLite databases. These issues were discovered after [PR #54399](apache/airflow#54399) unblocked the downgrade process. ### 1. PostgreSQL - NOT NULL Violation in `task_reschedule` Table **Error:** ``` sqlalchemy.exc.IntegrityError: (psycopg2.errors.NotNullViolation) column "try_number" of relation "task_reschedule" contains null values ``` **Root Cause:** Migration [`0068_3_0_0_ti_table_id_unique_per_try.py:L99`](https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/migrations/versions/0068_3_0_0_ti_table_id_unique_per_try.py#L99) uses `default="1"` instead of `server_default="1"`, causing existing NULL values to remain NULL when the column is made NOT NULL. **Solution:** Replace `default="1"` with `server_default="1"` to ensure database-level default value assignment. ### 2. PostgreSQL - NOT NULL Violation in `task_instance_history` Table **Error:** ``` sqlalchemy.exc.IntegrityError: (psycopg2.errors.NotNullViolation) column "task_instance_id" of relation "task_instance_history" contains null values ``` **Root Cause:** The `task_instance_id` column is made NOT NULL during downgrade but contains NULL values, even though this column gets dropped entirely in the same migration. **Solution:** Make `task_instance_id` column nullable since it's immediately dropped and not needed in 2.x schema. ### 3. MySQL - Invalid NULL Value During Row Numbering **Error:** ``` sqlalchemy.exc.OperationalError: (MySQLdb.OperationalError) (1138, 'Invalid use of NULL value') ``` **Root Cause:** MySQL query attempts to JOIN on NULL `id` values in `task_instance_history`: ```sql UPDATE task_instance_history tih JOIN ( SELECT id, ROW_NUMBER() OVER (ORDER BY id) AS row_num FROM task_instance_history ) AS temp ON tih.id = temp.id SET tih.id = temp.row_num; ``` The `id` column was just re-added as nullable with all NULL values, making the JOIN fail. **Solution:** Replace with MySQL variable-based sequential numbering: ```sql SET @row_number = 0; UPDATE task_instance_history SET id = (@row_number := @row_number + 1) ORDER BY try_id; ``` ### 4. SQLite - Foreign Key Constraint During Batch Operations **Error:** ``` sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) FOREIGN KEY constraint failed ``` **Root Cause:** SQLite's strict foreign key checking prevents dropping the `dag` table during `batch_alter_table` operations when other tables reference it. **Solution:** Add SQLite-specific handling to temporarily disable foreign key constraints: ```python if dialect_name == "sqlite": conn.execute(text("PRAGMA foreign_keys=OFF")) try: # batch operations finally: conn.execute(text("PRAGMA foreign_keys=ON")) ``` ## Testing Verified successful downgrade from Airflow 3.0.5rc1 to 2.11 on: - [x] PostgreSQL - [x] MySQL - [x] SQLite GitOrigin-RevId: bc18493d64edda6ecc3aebd9013a7d1355fa63d6
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There were two things blocking this:
The revision heads map didn't have any 2.11.x versions in it, so the
previous implementation of
_get_version_revisionwas only looking withinthe same <major.minor> pathc version.
We change it to rely on the fact that our pre-commit checks ensure this map
is ordered, and iterate over the dictionary reversed, and when we find the
first thing less than the target revision we use that (direct equal is
handled already above)
The
ab_*tables not existing were blocking the migration. Part of this isnow fixable manually with Create FAB's user/role tables on migration, not only on initdb #54227, but I have decided that since FAB was
required and the only option in 2.x, so I have decided to just create the
tables if they are missing
In order to try and cope with possible future changes I create the tables
at the latest version and then downgrade to the oldest known revision.
This is all handled in a
reset_to_2_x()method on the FABDBManager, witha fallback to just blindly create the tables from the ORM for versions of
the provider that don't yet have that function.
downgradefrom the RunDBManager interfaceThis never made sense, and wasn't actually called as part of the
airflow db downgradeCLI calls.The reason it doesn't make sense is that the version you pass is either the
Airflow version (but external DB managers are installed and versioned
separately) or the migration revision ID for the Airflow Core meta db.
For FAB specifically there is the
airflow fab-dbCLI command to managethings, so "checking RunDBManager doesn't run Fab migrations" doesn't make
sense as a test now (as the code that could do it is removed), so I've
removed the test too.
(cherry picked from commit 1d04f09)