Reinstate tih.id as pkey and auto-increment on downgrade #58149

Chais · 2025-11-10T16:22:29Z

Reinstate task_instance_history.id as the table's primary key, given that it was that before the upgrade.
Use alembic's autoincrement instead of filling in the values manually. This ensures future entries will get a key, as the code for those versions expects it.

I verified that adding an auto-increment column (INTEGER SERIAL for postgres, int NOT NULL AUTO_INCREMENT for mysql) to a non-empty table backfills the new fields as we did manually.

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

boring-cyborg · 2025-11-10T16:22:33Z

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

Pay attention to the quality of your code (ruff, mypy and type annotations). Our prek-hooks will help you with that.
In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
Be sure to read the Airflow Coding style.
Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
Apache Airflow is a community-driven project and together we are making it better 🚀.
In case of doubts contact the developers at:
Mailing List: dev@airflow.apache.org
Slack: https://s.apache.org/airflow-slack

ephraimbuddy · 2025-12-02T11:00:34Z

airflow-core/src/airflow/migrations/versions/0060_3_0_0_add_try_id_to_ti_and_tih.py

    with op.batch_alter_table("task_instance_history", schema=None) as batch_op:
        batch_op.drop_constraint(batch_op.f("task_instance_history_pkey"), type_="primary")
-        batch_op.add_column(sa.Column("id", sa.INTEGER, nullable=True))
+        batch_op.add_column(sa.Column("id", sa.INTEGER, primary_key=True, autoincrement=True))


This won't work with data as id would be null.
That's why we had to run the deleted statements to update the ID column before making it a primary key

I'm not convinced this is the case. The downgrade SQL generated for psql looks like this:

BEGIN; -- Running downgrade 7645189f3479 -> e00344393f31 ALTER TABLE task_instance_history DROP CONSTRAINT task_instance_history_pkey; ALTER TABLE task_instance_history ADD COLUMN id SERIAL NOT NULL; ALTER TABLE task_instance_history DROP COLUMN task_instance_id; ALTER TABLE task_instance_history DROP COLUMN try_id; UPDATE alembic_version SET version_num='e00344393f31' WHERE alembic_version.version_num = '7645189f3479'; COMMIT;

The relevant line is:

ALTER TABLE task_instance_history ADD COLUMN id SERIAL NOT NULL;

If I now create a test table and fill it with some data:

CREATE TABLE test ( column_a text, column_b integer ); INSERT INTO test (column_a, column_b) VALUES ('some text', 3); INSERT INTO test (column_a, column_b) VALUES ('some other text', 17);

I see it as expected:

SELECT * FROM test; column_a | column_b -----------------+---------- some text | 3 some other text | 17 (2 rows)

If I now apply the relevant change to this table with its existing data like this:

ALTER TABLE test ADD COLUMN id SERIAL NOT NULL;

I can see the values being inserted:

SELECT * FROM test; column_a | column_b | id -----------------+----------+---- some text | 3 | 1 some other text | 17 | 2 (2 rows)

So this seems to work as intended on PostgreSQL. I tested it on MySQL when I wrote this PR, with a similar result.

Can you test with postgresql because according to alembic, autoincrement is only understood by mysql: https://alembic.sqlalchemy.org/en/latest/ops.html

I had hoped this would make it sufficiently clear that I did:

The downgrade SQL generated for psql looks like this:
[…]
So this seems to work as intended on PostgreSQL.

Technically, the generated code doesn't use AUTO_INCREMENT:

ALTER TABLE test ADD COLUMN id SERIAL NOT NULL;

But as we can see in the PostgreSQL documentation, the SERIAL type is an autoincrementing integer, which is just what we need: https://www.postgresql.org/docs/current/datatype-numeric.html
Maybe Alembic translates sa.INTEGER, autoincrement=True into SERIAL for Postgres.

Also, it's not like I invented this definition, either. task_instance_history was introduced in revision d482b7261ff9 (number 21, by the new count) with the id column defined like this:

sa.Column("id", sa.Integer(), primary_key=True, autoincrement=True),

So clearly, this worked before (and for all supported DBs, I might add) and I see no reason why it shouldn't work again.
And the fact remains, that the current code leaves the DB broken post-downgrade. While it restores the id column and refills it in what may be the most cumbersome way possible, it neglects the autoincrement the codebase expects at the relevant version, leading to queries failing. Please see the issue I linked in the original post for details.

Reinstate tih.id as pkey and auto-increment on downgrade

0071913

Chais requested a review from ephraimbuddy as a code owner November 10, 2025 16:22

boring-cyborg bot added the area:db-migrations PRs with DB migration label Nov 10, 2025

Merge branch 'main' into 7645189f3479-downgrade-fix

6cb636f

potiuk added this to the Airflow 3.1.4 milestone Dec 2, 2025

ephraimbuddy reviewed Dec 2, 2025

View reviewed changes

ephraimbuddy modified the milestones: Airflow 3.1.4, Airflow 3.1.5 Dec 8, 2025

potiuk modified the milestones: Airflow 3.1.5, Airflow 3.1.6 Dec 14, 2025

ephraimbuddy modified the milestones: Airflow 3.1.6, Airflow 3.1.7 Jan 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reinstate tih.id as pkey and auto-increment on downgrade #58149

Reinstate tih.id as pkey and auto-increment on downgrade #58149

Uh oh!

Chais commented Nov 10, 2025

Uh oh!

boring-cyborg bot commented Nov 10, 2025

Uh oh!

ephraimbuddy Dec 2, 2025

Uh oh!

Chais Jan 15, 2026 •

edited

Loading

Uh oh!

ephraimbuddy Jan 15, 2026

Uh oh!

Chais Jan 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Reinstate tih.id as pkey and auto-increment on downgrade #58149

Are you sure you want to change the base?

Reinstate tih.id as pkey and auto-increment on downgrade #58149

Uh oh!

Conversation

Chais commented Nov 10, 2025

Uh oh!

boring-cyborg bot commented Nov 10, 2025

Uh oh!

ephraimbuddy Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Chais Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ephraimbuddy Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Chais Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Chais Jan 15, 2026 •

edited

Loading

Chais Jan 16, 2026 •

edited

Loading