Skip to content

taskInstance Id format wrong while migrating to 3.0.4 and using MySQL as database #54554

@vinitpayal

Description

@vinitpayal

Apache Airflow version

3.0.4

If "Other Airflow 2 version" selected, which one?

No response

What happened?

Getting error

Input should be a valid UUID, invalid group length in group 4: expected 12, found 8 [type=uuid_parsing, input_value='0198ad8b-87b4-bb14-70de-b05a87b2', input_type=str]

It started after migrating from 2.10 to 3.0.4 using MySQL as the metadata database.

    num_queued_tis = self._do_scheduling(session)                                                                                                                                                │
│   File "/usr/local/lib/python3.10/site-packages/airflow/jobs/scheduler_job_runner.py", line 1435, in _do_scheduling                                                                              │
│     num_queued_tis = self._critical_section_enqueue_task_instances(session=session)                                                                                                              │
│   File "/usr/local/lib/python3.10/site-packages/airflow/jobs/scheduler_job_runner.py", line 774, in _critical_section_enqueue_task_instances                                                     │
│     self._enqueue_task_instances_with_queued_state(queued_tis_per_executor, executor, session=session)                                                                                           │
│   File "/usr/local/lib/python3.10/site-packages/airflow/jobs/scheduler_job_runner.py", line 705, in _enqueue_task_instances_with_queued_state                                                    │
│     workload = workloads.ExecuteTask.make(ti, generator=executor.jwt_generator)                                                                                                                  │
│   File "/usr/local/lib/python3.10/site-packages/airflow/executors/workloads.py", line 115, in make                                                                                               │
│     ser_ti = TaskInstance.model_validate(ti, from_attributes=True)                                                                                                                               │
│   File "/usr/local/lib/python3.10/site-packages/pydantic/main.py", line 705, in model_validate                                                                                                   │
│     return cls.__pydantic_validator__.validate_python(                                                                                                                                           │
│ pydantic_core._pydantic_core.ValidationError: 1 validation error for TaskInstance                                                                                                                │
│ id                                                                                                                                                                                               │
│   Input should be a valid UUID, invalid group length in group 4: expected 12, found 8 [type=uuid_parsing, input_value='0198ad8b-87b4-bb14-70de-b05a87b2', input_type=str]                        │
│     For further information visit https://errors.pydantic.dev/2.11/v/uuid_parsing                                                                                                                │
│ [2025-08-15T16:03:26.305+0000] {local_executor.py:223} INFO - Shutting down LocalExecutor; waiting for running tasks to finish.  Signal again if you don't want to wait.                         │
│ [2025-08-15T16:03:26.305+0000] {scheduler_job_runner.py:1038} INFO - Exited execute loop                                                                                                         │
│ Traceback (most recent call last):                                                                                                                                                               │
│   File "/usr/local/bin/airflow", line 8, in <module>                                                                                                                                             │
│     sys.exit(main())                                                                                                                                                                             │
│   File "/usr/local/lib/python3.10/site-packages/airflow/__main__.py", line 55, in main                                                                                                           │
│     args.func(args)                                                                                                                                                                              │
│   File "/usr/local/lib/python3.10/site-packages/airflow/cli/cli_config.py", line 48, in command                                                                                                  │
│     return func(*args, **kwargs)                                                                                                                                                                 │
│   File "/usr/local/lib/python3.10/site-packages/airflow/utils/cli.py", line 112, in wrapper                                                                                                      │
│     return f(*args, **kwargs)                                                                                                                                                                    │
│   File "/usr/local/lib/python3.10/site-packages/airflow/utils/providers_configuration_loader.py", line 55, in wrapped_function                                                                   │
│     return func(*args, **kwargs)                                                                                                                                                                 │
│   File "/usr/local/lib/python3.10/site-packages/airflow/cli/commands/scheduler_command.py", line 52, in scheduler                                                                                │
│     run_command_with_daemon_option(                                                                                                                                                              │
│   File "/usr/local/lib/python3.10/site-packages/airflow/cli/commands/daemon_utils.py", line 86, in run_command_with_daemon_option                                                                │
│     callback()                                                                                                                                                                                   │
│   File "/usr/local/lib/python3.10/site-packages/airflow/cli/commands/scheduler_command.py", line 55, in <lambda>                                                                                 │
│     callback=lambda: _run_scheduler_job(args),                                                                                                                                                   │
│   File "/usr/local/lib/python3.10/site-packages/airflow/cli/commands/scheduler_command.py", line 43, in _run_scheduler_job                                                                       │
│     run_job(job=job_runner.job, execute_callable=job_runner._execute)                                                                                                                            │
│   File "/usr/local/lib/python3.10/site-packages/airflow/utils/session.py", line 101, in wrapper                                                                                                  │
│     return func(*args, session=session, **kwargs)                                                                                                                                                │
│   File "/usr/local/lib/python3.10/site-packages/airflow/jobs/job.py", line 347, in run_job    

Did a bit of debugging and trace backed the issue to be in migration file 0042_3_0_0_add_uuid_primary_key_to_task_instance_.py for MySQL.

As it's generating UUIDs of format 0198ad8b-87b4-bb14-70de-b05a87b2 which is not the same for postgres and in the rules.

What you think should happen instead?

I think the migration should be corrected with the function definition

CREATE FUNCTION uuid_generate_v7(p_timestamp DATETIME(3))
    RETURNS CHAR(36)
    DETERMINISTIC
BEGIN
    DECLARE unix_time_ms BIGINT;
    DECLARE time_hex CHAR(12);
    DECLARE rand_hex CHAR(20);
    DECLARE uuid CHAR(36);

    -- Convert timestamp to milliseconds since epoch
    SET unix_time_ms = UNIX_TIMESTAMP(p_timestamp) * 1000;
    SET time_hex = LPAD(HEX(unix_time_ms), 12, '0');

    -- Generate 10 random bytes (20 hex chars)
    SET rand_hex = CONCAT(
            LPAD(HEX(FLOOR(RAND() * POW(2, 32))), 8, '0'),
            LPAD(HEX(FLOOR(RAND() * POW(2, 32))), 8, '0'),
            LPAD(HEX(FLOOR(RAND() * POW(2, 16))), 4, '0')
                   );

    -- Inject version (lets use version 9 here, hex '9')
    SET time_hex = CONCAT(LEFT(time_hex, 8), '9', SUBSTRING(time_hex, 10, 3));

    -- Inject variant (10xx... → '8', '9', 'a', or 'b')
    SET rand_hex = CONCAT(LEFT(rand_hex, 4), 'b', SUBSTRING(rand_hex, 6));

    -- Assemble UUID: 8-4-4-4-12 hex chars
    SET uuid = LOWER(CONCAT(
            SUBSTRING(time_hex, 1, 8), '-',
            SUBSTRING(time_hex, 9, 4), '-',
            SUBSTRING(rand_hex, 1, 4), '-',
            SUBSTRING(rand_hex, 5, 4), '-',
            SUBSTRING(rand_hex, 9, 12)
                     ));

    RETURN uuid;
END$$

Basically the last part of it should be SUBSTRING(rand_hex, 9, 12) as compared to SUBSTRING(rand_hex, 9) so that it generates UUIDs of format xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx

How to reproduce

  • Use MySQL as the metadata storage.
  • The table should have some records in task_instance table before migrating
  • Upgrade from 2.10 to 3.0.7 using airflow db migrate

Operating System

Ubuntu 20.04

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==9.11.0
apache-airflow-providers-cncf-kubernetes==10.7.0
apache-airflow-providers-common-compat==1.7.3
apache-airflow-providers-common-io==1.6.2
apache-airflow-providers-common-sql==1.27.4
apache-airflow-providers-fab==2.3.1
apache-airflow-providers-http==5.3.3
apache-airflow-providers-mysql==6.3.3
apache-airflow-providers-slack==9.1.3
apache-airflow-providers-smtp==2.1.2
apache-airflow-providers-standard==1.5.0

Deployment

Other Docker-based deployment

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions