Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix trigger_kwargs encryption/decryption on db migration #38876

Closed
wants to merge 5 commits into from

Conversation

hussein-awala
Copy link
Member

@hussein-awala hussein-awala commented Apr 9, 2024

closes: #38836

This PR updates the trigger kwargs encryption condition to from_revision < trigger_kwargs_encryption_version <= current_version and the trigger kwargs decryption condition to to_revision < trigger_kwargs_encryption_version <= from_revision:

  • encryption: the previous revision should be lower than the revision that encrypts the kwargs and the new revision should be equal to or greater than this revision.
  • decryption: the new revision should be lower than the revision that encrypts the kwargs and the previous revision should be equal to or greater than this revision.

@hussein-awala hussein-awala added the type:bug-fix Changelog: Bug Fixes label Apr 9, 2024
@hussein-awala hussein-awala added this to the Airflow 2.9.1 milestone Apr 9, 2024
@hussein-awala hussein-awala added the area:db-migrations PRs with DB migration label Apr 9, 2024
@ephraimbuddy
Copy link
Contributor

Hi @hussein-awala , can you add more information to the commit message, I'm finding it difficult to understand what you worked on.

@hussein-awala
Copy link
Member Author

Hi @hussein-awala , can you add more information to the commit message, I'm finding it difficult to understand what you worked on.

I updated the PR description, could you take a look?

airflow/utils/db.py Outdated Show resolved Hide resolved
airflow/utils/db.py Outdated Show resolved Hide resolved
mock_alembic_upgrade.assert_called_once_with(mock.ANY, f"{from_revision}:{to_revision}", sql=True)

@pytest.mark.skipif(
conf.get_mandatory_value("database", "sql_alchemy_conn").lower().startswith("sqlite"),
reason="Offline migration not supported for SQLite.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we now have this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are not valid for SQLite , they should be skipped because of

raise AirflowException("Offline migration not supported for SQLite.")

But before _get_current_revision was not used if show_sql_only=True.

I can refactor it if needed to revert the tests

airflow/utils/db.py Outdated Show resolved Hide resolved
airflow/utils/db.py Outdated Show resolved Hide resolved
mock_alembic_upgrade.assert_called_once_with(mock.ANY, f"{from_revision}:{to_revision}", sql=True)

@pytest.mark.skipif(
conf.get_mandatory_value("database", "sql_alchemy_conn").lower().startswith("sqlite"),
reason="Offline migration not supported for SQLite.",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are not valid for SQLite , they should be skipped because of

raise AirflowException("Offline migration not supported for SQLite.")

But before _get_current_revision was not used if show_sql_only=True.

I can refactor it if needed to revert the tests

hussein-awala and others added 2 commits April 18, 2024 01:19
Co-authored-by: Ephraim Anierobi <splendidzigy24@gmail.com>
@Lee-W Lee-W self-requested a review April 24, 2024 00:59
Comment on lines +1670 to +1677
current_version = _from_revision
trigger_kwargs_encryption_revision = _REVISION_HEADS_MAP["2.9.0"]
if (
_from_revision != trigger_kwargs_encryption_revision
and _revision_greater(config, trigger_kwargs_encryption_revision, _from_revision)
and _revision_greater(config, current_version, trigger_kwargs_encryption_revision)
):
# _from_revision < trigger_kwargs_encryption_version <= current_version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused here. If current_vesrion = _from_revision, then when will _from_revision < trigger_kwargs_encryption_version <= current_version happen?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, this comment looks quite wrong here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there confusion between from_revision I wonder? Not sure which is intended here.

Comment on lines +165 to +168
@pytest.mark.skipif(
conf.get_mandatory_value("database", "sql_alchemy_conn").lower().startswith("sqlite"),
reason="Offline migration not supported for SQLite.",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mark it for supported backend?

Suggested change
@pytest.mark.skipif(
conf.get_mandatory_value("database", "sql_alchemy_conn").lower().startswith("sqlite"),
reason="Offline migration not supported for SQLite.",
)
@pytest.mark.backend("postgres", "mysql")

@dstandish
Copy link
Contributor

Should we consider yanking 2.9.0 once 2.9.1 is out with this fix?

@eladkal
Copy link
Contributor

eladkal commented Apr 24, 2024

Should we consider yanking 2.9.0 once 2.9.1 is out with this fix?

I don't think so.
In my point of view if we wanted to yank then we should have merge this quickly and cut 2.9.1 immidiately as critical bug fix. I asked about this when issue was discoved but others thought we should wait for more more bug fixes as this was not categorized as critical.

@jedcunningham
Copy link
Member

Ignoring that the conditional is wrong, I don't follow how this migration approach works for offline migration in the first place. Am I missing functionality somewhere that solves that scenario?

@jedcunningham
Copy link
Member

Just confirmed my suspicions, it doesn't work for offline migrations:

[2024-04-24T23:08:17.337+0000] {triggerer_job_runner.py:341} ERROR - Exception when executing TriggererJobRunner._run_trigger_loop
Traceback (most recent call last):                                                                                                
  File "/home/airflow/.local/lib/python3.12/site-packages/airflow/jobs/triggerer_job_runner.py", line 339, in _execute            
    self._run_trigger_loop()                                                                                                      
  File "/home/airflow/.local/lib/python3.12/site-packages/airflow/jobs/triggerer_job_runner.py", line 362, in _run_trigger_loop   
    self.load_triggers()                                                                                                          
  File "/home/airflow/.local/lib/python3.12/site-packages/airflow/jobs/triggerer_job_runner.py", line 377, in load_triggers       
    self.trigger_runner.update_triggers(set(ids))                                                                                 
  File "/home/airflow/.local/lib/python3.12/site-packages/airflow/jobs/triggerer_job_runner.py", line 676, in update_triggers     
    new_trigger_instance = trigger_class(**new_trigger_orm.kwargs)                                                                
                                           ^^^^^^^^^^^^^^^^^^^^^^                                                                 
  File "/home/airflow/.local/lib/python3.12/site-packages/airflow/models/trigger.py", line 93, in kwargs                          
    return self._decrypt_kwargs(self.encrypted_kwargs)                                                                            
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                            
  File "/home/airflow/.local/lib/python3.12/site-packages/airflow/models/trigger.py", line 119, in _decrypt_kwargs                
    decrypted_kwargs = json.loads(get_fernet().decrypt(encrypted_kwargs.encode("utf-8")).decode("utf-8"))                         
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                          
  File "/home/airflow/.local/lib/python3.12/site-packages/cryptography/fernet.py", line 211, in decrypt                           
    raise InvalidToken                                                                                                            
cryptography.fernet.InvalidToken                                                                                                  

This was all that was spit out for that migration:

-- Running upgrade ee1467d4aa35 -> 1949afb29106

ALTER TABLE trigger ALTER COLUMN kwargs TYPE TEXT;

UPDATE alembic_version SET version_num='1949afb29106' WHERE alembic_version.version_num = 'ee1467d4aa35';

@dstandish
Copy link
Contributor

dstandish commented Apr 25, 2024

Should we consider yanking 2.9.0 once 2.9.1 is out with this fix?

I don't think so. In my point of view if we wanted to yank then we should have merge this quickly and cut 2.9.1 immidiately as critical bug fix. I asked about this when issue was discoved but others thought we should wait for more more bug fixes as this was not categorized as critical.

To me, I’m not sure whether it meets the criteria of critical, but it’s pretty bad. If the user is using helm chart, then it will run the migration pod a lot, and if there are triggers in flight, this could happen and sorta bork the cluster IIUC.

@jedcunningham
Copy link
Member

Currently downgrade also doesn't work when you have triggers - alembic tries to switch the column back to json before decrypting happens.

@jedcunningham
Copy link
Member

I've opened #39246 as an alternative solution to this.

@jedcunningham
Copy link
Member

Closing this as #39246 has been merged. Thanks for kicking off fixing this bug @hussein-awala!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:db-migrations PRs with DB migration type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DB migrate throws na error on encrypt_trigger_kwargs
8 participants