-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix decryption of trigger kwargs when downgrading. #38743
Fix decryption of trigger kwargs when downgrading. #38743
Conversation
This failed because the query for Triggers after the downgrade lazy loads the taskinstance table(ORM) which doesn't have the task_display_name column at that downloaded point. The fix was to query specifically on the encrypted_kwargs column.
Figuring out why this was not detected in our tests... |
🤔 |
Because we are running up-down migration with empty tables I think. We should likely add some steps in our migration up/down to generate some data -> i.e. run a bunch of example dags - both before migration up and after migration up and before migration down. |
Created an issue for that #38744 |
I reproduced it in breeze without having data on the tables |
Strange. It definitely runs in main migration :
Here:
|
Are you sure when you reproduced it your Trigger table was empty? looking at the fix, the whole for loop would be skipped if it was empty. Maybe you run it with |
I reconfirmed that it was empty. I reproduced it. The error happens on the query itself not on the decryption. The session.query(Triggers) causes the task instance table to be loaded and because ORM already has task_display_name but this no longer exist in db after downgrade, the error occurs. Below is the stacktrace:
This can only be reproduced when you downgrade half-way to 2.8.4, that's why we missed it. In the CI, we downgrade from latest to 2.0.0 then backup |
Hmm. Still don't get it - why it does not fail when we downgrade to 2.0.0 ? I thought alembic was executing the downgrades step-by-step and it goes 2.9.0 -> 2.8.4 -> 2.8.3 and so on... So why it did not fail in the 2.9.0 -> 2.8.4 step? |
🤔 |
aaaaa if _revision_greater(
config,
_REVISION_HEADS_MAP["2.9.0"],
to_revision,
):
decrypt_trigger_kwargs(session=session) |
But this condition is likely broken ? (why would it run when migrating to 2.8.4 and not to 2.0.0 ?) |
Sorry, I see now. It's not running it because the triggered table can't be found when 2.0.0 is specified ( Line 991 in 901c3a6
|
Interesting, I did not realise we do that - I was under the impression that each step of downgrade (or upgrade) is done separately - and in isolation - no matter what the target version is. So I guess (cc: @hussein-awala ) this is a kind of optimisation (we do not neeed to decrypt if we know the table will be dropped)? Was that the purpose of it ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for now. I do not thing it should block rc2, but we likely have to get some improvements as a follow up
I updated the description of #38744 to reflect that "follow up" need. |
It's necessary because we execute the method each time we run the downgrade command, so later when we want to downgrade 2.9.1+ to 2.9.0+, we should skip the decryption |
Yeah. condition to run decrypt is is right - but as @ephraimbuddy it's not the condition to run the method that is wrong, it's the |
I think the code is right. If the table has been dropped, the decrypt is not run. Because in 2.8.4, we still have the trigger table, the decryption ran and failed. The code path is only evaluated at the end of migrations so if user is going from 2.9.0 to 2.0.0, it will only be run when the migration completes and DB in 2.0.0 state |
Exactly, I added the condition to fix a failing test. |
* Fix decryption of trigger kwargs when downgrading. This failed because the query for Triggers after the downgrade lazy loads the taskinstance table(ORM) which doesn't have the task_display_name column at that downloaded point. The fix was to query specifically on the encrypted_kwargs column. * Properly positon the decryption after downgrade and not in offline migration (cherry picked from commit 567246f)
Ah. OK then our migration to 2.0.0 and back is not really testing all the code paths. Maybe we should consider doing migration tests also to the previous latest MINOR version. I think we rarely change past migration scripts - so if we add just migration down one MINOR version, we should catch all such cases (eventually over time we will know 2.9 -> 2.8 works - because we tested it now, 2.10 -> 2.9 works because we will test it in the future and so on). |
This failed because the query for Triggers after the downgrade lazy loads the taskinstance table(ORM) which doesn't have the task_display_name column at that downloaded point.
The fix was to query specifically on the encrypted_kwargs column.