-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try to move "dangling" rows in upgradedb #18953
Try to move "dangling" rows in upgradedb #18953
Conversation
cf3d9a0
to
277abdc
Compare
4e2e6aa
to
dda518a
Compare
Instead of failing loudly for invalid records (which happens way too often), this attempts to move those offending data to another table and carry on with the migration if possible. This table for dangling data are copied with CREATE TABLE ... AS SELECT ... and could miss some indexing and stuff, but this is only meant for temporary storage, so this is probably not a big deal. If copying went well, the dangling data are automatically deleted so we can carry on with migration. Additionally, this commit removes the upgrade check on TaskFail, and added check on TaskReschedule. This is because TaskFail is not actually being migrated in 2.2, while TaskReschedule is, and we concluded this is likely a typo during implementation and not an intentional choice.
dda518a
to
f73720a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM -- what else is there to do on this @uranusjr ?
I just added some code to show the alert in the web UI. This should be ready. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this locally as well, looks good.
Also I somehow forgot to push the fix to the db query count test before :( Waiting for CI to pass... |
(cherry picked from commit f967ca9)
(cherry picked from commit f967ca9)
(cherry picked from commit f967ca9)
In Airflow 2.2.2 we introduced a fix in apache#18953 where the corrupted data was moved to a separate table. However some of our users (rightly) might not have the context. We've never had anything like that before, so the users who treat Airflow DB as black-boxes might get confused on what the error means and what they should do in this case. You can see it in apache#19440 converted into discussion apache#19444 and apache#19421 indicate that the message is a bit unclear for users. This PR attempts to improve that it adds `upgrading` section to our documentation and have the message link to it so that rather than asking questions in the issues, users can find context and answers what they should do in our docs. It also guides the users who treat Airflow DB as "black-box" on how they can use their tools and airflow db shell to fix the problem.
* Improve message and documentation around moved data In Airflow 2.2.2 we introduced a fix in #18953 where the corrupted data was moved to a separate table. However some of our users (rightly) might not have the context. We've never had anything like that before, so the users who treat Airflow DB as black-boxes might get confused on what the error means and what they should do in this case. You can see it in #19440 converted into discussion #19444 and #19421 indicate that the message is a bit unclear for users. This PR attempts to improve that it adds `upgrading` section to our documentation and have the message link to it so that rather than asking questions in the issues, users can find context and answers what they should do in our docs. It also guides the users who treat Airflow DB as "black-box" on how they can use their tools and airflow db shell to fix the problem.
* Improve message and documentation around moved data In Airflow 2.2.2 we introduced a fix in #18953 where the corrupted data was moved to a separate table. However some of our users (rightly) might not have the context. We've never had anything like that before, so the users who treat Airflow DB as black-boxes might get confused on what the error means and what they should do in this case. You can see it in #19440 converted into discussion #19444 and #19421 indicate that the message is a bit unclear for users. This PR attempts to improve that it adds `upgrading` section to our documentation and have the message link to it so that rather than asking questions in the issues, users can find context and answers what they should do in our docs. It also guides the users who treat Airflow DB as "black-box" on how they can use their tools and airflow db shell to fix the problem. (cherry picked from commit de43fb3)
Fix #18894.
Instead of failing loudly for invalid records (which happens way too often), this attempts to move those offending data to another table and carry on with the migration if possible. This table for dangling data are copied with
CREATE TABLE ... AS SELECT ...
and could miss some indexing and stuff, but this is only meant for temporary storage, so this is probably not a big deal. If copying went well, the dangling data are automatically deleted so we can carry on with migration.This also removed the upgrade check on TaskFail and added TaskReschedule. This is because TaskFail is not actually being migrated in 2.2, while TaskReschedule is, and we concluded this is likely a typo during implementation and not an intentional choice.