Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to move "dangling" rows in upgradedb #18953

Merged
merged 5 commits into from
Oct 14, 2021

Conversation

uranusjr
Copy link
Member

Fix #18894.

Instead of failing loudly for invalid records (which happens way too often), this attempts to move those offending data to another table and carry on with the migration if possible. This table for dangling data are copied with CREATE TABLE ... AS SELECT ... and could miss some indexing and stuff, but this is only meant for temporary storage, so this is probably not a big deal. If copying went well, the dangling data are automatically deleted so we can carry on with migration.

This also removed the upgrade check on TaskFail and added TaskReschedule. This is because TaskFail is not actually being migrated in 2.2, while TaskReschedule is, and we concluded this is likely a typo during implementation and not an intentional choice.

@uranusjr uranusjr requested a review from ashb October 13, 2021 16:52
@uranusjr uranusjr force-pushed the dagrun-pk-migrate-to-temporary-table branch from cf3d9a0 to 277abdc Compare October 13, 2021 16:53
@kaxil kaxil added this to the Airflow 2.2.1 milestone Oct 13, 2021
@uranusjr uranusjr force-pushed the dagrun-pk-migrate-to-temporary-table branch from 4e2e6aa to dda518a Compare October 14, 2021 08:43
Instead of failing loudly for invalid records (which happens way too
often), this attempts to move those offending data to another table and
carry on with the migration if possible. This table for dangling data
are copied with CREATE TABLE ... AS SELECT ... and could miss some
indexing and stuff, but this is only meant for temporary storage, so
this is probably not a big deal. If copying went well, the dangling
data are automatically deleted so we can carry on with migration.

Additionally, this commit removes the upgrade check on TaskFail, and
added check on TaskReschedule. This is because TaskFail is not actually
being migrated in 2.2, while TaskReschedule is, and we concluded this
is likely a typo during implementation and not an intentional choice.
@uranusjr uranusjr force-pushed the dagrun-pk-migrate-to-temporary-table branch from dda518a to f73720a Compare October 14, 2021 08:47
@uranusjr uranusjr marked this pull request as ready for review October 14, 2021 09:12
@uranusjr uranusjr marked this pull request as draft October 14, 2021 09:12
Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- what else is there to do on this @uranusjr ?

@uranusjr uranusjr marked this pull request as ready for review October 14, 2021 13:36
@uranusjr
Copy link
Member Author

I just added some code to show the alert in the web UI. This should be ready.

@uranusjr uranusjr requested a review from ashb October 14, 2021 13:37
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Oct 14, 2021
Copy link
Member

@jedcunningham jedcunningham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this locally as well, looks good.

airflow/www/templates/airflow/dags.html Outdated Show resolved Hide resolved
@uranusjr
Copy link
Member Author

Also I somehow forgot to push the fix to the db query count test before :( Waiting for CI to pass...

@uranusjr uranusjr merged commit f967ca9 into apache:main Oct 14, 2021
@uranusjr uranusjr deleted the dagrun-pk-migrate-to-temporary-table branch October 14, 2021 17:09
jedcunningham pushed a commit that referenced this pull request Oct 14, 2021
jedcunningham pushed a commit to astronomer/airflow that referenced this pull request Oct 27, 2021
jedcunningham pushed a commit to astronomer/airflow that referenced this pull request Oct 28, 2021
potiuk added a commit to potiuk/airflow that referenced this pull request Nov 10, 2021
In Airflow 2.2.2 we introduced a fix in apache#18953 where the corrupted
data was moved to a separate table. However some of our users
(rightly) might not have the context. We've never had anything
like that before, so the users who treat Airflow DB as
black-boxes might get confused on what the error means and what
they should do in this case.

You can see it in apache#19440 converted into discussion apache#19444 and apache#19421
indicate that the message is a bit unclear for users. This PR attempts to
improve that it adds `upgrading` section to our documentation and have the
message link to it so that rather than asking questions in the issues,
users can find context and answers what they should do in our docs.

It also guides the users who treat Airflow DB as "black-box" on how they
can use their tools and airflow db shell to fix the problem.
potiuk added a commit that referenced this pull request Nov 10, 2021
* Improve message and documentation around moved data

In Airflow 2.2.2 we introduced a fix in #18953 where the corrupted
data was moved to a separate table. However some of our users
(rightly) might not have the context. We've never had anything
like that before, so the users who treat Airflow DB as
black-boxes might get confused on what the error means and what
they should do in this case.

You can see it in #19440 converted into discussion #19444 and #19421
indicate that the message is a bit unclear for users. This PR attempts to
improve that it adds `upgrading` section to our documentation and have the
message link to it so that rather than asking questions in the issues,
users can find context and answers what they should do in our docs.

It also guides the users who treat Airflow DB as "black-box" on how they
can use their tools and airflow db shell to fix the problem.
kaxil pushed a commit that referenced this pull request Nov 11, 2021
* Improve message and documentation around moved data

In Airflow 2.2.2 we introduced a fix in #18953 where the corrupted
data was moved to a separate table. However some of our users
(rightly) might not have the context. We've never had anything
like that before, so the users who treat Airflow DB as
black-boxes might get confused on what the error means and what
they should do in this case.

You can see it in #19440 converted into discussion #19444 and #19421
indicate that the message is a bit unclear for users. This PR attempts to
improve that it adds `upgrading` section to our documentation and have the
message link to it so that rather than asking questions in the issues,
users can find context and answers what they should do in our docs.

It also guides the users who treat Airflow DB as "black-box" on how they
can use their tools and airflow db shell to fix the problem.

(cherry picked from commit de43fb3)
@jedcunningham jedcunningham added the type:bug-fix Changelog: Bug Fixes label Apr 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full tests needed We need to run full set of tests for this PR to merge type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate from 2.1.4 to 2.2.0
5 participants