-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PITR: Undo of ALTER TABLE #7135
Labels
area/docdb
YugabyteDB core features
Comments
spolitov
added a commit
that referenced
this issue
Apr 1, 2021
Summary: This diff adds logic to restore table schema. After this, we should be able to undo an ALTER TABLE operation! There are two important changes as part of this diff. 1) Restoring master side sys_catalog metadata. 2) Sending the restored version of the schema from the master to the TS, as part of the explicit command to restore the TS. As part of applying the restore operation on the master, we add new state tracking, which can do the diff between current sys_catalog state vs the state at the time at which we want to restore. This is done by restoring the corresponding sys_catalog snapshot into a temporary directory, with the HybridTime filter applied, for the restore_at time. We then load the relevant TABLE and TABLET data into memory and overwrite the existing rocksdb data directly in memory. This is safe to do because - It is done as part of the apply step of a raft operation, so it is already persisted and will be replayed accordingly at bootstrap, in case of a restart. - It is done on both leader and follower. Once the master state is rolled back, we then run the TS side of the restore operation. The master now sends over the restored schema information, as part of the Restore request. On the TS side, we update our tablet schema information on disk accordingly. Note: In between the master state being rolled back and all the TS processing their respective restores, there is a time window in which the master can receive heartbeats from a TS, with newer schema information than what the master has persisted. Currently, that seems to only lead to some log spew, but will be investigated later, as part of fault tolerance testing. Test Plan: ybd --gtest_filter SnapshotScheduleTest.RestoreSchema Reviewers: amitanand, bogdan Reviewed By: bogdan Subscribers: ybase Differential Revision: https://phabricator.dev.yugabyte.com/D11013
YintongMa
pushed a commit
to YintongMa/yugabyte-db
that referenced
this issue
May 26, 2021
Summary: This diff adds logic to restore table schema. After this, we should be able to undo an ALTER TABLE operation! There are two important changes as part of this diff. 1) Restoring master side sys_catalog metadata. 2) Sending the restored version of the schema from the master to the TS, as part of the explicit command to restore the TS. As part of applying the restore operation on the master, we add new state tracking, which can do the diff between current sys_catalog state vs the state at the time at which we want to restore. This is done by restoring the corresponding sys_catalog snapshot into a temporary directory, with the HybridTime filter applied, for the restore_at time. We then load the relevant TABLE and TABLET data into memory and overwrite the existing rocksdb data directly in memory. This is safe to do because - It is done as part of the apply step of a raft operation, so it is already persisted and will be replayed accordingly at bootstrap, in case of a restart. - It is done on both leader and follower. Once the master state is rolled back, we then run the TS side of the restore operation. The master now sends over the restored schema information, as part of the Restore request. On the TS side, we update our tablet schema information on disk accordingly. Note: In between the master state being rolled back and all the TS processing their respective restores, there is a time window in which the master can receive heartbeats from a TS, with newer schema information than what the master has persisted. Currently, that seems to only lead to some log spew, but will be investigated later, as part of fault tolerance testing. Test Plan: ybd --gtest_filter SnapshotScheduleTest.RestoreSchema Reviewers: amitanand, bogdan Reviewed By: bogdan Subscribers: ybase Differential Revision: https://phabricator.dev.yugabyte.com/D11013
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Some options we've discussed internally
The text was updated successfully, but these errors were encountered: