-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DBZ-7050 Add automatic retry for snapshots #163
Conversation
@jpechane Can you review this when you get the chance? Thank you! |
@@ -464,28 +467,34 @@ public List<String> getShard() { | |||
return getConfig().getStrings(SHARD, CSV_DELIMITER); | |||
} | |||
|
|||
public List<String> getGtid() { | |||
public String getGtid() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the method name called getVGtid
?
|
||
public static final Field GTID = Field.create(VITESS_CONFIG_GROUP_PREFIX + "gtid") | ||
.withDisplayName("gtid") | ||
public static final Field VGTID = Field.create(VITESS_CONFIG_GROUP_PREFIX + "vgtid") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it cause any regression with the config name changes from gtid
to vgtid
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it will. It is necessary to deprecate the existing setting. So both should be available and when the deprecated one is used then WARN should be written to the log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@twthorn Thanks for the PR. I left few minor comments. One question for you. I supposed that incomplete snapshots would be resumed automaically so I'd expect additional data being writtent to or read from offsets. Is is the case?
Also for testing there should be test when connector is started. Then it is stopped in the middle of snapshot and started again. There should be no lost records or duplicates.
pom.xml
Outdated
@@ -245,6 +245,11 @@ | |||
<type>test-jar</type> | |||
<scope>test</scope> | |||
</dependency> | |||
<dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please place this dependency version into dependencyManagement
and configure it via property?
Also I'd recommend to align the version with the one pulled by grpc client.
|
||
import binlogdata.Binlogdata; | ||
|
||
public class TablePrimaryKeys { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add JavaDoc to this class?
@@ -37,8 +40,7 @@ | |||
*/ | |||
public class VitessConnectorConfig extends RelationalDatabaseConnectorConfig { | |||
|
|||
public static final List<String> EMPTY_GTID_LIST = List.of(Vgtid.EMPTY_GTID); | |||
public static final List<String> DEFAULT_GTID_LIST = List.of(Vgtid.CURRENT_GTID); | |||
public static final String DEFAULT_GTID = Vgtid.CURRENT_GTID; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be moved to Vgtid
class?
|
||
public static final Field GTID = Field.create(VITESS_CONFIG_GROUP_PREFIX + "gtid") | ||
.withDisplayName("gtid") | ||
public static final Field VGTID = Field.create(VITESS_CONFIG_GROUP_PREFIX + "vgtid") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it will. It is necessary to deprecate the existing setting. So both should be available and when the deprecated one is used then WARN should be written to the log.
Correct, there will be additional data read/written from offsets. There is now an additional field Also worth noting, for rolling forward, this works, and I added a test showing this (we can read offsets that lack the field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
|
||
@Test | ||
public void testSnapshotLargeTable() throws Exception { | ||
TestHelper.executeDDL("vitess_create_tables.ddl"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@twthorn Could you please store all the records captured and assert at the end that all are present and there are neither gaps nor duplicates?
@twthorn LGTM. I left one comment requesting a hardening of a a test. Otherwise we are good to go. |
@jpechane Thanks for the changes! Please let me know if there's anything else I can do on the PR |
@twthorn Applied, thanks a lot |
Add automatic retry for snapshots.
Automatic Retry
As detailed in the table copy RFC the last PK value in the VGTID is used for resuming partially complete snapshots. We utilize this for our automatic retries and parse this additional information now being sent by Vitess in the latest version (i.e., including fixes like this).
One implementation decision I debated between was switching all of our VGTID parsing logic to instead use the provided protobuf json converters. This would simplify our codebase but also mean the Debezium logic is more tightly coupled with Vitesss/protobuf, so for the latter reason I decided against this.
GTID -> VGTID config
Previous configs had the option of
vitess.gtid
which was originally added whenvitess.shard
only allowed for a single shard. With the changes to support multiple shards back in #135, we made GTID into a CSV as well. In order to easily test the automatic retry snapshot behavior, it would be useful to also specify the last PK values in the config. Additionally, GTID csv adds unnecessary complexity to support another format of specifying GTID(s). So to resolve both of these issues, I changed this to simply be avitess.vgtid
config which is a JSON parseable string that we can use to initialize our VGTID for a VStream. Since VGTID is an array of keyspace/shard/gtid/lastpks it has far more versatility for the user to be able to arbitrarily specify the GTID(s) for any shard(s) subset and even PKs to resume snapshots on. This is what is used to test the snapshot resume behavior in the integration tests.