Binary column corruption ruby tests 157 #159

kolbitsch-lastline · 2020-03-11T00:56:52Z

Link: #157

shuhaowu · 2020-03-11T15:26:39Z

I see you made some changes into vendor/. Upstream is usually pretty receptive to PRs. Perhaps we can send a PR there?

kolbitsch-lastline · 2020-03-11T16:25:25Z

I see you made some changes into vendor/. Upstream is usually pretty receptive to PRs. Perhaps we can send a PR there?

yes, see my comment in the git commit message.

the point of the pull-request is to get your feedback on the change... if you disagree this is how we want to implement this, doing a PR to the upstream is a waste of time :-) . But yes, if you agree we want to change it as I propose, we clearly need to send the vendor'ized changes to upstream

shuhaowu · 2020-03-11T17:15:25Z

test/integration/types_test.rb

+
+    ghostferry = new_ghostferry(MINIMAL_GHOSTFERRY)
+
+    ghostferry.on_status(Ghostferry::Status::ROW_COPY_COMPLETED) do


Can you use BINLOG_STREAMING_STARTED instead? Should be less racy that way.

hm, I actually think I tried that. Maybe I misunderstood what's the sequence of events.

When is binlog streaming started in this scenario - after the copy is complete or before that? I thought the former. Thus, I really want the test to happen after we are copying, because the bug occurs only during binlog streaming, not during the copy (as far as I can tell)

Binlog streaming and data copy begin and occur simultaneously.

when the ROW_COPY_COMPLETED event is sent, the binlog streamer is about to be terminated. I don't recall 100%, but I think there's a strong race possibility that the binlog is not picked up by Ghostferry.

When these events are sent and these ruby blocks are called, the Ghostferry code is blocked from proceeding until the ruby block returns. As such, a more fool-proof test would be something like this:

source_db.query("INSERT INTO ... (binary data...)") row_copy_called = false binlog_appled_called = false ghostferry.on_status(Ghostferry::Status::AFTER_ROW_COPY) do row_copy_called = true # select row from the target and then make sure the data with 0 padding is present # Perform the update query to set the data to `updated_data`. end ghostferry.on_status(Ghostferry::Status::AFTER_BINLOG_APPLY) do binlog_apply_called = true # select row from the target and make sure the data is on the updated_data end # run ghostferry, # assert table identical assert binlog_apply_called assert row_copy_called # if we want to be defensive, query the target and make sure the data is updated_data.

I tried your suggestion, but am running into two issues with this:

first, updating the source-DB from the context of AFTER_ROW_COPY fails. Do we hold some type of lock at the time this is invoked?

second, when checking in AFTER_BINLOG_APPLY, it seems binlog applying isn't fully done yet. Thus, it keep seeing the old data and the validation fails

I understand your proposed code, but seems that's equally racy or something else is off

Oh yeah you're right. Sorry for the faulty suggestion. In AFTER_BINLOG_APPLY there's indeed a lock due to SELECT FOR UPDATE. I misremembered when I made that suggestion. For now using BINLOG_STREAMING_STARTED should work slightly better. At some point in the future we can make the "breakpoints" a little bit more robust and less racy.

got it, thanks for the details. No problem for the misdirection - glad I wasn't doing anything wrong

to avoid a misunderstanding: so is there anything else that we should fix?

Other than using BINLOG_STREAMING_STARTED, not now.

ok, then there seems to have been a misunderstanding - that's why I was asking if streaming was started in parallel with the row-copy above (and you answered they are started in parallel...)

if we use BINLOG_STREAMING_STARTED , we race in the unit-test (and actually usually lose the race), meaning that callback is called before the row-copy has inserted the original data in the target.

shuhaowu · 2020-03-11T17:35:31Z

Let me know If I understand this issue correctly:

if there's a BINARY(N) column and if you insert a some data where its length is < N, mysql will automatically pad the data with \0.
Within the DataIterator: Ghostferry retrieves the binary column, it will return the data with trailing '\0'. Thus, data copied by Ghostferry will remain unaffected.
Within the the BinlogStreamer: We use siddontang/go-mysql's replication module, which strips the \0 out. This implies that the data in the mysql binlog stream is fine, just there's a bug in the upstream module.
The data with the stripped \0 reaches Ghostferry and we use it in the WHERE clause for UPDATE and DELETE, which matches no rows and thus causes silent data corruption within the BinlogStreamer only.

Thus, the fix here is:

If we detect a BINARY(N) column and we get data that's less than N, we right pad the data with \0 until the length of the data = N.
The patch made to the upstream library is to change the schema definition to add the BINARY and VARBINARY types. It's not to make it so the upstream library returns the correct data to ghostferry.

Questions:

What about VARBINARY? Do we need anymore handling for that type other than this patch? I'm not very familiar with this data type. The MySQL doc says "For VARBINARY, there is no padding for inserts and no bytes are stripped for retrievals", but I'm not certain how applicable this is here.

Additional comments:

You may have noted that CI has been failing. This maybe due to one of the race conditions within tests outlined in Race condition in tests causing intermittent failures #107. Basically the only advice I have is to retry the tests until the pass if you know the failing test is caused by one of the documented issues. Fixing them may also be easy and is also very welcome (in a separate PR).

kolbitsch-lastline · 2020-03-11T19:49:24Z

if there's a BINARY(N) column and if you insert a some data where its length is < N, mysql will automatically pad the data with \0.

correct:

mysql> create database test;
Query OK, 1 row affected (0.00 sec)

mysql> use test;
Database changed

mysql> CREATE TABLE test (id bigint(20) not null auto_increment, data BINARY(4), primary key(id));
Query OK, 0 rows affected (0.06 sec)

mysql> INSERT INTO test (id, data) values (1, 'AB');
Query OK, 1 row affected (0.00 sec)

mysql> select id, data, hex(data) from test where id = 1;
+----+------+-----------+
| id | data | hex(data) |
+----+------+-----------+
|  1 | AB   | 41420000  |
+----+------+-----------+
1 row in set (0.00 sec)

as you can see, I inserted AB but the DB stores AB\x00\x00. The binlog will contain AB though.

same can be seen if I do a simple query later on:

mysql> select id, data, hex(data) from test where id = 1 and data = 'AB';
Empty set (0.00 sec)

As you can see, I cannot SELECT the data that I INSERTed before. Selecting on the padded string, however, works:

mysql> select id, data, hex(data) from test where id = 1 and hex(data) = '41420000';
+----+------+-----------+
| id | data | hex(data) |
+----+------+-----------+
|  1 | AB   | 41420000  |
+----+------+-----------+
1 row in set (0.00 sec)

(I'm using the hex stuff to avoid having to print the \x00 here)

Within the DataIterator: Ghostferry retrieves the binary column, it will return the data with trailing '\0'. Thus, data copied by Ghostferry will remain unaffected.

I have not experimented yet with the copy, but that's my next step. But, yes, I agree it should not be affected

Within the the BinlogStreamer: We use siddontang/go-mysql's replication module, which strips the \0 out. This implies that the data in the mysql binlog stream is fine, just there's a bug in the upstream module.

not quite: it does not strip it, it never ends up in the binlog. So, it's the mysql server's "fault", not the replication module's.

Essentially ghostferry receives what the user inserted, but not the 0-padding added by mysql on the insert.

NOTE: I was not able to find this behavior in the documentation. I'm inferring this from experiments, but I have done a lot of it, so I'm fairly confident what I'm writing here reflects reality. But there could be a different cause for all this that matches the same symptoms.

The data with the stripped \0 reaches Ghostferry and we use it in the WHERE clause for UPDATE and DELETE, which matches no rows and thus causes silent data corruption within the BinlogStreamer only.

correct.

Thus, the fix here is:

If we detect a BINARY(N) column and we get data that's less than N, we right pad the data with \0 until the length of the data = N.

correct.

The patch made to the upstream library is to change the schema definition to add the BINARY and VARBINARY types. It's not to make it so the upstream library returns the correct data to ghostferry.

correct.

Questions:

What about VARBINARY? Do we need anymore handling for that type other than this patch? I'm not very familiar with this data type. The MySQL doc says "For VARBINARY, there is no padding for inserts and no bytes are stripped for retrievals", but I'm not certain how applicable this is here.

I believe VAR* types are not affected, as there is no padding. IMO it should only happen for BINARY and CHAR - that is, any column that supports fixed size and where the actual length in the column is not stored as separate attribute on the row-column

Additional comments:

You may have noted that CI has been failing. This maybe due to one of the race conditions within tests outlined in #107. Basically the only advice I have is to retry the tests until the pass if you know the failing test is caused by one of the documented issues.

oh, thanks for pointing this out! I was already puzzled by this. My local tests seem to work fine. So, I can confirm it works. Is there a way to retrigger QA without updating the PR?

Fixing them may also be easy and is also very welcome (in a separate PR).

hehe, I think I'll prioritize fixing the other issues I have brought up :-D But, yes, if I happen to see what is causing this, happy to send a PR ;-)

shuhaowu · 2020-03-11T19:51:55Z

oh, thanks for pointing this out! I was already puzzled by this. My local tests seem to work fine. So, I can confirm it works. Is there a way to retrigger QA without updating the PR?

There should be a rebuild button on CI. You have to be logged into travis via your Github account tho. You might be able to just click rerun on the GH interface as well, but I've never tried.

kolbitsch-lastline · 2020-03-11T19:59:22Z

I'm logged in, but I can't retrigger the build - I think it's a permission issue. According to the post here:

https://stackoverflow.com/questions/17606874/trigger-a-travis-ci-rebuild-without-pushing-a-commit

this is only possible if the account has write permissions, which I don't. Do you mind retriggering it? Assuming these are perma-links , you should be able to retrigger it here:

https://travis-ci.com/github/Shopify/ghostferry/builds/152736013

(for me, there is no `restart build" button as shown on the stackoverflow post)

shuhaowu · 2020-03-11T20:06:32Z

Triggered. Sorry about the messy state of CI. You can also push some empty commits to retrigger builds.

Back to the topic at hand, I think you're right about this being a mysql "bug". I just did my own tests:

mysql> create table a (d BINARY(5));
Query OK, 0 rows affected (0.02 sec)

mysql> insert into a values (_binary'a');
Query OK, 1 row affected (0.01 sec)

mysql> select hex(d) from a;
+------------+
| hex(d)     |
+------------+
| 6100000000 |
+------------+
1 row in set (0.00 sec)

Using mysqlbinlog -vvv, I see:

# at 540
#200311 19:58:14 server id 1  end_log_pos 611 CRC32 0xd85350b1  Query   thread_id=3     exec_time=0     error_code=0
SET TIMESTAMP=1583956694/*!*/;
BEGIN
/*!*/;
# at 611
#200311 19:58:14 server id 1  end_log_pos 656 CRC32 0x742aeeee  Table_map: `abc`.`a` mapped to number 108
# at 656
#200311 19:58:14 server id 1  end_log_pos 694 CRC32 0xdd1079b9  Write_rows: table id 108 flags: STMT_END_F

BINLOG '
1kJpXhMBAAAALQAAAJACAAAAAGwAAAAAAAEAA2FiYwABYQAB/gL+BQHu7ip0
1kJpXh4BAAAAJgAAALYCAAAAAGwAAAAAAAEAAgAB//4BYbl5EN0=
'/*!*/;
### INSERT INTO `abc`.`a`
### SET
###   @1='a' /* STRING(5) meta=65029 nullable=1 is_null=0 */
# at 694
#200311 19:58:14 server id 1  end_log_pos 725 CRC32 0xfb77b9e0  Xid = 15
COMMIT/*!*/;
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;
# End of log file

Specifically, the line says: ### @1='a' /* STRING(5) meta=65029 nullable=1 is_null=0 */ suggests to me that the MySQL binlog doesn't contain the \0 at the end. Maybe mysqlbinlog is stripping the /0, we'd have to confirm with mysql source code I'm afraid...

shuhaowu · 2020-03-11T20:17:30Z

Since you're looking a little more closely at go-mysql, here are some links of how some other code are handling BINARY in replication. ~~They also remind me that this code may affect CHAR.~~

https://github.com/shyiko/mysql-binlog-connector-java/blob/f698427b151aadacba8e6ec305dbc6b5c9d61aee/src/main/java/com/github/shyiko/mysql/binlog/event/deserialization/AbstractRowsEventDataDeserializer.java#L363-L371

Edit: doesn't seem to affect CHAR:

mysql> create table b (d char(5));
Query OK, 0 rows affected (0.03 sec)

mysql> insert into b values ('a');
Query OK, 1 row affected (0.00 sec)

mysql> select hex(d) from b;
+--------+
| hex(d) |
+--------+
| 61     |
+--------+
1 row in set (0.00 sec)

kolbitsch-lastline · 2020-03-11T20:24:48Z

oh, interesting. Essentially the java-equivalent of what I suggested.

I think we agree on how to move forward then, correct? I'll go and try to get the patch into the upstream of go-mysql so we can use it here...

shuhaowu · 2020-03-11T20:27:33Z

After some further investigation on related projects, it seems that others are running into the same issue. Specifically, shyiko/mysql-binlog-connector-java#169 is like siddontang/go-mysql, which is unable to fix this issue it seems. Debezium is a consumer of that library much like Ghostferry is, and they documented the issue here: https://issues.redhat.com/browse/DBZ-254. And their solution sounds very similar to yours and is implemented at debezium/debezium#237. I'm going to review that PR to see if we're doing anything differently here.

kolbitsch-lastline · 2020-03-11T20:30:38Z

haha, I can't believe how similar the examples between

https://dev.mysql.com/doc/refman/5.7/en/binary-varbinary.html

and my post above are - what a coincidence :-)

But, glad to see that we have a concrete page to refer to . Working on a PR to go-mysql now

shuhaowu · 2020-03-11T20:31:55Z

Just reviewed the Debezium PR. For future reference, Debezium's fix is here: https://github.com/debezium/debezium/blob/2c56997/debezium-connector-mysql/src/main/java/io/debezium/connector/mysql/MySqlValueConverters.java#L611-L614

This approach is exactly the same as the ones proposed here.

shuhaowu

The approach looks good to me. We need to merge the upstream changes but idk if we are ready to update go-mysql, so maybe a vendor patch here is fine.

shuhaowu · 2020-03-11T20:36:22Z

dml_events.go

@@ -406,17 +410,21 @@ func Int64Value(value interface{}) (int64, bool) {
 //
 // ref: https://github.com/mysql/mysql-server/blob/mysql-5.7.5/mysys/charset.c#L963-L1038
 // ref: https://github.com/go-sql-driver/mysql/blob/9181e3a86a19bacd63e68d43ae8b7b36320d8092/utils.go#L717-L758
-func appendEscapedString(buffer []byte, value string) []byte {
+func appendEscapedString(buffer []byte, value string, padToLength uint) []byte {


Can you rename this to rightPadLengthForBinaryColumn to make it more verbose? Can you also leave some comments in the code to link to this PR so people in the future can figure out why this is done?

shuhaowu · 2020-03-11T20:38:46Z

dml_events.go

 		c := value[i]
 		if c == '\'' {
 			buffer = append(buffer, '\'', '\'')
 		} else {
 			buffer = append(buffer, c)
 		}
 	}
+	for ; i < int(padToLength); i++ {
+		buffer = append(buffer, 0)


Question: Appending 0 is OK here? Don't need something like \0? I ask because this buffer is eventually going to be turned into a string.

buffer is a []byte . As such, 0 seems the right thing to me

Yes, but buffer eventually gets turned into a string, see https://github.com/Shopify/ghostferry/blob/e3f52d4/dml_events.go#L286. I'm not certain if that's the right encoding for the query as I'd assume MySQL wants you to specify \0 in a statement, especially since we're not using prepared statements here.

sorry, I saw your comment too late and I just pushed the new patch-set. Let me check a bit more - I thought the binary conversion would end up being the same thing

admittedly: I'm coming from a c/c++ and Python background, so I don't know if it really makes a difference

in golang, 0 is a int while '\x00' is a int32 but they should be the same value when encoding. But I have no problem using '\x00' if that's cleaner

shuhaowu · 2020-03-11T20:39:15Z

dml_events.go

 		c := value[i]
 		if c == '\'' {
 			buffer = append(buffer, '\'', '\'')
 		} else {
 			buffer = append(buffer, c)
 		}
 	}
+	for ; i < int(padToLength); i++ {


This feels a bit clever such that it might confuse future readers, but may be OK.

shuhaowu · 2020-03-11T20:48:32Z

test/integration/types_test.rb

+
+    ghostferry = new_ghostferry(MINIMAL_GHOSTFERRY)
+
+    ghostferry.on_status(Ghostferry::Status::ROW_COPY_COMPLETED) do


when the ROW_COPY_COMPLETED event is sent, the binlog streamer is about to be terminated. I don't recall 100%, but I think there's a strong race possibility that the binlog is not picked up by Ghostferry.

When these events are sent and these ruby blocks are called, the Ghostferry code is blocked from proceeding until the ruby block returns. As such, a more fool-proof test would be something like this:

source_db.query("INSERT INTO ... (binary data...)") row_copy_called = false binlog_appled_called = false ghostferry.on_status(Ghostferry::Status::AFTER_ROW_COPY) do row_copy_called = true # select row from the target and then make sure the data with 0 padding is present # Perform the update query to set the data to `updated_data`. end ghostferry.on_status(Ghostferry::Status::AFTER_BINLOG_APPLY) do binlog_apply_called = true # select row from the target and make sure the data is on the updated_data end # run ghostferry, # assert table identical assert binlog_apply_called assert row_copy_called # if we want to be defensive, query the target and make sure the data is updated_data.

kolbitsch-lastline · 2020-03-11T21:30:29Z

The approach looks good to me.

cool, working on a final PS

We need to merge the upstream changes but idk if we are ready to update go-mysql, so maybe a vendor patch here is fine.

ok, then let's start with a vendor-patch and - in parallel - work on getting our changes into upstream

rodrigosaito · 2020-03-12T17:19:23Z

test/integration/types_test.rb

+    inserted_data = "ABC\x00"
+    updated_data = "EFGH"
+
+    source_db.query("INSERT INTO #{DEFAULT_FULL_TABLE_NAME} (id, data) VALUES (1, _binary'#{inserted_data}')")


should this be inserting a value that is shorter than the column definition(eg: "ABC") and let mysql add the trailing 0s to reproduce the exact same issue reported here ?

interestingly mysql seems to even strip the trailing whitespace if done like this. But your suggestion is perfectly valid and maybe a slightly better test-case

fjordan

Thanks for the fix! A few comments

fjordan · 2020-03-12T21:44:53Z

test/integration/types_test.rb

+    # update/delete statements, as the WHERE clause would not match existing
+    # rows in the target DB
+    inserted_data = "ABC"
+    expected_inserted_data = inserted_data + "\x00"


A nitpick, but in ruby this is preferably expressed as:

Suggested change

expected_inserted_data = inserted_data + "\x00"

expected_inserted_data = "#{inserted_data}\x00"

fjordan · 2020-03-12T21:49:31Z

test/integration/types_test.rb

+
+    [source_db, target_db].each do |db|
+      db.query("CREATE DATABASE IF NOT EXISTS #{DEFAULT_DB}")
+      db.query("CREATE TABLE IF NOT EXISTS #{DEFAULT_FULL_TABLE_NAME} (id bigint(20) not null auto_increment, data BINARY(4), primary key(id))")


Can we add a test for a case when the length is 0 (ex: create table a (d binary);) and for the decimal type above?

Can we add a test for a case when the length is 0 (ex: create table a (d binary);)

sure

and for the decimal type above?

Just to avoid an unnecessary round-trip: what do you mean by that?

In the decimal.Decimal case, the function is called with a length of 0 - appendEscapedString(buffer, v.String(), 0). I meant a test case of length 0 would cover both the decimal.Decimal case as well as if the length of the column was 0 (ie: create table a (d binary);).

got it. I thought we already had a test-case for decimals, but I'll check.

fjordan · 2020-03-12T22:08:39Z

dml_events.go

+//
+// We need to support right-padding of the generated string using 0-bytes to
+// mimic what a MySQL server would do for BINARY columns (with fixed length).
+//
+// ref: https://github.com/Shopify/ghostferry/pull/159
+//
+// This is specifically mentioned in the the below link:
+//
+//    When BINARY values are stored, they are right-padded with the pad value
+//    to the specified length. The pad value is 0x00 (the zero byte). Values
+//    are right-padded with 0x00 for inserts, and no trailing bytes are removed
+//    for retrievals.
+//
+// ref: https://dev.mysql.com/doc/refman/5.7/en/binary-varbinary.html
+func appendEscapedString(buffer []byte, value string, rightPadLengthForBinaryColumn uint) []byte {


I think moving the 0-byte padding into a separate function would be cleaner, as the padding doesn't really have anything to do with properly escaping strings. That way, we only need to call it if column.Type == schema.TYPE_BINARY and don't need to modify the case of decimal.Decimal

Reusing the naming, we can make a function like:

func rightPadLengthForBinaryColumn(buffer []byte, columnLength uint) []byte { ... }

and just move the logic there. It can be called directly after appendEscapedString in line 357.

true - seems cleaner and avoids the other invocation of the method with the awkward 0 parameter

actually, let me take that back. I agree it's cleaner, but the method currently does the quoting (as well as escaping the ' ) and knows about the length of the string

sure, all of that can be broken apart, but does that make it more readable?

You're right. I think handling the trailing '\'' that is appended to the buffer in a separate method would make this less readable.

Another option would be to pass the column to the appendEscapedString method and conditionally handle the padding in that method:

// We need to support right-padding of the generated string using 0-bytes to // mimic what a MySQL server would do for BINARY columns (with fixed length). // // ref: https://github.com/Shopify/ghostferry/pull/159 // // This is specifically mentioned in the the below link: // // When BINARY values are stored, they are right-padded with the pad value // to the specified length. The pad value is 0x00 (the zero byte). Values // are right-padded with 0x00 for inserts, and no trailing bytes are removed // for retrievals. // // ref: https://dev.mysql.com/doc/refman/5.7/en/binary-varbinary.html func rightPadLengthForBinaryColumn(buffer []byte, padLength int) []byte { // continue 0-padding up to the desired length as provided by the // caller for i := 0; i < int(padLength); i++ { buffer = append(buffer, '\x00') } return buffer } // appendEscapedString replaces single quotes with quote-escaped single quotes. // When the NO_BACKSLASH_ESCAPES mode is on, this is the extent of escaping // necessary for strings. // // ref: https://github.com/mysql/mysql-server/blob/mysql-5.7.5/mysys/charset.c#L963-L1038 // ref: https://github.com/go-sql-driver/mysql/blob/9181e3a86a19bacd63e68d43ae8b7b36320d8092/utils.go#L717-L758 func appendEscapedString(buffer []byte, value string, column schema.TableColumn) []byte { buffer = append(buffer, '\'') for i := 0; i < len(value); i++ { c := value[i] if c == '\'' { buffer = append(buffer, '\'', '\'') } else { buffer = append(buffer, c) } } if column.Type == schema.TYPE_BINARY { padLength := int(column.FixedSize) - len(value) buffer = rightPadLengthForBinaryColumn(buffer, padLength) } return append(buffer, '\'') }

IMHO appendEscapedString is a pure string/conversion method and should not have to worry about columns or types. I have refactored out the trivial method for 0-padding now, but I feel making the former method aware of columns or types seems going the wrong direction (making the method less useful in other scenarios)

it's a nit and I'm fine either way, but I feel passing the column is wrong here

I don't feel strongly - just that it seemed a bit awkward to be forcing a param for all cases when not all applied. Either way can work 👍

shuhaowu · 2020-03-13T12:59:23Z

dml_events.go

+	// continue 0-padding up to the desired length as provided by the
+	// caller
+	for ; i < int(rightPadLengthForBinaryColumn); i++ {
+		buffer = append(buffer, '\x00')


I meant to say that we're supposed to generate MySQL queries here. Is having a \0valid in a query? Or should it be "\0"? The difference is that the latter is the literal ascii character '\' and '0' while the former is actually just 0, if that makes sense.

heh, I agree :-) My original code was injecting literal 0 (that is, the byte, not the character), and then I switched it to be \x00 because it was suggested as improvement.

Still, please note that in neither case are we inserting a 2-character string of \0 (that is, a \ and a 0).

The only difference I can see is that one is an int and the other an int32. However: compared to you guys, I'm probably a golang noob and may simply be wrong here.

But, yes, we are building SQL statements, and we want to have literal 0-bytes (not 0 strings or \0 strings) be injected.

This commit addresses a data corruption bug in the binlog streaming phase, where the binlog writer incorrectly propagates values in fixed-length BINARY columns that have trailing 0s in the original value. These trailing 0s are removed from the binlog by the SQL master and therefore do not show up in the WHERE clause for update/delete statements executed by the binlog writer. NOTE: This commit requires changes to one of the vendor'ed modules in github.com/siddontang/go-mysql that we patch directly in the local repo. We are working on getting these changes into the upstream module and will need to merge these changes once we can use the latest upstream module version. Change-Id: Ib9c1b7308e8198f1fd38439c37f444d9a8154e6a

This was manually picked from #159. I made some minor adjustments to the comments and the function names to be more clear. Co-authored-by: Clemens Kolbitsch <kolbitsch@lastline.com> Co-authored-by: Shuhao Wu <shuhao@shuhaowu.com>

This commit addresses a data corruption bug in the binlog streaming phase, where the binlog writer incorrectly propagates values in fixed-length BINARY columns that have trailing 0s in the original value. These trailing 0s are removed from the binlog by the SQL master and therefore do not show up in the WHERE clause for update/delete statements executed by the binlog writer. This commit is manually cherry-picked from #159.

shuhaowu · 2021-09-22T17:16:18Z

This was cherry-picked into a new PR and merged to master.

shuhaowu reviewed Mar 11, 2020

View reviewed changes

shuhaowu requested review from fjordan, rodrigosaito and jeremycole March 11, 2020 20:52

kolbitsch-lastline force-pushed the binary-column-corruption-ruby-tests-157 branch 2 times, most recently from 1a9ea62 to e7fb1ba Compare March 11, 2020 22:28

rodrigosaito reviewed Mar 12, 2020

View reviewed changes

kolbitsch-lastline force-pushed the binary-column-corruption-ruby-tests-157 branch from e7fb1ba to 825bd63 Compare March 12, 2020 18:57

fjordan reviewed Mar 12, 2020

View reviewed changes

kolbitsch-lastline force-pushed the binary-column-corruption-ruby-tests-157 branch from 825bd63 to e64ef4e Compare March 12, 2020 23:21

shuhaowu reviewed Mar 13, 2020

View reviewed changes

kolbitsch-lastline force-pushed the binary-column-corruption-ruby-tests-157 branch from e64ef4e to 7c3890d Compare March 13, 2020 16:54

kolbitsch-lastline force-pushed the binary-column-corruption-ruby-tests-157 branch from 7c3890d to 37f2188 Compare March 13, 2020 17:54

shuhaowu mentioned this pull request Apr 9, 2020

Unit-test for binary-column data corruption (#157) #158

Closed

jeremycole removed their request for review June 23, 2021 18:05

shuhaowu mentioned this pull request Sep 21, 2021

Update go mysql to address a very rare data corruption problem #307

Merged

shuhaowu mentioned this pull request Sep 22, 2021

Fixes Ghostferry for BINARY columns #308

Merged

shuhaowu closed this in #308 Sep 22, 2021


		ghostferry = new_ghostferry(MINIMAL_GHOSTFERRY)

		ghostferry.on_status(Ghostferry::Status::ROW_COPY_COMPLETED) do

	expected_inserted_data = inserted_data + "\x00"
	expected_inserted_data = "#{inserted_data}\x00"

Binary column corruption ruby tests 157 #159

Binary column corruption ruby tests 157 #159

Conversation

kolbitsch-lastline commented Mar 11, 2020 • edited by shuhaowu Loading

shuhaowu commented Mar 11, 2020

kolbitsch-lastline commented Mar 11, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shuhaowu commented Mar 11, 2020

kolbitsch-lastline commented Mar 11, 2020

shuhaowu commented Mar 11, 2020

kolbitsch-lastline commented Mar 11, 2020

shuhaowu commented Mar 11, 2020 • edited Loading

shuhaowu commented Mar 11, 2020 • edited Loading

kolbitsch-lastline commented Mar 11, 2020

shuhaowu commented Mar 11, 2020

kolbitsch-lastline commented Mar 11, 2020

shuhaowu commented Mar 11, 2020

shuhaowu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kolbitsch-lastline commented Mar 11, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fjordan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fjordan Mar 13, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shuhaowu commented Sep 22, 2021

kolbitsch-lastline commented Mar 11, 2020 •

edited by shuhaowu

Loading

shuhaowu commented Mar 11, 2020 •

edited

Loading

shuhaowu commented Mar 11, 2020 •

edited

Loading

fjordan Mar 13, 2020 •

edited

Loading