Update go mysql to address a very rare data corruption problem #307
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The siddontang version of go-mysql has been updated to https://github.com/go-mysql-org/go-mysql. I had to do a rename for everything.
Technically, we don't need this to fix the BINARY column problem (basically i need to rebase #159), as the latest version we pin already has the fix we need in the upstream library, but I thought this is a good chance to do this as it is a decent chunk of work anyway.
Changelog: go-mysql-org/go-mysql@803944a...v1.3.0. Highlights:
Minor problem with the YEAR column
It seems like if there are values of 0 with columns with type YEAR(4), Ghostferry will corrupt data, but will be caught by the inline verifier. The problem occurs when we UPDATE an existing YEAR column that has been copied by the DataIterator from or to a value of 0, which is a possible value in non-strict mode. The full scale of this issue has not been investigated, as this is a rarely used data type.
I'm not sure if the upstream fix is sufficient, as I have doubts over how it works (simply adding the value by 1900 seems wrong). More investigation is likely required.
Rare data loss due to binlog position comparison bug when binlog file numbering reaches 1M+
There's a rare bug in the upstream go-mysql library that is fixed in this upstream PR that can result in a rare data loss event if: (1) Ghostferry is currently streaming binlog file 999999 and sets a target position of 1000000. This is because the
mysql.Position.Compare
method was performing a string comparison of the binlog filenames, and thus incorrectly thinks that binlog file 999999 is larger than 1000000, which would then result in the early termination of the binlog streamer, which can cause data loss. The relevant BinlogStreamer code is here. This is confirmed with the following sample code:When executed with the vendored go-mysql library currently in the repo, we get:
With 1.3.0 as I updated with this PR:
This should be a very very rare condition, but a serious one as no online verifiers will be able to catch this kind of data loss.