online DDL affect Binlog #9304

july2993 · 2019-02-14T08:09:40Z

About the issue

this pr #9207 fix the issue TOOL-881 (internel only)
the issue is like we may get the following sequence Binlog after sorting by commit TS

...
create database test1; (SchemaVersion = 591
insert into test1.test1.... (SchemaVersion = 592  
create table test1.test1 (SchemaVersion = 592
....

the insert into test1.test will fail to execute at downstream, because we having not execute the create table DDL at downstream yet.

New issue introduced

#9207
fix this issue, but introduce another problem. That is we may receive the old SchemaVersion DML after we have execute the DDL at downstream(with multi TiDB instances).
like:

DDL on table a  (SchemaVersion = 500
DML1  (later DML on table a SchemaVersion may be 500 or 499)
DML2  
...

this means we must handle the case with one older SchemaVersion DML
for different DDL.

How to handle for different DDL

drop table/db
public -> write-only -> delete-only - None
ok - will not get old SchemaVersion DML (the table/db is not visible at delete-only schema state
create table/db
None -> Public
ok - will not get old SchemaVersion DML (having not this table/db yet)
add index/column （index take no effect， just consider column）
None -> delete-only -> write-only -> public
TiDB should make sure the old schema version DML will write the added column value at write-only(or should be write-reorganization) schema state, so we can get this column value while syncing to downstream.
drop index/column （index take no effect， just consider column）
public -> write-only -> delete-only -> None
ok - the old SchemaVersion at delete-only can or can not write the drop column to Binlog, if it writes to Binlog, we can just ignore this column.
modify column
currently no intermediate schema state, just change meta info.
the older schema version DML can always write to the table with new schema version
(may be change if we support change column type later)
truncate table
no intermediate schema state,TiDB will use a new table id and GC the old table id data after DDL job is synced。
After truncate table at downstream we may receive the old schema version DML(with old table id), we must discard this DML.
rename table
no intermediate schema state
For example: RENAME TABLE a TO b, we must change name from a to b before sync to downstream.
…

conclusion

TiDB should make sure the old schema version DML will write the added column value at write-only(or should be write-reorganization) schema state.
If we support modify column type later, we must consider the effect on Binlog.

The text was updated successfully, but these errors were encountered:

IANTHEREAL · 2019-02-14T08:29:40Z

Why do we have to populate the value of the new column in tidb？ Version compatibility decide it.

think

if we call function of a package in tidb to populate default values, the function used by binlog is not same with source TiDB (different tidb versions, we had changed it implemention), it may cause data inconsistent.
if we want to synchronize data to mysql， implemention of populating default values is not same, it may also cause data inconsistent.

Unless we can guarantee that they are exactly the same

crazycs520 · 2019-02-14T08:50:09Z

TiDB will fill the column default value if the column is write_only or not specify by user. So I think after binlog syncer, binlog server no need (and shouldn't) to fill the default value.

IANTHEREAL · 2019-02-14T08:53:01Z

nice, what about drop column, would tidb ignore the delete-only column?

crazycs520 · 2019-02-14T08:58:10Z

TiDB will fill default value if column in public or write_only, otherwise(in delete only or none), TiDB won't fill default value. So in drop column processing, TiDB will fill default value in public and write_only state.

IANTHEREAL · 2019-02-14T09:26:26Z

so we don't need change anything, and do you have any problem with alove solution How to handle for different DDL?

see issue pingcap/tidb#9304

IANTHEREAL · 2019-02-18T07:19:09Z

how about now？ @crazycs520

crazycs520 · 2019-02-18T13:00:26Z

I have considered all conceivable DDL that I can think of. I think there is no problem in the issue described.

Ignore DML if the table_id was already dropped.
Binlog server does not backfill the column default value by itself, just read from the key value.
@july2993 @GregoryIan

july2993 · 2019-02-21T06:45:16Z

According to my test, there's two problem if we get data from DML Binlog and use the newest Schema receive by DDL Binlog

TiDB will not always write the column value into Binlog

if the column's default value is null, and the value equals to that, this column will not be be write into Binlog;https://github.com/pingcap/tidb/blob/master/table/tables/tables.go#L1015
so when modify a column from default NULL to other value, we can't not get the value from Binlog, and will get the wrong default value if using the newest Schema

DDL1 on table to change column `c1` from defalut null to default 1 (SchemaVersion = 500
DML1  (DML with older SchemaVersion, `c1` is NULL, and will not be wirte into Binlog, if we set default value as the newest schema to be 1, then data will inconsistent)
DML2  (DML with new schemVersion after change default value is ok
...

may miss one column value from Binlog when drop column

this issue exist even before change the way to write DDL Binlog at #9207

as a example, for drop column c1
DML0 means a DML will older schema version at deltete-only state, it will not wirte the column to Binlog(the column is invisible)

DML0   (Binlog data without column c1, but the column has not being dropped at downstream)
DDL1     (we drop the column c1 at this point)
DML0   (Binlog data without column c1, it 's ok because the column is also dropped at downstream)
...

it's tricky for DML0 before the DDL, for table with primary key or unique index, it's ok to use default value as the column value, because the column will be dropped later, and we will not use the column as
where condition( where c1 = ? ), but if using the column as a where condition( where c1 = ? ) , then we must get the right value from TiDB.

For simpleness, we decide to fill the column value in such way.

if the column can be null, use null
if the the column has default value, use default value.
use zero value for that column type

see issue pingcap/tidb#9304 we drop the truncated table dml data.

see issue pingcap/tidb#9304 skip the truncated table DML data. refactor some translator code in ./drainer add some ddl test case change tests/run.sh to be easy start multi instance TiDB for test

fzhedu · 2020-08-25T11:14:41Z

@zimulala @crazycs520 Does #9207 fix this issue? if yes, close this issue, Pls.

IANTHEREAL assigned winkyao Feb 14, 2019

july2993 added a commit to pingcap/tidb-binlog that referenced this issue Feb 15, 2019

Fix affect of online DDL

4a87b16

see issue pingcap/tidb#9304

july2993 added a commit to pingcap/tidb-binlog that referenced this issue Feb 24, 2019

Fix affect of online DDL

6b29d3b

see issue pingcap/tidb#9304 we drop the truncated table dml data.

july2993 mentioned this issue Feb 25, 2019

ddl: write binlog when ddl job state is done. #9207

Merged

ghost added the type/bug The issue is confirmed as a bug. label Jul 23, 2020

ghost added the sig/sql-infra SIG: SQL Infra label Jul 28, 2020

july2993 closed this as completed Aug 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

online DDL affect Binlog #9304

online DDL affect Binlog #9304

july2993 commented Feb 14, 2019

IANTHEREAL commented Feb 14, 2019

crazycs520 commented Feb 14, 2019

IANTHEREAL commented Feb 14, 2019

crazycs520 commented Feb 14, 2019

IANTHEREAL commented Feb 14, 2019

IANTHEREAL commented Feb 18, 2019

crazycs520 commented Feb 18, 2019

july2993 commented Feb 21, 2019 •

edited

Loading

fzhedu commented Aug 25, 2020

online DDL affect Binlog #9304

online DDL affect Binlog #9304

Comments

july2993 commented Feb 14, 2019

About the issue

New issue introduced

How to handle for different DDL

conclusion

IANTHEREAL commented Feb 14, 2019

crazycs520 commented Feb 14, 2019

IANTHEREAL commented Feb 14, 2019

crazycs520 commented Feb 14, 2019

IANTHEREAL commented Feb 14, 2019

IANTHEREAL commented Feb 18, 2019

crazycs520 commented Feb 18, 2019

july2993 commented Feb 21, 2019 • edited Loading

fzhedu commented Aug 25, 2020

july2993 commented Feb 21, 2019 •

edited

Loading