-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
schema fingerprint verifier #292
base: main
Are you sure you want to change the base?
Conversation
4399e6d
to
acf14fa
Compare
fix go tests remove pry changes changes fix tests
aaa1229
to
d83ec42
Compare
changes
8192736
to
031bd4c
Compare
You imported some svg and png files. Is this intended? |
Yeah, these files depict the performance graphs for offcpu and and cpu profiles in case anyone wants to take a look. Will delete it once the PR is ready to merge. |
You can just paste it directly in either a comment here or the PR description, would make it easier for people to see. |
ferry.go
Outdated
@@ -686,6 +716,13 @@ func (f *Ferry) Run() { | |||
}() | |||
} | |||
|
|||
schemaFingerVerifierPrintWg := &sync.WaitGroup{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could probably just reuse the supportingServicesWg
instead of having another one here.
ferry.go
Outdated
var ( | ||
binlogVerifyStore *BinlogVerifyStore = nil | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question for my own knowledge: what prompted this change? A linting error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, seems like it. Will change it back.
schema_fingerprint_verifier.go
Outdated
|
||
type SchemaFingerPrintVerifier struct { | ||
SourceDB *sql.DB | ||
TableRewrites map[string]string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need DatabaseRewrites as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was a bit confused as to what the purpose of TableRewrites
is. Actually I only DatabaseRewrites
.
schema_fingerprint_verifier.go
Outdated
return schemaFingerPrints, err | ||
} | ||
|
||
_, isIgnored := table.IgnoredColumnsForVerification[string(rowData[3].([]byte))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait. We ignore certain columns for verification for a very specific reason (see below). We shouldn't ignore it here as we can corrupt data if that column definition changed.
Lines 688 to 699 in e605f11
// This config is also for inline verification for the same special case of | |
// Ghostferry as documented with the CompressedColumnsForVerification option: | |
// | |
// - If you're copying a table where the data is partially already on the | |
// the target through some other means. | |
// - A difference in a particular column could be acceptable. | |
// - An example would be a table with a data field and a created_at field. | |
// Maybe the created_at field is not important for data integrity as long | |
// as the data field is correct. | |
// - Putting the column in this config will cause the InlineVerifier to skip | |
// this column for verification. | |
IgnoredColumnsForVerification ColumnIgnoreConfig |
|
||
schemaData := [][]interface{}{} | ||
for rows.Next() { | ||
rowData, err := ScanGenericRow(rows, 21) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can you write a comment about this magic 21? I think this is the number of columns in information_schema.columns
, but if you have a comment here guiding the reader, then the reader doesn't have to go look on their own.
schema_fingerprint_verifier.go
Outdated
dbSet[table.Schema] = struct{}{} | ||
|
||
query := fmt.Sprintf("SELECT * FROM information_schema.columns WHERE table_schema = '%s' ORDER BY table_name, column_name", table.Schema) | ||
rows, err := sf.SourceDB.Query(query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we do this for the target DB too, if we want to be generic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct, we should check both source and target.
schema_fingerprint_verifier.go
Outdated
} | ||
|
||
hash := md5.Sum([]byte(schemaDataInBytes)) | ||
schemaFingerPrints[table.Schema] = hex.EncodeToString(hash[:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we store a single hash, instead of an array of hash? This map is stored into the state dump. If we have a lot of tables, it can be pretty big. A single hash is most likely good enough and save a lot of space.
ferry.go
Outdated
fingerPrint := map[string]string{} | ||
if f.StateToResumeFrom != nil && f.StateToResumeFrom.SchemaFingerPrint != nil { | ||
fingerPrint = f.StateToResumeFrom.SchemaFingerPrint | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this code gives me a bit of pause. There's currently a race condition I think:
- Ghostferry starts copying rows
- within 1 second, the table schema changed
- The schema finger print verifier reads the schema and thinks the schema is unchanged.
Did I miss something and this situation is not possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So for clarity currently the following happens when you run the migration with copydb
- ghostferry creates table and dbs on the target before running the ferry - https://github.com/Shopify/ghostferry/blob/master/copydb/cmd/main.go#L127
- the ferry starts running and that's when the schema_fingerprint is started in background.
Also just to avoid confusion by table schema
I mean the information we retrieve from information_schema.columns
table from mysql and not the TableSchema
struct we use internally in ghostferry.
I am not sure I understand your point about after ghostferry starts copying rows why would the table schema would have changed. Copying rows shouldn't alter the schema structure of the table. Here is what an example output is when we try to retrieve the schema of a table.
mysql> SELECT * FROM information_schema.columns WHERE table_schema = 'sakila' ORDER BY table_name, column_name LIMIT 1\G
*************************** 1. row ***************************
TABLE_CATALOG: def
TABLE_SCHEMA: sakila
TABLE_NAME: actor
COLUMN_NAME: actor_id
ORDINAL_POSITION: 1
COLUMN_DEFAULT: NULL
IS_NULLABLE: NO
DATA_TYPE: smallint
CHARACTER_MAXIMUM_LENGTH: NULL
CHARACTER_OCTET_LENGTH: NULL
NUMERIC_PRECISION: 5
NUMERIC_SCALE: 0
DATETIME_PRECISION: NULL
CHARACTER_SET_NAME: NULL
COLLATION_NAME: NULL
COLUMN_TYPE: smallint unsigned
COLUMN_KEY: PRI
EXTRA: auto_increment
PRIVILEGES: select,insert,update,references
COLUMN_COMMENT:
GENERATION_EXPRESSION:
SRS_ID: NULL
1 row in set (0.00 sec)
Have posted the CPU profile, but since the off CPU profile (basically flamegraph) is in svg format, I had to attach it as a file. |
What I usually do is either convert it into a rasterized format like png, or take a screenshot of the SVG and paste it in. |
fixes more fixes
7fa52dc
to
2afb0c3
Compare
Introduction
Introducing Schema Verifier
This componnent currently collects source and target schema fingerprints periodically (1 minute by default) and verifies that the schema has not been changed on the source and target while the migration is still going on.
Technical details
I don't think there is anything special about the code written, most if it is self-explanotary.
Performace
Collecting Schema fingerprint every 1 minute should not affect performance in any sort for the migration as the interval is just too high, although I did some performance testing by setting the
PeriodicallyVerifySchemaFingerprintInterval = 50ms
, just for curiosity on how would the performance affect.I collected for 30s cpuprofile (using
pprof
) and offcpuprofile using (offcputime-bpfcc
), but didn't notice any performance issues at all.offcpu profile
cpu profile