-
Notifications
You must be signed in to change notification settings - Fork 66
restore: check row value count to avoid unexpected encode result #528
Conversation
c95b6d3
to
9b63146
Compare
/run-all-tests |
1 similar comment
/run-all-tests |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what bug does this PR fix?
We met a case last week that the source file columns are more than the table schema, and this caused tidb backend panic at tidb-lightning/lightning/backend/tidb.go Line 251 in c2b6dde
And for local/importer backend, whether it can result in correct result depends on the implementation of tidb table encoder, thus it is undefined behavior. |
maybe easier to check or maybe do it inside
pretty sure they will just be ignored. we also need to support "ignore_extra_columns" in Fast Bulk Import i.e. it is legal to have a CSV in the form
with the column "ignored" skipped. |
ok, we could delay resolve this issue to the implementation of "ignore_extra_columns" since this is an ergent thing. |
With a CSV without header, or it's an invalid csv with the fields in some line more than in header, maybe we still should check it in the tidb encoder |
i mean we should still check it, but we should do it |
got. I'll move it into the tidb encoder since tableKvEncoder can properly handle it. |
…to check-read-row
@kennytm I have moved the check into tidb encoder, and fix an extra bug related to |
b6e33cc
to
e104101
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest LGTM (and fix the test)
lightning/backend/tidb.go
Outdated
} else { | ||
return cols[index] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} else { | |
return cols[index] | |
} | |
return cols[index] |
strange that it is not linted
lightning/backend/tidb.go
Outdated
// See: tests/generated_columns/data/gencol.various_types.0.sql this sql has no columns, so encodeLoop will fill the | ||
// column permutation with default, thus enc.columnCnt > len(row). | ||
if len(row) > enc.columnCnt { | ||
log.L().Error("column count mismatch", zap.Ints("column_permutation", columnPermutation), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.L().Error("column count mismatch", zap.Ints("column_permutation", columnPermutation), | |
logger.Error("column count mismatch", zap.Ints("column_permutation", columnPermutation), |
@lance6716 PTAL again, since it has changed a log. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest LGTM
588e5b6
to
551ce93
Compare
/run-all-tests
|
Still need to resolve the integration test |
(LGTM, bot has already count my lgtm) |
@kennytm PTAL again |
/lgtm |
* check row value count to avoid unexpected encode result * check the '_tidb_row_id' field * resolve comments * fix issue related to '_tidb_rowid' and move column count to tidb encoder * add tidb_opt_write_row_id session var * fix test * resolve comments * update tidb to apply tidb#22062 * fix test
What problem does this PR solve?
Check row field count before encodes, if row value count is bigger than table field count, directly return an error._tidb_rowid
ingetColumnNames
and tidb encoderWhat is changed and how it works?
Check List
Tests
Side effects
Related changes
Release Note