Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl: do change column data when binary to binary with different length #24681

Merged
merged 4 commits into from
Jul 22, 2021

Conversation

LENSHOOD
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #3644

Problem Summary:

  1. Column change with binary to binary should be disallowed when tidb_enable_change_column_type = 0
  2. When modify column from binary to binary with flag tidb_enable_change_column_type = 1, then no matter the new binary length is great or less than the original one, change column data should be performed.

What is changed and how it works?

What's Changed:

  1. Need reorg when alter binary - binary with different length.
  2. Change column data when alter binary - binary with different length.

Check List

Tests

  • Integration test

Release note

  • Change column data when alter binary to binary with different length

@ti-chi-bot ti-chi-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 17, 2021
@github-actions github-actions bot added the sig/sql-infra SIG: SQL Infra label May 17, 2021
@LENSHOOD
Copy link
Contributor Author

/cc @AilinKid

@ti-chi-bot ti-chi-bot requested a review from AilinKid May 17, 2021 07:16
@LENSHOOD
Copy link
Contributor Author

@ti-srebot /run-all-tests

@ti-srebot
Copy link
Contributor

/run-all-tests

Copy link
Member

@wjhuang2016 wjhuang2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we plan to remove tidb_enable_change_column_type, after the column type change is GA, we can support it directly.

@LENSHOOD
Copy link
Contributor Author

we plan to remove tidb_enable_change_column_type, after the column type change is GA, we can support it directly.

right, then the only difference would be need to do reorg when binary expand.

@wjhuang2016
Copy link
Member

/cc @AilinKid

ddl/column.go Show resolved Hide resolved
ddl/ddl_api.go Outdated Show resolved Hide resolved
mb := getModifyColumn(c, s.s.(sessionctx.Context), "test", "tt", "a", false)
c.Assert(mb.FieldType.Flen, Equals, 10)
tk.MustQuery("select * from tt").Check(testkit.Rows("111\x00\x00\x00\x00\x00\x00\x00", "10000\x00\x00\x00\x00\x00"))
tk.MustGetErrMsg("alter table tt change a a binary(4);", "[types:1406]Data Too Long, field len 4, data len 10")
Copy link
Contributor

@AilinKid AilinKid May 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can add more tests, for example, char type which also has a binary opt.

Copy link
Contributor

@AilinKid AilinKid May 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mysql> alter table t modify b char(33) binary;
Query OK, 0 rows affected, 1 warning (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 1

However, in MySQL, the length won't change during alter binary(22) to char(33) binary

I think TiDB may have some incompatibility here.

Copy link
Contributor Author

@LENSHOOD LENSHOOD May 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior of MySQL is reasonable.

char(33) binary defines the collation to "_bin", the ordering/comparison behavior may changed, but char always right-padded with spaces to the specified length according to here. So length is not influenced by collation. (BTW, char(33) binary become ambiguous since MySQL 8.0 provide two _bin collations to utf8mb4)

For the char type, the retrieved value would keep the trailing spaces only if SQL_MODE contains PAD_CHAR_TO_FULL_LENGTH, which is disable at default. That's why Length() result not changed. (On the contrary, binary padding with 0x00 and always keep the padding bytes at retrieve, so the 0x00 can be count at Length())

If we add PAD_CHAR_TO_FULL_LENGTH to sql_mode, execute alter table t modify b char(33) binary;, we can get length = 33.


But when it comes to TiDB, column change to different charset still unsupported now. So I think maybe it's better to add such tests along with the feature development.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every Impressive, thanks LENSHOOD

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 1, 2021
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 2, 2021
@@ -708,6 +708,11 @@ func needChangeColumnData(oldCol, newCol *model.ColumnInfo) bool {
return isElemsChangedToModifyColumn(oldCol.Elems, newCol.Elems)
case mysql.TypeTiny, mysql.TypeShort, mysql.TypeInt24, mysql.TypeLong, mysql.TypeLonglong:
return toUnsigned != originUnsigned
case mysql.TypeString:
// Due to the behavior of padding \x00 at binary type, always change column data when binary length changed
if types.IsBinaryStr(&oldCol.FieldType) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

varbinary doesn't have padding \x00

@wjhuang2016
Copy link
Member

@LENSHOOD Could you update this PR?

@LENSHOOD
Copy link
Contributor Author

@LENSHOOD Could you update this PR?

I've missed your last comment about "varbinary" !
Sorry about that.

Anyway, varbinary doesn't have to add padding, indeed.

But due to the varbinary 's FieldType.Tp == 15 (TypeVarchar) and binary's FieldType.Tp == 0xfe (TypeString), so check varbinary will never goes into the if types.IsBinaryStr(&oldCol.FieldType) {... logic.

(am I understand you comment correctly?)

@wjhuang2016
Copy link
Member

@LENSHOOD Could you update this PR?

I've missed your last comment about "varbinary" !
Sorry about that.

Anyway, varbinary doesn't have to add padding, indeed.

But due to the varbinary 's FieldType.Tp == 15 (TypeVarchar) and binary's FieldType.Tp == 0xfe (TypeString), so check varbinary will never goes into the if types.IsBinaryStr(&oldCol.FieldType) {... logic.

(am I understand you comment correctly?)

Okay, I check it and it's expected.

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 21, 2021
Copy link
Contributor

@AilinKid AilinKid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • AilinKid
  • wjhuang2016

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jul 22, 2021
@wjhuang2016
Copy link
Member

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: f4b49f5

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jul 22, 2021
@AilinKid
Copy link
Contributor

Since it's not a critical bug, or user requirements, we will not cherry-pick it release 5.1

@ti-chi-bot ti-chi-bot merged commit 11bc9c1 into pingcap:master Jul 22, 2021
@LENSHOOD
Copy link
Contributor Author

@AilinKid @wjhuang2016
Thanks for review!

@zhouqiang-cl
Copy link
Contributor

This pr break integration tests https://ci.pingcap.net/blue/organizations/jenkins/tidb_ghpr_common_test/detail/tidb_ghpr_common_test/5633/

"run test [alter_table1] err: sql:SELECT HEX(s) FROM t1;: failed to run query \n\"SELECT HEX(s) FROM t1;\" \n around line 169, \nwe need(51):\nSELECT HEX(s) FROM t1;\nHEX(s)\n7465737400000000\nSELE\nbut got(51):\nSELECT HEX(s) FROM t1;\nHEX(s)\n74657374000000000000\n\n"

@LENSHOOD
Copy link
Contributor Author

This pr break integration tests https://ci.pingcap.net/blue/organizations/jenkins/tidb_ghpr_common_test/detail/tidb_ghpr_common_test/5633/

"run test [alter_table1] err: sql:SELECT HEX(s) FROM t1;: failed to run query \n\"SELECT HEX(s) FROM t1;\" \n around line 169, \nwe need(51):\nSELECT HEX(s) FROM t1;\nHEX(s)\n7465737400000000\nSELE\nbut got(51):\nSELECT HEX(s) FROM t1;\nHEX(s)\n74657374000000000000\n\n"

Sorry, I cannot access the repo tidb_ghpr_common_test, so I may not able to find out what's going on with the error.😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/sql-infra SIG: SQL Infra size/M Denotes a PR that changes 30-99 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

length result is wrong after modify column length
6 participants