ddl: allow more charset/collation modifications for database/table #10958

bb7133 · 2019-06-27T04:04:28Z

What problem does this PR solve?

Allow modifying collations of databases/tables when their charsets are utf8/utf8mb4.

For example, some TiDB users want to do the following things:

tidb> create table t(a int);
Query OK, 0 rows affected (0.01 sec)

tidb> show create table t;
+-------+-----------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                              |
+-------+-----------------------------------------------------------------------------------------------------------+
| t     | CREATE TABLE `t` (
  `a` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+-------+-----------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

tidb> alter table t default charset utf8mb4 collate utf8mb4_unicode_ci;

Before this PR, an error is returned:

ERROR 1105 (HY000): unsupported modify collate from utf8mb4_bin to utf8mb4_unicode_ci

This PR fixes this error.

What is changed and how it works?

Some limitation checks are loosed

Check List

Tests

Integration test

Code changes

Has exported function/method change

Related changes

Need to cherry-pick to the release branch
Need to update the documentation

winkyao

LGTM

codecov · 2019-06-27T04:09:23Z

Codecov Report

Merging #10958 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #10958   +/-   ##
===========================================
  Coverage   81.0421%   81.0421%           
===========================================
  Files           419        419           
  Lines         89662      89662           
===========================================
  Hits          72664      72664           
  Misses        11750      11750           
  Partials       5248       5248

coocood · 2019-06-27T06:00:24Z

But we don't support case insensitive collate.

Deardrops · 2019-06-27T06:46:13Z

ddl/ddl_api.go

-	if toCharset == charset.CharsetUTF8MB4 && origCharset == charset.CharsetUTF8 {
+	if (origCharset == charset.CharsetUTF8 && toCharset == charset.CharsetUTF8MB4) ||
+		(origCharset == charset.CharsetUTF8 && toCharset == charset.CharsetUTF8) ||
+		(origCharset == charset.CharsetUTF8MB4 && toCharset == charset.CharsetUTF8MB4) {


this change is needless, because L2353 has check the case that toCharset is same with origCharset.

hi @Deardrops L2353 is used to report the error message. If the check is passed(2346 ~ 2348), nil will be returned, and the change is used to allow changing collate when the charset is not changed.

Sorry I point out a wrong place before. Not in L2353, but in L2361, there is also a return nil for the case that toCharset is same with origCharset.

The code is that we allow changing the collation while keeping charset unchanged if the charset is utf8/utf8mb4.

Please check the test case https://github.com/pingcap/tidb/pull/10958/files/e23ba9e480618ed994f176cd4f7fdbb2f02b850d#diff-703ae6b7872b425273d1832c198598c8R1751 for this logic. @Deardrops

bb7133 · 2019-06-27T07:16:01Z

But we don't support case insensitive collate.

Some users complained that in TiDB, creating a table with some collations like utf8mb4_unicode_ci is supported but the collation cannot be altered.

crazycs520

LGTM

tangenta · 2019-07-02T16:05:52Z

But we don't support case insensitive collate.

Some users complained that in TiDB, creating a table with some collations like utf8mb4_unicode_ci is supported but the collation cannot be altered.

I am not familiar with TiDB's charset. Since TiDB does not support case insensitive collation, would it be better to disallow the creation of utf8mb4_unicode_ci tables?

bb7133 · 2019-07-03T07:35:29Z

But we don't support case insensitive collate.

Some users complained that in TiDB, creating a table with some collations like utf8mb4_unicode_ci is supported but the collation cannot be altered.

I am not familiar with TiDB's charset. Since TiDB does not support case insensitive collation, would it be better to disallow the creation of utf8mb4_unicode_ci tables?

Some of TiDB users need this syntax, but they don't really care whether the collation is case-insensitive.

zimulala · 2019-07-03T08:00:02Z

ddl/ddl_api.go

-	if toCharset == charset.CharsetUTF8MB4 && origCharset == charset.CharsetUTF8 {
+	if (origCharset == charset.CharsetUTF8 && toCharset == charset.CharsetUTF8MB4) ||
+		(origCharset == charset.CharsetUTF8 && toCharset == charset.CharsetUTF8) ||
+		(origCharset == charset.CharsetUTF8MB4 && toCharset == charset.CharsetUTF8MB4) {
 		// TiDB only allow utf8 to be changed to utf8mb4.


Do we need to update this comment?

Updated, PTAL @zimulala

tangenta

LGTM

bb7133 · 2019-07-04T08:43:58Z

/rebuild

bb7133 · 2019-07-04T08:52:05Z

/run-all-tests

…ingcap#10958)

bb7133 requested a review from crazycs520 June 27, 2019 04:04

bb7133 added the component/DDL-need-LGT3 label Jun 27, 2019

winkyao reviewed Jun 27, 2019

View reviewed changes

Deardrops reviewed Jun 27, 2019

View reviewed changes

crazycs520 reviewed Jun 27, 2019

View reviewed changes

zimulala added the status/LGT2 Indicates that a PR has LGTM 2. label Jun 27, 2019

zimulala reviewed Jul 3, 2019

View reviewed changes

tangenta reviewed Jul 3, 2019

View reviewed changes

bb7133 added 2 commits July 3, 2019 19:05

ddl: allow more charset/collation modifications for database/table

36743d0

address comments

fc638a2

bb7133 force-pushed the bb7133/alter_charset branch from e23ba9e to fc638a2 Compare July 3, 2019 11:05

zimulala approved these changes Jul 4, 2019

View reviewed changes

Merge branch 'master' into bb7133/alter_charset

79f4300

bb7133 added 2 commits July 4, 2019 16:44

Merge branch 'master' into bb7133/alter_charset

56f6e98

Merge branch 'master' into bb7133/alter_charset

7ee8302

bb7133 merged commit 247a07f into pingcap:master Jul 4, 2019

bb7133 mentioned this pull request Jul 4, 2019

ddl: allow more charset/collation modifications for database/table (#10958) #11085

Merged

bb7133 added a commit to bb7133/tidb that referenced this pull request Jul 4, 2019

ddl: allow more charset/collation modifications for database/table (p…

a8b5ef7

…ingcap#10958)

bb7133 mentioned this pull request Jul 4, 2019

ddl: allow more charset/collation modifications for database/table (#10958) #11086

Merged

bb7133 added a commit to bb7133/tidb that referenced this pull request Jul 5, 2019

ddl: allow more charset/collation modifications for database/table (p…

4a8d083

…ingcap#10958)

you06 added the sig/sql-infra SIG: SQL Infra label Mar 4, 2020

bb7133 deleted the bb7133/alter_charset branch December 29, 2023 18:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ddl: allow more charset/collation modifications for database/table #10958

ddl: allow more charset/collation modifications for database/table #10958

bb7133 commented Jun 27, 2019

winkyao left a comment

codecov bot commented Jun 27, 2019 •

edited

Loading

coocood commented Jun 27, 2019

Deardrops Jun 27, 2019 •

edited

Loading

bb7133 Jun 27, 2019

Deardrops Jun 27, 2019

bb7133 Jul 2, 2019

bb7133 commented Jun 27, 2019

crazycs520 left a comment

tangenta commented Jul 2, 2019 •

edited

Loading

bb7133 commented Jul 3, 2019

zimulala Jul 3, 2019

bb7133 Jul 3, 2019

tangenta left a comment

bb7133 commented Jul 4, 2019

bb7133 commented Jul 4, 2019

ddl: allow more charset/collation modifications for database/table #10958

ddl: allow more charset/collation modifications for database/table #10958

Conversation

bb7133 commented Jun 27, 2019

What problem does this PR solve?

What is changed and how it works?

Check List

winkyao left a comment

Choose a reason for hiding this comment

codecov bot commented Jun 27, 2019 • edited Loading

Codecov Report

coocood commented Jun 27, 2019

Deardrops Jun 27, 2019 • edited Loading

Choose a reason for hiding this comment

bb7133 Jun 27, 2019

Choose a reason for hiding this comment

Deardrops Jun 27, 2019

Choose a reason for hiding this comment

bb7133 Jul 2, 2019

Choose a reason for hiding this comment

bb7133 commented Jun 27, 2019

crazycs520 left a comment

Choose a reason for hiding this comment

tangenta commented Jul 2, 2019 • edited Loading

bb7133 commented Jul 3, 2019

zimulala Jul 3, 2019

Choose a reason for hiding this comment

bb7133 Jul 3, 2019

Choose a reason for hiding this comment

tangenta left a comment

Choose a reason for hiding this comment

bb7133 commented Jul 4, 2019

bb7133 commented Jul 4, 2019

codecov bot commented Jun 27, 2019 •

edited

Loading

Deardrops Jun 27, 2019 •

edited

Loading

tangenta commented Jul 2, 2019 •

edited

Loading