Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different partition locations for the same key when SQL mode is differnt #54271

Closed
lcwangchao opened this issue Jun 27, 2024 · 2 comments · Fixed by #54101
Closed

Different partition locations for the same key when SQL mode is differnt #54271

lcwangchao opened this issue Jun 27, 2024 · 2 comments · Fixed by #54101
Labels
affects-8.1 This bug affects the 8.1.x(LTS) versions. severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.

Comments

@lcwangchao
Copy link
Collaborator

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

> set @@sql_mode='allow_invalid_dates';
> create table t(a datetime primary key) partition by range columns(a) (
    partition p0 values less than ('2020-02-31 00:00:00'),
    partition p1 values less than (MAXVALUE)
)
> insert into t values('2020-03-01 00:00:00')
> set @@sql_mode='';
> insert into t values('2020-03-01 00:00:00')

2. What did you expect to see? (Required)

The second insert should report an error because a is an primary key.

3. What did you see instead (Required)

The second insert succeeds. And we can see both of them can be found:

TiDB root@127.0.0.1:test> select * from t;
+---------------------+
| a                   |
+---------------------+
| 2020-03-01 00:00:00 |
| 2020-03-01 00:00:00 |
+---------------------+
2 rows in set
Time: 0.010s
TiDB root@127.0.0.1:test> select * from t partition(p0);
+---------------------+
| a                   |
+---------------------+
| 2020-03-01 00:00:00 |
+---------------------+
1 row in set
Time: 0.005s
TiDB root@127.0.0.1:test> select * from t partition(p1);
+---------------------+
| a                   |
+---------------------+
| 2020-03-01 00:00:00 |
+---------------------+
1 row in set
Time: 0.007s

The located in different partitions

4. What is your TiDB version? (Required)

Master. But release version may also have this issue.

@lcwangchao
Copy link
Collaborator Author

lcwangchao commented Jun 27, 2024

The root cause of the issue seems a little complex. First of all, the expressions in the partition table will be parsed as an expression.Expression object which will be stored as an global cache in memory:

partitionExpr, err := newPartitionExpr(tblInfo, pi.Type, pi.Expr, pi.Columns, pi.Definitions)

func newPartitionExpr(tblInfo *model.TableInfo, tp model.PartitionType, expr string, partCols []model.CIStr, defs []model.PartitionDefinition) (*PartitionExpr, error) {
// a partitioned table cannot rely on session context/sql modes, so use a default one!
ctx := mock.NewContext()
dbName := model.NewCIStr(ctx.GetSessionVars().CurrentDB)
columns, names, err := expression.ColumnInfos2ColumnsAndNames(ctx, dbName, tblInfo.Name, tblInfo.Cols(), tblInfo)

In this case, the upper bound of partition p0 is lt(t.a, cast(2021-02-31 00:00:00, datetime)). In most cases, the second argument of lt will be folded to a datetime constant and the cast will dispear. However, in this case, the fold operation will fail because 2021-02-31 00:00:00 is not a valid datetime when ALLOW_INVALID_DATES is not set in context (this context is a mock context and is not related with the current environment). Then the parsed expression will keep cast in it:

value, err := x.Eval(ctx.GetEvalCtx(), chunk.Row{})
retType := x.RetType.Clone()
if !hasNullArg {
// set right not null flag for constant value
switch value.Kind() {
case types.KindNull:
retType.DelFlag(mysql.NotNullFlag)
default:
retType.AddFlag(mysql.NotNullFlag)
}
}
if err != nil {
logutil.BgLogger().Debug("fold expression to constant", zap.String("expression", x.ExplainInfo(ctx.GetEvalCtx())), zap.Error(err))
return expr, isDeferredConst
}

After the expression finished building, we started to insert 2020-03-01 00:00:00 into this table. However, the result of lt(t.a, cast(2021-02-31 00:00:00, datetime)) depends on the SQL mode:

  • If SQL mode is "ALLOW_INVALID_DATES", the above expression will return false.
  • If SQL mode is empty "", the above expression will return NULL because cast(2021-02-31 00:00:00, datetime) returns NULL.

When the compare returns NULL, it will be recognized as true actually:

if isNull {
// If the column value used to determine the partition is NULL, the row is inserted into the lowest partition.
// See https://dev.mysql.com/doc/mysql-partitioning-excerpt/5.7/en/partitioning-handling-nulls.html
return true // Always less than any other value (NULL cannot be in the partition definition VALUE LESS THAN).
}

That makes the same row located to the different partitions.

@lcwangchao
Copy link
Collaborator Author

I found this bug when I surveyed the test failure for PR #54101. The quick fix for this issue is to allow invalid dates in context to make the fold operation pass. But I'm not sure whether we have a better way to fix it...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-8.1 This bug affects the 8.1.x(LTS) versions. severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant