Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The result of Replace into statement is different from mysql #24627

Closed
SilvaXiang opened this issue May 13, 2021 · 12 comments
Closed

The result of Replace into statement is different from mysql #24627

SilvaXiang opened this issue May 13, 2021 · 12 comments
Labels
severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.

Comments

@SilvaXiang
Copy link
Contributor

SilvaXiang commented May 13, 2021

Bug Report

1. Minimal reproduce step (Required)

1.Create a table test whose primary key is AUTO_INCREMENT
create table test(id float primary key AUTO_INCREMENT, col1 int);

2.use replace into statement to insert the data
replace into test(col1) values(1);
replace into test(col1) values(2);

  1. Query the data from table test
    select * from test;

2. What did you expect to see? (Required)

I expect to see the following result which is the same as the result from MySQL 8.0.20 and MySQL 5.7:
image

3. What did you see instead (Required)

But the TiDB return the different result:
image

4. What is your TiDB version? (Required)

Release Version: v5.0.0 Edition: Community Git Commit Hash: bdac0885cd11bdf571aad9353bfc24e13554b91c Git Branch: heads/refs/tags/v5.0.0 UTC Build Time: 2021-04-06 16:36:29 GoVersion: go1.13 Race Enabled: false TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306 Check Table Before Drop: false |

4. 5月24日 Update

When i set Primary key double and use the insert statement, The same thing will also happen that the increment is not 1.
create table test2(id double primary key AUTO_INCREMENT, col1 int);
replace into test2(col1) values(1);
insert into test2(col1) values(1);
replace into test2(col1) values(1);
insert into test2(col1) values(1);
replace into test2(col1) values(1);
replace into test2(col1) values(1);

The result in Mysql
image

The result in TiDB
image

But the increment setting in TiDB is as follows:
image

@SilvaXiang SilvaXiang added the type/bug The issue is confirmed as a bug. label May 13, 2021
@dveeden
Copy link
Contributor

dveeden commented May 14, 2021

When testing this try all 3 values for innodb_autoinc_lock_mode on MySQL. Maybe also try other storage engines (MyISAM, MEMORY, ...)

@SilvaXiang
Copy link
Contributor Author

When testing this try all 3 values for innodb_autoinc_lock_mode on MySQL. Maybe also try other storage engines (MyISAM, MEMORY, ...)

I'm sorry I didn't quite understand your meaning, but isn't it a mistake?

@dveeden
Copy link
Contributor

dveeden commented May 18, 2021

When testing this try all 3 values for innodb_autoinc_lock_mode on MySQL. Maybe also try other storage engines (MyISAM, MEMORY, ...)

I'm sorry I didn't quite understand your meaning, but isn't it a mistake?

You are comparing MySQL behavior to TiDB behavior. The behavior of MySQL is dependent on the innodb_autoinc_lock_mode, so that could influence the comparison.

However I don't think this will make any real difference in this case.

@XuHuaiyu XuHuaiyu added sig/sql-infra SIG: SQL Infra and removed sig/execution SIG execution labels May 24, 2021
@guo-shaoge guo-shaoge added sig/sql-infra SIG: SQL Infra and removed sig/execution SIG execution labels May 25, 2021
@guo-shaoge
Copy link
Collaborator

Looks like is realated to auto_increament. I am not sure if this behavior is expected.

@AilinKid
Copy link
Contributor

AilinKid commented May 25, 2021

did you turn on the switch alterPrimaryKey? it's because the table has implicit rowid sharing the allocation of increment column.

@tangenta
Copy link
Contributor

tangenta commented May 25, 2021

This is expected behavior. Because there is no integer primary key in the table, TiDB uses an implicit row_id to identify each row:

mysql> select *, _tidb_rowid from test;
+----+------+-------------+
| id | col1 | _tidb_rowid |
+----+------+-------------+
|  1 |    1 |           2 |
|  3 |    2 |           4 |
+----+------+-------------+
2 rows in set (0.00 sec)

The current implementation is that auto_increment ID and _tidb_rowid share the same allocator. You can see the first row consumes 1 for auto_increment ID and 2 for _tidb_rowid. The documentation also mentions this problem.

This behavior is counter-intuitive and there is an ancient issue: #982. I try to separate them in #20708, but it seems that we still need to consider the compatibility problems.

@SilvaXiang
Copy link
Contributor Author

SilvaXiang commented May 25, 2021

This is expected behavior. Because there is no integer primary key in the table, TiDB uses an implicit row_id to identify each row:

mysql> select *, _tidb_rowid from test;
+----+------+-------------+
| id | col1 | _tidb_rowid |
+----+------+-------------+
|  1 |    1 |           2 |
|  3 |    2 |           4 |
+----+------+-------------+
2 rows in set (0.00 sec)

The current implementation is that auto_increment ID and _tidb_rowid share the same allocator. You can see the first row consumes 1 for auto_increment ID and 2 for _tidb_rowid. The documentation also mentions this problem.

This behavior is counter-intuitive and there is an ancient issue: #982. I try to separate them in #20708, but it seems that we still need to consider the compatibility problems.

But this behavior is very counter-intuitive which will make some aggregate operations involving primary keys wrong, right?

@tangenta
Copy link
Contributor

What does "some aggregate operations" mean? Could you show an example?

Anyway, the application devs should not assume the values of the auto_increment column to be sequential or incremental.

@SilvaXiang
Copy link
Contributor Author

What does "some aggregate operations" mean? Could you show an example?

Anyway, the application devs should not assume the values of the auto_increment column to be sequential or incremental.

Some aggregate function such as Max(), Min(), Avg(), etc. For example ,this query result is counter-intuitive in this case:
select sum(id), avg(col1) from test
In many analytic workloads, aggregate functions involving primary keys may be common.(I guess.....)

@tangenta
Copy link
Contributor

I don't think accumulating an ID is meaningful. Why would an application depend on a distributed, implicit-allocated value? There are so many substitutions like sequence, ID generator at the application level...

@bb7133
Copy link
Member

bb7133 commented Jul 22, 2021

I agree with @tangenta that the allocation of auto_increment in TiDB is different with MySQL, this is by-design.

This issue is duplicated with #982.

@bb7133 bb7133 closed this as completed Jul 22, 2021
@ti-srebot
Copy link
Contributor

Please edit this comment or add a new comment to complete the following information

Not a bug

  1. Remove the 'type/bug' label
  2. Add notes to indicate why it is not a bug

Duplicate bug

  1. Add the 'type/duplicate' label
  2. Add the link to the original bug

Bug

Note: Make Sure that 'component', and 'severity' labels are added
Example for how to fill out the template: #20100

1. Root Cause Analysis (RCA) (optional)

2. Symptom (optional)

3. All Trigger Conditions (optional)

4. Workaround (optional)

5. Affected versions

6. Fixed versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

9 participants