Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

infer functional dependencies from filters / join conditions and use them for only_full_group_by check #25089

Closed
eurekaka opened this issue Jun 3, 2021 · 0 comments · Fixed by #33567
Labels
feature/developing the related feature is in development sig/planner SIG: Planner type/compatibility type/enhancement The issue or PR belongs to an enhancement.

Comments

@eurekaka
Copy link
Contributor

eurekaka commented Jun 3, 2021

Development Task

Below is the output from mysql 5.7:

MySQL [test]> create table t1(a int not null);
Query OK, 0 rows affected (0.09 sec)

MySQL [test]> insert into t1 values(1);
Query OK, 1 row affected (0.00 sec)

MySQL [test]> create table t2(a int not null);
Query OK, 0 rows affected (0.02 sec)

MySQL [test]> insert into t2 values(1);
Query OK, 1 row affected (0.00 sec)

MySQL [test]> select t1.a from t1 join t2 on t1.a = t2.a group by t2.a;
+---+
| a |
+---+
| 1 |
+---+
1 row in set (0.01 sec)

MySQL [test]> select t1.a from t1 left join t2 on t1.a = t2.a group by t2.a;
ERROR 1055 (42000): Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'test.t1.a' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

MySQL [test]> select t1.a from t1 join t2 on t1.a = 1 and 2 = t2.a group by t2.a;
Empty set (0.00 sec)

MySQL [test]> select t1.a from t1 join t2 on t1.a = 1 and 2 < t2.a group by t2.a;
Empty set (0.00 sec)

MySQL [test]> select t1.a from t1 join t2 on t1.a > 1 and 2 = t2.a group by t2.a;
ERROR 1055 (42000): Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'test.t1.a' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

MySQL [test]> select t1.a from t1 join t2 using(a) group by t2.a;
+---+
| a |
+---+
| 1 |
+---+
1 row in set (0.00 sec)

MySQL [test]> select t1.a from t1 left join t2 using(a) group by t2.a;
ERROR 1055 (42000): Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'test.t1.a' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

We can see that, MySQL infers functional dependencies from filters and join conditions, and uses them for only_full_group_by check. Compared with MySQL, TiDB uses a pretty naive approach for only_full_group_by check, and TiDB would report error for all the above SELECT statements.

PostgreSQL(12 devel) has same loose check as TiDB.

postgres=# create table t1(a int not null);
CREATE TABLE
postgres=# insert into t1 values(1);
INSERT 0 1
postgres=# create table t2(a int not null);
CREATE TABLE
postgres=# insert into t2 values(1);
INSERT 0 1
postgres=# select t1.a from t1 join t2 on t1.a = t2.a group by t2.a;
ERROR:  column "t1.a" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select t1.a from t1 join t2 on t1.a = t2.a group by t2.a;
               ^
postgres=# select t1.a from t1 left join t2 on t1.a = t2.a group by t2.a;
ERROR:  column "t1.a" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select t1.a from t1 left join t2 on t1.a = t2.a group by t2....
               ^
postgres=# select t1.a from t1 join t2 on t1.a = 1 and 2 = t2.a group by t2.a;
ERROR:  column "t1.a" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select t1.a from t1 join t2 on t1.a = 1 and 2 = t2.a group b...
               ^
postgres=# select t1.a from t1 join t2 on t1.a = 1 and 2 < t2.a group by t2.a;
ERROR:  column "t1.a" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select t1.a from t1 join t2 on t1.a = 1 and 2 < t2.a group b...
               ^
postgres=# select t1.a from t1 join t2 on t1.a > 1 and 2 = t2.a group by t2.a;
ERROR:  column "t1.a" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select t1.a from t1 join t2 on t1.a > 1 and 2 = t2.a group b...
               ^
postgres=# select t1.a from t1 join t2 using(a) group by t2.a;
ERROR:  column "t1.a" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select t1.a from t1 join t2 using(a) group by t2.a;
               ^
postgres=# select t1.a from t1 left join t2 using(a) group by t2.a;
ERROR:  column "t1.a" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select t1.a from t1 left join t2 using(a) group by t2.a;
               ^
postgres=# select version();
                                                  version
------------------------------------------------------------------------------------------------------------
 PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
(1 row)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/developing the related feature is in development sig/planner SIG: Planner type/compatibility type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants