Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect result from SELECT DISTINCT ON a distributed table (version 12.1-1 ) #7684

Open
scooreman opened this issue Sep 6, 2024 · 6 comments

Comments

@scooreman
Copy link

Query no longer returns the correct result after distributing table:

sandbox=# CREATE TABLE test (
attribute1 varchar(255),
attribute2 varchar(255),
attribute3 varchar(255)
);
CREATE TABLE

sandbox=# INSERT INTO test (attribute1, attribute2, attribute3)
VALUES ('Phone', 'John', 'A'),
('Phone', 'Eric', 'A'),
('Tablet','Eric', 'B');
INSERT 0 3

sandbox=# SELECT DISTINCT ON (T.attribute1, T.attribute2)
T.attribute1 as attribute1,
T.attribute3 as attribute2
FROM test T;
attribute1 | attribute2
------------+------------
Phone | A
Phone | A
Tablet | B
(3 rows)

sandbox=# SELECT create_distributed_table('test', 'attribute1');
NOTICE: Copying data from local table...
NOTICE: copying the data has completed
DETAIL: The local data in the table is no longer visible, but is still on disk.
HINT: To remove the local data, run: SELECT truncate_local_data_after_distributing_table($$"ford-dev".test$$)
create_distributed_table

(1 row)

sandbox=# SELECT DISTINCT ON (T.attribute1, T.attribute2)
T.attribute1 as attribute1,
T.attribute3 as attribute2
FROM test T;
attribute1 | attribute2
------------+------------
Phone | A
Tablet | B
(2 rows)

@scooreman scooreman changed the title Incorrect result from SELECT DISTINCT ON a distributed table Incorrect result from SELECT DISTINCT ON a distributed table (version 12.1-1 ) Sep 6, 2024
@c2main
Copy link
Contributor

c2main commented Sep 12, 2024

Confirmed, the problem really is with this part T.attribute3 as attribute2 (apparently confuse citus).

@c2main
Copy link
Contributor

c2main commented Sep 12, 2024

@scooreman if you can patch/compile citus, I think this patch is ok:

diff --git a/src/backend/distributed/deparser/ruleutils_16.c b/src/backend/distributed/deparser/ruleutils_16.c
index cc0da165b..fc76a1257 100644
--- a/src/backend/distributed/deparser/ruleutils_16.c
+++ b/src/backend/distributed/deparser/ruleutils_16.c
@@ -2568,9 +2568,21 @@ get_basic_select_query(Query *query, deparse_context *context,
                        {
                                SortGroupClause *srt = (SortGroupClause *) lfirst(l);
 
-                               appendStringInfoString(buf, sep);
-                               get_rule_sortgroupclause(srt->tleSortGroupRef, query->targetList,
-                                                                                false, context);
+                               /* Ensure the correct target list is used without alias swapping */
+                               TargetEntry *tle = get_sortgroupclause_tle(srt, query->targetList);
+
+                               /* Ensure the attribute names are preserved */
+                               if (tle && tle->resname != NULL)
+                               {
+                                       appendStringInfoString(buf, sep);
+                                       appendStringInfoString(buf, quote_identifier(tle->resname));
+                               }
+                               else
+                               {
+                                       /* Fallback to the default behavior */
+                                       get_rule_sortgroupclause(srt->tleSortGroupRef, query->targetList, false, context);
+                               }
+
                                sep = ", ";
                        }
                        appendStringInfoChar(buf, ')');

I suppose you have a larger SQL code base to test, so if you can give it a try ?

@scooreman
Copy link
Author

@c2main I can confirm that the fix works on our data set.
Amazing turnaround. Many thanks!

@c2main
Copy link
Contributor

c2main commented Sep 24, 2024

@scooreman nice to know it works!

@onurctirtir or others , I didn't made a PR for that yet, can you reopen ?

@emelsimsek emelsimsek reopened this Sep 24, 2024
c2main added a commit to c2main/citus that referenced this issue Sep 25, 2024
Similar to other bugs affected by citus usage of ruletuils: the provided
tree as input is distinct from the tree expected in ruleutils in
PostgreSQL.
As a consequence, it is required in some places to re-order the tree
structure to make it compliant or "back" to the parser tree (before it's
rewritten and reordered by PostgreSQL to optimize execution). Or to
lookup the target list and ensure it's the one "we expect".
Fox this bug, the `get_rule_sortgroupclause()` is used to check if
target list entry `resname` is defined, and use it directly if it
exists.

No benchmark where run, it's not expected to impact a lot.
c2main added a commit to c2main/citus that referenced this issue Sep 25, 2024
Similar to other bugs affected by citus usage of ruletuils: the provided
tree as input is distinct from the tree expected in ruleutils in
PostgreSQL.
As a consequence, it is required in some places to re-order the tree
structure to make it compliant or "back" to the parser tree (before it's
rewritten and reordered by PostgreSQL to optimize execution). Or to
lookup the target list and ensure it's the one "we expect".
Fox this bug, the `get_rule_sortgroupclause()` is used to check if
target list entry `resname` is defined, and use it directly if it
exists.

No benchmark where run, it's not expected to impact a lot.
@c2main
Copy link
Contributor

c2main commented Sep 25, 2024

@scooreman be careful, the provided patch is probably not the good fix, it's more subtle than that.

Citus is building a query like the following, note the worker_column_3 added, and the absence of table qualified names.

SELECT DISTINCT ON (attribute1, attribute2) 
  attribute1
, attribute3 AS attribute2
, attribute2 AS worker_column_3 

FROM test_pg t WHERE true;
------------+------------+-----------------
 Phone      | A          | John
 Tablet     | B          | Eric

And the results for this query are correct ! See PostgreSQL DISTINCT and ORDER BY documentation in [1] and [2].

The problem is only related to absence of table qualified name in the built query by citus. So my patch is good only by accident...

[1] https://www.postgresql.org/docs/current/sql-select.html#SQL-DISTINCT
[2] https://www.postgresql.org/docs/current/sql-select.html#SQL-ORDERBY

@c2main
Copy link
Contributor

c2main commented Sep 25, 2024

For reference, there is a patch from August 2024 in PostgreSQL touching this area:
postgres/postgres@a7eb633563c

It's really not clear what's the best fix for Citus, and having to manage PostgreSQL 14 and 15 may not be that easy or free. Maybe better to backpatch more of PostgreSQL 17 ruleutils code into ruleutils 14, 15 and 16...
@naisila given your work on PostgreSQL 17, is it something you explored already ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants