Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BugReport: Issues with EXCEPT function #19477

Open
LLuopeiqi opened this issue Nov 21, 2024 · 0 comments
Open

BugReport: Issues with EXCEPT function #19477

LLuopeiqi opened this issue Nov 21, 2024 · 0 comments

Comments

@LLuopeiqi
Copy link

BugReport: Issues with EXCEPT function

version

8.3.0

Original sql

SELECT DISTINCT l_shipinstruct, l_extendedprice, l_comment
FROM lineitem
WHERE l_shipmode NOT IN (
    SELECT l_commitdate
    FROM lineitem
) and l_shipmode is not null
and l_commitdate is not null
;

return 0 row

Rewritten sql

SELECT DISTINCT l_shipinstruct, l_extendedprice, l_comment
FROM lineitem
WHERE l_shipmode IS NOT NULL
AND l_commitdate IS NOT NULL
EXCEPT
SELECT DISTINCT l_shipinstruct, l_extendedprice, l_comment
FROM lineitem
WHERE l_shipmode IN (
    SELECT l_commitdate
    FROM lineitem
)AND l_shipmode IS NOT NULL
AND l_commitdate IS NOT NULL;

return 5905 row

Analysis

These two queries are logically equivalent, although they are written differently.

The original query filters rows using the NOT IN subquery, selecting rows where l_shipmode is not in the set of l_commitdate values from the lineitem table, and where both l_shipmode and l_commitdate are not NULL (using the IS NOT NULL condition). The result is all rows where l_shipmode is not in the l_commitdate set and both columns are not NULL.

In the rewritten query, the same logic is achieved using the EXCEPT operator. EXCEPT returns the results from the first query, but excludes rows that are present in the second query. Here, the first query returns all records where l_shipmode and l_commitdate are not NULL, and the second query excludes rows where l_shipmode is in the set of l_commitdate. The final result is the same as the original query.

The two SQL queries are logically equivalent, but they return different results, indicating the presence of a bug.

How to repeat

The exported file for the database is in the attachment. : (https://github.com/LLuopeiqi/newtpcd/blob/main/tidb/tpcd.sql) .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant