Skip to content

DataFusion HashJoin LeftAnti doesn't support null aware anti join #10583

Open
@viirya

Description

@viirya

Describe the bug

During working on apache/datafusion-comet#437, a few Spark join tests are failed when delegating to DataFusion HashJoin.

It is because that DataFusion HashJoin LeftAnti Join returns incorrect results when it is a null aware anti join.

To Reproduce

Added a test to join.slt:

statement ok
CREATE TABLE IF NOT EXISTS test_table(c1 INT, c2 INT) AS VALUES
(1, 1),
(2, 2),
(3, 3),
(4, null),
(null, 0);

query II
SELECT * FROM test_table t1 WHERE (c1 NOT IN (SELECT c2 FROM test_table)) = true
----
4 NULL
NULL 0

Expected behavior

Above query should return empty relation.

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions