Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Right semi join returns incorrect result #3626

Closed
xiaoxmeng opened this issue Jan 3, 2023 · 2 comments
Closed

Right semi join returns incorrect result #3626

xiaoxmeng opened this issue Jan 3, 2023 · 2 comments
Assignees
Labels
bug Something isn't working fuzzer-found

Comments

@xiaoxmeng
Copy link
Contributor

xiaoxmeng commented Jan 3, 2023

Description

Run fuzzer join test with the following parameters: --logtostderr --duration_sec 120 --batch_size=5 --seed 2199528297

I20230103 13:56:15.425753 6804843 JoinFuzzer.cpp:693] ==============================> Started iteration 167 (seed: 897056553)
I20230103 13:56:15.426928 6804843 JoinFuzzer.cpp:257] Executing query plan: 
-- HashJoin[LEFT SEMI (PROJECT) t4=u4 AND t0=u0 AND t1=u1 AND t3=u3 AND t2=u2] -> t0:BIGINT, t2:INTEGER, t1:VARBINARY, t3:INTEGER, tp5:ROW<f0:INTEGER,f1:REAL>, match:BOOLEAN
  -- Values[25 rows in 5 vectors] -> t0:BIGINT, t1:VARBINARY, t2:INTEGER, t3:INTEGER, t4:VARCHAR, tp5:ROW<f0:INTEGER,f1:REAL>
  -- Values[5 rows in 5 vectors] -> u0:BIGINT, u1:VARBINARY, u2:INTEGER, u3:INTEGER, u4:VARCHAR, bp5:ARRAY<TINYINT>, bp6:ROW<f0:REAL>, bp7:ROW<f0:REAL,f1:TIMESTAMP>
I20230103 13:56:15.428032 6804843 JoinFuzzer.cpp:271] Results: [ROW ROW<t0:BIGINT,t2:INTEGER,t1:VARBINARY,t3:INTEGER,tp5:ROW<f0:INTEGER,f1:REAL>,match:BOOLEAN>: 25 elements, no nulls]
I20230103 13:56:15.428201 6804843 JoinFuzzer.cpp:660] Testing plan #0
I20230103 13:56:15.428210 6804843 JoinFuzzer.cpp:257] Executing query plan: 
-- HashJoin[LEFT SEMI (PROJECT) t4=u4 AND t0=u0 AND t1=u1 AND t3=u3 AND t2=u2] -> t0:BIGINT, t2:INTEGER, t1:VARBINARY, t3:INTEGER, tp5:ROW<f0:INTEGER,f1:REAL>, match:BOOLEAN
  -- Values[25 rows in 5 vectors] -> t0:BIGINT, t1:VARBINARY, t2:INTEGER, t3:INTEGER, t4:VARCHAR, tp5:ROW<f0:INTEGER,f1:REAL>
  -- Values[5 rows in 5 vectors] -> u0:BIGINT, u1:VARBINARY, u2:INTEGER, u3:INTEGER, u4:VARCHAR, bp5:ARRAY<TINYINT>, bp6:ROW<f0:REAL>, bp7:ROW<f0:REAL,f1:TIMESTAMP>
I20230103 13:56:15.429280 6804843 JoinFuzzer.cpp:271] Results: [ROW ROW<t0:BIGINT,t2:INTEGER,t1:VARBINARY,t3:INTEGER,tp5:ROW<f0:INTEGER,f1:REAL>,match:BOOLEAN>: 25 elements, no nulls]
I20230103 13:56:15.429558 6804843 JoinFuzzer.cpp:660] Testing plan #1
I20230103 13:56:15.429566 6804843 JoinFuzzer.cpp:257] Executing query plan: 
-- HashJoin[RIGHT SEMI (PROJECT) u4=t4 AND u0=t0 AND u1=t1 AND u3=t3 AND u2=t2] -> t0:BIGINT, t2:INTEGER, t1:VARBINARY, t3:INTEGER, tp5:ROW<f0:INTEGER,f1:REAL>, match:BOOLEAN
  -- Values[5 rows in 5 vectors] -> u0:BIGINT, u1:VARBINARY, u2:INTEGER, u3:INTEGER, u4:VARCHAR, bp5:ARRAY<TINYINT>, bp6:ROW<f0:REAL>, bp7:ROW<f0:REAL,f1:TIMESTAMP>
  -- Values[25 rows in 5 vectors] -> t0:BIGINT, t1:VARBINARY, t2:INTEGER, t3:INTEGER, t4:VARCHAR, tp5:ROW<f0:INTEGER,f1:REAL>
I20230103 13:56:15.430617 6804843 JoinFuzzer.cpp:271] Results: [ROW ROW<t0:BIGINT,t2:INTEGER,t1:VARBINARY,t3:INTEGER,tp5:ROW<f0:INTEGER,f1:REAL>,match:BOOLEAN>: 25 elements, no nulls]
E20230103 13:56:15.431311 6804843 Exceptions.h:68] Line: /Users/xiaoxmeng/code/presto_cpp/velox/velox/exec/tests/JoinFuzzer.cpp:664, Function:verify, Expression: assertEqualResults({expected}, {actual}) Logically equivalent plans produced different results, Source: RUNTIME, ErrorCode: INVALID_STATE
libc++abi: terminating with uncaught exception of type facebook::velox::VeloxRuntimeError: Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Logically equivalent plans produced different results
Retriable: False
Expression: assertEqualResults({expected}, {actual})
Function: verify
File: /Users/xiaoxmeng/code/presto_cpp/velox/velox/exec/tests/JoinFuzzer.cpp
Line: 664
Stack trace:
...

/Users/xiaoxmeng/code/presto_cpp/velox/velox/exec/tests/utils/QueryAssertions.cpp:992: Failure
Value of: false
  Actual: false
Expected: true
Expected 25, got 25
25 extra rows, 25 missing rows
10 of extra rows:
	null | 218092068 | "x`/m-?10dS@i9r|:S|8Ob.*y2[;u@NR7Y{O<Yy|AA'08pDE5S2A!q" | null | null | null
	null | 333520695 | null | 379296071 | [null,0.6685807704925537] | null
	null | 1351162162 | null | 379296071 | [1570896782,null] | null
	null | 1430253304 | "N'ioyH#;UYZ~lvlD*_nrO|,09]Uh1WK2sykl$wXmXdC0eSQ]C\\`>r~Z<[*5s'z@2Ba7(j=&b+KP\"Jsr&;yzY*L!2h2Q1" | null | [875193945,0.7851024866104126] | null
	253982179219790907 | 566640434 | null | null | [1540924563,0.2357754409313202] | null
	1364886058125287610 | 2130063901 | "dA_$PR.d.x|(B=Yq|!Y7^Dn\"|,$R\\+%bXC-+1|wP2QY.C1tZwav\\x'\"ac)7)mRcm`1NDY')U]2$x;e+=WJ0-]~~/MVi" | 2094742274 | [2142371382,0.6333921551704407] | null
	1699526393999299079 | null | null | 1143322133 | [773893561,0.801523745059967] | null
	1699526393999299079 | 1143936907 | "N'ioyH#;UYZ~lvlD*_nrO|,09]Uh1WK2sykl$wXmXdC0eSQ]C\\`>r~Z<[*5s'z@2Ba7(j=&b+KP\"Jsr&;yzY*L!2h2Q1" | 312860059 | [773893561,0.801523745059967] | null
	1706895570413328037 | 566640434 | null | null | [null,0.7522648572921753] | null
	3300686975662340930 | 664141492 | null | 379296071 | [null,null] | null

10 of missing rows:
	null | 218092068 | "x`/m-?10dS@i9r|:S|8Ob.*y2[;u@NR7Y{O<Yy|AA'08pDE5S2A!q" | null | null | false
	null | 333520695 | null | 379296071 | [null,0.6685807704925537] | false
	null | 1351162162 | null | 379296071 | [1570896782,null] | false
	null | 1430253304 | "N'ioyH#;UYZ~lvlD*_nrO|,09]Uh1WK2sykl$wXmXdC0eSQ]C\\`>r~Z<[*5s'z@2Ba7(j=&b+KP\"Jsr&;yzY*L!2h2Q1" | null | [875193945,0.7851024866104126] | false
	253982179219790907 | 566640434 | null | null | [1540924563,0.2357754409313202] | false
	1364886058125287610 | 2130063901 | "dA_$PR.d.x|(B=Yq|!Y7^Dn\"|,$R\\+%bXC-+1|wP2QY.C1tZwav\\x'\"ac)7)mRcm`1NDY')U]2$x;e+=WJ0-]~~/MVi" | 2094742274 | [2142371382,0.6333921551704407] | false
	1699526393999299079 | null | null | 1143322133 | [773893561,0.801523745059967] | false
	1699526393999299079 | 1143936907 | "N'ioyH#;UYZ~lvlD*_nrO|,09]Uh1WK2sykl$wXmXdC0eSQ]C\\`>r~Z<[*5s'z@2Ba7(j=&b+KP\"Jsr&;yzY*L!2h2Q1" | 312860059 | [773893561,0.801523745059967] | false
	1706895570413328037 | 566640434 | null | null | [null,0.7522648572921753] | false
	3300686975662340930 | 664141492 | null | 379296071 | [null,null] | false

Error Reproduction

Run fuzzer join test with the following parameters: --logtostderr --duration_sec 120 --batch_size=5 --seed 2199528297

Relevant logs

No response

@xiaoxmeng xiaoxmeng added bug Something isn't working fuzzer Issues related the to Velox fuzzer test components. fuzzer-found labels Jan 3, 2023
@xiaoxmeng xiaoxmeng self-assigned this Jan 3, 2023
@xiaoxmeng
Copy link
Contributor Author

cc @mbasmanova

@mbasmanova mbasmanova removed the fuzzer Issues related the to Velox fuzzer test components. label Jan 3, 2023
@mbasmanova
Copy link
Contributor

Turns out that the bug is in left semi project join. When build side is not empty, but all rows have nulls in the join keys, the 'match' columns should be set to NULL, but it is currently set to FALSE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fuzzer-found
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants