-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent limit.slt failure #9450
Comments
I think the issue is that the partitioning of the test is not deterministic # generate BIGINT data from 1 to 1000 in multiple partitions
statement ok
CREATE TABLE t1000 (i BIGINT) AS
WITH t AS (VALUES (0), (0), (0), (0), (0), (0), (0), (0), (0), (0))
SELECT ROW_NUMBER() OVER (PARTITION BY t1.column1) FROM t t1, t t2, t t3;
# verify that there are multiple partitions in the input (i.e. MemoryExec says
# there are 4 partitions) so that this tests multi-partition limit.
query TT
EXPLAIN SELECT DISTINCT i FROM t1000;
----
logical_plan
Aggregate: groupBy=[[t1000.i]], aggr=[[]]
--TableScan: t1000 projection=[i]
physical_plan
AggregateExec: mode=FinalPartitioned, gby=[i@0 as i], aggr=[]
--CoalesceBatchesExec: target_batch_size=8192
----RepartitionExec: partitioning=Hash([i@0], 4), input_partitions=4
------AggregateExec: mode=Partial, gby=[i@0 as i], aggr=[]
--------MemoryExec: partitions=4, partition_sizes=[1, 2, 1, 1] I think we can fix this by changing the test to use different values. |
So I think the problem is that the input is hash partitioned into 4 partitions but somehow one of the partitions gets two batches and which partition gets the two batches is non deterministic
Another way to fix the issue might be to add a configuration option such as Something like set datafusion.explain.show_sizes = false; And then the MemoryExec output would be generated without partition_sizesL
|
This test was recently changed in #9444 |
The second option seems like the most robust solution. Since in the current setup of the test, each time |
Describe the bug
When I run
limit.slt
locally it fails like this:cargo test --test sqllogictests -- limit
@huaxingao also saw this failure on #9411 (comment)
To Reproduce
Here is an example failure on CI showing the same failure mode: https://github.com/apache/arrow-datafusion/actions/runs/8133918181/job/22226135558?pr=9411
Expected behavior
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: