You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to execute benchmark q1.sql distributed, And I noticed that in from_proto.rs there is PhysicalPlanType::ParquetScan, in which we can use ParquetExec::try_from_files() to make several partitions.
However, in benchmark tests, the code didnot call this method, instead, it directly use read_csv(). Can I know why? And how can I use parquetScan?
Also, I attempted to call datafusion's repartition() function in register_table() :
let rr_repartition = Partitioning::RoundRobinBatch(3);
let roundtrip_plan = LogicalPlan::Repartition {
input: Arc::from(table.to_logical_plan()),
partitioning_scheme: rr_repartition,
};
@state
.tables
.insert(name.to_owned(), roundtrip_plan);
but I meet the error: General("Invalid LogicalPlan::TableScan")
Can you help to resolve this? My purpose is to execute benchmark q1.sql distributed. I have several data files of Lineitem Schema.
The text was updated successfully, but these errors were encountered:
Thanks.
Under the data path, each schema only has one data file. As you said, one file will be in one partition. While one partition will only be executed in one executor.
So are there distributed examples already?
The benchmark crate in the repo can be used for executing fully distributed queries against partitioned data and the README in there explains how to do this.
I want to execute benchmark q1.sql distributed, And I noticed that in from_proto.rs there is PhysicalPlanType::ParquetScan, in which we can use ParquetExec::try_from_files() to make several partitions.
However, in benchmark tests, the code didnot call this method, instead, it directly use read_csv(). Can I know why? And how can I use parquetScan?
Also, I attempted to call datafusion's repartition() function in register_table() :
but I meet the error:
General("Invalid LogicalPlan::TableScan")
Can you help to resolve this? My purpose is to execute benchmark q1.sql distributed. I have several data files of Lineitem Schema.
The text was updated successfully, but these errors were encountered: