Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support qualified columns in queries #55

Merged
merged 38 commits into from
Jun 22, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
8ecc215
support qualified columns in queries
houqp Apr 19, 2021
696f8e0
handle coalesced hash join partition in HashJoinStream
houqp Apr 25, 2021
cdc5fb7
implement Into<Column> for &str
houqp Apr 25, 2021
723ee5d
add todo for ARROW-10971
houqp Apr 25, 2021
2ae97c4
Merge remote-tracking branch 'upstream/master' into qp_qualified
houqp Apr 26, 2021
9cf494f
fix cross join handling in production push down optimizer
houqp Apr 27, 2021
fff2e1d
maintain field order during plan optimization using projections
houqp May 10, 2021
202b87e
Merge remote-tracking branch 'upstream/master' into qp_qualified
houqp May 10, 2021
eaf1edc
change TableScane name from Option<String> to String
houqp May 15, 2021
6c674cb
Merge remote-tracking branch 'upstream/master' into qp_qualified
houqp May 15, 2021
7f253c7
WIP: fix ballista
houqp May 16, 2021
5c413dd
separate logical and physical expressions in proto, fix ballista build
houqp May 24, 2021
841159f
fix join schema handling in production push down optimizer
houqp May 24, 2021
9ab4711
tpch 7 & 8 are now passing!
houqp May 24, 2021
babb252
fix roundtrip_join test
houqp May 24, 2021
d18065b
Merge remote-tracking branch 'upstream/master' into qp_qualified
houqp May 24, 2021
bbde69a
Merge remote-tracking branch 'upstream/master' into qp_qualified
houqp Jun 1, 2021
6aaa148
fix clippy warnings
houqp Jun 2, 2021
040d28c
Merge remote-tracking branch 'upstream/master' into qp_qualified
houqp Jun 2, 2021
2ef668d
fix sql planner test error checking with matches
houqp Jun 5, 2021
9513c19
Merge remote-tracking branch 'upstream/master' into qp_qualified
houqp Jun 5, 2021
7b70f04
address FIXMEs
houqp Jun 6, 2021
fd3005f
honor datafusion field name semantic
houqp Jun 12, 2021
0782ee2
Merge remote-tracking branch 'upstream/master' into qp_qualified
houqp Jun 12, 2021
a9ba4c6
Merge remote-tracking branch 'upstream/master' into qp_qualified
houqp Jun 12, 2021
071f86b
add more comment
houqp Jun 12, 2021
80a5168
enable more queries in benchmark/run.sh
houqp Jun 12, 2021
713fbe1
use unzip to avoid unnecessary iterators
houqp Jun 12, 2021
e4677b9
reduce diff by discarding style related changes
houqp Jun 13, 2021
6f6ecdf
simplify hash_join tests
houqp Jun 13, 2021
1643617
reduce diff for easier revuew
houqp Jun 13, 2021
660af1e
Merge remote-tracking branch 'upstream/master' into qp_qualified
houqp Jun 14, 2021
cad0d5e
fix unnecessary reference clippy error
houqp Jun 14, 2021
7b72c9c
Merge remote-tracking branch 'upstream/master' into qp_qualified
houqp Jun 15, 2021
593785a
incorporate code review feedback
houqp Jun 15, 2021
071d8ac
Merge remote-tracking branch 'upstream/master' into HEAD
houqp Jun 20, 2021
d26b54c
fix window schema handling in projection pushdown optimizer
houqp Jun 20, 2021
9c1e94d
Merge remote-tracking branch 'upstream/master' into qp_qualified
houqp Jun 21, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 9 additions & 4 deletions datafusion/src/dataframe.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
use crate::arrow::record_batch::RecordBatch;
use crate::error::Result;
use crate::logical_plan::{
DFSchema, Expr, FunctionRegistry, JoinType, LogicalPlan, Partitioning,
Column, DFSchema, Expr, FunctionRegistry, JoinType, LogicalPlan, Partitioning,
};
use std::sync::Arc;

Expand Down Expand Up @@ -175,7 +175,12 @@ pub trait DataFrame: Send + Sync {
/// col("a").alias("a2"),
/// col("b").alias("b2"),
/// col("c").alias("c2")])?;
/// let join = left.join(right, JoinType::Inner, &["a", "b"], &["a2", "b2"])?;
/// let join = left.join(
/// right,
/// JoinType::Inner,
/// vec![Column::from_name("a".to_string()), Column::from_name("b".to_string())],
/// vec![Column::from_name("a2".to_string()), Column::from_name("b2".to_string())],
/// )?;
/// let batches = join.collect().await?;
/// # Ok(())
/// # }
Expand All @@ -184,8 +189,8 @@ pub trait DataFrame: Send + Sync {
&self,
right: Arc<dyn DataFrame>,
join_type: JoinType,
left_cols: &[&str],
right_cols: &[&str],
left_cols: Vec<Column>,
right_cols: Vec<Column>,
) -> Result<Arc<dyn DataFrame>>;

/// Repartition a DataFrame based on a logical partitioning scheme.
Expand Down
12 changes: 10 additions & 2 deletions datafusion/src/datasource/csv.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
//! let schema = csvdata.schema();
//! ```

use arrow::datatypes::SchemaRef;
use arrow::datatypes::{Schema, SchemaRef};
use std::any::Any;
use std::string::String;
use std::sync::Arc;
Expand Down Expand Up @@ -123,10 +123,18 @@ impl TableProvider for CsvFile {
_filters: &[Expr],
limit: Option<usize>,
) -> Result<Arc<dyn ExecutionPlan>> {
let fields = self
.schema
.fields()
.iter()
.map(|f| f.clone())
.collect::<Vec<_>>();
let schema = Schema::new(fields);

Ok(Arc::new(CsvExec::try_new(
&self.path,
CsvReadOptions::new()
.schema(&self.schema)
.schema(&schema)
.has_header(self.has_header)
.delimiter(self.delimiter)
.file_extension(self.file_extension.as_str()),
Expand Down
Loading