12.0.0 (2022-09-12)
Breaking changes:
- Pass
return_type
toAccumulatorFunctionImplementation
for user defined aggregates #3428 (alamb) - Use
usize
rather thanOption<usize>
to representLimit::skip
andLimit::offset
#3374 [sql] (HaoYang670) - Deprecate legacy datafusion::logical_plan module #3338 (andygrove)
- Update signature for Expr.name so that schema is no longer required #3336 (andygrove)
- MINOR: rename optimizer rule to ScalarSubqueryToJoin #3306 (kmitchener)
- Add top-level
Like
,ILike
,SimilarTo
expressions in logical plan #3298 [sql] (andygrove) - Upgrade to sqlparser 0.22 #3278 [sql] (andygrove)
Expr
variants for boolean operations #3275 [sql] (sarahyurick)- Upgrade to sqlparser 0.21 #3200 [sql] (andygrove)
- Add SQL planner support for
Like
,ILike
andSimilarTo
, with optional escape character #3101 [sql] (andygrove)
Implemented enhancements:
- support
cast
insidevalues
#3446 - update TPCH test schemas to use Decimal128 from Float #3435
- Include Bitwise operators in the documentation #3434
- How to read excel file with datafusion? #3433
- Pass return type to the accumulator state factory in aggregates #3427
- Support bitwise XOR operator (
#
) #3420 - support InList with datatype Date32 #3412
- add simplification for
between
expression during logical plan optimization #3402 - Replace From trait with TryFrom trait for datafusion-proto crate #3401
- update TPC-H benchmark to Decimal types from Float #3392
- Use
usize
to representLimit::skip
#3369 - Avoid coping in
LogicalPlan::expressions
#3368 - Upgrade to Arrow 22 #3362
- Eliminate
OFFSET 0
in the logical plan optimization #3355 - Add ability to get unoptimized logical plan from DataFrame #3340
- Allow IDEs to recognize generated code #3332
CAST
should not change the name of an expression #3326- add SQL support for unsigned integers #3325
- Review use of panic in
datafusion-proto
crate #3318 - Review use of panic in
datafusion-sql
crate #3315 - Review use of panic in
datafusion-optimizer
crate #3314 - Review use of panic in
datafusion-expr
crate #3312 - Support registration of custom TableProviders through SQL #3310
- Support binary data in sha hash functions #3308
- add SQL support for tinyint and unsigned versions of all INTs #3307
- Support binary types in InList expression #3300
- Physical planner should map
IsTrue
and similar expressions toIsDistinctFrom
#3288 - Introduce physical plan version of
Operator
enum #3269 - Introduce
Expr
variants forIS [NOT] TRUE / FALSE / UNKNOWN
#3268 - Add support for non-correlated subqueries #3266 [sql]
- (Re-)add support for glob patterns in ListingTableUrl #3261
PreCastLitInComparisonExpressions
should use ExprRewriter and supported nested expressions #3259- implement
DROP VIEW
#3251 - Upgrade to Arrow 21 #3224
- Add TypeCoercion optimizer rule #3221
- Create bench for approx_percentile_cont aggregate #3217
- Add SQL query planner support for
DISTRIBUTED BY
#3207 - Support "IS [NOT] UNKNOWN" syntax #3195
- sqlparser 0.21 upgrade #3192
- Re-implement parsing/planning for SHOW TABLES due to sqlparser changes #3188
- Support
SUM
AVG
,MIN
,MAX
onTime
columns. #3166 - Support "IS TRUE/FALSE" syntax #3159
- Support number of histogram bins in approx_percentile_cont #3145
- Support create ApproxPercentileAccumulator with TDigest max_size #3142
- Remove support for
array
function and only supportarray[]
style postgres syntax #3115 - Allow inline column aliases for create view #3108 [sql]
- Add support for Postgres
SIMILAR TO
andILIKE
syntax #3099 [sql] - Update SQL reference in user guide to cover all supported syntax #3091
- DataFusion prelude should import all logical expression functions #3068
- Proposal: Add similar to operator #3016 [sql]
- Release DataFusion 11.0.0 #3012
- Implement "SHOW CREATE TABLE" for external tables #2848
- Change java package names in protobuf files #2513
- When creating
DFField
fromExpr
we should provide input plan not input schema #2456 - Support "IS NOT TRUE/FALSE" syntax #2265
- RFC: Spill-To-Disk Object Storage Download #2205
- Support for BitwiseAnd
&
, BitOr|
binary operators #1619 - [Question] Usage of async object store APIs in consuming code #1313
- Allow User Defined Aggregates to return multiple values / structs #600
- Implement vectorized hashing for dictionary types #331
Fixed bugs:
- Intermittent build error when changing selected features #3366
sql::timestamp::timestamp_add_interval_months
failing since September 1st #3327sql::timestamp::timestamp_add_interval_months
test fails #3322- test case
timestamp_add_interval_months
failed on master branch #3321 - datafusion-proto does not support untyped null scalar values #3302
ConfigOptions
creation is slow #3295- FilterPushDown optimization through UNION ALL results in SchemaError #3281
- Execute LogicalPlans after building for TPCH Benchmarks #3273
CREATE TABLE
should return empty DataFrame #3265 [sql]CREATE EXTERNAL TABLE
from CSV creates a table with no columns if there is just a header row #3263- View TableProvider ignores projections, resulting in invalid plans #3240
- CREATE VIEW should return an empty dataframe on success #3236
DISTRIBUTE BY
expressions get removed during optimization #3234- datafusion cannot recognize chinese charactors. #3203
- Panicked at 'byte index 1 is out of bounds on invalid query #3190
like_nlike_with_null_lt
fails with latest sqlparser code #3187- Interval Literal output inconsistent date_type #3180
array
function allows different data types #3123- eq operator doesn't work on binary data #3117
- incorrect
where
clause comparison while using table alias #3073 - Some functions are incorrectly declared as unary #3069
- once now() is called in a statement, it forever returns the same value #3057
- single_distinct_to_groupby panic when group by expr is a binaryExpr #2994
- Cannot have
order by
expression that references complexgroup by
expression #2360 - Fix some bugs in TypeCoercion rule #3407 (andygrove)
- MINOR: Stop ignoring
AggregateFunction::distinct
in protobuf serde code #3250 (andygrove) - Add assertion for invariant in
create_physical_expression
and fix ViewTable projection #3242 (andygrove) - Fix bug where optimizer was removing
Partitioning::DistributeBy
expressions #3229 (andygrove)
Documentation updates:
Closed issues:
Merged pull requests:
- minor: fix some typo. #3453 (jackwener)
- Update criterion requirement from 0.3 to 0.4 #3452 (dependabot[bot])
- Update object_store requirement from 0.4.0 to 0.5.0 #3451 (dependabot[bot])
- add
cast
support insidevalues
#3447 [sql] (kmitchener) - Use hash repartitioning for aggregates on dictionaries #3445 (isidentical)
- Review
unwrap
andpanic
from theaggregate
directory ofdatafusion-physical-expr
#3443 (iajoiner) - MINOR: Implement protobuf serde for all binary operators #3441 (andygrove)
- MINOR: Add accessor methods to DateTimeIntervalExpr #3440 (andygrove)
- update TPCH-mimicking tests to Decimal data type from Float, matching the benchmark #3438 (kmitchener)
- Include Bitwise operators in the documentation #3436 (askoa)
- minor: make sql number parsing slightly more efficient + functional #3432 [sql] (alamb)
- Implement bitwise XOR operator (
#
) #3430 [sql] (askoa) - Replace From trait with TryFrom trait for datafusion-proto crate #3401 #3429 (comphead)
- Tests showing user defined aggregate returning a struct #3425 (alamb)
- MINOR: update optimizer rule names to be consistent style as the rest #3415 (kmitchener)
- Support date32 and date 64 in inlist node #3413 (Ted-Jiang)
- Update sqlparser requirement from 0.22 to 0.23 #3411 [sql] (dependabot[bot])
- simplify the
between
expr during logical plan optimization #3404 (kmitchener) - MINOR: Improve optimizer error #3403 (andygrove)
- Review panics in the sql crate #3397 [sql] (HaoYang670)
- changed TPC-H benchmark to use Decimal types #3393 (kmitchener)
- minor: remove redundant code. #3389 (jackwener)
- Add dictionary cases to merge bench #3384 (tustvold)
- Implement Eq trait for Expr and nested types #3381 (jdye64)
- Minor: Improvements to type coercion rule #3379 (alamb)
- MINOR: Note that most communication happens on github #3375 (alamb)
- minor fix: clean data type for negative operation #3370 (liukun4515)
- Fix code generation for json feature #3367 (avantgardnerio)
- Review use of panic in datafusion-proto crate #3365 (comphead)
- Upgrade to arrow 22 #3363 [sql] (avantgardnerio)
- return empty dataframe on create table, remove a duplicate optimize call #3361 (kmitchener)
- Add SQL support for
tinyint
,smallint
, andunsigned int variants
#3359 [sql] (kmitchener) - Minor: add hint in README of example #3358 (jackwener)
- Collect to
HashSet
directly inin_list
#3356 (HaoYang670) - MINOR: Add comments about rewrite_disjunctive_predicate #3351 (alamb)
- [MINOR] Add debug logging to plan teardown #3350 (alamb)
- MINOR: add df.to_unoptimized_plan() to docs, remove erroneous comment #3348 (kmitchener)
- Replace
unwrap
inconvert_to_ordered_float
and adddowncast_value
#3347 (iajoiner) - Remove panics from
common_subexpr_eliminate
#3346 (andygrove) - Remove Result.unwrap from single_distinct_to_groupby #3345 (andygrove)
- Add to_unoptimized_plan #3344 (iajoiner)
- Remove panics from simplify_expressions optimizer rule #3343 (andygrove)
- Remove
unreachable!
from filter push down rule #3342 (andygrove) - Replace panic in
datafusion-expr
crate #3341 (iajoiner) - Re-implement ExprIdentifierVisitor::desc_expr to use Expr::Display #3339 (andygrove)
- Fix the test
timestamp_add_interval_months
#3337 (HaoYang670) - Bump lz4-sys from 1.9.3 to 1.9.4 in /datafusion-cli #3335 (dependabot[bot])
- Make binary operator formatting consistent between logical and physical plans #3331 (andygrove)
- Fix build: Ignore failing test #3329 (andygrove)
- Add
InList
support for binary type. #3324 (HaoYang670) - MINOR: add github action trigger #3323 (waynexia)
- add explain sql test for optimizer rule PreCastLitInComparisonExpressions #3320 (liukun4515)
- Custom / Dynamic table provider factories #3311 [sql] (avantgardnerio)
- fix: alias group_by exprs in single_distinct_to_groupby optimizer #3305 (waynexia)
- Add support for serializing null scalar values #3303 (andygrove)
- Finish integrating
Expr::Is[Not]True
and similar expressions #3301 [sql] (andygrove) - MINOR: Remove
unwrap
calls fromsingle_distinct_to_groupby optimizer
rule #3299 (andygrove) - docs: update the Python library repository #3297 (haoxins)
- fix: speed up
ConfigOptions
creation #3296 (crepererum) - Execute LogicalPlans after building for TPCH Benchmarks #3290 (DaltonModlin)
- support for non-correlated subqueries #3287 (kmitchener)
- Add
Aggregate::try new
with validation checks #3286 (andygrove) - Fix SchemaError in FilterPushDown optimization with UNION ALL #3282 (jonmmease)
- Allow sorting by aggregated groups #3280 (isidentical)
- Add show external tables #3279 [sql] (psvri)
- Return from task execution if send fails as there is nothing more to do (faster cancel / limit) #3276 (nvartolomei)
- Let prelude import all expression functions #3274 (sadilet)
- Fix no schema when CSV is only header #3272 (comphead)
- support inlist for pre cast literal expression #3270 (liukun4515)
- implement
drop view
#3267 [sql] (kmitchener) - Use
ExprRewriter
inpre_cast_lit_in_comparison
#3260 (andygrove) - Add type coercion for UDFs in logical plan #3254 (andygrove)
- Support "IS NOT TRUE/FALSE" syntax #3252 [sql] (sarahyurick)
- Implement
IS UNKNOWN
/IS NOT UNKNOWN
operators #3246 [sql] (isidentical) - support decimal data type for the optimizer rule of PreCastLitInComparisonExpressions #3245 (liukun4515)
- chore: update cranelifts to 0.87.0 #3243 (yjshen)
- Moved nullif out of unary functions #3241 (comphead)
- MINOR: documentation updates #3239 (kmitchener)
- MINOR: Add bounds check to Column physical expression #3238 (andygrove)
- CREATE VIEW should return empty dataframe #3237 (kmitchener)
- Support "IS TRUE/FALSE" syntax (redo) #3235 [sql] (sarahyurick)
- Fix propagation of optimized predicates on nested projections #3228 (isidentical)
- Add more trim test cases #3226 (ayushdg)
- Upgrade to arrow 21 #3225 [sql] (avantgardnerio)
- Add optimizer rule for type coercion (binary operations only) #3222 (andygrove)
- [Improve] Use arrow::compute::sort in approx_percentile_cont #3219 (Ted-Jiang)
- [minor] fix bench aggregate_query_sql meta #3218 (Ted-Jiang)
- minor: refactor simplify negate #3213 (jackwener)
- MINOR: update cargo.lock and rust-version for datafusion-cli #3212 (kmitchener)
- fix issue with now() returning same value across statements #3210 (kmitchener)
- Add support for inline column alias in CREATE VIEW #3209 [sql] (DaltonModlin)
- Add SQL query planner support for
DISTRIBUTE BY
#3208 [sql] (andygrove) - minor: remove test code that's in the arrow library now #3206 (kmitchener)
- Use .get() to avoid panic #3201 [sql] (jklamer)
- [Minor] Reduce code duplication creating ScalarValue::List #3197 [sql] (alamb)
- Clean up CI workflows by removing "matrix" strategy, simplifying names #3196 (alamb)
- optimizer: add framework for the rule of pre-add cast to the literal in comparison binary #3185 (liukun4515)
- Fix clippy #3182 (alamb)
- MINOR: Add notes on writing release blog posts #3179 (andygrove)
- add min/max for time #3178 (waitingkuo)
- Recursively apply remove filter rule if filter is a true scalar value #3175 (byteink)
- Update
ahash
requirement from 0.7 to 0.8 #3161 [sql] (alamb) - Support number of centroids in approx_percentile_cont #3146 (Ted-Jiang)
- Introduce
\i
command to execute from a file #3136 (turbo1912) - impl binary ops between binary arrays and scalars #3124 (ozgrakkurt)