Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagation of column boundary changes in subexpressions #5

Open
wants to merge 39 commits into
base: master
Choose a base branch
from

Conversation

isidentical
Copy link
Owner

Proof of concept.

Copy link

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this code and proposal @isidentical

Comment on lines 178 to 181
let predicate_selectivity = self
.predicate
.boundaries(&analysis_ctx)
.boundaries(&mut analysis_ctx)
.and_then(|bounds| bounds.selectivity);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if this looked something like:

 let predicate_selectivity = self
            .predicate
            .boundaries(analysis_ctx)
            .and_then(|analysis_ctx| analysis_ctx.bounds().selectivity);

Namely that boundaries took in a context and returned a potentially modified one?

The reason I really worry about the mut analysis_ctx thing is when there is an operator that removes the intermediate column boundaries (like maybe CASE or some user defined function call) each place will have to remember to clear any intermediate boundaries built up in the context.

The idea of tracking intermediate column boundaries looks very cool

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Namely that boundaries took in a context and returned a potentially modified one?

Thinking about it, that sounds really intriguing. Will give it a try, thank you for your suggestion and feedback on this @alamb!

@isidentical isidentical changed the base branch from gh-3845-phase-2 to master November 19, 2022 15:46
@isidentical isidentical changed the base branch from master to master2 December 4, 2022 20:48
@isidentical isidentical changed the base branch from master2 to master December 4, 2022 20:48
Ted-Jiang and others added 12 commits December 4, 2022 19:33
* Add window func related logic plan to proto ability.

Signed-off-by: yangjiang <yangjiang@ebay.com>

* add test.

Signed-off-by: yangjiang <yangjiang@ebay.com>

* more functional

Signed-off-by: yangjiang <yangjiang@ebay.com>

Signed-off-by: yangjiang <yangjiang@ebay.com>
Signed-off-by: yangjiang <yangjiang@ebay.com>

Signed-off-by: yangjiang <yangjiang@ebay.com>
* Remove interior mutability of MemTable

* remove insert_batches
* improve error handling and add some more types

* refactor booleanarray
* window frame none, non empty orderby handling logic

* window frame none, empty orderby handling logic

* Remove window frame option

* Minor changes

* Use ScalarValue::Null for unbounded

* Resolve proto errors

* combine functions under new
* Minor: add some comments to aggregate code

* Fix typos

Co-authored-by: jakevin <jakevingoo@gmail.com>

* fix typo

Co-authored-by: jakevin <jakevingoo@gmail.com>

Co-authored-by: jakevin <jakevingoo@gmail.com>
* remove project_with_alias.

refactor: cleanup code of `subqueryAlias` and `expr-alias`.

* separate project and subquery_alias

* expr alias

* replace with `.alias()`

* fix comment
* minor: remove redundant `unwrap()`

* more
)

* Expand median tests + fix floats

* Remove rust tests

* simplify csv_query_rollup_avg test
* sqllogictest: A logging and command line filter

* Reduce some println to info

* Be compatible with Rust test runner

* Fix typo, cargo fmt

* Add a note about substring matching
)

* Unify most `SessionConfig` settings into `ConfigOptions`

* Update set target_partitions in show_variable test

* Normalize setting in docs

* fix clippy
Ted-Jiang and others added 25 commits December 6, 2022 16:03
…ec (apache#4508)

* fix confilt

Signed-off-by: yangjiang <yangjiang@ebay.com>

# Conflicts:
#	datafusion/core/tests/sql/window.rs

* fix ut

Signed-off-by: yangjiang <yangjiang@ebay.com>

Signed-off-by: yangjiang <yangjiang@ebay.com>
* support seconds fraction in date_part

* support seconds fraction in date_part

* to be insync with pgsql

* fix merge

* fix bench

* fix bench
* Add tests for names with period

* adjust docstrings

* Improve docstrings

* Add tests coverage

* Update datafusion/common/src/table_reference.rs

Co-authored-by: Nga Tran <nga-tran@live.com>

* Add tests for creating tables with three periods

Co-authored-by: Nga Tran <nga-tran@live.com>
* Support Time32 and Time64 for Type Coercion

* Revert "Support Time32 and Time64 for Type Coercion"

This reverts commit a46b97e.

* Implement Time32 and Time64 in hash_join and hash_util

* Add review comments

* Qualify TimeUnits

* Changes in proto to provide full support for Time32 and Time64

* Add test to ensure Time32 and Time64 are fully supported

* Add support for type coercion for pair (Timestamp, Utf8) and add corresponding test cases

* Implement Time32 and Time64 in hash_join and hash_util

* Add review comments

* Changes in proto to provide full support for Time32 and Time64

* Add test to ensure Time32 and Time64 are fully supported

* Add support for type coercion for pair (Timestamp, Utf8) and add corresponding test cases

* Revert to_proto.rs

* Revert to_proto.rs

* Revert timestamp.rs

* Revert mod.rs

* Revert select.rs

* Revert group_by.rs

* Revert aggregates.rs

* Delete generated file
Arrow now offers `size` methods for `DataType` and `Field`, so these
TODOs can be fixed.
* feat: support prepare statement

* fix: typo

* chore: address preliminary review comments

* fix: put Placeholder last to have the expression comparison to work as expected

* test: add more tests and starting to pass param_data_types to expression to get data types of the params , , ...

* test: one more test and a bit of refactor while waiting for the CTEs/PlannerContext PR

* feat: use prepare stmt's param data types in the planner context

* chore: cleanup

* refactor: address review comments

* chore: cleanup

* test: more prepare statement tests

* chore: cleanup

* chore: fix typos and add tests into the sqllogicaltests

* docs: add docstring

* chore: update test panic message due to recent change to have clearer message per review comment

* chore: add a test and a doc string per review comments

* fix: output of a test after master merge
* Add tests for coercion of timestamps to strings

* Update datafusion/core/tests/sqllogictests/test_files/timestamps.slt
* MINOR: move sqllogictest to dev-dependencies

* Update datafusion-cli Cargo.lock

* fix toml
* extract to support functions

* extract support functions
* remove

Signed-off-by: remzi <13716567376yh@gmail.com>

* clean

Signed-off-by: remzi <13716567376yh@gmail.com>

Signed-off-by: remzi <13716567376yh@gmail.com>
…e#4488)

* Fix panic in median "AggregateState is not a scalar aggregate"

* Apply suggestions from code review

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
* bump sqllogictest to 0.9.0

* cargo update datafusion-cli

* fix clippy

* fix Cargo.lock

* fix test case
…in to inner join (apache#4443)

* Support non-column join key in eliminating cross join to inner join

* Add comment

* Make clippy  happy

* Add tests

* Add alias for cast expr join keys

* Add tests

* Add relative issue comment

* Improve test

* Improve use declarations
* Port create_drop.rs tests to sqllogictest framework

* Add in expected error messages
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.