Feature: functional dependency optimization #7438

leiysky · 2022-09-01T12:16:10Z

A functional dependency is a relationship between two attributes in the same database.

If an attribute determines another attribute, we will call it a determinant, and the determined attribute is functionally dependent on the determinant.

We will denote the relationship with $X \rightarrow Y$ if $X$ is a determinant of $Y$, where both $X$ and $Y$ can be a set of attributes.

For example, given a SQL:

select a, a+1 as b from t;

Here the attribute a is a determinant of b because b is derived from a. Thus we can get a functional dependency $(a) \rightarrow (b)$.

Functional dependency is helpful in query optimization, with it we can eliminate outer joins, eliminate useless DISTINCT, eliminate cross joins and etc.

Reference wiki and the thesis for more details.

The functional dependencies can be derived from relational operators, so it's possible to encapsulate them into RelationalProperty.

The text was updated successfully, but these errors were encountered:

AngleNet · 2022-09-08T03:17:02Z

@leiysky HI. I would like to take this issue. Is there any one working on this? I swear I will finish this on my own ~~~ :(

leiysky · 2022-09-08T03:33:43Z

@leiysky HI. I would like to take this issue. Is there any one working on this? I swear I will finish this on my own ~~~ :(

We have a plan to implement this after #6547 is done.

You are still welcome to take the issue, but I suggest you make a proposal first so we can review it and give some advice since this is an important infrastructure for the optimizer.

AngleNet · 2022-09-08T03:47:40Z

Great. I will write a proposal on this.

BohuTANG · 2022-09-08T04:50:31Z

Can this function be optimized?
#7468

AngleNet · 2022-09-08T07:48:50Z

Can this function be optimized?
#7468

I think so. I will consider it inside the proposal.

andylokandy · 2022-09-12T11:24:35Z

#7468

This is about CSE optimization. Is it included in functional dependency optimization?

AngleNet · 2022-09-13T00:13:54Z

#7468

This is about CSE optimization. Is it included in functional dependency optimization?

HI. I think we could do this via functional dependency. What do you mean by CSE optimization?

andylokandy · 2022-09-13T06:59:18Z

CSE is abbr of Common Subexpression Elimination. Let's explain with an example: Given query SELECT max(a+1), (a+1)/2 FROM t1, the CSE optimizer can rewrite it into SELECT max(b), b/2 FROM (SELECT a+1 AS b FROM t1). The actual implementation usually is adding a column for the sub-expression and then hiding it in the top projection, without constructing such sub-query.

I'm not familiar with functional dependency, but from the very limited knowledge I've learned from the wiki, functional dependency is discovered from the relation between data of different columns and can be broken during updates on data.

AngleNet · 2022-09-14T03:58:41Z

@andylokandy @BohuTANG Sorry. After some investigations, I think I was wrong about doing CSE optimization (the duplicate json_parse elimination) via functional dependency analysis. It's basically another problem to solve. For CSE optimization, I think we could build a common sub-expression collector and rewrite project list or predicate conditions in a select-where-from block via discovered CSEs. The basic idea of the CSE collector is to rewrite each expression in a canonical form which uniquely identify a set of semantically equal expressions, search CSE as visiting it. If the leaf node of an expression is a column during canonical form rewriting, rewrite it to its equivalent column with smallest column id according to an equality inferrer which could be used to infer whether two columns are equal.

leiysky · 2022-09-15T03:17:27Z

@AngleNet I've read your proposal, but would you mind putting the document into a GitHub issue so we can review it without login Tencent account?

AngleNet · 2022-09-15T04:31:55Z

@AngleNet I've read your proposal, but would you mind putting the document into a GitHub issue so we can review it without login Tencent account?

Good. I will put it in a issue later.

AngleNet · 2022-09-18T03:32:05Z

@AngleNet I've read your proposal, but would you mind putting the document into a GitHub issue so we can review it without login Tencent account?

Good. I will put it in a issue later.

The draft doc is #7693

leiysky added C-feature Category: feature A-planner Area: planner/optimizer labels Sep 1, 2022

leiysky added this to Databend Query Planner Sep 1, 2022

leiysky moved this to 📒Backlog in Databend Query Planner Sep 1, 2022

xudong963 assigned AngleNet Sep 8, 2022

BohuTANG mentioned this issue Jan 3, 2023

Roadmap 2023 #9448

Open

9 tasks

leiysky mentioned this issue Jan 19, 2023

feat(query): add optimize rule rule_eliminate_groupby #9678

Closed

leiysky mentioned this issue Apr 17, 2023

Feature: optimizer should push down order by ... limit ... clause to subquery when possible #11087

Open

xudong963 assigned Dousir9 and unassigned AngleNet Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: functional dependency optimization #7438

Feature: functional dependency optimization #7438

leiysky commented Sep 1, 2022 •

edited

Loading

AngleNet commented Sep 8, 2022

leiysky commented Sep 8, 2022

AngleNet commented Sep 8, 2022 •

edited

Loading

BohuTANG commented Sep 8, 2022

AngleNet commented Sep 8, 2022

andylokandy commented Sep 12, 2022

AngleNet commented Sep 13, 2022

andylokandy commented Sep 13, 2022 •

edited

Loading

AngleNet commented Sep 14, 2022 •

edited

Loading

leiysky commented Sep 15, 2022

AngleNet commented Sep 15, 2022

AngleNet commented Sep 18, 2022

Feature: functional dependency optimization #7438

Feature: functional dependency optimization #7438

Comments

leiysky commented Sep 1, 2022 • edited Loading

AngleNet commented Sep 8, 2022

leiysky commented Sep 8, 2022

AngleNet commented Sep 8, 2022 • edited Loading

BohuTANG commented Sep 8, 2022

AngleNet commented Sep 8, 2022

andylokandy commented Sep 12, 2022

AngleNet commented Sep 13, 2022

andylokandy commented Sep 13, 2022 • edited Loading

AngleNet commented Sep 14, 2022 • edited Loading

leiysky commented Sep 15, 2022

AngleNet commented Sep 15, 2022

AngleNet commented Sep 18, 2022

leiysky commented Sep 1, 2022 •

edited

Loading

AngleNet commented Sep 8, 2022 •

edited

Loading

andylokandy commented Sep 13, 2022 •

edited

Loading

AngleNet commented Sep 14, 2022 •

edited

Loading