Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove filter plan node in pipeline #341

Closed
ygf11 opened this issue Oct 26, 2022 · 5 comments · Fixed by #1126
Closed

Remove filter plan node in pipeline #341

ygf11 opened this issue Oct 26, 2022 · 5 comments · Fixed by #1126
Assignees
Labels
feature New feature or request

Comments

@ygf11
Copy link
Contributor

ygf11 commented Oct 26, 2022

Describe This Problem

For filter pushdown, we only pushdown partial filters to table scan now, that means the filter plan node is still preserved. For example:

Query:

SELECT  * FROM system.public.tables where table_name='1';

Plan:

| logical_plan  | Projection: #system.public.tables.timestamp, #system.public.tables.catalog, #system.public.tables.schema, #system.public.tables.table_name, #system.public.tables.table_id, #system.public.tables.engine
  Filter: #system.public.tables.table_name = Utf8("1")
    TableScan: system.public.tables projection=[timestamp, catalog, schema, table_name, table_id, engine], partial_filters=[#system.public.tables.table_name = Utf8("1")] |
| physical_plan | ProjectionExec: expr=[timestamp@0 as timestamp, catalog@1 as catalog, schema@2 as schema, table_name@3 as table_name, table_id@4 as table_id, engine@5 as engine]
  CoalesceBatchesExec: target_batch_size=4096
    FilterExec: table_name@3 = 1
      ScanTable: table=tables, parallelism=8, order=None,                                                                                                                         

Proposal

We have three derives of Table trait:
https://github.com/CeresDB/ceresdb/blob/28be2ed754a8dc404499b01797d0e45a412b3f65/analytic_engine/src/table/mod.rs#L84

https://github.com/CeresDB/ceresdb/blob/28be2ed754a8dc404499b01797d0e45a412b3f65/system_catalog/src/lib.rs#L91

https://github.com/CeresDB/ceresdb/blob/28be2ed754a8dc404499b01797d0e45a412b3f65/table_engine/src/memory.rs#L78

In #326, we will support filters pushdown fully in TableImpl. If we want to remove filter plan node in pipeline, we need add same filter logic to the other two derives.

Additional Context

No response

@ygf11 ygf11 added the feature New feature or request label Oct 26, 2022
@ygf11 ygf11 changed the title Remove filter pushdown plan node in pipeline Remove filter plan node in pipeline Oct 26, 2022
@ShiKaiWi
Copy link
Member

I remember that the predicate used by the filter plan node can be controlled by the implementation of TableProvider:
https://github.com/CeresDB/ceresdb/blob/a8fb2c92edb979051fbfbd339d72f65f94628f2c/table_engine/src/provider.rs#L143

That is to say, for different Table implementations, we can use different TableProviderFilterPushDown, and there is no need to add the same filter logic for the other two Table implementations: SystemTableAdapter & MemoryTable.

@ygf11
Copy link
Contributor Author

ygf11 commented Oct 27, 2022

That is to say, for different Table implementations, we can use different TableProviderFilterPushDown, and there is no need to add the same filter logic for the other two Table implementations: SystemTableAdapter & MemoryTable.

Yes, it is better to use different TableProviderFilterPushDown.

As you mentioned in #326, we can move the control of TableProviderFilterPushDown to the implementations of tables. like:

pub trait Table: std::fmt::Debug {
    fn supports_filter_pushdown(&self, _filter: &Expr) -> Result<TableProviderFilterPushDown>;
}

impl TableProvider for TableProviderAdapter {
    fn supports_filter_pushdown(&self, _filter: &Expr) -> Result<TableProviderFilterPushDown> {
        self.table.support_filter_pushdown()
    }
}

@ShiKaiWi
Copy link
Member

ShiKaiWi commented Oct 27, 2022

That is to say, for different Table implementations, we can use different TableProviderFilterPushDown, and there is no need to add the same filter logic for the other two Table implementations: SystemTableAdapter & MemoryTable.

Yes, it is better to use different TableProviderFilterPushDown.

As you mentioned in #326, we can move the control of TableProviderFilterPushDown to the implementations of tables. like:

pub trait Table: std::fmt::Debug {
    fn supports_filter_pushdown(&self, _filter: &Expr) -> Result<TableProviderFilterPushDown>;
}

impl TableProvider for TableProviderAdapter {
    fn supports_filter_pushdown(&self, _filter: &Expr) -> Result<TableProviderFilterPushDown> {
        self.table.support_filter_pushdown()
    }
}

👍 That sounds great! However, I guess it will be better to avoid using TableProviderFilterPushDown as the returned type of supports_filter_pushdown of our Table trait because Table trait has no any dependency on datafusion.

@jiacai2050
Copy link
Contributor

@dust1
Copy link
Contributor

dust1 commented May 15, 2023

assigne me!

jiacai2050 added a commit that referenced this issue Aug 9, 2023
## Rationale
close #341 

## Detailed Changes
I just change `supports_filter_pushdown` in `TableProvider` to return
`TableProviderFilterPushDown::Exact` to fix this

## Test Plan
pass

---------

Co-authored-by: jiacai2050 <dev@liujiacai.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants