Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter push down need consider alias columns #2725

Closed
Ted-Jiang opened this issue Jun 13, 2022 · 4 comments · Fixed by #2729
Closed

Filter push down need consider alias columns #2725

Ted-Jiang opened this issue Jun 13, 2022 · 4 comments · Fixed by #2729
Labels
bug Something isn't working

Comments

@Ted-Jiang
Copy link
Member

Describe the bug
Filter push down not consider alias columns

To Reproduce

Logical plan

Received plan for execution: Limit: 50000
  Projection: #LINEORDER.LO_SHIPMODE, #TEST1
    Projection: #LO_SHIPMODE AS LINEORDER.LO_SHIPMODE, #testBITMAPCOUNTDISTINCT(_KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_) AS TEST1
      Aggregate: groupBy=[[#LO_SHIPMODE]], aggr=[[testBITMAPCOUNTDISTINCT(#_KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_)]]
        Projection: #LO_SHIPMODE, #dummy_LINEORDER_LO_SUPPKEY, #_KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_
          Filter: #LO_SHIPMODE IN ([CAST(Int32(1) AS Utf8), CAST(Int32(2) AS Utf8), CAST(Int32(3) AS Utf8), CAST(Int32(4) AS Utf8), CAST(Int32(5) AS Utf8)])
            Projection: #LO_ORDERKEY, #LO_LINENUMBER, #LO_CUSTKEY, #LO_PARTKEY, #dummy_LINEORDER_LO_SUPPKEY, #LO_ORDERDATE, #LO_ORDERPRIORITY, #LO_SHIPPRIORITY, #LO_QUANTITY, #LO_EXTENDEDPRICE, #LO_ORDTOTALPRICE, #LO_DISCOUNT, #LO_TAX, #LO_COMMITDATE, #LO_SHIPMODE, #_KY_COUNT__, #_KY_COUNT_DISTINCT_LINEORDER_LO_ORDERKEY_, #_KY_COUNT_DISTINCT_LINEORDER_LO_SHIPPRIORITY_LINEORDER_LO_SUPPKEY_, #_KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_, #_KY_PERCENTILE_APPROX_LINEORDER_LO_EXTENDEDPRICE_, #_KY_APPROX_TOPN_SUM_LINEORDER_LO_EXTENDEDPRICE_, #_KY_APPROX_BITOPN_SUM_LINEORDER_LO_EXTENDEDPRICE_, #_KY_APPROX_SUM_TOPN_LINEORDER_LO_EXTENDEDPRICE_, #_KY_APPROX_SUM_BITOPN_LINEORDER_LO_EXTENDEDPRICE_, #dummy_LINEORDER__KY_SUM_LINEORDER_LO_EXTENDEDPRICE_, #dummy_LINEORDER__KY_MAX_LINEORDER_LO_EXTENDEDPRICE_
              Projection: #ssb@test_udaf_cube_update@17179869183.32 AS LO_ORDERKEY, #ssb@test_udaf_cube_update@17179869183.31 AS LO_LINENUMBER, #ssb@test_udaf_cube_update@17179869183.30 AS LO_CUSTKEY, #ssb@test_udaf_cube_update@17179869183.29 AS LO_PARTKEY, Utf8("1") AS dummy_LINEORDER_LO_SUPPKEY, #ssb@test_udaf_cube_update@17179869183.33 AS LO_ORDERDATE, #ssb@test_udaf_cube_update@17179869183.28 AS LO_ORDERPRIORITY, #ssb@test_udaf_cube_update@17179869183.27 AS LO_SHIPPRIORITY, #ssb@test_udaf_cube_update@17179869183.26 AS LO_QUANTITY, #ssb@test_udaf_cube_update@17179869183.44 AS LO_EXTENDEDPRICE, #ssb@test_udaf_cube_update@17179869183.25 AS LO_ORDTOTALPRICE, #ssb@test_udaf_cube_update@17179869183.24 AS LO_DISCOUNT, #ssb@test_udaf_cube_update@17179869183.23 AS LO_TAX, #ssb@test_udaf_cube_update@17179869183.22 AS LO_COMMITDATE, #ssb@test_udaf_cube_update@17179869183.21 AS LO_SHIPMODE, #ssb@test_udaf_cube_update@17179869183.39 AS _KY_COUNT__, #ssb@test_udaf_cube_update@17179869183.40 AS _KY_COUNT_DISTINCT_LINEORDER_LO_ORDERKEY_, #ssb@test_udaf_cube_update@17179869183.41 AS _KY_COUNT_DISTINCT_LINEORDER_LO_SHIPPRIORITY_LINEORDER_LO_SUPPKEY_, #ssb@test_udaf_cube_update@17179869183.42 AS _KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_, #ssb@test_udaf_cube_update@17179869183.43 AS _KY_PERCENTILE_APPROX_LINEORDER_LO_EXTENDEDPRICE_, #ssb@test_udaf_cube_update@17179869183.49 AS _KY_APPROX_TOPN_SUM_LINEORDER_LO_EXTENDEDPRICE_, #ssb@test_udaf_cube_update@17179869183.46 AS _KY_APPROX_BITOPN_SUM_LINEORDER_LO_EXTENDEDPRICE_, #ssb@test_udaf_cube_update@17179869183.47 AS _KY_APPROX_SUM_TOPN_LINEORDER_LO_EXTENDEDPRICE_, #ssb@test_udaf_cube_update@17179869183.48 AS _KY_APPROX_SUM_BITOPN_LINEORDER_LO_EXTENDEDPRICE_, Utf8("1") AS dummy_LINEORDER__KY_SUM_LINEORDER_LO_EXTENDEDPRICE_, Utf8("1") AS dummy_LINEORDER__KY_MAX_LINEORDER_LO_EXTENDEDPRICE_
                TableScan: ssb@test_udaf_cube_update@17179869183 projection=Some([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44])

logical plan optimize

Calculated optimized plan: Limit: 50000
  Projection: #LINEORDER.LO_SHIPMODE, #TEST1
    Projection: #LO_SHIPMODE AS LINEORDER.LO_SHIPMODE, #testBITMAPCOUNTDISTINCT(_KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_) AS TEST1
      Aggregate: groupBy=[[#LO_SHIPMODE]], aggr=[[testBITMAPCOUNTDISTINCT(#_KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_)]]
        Projection: #LO_SHIPMODE, #_KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_
          Projection: #LO_SHIPMODE, #_KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_
            Projection: #ssb@test_udaf_cube_update@17179869183.21 AS LO_SHIPMODE, #ssb@test_udaf_cube_update@17179869183.42 AS _KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_
              Filter: #LO_SHIPMODE IN ([Utf8("1"), Utf8("2"), Utf8("3"), Utf8("4"), Utf8("5")])
                TableScan: ssb@test_udaf_cube_update@17179869183 projection=Some([12, 37]), partial_filters=[#LO_SHIPMODE IN ([Utf8("1"), Utf8("2"), Utf8("3"), Utf8("4"), Utf8("5")])]

physical plan optimize

                
create_physical_plan optimized plan: Limit: 50000
  Projection: #LINEORDER.LO_SHIPMODE, #TEST1
    Projection: #LO_SHIPMODE AS LINEORDER.LO_SHIPMODE, #testBITMAPCOUNTDISTINCT(_KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_) AS TEST1
      Aggregate: groupBy=[[#LO_SHIPMODE]], aggr=[[testBITMAPCOUNTDISTINCT(#_KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_)]]
        Projection: #LO_SHIPMODE, #_KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_
          Projection: #LO_SHIPMODE, #_KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_
            Projection: #ssb@test_udaf_cube_update@17179869183.21 AS LO_SHIPMODE, #ssb@test_udaf_cube_update@17179869183.42 AS _KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_
              Filter: #LO_SHIPMODE IN ([Utf8("1"), Utf8("2"), Utf8("3"), Utf8("4"), Utf8("5")])
                TableScan: ssb@test_udaf_cube_update@17179869183 projection=Some([12, 37]), partial_filters=[#LO_SHIPMODE IN ([Utf8("1"), Utf8("2"), Utf8("3"), Utf8("4"), Utf8("5")])]

in logical plan Filter: #LO_SHIPMODE IN is after alias Projection: #LO_ORDERKEY
After optimize Filter is before Projection.
cause :

ERROR  Could not create physical plan: Error during planning: No field named '<unqualified>.LO_SHIPMODE'. Valid fields are 'ssb@test_udaf_cube_update@17179869183.21', 'ssb@test_udaf_cube_update@17179869183.42'.

Expected behavior
could find alias col

Additional context
Add any other context about the problem here.

@Ted-Jiang Ted-Jiang added the bug Something isn't working label Jun 13, 2022
@Ted-Jiang
Copy link
Member Author

@alamb PTAL I consider change in_list col alias name to origin name in optimize? Is this solution ok ? I need your opinion 😊

@Ted-Jiang
Copy link
Member Author

Debug log

After apply projection_push_down rule:

Optimized logical plan:
Limit: 50000
  Projection: #LINEORDER.LO_SHIPMODE, #ASS
    Projection: #LO_SHIPMODE AS LINEORDER.LO_SHIPMODE, #APPROXCOUNTDISTINCT(_KY_COUNT_DISTINCT_LINEORDER_LO_SHIPPRIORITY_LINEORDER_LO_SUPPKEY_,UInt8(10)) AS ASS
      Aggregate: groupBy=[[#LO_SHIPMODE]], aggr=[[testAPPROXCOUNTDISTINCT(#_KY_COUNT_DISTINCT_LINEORDER_LO_SHIPPRIORITY_LINEORDER_LO_SUPPKEY_, UInt8(10))]]
        Projection: #LO_SHIPMODE, #_KY_COUNT_DISTINCT_LINEORDER_LO_SHIPPRIORITY_LINEORDER_LO_SUPPKEY_
          Filter: #LO_SHIPMODE IN ([Utf8("SHIP"), Utf8("Rail"), Utf8("2321"), Utf8("MAIL")])
            Projection: #LO_SHIPMODE, #_KY_COUNT_DISTINCT_LINEORDER_LO_SHIPPRIORITY_LINEORDER_LO_SUPPKEY_
              Projection: #ssb@test_udaf_cube_update@17179869183.21 AS LO_SHIPMODE, #ssb@test_udaf_cube_update@17179869183.41 AS _KY_COUNT_DISTINCT_LINEORDER_LO_SHIPPRIORITY_LINEORDER_LO_SUPPKEY_
                TableScan: ssb@test_udaf_cube_update@17179869183 projection=Some([12, 36])

After apply filter_push_down rule:

Optimized logical plan:
Limit: 50000
  Projection: #LINEORDER.LO_SHIPMODE, #ASS
    Projection: #LO_SHIPMODE AS LINEORDER.LO_SHIPMODE, #APPROXCOUNTDISTINCT(_KY_COUNT_DISTINCT_LINEORDER_LO_SHIPPRIORITY_LINEORDER_LO_SUPPKEY_,UInt8(10)) AS ASS
      Aggregate: groupBy=[[#LO_SHIPMODE]], aggr=[[testAPPROXCOUNTDISTINCT(#_KY_COUNT_DISTINCT_LINEORDER_LO_SHIPPRIORITY_LINEORDER_LO_SUPPKEY_, UInt8(10))]]
        Projection: #LO_SHIPMODE, #_KY_COUNT_DISTINCT_LINEORDER_LO_SHIPPRIORITY_LINEORDER_LO_SUPPKEY_
          Projection: #LO_SHIPMODE, #_KY_COUNT_DISTINCT_LINEORDER_LO_SHIPPRIORITY_LINEORDER_LO_SUPPKEY_
            Projection: #ssb@test_udaf_cube_update@17179869183.21 AS LO_SHIPMODE, #ssb@test_udaf_cube_update@17179869183.41 AS _KY_COUNT_DISTINCT_LINEORDER_LO_SHIPPRIORITY_LINEORDER_LO_SUPPKEY_
              Filter: #LO_SHIPMODE IN ([Utf8("SHIP"), Utf8("Rail"), Utf8("2321"), Utf8("MAIL")])
                TableScan: ssb@test_udaf_cube_update@17179869183 projection=Some([12, 36]), partial_filters=[#LO_SHIPMODE IN ([Utf8("SHIP"), Utf8("Rail"), Utf8("2321"), Utf8("MAIL")])]

After apply limit_push_down rule:

@alamb
Copy link
Contributor

alamb commented Jun 13, 2022

Hi @Ted-Jiang

If I read your conclusion corrctly, I think you are saying that the filter added by the filter pushdown rule is remapping the names somehow. Specifically, you are proposing:

Projection: #ssb@test_udaf_cube_update@17179869183.21 AS LO_SHIPMODE, #ssb@test_udaf_cube_update@17179869183.42 AS _KY_COUNT_DISTINCT_LINEORDER_LO_SUPPKEY_
  Filter: #LO_SHIPMODE IN ([Utf8("1"), Utf8("2"), Utf8("3"), Utf8("4"), Utf8("5")])  <-- **** This filter should be `#ssb@test_udaf_cube_update@17179869183.21 IN ....`
    TableScan: ssb@test_udaf_cube_update@17179869183 projection=Some([12, 37]), partial_filters=[#LO_SHIPMODE IN ([Utf8("1"), Utf8("2"), Utf8("3"), Utf8("4"), Utf8("5")])]

If so I agree that sounds like a bug that should be corrected

@Ted-Jiang
Copy link
Member Author

If I read your conclusion corrctly, I think you are saying that the filter added by the filter pushdown rule is remapping the names somehow. Specifically, you are proposing:

You are right! Same idea as you, i am working on it 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants