Skip to content

[Bug] Where condition filtering for row_number results in incorrect results #56504

@yx-keith

Description

@yx-keith

Search before asking

  • I had searched in the issues and found no similar issues.

Version

2.1.8

What's Wrong?

-- 数据库原始数据:

select a.*
from hive_catalog.ecology_ods.ods_cus_fielddata as a
where a.pt = TO_DATE(DAYS_ADD(NOW(), -1))
  and `scope` = 'HrmCustomFieldByInfoType'
  and scopeid = 5
  and id = 23;

+----------+--------------------------+---------+------+--------+--------+------------+--------+--------+--------+--------+--------+--------+--------+---------+---------+---------+---------+---------+---------+---------+------------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------------------------+-----------------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+------------+
| seqorder | scope | scopeid | id | field0 | field1 | field2 | field3 | field4 | field5 | field6 | field7 | field8 | field9 | field10 | field11 | field12 | field13 | field14 | field15 | field16 | field17 | field18 | field19 | field20 | field21 | field22 | field23 | field24 | field25 | field26 | field27 | field28 | field29 | field30 | field31 | field32 | field33 | field34 | field35 | field36 | field37 | field38 | field39 | field40 | field41 | field42 | field43 | field44 | field45 | field46 | field47 | field48 | field49 | field50 | field51 | field52 | field53 | field54 | field55 | field56 | field57 | field58 | field59 | field60 | field61 | field62 | field63 | field64 | field65 | field66 | field67 | field68 | field69 | field70 | field71 | field72 | field73 | field74 | field75 | field76 | field77 | field78 | field79 | field80 | field81 | pt |
+----------+--------------------------+---------+------+--------+--------+------------+--------+--------+--------+--------+--------+--------+--------+---------+---------+---------+---------+---------+---------+---------+------------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------------------------+-----------------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+------------+
| 2549 | HrmCustomFieldByInfoType | 5 | 23 | NULL | NULL | 2013-07-20 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 2018-02-15 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 蜂网投资有限公司 | 办公室主任 | | | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 2025-09-24 |
| 2550 | HrmCustomFieldByInfoType | 5 | 23 | NULL | NULL | 2013-03-13 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 2013-06-13 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 上海意邦建材 | 客户经理 | | | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 2025-09-24 |
| 2551 | HrmCustomFieldByInfoType | 5 | 23 | NULL | NULL | 2009-09-01 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 2012-11-07 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 上海熊猫机械集团 | 客户部主管 | | | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 2025-09-24 |
+----------+--------------------------+---------+------+--------+--------+------------+--------+--------+--------+--------+--------+--------+--------+---------+---------+---------+---------+---------+---------+---------+------------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------------------------+-----------------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+------------+
3 rows in set (0.02 sec)

-- 不带 where 条件:
with cus as (
select a.*
from hive_catalog.ecology_ods.ods_cus_fielddata as a
where a.pt = TO_DATE(DAYS_ADD(NOW(), -1))
and scope = 'HrmCustomFieldByInfoType'
and scopeid = 5
and id = 23
)
select a.id
, a.workex as last_work
, a.rn as before_experience
, descrn
from (
select a.id
, a.field2
, concat_ws(',', a.field52, a.field53, a.field2, a.field17) as workex
, row_number() over (partition by a.id order by a.field2 desc) as descrn
, row_number() over (partition by a.id order by a.field2) as rn
from cus a
where a.scopeid = 5
) a

+------+----------------------------------------------------------------+-------------------+--------+
| id | last_work | before_experience | descrn |
+------+----------------------------------------------------------------+-------------------+--------+
| 23 | xx投资有限公司,办公室主任,2013-07-20,2018-02-15 | 3 | 1 |
| 23 | xx建材,客户经理,2013-03-13,2013-06-13 | 2 | 2 |
| 23 | xx机械集团,客户部主管,2009-09-01,2012-11-07 | 1 | 3 |
+------+----------------------------------------------------------------+-------------------+--------+
3 rows in set (0.02 sec)

加上最后的where 条件
with cus as (
select a.*
from hive_catalog.ecology_ods.ods_cus_fielddata as a
where a.pt = TO_DATE(DAYS_ADD(NOW(), -1))
and scope = 'HrmCustomFieldByInfoType'
and scopeid = 5
and id = 23
)
select a.id
, a.workex as last_work
, a.rn as before_experience
, descrn
from (
select a.id
, a.field2
, concat_ws(',', a.field52, a.field53, a.field2, a.field17) as workex
, row_number() over (partition by a.id order by a.field2 desc) as descrn
, row_number() over (partition by a.id order by a.field2) as rn
from cus a
where a.scopeid = 5
) a
where descrn = 1;
+------+----------------------------------------------------------------+-------------------+--------+
| id | last_work | before_experience | descrn |
+------+----------------------------------------------------------------+-------------------+--------+
| 23 | xx投资有限公司,办公室主任,2013-07-20,2018-02-15 | 1 | 1 |
+------+----------------------------------------------------------------+-------------------+--------+
1 row in set (0.02 sec)

期望的正确结果应该是:
+------+----------------------------------------------------------------+-------------------+--------+
| id | last_work | before_experience | descrn |
+------+----------------------------------------------------------------+-------------------+--------+
| 23 | xx投资有限公司,办公室主任,2013-07-20,2018-02-15 | 3 | 1 |
+------+----------------------------------------------------------------+-------------------+--------+
1 row in set (0.02 sec)

执行计划如下:
+--------------------------------------------------------------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner) |
+--------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS: |
| id[#102] |
| last_work[#103] |
| before_experience[#104] |
| descrn[#105] |
| PARTITION: UNPARTITIONED |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| VRESULT SINK |
| MYSQL_PROTOCAL |
| |
| 7:VEXCHANGE |
| offset: 0 |
| distribute expr lists: id[#102] |
| |
| PLAN FRAGMENT 1 |
| |
| PARTITION: HASH_PARTITIONED: id[#90] |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 07 |
| UNPARTITIONED |
| |
| 6:VANALYTIC(450) |
| | functions: [row_number()] |
| | partition by: id[#97] |
| | order by: field2[#98] ASC NULLS FIRST |
| | window: ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW |
| | predicates: (descrn[#100] = 1) -- 这里 谓词下推 导致的 |
| | final projections: id[#97], workex[#99], rn[#101], descrn[#100] |
| | final project output tuple id: 7 |
| | distribute expr lists: id[#97] |
| | |
| 5:VSORT(445) |
| | order by: id[#97] ASC, field2[#98] ASC |
| | offset: 0 |
| | distribute expr lists: id[#93] |
| | |
| 4:VANALYTIC(440) |
| | functions: [row_number()] |
| | partition by: id[#93] |
| | order by: field2[#94] DESC NULLS LAST |
| | window: ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW |
| | distribute expr lists: id[#93] |
| | |
| 3:VSORT(435) |
| | order by: id[#93] ASC, field2[#94] DESC |
| | offset: 0 |
| | distribute expr lists: id[#90] |
| | |
| 2:VEXCHANGE |
| offset: 0 |
| distribute expr lists: |
| |
| PLAN FRAGMENT 2 |
| |
| PARTITION: RANDOM |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 02 |
| HASH_PARTITIONED: id[#90] |
| |
| 1:VPartitionTopN(425) |
| | functions: row_number |
| | partition by: id[#87] |
| | order by: field2[#91] DESC |
| | has global limit: false |
| | partition limit: 1 |
| | partition topn phase: TWO_PHASE_LOCAL_PTOPN |
| | distribute expr lists: |
| | |
| 0:VHIVE_SCAN_NODE(410) |
| table: ods_cus_fielddata |
| predicates: (scopeid[#2] = 5), (CAST(pt[#86] AS datetime) = '2025-09-24 00:00:00'), (scope[#1] = 'HrmCustomFieldByInfoType'), (id[#3] = 23) |
| inputSplitNum=1, totalFileSize=441451, scanRanges=1 |
| partition=1/288 |
| cardinality=130668, numNodes=5 |
| pushdown agg=NONE |
| final projections: id[#3], field2[#6], concat_ws(',', field52[#56], field53[#57], field2[#6], field17[#21]) |
| final project output tuple id: 1 |
| |
| |
| |
| ========== STATISTICS ========== |
| planed with unknown column statistics |
+--------------------------------------------------------------------------------------------------------------------------------------------------+
91 rows in set (0.10 sec)

What You Expected?

希望得到正确的结果

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions