Skip to content

Conversation

@suxiaogang223
Copy link
Contributor

What problem does this PR solve?

This pull request introduces support for "virtual tables" in the Hudi JNI scanner, allowing the system to handle cases where only partition columns (and no data fields) are required from a query. The implementation ensures correct handling of empty field lists throughout the scanner and vector table logic, and adds regression tests for this scenario.

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@suxiaogang223 suxiaogang223 changed the title [fix](hudi) Handle cases where only partition columns (and no data fields) are required from a hudi jni query [fix](hudi) Fix querying hudi jni table where only partition columns (and no data fields) are required Aug 29, 2025
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34511 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f64be866876b1177dee5f635a3a59ed68bd5fc30, data reload: false

------ Round 1 ----------------------------------
q1	17625	5467	5242	5242
q2	2014	336	214	214
q3	10318	1349	771	771
q4	10240	1035	530	530
q5	7557	2399	2383	2383
q6	186	174	142	142
q7	932	783	620	620
q8	9340	1380	1126	1126
q9	6983	5118	5116	5116
q10	6940	2369	1977	1977
q11	490	338	274	274
q12	356	382	235	235
q13	17771	3731	3075	3075
q14	235	248	229	229
q15	563	500	482	482
q16	428	428	403	403
q17	595	867	361	361
q18	7590	7117	7134	7117
q19	1141	982	597	597
q20	335	347	241	241
q21	3854	3246	2387	2387
q22	1080	1031	989	989
Total cold run time: 106573 ms
Total hot run time: 34511 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5269	5186	5193	5186
q2	256	349	230	230
q3	2173	2737	2257	2257
q4	1353	1769	1343	1343
q5	4200	4548	4540	4540
q6	231	176	140	140
q7	2026	1959	1897	1897
q8	2679	2485	2531	2485
q9	7385	7342	7281	7281
q10	3124	3350	2949	2949
q11	580	535	503	503
q12	732	769	657	657
q13	3571	3942	3271	3271
q14	300	304	289	289
q15	523	486	499	486
q16	662	515	441	441
q17	1219	1629	1343	1343
q18	7821	7732	7565	7565
q19	887	897	981	897
q20	2063	2033	1941	1941
q21	4899	4414	4347	4347
q22	1069	1029	1025	1025
Total cold run time: 53022 ms
Total hot run time: 51073 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 183761 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f64be866876b1177dee5f635a3a59ed68bd5fc30, data reload: false

query1	1050	469	411	411
query2	6588	1753	1756	1753
query3	6748	231	222	222
query4	25755	23417	22924	22924
query5	4442	669	516	516
query6	359	254	241	241
query7	4671	544	300	300
query8	312	264	260	260
query9	8646	2915	2887	2887
query10	518	369	302	302
query11	15464	15026	14822	14822
query12	181	129	128	128
query13	1690	590	453	453
query14	9444	5809	5849	5809
query15	217	192	177	177
query16	7697	686	507	507
query17	1327	780	644	644
query18	2082	448	393	393
query19	207	203	178	178
query20	139	130	124	124
query21	221	134	112	112
query22	4187	4024	4042	4024
query23	33910	32825	32820	32820
query24	8254	2365	2366	2365
query25	569	525	442	442
query26	1234	287	169	169
query27	2660	521	350	350
query28	4439	2270	2242	2242
query29	771	591	498	498
query30	293	225	199	199
query31	916	832	740	740
query32	90	81	88	81
query33	564	388	370	370
query34	793	856	525	525
query35	851	818	765	765
query36	969	1022	937	937
query37	131	112	94	94
query38	4051	4050	4001	4001
query39	1491	1462	1428	1428
query40	262	135	133	133
query41	74	66	62	62
query42	138	114	117	114
query43	529	533	471	471
query44	1365	877	869	869
query45	181	179	168	168
query46	881	1015	653	653
query47	1796	1852	1770	1770
query48	401	423	358	358
query49	724	516	424	424
query50	678	685	413	413
query51	4090	4238	4015	4015
query52	122	118	106	106
query53	259	281	213	213
query54	625	612	558	558
query55	101	91	96	91
query56	352	341	333	333
query57	1196	1202	1143	1143
query58	296	289	284	284
query59	2717	2710	2517	2517
query60	373	363	356	356
query61	170	163	168	163
query62	802	755	671	671
query63	233	196	207	196
query64	4468	1168	882	882
query65	4387	4233	4199	4199
query66	1111	444	352	352
query67	15382	15305	15046	15046
query68	9487	937	584	584
query69	493	332	292	292
query70	1201	1167	1113	1113
query71	468	353	320	320
query72	5401	2650	5303	2650
query73	815	768	359	359
query74	9037	9067	8745	8745
query75	4262	3081	2646	2646
query76	4858	1156	740	740
query77	1042	417	354	354
query78	9512	9880	8773	8773
query79	1332	861	586	586
query80	761	602	521	521
query81	560	258	236	236
query82	198	146	115	115
query83	293	274	317	274
query84	297	122	89	89
query85	860	476	440	440
query86	350	340	297	297
query87	4308	4253	4174	4174
query88	2891	2267	2263	2263
query89	414	344	293	293
query90	2089	243	234	234
query91	165	175	138	138
query92	87	78	77	77
query93	1134	996	675	675
query94	703	410	329	329
query95	408	338	332	332
query96	484	609	288	288
query97	2682	2709	2551	2551
query98	250	221	219	219
query99	1460	1405	1281	1281
Total cold run time: 275946 ms
Total hot run time: 183761 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.1 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f64be866876b1177dee5f635a3a59ed68bd5fc30, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.05	0.06
query3	0.26	0.08	0.08
query4	1.60	0.12	0.12
query5	0.45	0.43	0.43
query6	1.18	0.65	0.67
query7	0.03	0.03	0.03
query8	0.06	0.05	0.05
query9	0.59	0.53	0.53
query10	0.60	0.59	0.58
query11	0.17	0.12	0.11
query12	0.17	0.13	0.13
query13	0.63	0.63	0.62
query14	0.81	0.85	0.84
query15	0.89	0.86	0.89
query16	0.39	0.41	0.38
query17	1.06	1.06	1.04
query18	0.22	0.20	0.21
query19	1.92	1.85	1.82
query20	0.02	0.01	0.01
query21	15.40	1.00	0.59
query22	0.78	1.05	0.75
query23	14.94	1.36	0.63
query24	7.27	1.06	0.94
query25	0.51	0.22	0.11
query26	0.65	0.17	0.15
query27	0.05	0.05	0.05
query28	9.80	0.93	0.44
query29	12.57	3.92	3.26
query30	3.14	3.05	2.95
query31	2.82	0.58	0.39
query32	3.24	0.56	0.48
query33	3.14	3.09	3.15
query34	15.98	5.48	4.88
query35	4.91	4.93	4.94
query36	0.72	0.52	0.51
query37	0.10	0.07	0.07
query38	0.06	0.05	0.04
query39	0.04	0.03	0.03
query40	0.18	0.15	0.14
query41	0.08	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.03	0.04
Total cold run time: 107.65 s
Total hot run time: 33.1 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage `` 🎉
Increment coverage report
Complete coverage report

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34164 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 15f001b0c610bcb467980b09086a2d9306241d2a, data reload: false

------ Round 1 ----------------------------------
q1	17656	5234	5117	5117
q2	1989	350	210	210
q3	10243	1391	746	746
q4	10226	1031	565	565
q5	7532	2403	2325	2325
q6	185	168	137	137
q7	942	781	636	636
q8	9340	1343	1163	1163
q9	7005	5119	5080	5080
q10	6878	2403	1998	1998
q11	485	308	272	272
q12	348	354	227	227
q13	17781	3678	3034	3034
q14	233	255	220	220
q15	542	501	487	487
q16	454	432	368	368
q17	589	857	361	361
q18	7535	7274	7064	7064
q19	1226	972	572	572
q20	340	337	231	231
q21	3918	3236	2384	2384
q22	1082	1044	967	967
Total cold run time: 106529 ms
Total hot run time: 34164 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5476	5136	5133	5133
q2	282	329	232	232
q3	2163	2701	2306	2306
q4	1359	1831	1370	1370
q5	4243	4486	4545	4486
q6	222	178	145	145
q7	2123	1994	1820	1820
q8	2661	2532	2548	2532
q9	7568	7510	7145	7145
q10	3096	3310	2975	2975
q11	570	529	500	500
q12	722	843	676	676
q13	3537	3955	3242	3242
q14	313	300	274	274
q15	524	486	487	486
q16	441	508	433	433
q17	1178	1625	1379	1379
q18	7940	7730	7487	7487
q19	846	920	981	920
q20	2033	2073	1911	1911
q21	5092	4698	4345	4345
q22	1068	1043	1006	1006
Total cold run time: 53457 ms
Total hot run time: 50803 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187191 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 15f001b0c610bcb467980b09086a2d9306241d2a, data reload: false

query1	1047	493	424	424
query2	6573	1763	1778	1763
query3	6755	228	224	224
query4	26654	23884	22812	22812
query5	4731	660	526	526
query6	392	243	215	215
query7	4692	517	299	299
query8	306	260	265	260
query9	8630	2962	2952	2952
query10	510	370	313	313
query11	15767	15066	14829	14829
query12	175	121	118	118
query13	1670	567	460	460
query14	9479	6006	5887	5887
query15	206	189	170	170
query16	7432	664	479	479
query17	1199	761	629	629
query18	2044	433	343	343
query19	206	201	185	185
query20	135	124	122	122
query21	227	132	118	118
query22	4027	4318	4030	4030
query23	34163	33042	33090	33042
query24	8318	2384	2391	2384
query25	634	521	430	430
query26	1250	277	171	171
query27	2732	523	363	363
query28	4378	2343	2333	2333
query29	764	600	488	488
query30	291	221	201	201
query31	927	805	716	716
query32	95	87	83	83
query33	583	410	388	388
query34	828	861	527	527
query35	828	836	742	742
query36	976	1046	924	924
query37	143	124	99	99
query38	4136	4068	4103	4068
query39	1537	1439	1452	1439
query40	232	146	139	139
query41	74	70	70	70
query42	140	121	122	121
query43	530	513	505	505
query44	1421	901	912	901
query45	189	177	181	177
query46	886	1037	662	662
query47	1784	1831	1735	1735
query48	417	460	324	324
query49	775	529	437	437
query50	657	738	423	423
query51	4099	4303	4121	4121
query52	128	111	106	106
query53	255	274	200	200
query54	618	612	544	544
query55	103	97	98	97
query56	353	344	386	344
query57	1208	1225	1139	1139
query58	297	285	290	285
query59	2598	2780	2672	2672
query60	363	354	339	339
query61	161	157	156	156
query62	816	713	661	661
query63	239	202	194	194
query64	4448	1148	843	843
query65	4293	4258	4280	4258
query66	1155	442	342	342
query67	15673	15253	15088	15088
query68	9085	946	594	594
query69	476	332	305	305
query70	1234	1120	1148	1120
query71	587	328	374	328
query72	6012	5016	5184	5016
query73	776	685	367	367
query74	8883	9127	8922	8922
query75	4304	3098	2662	2662
query76	3705	1155	748	748
query77	821	437	337	337
query78	9618	9796	8874	8874
query79	2043	882	613	613
query80	739	581	535	535
query81	508	263	229	229
query82	241	140	113	113
query83	301	269	246	246
query84	308	112	100	100
query85	851	460	428	428
query86	358	331	308	308
query87	4319	4366	4220	4220
query88	2884	2232	2229	2229
query89	417	332	288	288
query90	2077	237	237	237
query91	163	163	135	135
query92	95	81	76	76
query93	1427	1020	654	654
query94	685	406	327	327
query95	420	330	336	330
query96	487	607	289	289
query97	2681	2732	2606	2606
query98	241	225	220	220
query99	1449	1451	1296	1296
Total cold run time: 277547 ms
Total hot run time: 187191 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.75 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 15f001b0c610bcb467980b09086a2d9306241d2a, data reload: false

query1	0.06	0.05	0.05
query2	0.09	0.05	0.06
query3	0.26	0.08	0.09
query4	1.61	0.11	0.11
query5	0.44	0.41	0.41
query6	1.17	0.66	0.64
query7	0.04	0.03	0.04
query8	0.06	0.04	0.04
query9	0.61	0.52	0.53
query10	0.58	0.58	0.57
query11	0.17	0.12	0.11
query12	0.16	0.12	0.12
query13	0.63	0.63	0.63
query14	0.84	0.82	0.86
query15	0.86	0.86	0.87
query16	0.39	0.44	0.39
query17	1.07	1.06	1.07
query18	0.22	0.20	0.20
query19	1.94	1.87	1.87
query20	0.02	0.02	0.01
query21	15.40	0.94	0.59
query22	0.80	1.38	1.01
query23	14.67	1.38	0.68
query24	6.80	1.42	0.47
query25	0.47	0.18	0.06
query26	0.54	0.16	0.13
query27	0.06	0.06	0.06
query28	9.64	0.96	0.43
query29	12.55	3.90	3.26
query30	3.16	3.04	3.00
query31	2.83	0.57	0.38
query32	3.26	0.56	0.47
query33	3.03	3.09	3.20
query34	15.79	5.45	4.80
query35	4.92	4.90	4.90
query36	0.69	0.51	0.50
query37	0.11	0.08	0.07
query38	0.07	0.05	0.04
query39	0.04	0.03	0.04
query40	0.19	0.14	0.13
query41	0.09	0.03	0.02
query42	0.04	0.03	0.03
query43	0.05	0.04	0.03
Total cold run time: 106.42 s
Total hot run time: 32.75 s

@suxiaogang223
Copy link
Contributor Author

run p0

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 4, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2025

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2025

PR approved by anyone and no changes requested.

@morningman morningman merged commit 3b64ae3 into apache:master Sep 4, 2025
29 of 30 checks passed
github-actions bot pushed a commit that referenced this pull request Sep 4, 2025
…(and no data fields) are required (#55466)

### What problem does this PR solve?

This pull request introduces support for "virtual tables" in the Hudi
JNI scanner, allowing the system to handle cases where only partition
columns (and no data fields) are required from a query. The
implementation ensures correct handling of empty field lists throughout
the scanner and vector table logic, and adds regression tests for this
scenario.
github-actions bot pushed a commit that referenced this pull request Sep 4, 2025
…(and no data fields) are required (#55466)

### What problem does this PR solve?

This pull request introduces support for "virtual tables" in the Hudi
JNI scanner, allowing the system to handle cases where only partition
columns (and no data fields) are required from a query. The
implementation ensures correct handling of empty field lists throughout
the scanner and vector table logic, and adds regression tests for this
scenario.
wenzhenghu pushed a commit to wenzhenghu/doris that referenced this pull request Sep 8, 2025
…(and no data fields) are required (apache#55466)

### What problem does this PR solve?

This pull request introduces support for "virtual tables" in the Hudi
JNI scanner, allowing the system to handle cases where only partition
columns (and no data fields) are required from a query. The
implementation ensures correct handling of empty field lists throughout
the scanner and vector table logic, and adds regression tests for this
scenario.
morrySnow pushed a commit that referenced this pull request Sep 8, 2025
…ion columns (and no data fields) are required #55466 (#55662)

Cherry-picked from #55466

Co-authored-by: Socrates <suyiteng@selectdb.com>
@morrySnow morrySnow mentioned this pull request Sep 22, 2025
@suxiaogang223 suxiaogang223 deleted the fix_hudi_jni branch September 23, 2025 03:19
yiguolei pushed a commit to yiguolei/incubator-doris that referenced this pull request Sep 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.x dev/3.1.1-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants