Skip to content

Conversation

@kaijchen
Copy link
Member

@kaijchen kaijchen commented Jun 20, 2025

What problem does this PR solve?

This commit fixes an Undefined Behavior Sanitizer error triggered when printing PUniqueId directly to an ostream.
Among the three types of unique IDs, UniqueId, PUniqueId (protobuf), and TUniqueId (thrift),
only UniqueId has an overloaded operator<<().
Attempting to stream the other two types leads to undefined behavior, making the current approach error-prone.

Introduces a unified print_id() function for safely printing all unique ID types (UniqueId, PUniqueId, TUniqueId).
It is recommended to use print_id() instead of streaming directly.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

This commit fixes an Undefined Behavior Sanitizer error triggered when printing PUniqueId directly to an ostream.
Among the three types of unique IDs, `UniqueId`, `PUniqueId` (protobuf), and `TUniqueId` (thrift),
only `UniqueId` has an overloaded `operator<<`.
Attempting to stream the other two types leads to undefined behavior, making the current approach error-prone.

Introduces a unified `print_id()` function for safely printing all unique ID types (UniqueId, PUniqueId, TUniqueId).
It is recommended to use `print_id()` instead of streaming directly.
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@kaijchen
Copy link
Member Author

run buildall

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 25.00% (1/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 56.37% (15055/26706)
Line Coverage 45.16% (134731/298358)
Region Coverage 44.30% (67734/152914)
Branch Coverage 38.92% (34784/89374)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 25.00% (1/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 61.18% (16077/26278)
Line Coverage 50.72% (151250/298199)
Region Coverage 48.12% (86559/179881)
Branch Coverage 41.56% (42469/102180)

1 similar comment
@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 25.00% (1/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 61.18% (16077/26278)
Line Coverage 50.72% (151250/298199)
Region Coverage 48.12% (86559/179881)
Branch Coverage 41.56% (42469/102180)

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jun 23, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 34378 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c6988ea4e889630da0269702b65ea5cb49b69c29, data reload: false

------ Round 1 ----------------------------------
q1	17597	5175	5109	5109
q2	1946	310	195	195
q3	10270	1313	775	775
q4	10226	1051	551	551
q5	7544	2401	2434	2401
q6	196	178	139	139
q7	944	799	611	611
q8	9337	1340	1126	1126
q9	6767	5145	5106	5106
q10	6945	2398	2035	2035
q11	478	310	280	280
q12	355	363	218	218
q13	17815	3702	3126	3126
q14	234	249	223	223
q15	576	492	494	492
q16	444	446	378	378
q17	611	897	357	357
q18	7761	7216	7024	7024
q19	1676	957	597	597
q20	332	360	229	229
q21	3855	3305	2442	2442
q22	1029	1043	964	964
Total cold run time: 106938 ms
Total hot run time: 34378 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5229	5128	5160	5128
q2	250	325	224	224
q3	2190	2727	2313	2313
q4	1379	1844	1353	1353
q5	4261	4270	4404	4270
q6	226	171	139	139
q7	2015	1889	1747	1747
q8	2663	2689	2628	2628
q9	7062	7005	7154	7005
q10	3111	3285	2836	2836
q11	569	508	502	502
q12	724	737	640	640
q13	3592	4044	3388	3388
q14	282	310	282	282
q15	529	488	473	473
q16	466	504	473	473
q17	1225	1601	1383	1383
q18	7378	7334	7044	7044
q19	823	807	818	807
q20	1941	1984	1808	1808
q21	4821	4485	4391	4391
q22	1057	1052	991	991
Total cold run time: 51793 ms
Total hot run time: 49825 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186554 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c6988ea4e889630da0269702b65ea5cb49b69c29, data reload: false

query1	987	403	403	403
query2	6556	1924	1903	1903
query3	6760	227	224	224
query4	26311	23529	23560	23529
query5	4324	631	470	470
query6	316	207	203	203
query7	4636	499	295	295
query8	279	235	213	213
query9	8616	2633	2642	2633
query10	488	326	275	275
query11	15824	15013	14863	14863
query12	171	112	111	111
query13	1655	559	429	429
query14	9688	6329	6397	6329
query15	195	185	184	184
query16	7577	628	500	500
query17	1174	710	555	555
query18	2015	399	302	302
query19	194	201	157	157
query20	121	116	114	114
query21	215	126	109	109
query22	4001	4033	4047	4033
query23	33991	33069	32874	32874
query24	8511	2372	2409	2372
query25	547	451	421	421
query26	1251	274	152	152
query27	2772	514	342	342
query28	4307	2137	2107	2107
query29	727	547	446	446
query30	292	217	193	193
query31	945	851	769	769
query32	76	68	66	66
query33	557	412	307	307
query34	819	875	560	560
query35	816	844	726	726
query36	952	994	901	901
query37	117	105	76	76
query38	4121	4098	4009	4009
query39	1470	1406	1414	1406
query40	214	128	110	110
query41	63	60	59	59
query42	155	115	112	112
query43	495	509	493	493
query44	1389	842	841	841
query45	189	173	166	166
query46	899	1023	647	647
query47	1728	1843	1700	1700
query48	384	439	312	312
query49	760	494	405	405
query50	686	699	418	418
query51	4184	4250	4162	4162
query52	112	117	102	102
query53	230	268	192	192
query54	579	571	517	517
query55	90	88	88	88
query56	324	305	302	302
query57	1188	1186	1131	1131
query58	270	271	267	267
query59	2690	2654	2555	2555
query60	331	335	322	322
query61	156	148	151	148
query62	817	754	665	665
query63	235	191	200	191
query64	4343	1128	771	771
query65	4271	4162	4141	4141
query66	1084	411	319	319
query67	15988	15459	15562	15459
query68	7844	903	540	540
query69	464	306	285	285
query70	1211	1099	1116	1099
query71	482	325	315	315
query72	5758	4764	4894	4764
query73	714	666	364	364
query74	8864	9072	8670	8670
query75	3886	3235	2692	2692
query76	3708	1217	765	765
query77	782	397	341	341
query78	10033	10142	9310	9310
query79	2225	819	607	607
query80	602	529	477	477
query81	491	259	231	231
query82	456	131	99	99
query83	258	262	236	236
query84	256	100	84	84
query85	834	365	314	314
query86	393	310	301	301
query87	4474	4407	4334	4334
query88	3849	2321	2328	2321
query89	386	313	297	297
query90	1874	226	217	217
query91	145	146	118	118
query92	77	64	60	60
query93	1832	954	592	592
query94	673	430	287	287
query95	377	305	295	295
query96	497	576	288	288
query97	2689	2716	2632	2632
query98	240	204	205	204
query99	1411	1438	1259	1259
Total cold run time: 276013 ms
Total hot run time: 186554 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.2 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c6988ea4e889630da0269702b65ea5cb49b69c29, data reload: false

query1	0.03	0.04	0.03
query2	0.07	0.04	0.04
query3	0.24	0.07	0.07
query4	1.60	0.10	0.11
query5	0.43	0.41	0.42
query6	1.19	0.65	0.68
query7	0.02	0.02	0.01
query8	0.05	0.04	0.03
query9	0.58	0.54	0.52
query10	0.57	0.58	0.56
query11	0.16	0.11	0.11
query12	0.15	0.12	0.12
query13	0.63	0.62	0.61
query14	0.81	0.82	0.82
query15	0.91	0.88	0.88
query16	0.39	0.39	0.40
query17	1.08	1.06	1.05
query18	0.23	0.21	0.21
query19	1.96	1.80	1.91
query20	0.01	0.02	0.01
query21	15.44	0.88	0.54
query22	0.75	1.28	0.69
query23	14.87	1.41	0.60
query24	8.42	1.05	0.40
query25	0.45	0.12	0.08
query26	0.68	0.17	0.14
query27	0.07	0.06	0.05
query28	9.17	0.86	0.44
query29	12.55	4.06	3.33
query30	0.25	0.10	0.07
query31	2.81	0.63	0.41
query32	3.23	0.55	0.48
query33	3.07	3.13	3.08
query34	15.84	5.39	4.79
query35	4.85	4.83	4.85
query36	0.71	0.52	0.51
query37	0.09	0.07	0.07
query38	0.05	0.04	0.04
query39	0.03	0.02	0.02
query40	0.19	0.14	0.14
query41	0.08	0.02	0.02
query42	0.03	0.03	0.02
query43	0.04	0.03	0.03
Total cold run time: 104.78 s
Total hot run time: 29.2 s

@yiguolei yiguolei merged commit 1552f2c into apache:master Jun 23, 2025
27 of 30 checks passed
github-actions bot pushed a commit that referenced this pull request Jun 25, 2025
### What problem does this PR solve?

This commit fixes an Undefined Behavior Sanitizer error triggered when
printing `PUniqueId` directly to an ostream.
Among the three types of unique IDs, `UniqueId`, `PUniqueId` (protobuf),
and `TUniqueId` (thrift),
only `UniqueId` has an overloaded `operator<<()`.
Attempting to stream the other two types leads to undefined behavior,
making the current approach error-prone.

Introduces a unified `print_id()` function for safely printing all
unique ID types (`UniqueId`, `PUniqueId`, `TUniqueId`).
It is recommended to use `print_id()` instead of streaming directly.
github-actions bot pushed a commit that referenced this pull request Jun 25, 2025
### What problem does this PR solve?

This commit fixes an Undefined Behavior Sanitizer error triggered when
printing `PUniqueId` directly to an ostream.
Among the three types of unique IDs, `UniqueId`, `PUniqueId` (protobuf),
and `TUniqueId` (thrift),
only `UniqueId` has an overloaded `operator<<()`.
Attempting to stream the other two types leads to undefined behavior,
making the current approach error-prone.

Introduces a unified `print_id()` function for safely printing all
unique ID types (`UniqueId`, `PUniqueId`, `TUniqueId`).
It is recommended to use `print_id()` instead of streaming directly.
morrySnow pushed a commit that referenced this pull request Jun 25, 2025
…52042 (#52260)

Cherry-picked from #52042

Co-authored-by: Kaijie Chen <chenkaijie@selectdb.com>
dataroaring pushed a commit that referenced this pull request Jun 26, 2025
…52042 (#52261)

Cherry-picked from #52042

Co-authored-by: Kaijie Chen <chenkaijie@selectdb.com>
hello-stephen pushed a commit that referenced this pull request Jun 27, 2025
### What problem does this PR solve?

Followup #52042 
Modifying `hdr.load_id()` is unsafe and may cause undefined behavior.
Also passing load_id by reference in constructors.
github-actions bot pushed a commit that referenced this pull request Jun 27, 2025
### What problem does this PR solve?

Followup #52042 
Modifying `hdr.load_id()` is unsafe and may cause undefined behavior.
Also passing load_id by reference in constructors.
github-actions bot pushed a commit that referenced this pull request Jun 27, 2025
### What problem does this PR solve?

Followup #52042 
Modifying `hdr.load_id()` is unsafe and may cause undefined behavior.
Also passing load_id by reference in constructors.
wyxxxcat pushed a commit to wyxxxcat/doris that referenced this pull request Jun 30, 2025
Followup apache#52042
Modifying `hdr.load_id()` is unsafe and may cause undefined behavior.
Also passing load_id by reference in constructors.
koarz pushed a commit to koarz/doris that referenced this pull request Jul 3, 2025
…pache#52042 (apache#52261)

Cherry-picked from apache#52042

Co-authored-by: Kaijie Chen <chenkaijie@selectdb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.7-merged dev/3.1.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants