Skip to content

Conversation

@qidaye
Copy link
Contributor

@qidaye qidaye commented Aug 8, 2024

Proposed changes

Elasticsearch does not have an explicit array type, but one of its fields can contain 0 or more values.
When the field has one value and we map it as array type in Doris, it will run into segment fault while parsing it.
So we add a check before we parse json to array.

Issue Number: close #39102

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@qidaye
Copy link
Contributor Author

qidaye commented Aug 8, 2024

run buildall

@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2024

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 39315 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2dc8f7588ac37ffdd51c204959a643aa516125a0, data reload: false

------ Round 1 ----------------------------------
q1	17732	4478	4379	4379
q2	2560	185	188	185
q3	11283	1149	1094	1094
q4	10221	728	766	728
q5	7593	2581	2587	2581
q6	227	147	144	144
q7	1084	609	611	609
q8	9583	1918	2033	1918
q9	8826	6530	6513	6513
q10	7058	2202	2159	2159
q11	467	248	245	245
q12	390	219	213	213
q13	17767	2960	2950	2950
q14	289	242	231	231
q15	525	476	491	476
q16	513	388	392	388
q17	959	692	730	692
q18	8102	7338	7478	7338
q19	3827	1088	969	969
q20	696	327	336	327
q21	5792	4520	4174	4174
q22	1173	1037	1002	1002
Total cold run time: 116667 ms
Total hot run time: 39315 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4480	4238	4255	4238
q2	369	287	282	282
q3	2810	2637	2584	2584
q4	1855	1627	1624	1624
q5	5270	5274	5295	5274
q6	221	132	132	132
q7	2083	1672	1706	1672
q8	3142	3328	3328	3328
q9	8406	8404	8359	8359
q10	3361	3144	3153	3144
q11	594	512	511	511
q12	771	602	571	571
q13	17488	2971	3025	2971
q14	295	279	268	268
q15	521	474	472	472
q16	473	413	411	411
q17	1816	1493	1445	1445
q18	7699	7674	7312	7312
q19	6851	1693	1525	1525
q20	1981	1772	1745	1745
q21	5348	4981	5051	4981
q22	1115	1014	1014	1014
Total cold run time: 76949 ms
Total hot run time: 53863 ms

@doris-robot
Copy link

TPC-H: Total hot run time: 39382 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2dc8f7588ac37ffdd51c204959a643aa516125a0, data reload: false

------ Round 1 ----------------------------------
q1	17633	4400	4283	4283
q2	2022	175	175	175
q3	10494	1144	1032	1032
q4	10148	817	697	697
q5	7483	2481	2457	2457
q6	223	139	136	136
q7	965	598	597	597
q8	9209	1897	1918	1897
q9	8640	6592	6552	6552
q10	7041	2216	2137	2137
q11	453	241	240	240
q12	497	223	218	218
q13	17760	2991	2948	2948
q14	271	232	233	232
q15	514	480	478	478
q16	501	378	393	378
q17	981	655	690	655
q18	7968	7378	7403	7378
q19	6413	1028	1056	1028
q20	685	330	335	330
q21	5397	4597	4529	4529
q22	1106	1022	1005	1005
Total cold run time: 116404 ms
Total hot run time: 39382 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4481	4248	4327	4248
q2	366	270	263	263
q3	2845	2605	2747	2605
q4	1987	1706	1687	1687
q5	5644	5473	5445	5445
q6	244	134	131	131
q7	2220	1755	1786	1755
q8	3280	3460	3433	3433
q9	8785	8737	8825	8737
q10	3489	3309	3217	3217
q11	579	491	489	489
q12	787	635	625	625
q13	15891	3186	3155	3155
q14	307	288	295	288
q15	531	485	490	485
q16	482	423	451	423
q17	1845	1507	1470	1470
q18	8102	8081	7749	7749
q19	1755	1623	1554	1554
q20	2161	1882	1897	1882
q21	9741	5413	5224	5224
q22	1122	1038	1002	1002
Total cold run time: 76644 ms
Total hot run time: 55867 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 203896 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2dc8f7588ac37ffdd51c204959a643aa516125a0, data reload: false

query1	951	407	401	401
query2	6431	2076	1925	1925
query3	6626	209	217	209
query4	34195	23189	23032	23032
query5	3666	501	499	499
query6	279	182	195	182
query7	4579	294	289	289
query8	240	198	195	195
query9	8665	2382	2365	2365
query10	892	883	875	875
query11	17895	15009	14932	14932
query12	133	97	100	97
query13	1662	389	373	373
query14	10485	7906	7978	7906
query15	396	371	350	350
query16	7577	492	499	492
query17	1307	588	568	568
query18	2049	408	391	391
query19	292	233	194	194
query20	118	107	106	106
query21	208	112	105	105
query22	4698	4382	4383	4382
query23	34449	33660	33492	33492
query24	10512	3014	2919	2919
query25	558	362	361	361
query26	698	150	147	147
query27	2185	288	280	280
query28	5750	2015	2002	2002
query29	773	401	400	400
query30	254	150	150	150
query31	963	781	748	748
query32	97	54	55	54
query33	633	299	281	281
query34	870	482	497	482
query35	948	831	838	831
query36	1086	914	907	907
query37	132	79	82	79
query38	4258	4172	4173	4172
query39	1431	1410	1378	1378
query40	235	121	116	116
query41	46	43	48	43
query42	117	93	91	91
query43	515	496	478	478
query44	1091	730	739	730
query45	410	397	385	385
query46	1131	764	772	764
query47	1823	1768	1759	1759
query48	386	311	297	297
query49	843	415	452	415
query50	810	411	413	411
query51	6775	6714	6715	6714
query52	105	90	91	90
query53	249	180	184	180
query54	873	447	461	447
query55	76	73	77	73
query56	271	242	244	242
query57	1150	1074	1093	1074
query58	233	229	223	223
query59	3076	2772	2771	2771
query60	279	265	253	253
query61	136	101	94	94
query62	782	640	634	634
query63	211	184	178	178
query64	9259	2433	1882	1882
query65	3218	3170	3197	3170
query66	745	335	329	329
query67	15235	14955	14760	14760
query68	4507	547	546	546
query69	410	393	418	393
query70	1117	1061	1087	1061
query71	413	287	275	275
query72	18230	17310	16753	16753
query73	761	325	324	324
query74	9167	8743	8760	8743
query75	3381	2675	2672	2672
query76	2168	1011	957	957
query77	420	301	317	301
query78	9669	8983	9018	8983
query79	2182	522	520	520
query80	2412	493	486	486
query81	616	227	224	224
query82	791	140	133	133
query83	296	155	149	149
query84	261	82	81	81
query85	1998	308	303	303
query86	474	276	286	276
query87	4723	4542	4577	4542
query88	4326	2495	2495	2495
query89	404	284	292	284
query90	1844	202	205	202
query91	155	140	140	140
query92	62	52	52	52
query93	2321	534	536	534
query94	910	301	311	301
query95	360	271	274	271
query96	610	285	278	278
query97	3230	3119	3036	3036
query98	220	201	200	200
query99	1502	1269	1227	1227
Total cold run time: 308961 ms
Total hot run time: 203896 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.06 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2dc8f7588ac37ffdd51c204959a643aa516125a0, data reload: false

query1	0.04	0.04	0.04
query2	0.08	0.04	0.04
query3	0.23	0.04	0.04
query4	1.68	0.07	0.07
query5	0.50	0.48	0.49
query6	1.13	0.72	0.73
query7	0.01	0.02	0.01
query8	0.05	0.04	0.05
query9	0.55	0.49	0.50
query10	0.54	0.55	0.54
query11	0.15	0.12	0.11
query12	0.15	0.12	0.12
query13	0.60	0.62	0.60
query14	0.76	0.77	0.77
query15	0.90	0.82	0.82
query16	0.34	0.40	0.37
query17	0.94	1.05	0.97
query18	0.23	0.22	0.22
query19	1.82	1.69	1.79
query20	0.01	0.02	0.01
query21	15.39	0.75	0.65
query22	4.03	7.13	2.38
query23	18.24	1.38	1.26
query24	2.12	0.22	0.21
query25	0.15	0.07	0.08
query26	0.28	0.21	0.21
query27	0.45	0.22	0.22
query28	13.29	1.02	0.99
query29	12.64	3.33	3.34
query30	0.24	0.05	0.05
query31	2.93	0.39	0.39
query32	3.25	0.48	0.46
query33	2.86	2.94	2.87
query34	16.88	4.34	4.36
query35	4.42	4.44	4.50
query36	0.67	0.49	0.46
query37	0.19	0.15	0.17
query38	0.16	0.14	0.15
query39	0.04	0.04	0.04
query40	0.15	0.13	0.12
query41	0.10	0.05	0.05
query42	0.05	0.04	0.04
query43	0.05	0.04	0.05
Total cold run time: 109.29 s
Total hot run time: 31.06 s

@wm1581066 wm1581066 added the usercase Important user case type label label Aug 8, 2024
morningman
morningman previously approved these changes Aug 8, 2024
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 8, 2024
@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2024

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2024

PR approved by anyone and no changes requested.

Copy link
Contributor

@xiaokang xiaokang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add a testcase

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Aug 12, 2024
@qidaye
Copy link
Contributor Author

qidaye commented Aug 12, 2024

run buildall

@qidaye qidaye force-pushed the fix_es_catalog_core branch from f3dee95 to cc42cd6 Compare August 12, 2024 07:58
@qidaye
Copy link
Contributor Author

qidaye commented Aug 12, 2024

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 39465 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit cc42cd61422e511afba9accee0ae6ee715b1c930, data reload: false

------ Round 1 ----------------------------------
q1	18310	4444	4464	4444
q2	2680	176	179	176
q3	11506	1189	1083	1083
q4	10237	696	810	696
q5	7634	2548	2572	2548
q6	225	141	140	140
q7	989	617	592	592
q8	9211	1912	1888	1888
q9	8784	6592	6548	6548
q10	7012	2225	2230	2225
q11	450	245	243	243
q12	400	216	212	212
q13	17760	2971	2978	2971
q14	286	225	225	225
q15	515	482	475	475
q16	482	388	388	388
q17	980	695	722	695
q18	8084	7488	7437	7437
q19	1385	986	926	926
q20	694	321	341	321
q21	5374	4237	4616	4237
q22	1099	995	996	995
Total cold run time: 114097 ms
Total hot run time: 39465 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4431	4243	4301	4243
q2	368	271	269	269
q3	2816	2638	2575	2575
q4	1885	1637	1607	1607
q5	5236	5235	5237	5235
q6	218	129	131	129
q7	2039	1645	1644	1644
q8	3168	3288	3341	3288
q9	8370	8356	8356	8356
q10	3369	3121	3137	3121
q11	593	496	500	496
q12	786	606	581	581
q13	17596	3003	2932	2932
q14	305	271	277	271
q15	518	481	485	481
q16	470	417	410	410
q17	1777	1484	1486	1484
q18	7641	7324	7396	7324
q19	1760	1668	1594	1594
q20	1986	1794	1769	1769
q21	5268	5068	5121	5068
q22	1074	1006	1022	1006
Total cold run time: 71674 ms
Total hot run time: 53883 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 200441 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit cc42cd61422e511afba9accee0ae6ee715b1c930, data reload: false

query1	925	360	350	350
query2	6440	1988	1944	1944
query3	6651	210	211	210
query4	34439	23215	22997	22997
query5	4244	477	481	477
query6	271	160	159	159
query7	4575	295	292	292
query8	251	203	199	199
query9	8734	2498	2466	2466
query10	571	472	452	452
query11	16892	14870	15174	14870
query12	143	99	96	96
query13	1633	387	357	357
query14	10138	7810	7552	7552
query15	280	215	230	215
query16	7841	499	488	488
query17	1729	586	554	554
query18	2007	283	280	280
query19	201	147	154	147
query20	112	103	103	103
query21	203	102	103	102
query22	4473	3954	3896	3896
query23	34015	33265	33047	33047
query24	11863	2631	2482	2482
query25	668	393	386	386
query26	1796	152	152	152
query27	2887	282	289	282
query28	7742	2067	2051	2051
query29	1074	420	433	420
query30	308	148	145	145
query31	984	740	742	740
query32	94	58	57	57
query33	738	302	283	283
query34	906	461	459	459
query35	968	802	796	796
query36	1099	922	951	922
query37	175	81	79	79
query38	4275	4158	4131	4131
query39	1424	1356	1380	1356
query40	273	117	116	116
query41	47	46	46	46
query42	111	96	99	96
query43	500	472	471	471
query44	1205	729	749	729
query45	239	195	209	195
query46	1083	758	726	726
query47	1859	1758	1744	1744
query48	362	296	301	296
query49	1207	422	435	422
query50	804	407	415	407
query51	6830	6776	6684	6684
query52	96	97	90	90
query53	257	186	181	181
query54	921	451	453	451
query55	79	74	77	74
query56	287	245	249	245
query57	1174	1075	1062	1062
query58	242	231	235	231
query59	3070	2778	2806	2778
query60	304	262	263	262
query61	114	122	227	122
query62	840	642	648	642
query63	217	181	181	181
query64	10587	2266	1769	1769
query65	3224	3152	3142	3142
query66	1390	330	323	323
query67	15538	14812	14701	14701
query68	8612	553	580	553
query69	447	426	412	412
query70	1124	1129	1124	1124
query71	537	274	266	266
query72	20384	16396	16434	16396
query73	1111	329	321	321
query74	9052	8759	8724	8724
query75	5098	2682	2643	2643
query76	5013	1037	1037	1037
query77	820	302	307	302
query78	9703	9045	8879	8879
query79	10154	541	543	541
query80	1021	504	498	498
query81	596	225	225	225
query82	786	135	131	131
query83	319	141	143	141
query84	273	78	73	73
query85	1349	315	267	267
query86	388	296	288	288
query87	4712	4462	4483	4462
query88	5376	2501	2507	2501
query89	525	293	284	284
query90	2057	198	200	198
query91	125	92	93	92
query92	62	48	47	47
query93	7360	553	540	540
query94	1005	283	291	283
query95	368	265	262	262
query96	631	276	279	276
query97	3176	3034	3025	3025
query98	223	209	196	196
query99	1707	1274	1287	1274
Total cold run time: 340743 ms
Total hot run time: 200441 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.05 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit cc42cd61422e511afba9accee0ae6ee715b1c930, data reload: false

query1	0.04	0.04	0.04
query2	0.09	0.04	0.04
query3	0.22	0.05	0.05
query4	1.67	0.08	0.08
query5	0.48	0.48	0.47
query6	1.13	0.72	0.73
query7	0.02	0.02	0.01
query8	0.05	0.04	0.04
query9	0.56	0.50	0.49
query10	0.54	0.55	0.54
query11	0.15	0.12	0.12
query12	0.15	0.12	0.12
query13	0.59	0.60	0.58
query14	0.76	0.79	0.77
query15	0.86	0.80	0.82
query16	0.34	0.35	0.38
query17	0.99	0.94	0.99
query18	0.23	0.22	0.22
query19	1.77	1.73	1.68
query20	0.03	0.01	0.01
query21	15.42	0.72	0.65
query22	4.22	8.18	1.56
query23	18.29	1.36	1.26
query24	2.07	0.21	0.21
query25	0.15	0.08	0.07
query26	0.29	0.22	0.21
query27	0.45	0.23	0.22
query28	13.33	1.00	0.98
query29	12.62	3.32	3.28
query30	0.24	0.05	0.04
query31	2.89	0.39	0.39
query32	3.27	0.48	0.47
query33	2.90	2.94	2.90
query34	17.09	4.31	4.34
query35	4.47	4.45	4.41
query36	0.65	0.46	0.46
query37	0.20	0.15	0.16
query38	0.15	0.15	0.14
query39	0.06	0.03	0.04
query40	0.15	0.12	0.13
query41	0.10	0.04	0.05
query42	0.06	0.05	0.04
query43	0.05	0.04	0.04
Total cold run time: 109.79 s
Total hot run time: 30.05 s

Copy link
Contributor

@xiaokang xiaokang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 12, 2024
Copy link
Member

@eldenmoon eldenmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qidaye qidaye merged commit 2b46aaa into apache:master Aug 13, 2024
@qidaye qidaye deleted the fix_es_catalog_core branch August 13, 2024 02:29
qidaye added a commit to qidaye/incubator-doris that referenced this pull request Aug 13, 2024
## Proposed changes

Elasticsearch does not have an explicit array type, but one of its
fields can contain [0 or more
values](https://www.elastic.co/guide/en/elasticsearch/reference/current/array.html).
When the field has one value and we map it as array type in Doris, it
will run into segment fault while parsing it.
So we add a check before we parse json to array.

Issue Number: close apache#39102
qidaye added a commit to qidaye/incubator-doris that referenced this pull request Aug 13, 2024
## Proposed changes

Elasticsearch does not have an explicit array type, but one of its
fields can contain [0 or more
values](https://www.elastic.co/guide/en/elasticsearch/reference/current/array.html).
When the field has one value and we map it as array type in Doris, it
will run into segment fault while parsing it.
So we add a check before we parse json to array.

Issue Number: close apache#39102
yiguolei pushed a commit that referenced this pull request Aug 13, 2024
wyxxxcat pushed a commit to wyxxxcat/doris that referenced this pull request Aug 14, 2024
## Proposed changes

Elasticsearch does not have an explicit array type, but one of its
fields can contain [0 or more
values](https://www.elastic.co/guide/en/elasticsearch/reference/current/array.html).
When the field has one value and we map it as array type in Doris, it
will run into segment fault while parsing it.
So we add a check before we parse json to array.

Issue Number: close apache#39102
dataroaring pushed a commit that referenced this pull request Aug 17, 2024
## Proposed changes

Elasticsearch does not have an explicit array type, but one of its
fields can contain [0 or more
values](https://www.elastic.co/guide/en/elasticsearch/reference/current/array.html).
When the field has one value and we map it as array type in Doris, it
will run into segment fault while parsing it.
So we add a check before we parse json to array.

Issue Number: close #39102
@xiaokang xiaokang removed the doing label Sep 1, 2024
qidaye added a commit that referenced this pull request Sep 11, 2024
Follow up #39104, when the field has one value and we map it as array
type in Doris, we parse the single value to a single element array to
make them queryable.

close #40406
qidaye added a commit to qidaye/incubator-doris that referenced this pull request Sep 11, 2024
…40614)

Follow up apache#39104, when the field has one value and we map it as array
type in Doris, we parse the single value to a single element array to
make them queryable.

close apache#40406
qidaye added a commit to qidaye/incubator-doris that referenced this pull request Sep 11, 2024
…40614)

Follow up apache#39104, when the field has one value and we map it as array
type in Doris, we parse the single value to a single element array to
make them queryable.

close apache#40406
qidaye added a commit to qidaye/incubator-doris that referenced this pull request Sep 11, 2024
…40614)

Follow up apache#39104, when the field has one value and we map it as array
type in Doris, we parse the single value to a single element array to
make them queryable.

close apache#40406
qidaye added a commit to qidaye/incubator-doris that referenced this pull request Sep 11, 2024
…40614)

Follow up apache#39104, when the field has one value and we map it as array
type in Doris, we parse the single value to a single element array to
make them queryable.

close apache#40406
dataroaring pushed a commit that referenced this pull request Oct 9, 2024
Follow up #39104, when the field has one value and we map it as array
type in Doris, we parse the single value to a single element array to
make them queryable.

close #40406
@gavinchou gavinchou mentioned this pull request Oct 13, 2024
mongo360 pushed a commit to mongo360/doris that referenced this pull request Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.0.15-merged dev/2.1.6-merged dev/3.0.2-merged reviewed usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Query ES catalog coredump

8 participants