Skip to content

Conversation

@github-actions
Copy link
Contributor

@github-actions github-actions bot commented Nov 2, 2024

PR Body: Fix the problem that paimon catalog can not access to OSS-HDFS.

There are 2 problems in paimon catalog:

  1. Doris FE can not list paimon tables.
    This is because we pass these three properties -- fs.oss.endpoint / fs.oss.accessKeyId / fs.oss.accessKeySecret -- to the PaimonCatalog. When PaimonCatalog get these three properties, it will use OSSLoader rather than HadoopFileIOLoader.

  2. Doris BE does not use libhdfs to access OSS-HDFS
    This is because the tmpLocation in LocationPath does not contain oss-dls.aliyuncs. We should use endpoint to judge if user wants to access OSS-HDFS

What's more, if you want to access OSS-HDFS with PaimonCatalog, you should:

  1. Download Jindo SDK: https://github.com/aliyun/alibabacloud-jindodata/blob/latest/docs/user/zh/jindosdk/jindosdk_download.md
  2. copy jindo-core.jar、jindo-sdk.jar to ${DORIS_HOME}/fe/lib and ${DORIS_HOME}/be/lib/java_extensions/preload-extensions directory.

Cherry-picked from #42585

…ss to OSS-HDFS (#42585)

Fix the problem that paimon catalog can not access to OSS-HDFS.

There are 2 problems in paimon catalog:
1. Doris FE can not list paimon tables.
This is because we pass these three properties -- `fs.oss.endpoint /
fs.oss.accessKeyId / fs.oss.accessKeySecret` -- to the PaimonCatalog.
When PaimonCatalog get these three properties, it will use `OSSLoader`
rather than `HadoopFileIOLoader`.

2. Doris BE does not use libhdfs to access OSS-HDFS
This is because the `tmpLocation` in `LocationPath` does not contain
`oss-dls.aliyuncs`. We should use `endpoint` to judge if user wants to
access OSS-HDFS

What's more, if you want to access OSS-HDFS with PaimonCatalog, you
should:
1. Download Jindo SDK:
https://github.com/aliyun/alibabacloud-jindodata/blob/latest/docs/user/zh/jindosdk/jindosdk_download.md
2. copy `jindo-core.jar、jindo-sdk.jar` to `${DORIS_HOME}/fe/lib` and
`${DORIS_HOME}/be/lib/java_extensions/preload-extensions` directory.
@github-actions
Copy link
Contributor Author

github-actions bot commented Nov 2, 2024

run buildall

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@doris-robot
Copy link

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40215 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 40f43c4c132ca56d48409e26a405aaed23bd93ac, data reload: false

------ Round 1 ----------------------------------
q1	17566	7365	7257	7257
q2	2049	154	146	146
q3	10729	1092	1153	1092
q4	10538	712	733	712
q5	7719	2821	2809	2809
q6	231	147	145	145
q7	1055	615	600	600
q8	9586	1916	2009	1916
q9	6930	6335	6422	6335
q10	6951	2295	2311	2295
q11	453	256	256	256
q12	398	209	213	209
q13	17798	3007	2959	2959
q14	250	228	210	210
q15	554	506	523	506
q16	655	614	605	605
q17	972	584	552	552
q18	7217	6699	6508	6508
q19	2841	1064	1073	1064
q20	490	196	192	192
q21	3867	3004	2876	2876
q22	1103	971	979	971
Total cold run time: 109952 ms
Total hot run time: 40215 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7316	7204	7213	7204
q2	315	230	234	230
q3	2918	2883	2931	2883
q4	2025	1851	1712	1712
q5	5651	5653	5677	5653
q6	217	140	138	138
q7	2152	1752	1759	1752
q8	3260	3550	3430	3430
q9	8799	8858	8787	8787
q10	3539	3496	3488	3488
q11	601	502	483	483
q12	780	641	634	634
q13	16444	3108	3154	3108
q14	295	283	273	273
q15	551	515	518	515
q16	714	664	667	664
q17	1832	1603	1589	1589
q18	8210	7798	7465	7465
q19	6977	1543	1501	1501
q20	2063	1890	1833	1833
q21	5324	5305	5254	5254
q22	1158	1035	1049	1035
Total cold run time: 81141 ms
Total hot run time: 59631 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 194147 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 40f43c4c132ca56d48409e26a405aaed23bd93ac, data reload: false

query1	1225	940	936	936
query2	6249	2105	2016	2016
query3	10840	3865	3763	3763
query4	69272	29783	23499	23499
query5	5643	444	433	433
query6	473	172	171	171
query7	6126	309	311	309
query8	329	233	232	232
query9	9238	2634	2644	2634
query10	505	262	246	246
query11	18036	15419	15686	15419
query12	161	112	107	107
query13	1550	446	429	429
query14	11433	7314	7026	7026
query15	220	175	173	173
query16	7159	494	450	450
query17	1091	562	564	562
query18	1837	295	291	291
query19	193	143	149	143
query20	111	113	110	110
query21	208	100	102	100
query22	4614	4405	4277	4277
query23	34828	34324	33613	33613
query24	5698	2817	2914	2817
query25	512	405	413	405
query26	662	168	163	163
query27	1690	293	305	293
query28	4145	2502	2484	2484
query29	716	434	432	432
query30	230	153	147	147
query31	985	782	782	782
query32	62	57	56	56
query33	456	274	268	268
query34	895	502	495	495
query35	840	708	742	708
query36	1078	946	955	946
query37	118	76	75	75
query38	3987	3927	3783	3783
query39	1486	1407	1440	1407
query40	197	98	97	97
query41	55	51	48	48
query42	114	100	105	100
query43	522	492	478	478
query44	1151	779	771	771
query45	181	167	162	162
query46	1107	710	698	698
query47	1909	1836	1796	1796
query48	454	368	368	368
query49	748	399	395	395
query50	822	403	400	400
query51	7083	7257	7033	7033
query52	97	86	94	86
query53	257	178	180	178
query54	552	441	436	436
query55	74	75	75	75
query56	241	229	248	229
query57	1169	1099	1108	1099
query58	202	202	208	202
query59	3075	2908	2846	2846
query60	272	248	236	236
query61	103	101	98	98
query62	769	674	644	644
query63	206	183	182	182
query64	1615	631	572	572
query65	3275	3194	3166	3166
query66	712	303	300	300
query67	15611	15337	15371	15337
query68	4522	551	553	551
query69	412	251	251	251
query70	1119	1120	1114	1114
query71	395	260	256	256
query72	6180	3933	3853	3853
query73	754	340	335	335
query74	10201	8885	9009	8885
query75	3315	2624	2614	2614
query76	2302	1040	892	892
query77	480	263	254	254
query78	10657	9978	9533	9533
query79	8081	604	588	588
query80	1753	411	413	411
query81	557	241	240	240
query82	1156	118	110	110
query83	248	135	133	133
query84	282	74	71	71
query85	1829	291	275	275
query86	488	296	295	295
query87	4393	4300	4304	4300
query88	5507	2374	2398	2374
query89	472	293	292	292
query90	2021	179	179	179
query91	178	139	142	139
query92	64	47	47	47
query93	6116	541	537	537
query94	908	279	272	272
query95	345	252	248	248
query96	630	278	277	277
query97	3406	3139	3114	3114
query98	214	193	195	193
query99	2082	1282	1305	1282
Total cold run time: 338541 ms
Total hot run time: 194147 ms

@dataroaring dataroaring closed this Nov 7, 2024
@dataroaring dataroaring deleted the auto-pick-42585-branch-3.0 branch December 27, 2024 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants