Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](arrow-flight-sql) Arrow flight server supports data forwarding when BE uses public vip #43281

Merged
merged 2 commits into from
Nov 13, 2024

Conversation

xinyiZzz
Copy link
Contributor

@xinyiZzz xinyiZzz commented Nov 5, 2024

Motivation

If there is a Doris cluster, its FE node can be accessed by the external network, and all its BE nodes can only be accessed by the internal network.

This is fine when using Mysql client and JDBC to connect to Doris to execute queries, and the query results will be returned to the client by the Doris FE node.

However, using Arrow Flight SQL to connect to Doris cannot execute queries, because the ADBC ​​client needs to connect to the Doris BE node to pull query results, but the Doris BE node is not allowed to be accessed by the external network.

In a production environment, it is often inconvenient to expose Doris BE nodes to the external network. However, a reverse proxy (such as nginx) can be added to all Doris BE nodes, and the external client will be randomly routed to a Doris BE node when connecting to nginx.

The query results of Arrow Flight SQL will be randomly saved on a Doris BE node. If it is different from the Doris BE node randomly routed by nginx, data forwarding needs to be done inside the Doris BE node.

Implementation

  1. The Ticket returned by Doris FE Arrow Flight Server to ADBC ​​client contains the IP and Brpc Port of the Doris BE node where the query result is located.

  2. Doris BE Arrow Flight Server receives a request to pull data. If the IP:BrpcPort in the Ticket is not itself, it pulls the query result Block from the Doris BE node specified by IP:BrpcPort, converts it to Arrow Batch and returns it to ADBC ​​Client; if the IP:BrpcPort in the Ticket is itself, it is the same as before.

TODO

  1. If the data is not in the current BE node, you can pull the data from other BE nodes asynchronously and cache at least one Block locally in the current BE node, which will reduce the time consumption of serialization, deserialization, and RPC.

How to test

  1. Create a Doris cluster with 1 FE and 2 BE, and modify arrow_flight_sql_port in fe.conf and be.conf.

  2. Root executes systemctl status nginx to check whether Nginx is installed. If not, yum install is recommended.

  3. vim /etc/nginx/nginx.conf adds underscores_in_headers on;

  4. touch /etc/nginx/conf.d/arrowflight.conf creates a file, and vim /etc/nginx/conf.d/arrowflight.conf adds:

upstream arrowflight {
    server {BE1_ip}:{BE1_arrow_flight_sql_port};
    server {BE2_IP}:{BE2_arrow_flight_sql_port};
}

server {
    listen {nginx port} http2;
    listen [::]:{nginx port} http2;
    server_name doris.arrowflight.com;

    #ssl_certificate   /etc/nginx/cert/myCA.pem;
    #ssl_certificate_key /etc/nginx/cert/myCA.key;

    location / {
        grpc_pass grpc://arrowflight;
        grpc_set_header X-Real-IP $remote_addr;
        grpc_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        grpc_set_header X-Forwarded-Proto $scheme;

        proxy_read_timeout 60s;
        proxy_send_timeout 60s;

        #proxy_http_version 1.1;
        #proxy_set_header Connection "";
    }
}

Where {BE1_ip}:{BE1_arrow_flight_sql_port} is the IP of BE 1 and arrow_flight_sql_port in be.conf, and similarly {BE2_IP}:{BE2_arrow_flight_sql_port}. {nginx port} is any available port.

  1. Add in be.conf of all BEs:
public_access_ip={nginx ip}
public_access_port={nginx port}

动机

如果存在一个 Doris 集群,它的 FE 节点可以被外部网络访问,它的所有 BE 节点只可以被内网访问。

这在使用 Mysql client 和 JDBC 连接 Doris 执行查询是没问题的,查询结果将由 Doris FE 节点返回给 client。

但使用 Arrow Flight SQL 连接 Doris 无法执行查询,因为 ADBC client 需要连接 Doris BE 节点拉取查询结果,但 Doris BE 节点不允许被外网访问。

生产环境中,很多时候不方便在外网暴露 Doris BE 节点。但可以为所有 Doris BE 节点增加了一层反向代理(比如 nginx),外网的 client 连接 nginx 时会随机路由到一台 Doris BE 节点上。

Arrow Flight SQL 查询结果会随机保存在一台 Doris BE 节点上,如果和 nginx 随机路由的 Doris BE 节点不同,需要在 Doris BE 节点内部做一次数据转发。

实现

  1. Doris FE Arrow Flight Server 向 ADBC client 返回的 Ticket 中包含查询结果所在 Doris BE节点的 IP 和 Brpc Port。
  2. Doris BE Arrow Flight Server 收到拉取数据请求。如果 Ticket 中的 IP:BrpcPort 不是自己,则从 IP:BrpcPort 指定的 Doris BE 节点拉取查询结果Block,转为 Arrow Batch 后返回 ADBC Client;如果 Ticket 中的 IP:BrpcPort 是自己,则和过去一样。

TODO

  1. 若数据不在当前 BE 节点,可以异步的从其他 BE 节点拉取数据,并在当前 BE 节点本地缓存至少一个 Block,这将减少序列化、反序列化、RPC 的耗时。

如何测试

  1. 创建一个 1 FE 和 2 BE 的 Doris 集群,修改 fe.confbe.conf 中的 arrow_flight_sql_port
  2. Root 执行 systemctl status nginx 查看是否安装 Nginx,若没有则推荐 yum install。
  3. vim /etc/nginx/nginx.conf 增加 underscores_in_headers on;
  4. touch /etc/nginx/conf.d/arrowflight.conf 创建文件,vim /etc/nginx/conf.d/arrowflight.conf 增加:
upstream arrowflight {
    server {BE1_ip}:{BE1_arrow_flight_sql_port};
    server {BE2_IP}:{BE2_arrow_flight_sql_port};
}

server {
    listen {nginx port} http2;
    listen [::]:{nginx port} http2;
    server_name doris.arrowflight.com;

    #ssl_certificate   /etc/nginx/cert/myCA.pem;
    #ssl_certificate_key /etc/nginx/cert/myCA.key;

    location / {
        grpc_pass grpc://arrowflight;
        grpc_set_header X-Real-IP $remote_addr;
        grpc_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        grpc_set_header X-Forwarded-Proto $scheme;

        proxy_read_timeout 60s;
        proxy_send_timeout 60s;

        #proxy_http_version 1.1;
        #proxy_set_header Connection "";
    }
}

其中 {BE1_ip}:{BE1_arrow_flight_sql_port} 是 BE 1 的 IP 和 be.conf 中的 arrow_flight_sql_port,同理 {BE2_IP}:{BE2_arrow_flight_sql_port}。{nginx port} 是一个任意可用端口。

  1. 在所有 BE 的 be.conf 中增加
public_access_ip={nginx ip}
public_access_port={nginx port}

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@xinyiZzz
Copy link
Contributor Author

xinyiZzz commented Nov 5, 2024

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@xinyiZzz xinyiZzz force-pushed the 20241025_fix_arrowflight_vip branch 2 times, most recently from 7a34651 to cfade62 Compare November 5, 2024 17:37
@xinyiZzz
Copy link
Contributor Author

xinyiZzz commented Nov 5, 2024

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.84% (9836/25995)
Line Coverage: 29.01% (81811/282010)
Region Coverage: 28.25% (42156/149228)
Branch Coverage: 24.84% (21395/86120)
Coverage Report: http://coverage.selectdb-in.cc/coverage/cfade62d90ad379a9087f8bad9a9408185a399de_cfade62d90ad379a9087f8bad9a9408185a399de/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 41795 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit cfade62d90ad379a9087f8bad9a9408185a399de, data reload: false

------ Round 1 ----------------------------------
q1	17586	8247	7332	7332
q2	2044	166	172	166
q3	10554	1149	1158	1149
q4	10228	977	888	888
q5	7755	3141	3063	3063
q6	241	149	154	149
q7	1020	618	627	618
q8	9356	2008	2062	2008
q9	6621	6450	6502	6450
q10	7064	2430	2467	2430
q11	464	265	257	257
q12	410	228	224	224
q13	17769	3028	2976	2976
q14	240	210	217	210
q15	576	530	539	530
q16	645	579	578	578
q17	986	671	602	602
q18	7621	6722	6812	6722
q19	1353	972	1092	972
q20	492	192	183	183
q21	4021	3304	3290	3290
q22	1130	1009	998	998
Total cold run time: 108176 ms
Total hot run time: 41795 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7288	7267	7279	7267
q2	328	234	229	229
q3	2950	2796	2864	2796
q4	1961	1736	1732	1732
q5	5457	5477	5500	5477
q6	218	143	147	143
q7	2147	1756	1726	1726
q8	3300	3410	3430	3410
q9	8578	8615	8580	8580
q10	3516	3485	3462	3462
q11	593	500	517	500
q12	811	573	610	573
q13	9466	3023	3014	3014
q14	295	275	259	259
q15	573	516	503	503
q16	692	644	640	640
q17	1866	1593	1571	1571
q18	7894	7585	7458	7458
q19	1673	1633	1424	1424
q20	2068	1842	1845	1842
q21	5377	5179	5312	5179
q22	1103	1008	971	971
Total cold run time: 68154 ms
Total hot run time: 58756 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191804 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit cfade62d90ad379a9087f8bad9a9408185a399de, data reload: false

query1	974	371	379	371
query2	6517	2175	2088	2088
query3	6786	216	209	209
query4	33573	23693	23629	23629
query5	4280	462	455	455
query6	280	178	173	173
query7	4617	294	307	294
query8	295	232	229	229
query9	9489	2641	2637	2637
query10	466	260	256	256
query11	17995	15398	15262	15262
query12	156	103	102	102
query13	1682	452	415	415
query14	10161	7178	7207	7178
query15	252	186	184	184
query16	8153	457	448	448
query17	1650	582	559	559
query18	2150	299	298	298
query19	378	150	149	149
query20	114	110	111	110
query21	206	103	99	99
query22	4569	4126	4176	4126
query23	35052	34180	34507	34180
query24	11149	2763	2797	2763
query25	669	406	410	406
query26	1425	160	163	160
query27	2839	280	285	280
query28	8108	2436	2410	2410
query29	917	424	434	424
query30	326	153	154	153
query31	1030	796	830	796
query32	99	59	60	59
query33	776	282	276	276
query34	981	528	545	528
query35	914	736	748	736
query36	1106	993	972	972
query37	135	76	73	73
query38	4399	4222	4330	4222
query39	1507	1436	1422	1422
query40	281	103	104	103
query41	49	48	46	46
query42	114	98	101	98
query43	549	485	487	485
query44	1306	809	811	809
query45	188	165	168	165
query46	1141	688	696	688
query47	1934	1837	1804	1804
query48	434	331	315	315
query49	1170	396	403	396
query50	814	386	404	386
query51	7239	7111	7248	7111
query52	98	90	87	87
query53	259	182	182	182
query54	1279	419	413	413
query55	81	77	80	77
query56	258	237	269	237
query57	1280	1174	1133	1133
query58	234	209	204	204
query59	3203	3022	2946	2946
query60	265	236	241	236
query61	118	106	107	106
query62	883	684	693	684
query63	212	191	189	189
query64	5299	634	611	611
query65	3295	3197	3198	3197
query66	1224	310	310	310
query67	16199	15773	15600	15600
query68	4993	544	569	544
query69	431	250	257	250
query70	1200	1137	1154	1137
query71	431	250	249	249
query72	6503	4055	3970	3970
query73	764	356	357	356
query74	10339	8980	8970	8970
query75	3434	2672	2654	2654
query76	3086	1080	1072	1072
query77	388	276	268	268
query78	10264	9425	9468	9425
query79	1684	582	602	582
query80	1083	432	419	419
query81	547	239	238	238
query82	922	115	118	115
query83	209	134	138	134
query84	230	69	71	69
query85	1319	321	297	297
query86	445	311	293	293
query87	4854	4684	4747	4684
query88	3316	2196	2178	2178
query89	392	284	302	284
query90	1985	192	189	189
query91	138	103	98	98
query92	68	50	51	50
query93	1738	548	542	542
query94	928	301	300	300
query95	353	252	253	252
query96	605	279	279	279
query97	2846	2673	2803	2673
query98	209	194	191	191
query99	1522	1313	1293	1293
Total cold run time: 302807 ms
Total hot run time: 191804 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.1 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit cfade62d90ad379a9087f8bad9a9408185a399de, data reload: false

query1	0.04	0.04	0.03
query2	0.06	0.04	0.03
query3	0.23	0.07	0.07
query4	1.64	0.10	0.10
query5	0.42	0.41	0.42
query6	1.15	0.64	0.64
query7	0.02	0.02	0.02
query8	0.04	0.03	0.04
query9	0.57	0.50	0.51
query10	0.56	0.57	0.59
query11	0.14	0.11	0.11
query12	0.14	0.12	0.11
query13	0.60	0.60	0.59
query14	2.74	2.73	2.75
query15	0.90	0.83	0.83
query16	0.42	0.38	0.39
query17	1.01	1.04	1.08
query18	0.24	0.22	0.22
query19	1.96	1.76	1.96
query20	0.02	0.01	0.01
query21	15.37	0.58	0.59
query22	2.74	2.89	1.55
query23	16.99	0.85	0.72
query24	3.02	1.80	0.59
query25	0.20	0.19	0.21
query26	0.38	0.14	0.14
query27	0.04	0.04	0.04
query28	10.60	1.10	1.07
query29	12.58	3.29	3.27
query30	0.24	0.06	0.06
query31	2.86	0.38	0.39
query32	3.28	0.46	0.45
query33	3.08	3.13	3.15
query34	16.90	4.49	4.43
query35	4.51	4.53	4.53
query36	0.66	0.50	0.48
query37	0.09	0.06	0.06
query38	0.04	0.03	0.04
query39	0.02	0.02	0.02
query40	0.17	0.13	0.13
query41	0.07	0.03	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 106.81 s
Total hot run time: 32.1 s

@xinyiZzz
Copy link
Contributor Author

xinyiZzz commented Nov 6, 2024

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

be/src/service/arrow_flight/arrow_flight_batch_reader.cpp Outdated Show resolved Hide resolved
@xinyiZzz
Copy link
Contributor Author

xinyiZzz commented Nov 6, 2024

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

be/src/service/arrow_flight/arrow_flight_batch_reader.cpp Outdated Show resolved Hide resolved
@doris-robot
Copy link

TPC-H: Total hot run time: 41471 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit dee2f9c6c31bb2f84a59a98473c521846587a583, data reload: false

------ Round 1 ----------------------------------
q1	17771	7502	7299	7299
q2	2068	198	173	173
q3	10574	1195	1229	1195
q4	10265	847	822	822
q5	7777	3159	3098	3098
q6	239	144	148	144
q7	1007	618	602	602
q8	9376	2003	2040	2003
q9	6683	6381	6460	6381
q10	7040	2405	2439	2405
q11	460	261	267	261
q12	405	208	210	208
q13	17822	3025	3058	3025
q14	251	204	215	204
q15	582	545	529	529
q16	649	590	593	590
q17	997	534	652	534
q18	7672	6706	6791	6706
q19	1340	1059	913	913
q20	456	185	179	179
q21	3972	3285	3184	3184
q22	1112	1016	1016	1016
Total cold run time: 108518 ms
Total hot run time: 41471 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7307	7309	7306	7306
q2	347	257	252	252
q3	2953	2757	2862	2757
q4	1998	1707	1724	1707
q5	5489	5508	5550	5508
q6	217	137	138	137
q7	2146	1723	1742	1723
q8	3310	3488	3455	3455
q9	8682	8648	8686	8648
q10	3535	3468	3471	3468
q11	586	505	511	505
q12	797	556	637	556
q13	10173	3022	3087	3022
q14	286	263	273	263
q15	589	539	540	539
q16	686	643	641	641
q17	1834	1655	1571	1571
q18	7916	7500	7371	7371
q19	1684	1601	1590	1590
q20	2081	1789	1808	1789
q21	5268	5238	5236	5236
q22	1131	982	1000	982
Total cold run time: 69015 ms
Total hot run time: 59026 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193497 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit dee2f9c6c31bb2f84a59a98473c521846587a583, data reload: false

query1	985	370	370	370
query2	6527	2191	2108	2108
query3	6793	217	215	215
query4	34233	23771	23611	23611
query5	4376	467	442	442
query6	277	197	192	192
query7	4598	301	292	292
query8	283	227	227	227
query9	9630	2735	2724	2724
query10	470	248	249	248
query11	18420	15304	15359	15304
query12	169	103	104	103
query13	1682	441	414	414
query14	9904	7104	6616	6616
query15	256	180	178	178
query16	8066	486	492	486
query17	1617	564	560	560
query18	2475	599	616	599
query19	403	191	178	178
query20	118	108	114	108
query21	216	105	103	103
query22	4520	4333	4301	4301
query23	34996	34887	34610	34610
query24	12332	3277	3358	3277
query25	608	400	408	400
query26	1239	188	187	187
query27	2800	289	287	287
query28	8292	2475	2450	2450
query29	706	440	437	437
query30	471	331	316	316
query31	1069	805	787	787
query32	96	58	64	58
query33	790	291	276	276
query34	986	513	539	513
query35	905	750	744	744
query36	1122	951	953	951
query37	202	77	75	75
query38	4473	4237	4291	4237
query39	1454	1453	1430	1430
query40	298	105	101	101
query41	51	47	48	47
query42	108	101	103	101
query43	545	502	499	499
query44	1297	826	823	823
query45	192	183	169	169
query46	1174	718	735	718
query47	1944	1801	1828	1801
query48	423	351	336	336
query49	1281	418	400	400
query50	804	399	398	398
query51	7381	7162	7160	7160
query52	104	94	90	90
query53	256	188	181	181
query54	1175	437	422	422
query55	77	81	79	79
query56	288	268	252	252
query57	1325	1153	1160	1153
query58	241	210	230	210
query59	3246	3063	3085	3063
query60	275	248	246	246
query61	108	105	110	105
query62	870	664	674	664
query63	213	203	183	183
query64	5461	693	659	659
query65	3336	3281	3210	3210
query66	1302	310	342	310
query67	16254	15640	15547	15547
query68	5009	577	584	577
query69	443	266	249	249
query70	1187	1049	1116	1049
query71	431	256	243	243
query72	6085	4113	4030	4030
query73	796	352	368	352
query74	10618	9036	8999	8999
query75	3501	2665	2659	2659
query76	2921	1093	1128	1093
query77	391	265	273	265
query78	11043	9661	9425	9425
query79	1562	610	605	605
query80	870	451	430	430
query81	547	237	243	237
query82	962	117	115	115
query83	239	180	159	159
query84	241	71	70	70
query85	984	304	297	297
query86	366	306	283	283
query87	4711	4809	4686	4686
query88	3396	2195	2162	2162
query89	405	296	295	295
query90	1967	192	191	191
query91	139	105	107	105
query92	59	50	50	50
query93	1159	555	553	553
query94	877	302	292	292
query95	357	249	249	249
query96	619	283	280	280
query97	2836	2716	2680	2680
query98	210	208	197	197
query99	1510	1279	1293	1279
Total cold run time: 305029 ms
Total hot run time: 193497 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.86% (9848/26014)
Line Coverage: 29.02% (81886/282171)
Region Coverage: 28.24% (42169/149303)
Branch Coverage: 24.83% (21395/86154)
Coverage Report: http://coverage.selectdb-in.cc/coverage/dee2f9c6c31bb2f84a59a98473c521846587a583_dee2f9c6c31bb2f84a59a98473c521846587a583/report/index.html

@doris-robot
Copy link

ClickBench: Total hot run time: 32.82 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit dee2f9c6c31bb2f84a59a98473c521846587a583, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.03	0.03
query3	0.22	0.07	0.07
query4	1.64	0.10	0.10
query5	0.40	0.41	0.41
query6	1.15	0.65	0.64
query7	0.02	0.04	0.01
query8	0.04	0.03	0.03
query9	0.59	0.48	0.50
query10	0.56	0.55	0.57
query11	0.14	0.11	0.11
query12	0.13	0.11	0.11
query13	0.60	0.59	0.60
query14	2.81	2.85	2.85
query15	0.92	0.82	0.83
query16	0.38	0.40	0.37
query17	1.01	1.05	1.05
query18	0.20	0.19	0.20
query19	1.97	1.93	1.94
query20	0.01	0.01	0.01
query21	15.39	0.57	0.59
query22	3.23	1.84	2.12
query23	17.01	0.87	0.76
query24	3.01	0.96	1.29
query25	0.25	0.22	0.13
query26	0.39	0.13	0.14
query27	0.05	0.04	0.04
query28	10.76	1.09	1.07
query29	12.53	3.28	3.28
query30	0.25	0.06	0.07
query31	2.88	0.39	0.39
query32	3.24	0.46	0.46
query33	2.98	3.01	3.02
query34	16.94	4.47	4.45
query35	4.50	4.50	4.51
query36	0.66	0.50	0.49
query37	0.09	0.06	0.06
query38	0.04	0.03	0.04
query39	0.03	0.03	0.02
query40	0.16	0.12	0.13
query41	0.08	0.02	0.03
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 107.44 s
Total hot run time: 32.82 s

@xinyiZzz
Copy link
Contributor Author

xinyiZzz commented Nov 7, 2024

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

be/src/service/arrow_flight/arrow_flight_batch_reader.cpp Outdated Show resolved Hide resolved
@doris-robot
Copy link

TPC-H: Total hot run time: 41440 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2368f58165d7619d2739d0f4fb1b0cdbd37cd3c3, data reload: false

------ Round 1 ----------------------------------
q1	17764	7441	7259	7259
q2	2090	182	175	175
q3	10600	1091	1187	1091
q4	10549	875	852	852
q5	7762	3086	3089	3086
q6	234	146	146	146
q7	1008	629	611	611
q8	9401	2004	2072	2004
q9	6523	6451	6473	6451
q10	7031	2456	2428	2428
q11	464	265	266	265
q12	415	214	210	210
q13	17794	3004	3031	3004
q14	243	220	213	213
q15	571	527	515	515
q16	646	579	582	579
q17	986	575	601	575
q18	7511	6709	6587	6587
q19	1337	988	1022	988
q20	490	183	181	181
q21	4072	3230	3253	3230
q22	1127	1016	990	990
Total cold run time: 108618 ms
Total hot run time: 41440 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7271	7245	7202	7202
q2	344	250	249	249
q3	3122	2977	2950	2950
q4	2196	1884	1843	1843
q5	5765	5818	5863	5818
q6	230	140	143	140
q7	2321	1867	1837	1837
q8	3460	3627	3504	3504
q9	8996	8977	8981	8977
q10	3605	3575	3563	3563
q11	613	506	503	503
q12	794	620	607	607
q13	11160	3204	3219	3204
q14	293	284	287	284
q15	617	547	548	547
q16	702	638	643	638
q17	1857	1634	1639	1634
q18	8400	7811	7604	7604
q19	1728	1546	1583	1546
q20	2155	1927	1880	1880
q21	5729	5585	5604	5585
q22	1150	1055	1099	1055
Total cold run time: 72508 ms
Total hot run time: 61170 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 194769 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2368f58165d7619d2739d0f4fb1b0cdbd37cd3c3, data reload: false

query1	2452	2135	2272	2135
query2	6256	2059	2028	2028
query3	15116	11461	235	235
query4	33114	23694	23510	23510
query5	3310	452	445	445
query6	281	198	194	194
query7	3971	292	295	292
query8	281	218	228	218
query9	9445	2666	2663	2663
query10	469	280	257	257
query11	18148	15146	15384	15146
query12	152	103	97	97
query13	1587	421	413	413
query14	9910	7899	7263	7263
query15	256	180	185	180
query16	8187	472	512	472
query17	1591	592	588	588
query18	2477	605	603	603
query19	388	192	196	192
query20	131	121	122	121
query21	210	105	105	105
query22	4979	4256	4377	4256
query23	35053	34167	34139	34139
query24	11569	3336	3321	3321
query25	589	402	396	396
query26	737	188	178	178
query27	1932	285	292	285
query28	6751	2430	2422	2422
query29	760	417	426	417
query30	406	310	304	304
query31	1032	793	821	793
query32	96	55	57	55
query33	779	289	280	280
query34	905	509	518	509
query35	896	736	731	731
query36	1102	935	945	935
query37	121	78	72	72
query38	4402	4215	4247	4215
query39	1463	1437	1580	1437
query40	199	101	102	101
query41	48	45	47	45
query42	106	100	99	99
query43	539	487	498	487
query44	1345	809	824	809
query45	180	168	168	168
query46	1143	693	709	693
query47	1929	1810	1861	1810
query48	416	342	331	331
query49	887	410	399	399
query50	796	391	397	391
query51	7319	7109	7078	7078
query52	100	88	90	88
query53	266	185	182	182
query54	1242	459	422	422
query55	78	76	80	76
query56	292	247	245	245
query57	1288	1152	1147	1147
query58	220	209	210	209
query59	3181	2987	3017	2987
query60	274	251	251	251
query61	110	114	108	108
query62	860	688	669	669
query63	220	185	184	184
query64	3857	643	641	641
query65	3285	3239	3224	3224
query66	837	302	295	295
query67	16041	15651	15587	15587
query68	4492	596	575	575
query69	457	254	252	252
query70	1158	1150	1149	1149
query71	392	250	253	250
query72	6293	3981	3940	3940
query73	775	370	360	360
query74	10267	8996	9044	8996
query75	3447	2661	2695	2661
query76	2679	1150	1050	1050
query77	374	279	266	266
query78	10428	9477	9413	9413
query79	1127	622	601	601
query80	792	429	441	429
query81	543	240	243	240
query82	1327	124	114	114
query83	243	159	156	156
query84	237	74	75	74
query85	1042	319	289	289
query86	338	294	267	267
query87	4773	4697	4768	4697
query88	3439	2239	2203	2203
query89	414	301	287	287
query90	2108	190	188	188
query91	137	106	105	105
query92	59	48	49	48
query93	1222	543	552	543
query94	771	284	307	284
query95	352	244	260	244
query96	631	281	283	281
query97	2860	2666	2691	2666
query98	212	199	199	199
query99	1563	1326	1314	1314
Total cold run time: 303314 ms
Total hot run time: 194769 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.74 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2368f58165d7619d2739d0f4fb1b0cdbd37cd3c3, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.05	0.04
query3	0.23	0.06	0.07
query4	1.65	0.10	0.09
query5	0.43	0.41	0.40
query6	1.14	0.68	0.66
query7	0.02	0.01	0.02
query8	0.04	0.05	0.05
query9	0.56	0.50	0.50
query10	0.56	0.56	0.57
query11	0.18	0.14	0.13
query12	0.16	0.13	0.14
query13	0.61	0.60	0.60
query14	2.84	2.73	2.79
query15	0.90	0.84	0.85
query16	0.37	0.38	0.38
query17	1.06	1.07	1.00
query18	0.18	0.18	0.19
query19	1.98	1.80	2.05
query20	0.01	0.01	0.01
query21	15.36	0.68	0.68
query22	4.30	6.60	2.40
query23	18.29	1.42	1.40
query24	2.13	0.24	0.22
query25	0.14	0.09	0.08
query26	0.26	0.18	0.19
query27	0.08	0.08	0.08
query28	13.25	1.16	1.14
query29	12.56	3.32	3.31
query30	0.25	0.06	0.07
query31	2.87	0.40	0.40
query32	3.23	0.48	0.48
query33	2.99	3.07	3.08
query34	17.01	4.49	4.50
query35	4.53	4.49	4.54
query36	0.66	0.47	0.50
query37	0.19	0.15	0.15
query38	0.16	0.14	0.15
query39	0.04	0.04	0.04
query40	0.16	0.13	0.13
query41	0.10	0.05	0.05
query42	0.06	0.05	0.04
query43	0.04	0.05	0.04
Total cold run time: 111.7 s
Total hot run time: 33.74 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.85% (9849/26019)
Line Coverage: 29.03% (81965/282316)
Region Coverage: 28.25% (42205/149389)
Branch Coverage: 24.83% (21410/86218)
Coverage Report: http://coverage.selectdb-in.cc/coverage/2368f58165d7619d2739d0f4fb1b0cdbd37cd3c3_2368f58165d7619d2739d0f4fb1b0cdbd37cd3c3/report/index.html

@xinyiZzz
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

be/src/runtime/buffer_control_block.cpp Show resolved Hide resolved
yiguolei
yiguolei previously approved these changes Nov 13, 2024
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 13, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@xinyiZzz
Copy link
Contributor Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Nov 13, 2024
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 13, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

@wangbo wangbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xinyiZzz xinyiZzz merged commit 6c2c36d into apache:master Nov 13, 2024
27 of 30 checks passed
yiguolei pushed a commit that referenced this pull request Nov 16, 2024
xinyiZzz added a commit to xinyiZzz/incubator-doris that referenced this pull request Nov 18, 2024
…when BE uses public vip (apache#43281)

If there is a Doris cluster, its FE node can be accessed by the external
network, and all its BE nodes can only be accessed by the internal
network.

This is fine when using Mysql client and JDBC to connect to Doris to
execute queries, and the query results will be returned to the client by
the Doris FE node.

However, using Arrow Flight SQL to connect to Doris cannot execute
queries, because the ADBC ​​client needs to connect to the Doris BE node
to pull query results, but the Doris BE node is not allowed to be
accessed by the external network.

In a production environment, it is often inconvenient to expose Doris BE
nodes to the external network. However, a reverse proxy (such as nginx)
can be added to all Doris BE nodes, and the external client will be
randomly routed to a Doris BE node when connecting to nginx.

The query results of Arrow Flight SQL will be randomly saved on a Doris
BE node. If it is different from the Doris BE node randomly routed by
nginx, data forwarding needs to be done inside the Doris BE node.

1. The Ticket returned by Doris FE Arrow Flight Server to ADBC ​​client
contains the IP and Brpc Port of the Doris BE node where the query
result is located.

2. Doris BE Arrow Flight Server receives a request to pull data. If the
IP:BrpcPort in the Ticket is not itself, it pulls the query result Block
from the Doris BE node specified by IP:BrpcPort, converts it to Arrow
Batch and returns it to ADBC ​​Client; if the IP:BrpcPort in the Ticket
is itself, it is the same as before.

1. If the data is not in the current BE node, you can pull the data from
other BE nodes asynchronously and cache at least one Block locally in
the current BE node, which will reduce the time consumption of
serialization, deserialization, and RPC.

1. Create a Doris cluster with 1 FE and 2 BE, and modify
`arrow_flight_sql_port` in `fe.conf` and `be.conf`.

2. Root executes `systemctl status nginx` to check whether Nginx is
installed. If not, yum install is recommended.

3. `vim /etc/nginx/nginx.conf` adds `underscores_in_headers on;`

4. `touch /etc/nginx/conf.d/arrowflight.conf` creates a file, and `vim
/etc/nginx/conf.d/arrowflight.conf` adds:

```
upstream arrowflight {
    server {BE1_ip}:{BE1_arrow_flight_sql_port};
    server {BE2_IP}:{BE2_arrow_flight_sql_port};
}

server {
    listen {nginx port} http2;
    listen [::]:{nginx port} http2;
    server_name doris.arrowflight.com;

    #ssl_certificate   /etc/nginx/cert/myCA.pem;
    #ssl_certificate_key /etc/nginx/cert/myCA.key;

    location / {
        grpc_pass grpc://arrowflight;
        grpc_set_header X-Real-IP $remote_addr;
        grpc_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        grpc_set_header X-Forwarded-Proto $scheme;

        proxy_read_timeout 60s;
        proxy_send_timeout 60s;

        #proxy_http_version 1.1;
        #proxy_set_header Connection "";
    }
}
```

Where {BE1_ip}:{BE1_arrow_flight_sql_port} is the IP of BE 1 and
arrow_flight_sql_port in be.conf, and similarly
{BE2_IP}:{BE2_arrow_flight_sql_port}. `{nginx port}` is any available
port.

6. Add in be.conf of all BEs:

```
public_access_ip={nginx ip}
public_access_port={nginx port}
```

---

如果存在一个 Doris 集群,它的 FE 节点可以被外部网络访问,它的所有 BE 节点只可以被内网访问。

这在使用 Mysql client 和 JDBC 连接 Doris 执行查询是没问题的,查询结果将由 Doris FE 节点返回给
client。

但使用 Arrow Flight SQL 连接 Doris 无法执行查询,因为 ADBC client 需要连接 Doris BE
节点拉取查询结果,但 Doris BE 节点不允许被外网访问。

生产环境中,很多时候不方便在外网暴露 Doris BE 节点。但可以为所有 Doris BE 节点增加了一层反向代理(比如 nginx),外网的
client 连接 nginx 时会随机路由到一台 Doris BE 节点上。

Arrow Flight SQL 查询结果会随机保存在一台 Doris BE 节点上,如果和 nginx 随机路由的 Doris BE
节点不同,需要在 Doris BE 节点内部做一次数据转发。

1. Doris FE Arrow Flight Server 向 ADBC client 返回的 Ticket 中包含查询结果所在 Doris
BE节点的 IP 和 Brpc Port。
2. Doris BE Arrow Flight Server 收到拉取数据请求。如果 Ticket 中的 IP:BrpcPort
不是自己,则从 IP:BrpcPort 指定的 Doris BE 节点拉取查询结果Block,转为 Arrow Batch 后返回 ADBC
Client;如果 Ticket 中的 IP:BrpcPort 是自己,则和过去一样。

1. 若数据不在当前 BE 节点,可以异步的从其他 BE 节点拉取数据,并在当前 BE 节点本地缓存至少一个
Block,这将减少序列化、反序列化、RPC 的耗时。

1. 创建一个 1 FE 和 2 BE 的 Doris 集群,修改 `fe.conf` 和 `be.conf` 中的
`arrow_flight_sql_port`。
2. Root 执行 `systemctl status nginx` 查看是否安装 Nginx,若没有则推荐 yum install。
3. `vim /etc/nginx/nginx.conf` 增加 `underscores_in_headers on;`
4. `touch /etc/nginx/conf.d/arrowflight.conf` 创建文件,`vim
/etc/nginx/conf.d/arrowflight.conf` 增加:

```
upstream arrowflight {
    server {BE1_ip}:{BE1_arrow_flight_sql_port};
    server {BE2_IP}:{BE2_arrow_flight_sql_port};
}

server {
    listen {nginx port} http2;
    listen [::]:{nginx port} http2;
    server_name doris.arrowflight.com;

    #ssl_certificate   /etc/nginx/cert/myCA.pem;
    #ssl_certificate_key /etc/nginx/cert/myCA.key;

    location / {
        grpc_pass grpc://arrowflight;
        grpc_set_header X-Real-IP $remote_addr;
        grpc_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        grpc_set_header X-Forwarded-Proto $scheme;

        proxy_read_timeout 60s;
        proxy_send_timeout 60s;

        #proxy_http_version 1.1;
        #proxy_set_header Connection "";
    }
}
```

其中 {BE1_ip}:{BE1_arrow_flight_sql_port} 是 BE 1 的 IP 和 be.conf 中的
arrow_flight_sql_port,同理 {BE2_IP}:{BE2_arrow_flight_sql_port}。`{nginx
port}` 是一个任意可用端口。

6. 在所有 BE 的 be.conf 中增加

```
public_access_ip={nginx ip}
public_access_port={nginx port}
```
xinyiZzz added a commit to xinyiZzz/incubator-doris that referenced this pull request Nov 19, 2024
…when BE uses public vip (apache#43281)

If there is a Doris cluster, its FE node can be accessed by the external
network, and all its BE nodes can only be accessed by the internal
network.

This is fine when using Mysql client and JDBC to connect to Doris to
execute queries, and the query results will be returned to the client by
the Doris FE node.

However, using Arrow Flight SQL to connect to Doris cannot execute
queries, because the ADBC ​​client needs to connect to the Doris BE node
to pull query results, but the Doris BE node is not allowed to be
accessed by the external network.

In a production environment, it is often inconvenient to expose Doris BE
nodes to the external network. However, a reverse proxy (such as nginx)
can be added to all Doris BE nodes, and the external client will be
randomly routed to a Doris BE node when connecting to nginx.

The query results of Arrow Flight SQL will be randomly saved on a Doris
BE node. If it is different from the Doris BE node randomly routed by
nginx, data forwarding needs to be done inside the Doris BE node.

1. The Ticket returned by Doris FE Arrow Flight Server to ADBC ​​client
contains the IP and Brpc Port of the Doris BE node where the query
result is located.

2. Doris BE Arrow Flight Server receives a request to pull data. If the
IP:BrpcPort in the Ticket is not itself, it pulls the query result Block
from the Doris BE node specified by IP:BrpcPort, converts it to Arrow
Batch and returns it to ADBC ​​Client; if the IP:BrpcPort in the Ticket
is itself, it is the same as before.

1. If the data is not in the current BE node, you can pull the data from
other BE nodes asynchronously and cache at least one Block locally in
the current BE node, which will reduce the time consumption of
serialization, deserialization, and RPC.

1. Create a Doris cluster with 1 FE and 2 BE, and modify
`arrow_flight_sql_port` in `fe.conf` and `be.conf`.

2. Root executes `systemctl status nginx` to check whether Nginx is
installed. If not, yum install is recommended.

3. `vim /etc/nginx/nginx.conf` adds `underscores_in_headers on;`

4. `touch /etc/nginx/conf.d/arrowflight.conf` creates a file, and `vim
/etc/nginx/conf.d/arrowflight.conf` adds:

```
upstream arrowflight {
    server {BE1_ip}:{BE1_arrow_flight_sql_port};
    server {BE2_IP}:{BE2_arrow_flight_sql_port};
}

server {
    listen {nginx port} http2;
    listen [::]:{nginx port} http2;
    server_name doris.arrowflight.com;

    #ssl_certificate   /etc/nginx/cert/myCA.pem;
    #ssl_certificate_key /etc/nginx/cert/myCA.key;

    location / {
        grpc_pass grpc://arrowflight;
        grpc_set_header X-Real-IP $remote_addr;
        grpc_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        grpc_set_header X-Forwarded-Proto $scheme;

        proxy_read_timeout 60s;
        proxy_send_timeout 60s;

        #proxy_http_version 1.1;
        #proxy_set_header Connection "";
    }
}
```

Where {BE1_ip}:{BE1_arrow_flight_sql_port} is the IP of BE 1 and
arrow_flight_sql_port in be.conf, and similarly
{BE2_IP}:{BE2_arrow_flight_sql_port}. `{nginx port}` is any available
port.

6. Add in be.conf of all BEs:

```
public_access_ip={nginx ip}
public_access_port={nginx port}
```

---

如果存在一个 Doris 集群,它的 FE 节点可以被外部网络访问,它的所有 BE 节点只可以被内网访问。

这在使用 Mysql client 和 JDBC 连接 Doris 执行查询是没问题的,查询结果将由 Doris FE 节点返回给
client。

但使用 Arrow Flight SQL 连接 Doris 无法执行查询,因为 ADBC client 需要连接 Doris BE
节点拉取查询结果,但 Doris BE 节点不允许被外网访问。

生产环境中,很多时候不方便在外网暴露 Doris BE 节点。但可以为所有 Doris BE 节点增加了一层反向代理(比如 nginx),外网的
client 连接 nginx 时会随机路由到一台 Doris BE 节点上。

Arrow Flight SQL 查询结果会随机保存在一台 Doris BE 节点上,如果和 nginx 随机路由的 Doris BE
节点不同,需要在 Doris BE 节点内部做一次数据转发。

1. Doris FE Arrow Flight Server 向 ADBC client 返回的 Ticket 中包含查询结果所在 Doris
BE节点的 IP 和 Brpc Port。
2. Doris BE Arrow Flight Server 收到拉取数据请求。如果 Ticket 中的 IP:BrpcPort
不是自己,则从 IP:BrpcPort 指定的 Doris BE 节点拉取查询结果Block,转为 Arrow Batch 后返回 ADBC
Client;如果 Ticket 中的 IP:BrpcPort 是自己,则和过去一样。

1. 若数据不在当前 BE 节点,可以异步的从其他 BE 节点拉取数据,并在当前 BE 节点本地缓存至少一个
Block,这将减少序列化、反序列化、RPC 的耗时。

1. 创建一个 1 FE 和 2 BE 的 Doris 集群,修改 `fe.conf` 和 `be.conf` 中的
`arrow_flight_sql_port`。
2. Root 执行 `systemctl status nginx` 查看是否安装 Nginx,若没有则推荐 yum install。
3. `vim /etc/nginx/nginx.conf` 增加 `underscores_in_headers on;`
4. `touch /etc/nginx/conf.d/arrowflight.conf` 创建文件,`vim
/etc/nginx/conf.d/arrowflight.conf` 增加:

```
upstream arrowflight {
    server {BE1_ip}:{BE1_arrow_flight_sql_port};
    server {BE2_IP}:{BE2_arrow_flight_sql_port};
}

server {
    listen {nginx port} http2;
    listen [::]:{nginx port} http2;
    server_name doris.arrowflight.com;

    #ssl_certificate   /etc/nginx/cert/myCA.pem;
    #ssl_certificate_key /etc/nginx/cert/myCA.key;

    location / {
        grpc_pass grpc://arrowflight;
        grpc_set_header X-Real-IP $remote_addr;
        grpc_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        grpc_set_header X-Forwarded-Proto $scheme;

        proxy_read_timeout 60s;
        proxy_send_timeout 60s;

        #proxy_http_version 1.1;
        #proxy_set_header Connection "";
    }
}
```

其中 {BE1_ip}:{BE1_arrow_flight_sql_port} 是 BE 1 的 IP 和 be.conf 中的
arrow_flight_sql_port,同理 {BE2_IP}:{BE2_arrow_flight_sql_port}。`{nginx
port}` 是一个任意可用端口。

6. 在所有 BE 的 be.conf 中增加

```
public_access_ip={nginx ip}
public_access_port={nginx port}
```
xinyiZzz added a commit to xinyiZzz/incubator-doris that referenced this pull request Nov 20, 2024
…when BE uses public vip (apache#43281)

If there is a Doris cluster, its FE node can be accessed by the external
network, and all its BE nodes can only be accessed by the internal
network.

This is fine when using Mysql client and JDBC to connect to Doris to
execute queries, and the query results will be returned to the client by
the Doris FE node.

However, using Arrow Flight SQL to connect to Doris cannot execute
queries, because the ADBC ​​client needs to connect to the Doris BE node
to pull query results, but the Doris BE node is not allowed to be
accessed by the external network.

In a production environment, it is often inconvenient to expose Doris BE
nodes to the external network. However, a reverse proxy (such as nginx)
can be added to all Doris BE nodes, and the external client will be
randomly routed to a Doris BE node when connecting to nginx.

The query results of Arrow Flight SQL will be randomly saved on a Doris
BE node. If it is different from the Doris BE node randomly routed by
nginx, data forwarding needs to be done inside the Doris BE node.

1. The Ticket returned by Doris FE Arrow Flight Server to ADBC ​​client
contains the IP and Brpc Port of the Doris BE node where the query
result is located.

2. Doris BE Arrow Flight Server receives a request to pull data. If the
IP:BrpcPort in the Ticket is not itself, it pulls the query result Block
from the Doris BE node specified by IP:BrpcPort, converts it to Arrow
Batch and returns it to ADBC ​​Client; if the IP:BrpcPort in the Ticket
is itself, it is the same as before.

1. If the data is not in the current BE node, you can pull the data from
other BE nodes asynchronously and cache at least one Block locally in
the current BE node, which will reduce the time consumption of
serialization, deserialization, and RPC.

1. Create a Doris cluster with 1 FE and 2 BE, and modify
`arrow_flight_sql_port` in `fe.conf` and `be.conf`.

2. Root executes `systemctl status nginx` to check whether Nginx is
installed. If not, yum install is recommended.

3. `vim /etc/nginx/nginx.conf` adds `underscores_in_headers on;`

4. `touch /etc/nginx/conf.d/arrowflight.conf` creates a file, and `vim
/etc/nginx/conf.d/arrowflight.conf` adds:

```
upstream arrowflight {
    server {BE1_ip}:{BE1_arrow_flight_sql_port};
    server {BE2_IP}:{BE2_arrow_flight_sql_port};
}

server {
    listen {nginx port} http2;
    listen [::]:{nginx port} http2;
    server_name doris.arrowflight.com;

    #ssl_certificate   /etc/nginx/cert/myCA.pem;
    #ssl_certificate_key /etc/nginx/cert/myCA.key;

    location / {
        grpc_pass grpc://arrowflight;
        grpc_set_header X-Real-IP $remote_addr;
        grpc_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        grpc_set_header X-Forwarded-Proto $scheme;

        proxy_read_timeout 60s;
        proxy_send_timeout 60s;

        #proxy_http_version 1.1;
        #proxy_set_header Connection "";
    }
}
```

Where {BE1_ip}:{BE1_arrow_flight_sql_port} is the IP of BE 1 and
arrow_flight_sql_port in be.conf, and similarly
{BE2_IP}:{BE2_arrow_flight_sql_port}. `{nginx port}` is any available
port.

6. Add in be.conf of all BEs:

```
public_access_ip={nginx ip}
public_access_port={nginx port}
```

---

如果存在一个 Doris 集群,它的 FE 节点可以被外部网络访问,它的所有 BE 节点只可以被内网访问。

这在使用 Mysql client 和 JDBC 连接 Doris 执行查询是没问题的,查询结果将由 Doris FE 节点返回给
client。

但使用 Arrow Flight SQL 连接 Doris 无法执行查询,因为 ADBC client 需要连接 Doris BE
节点拉取查询结果,但 Doris BE 节点不允许被外网访问。

生产环境中,很多时候不方便在外网暴露 Doris BE 节点。但可以为所有 Doris BE 节点增加了一层反向代理(比如 nginx),外网的
client 连接 nginx 时会随机路由到一台 Doris BE 节点上。

Arrow Flight SQL 查询结果会随机保存在一台 Doris BE 节点上,如果和 nginx 随机路由的 Doris BE
节点不同,需要在 Doris BE 节点内部做一次数据转发。

1. Doris FE Arrow Flight Server 向 ADBC client 返回的 Ticket 中包含查询结果所在 Doris
BE节点的 IP 和 Brpc Port。
2. Doris BE Arrow Flight Server 收到拉取数据请求。如果 Ticket 中的 IP:BrpcPort
不是自己,则从 IP:BrpcPort 指定的 Doris BE 节点拉取查询结果Block,转为 Arrow Batch 后返回 ADBC
Client;如果 Ticket 中的 IP:BrpcPort 是自己,则和过去一样。

1. 若数据不在当前 BE 节点,可以异步的从其他 BE 节点拉取数据,并在当前 BE 节点本地缓存至少一个
Block,这将减少序列化、反序列化、RPC 的耗时。

1. 创建一个 1 FE 和 2 BE 的 Doris 集群,修改 `fe.conf` 和 `be.conf` 中的
`arrow_flight_sql_port`。
2. Root 执行 `systemctl status nginx` 查看是否安装 Nginx,若没有则推荐 yum install。
3. `vim /etc/nginx/nginx.conf` 增加 `underscores_in_headers on;`
4. `touch /etc/nginx/conf.d/arrowflight.conf` 创建文件,`vim
/etc/nginx/conf.d/arrowflight.conf` 增加:

```
upstream arrowflight {
    server {BE1_ip}:{BE1_arrow_flight_sql_port};
    server {BE2_IP}:{BE2_arrow_flight_sql_port};
}

server {
    listen {nginx port} http2;
    listen [::]:{nginx port} http2;
    server_name doris.arrowflight.com;

    #ssl_certificate   /etc/nginx/cert/myCA.pem;
    #ssl_certificate_key /etc/nginx/cert/myCA.key;

    location / {
        grpc_pass grpc://arrowflight;
        grpc_set_header X-Real-IP $remote_addr;
        grpc_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        grpc_set_header X-Forwarded-Proto $scheme;

        proxy_read_timeout 60s;
        proxy_send_timeout 60s;

        #proxy_http_version 1.1;
        #proxy_set_header Connection "";
    }
}
```

其中 {BE1_ip}:{BE1_arrow_flight_sql_port} 是 BE 1 的 IP 和 be.conf 中的
arrow_flight_sql_port,同理 {BE2_IP}:{BE2_arrow_flight_sql_port}。`{nginx
port}` 是一个任意可用端口。

6. 在所有 BE 的 be.conf 中增加

```
public_access_ip={nginx ip}
public_access_port={nginx port}
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants