Skip to content

Conversation

@mrhhsg
Copy link
Member

@mrhhsg mrhhsg commented Mar 27, 2024

Proposed changes

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@mrhhsg
Copy link
Member Author

mrhhsg commented Mar 27, 2024

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.25% (8740/24796)
Line Coverage: 27.02% (71548/264780)
Region Coverage: 26.28% (37137/141327)
Branch Coverage: 23.17% (18981/81936)
Coverage Report: http://coverage.selectdb-in.cc/coverage/30c13a508d97a0848f86fe991e92df4110b830d0_30c13a508d97a0848f86fe991e92df4110b830d0/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 38074 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 30c13a508d97a0848f86fe991e92df4110b830d0, data reload: false

------ Round 1 ----------------------------------
q1	17719	4471	4213	4213
q2	2459	158	167	158
q3	11421	1112	1201	1112
q4	10333	765	818	765
q5	7536	3106	3008	3008
q6	208	128	123	123
q7	1055	619	585	585
q8	9952	2034	2019	2019
q9	7307	6628	6556	6556
q10	8473	3396	3548	3396
q11	438	217	217	217
q12	367	195	192	192
q13	17805	2843	2851	2843
q14	229	197	203	197
q15	505	462	460	460
q16	499	372	372	372
q17	931	543	587	543
q18	7172	6559	6485	6485
q19	1856	1433	1463	1433
q20	560	261	258	258
q21	3550	2890	2837	2837
q22	345	305	302	302
Total cold run time: 110720 ms
Total hot run time: 38074 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4149	4049	4092	4049
q2	325	238	226	226
q3	2972	2852	2865	2852
q4	1813	1558	1554	1554
q5	5271	5305	5385	5305
q6	194	115	116	115
q7	2253	1854	1882	1854
q8	3190	3297	3257	3257
q9	8714	8647	8634	8634
q10	3816	3765	3750	3750
q11	541	441	443	441
q12	709	524	523	523
q13	16903	2828	2859	2828
q14	278	253	256	253
q15	498	458	462	458
q16	454	428	423	423
q17	1721	1488	1456	1456
q18	7356	7176	7037	7037
q19	1614	1536	1514	1514
q20	1924	1735	1718	1718
q21	4782	4518	4652	4518
q22	496	443	440	440
Total cold run time: 69973 ms
Total hot run time: 53205 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 182180 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 30c13a508d97a0848f86fe991e92df4110b830d0, data reload: false

query1	929	370	346	346
query2	7144	1989	1883	1883
query3	6702	209	208	208
query4	32125	21373	21407	21373
query5	4299	404	403	403
query6	271	176	187	176
query7	4630	291	287	287
query8	244	168	192	168
query9	9723	2350	2320	2320
query10	566	254	260	254
query11	17374	14344	14276	14276
query12	142	95	88	88
query13	1631	421	427	421
query14	9469	8244	7617	7617
query15	241	203	199	199
query16	8234	259	275	259
query17	1975	581	542	542
query18	2109	289	278	278
query19	350	155	150	150
query20	95	84	88	84
query21	201	129	135	129
query22	5029	4813	4868	4813
query23	33459	32851	32854	32851
query24	11647	2881	2848	2848
query25	669	384	392	384
query26	1840	158	157	157
query27	2994	358	364	358
query28	7735	1909	1886	1886
query29	1054	643	633	633
query30	306	150	152	150
query31	961	721	750	721
query32	98	58	53	53
query33	778	250	255	250
query34	1084	489	481	481
query35	842	617	634	617
query36	1058	910	844	844
query37	181	67	67	67
query38	3580	3471	3473	3471
query39	1481	1507	1430	1430
query40	289	113	113	113
query41	51	47	48	47
query42	102	97	97	97
query43	511	438	448	438
query44	1246	738	735	735
query45	278	266	258	258
query46	1106	700	697	697
query47	1924	1862	1821	1821
query48	447	368	366	366
query49	1238	352	347	347
query50	769	369	374	369
query51	6776	6656	6611	6611
query52	107	87	92	87
query53	341	274	289	274
query54	335	264	245	245
query55	82	81	80	80
query56	254	233	234	233
query57	1206	1140	1174	1140
query58	236	209	215	209
query59	2847	2536	2638	2536
query60	291	250	245	245
query61	116	114	114	114
query62	660	449	444	444
query63	309	275	283	275
query64	6889	4088	3939	3939
query65	3111	3009	3052	3009
query66	1471	383	355	355
query67	15531	15141	15337	15141
query68	9210	523	532	523
query69	643	376	381	376
query70	1416	1151	1137	1137
query71	511	263	259	259
query72	6546	2732	2576	2576
query73	1592	321	320	320
query74	8423	6373	6485	6373
query75	3887	2223	2203	2203
query76	5485	876	863	863
query77	629	256	247	247
query78	11096	10201	10212	10201
query79	11136	524	520	520
query80	2020	369	365	365
query81	516	212	215	212
query82	527	87	83	83
query83	213	150	150	150
query84	291	76	76	76
query85	1386	317	311	311
query86	401	323	274	274
query87	3810	3536	3533	3533
query88	4513	2344	2299	2299
query89	496	373	366	366
query90	2011	176	173	173
query91	174	138	138	138
query92	60	46	46	46
query93	7341	518	506	506
query94	1372	179	177	177
query95	433	333	331	331
query96	603	268	271	268
query97	2630	2490	2496	2490
query98	234	210	210	210
query99	1208	874	932	874
Total cold run time: 322586 ms
Total hot run time: 182180 ms

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 30c13a508d97a0848f86fe991e92df4110b830d0 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       13.9 seconds inserted 10000000 Rows, about 719K ops/s

Copy link
Contributor

@yiguolei yiguolei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Mar 28, 2024
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@yiguolei yiguolei merged commit 5ebaa4f into apache:master Mar 28, 2024
Jibing-Li added a commit that referenced this pull request Mar 29, 2024
* [fix](merge cloud) Fix cloud be set be tag map (#32864)

* [chore] Add gavinchou to collaborators (#32881)

* [chore](show) support statement to show views from table (#32358)

MySQL [test]> show views;
+----------------+
| Tables_in_test |
+----------------+
| t1_view        |
| t2_view        |
+----------------+
2 rows in set (0.00 sec)

MySQL [test]> show views like '%t1%';
+----------------+
| Tables_in_test |
+----------------+
| t1_view        |
+----------------+
1 row in set (0.01 sec)

MySQL [test]> show views where create_time > '2024-03-18';
+----------------+
| Tables_in_test |
+----------------+
| t2_view        |
+----------------+
1 row in set (0.02 sec)

* [Enhancement](ranger) Disable some permission operations when Ranger or LDAP are enabled (#32538)

Disable some permission operations when Ranger or LDAP are enabled.

* [chore](ci) exclude unstable trino_connector case (#32892)

Co-authored-by: stephen <hello-stephen@qq.com>

* [fix](Nereids) NPE when create table with implicit index type (#32893)

* [improvement](mtmv) Support more join types for query rewriting by materialized view (#32685)

This pattern of rewriting is supported for multi-table joins and supported join types is as following:

INNER JOIN
LEFT OUTER JOIN
RIGHT OUTER JOIN
FULL OUTER JOIN
LEFT SEMI JOIN
RIGHT SEMI JOIN
LEFT ANTI JOIN
RIGHT ANTI JOIN

* [Serde](Variant) support arrow serialization for varint type (#32780)

* [fix](multicatalog) fix no data error when read hive table on cosn (#32815)

Currently, when reading a hive on cosn table, doris return empty result, but the table has data.
iceberg on cosn is ok.
The reason is misuse of cosn's file sytem. according to cosn's doc, its fs.cosn.impl should be org.apache.hadoop.fs.CosFileSystem

* [fix](nereids)EliminateGroupByConstant should replace agg's output after removing constant group by keys (#32878)

* [Fix](executor)Fix regression test for test_active_queries/test_backend_active_tasks #32899

* [fix](iceberg) fix iceberg catalog bug and p2 test cases (#32898)

1. Fix iceberg catalog bug

    This PR #30198 change the logic of `IcebergHMSExternalCatalog.java`,
    to get locationUrl by calling hive metastore's `getCatalog()` method.
    But this method only exists in hive 3+. So it will fail if we using hive 2.x.

    I temporary remove this logic, because this logic is only used from iceberg table writing.
    Which is still under development. We will rethink this logic later.

2. Fix test cases

    Some of P2 test cases missed `order_qt`. And because the output format of the floating point
    type is changed, some result in `out` files need to be regenerated.

* [revert](jni) revert part of #32455 (#32904)

* [fix](spill) Avoid releasing resources while spill tasks are executing (#32783)

* [chore](log) print query id before logging profile in be.INFO (#32922)

* [fix](grace-exit) Stop incorrectly of reportwork cause heap use after free #32929

* [improvement](decommission be) decommission check replica num (#32748)

* [fix](arrow-flight) Fix reach limit of connections error (#32911)

Fix Reach limit of connections error
in fe.conf , arrow_flight_token_cache_size is mandatory less than qe_max_connection/2. arrow flight sql is a stateless protocol, connection is usually not actively disconnected, bearer token is evict from the cache will unregister ConnectContext.

Fix ConnectContext.command not be reset to COM_SLEEP in time, this will result in frequent kill connection after query timeout.

Fix bearer token evict log and exception.

TODO: use arrow flight session: https://mail.google.com/mail/u/0/#inbox/FMfcgzGxRdxBLQLTcvvtRpqsvmhrHpdH

* [bugfix](cloud) few variable not initialized (#32868)

../../cloud/src/recycler/meta_checker.cpp
can cause uninitialised memory read.

* [fix](arrow-flight) Fix arrow flight sql compatible with JDK 17 and upgrade arrow 15.0.2 (#32796)

--add-opens=java.base/java.nio=ALL-UNNAMED, see: https://arrow.apache.org/docs/java/install.html#java-compatibility
groovy use flight sql connection to execute query SUM(MAX(c1) OVER (PARTITION BY)) report error: AGGREGATE clause must not contain analytic expressions, but no problem in Java execute it with jdbc::arrow-flight-sql.
groovy not support print arrow array type, throw IndexOutOfBoundsException.
"arrow_flight_sql" not support two phase read
./run-regression-test.sh --run --clean -g arrow_flight_sql

* [fix](spill) SpillStream's writer maybe may not have been finalized (#32931)

* [improvement](spill) Disable DistinctStreamingAgg when spill is enabled (#32932)

* [Improve](inverted_index) update clucene and improve array inverted index writer  (#32436)

* [Performance](exec) replace SipHash in function by XXHash (#32919)

* [feature](agg) add aggregate function sum0 (#32541)

* [improvement](mtmv) Support to get tables in materialized view when collecting table in plan (#32797)

Support to get tables in materialized view when collecting table in plan

table scehma as fllowing:

create materialized view mv1
BUILD IMMEDIATE REFRESH COMPLETE ON MANUAL
DISTRIBUTED BY RANDOM BUCKETS 1 
PROPERTIES ('replication_num' = '1')
 as 
select 
  t1.c1, 
  t3.c2 
from 
  table1 t1 
  inner join table3 t3 on t1.c1 = t3.c2

if get table from the plan as follwoing, we can get [table1, table3, table2], the mv1 is expanded to get base tables;

SELECT 
  mv1.*, 
  uuid() 
FROM 
  mv1 LEFT SEMI 
  JOIN table2 ON mv1.c1 = table2.c1 
WHERE 
  mv1.c1 IN (
    SELECT 
      c1 
    FROM 
      table2
  ) 
  OR mv1.c1 < 10

* [enhance](mtmv)support olap table partition column is null (#32698)

* [enhancement](cloud) add table version to cloud (#32738)

Add table version to cloud.

In Fe:
Get: If Fe is cloud mode, get table version from meta service.
Update: Op drop/replace temp partition, commit transaction.

In meta service:
Add: create Index. init value is 1.
Remove: by recycler.
Update: commit/drop partition rpc, commit txn rpc. Atomic++.

* [fix](cloud) schema change from not null to null (#32913)

1. Use equals instead of == for type comparing
2. null bitmap size is reisze by size of ref column.

* [feature](Nereids): add ColumnPruningPostProcessor. (#32800)

* [case](rowpolicy)fix row policy has been exist (#32880)

* [fix](pipeline) fix use error row desc when origin block clear (#32803)

* [fix](Nereids) support variant column with index when create table (#32948)

* [opt](Nereids) support create table with variant type (#32953)

* [test](insert-overwrite) Add insert overwrite auto detect concurrency cases (#32935)

* [fix](compile) fe cannot compile in idea (#32955)

* [enhancement](plsql) Support select * from routines (#32866)

Support show of plsql procedure using select * from routines.

* [fix](trino-connector) fix `NoClassDefFoundError` of hudi `Utils` class (#32846)

Due to the change of this PR #32455 , the `trino-connector-scanner` package cannot access the `hudi_scanner` package, so the exception NoclassDeffounderror will appear.

We need to write a separate Utils class.

* [exec](column) change some complex column move to noexcept (#32954)

* [Enhancement](data skew) extends show data skew (#32732)

* [chore](test) let suite compatible with Nereids (#32964)

* Support identical column name in different index. (#32792)

* Limit the max string length to 1024 while collecting column stats to control BE memory usage. (#32470)

* [fix](merge-iterator) fix NOT_IMPLEMENTED_ERROR when read next block view (#32961)

* [improvement](executor)Add tag property for workload group #32874

* [fix](auth)unified workload and resource permission logic (#32907)

- `Grant resource` can no longer grant global `usage_priv`
-  `grant resource %` instead of `grant resource *`

before change:
```
grant usage_priv on resource * to f;
show grants for f\G
*************************** 1. row ***************************
      UserIdentity: 'f'@'%'
           Comment: 
          Password: No
             Roles: 
       GlobalPrivs: Usage_priv 
      CatalogPrivs: NULL
     DatabasePrivs: internal.information_schema: Select_priv ; internal.mysql: Select_priv 
        TablePrivs: NULL
          ColPrivs: NULL
     ResourcePrivs: NULL
 CloudClusterPrivs: NULL
WorkloadGroupPrivs: normal: Usage_priv 
```
after change
```
grant usage_priv on resource '%' to f;
show grants for f\G
*************************** 1. row ***************************
      UserIdentity: 'f'@'%'
           Comment: 
          Password: No
             Roles: 
       GlobalPrivs: NULL
      CatalogPrivs: NULL
     DatabasePrivs: internal.information_schema: Select_priv ; internal.mysql: Select_priv 
        TablePrivs: NULL
          ColPrivs: NULL
     ResourcePrivs: %: Usage_priv 
 CloudClusterPrivs: NULL
WorkloadGroupPrivs: normal: Usage_priv 

```

---------

Co-authored-by: yujun <yu.jun.reach@gmail.com>
Co-authored-by: Gavin Chou <gavineaglechou@gmail.com>
Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com>
Co-authored-by: yongjinhou <109586248+yongjinhou@users.noreply.github.com>
Co-authored-by: Dongyang Li <hello_stephen@qq.com>
Co-authored-by: stephen <hello-stephen@qq.com>
Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com>
Co-authored-by: seawinde <149132972+seawinde@users.noreply.github.com>
Co-authored-by: lihangyu <15605149486@163.com>
Co-authored-by: Yulei-Yang <yulei.yang0699@gmail.com>
Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>
Co-authored-by: wangbo <wangbo@apache.org>
Co-authored-by: Mingyu Chen <morningman@163.com>
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
Co-authored-by: zhiqiang <seuhezhiqiang@163.com>
Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>
Co-authored-by: Vallish Pai <vallishpai@gmail.com>
Co-authored-by: amory <wangqiannan@selectdb.com>
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: Jensen <czjourney@163.com>
Co-authored-by: zhangdong <493738387@qq.com>
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
Co-authored-by: jakevin <jakevingoo@gmail.com>
Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com>
Co-authored-by: zclllyybb <zhaochangle@selectdb.com>
Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com>
Co-authored-by: Xin Liao <liaoxinbit@126.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants