Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improvement][performance] improve lru cache resize performance and memory usage #9521

Merged
merged 1 commit into from
May 19, 2022

Conversation

gaodayue
Copy link
Contributor

Proposed changes

Problem Summary:

Previously we use doubly linked list for LRUCache's HandleTable, in order to support a lookup-free remove(const LRUHandle* h) function. However it 1) slows down the resize operation because we need to malloc/free the dummy LRUHandle node for each bucket 2) increases the overall memory usage.

It turns out that we can achieve fast remove(const LRUHandle* h) using just singly linked list.

Checklist(Required)

  1. Does it affect the original behavior: (No)
  2. Has unit tests been added: (Yes)
  3. Has document been added or modified: (No Need)
  4. Does it need to update dependencies: (No)
  5. Are there any changes that cannot be rolled back: (No)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@gaodayue gaodayue self-assigned this May 12, 2022
@xinyiZzz
Copy link
Contributor

Hi @gaodayue
Please provide test results to demonstrate the performance and memory usage benefits of singly linked list, especially at high concurrency.

The lru cache is used globally in Doris, and the resize is very low frequency. It will only be resized frequently when the BE is cold started.

Comment on lines +114 to +120
bool HandleTable::remove(const LRUHandle* h) {
LRUHandle** ptr = &(_list[h->hash & (_length - 1)]);
while (*ptr != nullptr && *ptr != h) {
ptr = &(*ptr)->next_hash;
}

LRUHandle* result = *ptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think faster remove speed is more important than faster resize.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR will not slow down remove because

  1. the expected length for the linked list is 1 given a good hash function, so we won't spend too much time traversing the list
  2. only pointer comparison is used for each iteration, no time-consuming key comparison

As a result, CacheTest.SimpleBenchmark changed from 1576ms to 1570ms on 160000 iteration, and from 161392ms to 158165ms on 16000000 iteration.

Copy link
Contributor

@xinyiZzz xinyiZzz May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right~, I see you optimize the code logic at the same time.

@morningman morningman added kind/fix Categorizes issue or PR as related to a bug. kind/improvement dev/1.0.1-deprecated should be merged into dev-1.0.1 branch and removed kind/fix Categorizes issue or PR as related to a bug. labels May 14, 2022
@morningman morningman added this to the v1.1 milestone May 15, 2022
@gaodayue
Copy link
Contributor Author

@xinyiZzz thanks for your reply, I will answer inline

The lru cache is used globally in Doris, and the resize is very low frequency. It will only be resized frequently when the BE is cold started.

A bit of context for this PR: we observed several latency spikes after restarting BE and found that it's caused by the resize operation (see the graph below), so we'd like to improve the resize operation in order to avoid any query timeout.

image

The cluster uses 200G page cache per BE and resizing single shard to 8 million entries took >2 seconds.

Please provide test results to demonstrate the performance and memory usage benefits of singly linked list, especially at high concurrency.

I wrote a simple program to test the resize speed. From the results, we can see that for large resize, this PR reduces the resize duration by 60%. This PR also reduces the memory usage, but it's not our main concern.

-- before the PR
resize to 16 takes 5 us
resize to 32 takes 6 us
resize to 64 takes 6 us
resize to 128 takes 14 us
resize to 256 takes 23 us
resize to 512 takes 49 us
resize to 1024 takes 102 us
resize to 2048 takes 193 us
resize to 4096 takes 444 us
resize to 8192 takes 813 us
resize to 16384 takes 1708 us
resize to 32768 takes 3391 us
resize to 65536 takes 7104 us
resize to 131072 takes 16235 us
resize to 262144 takes 38272 us
resize to 524288 takes 88278 us
resize to 1048576 takes 184775 us
resize to 2097152 takes 380625 us
resize to 4194304 takes 774974 us
resize to 8388608 takes 1573816 us
resize to 16777216 takes 3170542 us

-- after the PR
resize to 16 takes 5 us
resize to 32 takes 2 us
resize to 64 takes 2 us
resize to 128 takes 2 us
resize to 256 takes 3 us
resize to 512 takes 4 us
resize to 1024 takes 8 us
resize to 2048 takes 18 us
resize to 4096 takes 36 us
resize to 8192 takes 82 us
resize to 16384 takes 170 us
resize to 32768 takes 406 us
resize to 65536 takes 753 us
resize to 131072 takes 1754 us
resize to 262144 takes 6720 us
resize to 524288 takes 24822 us
resize to 1048576 takes 58967 us
resize to 2097152 takes 132347 us
resize to 4194304 takes 278078 us
resize to 8388608 takes 587134 us
resize to 16777216 takes 1212146 us

@gaodayue
Copy link
Contributor Author

The failed checks seem unrelated to this PR. Can anyone help to trigger the rerun?

@xinyiZzz
Copy link
Contributor

A bit of context for this PR: we observed several latency spikes after restarting BE and found that it's caused by the resize operation (see the graph below), so we'd like to improve the resize operation in order to avoid any query timeout.

The cluster uses 200G page cache per BE and resizing single shard to 8 million entries took >2 seconds.

I wrote a simple program to test the resize speed. From the results, we can see that for large resize, this PR reduces the resize duration by 60%. This PR also reduces the memory usage, but it's not our main concern.

Convincing test results, nice job~

@xinyiZzz
Copy link
Contributor

The failed checks seem unrelated to this PR. Can anyone help to trigger the rerun?

Try git rebase and then push again.

be/src/olap/lru_cache.cpp Outdated Show resolved Hide resolved
@xinyiZzz
Copy link
Contributor

LGTM

@gaodayue
Copy link
Contributor Author

Error from the Code Quality Analysis check, not related to this PR

[INFO] Doris FE Project Parent POM ........................ FAILURE [11:44 min]
[INFO] fe-common .......................................... SUCCESS [ 21.655 s]
[INFO] spark-dpp .......................................... SUCCESS [ 11.115 s]
[INFO] fe-core ............................................ SUCCESS [01:07 min]
[INFO] hive-udf ........................................... SUCCESS [ 19.401 s]
[INFO] java-udf ........................................... SUCCESS [ 46.357 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  14:33 min
[INFO] Finished at: 2022-05-18T04:06:40Z
[INFO] ------------------------------------------------------------------------
Error:  Failed to execute goal org.sonarsource.scanner.maven:sonar-maven-plugin:3.9.1.2184:sonar (default-cli) on project fe: Project not found. Please check the ‘sonar.projectKey’ and ‘sonar.organization’ properties, the ‘SONAR_TOKEN’ environment variable, or contact the project administrator -> [Help 1]

@yiguolei yiguolei added the approved Indicates a PR has been approved by one committer. label May 19, 2022
@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label May 19, 2022
Copy link
Contributor

@yiguolei yiguolei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yiguolei yiguolei added approved Indicates a PR has been approved by one committer. and removed dev/1.0.1-deprecated should be merged into dev-1.0.1 branch labels May 19, 2022
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@yiguolei yiguolei merged commit c098586 into apache:master May 19, 2022
englefly pushed a commit to englefly/incubator-doris that referenced this pull request May 23, 2022
Lchangliang added a commit to Lchangliang/incubator-doris that referenced this pull request Jun 8, 2022
* [refactor] delete OLAP_LOG_WARNING related macro definition (apache#9484)

Co-authored-by: BePPPower <fangtiewei@selectdb.com>

* [refactor][style] Use clang-format to sort includes (apache#9483)

* [feature] show create materialized view (apache#9391)

* [feature](mysql-table) support utf8mb4 for mysql external table (apache#9402)

This patch supports utf8mb4 for mysql external table.

if someone needs a mysql external table with utf8mb4 charset, but only support charset utf8 right now.

When create mysql external table, it can add an optional propertiy "charset" which can set character fom mysql connection,
default value is "utf8". You can set "utf8mb4" instead of "utf8" when you need.

* [regression] add regression test for compaction (apache#9437)

Trigger compaction via REST API in this case.

* [refactor](backend) Refactor the logic of selecting Backend in FE. (apache#9478)

There are many places in FE where a group of BE nodes needs to be selected according to certain requirements. for example:
1. When creating replicas for a tablet.
2. When selecting a BE to execute Insert.
3. When Stream Load forwards http requests to BE nodes.

These operations all have the same logic. So this CL mainly changes:
1. Create a new `BeSelectionPolicy` class to describe the set of conditions for selecting BE.
2. The logic of selecting BE nodes in `SystemInfoService` has been refactored, and the following two methods are used uniformly:
    1. `selectBackendIdsByPolicy`: Select the required number of BE nodes according to the `BeSelectionPolicy`.
    2. `selectBackendIdsForReplicaCreation`: Select the BE node for the replica creation operation.

Note that there are some changes here:
For the replica creation operation, the round-robin method was used to select BE nodes before,
but now it is changed to `random` selection for the following reasons:
1. Although the previous logic is round-robin, it is actually random.
2. The final diff of the random algorithm will not be greater than 5%, so it can be considered that the random algorithm
     can distribute the data evenly.

* [fix](http) Hardening Recommendations Disable TRACE/TRAC methods (apache#9479)

* [refactor](Nereids): cascades refactor (apache#9470)

Describe the overview of changes.

- rename GroupExpression
- use `HashSet<GroupExpression> groupExpressions` in `memo`
- add label of `Nereids` for CI
- remove `GroupExpr` from Plan

* [doc] update fe checkstyle doc (apache#9373)

* [bugfix](vtablet_sink) fix max_pending_bytes for vtablet_sink (apache#9462)

Co-authored-by: yixiutt <yixiu@selectdb.com>

* [fixbug]fix bug for OLAP_SUCCESS with Status (apache#9427)

* [feature] support row policy filter (apache#9206)

* [chore](fe code style)add suppressions to fe check style (apache#9429)

Current fe check style check all files. But some rules should be only applied on production files.
Add suppressions to suppress some rules on test files.

* [fix](broker-load) can't load parquet file with column name case sensitive with Doris column (apache#9358)

* [fix](binlog-load) binlog load fails because txn exceeds the default value (apache#9471)

binlog load Because txn exceeds the default value, resume is a failure,
and a friendly prompt message is given to the user, instead of prompting success now,
it still fails after a while, and the user will feel inexplicable
Issue Number: close apache#9468

* [refactor]Cleanup unused empty files (apache#9497)

* [refactor] Check status precise_code instead of construct OLAPInternalError (apache#9514)

* check status precise_code instead of construct OLAPInternalError
* move is_io_error to Status

* [fix](storage) fix core for string predicate in storage layer (apache#9500)

Co-authored-by: Wang Bo <wangbo36@meituan.com>

* [Bug] (load) Broker load kerberos auth fail (apache#9494)

* Incorrect sequence numbers in revision documents. (apache#9496)

Co-authored-by: smallhibiscus <844981280>

* [regression test]add the regression test for json load (apache#9517)

Co-authored-by: hucheng01 <hucheng01@baidu.com>

* [style](java) format fe code with some check rules (apache#9460)

Issue Number: close apache#9403

set below rules' severity to error and format code according check info.
a. Merge conflicts unresolved
b. Avoid using corresponding octal or Unicode escape
c. Avoid Escaped Unicode Characters
d. No Line Wrap
e. Package Name
f. Type Name
g. Annotation Location
h. Interface Type Parameter
i. CatchParameterName
j. Pattern Variable Name
k. Record Component Name
l. Record Type Parameter Name
m. Method Type Parameter Name
n. Redundant Import
o. Custom Import Order
p. Unused Imports
q. Avoid Star Import
r. tab character in file
s. Newline At End Of File
t. Trailing whitespace found

* [bugfix](load) fix coredump in ordinal index flush (apache#9518)

commit apache#9123 introduce the bug. bitshuffle page return error when
page is full, so scalar column write cannot switch to next page, which make
ordinal index is null when flush.

All page builder should return ok when page full, and column writer procedure
shoud be append_data, check is_page_full, switch to next page

Co-authored-by: yixiutt <yixiu@selectdb.com>

* [fix][vectorized-storage] did not check column writer's write status

* Clean the version.sh file before build, otherwise the version information in the binary package produced by this compilation is still the commit id of the last time. (apache#9534)

Co-authored-by: stephen <hello-stephen@qq.com>

* [doc]Add ARM architecture compilation tutorial content (apache#9535)

Co-authored-by: manyi <fop@freeoneplus.com>

* [feature-wip](array-type) array_contains support more nested data types (apache#9170)

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [Improvement] remove unnecessary memcpy in OlapBlockDataConvertor (apache#9491)

* [Improvement] remove unnecessary memcpy in OlapBlockDataConvertor

* [doc] [Improved] The flink connector documentation is perfect (apache#9528)

Co-authored-by: 王磊 <lei.wang@unidt.com>

* [feature] add vectorized vjson_scanner (apache#9311)

This pr is used to add the vectorized vjson_scanner, which can support vectorized json import in stream load flow.

* [fix](Function) fix case when function return null with abs function (apache#9493)

* [fix](lateral-view) Error view includes lateral view (apache#9530)

Fixed apache#9529

When the lateral view based on a inline view which belongs to a view,
Doris could not resolve the column of lateral view in query.
When a query uses a view, it mainly refers to the string representation of the view.
That is, if the view's string representation is wrong, the view is wrong.
The string representation of the inline view lacks the handling of the lateral view.
This leads to query errors when using such views.
This PR mainly fixes the string representation of inline views.

* [refactor](es) Clean es tcp scannode and related thrift definitions (apache#9553)

PaloExternalSourcesService is designed for es_scan_node using tcp protocol.
But es tcp protocol need deploy a tcp jar into es code. Both es version and lucene version are upgraded,
and the tcp jar is not maintained any more.

So that I remove all the related code and thrift definitions.

* [bugfix](vectorized) vectorized write: invalid memory access caused by podarray resize (apache#9556)

* ADD: 补充idea开发文档,添加help-resource.zip的生成步骤 (apache#9561)

* [doc]fix doc typo in data-model and date data type (apache#9571)

* [Doc]Add show tables help documentation (apache#9568)

* [enhancement][betarowset]optimize lz4 compress and decompress speed by reusing context (apache#9566)

* [fix](function) fix last_value get wrong result when have order by clause (apache#9247)

* [Feature](Nereids) Data structure of comparison predicate (apache#9506)

1. The data structure of the comparison expression
2. Refactored the inheritance and implementation relationship of tree node

```
        +-- ---- ---- ---+- ---- ---- ---- ---+- ---- ----- ---- ----TreeNode-----------------+
        |                |                    |                                               |
                                                                                              |
        |                |                    |                                               |
                                                                                              v
        v                v                    v                                           Abstract Tree Node
    Leaf Node        Unary Node          Binary Node                              +--------          ---------+
        |                |                    |                                   |        (children)         |
                                                                                  |                           |
        v                v                    v                                   v                           v
Leaf Expression   Unary Expression      Binary Expression              +------Expression----+           Plan Node
        |                |                    |                        |                    |
                                                                       |                    |
        |                |                    |                        v                    v
        |                |                    +- ---- ---- -----> Comparison Predicate     Named Expr
                                                                                       +----   -------+
        |                |                                                             v              v
        |                +- -- --- --- --- --- --- --- --- --- --- --- --- --- ---> Alias Expr      Slot
                                                                                                      ^
        |                                                                                             |
        |                                                                                             |
        +---- --- ---- ------ ---- ------- ------ ------- --- ------ ------ ----- ---- ----- ----- ---+
```

* [fix](planner)VecNotImplException thrown when query need rewrite and some slot cannot changed to nullable (apache#9589)

* [chore] Fix compilation errors reported by clang (apache#9584)

* [docs]Modifide flink-doris-connector.md (apache#9595)

* [feature-wip](parquet-vec) Support parquet scanner in vectorized engine (apache#9433)

* [feature-wip](hudi) Step1: Support create hudi external table (apache#9559)

support create hudi table
support show create table for hudi table

1. create hudi table without schema(recommanded)
```sql
    CREATE [EXTERNAL] TABLE table_name
    ENGINE = HUDI
    [COMMENT "comment"]
    PROPERTIES (
    "hudi.database" = "hudi_db_in_hive_metastore",
    "hudi.table" = "hudi_table_in_hive_metastore",
    "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
    );
```

2. create hudi table with schema
```sql
    CREATE [EXTERNAL] TABLE table_name
    [(column_definition1[, column_definition2, ...])]
    ENGINE = HUDI
    [COMMENT "comment"]
    PROPERTIES (
    "hudi.database" = "hudi_db_in_hive_metastore",
    "hudi.table" = "hudi_table_in_hive_metastore",
    "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
    );
```
When create hudi table with schema, the columns must exist in corresponding table in hive metastore.

* [fix](storage-vectorized) fix VMergeIterator core dump (apache#9564)

It could be re appeared on rowset with many segment, it means segment overlap. Maybe could not reappear it easily.

* [Bug][Vectorized] Fix BE crash with delete condition and enable_storage_vectorization (apache#9547)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [Bug][Vectorized] Fix insert bimmap column with nullable column (apache#9408)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [doc]add largeint doc (apache#9609)

add largeint doc

* [doc]modified the spark-load doc (apache#9605)

* [code format]Upgrade clang-format in BE Code Formatter from 8 to 13 (apache#9602)

* [feature] group_concat support distinct (apache#9576)

* [feature] Add StoragePolicyResource for Remote Storage (apache#9554)

Add StoragePolicyResource for Remote Storage

* [fix] fix bug that replica can not be repaired duo to DECOMMISSION state (apache#9424)

Reset state of replica which state are in DECOMMISSION after finished scheduling.

* [config] Remove some old config and session variable (apache#9495)

1. Remove session variable "enable_lateral_view"
2. Remove Fe config: enable_materialized_view
3. Remove Fe config: enable_create_sync_job
4. Fe config dynamic_partition_enable is only used for disable dynamic partition scheduler.

* [Improvement] reduce string size in serialization (apache#9550)

* [Improvement][ASAN] make BE can exit normally and ASAN memory leak checking work (apache#9620)

* [clang build]fix clang compile error (apache#9615)

* [regression test] add some case for json load regression test (apache#9614)

Co-authored-by: hucheng01 <hucheng01@baidu.com>

* [BUG] fix information_schema.columns results not correctly on vec engine (apache#9612)

* VSchemaScanNode get_next bugfix

* add regression-test case for VSchemaScanNode

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [bug] (init) Java version check fail (apache#9607)

* [improment](planner) push down predicate past two phase aggregate (apache#9498)

Push down predicate past aggregate cannot push down predicate past 2 phase aggregate.

origin plan is like this:
```
second phase agg (conjuncts on olap scan node tuples)
|
first phase agg
|
olap scan node
```
should be optimized to
```
second phase agg
|
first phase agg
|
olap scan node (conjuncts on olap scan node tuples)
```

* [fixbug](vec-load) fix core of segment_writer while it is not thread-safe (apache#9569)

introduce in stream-load-vec apache#9280, it will cause multi-thread
operate to same segment_write cause BetaRowset enable multi-thread
of memtable flush, memtable flush call rowset_writer.add_block, it
use member variable _segment_writer to write, so it will cause
multi-thread in segment write.

Co-authored-by: yixiutt <yixiu@selectdb.com>

* [fix](storage) low_cardinality_optimize core dump when is null predicate (apache#9586)

Issue Number: close apache#9555
Make the last value of the dictionary null, when ColumnDict inserts a null value,
add the encoding corresponding to the last value of the dictionary·

* [regression test] Add compaction regression test case for different data models (apache#9660)

* [fix](planner) unnecessary cast will be added on children in CaseExpr sometimes (apache#9600)

unnecessary cast will be added on children in CaseExpr because use symbolized equal to compare to `Expr`'s type.
it will lead to expression compare mistake and then lead to expression substitute failed when use `ExprSubstitutionMap`

* [website] fix doris website with no link to the Privacy Policy. (apache#9665)

All websites must link to the Privacy Policy

* [doc] Fixed a error in the Bitmap Index section of the document (apache#9679)

* [refactor][regressiontest] reorder license header and import statement (apache#9672)

* [FeConfig](Project) Project optimization is enabled by default (apache#9667)

* [doc]update streamload 2pc doc (apache#9651)

Co-authored-by: wudi <>

* [BUG] fix bug for vectorized compaction and some storage vectorization bug (apache#9610)

* [style](fe) code correct rules and name rules (apache#9670)

* [style](fe) code correct rules and name rules

* revert some change according to comments

* [enhancement] Improve debugging experience. (apache#9677)

* [Feature] cancel load support state (apache#9537)

* Fix some typos for docs. (apache#9680)

* Fix some typos in be/. (apache#9681)

* [Bug] Fix timestamp_diff issue when timeunit is year and month (apache#9574)

* [fix] fix Code Quality Analysis failed (apache#9685)

* [improvement][performance] improve lru cache resize performance and memory usage (apache#9521)

* [Bug][Vectorized] fix schema change add varchar type column default value get wrong result (apache#9523)

* [Enhance] Add host info to heartbeat error msg (apache#9499)

* [Enhancement]  improve parquet reader via arrow's prefetch and multi thread (apache#9472)

* add ArrowReaderProperties to parquet::arrow::FileReader

* support perfecth batch

* [fix](sparkload): fix min_value will be negative number when `maxGlobalDictValue`  exceeds integer range (apache#9436)

* [refactor][rowset]move rowset writer to a single place (apache#9368)

* [feature](nereids): add join rules base code (apache#9598)

* [docs] Fix error command of meta tool docs (apache#9590)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [improvement](stream-load) adjust read unit of http to optimize stream load (apache#9154)

* [fix](broker-scan-node) Remove trailing spaces in broker_scanner. Make it consistent with hive and trino behavior. (apache#9190)

Hive and trino/presto would automatically trim the trailing spaces but Doris doesn't.
This would cause different query result with hive.

Add a new session variable "trim_tailing_spaces_for_external_table_query".
If set to true, when reading csv from broker scan node, it will trim the tailing space of the column

* [Vectorized][java-udf] add datetime&&largeint&&decimal type to java-udf (apache#9440)

* [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner (apache#9666)

* [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner
1. fix bug of vjson scanner not support `range_from_file_path`
2. fix bug of vjson/vbrocker scanner core dump by src/dest slot nullable is different
3. fix bug of vparquest filter_block reference of column in not 1
4. refactor code to simple all the code

It only changed vectorized load, not original row based load.

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [code style] minor update for code style (apache#9695)

* [enhancement](community): enhance java style (apache#9693)

Enhance java style.

Now: checkstyle about code order is in this page--Class and Interface Declarations

This pr can make idea auto rearrange code

* [Refactor] add vpre_filter_expr for vectorized to improve performance (apache#9508)

* [doc]Add insert best practices (apache#9723)

Add insert best practices

* [deps] libhdfs3 build enable kerberos support (apache#9524)

Currently, the libhdfs3 library integrated by doris BE does not support accessing the cluster with kerberos authentication
enabled, and found that kerberos-related dependencies(gsasl and krb5) were not added when build libhdfs3.

so, this pr will enable kerberos support and rebuild libhdfs3 with dependencies gsasl and krb5:

- gsasl version: 1.8.0
- krb5 version: 1.19

* [Refactor] simplify some code in routine load (apache#9532)

* [refactor](load) add tablet errors when close_wait return error (apache#9619)

* [fix] NullPredicate should implement evaluate_vec (apache#9689)

select column from table where column is null

* [doc] Fix typos in documentation (apache#9692)

* [config](checksum) Disable consistency checker by default (apache#9699)

Disable by default because current checksum logic has some bugs.
And it will also bring some overhead.

* [doc] Add trim_tailing_spaces_for_external_table_query variable to the docs. (apache#9701)

* [improvement](planner) Backfill the original predicate pushdown code (apache#9703)

Due to the current architecture, predicate derivation at rewrite cannot satisfy all cases,
because rewrite is performed on first and then where, and when there are subqueries, all cases cannot be derived.
So keep the predicate pushdown method here.

eg.
select * from t1 left join t2 on t1 = t2 where t1 = 1;

InferFiltersRule can't infer t2 = 1, because this is out of specification.

The expression(t2 = 1) can actually be deduced to push it down to the scan node.

* [doc] update docs for FE UT (apache#9718)

* [doc] Update dev image (apache#9721)

* [typo] Fix typos in comments (apache#9710)

* Fix some typos in fe/. (apache#9682)

* [Bug-Fix][Vectorized] Full join return error result (apache#9690)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [doc]Add SQL Select usage help documentation (apache#9729)

Add SQL Select usage help documentation

* [vec][opt] opt hash join build resize hash table before insert data (apache#9735)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [bugfix]fix column reader compress codec unsafe problem (apache#9741)

by moving codec from shared reader to unshared iterator

* [bugfix]teach BufferedBlockMgr2 track memory right (apache#9722)

The problem was introduced by e2d3d01.

* [Enhancement](Vectorized)build hash table with new thread, as non-vec… (apache#9290)

* [Enhancement][Vectorized]build hash table with new thread, as non-vectorized past do

edit after comments

* format code with clang format

Co-authored-by: lidongyang <dongyang.li@rateup.com.cn>
Co-authored-by: stephen <hello-stephen@qq.com>

* [Enhancement](Nereids)refactor plan node into plan + operator (apache#9755)

Close apache#9623

Summary:
This pr refactor plan node into plan + operator.

In the previous version in nereids, a plan node consists of children and relational algebra, e.g.
```java
class LogicalJoin extends LogicalBinary {
  private Plan left, right;
}
```
This structure above is easy to understand, but it difficult to optimize `Memo.copyIn`: rule generate complete sub-plan,
and Memo must compare the complete sub-plan to distinct GroupExpression and hurt performance.

First, we need change the rule to generate partial sub-plan, and replace some children plan to a placeholder, e.g. LeafOp in Columbia optimizer. And then mark some children in sub-plan to unchanged, and bind the relate group, so don't have to compare and copy some sub-plan if relate group exists.

Second, we need separate the origin `Plan` into `Plan` and `Operator`, which Plan contains children and Operator, and Operator just denote relation relational algebra(no children/ input field). This design make operator and children not affect each other. So plan-group binder can generate placeholder plan(contains relate group) for the sub-query, don't have to generate current plan node case by case because the plan is immutable(means generate a new plan with replace children). And rule implementer can reuse the placeholder to generate partial sub-plan.

Operator and Plan have the similar inheritance structure like below. XxxPlan contains XxxOperator, e.g. LogicalBinary contains a LogicalBinaryOperator.
```
          TreeNode
             │
             │
     ┌───────┴────────┐                                                   Operator
     │                │                                                       │
     │                │                                                       │
     │                │                                                       │
     ▼                ▼                                                       ▼
Expression          Plan                                                PlanOperator
                      │                                                       │
                      │                                                       │
          ┌───────────┴─────────┐                                             │
          │                     │                                 ┌───────────┴──────────────────┐
          │                     │                                 │                              │
          │                     │                                 │                              │
          ▼                     ▼                                 ▼                              ▼
     LogicalPlan          PhysicalPlan                   LogicalPlanOperator           PhysicalPlanOperator
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          ├───►LogicalLeaf      ├──►PhysicalLeaf                  ├──► LogicalLeafOperator       ├───►PhysicalLeafOperator
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          ├───►LogicalUnary     ├──►PhysicalUnary                 ├──► LogicalUnaryOperator      ├───►PhysicalUnaryOperator
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          └───►LogicalBinary    └──►PhysicalBinary                └──► LogicalBinaryOperator     └───►PhysicalBinaryOperator
```

The concrete operator extends the XxxNaryOperator, e.g.
```java
class LogicalJoin extends LogicalBinaryOperator;
class PhysicalProject extends PhysicalUnaryOperator;
class LogicalRelation extends LogicalLeafOperator;
```

So the first example change to this:
```java
class LogicalBinary extends AbstractLogicalPlan implements BinaryPlan {
  private Plan left, right;
  private LogicalBinaryOperator operator;
}

class LogicalJoin extends LogicalBinaryOperator {}
```

Under such changes, Rule must build the plan and operator as needed, not only the plan like before.
for example: JoinCommutative Rule
```java
public Rule<Plan> build() {
  // the plan override function can automatic build plan, according to the Operator's type,
  // so return a LogicalBinary(LogicalJoin, Plan, Plan)
  return innerLogicalJoin().then(join -> plan(
    // operator
    new LogicalJoin(join.op.getJoinType().swap(), join.op.getOnClause()),
    // children
    join.right(),
    join.left()
  )).toRule(RuleType.LOGICAL_JOIN_COMMUTATIVE);
}
```

* [fix](memory tracker) Fix lru cache, compaction tracker, add USE_MEM_TRACKER compile (apache#9661)

1. Fix Lru Cache MemTracker consumption value is negative.
2. Fix compaction Cache MemTracker has no track.
3. Add USE_MEM_TRACKER compile option.
4. Make sure the malloc/free hook is not stopped at any time.

* [fix] group by with two NULL rows after left join (apache#9688)

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [doc] Add manual for Array data type and functions (apache#9700)

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [security] update canal version to fix fastjson security issue (apache#9763)

* [fix] disable transfer data large than 2GB by brpc (apache#9770)

because of brpc and protobuf cannot transfer data large than 2GB, if large than 2GB will overflow, so add a check before send

* [Improvement] fix typo (apache#9743)

* [stream-load-vec]: memtable flush only if necessary after aggregated (apache#9459)

Co-authored-by: weixiang <weixiang06@meituan.com>

* [feature-wip][array-type] Support more sub types. (apache#9466)

Please refer to apache#9465

* [fix](resource-tag) Consider resource tags when assigning tasks for broker & routine load (apache#9492)

This CL mainly changes:
1. Broker Load
    When assigning backends, use user level resource tag to find available backends.
    If user level resource tag is not set, broker load task can be assigned to any BE node,
    otherwise, task can only be assigned to BE node which match the user level tags.

2. Routine Load
    The current routine load job does not have user info, so it can not get user level tag when assigning tasks.
    So there are 2 ways:
    1. For old routine load job, use tags of replica allocation info to select BE nodes.
    2. For new routine load job, the user info will be added and persisted in routine load job.

* [fix](help) fix bug of help command (apache#9761)

This bug is introduced from apache#9306, that user need to execute
"help stream-load" to show the help doc.
But actually, it should be "help stream load".

* merge master

Co-authored-by: BePPPower <43782773+BePPPower@users.noreply.github.com>
Co-authored-by: BePPPower <fangtiewei@selectdb.com>
Co-authored-by: Adonis Ling <adonis0147@gmail.com>
Co-authored-by: Stalary <452024236@qq.com>
Co-authored-by: xueweizhang <zxw520blue1@163.com>
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
Co-authored-by: jiafeng.zhang <zhangjf1@gmail.com>
Co-authored-by: jakevin <30525741+jackwener@users.noreply.github.com>
Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com>
Co-authored-by: yixiutt <102007456+yixiutt@users.noreply.github.com>
Co-authored-by: yixiutt <yixiu@selectdb.com>
Co-authored-by: pengxiangyu <diablowcg@163.com>
Co-authored-by: deardeng <565620795@qq.com>
Co-authored-by: hongbin <xlwh@users.noreply.github.com>
Co-authored-by: plat1ko <36853835+platoneko@users.noreply.github.com>
Co-authored-by: wangbo <wangbo@apache.org>
Co-authored-by: Wang Bo <wangbo36@meituan.com>
Co-authored-by: Hui Tian <827677355@qq.com>
Co-authored-by: smallhibiscus <844981280@qq.com>
Co-authored-by: carlvinhust2012 <huchenghappy@126.com>
Co-authored-by: hucheng01 <hucheng01@baidu.com>
Co-authored-by: yinzhijian <373141588@qq.com>
Co-authored-by: Dongyang Li <hello_stephen@qq.com>
Co-authored-by: stephen <hello-stephen@qq.com>
Co-authored-by: FreeOnePlus <54164178+FreeOnePlus@users.noreply.github.com>
Co-authored-by: manyi <fop@freeoneplus.com>
Co-authored-by: camby <104178625@qq.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Co-authored-by: LOVEGISER <wangleigis@163.com>
Co-authored-by: 王磊 <lei.wang@unidt.com>
Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
Co-authored-by: EmmyMiao87 <522274284@qq.com>
Co-authored-by: yiguolei <676222867@qq.com>
Co-authored-by: jacktengg <18241664+jacktengg@users.noreply.github.com>
Co-authored-by: dataalive <99398130+dataalive@users.noreply.github.com>
Co-authored-by: Kang <kxiao.tiger@gmail.com>
Co-authored-by: zy-kkk <815574403@qq.com>
Co-authored-by: dujl <dujlmail@gmail.com>
Co-authored-by: Xin Liao <liaoxinbit@126.com>
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
Co-authored-by: Stalary <stalary@163.com>
Co-authored-by: Pxl <pxl290@qq.com>
Co-authored-by: ZenoYang <cookie.yz@qq.com>
Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
Co-authored-by: wudi <676366545@qq.com>
Co-authored-by: Shuangchi He <34329208+Yulv-git@users.noreply.github.com>
Co-authored-by: huangzhaowei <carlmartinmax@gmail.com>
Co-authored-by: Dayue Gao <gaodayue@meituan.com>
Co-authored-by: leo65535 <leo65535@163.com>
Co-authored-by: spaces-x <weixiao5220@gmail.com>
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
Co-authored-by: Jibing-Li <64681310+Jibing-Li@users.noreply.github.com>
Co-authored-by: xiepengcheng01 <100340096+xiepengcheng01@users.noreply.github.com>
Co-authored-by: gtchaos <gsls1817@gmail.com>
Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com>
Co-authored-by: zxealous <xealous0729@gmail.com>
Co-authored-by: zhengshiJ <32082872+zhengshiJ@users.noreply.github.com>
Co-authored-by: lidongyang <dongyang.li@rateup.com.cn>
Co-authored-by: 924060929 <924060929@qq.com>
Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>
Co-authored-by: weixiang <weixiang06@meituan.com>
Lchangliang added a commit to Lchangliang/incubator-doris that referenced this pull request Jun 9, 2022
* [refactor] delete OLAP_LOG_WARNING related macro definition (apache#9484)

Co-authored-by: BePPPower <fangtiewei@selectdb.com>

* [refactor][style] Use clang-format to sort includes (apache#9483)

* [feature] show create materialized view (apache#9391)

* [feature](mysql-table) support utf8mb4 for mysql external table (apache#9402)

This patch supports utf8mb4 for mysql external table.

if someone needs a mysql external table with utf8mb4 charset, but only support charset utf8 right now.

When create mysql external table, it can add an optional propertiy "charset" which can set character fom mysql connection,
default value is "utf8". You can set "utf8mb4" instead of "utf8" when you need.

* [regression] add regression test for compaction (apache#9437)

Trigger compaction via REST API in this case.

* [refactor](backend) Refactor the logic of selecting Backend in FE. (apache#9478)

There are many places in FE where a group of BE nodes needs to be selected according to certain requirements. for example:
1. When creating replicas for a tablet.
2. When selecting a BE to execute Insert.
3. When Stream Load forwards http requests to BE nodes.

These operations all have the same logic. So this CL mainly changes:
1. Create a new `BeSelectionPolicy` class to describe the set of conditions for selecting BE.
2. The logic of selecting BE nodes in `SystemInfoService` has been refactored, and the following two methods are used uniformly:
    1. `selectBackendIdsByPolicy`: Select the required number of BE nodes according to the `BeSelectionPolicy`.
    2. `selectBackendIdsForReplicaCreation`: Select the BE node for the replica creation operation.

Note that there are some changes here:
For the replica creation operation, the round-robin method was used to select BE nodes before,
but now it is changed to `random` selection for the following reasons:
1. Although the previous logic is round-robin, it is actually random.
2. The final diff of the random algorithm will not be greater than 5%, so it can be considered that the random algorithm
     can distribute the data evenly.

* [fix](http) Hardening Recommendations Disable TRACE/TRAC methods (apache#9479)

* [refactor](Nereids): cascades refactor (apache#9470)

Describe the overview of changes.

- rename GroupExpression
- use `HashSet<GroupExpression> groupExpressions` in `memo`
- add label of `Nereids` for CI
- remove `GroupExpr` from Plan

* [doc] update fe checkstyle doc (apache#9373)

* [bugfix](vtablet_sink) fix max_pending_bytes for vtablet_sink (apache#9462)

Co-authored-by: yixiutt <yixiu@selectdb.com>

* [fixbug]fix bug for OLAP_SUCCESS with Status (apache#9427)

* [feature] support row policy filter (apache#9206)

* [chore](fe code style)add suppressions to fe check style (apache#9429)

Current fe check style check all files. But some rules should be only applied on production files.
Add suppressions to suppress some rules on test files.

* [fix](broker-load) can't load parquet file with column name case sensitive with Doris column (apache#9358)

* [fix](binlog-load) binlog load fails because txn exceeds the default value (apache#9471)

binlog load Because txn exceeds the default value, resume is a failure,
and a friendly prompt message is given to the user, instead of prompting success now,
it still fails after a while, and the user will feel inexplicable
Issue Number: close apache#9468

* [refactor]Cleanup unused empty files (apache#9497)

* [refactor] Check status precise_code instead of construct OLAPInternalError (apache#9514)

* check status precise_code instead of construct OLAPInternalError
* move is_io_error to Status

* [fix](storage) fix core for string predicate in storage layer (apache#9500)

Co-authored-by: Wang Bo <wangbo36@meituan.com>

* [Bug] (load) Broker load kerberos auth fail (apache#9494)

* Incorrect sequence numbers in revision documents. (apache#9496)

Co-authored-by: smallhibiscus <844981280>

* [regression test]add the regression test for json load (apache#9517)

Co-authored-by: hucheng01 <hucheng01@baidu.com>

* [style](java) format fe code with some check rules (apache#9460)

Issue Number: close apache#9403

set below rules' severity to error and format code according check info.
a. Merge conflicts unresolved
b. Avoid using corresponding octal or Unicode escape
c. Avoid Escaped Unicode Characters
d. No Line Wrap
e. Package Name
f. Type Name
g. Annotation Location
h. Interface Type Parameter
i. CatchParameterName
j. Pattern Variable Name
k. Record Component Name
l. Record Type Parameter Name
m. Method Type Parameter Name
n. Redundant Import
o. Custom Import Order
p. Unused Imports
q. Avoid Star Import
r. tab character in file
s. Newline At End Of File
t. Trailing whitespace found

* [bugfix](load) fix coredump in ordinal index flush (apache#9518)

commit apache#9123 introduce the bug. bitshuffle page return error when
page is full, so scalar column write cannot switch to next page, which make
ordinal index is null when flush.

All page builder should return ok when page full, and column writer procedure
shoud be append_data, check is_page_full, switch to next page

Co-authored-by: yixiutt <yixiu@selectdb.com>

* [fix][vectorized-storage] did not check column writer's write status

* Clean the version.sh file before build, otherwise the version information in the binary package produced by this compilation is still the commit id of the last time. (apache#9534)

Co-authored-by: stephen <hello-stephen@qq.com>

* [doc]Add ARM architecture compilation tutorial content (apache#9535)

Co-authored-by: manyi <fop@freeoneplus.com>

* [feature-wip](array-type) array_contains support more nested data types (apache#9170)

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [Improvement] remove unnecessary memcpy in OlapBlockDataConvertor (apache#9491)

* [Improvement] remove unnecessary memcpy in OlapBlockDataConvertor

* [doc] [Improved] The flink connector documentation is perfect (apache#9528)

Co-authored-by: 王磊 <lei.wang@unidt.com>

* [feature] add vectorized vjson_scanner (apache#9311)

This pr is used to add the vectorized vjson_scanner, which can support vectorized json import in stream load flow.

* [fix](Function) fix case when function return null with abs function (apache#9493)

* [fix](lateral-view) Error view includes lateral view (apache#9530)

Fixed apache#9529

When the lateral view based on a inline view which belongs to a view,
Doris could not resolve the column of lateral view in query.
When a query uses a view, it mainly refers to the string representation of the view.
That is, if the view's string representation is wrong, the view is wrong.
The string representation of the inline view lacks the handling of the lateral view.
This leads to query errors when using such views.
This PR mainly fixes the string representation of inline views.

* [refactor](es) Clean es tcp scannode and related thrift definitions (apache#9553)

PaloExternalSourcesService is designed for es_scan_node using tcp protocol.
But es tcp protocol need deploy a tcp jar into es code. Both es version and lucene version are upgraded,
and the tcp jar is not maintained any more.

So that I remove all the related code and thrift definitions.

* [bugfix](vectorized) vectorized write: invalid memory access caused by podarray resize (apache#9556)

* ADD: 补充idea开发文档,添加help-resource.zip的生成步骤 (apache#9561)

* [doc]fix doc typo in data-model and date data type (apache#9571)

* [Doc]Add show tables help documentation (apache#9568)

* [enhancement][betarowset]optimize lz4 compress and decompress speed by reusing context (apache#9566)

* [fix](function) fix last_value get wrong result when have order by clause (apache#9247)

* [Feature](Nereids) Data structure of comparison predicate (apache#9506)

1. The data structure of the comparison expression
2. Refactored the inheritance and implementation relationship of tree node

```
        +-- ---- ---- ---+- ---- ---- ---- ---+- ---- ----- ---- ----TreeNode-----------------+
        |                |                    |                                               |
                                                                                              |
        |                |                    |                                               |
                                                                                              v
        v                v                    v                                           Abstract Tree Node
    Leaf Node        Unary Node          Binary Node                              +--------          ---------+
        |                |                    |                                   |        (children)         |
                                                                                  |                           |
        v                v                    v                                   v                           v
Leaf Expression   Unary Expression      Binary Expression              +------Expression----+           Plan Node
        |                |                    |                        |                    |
                                                                       |                    |
        |                |                    |                        v                    v
        |                |                    +- ---- ---- -----> Comparison Predicate     Named Expr
                                                                                       +----   -------+
        |                |                                                             v              v
        |                +- -- --- --- --- --- --- --- --- --- --- --- --- --- ---> Alias Expr      Slot
                                                                                                      ^
        |                                                                                             |
        |                                                                                             |
        +---- --- ---- ------ ---- ------- ------ ------- --- ------ ------ ----- ---- ----- ----- ---+
```

* [fix](planner)VecNotImplException thrown when query need rewrite and some slot cannot changed to nullable (apache#9589)

* [chore] Fix compilation errors reported by clang (apache#9584)

* [docs]Modifide flink-doris-connector.md (apache#9595)

* [feature-wip](parquet-vec) Support parquet scanner in vectorized engine (apache#9433)

* [feature-wip](hudi) Step1: Support create hudi external table (apache#9559)

support create hudi table
support show create table for hudi table

1. create hudi table without schema(recommanded)
```sql
    CREATE [EXTERNAL] TABLE table_name
    ENGINE = HUDI
    [COMMENT "comment"]
    PROPERTIES (
    "hudi.database" = "hudi_db_in_hive_metastore",
    "hudi.table" = "hudi_table_in_hive_metastore",
    "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
    );
```

2. create hudi table with schema
```sql
    CREATE [EXTERNAL] TABLE table_name
    [(column_definition1[, column_definition2, ...])]
    ENGINE = HUDI
    [COMMENT "comment"]
    PROPERTIES (
    "hudi.database" = "hudi_db_in_hive_metastore",
    "hudi.table" = "hudi_table_in_hive_metastore",
    "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
    );
```
When create hudi table with schema, the columns must exist in corresponding table in hive metastore.

* [fix](storage-vectorized) fix VMergeIterator core dump (apache#9564)

It could be re appeared on rowset with many segment, it means segment overlap. Maybe could not reappear it easily.

* [Bug][Vectorized] Fix BE crash with delete condition and enable_storage_vectorization (apache#9547)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [Bug][Vectorized] Fix insert bimmap column with nullable column (apache#9408)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [doc]add largeint doc (apache#9609)

add largeint doc

* [doc]modified the spark-load doc (apache#9605)

* [code format]Upgrade clang-format in BE Code Formatter from 8 to 13 (apache#9602)

* [feature] group_concat support distinct (apache#9576)

* [feature] Add StoragePolicyResource for Remote Storage (apache#9554)

Add StoragePolicyResource for Remote Storage

* [fix] fix bug that replica can not be repaired duo to DECOMMISSION state (apache#9424)

Reset state of replica which state are in DECOMMISSION after finished scheduling.

* [config] Remove some old config and session variable (apache#9495)

1. Remove session variable "enable_lateral_view"
2. Remove Fe config: enable_materialized_view
3. Remove Fe config: enable_create_sync_job
4. Fe config dynamic_partition_enable is only used for disable dynamic partition scheduler.

* [Improvement] reduce string size in serialization (apache#9550)

* [Improvement][ASAN] make BE can exit normally and ASAN memory leak checking work (apache#9620)

* [clang build]fix clang compile error (apache#9615)

* [regression test] add some case for json load regression test (apache#9614)

Co-authored-by: hucheng01 <hucheng01@baidu.com>

* [BUG] fix information_schema.columns results not correctly on vec engine (apache#9612)

* VSchemaScanNode get_next bugfix

* add regression-test case for VSchemaScanNode

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [bug] (init) Java version check fail (apache#9607)

* [improment](planner) push down predicate past two phase aggregate (apache#9498)

Push down predicate past aggregate cannot push down predicate past 2 phase aggregate.

origin plan is like this:
```
second phase agg (conjuncts on olap scan node tuples)
|
first phase agg
|
olap scan node
```
should be optimized to
```
second phase agg
|
first phase agg
|
olap scan node (conjuncts on olap scan node tuples)
```

* [fixbug](vec-load) fix core of segment_writer while it is not thread-safe (apache#9569)

introduce in stream-load-vec apache#9280, it will cause multi-thread
operate to same segment_write cause BetaRowset enable multi-thread
of memtable flush, memtable flush call rowset_writer.add_block, it
use member variable _segment_writer to write, so it will cause
multi-thread in segment write.

Co-authored-by: yixiutt <yixiu@selectdb.com>

* [fix](storage) low_cardinality_optimize core dump when is null predicate (apache#9586)

Issue Number: close apache#9555
Make the last value of the dictionary null, when ColumnDict inserts a null value,
add the encoding corresponding to the last value of the dictionary·

* [regression test] Add compaction regression test case for different data models (apache#9660)

* [fix](planner) unnecessary cast will be added on children in CaseExpr sometimes (apache#9600)

unnecessary cast will be added on children in CaseExpr because use symbolized equal to compare to `Expr`'s type.
it will lead to expression compare mistake and then lead to expression substitute failed when use `ExprSubstitutionMap`

* [website] fix doris website with no link to the Privacy Policy. (apache#9665)

All websites must link to the Privacy Policy

* [doc] Fixed a error in the Bitmap Index section of the document (apache#9679)

* [refactor][regressiontest] reorder license header and import statement (apache#9672)

* [FeConfig](Project) Project optimization is enabled by default (apache#9667)

* [doc]update streamload 2pc doc (apache#9651)

Co-authored-by: wudi <>

* [BUG] fix bug for vectorized compaction and some storage vectorization bug (apache#9610)

* [style](fe) code correct rules and name rules (apache#9670)

* [style](fe) code correct rules and name rules

* revert some change according to comments

* [enhancement] Improve debugging experience. (apache#9677)

* [Feature] cancel load support state (apache#9537)

* Fix some typos for docs. (apache#9680)

* Fix some typos in be/. (apache#9681)

* [Bug] Fix timestamp_diff issue when timeunit is year and month (apache#9574)

* [fix] fix Code Quality Analysis failed (apache#9685)

* [improvement][performance] improve lru cache resize performance and memory usage (apache#9521)

* [Bug][Vectorized] fix schema change add varchar type column default value get wrong result (apache#9523)

* [Enhance] Add host info to heartbeat error msg (apache#9499)

* [Enhancement]  improve parquet reader via arrow's prefetch and multi thread (apache#9472)

* add ArrowReaderProperties to parquet::arrow::FileReader

* support perfecth batch

* [fix](sparkload): fix min_value will be negative number when `maxGlobalDictValue`  exceeds integer range (apache#9436)

* [refactor][rowset]move rowset writer to a single place (apache#9368)

* [feature](nereids): add join rules base code (apache#9598)

* [docs] Fix error command of meta tool docs (apache#9590)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [improvement](stream-load) adjust read unit of http to optimize stream load (apache#9154)

* [fix](broker-scan-node) Remove trailing spaces in broker_scanner. Make it consistent with hive and trino behavior. (apache#9190)

Hive and trino/presto would automatically trim the trailing spaces but Doris doesn't.
This would cause different query result with hive.

Add a new session variable "trim_tailing_spaces_for_external_table_query".
If set to true, when reading csv from broker scan node, it will trim the tailing space of the column

* [Vectorized][java-udf] add datetime&&largeint&&decimal type to java-udf (apache#9440)

* [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner (apache#9666)

* [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner
1. fix bug of vjson scanner not support `range_from_file_path`
2. fix bug of vjson/vbrocker scanner core dump by src/dest slot nullable is different
3. fix bug of vparquest filter_block reference of column in not 1
4. refactor code to simple all the code

It only changed vectorized load, not original row based load.

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [code style] minor update for code style (apache#9695)

* [enhancement](community): enhance java style (apache#9693)

Enhance java style.

Now: checkstyle about code order is in this page--Class and Interface Declarations

This pr can make idea auto rearrange code

* [Refactor] add vpre_filter_expr for vectorized to improve performance (apache#9508)

* [doc]Add insert best practices (apache#9723)

Add insert best practices

* [deps] libhdfs3 build enable kerberos support (apache#9524)

Currently, the libhdfs3 library integrated by doris BE does not support accessing the cluster with kerberos authentication
enabled, and found that kerberos-related dependencies(gsasl and krb5) were not added when build libhdfs3.

so, this pr will enable kerberos support and rebuild libhdfs3 with dependencies gsasl and krb5:

- gsasl version: 1.8.0
- krb5 version: 1.19

* [Refactor] simplify some code in routine load (apache#9532)

* [refactor](load) add tablet errors when close_wait return error (apache#9619)

* [fix] NullPredicate should implement evaluate_vec (apache#9689)

select column from table where column is null

* [doc] Fix typos in documentation (apache#9692)

* [config](checksum) Disable consistency checker by default (apache#9699)

Disable by default because current checksum logic has some bugs.
And it will also bring some overhead.

* [doc] Add trim_tailing_spaces_for_external_table_query variable to the docs. (apache#9701)

* [improvement](planner) Backfill the original predicate pushdown code (apache#9703)

Due to the current architecture, predicate derivation at rewrite cannot satisfy all cases,
because rewrite is performed on first and then where, and when there are subqueries, all cases cannot be derived.
So keep the predicate pushdown method here.

eg.
select * from t1 left join t2 on t1 = t2 where t1 = 1;

InferFiltersRule can't infer t2 = 1, because this is out of specification.

The expression(t2 = 1) can actually be deduced to push it down to the scan node.

* [doc] update docs for FE UT (apache#9718)

* [doc] Update dev image (apache#9721)

* [typo] Fix typos in comments (apache#9710)

* Fix some typos in fe/. (apache#9682)

* [Bug-Fix][Vectorized] Full join return error result (apache#9690)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [doc]Add SQL Select usage help documentation (apache#9729)

Add SQL Select usage help documentation

* [vec][opt] opt hash join build resize hash table before insert data (apache#9735)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [bugfix]fix column reader compress codec unsafe problem (apache#9741)

by moving codec from shared reader to unshared iterator

* [bugfix]teach BufferedBlockMgr2 track memory right (apache#9722)

The problem was introduced by e2d3d01.

* [Enhancement](Vectorized)build hash table with new thread, as non-vec… (apache#9290)

* [Enhancement][Vectorized]build hash table with new thread, as non-vectorized past do

edit after comments

* format code with clang format

Co-authored-by: lidongyang <dongyang.li@rateup.com.cn>
Co-authored-by: stephen <hello-stephen@qq.com>

* [Enhancement](Nereids)refactor plan node into plan + operator (apache#9755)

Close apache#9623

Summary:
This pr refactor plan node into plan + operator.

In the previous version in nereids, a plan node consists of children and relational algebra, e.g.
```java
class LogicalJoin extends LogicalBinary {
  private Plan left, right;
}
```
This structure above is easy to understand, but it difficult to optimize `Memo.copyIn`: rule generate complete sub-plan,
and Memo must compare the complete sub-plan to distinct GroupExpression and hurt performance.

First, we need change the rule to generate partial sub-plan, and replace some children plan to a placeholder, e.g. LeafOp in Columbia optimizer. And then mark some children in sub-plan to unchanged, and bind the relate group, so don't have to compare and copy some sub-plan if relate group exists.

Second, we need separate the origin `Plan` into `Plan` and `Operator`, which Plan contains children and Operator, and Operator just denote relation relational algebra(no children/ input field). This design make operator and children not affect each other. So plan-group binder can generate placeholder plan(contains relate group) for the sub-query, don't have to generate current plan node case by case because the plan is immutable(means generate a new plan with replace children). And rule implementer can reuse the placeholder to generate partial sub-plan.

Operator and Plan have the similar inheritance structure like below. XxxPlan contains XxxOperator, e.g. LogicalBinary contains a LogicalBinaryOperator.
```
          TreeNode
             │
             │
     ┌───────┴────────┐                                                   Operator
     │                │                                                       │
     │                │                                                       │
     │                │                                                       │
     ▼                ▼                                                       ▼
Expression          Plan                                                PlanOperator
                      │                                                       │
                      │                                                       │
          ┌───────────┴─────────┐                                             │
          │                     │                                 ┌───────────┴──────────────────┐
          │                     │                                 │                              │
          │                     │                                 │                              │
          ▼                     ▼                                 ▼                              ▼
     LogicalPlan          PhysicalPlan                   LogicalPlanOperator           PhysicalPlanOperator
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          ├───►LogicalLeaf      ├──►PhysicalLeaf                  ├──► LogicalLeafOperator       ├───►PhysicalLeafOperator
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          ├───►LogicalUnary     ├──►PhysicalUnary                 ├──► LogicalUnaryOperator      ├───►PhysicalUnaryOperator
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          └───►LogicalBinary    └──►PhysicalBinary                └──► LogicalBinaryOperator     └───►PhysicalBinaryOperator
```

The concrete operator extends the XxxNaryOperator, e.g.
```java
class LogicalJoin extends LogicalBinaryOperator;
class PhysicalProject extends PhysicalUnaryOperator;
class LogicalRelation extends LogicalLeafOperator;
```

So the first example change to this:
```java
class LogicalBinary extends AbstractLogicalPlan implements BinaryPlan {
  private Plan left, right;
  private LogicalBinaryOperator operator;
}

class LogicalJoin extends LogicalBinaryOperator {}
```

Under such changes, Rule must build the plan and operator as needed, not only the plan like before.
for example: JoinCommutative Rule
```java
public Rule<Plan> build() {
  // the plan override function can automatic build plan, according to the Operator's type,
  // so return a LogicalBinary(LogicalJoin, Plan, Plan)
  return innerLogicalJoin().then(join -> plan(
    // operator
    new LogicalJoin(join.op.getJoinType().swap(), join.op.getOnClause()),
    // children
    join.right(),
    join.left()
  )).toRule(RuleType.LOGICAL_JOIN_COMMUTATIVE);
}
```

* [fix](memory tracker) Fix lru cache, compaction tracker, add USE_MEM_TRACKER compile (apache#9661)

1. Fix Lru Cache MemTracker consumption value is negative.
2. Fix compaction Cache MemTracker has no track.
3. Add USE_MEM_TRACKER compile option.
4. Make sure the malloc/free hook is not stopped at any time.

* [fix] group by with two NULL rows after left join (apache#9688)

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [doc] Add manual for Array data type and functions (apache#9700)

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [security] update canal version to fix fastjson security issue (apache#9763)

* [fix] disable transfer data large than 2GB by brpc (apache#9770)

because of brpc and protobuf cannot transfer data large than 2GB, if large than 2GB will overflow, so add a check before send

* [Improvement] fix typo (apache#9743)

* [stream-load-vec]: memtable flush only if necessary after aggregated (apache#9459)

Co-authored-by: weixiang <weixiang06@meituan.com>

* [feature-wip][array-type] Support more sub types. (apache#9466)

Please refer to apache#9465

* [fix](resource-tag) Consider resource tags when assigning tasks for broker & routine load (apache#9492)

This CL mainly changes:
1. Broker Load
    When assigning backends, use user level resource tag to find available backends.
    If user level resource tag is not set, broker load task can be assigned to any BE node,
    otherwise, task can only be assigned to BE node which match the user level tags.

2. Routine Load
    The current routine load job does not have user info, so it can not get user level tag when assigning tasks.
    So there are 2 ways:
    1. For old routine load job, use tags of replica allocation info to select BE nodes.
    2. For new routine load job, the user info will be added and persisted in routine load job.

* [fix](help) fix bug of help command (apache#9761)

This bug is introduced from apache#9306, that user need to execute
"help stream-load" to show the help doc.
But actually, it should be "help stream load".

* merge master

Co-authored-by: BePPPower <43782773+BePPPower@users.noreply.github.com>
Co-authored-by: BePPPower <fangtiewei@selectdb.com>
Co-authored-by: Adonis Ling <adonis0147@gmail.com>
Co-authored-by: Stalary <452024236@qq.com>
Co-authored-by: xueweizhang <zxw520blue1@163.com>
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
Co-authored-by: jiafeng.zhang <zhangjf1@gmail.com>
Co-authored-by: jakevin <30525741+jackwener@users.noreply.github.com>
Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com>
Co-authored-by: yixiutt <102007456+yixiutt@users.noreply.github.com>
Co-authored-by: yixiutt <yixiu@selectdb.com>
Co-authored-by: pengxiangyu <diablowcg@163.com>
Co-authored-by: deardeng <565620795@qq.com>
Co-authored-by: hongbin <xlwh@users.noreply.github.com>
Co-authored-by: plat1ko <36853835+platoneko@users.noreply.github.com>
Co-authored-by: wangbo <wangbo@apache.org>
Co-authored-by: Wang Bo <wangbo36@meituan.com>
Co-authored-by: Hui Tian <827677355@qq.com>
Co-authored-by: smallhibiscus <844981280@qq.com>
Co-authored-by: carlvinhust2012 <huchenghappy@126.com>
Co-authored-by: hucheng01 <hucheng01@baidu.com>
Co-authored-by: yinzhijian <373141588@qq.com>
Co-authored-by: Dongyang Li <hello_stephen@qq.com>
Co-authored-by: stephen <hello-stephen@qq.com>
Co-authored-by: FreeOnePlus <54164178+FreeOnePlus@users.noreply.github.com>
Co-authored-by: manyi <fop@freeoneplus.com>
Co-authored-by: camby <104178625@qq.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Co-authored-by: LOVEGISER <wangleigis@163.com>
Co-authored-by: 王磊 <lei.wang@unidt.com>
Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
Co-authored-by: EmmyMiao87 <522274284@qq.com>
Co-authored-by: yiguolei <676222867@qq.com>
Co-authored-by: jacktengg <18241664+jacktengg@users.noreply.github.com>
Co-authored-by: dataalive <99398130+dataalive@users.noreply.github.com>
Co-authored-by: Kang <kxiao.tiger@gmail.com>
Co-authored-by: zy-kkk <815574403@qq.com>
Co-authored-by: dujl <dujlmail@gmail.com>
Co-authored-by: Xin Liao <liaoxinbit@126.com>
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
Co-authored-by: Stalary <stalary@163.com>
Co-authored-by: Pxl <pxl290@qq.com>
Co-authored-by: ZenoYang <cookie.yz@qq.com>
Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
Co-authored-by: wudi <676366545@qq.com>
Co-authored-by: Shuangchi He <34329208+Yulv-git@users.noreply.github.com>
Co-authored-by: huangzhaowei <carlmartinmax@gmail.com>
Co-authored-by: Dayue Gao <gaodayue@meituan.com>
Co-authored-by: leo65535 <leo65535@163.com>
Co-authored-by: spaces-x <weixiao5220@gmail.com>
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
Co-authored-by: Jibing-Li <64681310+Jibing-Li@users.noreply.github.com>
Co-authored-by: xiepengcheng01 <100340096+xiepengcheng01@users.noreply.github.com>
Co-authored-by: gtchaos <gsls1817@gmail.com>
Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com>
Co-authored-by: zxealous <xealous0729@gmail.com>
Co-authored-by: zhengshiJ <32082872+zhengshiJ@users.noreply.github.com>
Co-authored-by: lidongyang <dongyang.li@rateup.com.cn>
Co-authored-by: 924060929 <924060929@qq.com>
Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>
Co-authored-by: weixiang <weixiang06@meituan.com>
Lchangliang added a commit to Lchangliang/incubator-doris that referenced this pull request Jun 9, 2022
* [refactor] delete OLAP_LOG_WARNING related macro definition (apache#9484)

Co-authored-by: BePPPower <fangtiewei@selectdb.com>

* [refactor][style] Use clang-format to sort includes (apache#9483)

* [feature] show create materialized view (apache#9391)

* [feature](mysql-table) support utf8mb4 for mysql external table (apache#9402)

This patch supports utf8mb4 for mysql external table.

if someone needs a mysql external table with utf8mb4 charset, but only support charset utf8 right now.

When create mysql external table, it can add an optional propertiy "charset" which can set character fom mysql connection,
default value is "utf8". You can set "utf8mb4" instead of "utf8" when you need.

* [regression] add regression test for compaction (apache#9437)

Trigger compaction via REST API in this case.

* [refactor](backend) Refactor the logic of selecting Backend in FE. (apache#9478)

There are many places in FE where a group of BE nodes needs to be selected according to certain requirements. for example:
1. When creating replicas for a tablet.
2. When selecting a BE to execute Insert.
3. When Stream Load forwards http requests to BE nodes.

These operations all have the same logic. So this CL mainly changes:
1. Create a new `BeSelectionPolicy` class to describe the set of conditions for selecting BE.
2. The logic of selecting BE nodes in `SystemInfoService` has been refactored, and the following two methods are used uniformly:
    1. `selectBackendIdsByPolicy`: Select the required number of BE nodes according to the `BeSelectionPolicy`.
    2. `selectBackendIdsForReplicaCreation`: Select the BE node for the replica creation operation.

Note that there are some changes here:
For the replica creation operation, the round-robin method was used to select BE nodes before,
but now it is changed to `random` selection for the following reasons:
1. Although the previous logic is round-robin, it is actually random.
2. The final diff of the random algorithm will not be greater than 5%, so it can be considered that the random algorithm
     can distribute the data evenly.

* [fix](http) Hardening Recommendations Disable TRACE/TRAC methods (apache#9479)

* [refactor](Nereids): cascades refactor (apache#9470)

Describe the overview of changes.

- rename GroupExpression
- use `HashSet<GroupExpression> groupExpressions` in `memo`
- add label of `Nereids` for CI
- remove `GroupExpr` from Plan

* [doc] update fe checkstyle doc (apache#9373)

* [bugfix](vtablet_sink) fix max_pending_bytes for vtablet_sink (apache#9462)

Co-authored-by: yixiutt <yixiu@selectdb.com>

* [fixbug]fix bug for OLAP_SUCCESS with Status (apache#9427)

* [feature] support row policy filter (apache#9206)

* [chore](fe code style)add suppressions to fe check style (apache#9429)

Current fe check style check all files. But some rules should be only applied on production files.
Add suppressions to suppress some rules on test files.

* [fix](broker-load) can't load parquet file with column name case sensitive with Doris column (apache#9358)

* [fix](binlog-load) binlog load fails because txn exceeds the default value (apache#9471)

binlog load Because txn exceeds the default value, resume is a failure,
and a friendly prompt message is given to the user, instead of prompting success now,
it still fails after a while, and the user will feel inexplicable
Issue Number: close apache#9468

* [refactor]Cleanup unused empty files (apache#9497)

* [refactor] Check status precise_code instead of construct OLAPInternalError (apache#9514)

* check status precise_code instead of construct OLAPInternalError
* move is_io_error to Status

* [fix](storage) fix core for string predicate in storage layer (apache#9500)

Co-authored-by: Wang Bo <wangbo36@meituan.com>

* [Bug] (load) Broker load kerberos auth fail (apache#9494)

* Incorrect sequence numbers in revision documents. (apache#9496)

Co-authored-by: smallhibiscus <844981280>

* [regression test]add the regression test for json load (apache#9517)

Co-authored-by: hucheng01 <hucheng01@baidu.com>

* [style](java) format fe code with some check rules (apache#9460)

Issue Number: close apache#9403

set below rules' severity to error and format code according check info.
a. Merge conflicts unresolved
b. Avoid using corresponding octal or Unicode escape
c. Avoid Escaped Unicode Characters
d. No Line Wrap
e. Package Name
f. Type Name
g. Annotation Location
h. Interface Type Parameter
i. CatchParameterName
j. Pattern Variable Name
k. Record Component Name
l. Record Type Parameter Name
m. Method Type Parameter Name
n. Redundant Import
o. Custom Import Order
p. Unused Imports
q. Avoid Star Import
r. tab character in file
s. Newline At End Of File
t. Trailing whitespace found

* [bugfix](load) fix coredump in ordinal index flush (apache#9518)

commit apache#9123 introduce the bug. bitshuffle page return error when
page is full, so scalar column write cannot switch to next page, which make
ordinal index is null when flush.

All page builder should return ok when page full, and column writer procedure
shoud be append_data, check is_page_full, switch to next page

Co-authored-by: yixiutt <yixiu@selectdb.com>

* [fix][vectorized-storage] did not check column writer's write status

* Clean the version.sh file before build, otherwise the version information in the binary package produced by this compilation is still the commit id of the last time. (apache#9534)

Co-authored-by: stephen <hello-stephen@qq.com>

* [doc]Add ARM architecture compilation tutorial content (apache#9535)

Co-authored-by: manyi <fop@freeoneplus.com>

* [feature-wip](array-type) array_contains support more nested data types (apache#9170)

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [Improvement] remove unnecessary memcpy in OlapBlockDataConvertor (apache#9491)

* [Improvement] remove unnecessary memcpy in OlapBlockDataConvertor

* [doc] [Improved] The flink connector documentation is perfect (apache#9528)

Co-authored-by: 王磊 <lei.wang@unidt.com>

* [feature] add vectorized vjson_scanner (apache#9311)

This pr is used to add the vectorized vjson_scanner, which can support vectorized json import in stream load flow.

* [fix](Function) fix case when function return null with abs function (apache#9493)

* [fix](lateral-view) Error view includes lateral view (apache#9530)

Fixed apache#9529

When the lateral view based on a inline view which belongs to a view,
Doris could not resolve the column of lateral view in query.
When a query uses a view, it mainly refers to the string representation of the view.
That is, if the view's string representation is wrong, the view is wrong.
The string representation of the inline view lacks the handling of the lateral view.
This leads to query errors when using such views.
This PR mainly fixes the string representation of inline views.

* [refactor](es) Clean es tcp scannode and related thrift definitions (apache#9553)

PaloExternalSourcesService is designed for es_scan_node using tcp protocol.
But es tcp protocol need deploy a tcp jar into es code. Both es version and lucene version are upgraded,
and the tcp jar is not maintained any more.

So that I remove all the related code and thrift definitions.

* [bugfix](vectorized) vectorized write: invalid memory access caused by podarray resize (apache#9556)

* ADD: 补充idea开发文档,添加help-resource.zip的生成步骤 (apache#9561)

* [doc]fix doc typo in data-model and date data type (apache#9571)

* [Doc]Add show tables help documentation (apache#9568)

* [enhancement][betarowset]optimize lz4 compress and decompress speed by reusing context (apache#9566)

* [fix](function) fix last_value get wrong result when have order by clause (apache#9247)

* [Feature](Nereids) Data structure of comparison predicate (apache#9506)

1. The data structure of the comparison expression
2. Refactored the inheritance and implementation relationship of tree node

```
        +-- ---- ---- ---+- ---- ---- ---- ---+- ---- ----- ---- ----TreeNode-----------------+
        |                |                    |                                               |
                                                                                              |
        |                |                    |                                               |
                                                                                              v
        v                v                    v                                           Abstract Tree Node
    Leaf Node        Unary Node          Binary Node                              +--------          ---------+
        |                |                    |                                   |        (children)         |
                                                                                  |                           |
        v                v                    v                                   v                           v
Leaf Expression   Unary Expression      Binary Expression              +------Expression----+           Plan Node
        |                |                    |                        |                    |
                                                                       |                    |
        |                |                    |                        v                    v
        |                |                    +- ---- ---- -----> Comparison Predicate     Named Expr
                                                                                       +----   -------+
        |                |                                                             v              v
        |                +- -- --- --- --- --- --- --- --- --- --- --- --- --- ---> Alias Expr      Slot
                                                                                                      ^
        |                                                                                             |
        |                                                                                             |
        +---- --- ---- ------ ---- ------- ------ ------- --- ------ ------ ----- ---- ----- ----- ---+
```

* [fix](planner)VecNotImplException thrown when query need rewrite and some slot cannot changed to nullable (apache#9589)

* [chore] Fix compilation errors reported by clang (apache#9584)

* [docs]Modifide flink-doris-connector.md (apache#9595)

* [feature-wip](parquet-vec) Support parquet scanner in vectorized engine (apache#9433)

* [feature-wip](hudi) Step1: Support create hudi external table (apache#9559)

support create hudi table
support show create table for hudi table

1. create hudi table without schema(recommanded)
```sql
    CREATE [EXTERNAL] TABLE table_name
    ENGINE = HUDI
    [COMMENT "comment"]
    PROPERTIES (
    "hudi.database" = "hudi_db_in_hive_metastore",
    "hudi.table" = "hudi_table_in_hive_metastore",
    "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
    );
```

2. create hudi table with schema
```sql
    CREATE [EXTERNAL] TABLE table_name
    [(column_definition1[, column_definition2, ...])]
    ENGINE = HUDI
    [COMMENT "comment"]
    PROPERTIES (
    "hudi.database" = "hudi_db_in_hive_metastore",
    "hudi.table" = "hudi_table_in_hive_metastore",
    "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
    );
```
When create hudi table with schema, the columns must exist in corresponding table in hive metastore.

* [fix](storage-vectorized) fix VMergeIterator core dump (apache#9564)

It could be re appeared on rowset with many segment, it means segment overlap. Maybe could not reappear it easily.

* [Bug][Vectorized] Fix BE crash with delete condition and enable_storage_vectorization (apache#9547)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [Bug][Vectorized] Fix insert bimmap column with nullable column (apache#9408)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [doc]add largeint doc (apache#9609)

add largeint doc

* [doc]modified the spark-load doc (apache#9605)

* [code format]Upgrade clang-format in BE Code Formatter from 8 to 13 (apache#9602)

* [feature] group_concat support distinct (apache#9576)

* [feature] Add StoragePolicyResource for Remote Storage (apache#9554)

Add StoragePolicyResource for Remote Storage

* [fix] fix bug that replica can not be repaired duo to DECOMMISSION state (apache#9424)

Reset state of replica which state are in DECOMMISSION after finished scheduling.

* [config] Remove some old config and session variable (apache#9495)

1. Remove session variable "enable_lateral_view"
2. Remove Fe config: enable_materialized_view
3. Remove Fe config: enable_create_sync_job
4. Fe config dynamic_partition_enable is only used for disable dynamic partition scheduler.

* [Improvement] reduce string size in serialization (apache#9550)

* [Improvement][ASAN] make BE can exit normally and ASAN memory leak checking work (apache#9620)

* [clang build]fix clang compile error (apache#9615)

* [regression test] add some case for json load regression test (apache#9614)

Co-authored-by: hucheng01 <hucheng01@baidu.com>

* [BUG] fix information_schema.columns results not correctly on vec engine (apache#9612)

* VSchemaScanNode get_next bugfix

* add regression-test case for VSchemaScanNode

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [bug] (init) Java version check fail (apache#9607)

* [improment](planner) push down predicate past two phase aggregate (apache#9498)

Push down predicate past aggregate cannot push down predicate past 2 phase aggregate.

origin plan is like this:
```
second phase agg (conjuncts on olap scan node tuples)
|
first phase agg
|
olap scan node
```
should be optimized to
```
second phase agg
|
first phase agg
|
olap scan node (conjuncts on olap scan node tuples)
```

* [fixbug](vec-load) fix core of segment_writer while it is not thread-safe (apache#9569)

introduce in stream-load-vec apache#9280, it will cause multi-thread
operate to same segment_write cause BetaRowset enable multi-thread
of memtable flush, memtable flush call rowset_writer.add_block, it
use member variable _segment_writer to write, so it will cause
multi-thread in segment write.

Co-authored-by: yixiutt <yixiu@selectdb.com>

* [fix](storage) low_cardinality_optimize core dump when is null predicate (apache#9586)

Issue Number: close apache#9555
Make the last value of the dictionary null, when ColumnDict inserts a null value,
add the encoding corresponding to the last value of the dictionary·

* [regression test] Add compaction regression test case for different data models (apache#9660)

* [fix](planner) unnecessary cast will be added on children in CaseExpr sometimes (apache#9600)

unnecessary cast will be added on children in CaseExpr because use symbolized equal to compare to `Expr`'s type.
it will lead to expression compare mistake and then lead to expression substitute failed when use `ExprSubstitutionMap`

* [website] fix doris website with no link to the Privacy Policy. (apache#9665)

All websites must link to the Privacy Policy

* [doc] Fixed a error in the Bitmap Index section of the document (apache#9679)

* [refactor][regressiontest] reorder license header and import statement (apache#9672)

* [FeConfig](Project) Project optimization is enabled by default (apache#9667)

* [doc]update streamload 2pc doc (apache#9651)

Co-authored-by: wudi <>

* [BUG] fix bug for vectorized compaction and some storage vectorization bug (apache#9610)

* [style](fe) code correct rules and name rules (apache#9670)

* [style](fe) code correct rules and name rules

* revert some change according to comments

* [enhancement] Improve debugging experience. (apache#9677)

* [Feature] cancel load support state (apache#9537)

* Fix some typos for docs. (apache#9680)

* Fix some typos in be/. (apache#9681)

* [Bug] Fix timestamp_diff issue when timeunit is year and month (apache#9574)

* [fix] fix Code Quality Analysis failed (apache#9685)

* [improvement][performance] improve lru cache resize performance and memory usage (apache#9521)

* [Bug][Vectorized] fix schema change add varchar type column default value get wrong result (apache#9523)

* [Enhance] Add host info to heartbeat error msg (apache#9499)

* [Enhancement]  improve parquet reader via arrow's prefetch and multi thread (apache#9472)

* add ArrowReaderProperties to parquet::arrow::FileReader

* support perfecth batch

* [fix](sparkload): fix min_value will be negative number when `maxGlobalDictValue`  exceeds integer range (apache#9436)

* [refactor][rowset]move rowset writer to a single place (apache#9368)

* [feature](nereids): add join rules base code (apache#9598)

* [docs] Fix error command of meta tool docs (apache#9590)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [improvement](stream-load) adjust read unit of http to optimize stream load (apache#9154)

* [fix](broker-scan-node) Remove trailing spaces in broker_scanner. Make it consistent with hive and trino behavior. (apache#9190)

Hive and trino/presto would automatically trim the trailing spaces but Doris doesn't.
This would cause different query result with hive.

Add a new session variable "trim_tailing_spaces_for_external_table_query".
If set to true, when reading csv from broker scan node, it will trim the tailing space of the column

* [Vectorized][java-udf] add datetime&&largeint&&decimal type to java-udf (apache#9440)

* [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner (apache#9666)

* [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner
1. fix bug of vjson scanner not support `range_from_file_path`
2. fix bug of vjson/vbrocker scanner core dump by src/dest slot nullable is different
3. fix bug of vparquest filter_block reference of column in not 1
4. refactor code to simple all the code

It only changed vectorized load, not original row based load.

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [code style] minor update for code style (apache#9695)

* [enhancement](community): enhance java style (apache#9693)

Enhance java style.

Now: checkstyle about code order is in this page--Class and Interface Declarations

This pr can make idea auto rearrange code

* [Refactor] add vpre_filter_expr for vectorized to improve performance (apache#9508)

* [doc]Add insert best practices (apache#9723)

Add insert best practices

* [deps] libhdfs3 build enable kerberos support (apache#9524)

Currently, the libhdfs3 library integrated by doris BE does not support accessing the cluster with kerberos authentication
enabled, and found that kerberos-related dependencies(gsasl and krb5) were not added when build libhdfs3.

so, this pr will enable kerberos support and rebuild libhdfs3 with dependencies gsasl and krb5:

- gsasl version: 1.8.0
- krb5 version: 1.19

* [Refactor] simplify some code in routine load (apache#9532)

* [refactor](load) add tablet errors when close_wait return error (apache#9619)

* [fix] NullPredicate should implement evaluate_vec (apache#9689)

select column from table where column is null

* [doc] Fix typos in documentation (apache#9692)

* [config](checksum) Disable consistency checker by default (apache#9699)

Disable by default because current checksum logic has some bugs.
And it will also bring some overhead.

* [doc] Add trim_tailing_spaces_for_external_table_query variable to the docs. (apache#9701)

* [improvement](planner) Backfill the original predicate pushdown code (apache#9703)

Due to the current architecture, predicate derivation at rewrite cannot satisfy all cases,
because rewrite is performed on first and then where, and when there are subqueries, all cases cannot be derived.
So keep the predicate pushdown method here.

eg.
select * from t1 left join t2 on t1 = t2 where t1 = 1;

InferFiltersRule can't infer t2 = 1, because this is out of specification.

The expression(t2 = 1) can actually be deduced to push it down to the scan node.

* [doc] update docs for FE UT (apache#9718)

* [doc] Update dev image (apache#9721)

* [typo] Fix typos in comments (apache#9710)

* Fix some typos in fe/. (apache#9682)

* [Bug-Fix][Vectorized] Full join return error result (apache#9690)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [doc]Add SQL Select usage help documentation (apache#9729)

Add SQL Select usage help documentation

* [vec][opt] opt hash join build resize hash table before insert data (apache#9735)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>

* [bugfix]fix column reader compress codec unsafe problem (apache#9741)

by moving codec from shared reader to unshared iterator

* [bugfix]teach BufferedBlockMgr2 track memory right (apache#9722)

The problem was introduced by e2d3d01.

* [Enhancement](Vectorized)build hash table with new thread, as non-vec… (apache#9290)

* [Enhancement][Vectorized]build hash table with new thread, as non-vectorized past do

edit after comments

* format code with clang format

Co-authored-by: lidongyang <dongyang.li@rateup.com.cn>
Co-authored-by: stephen <hello-stephen@qq.com>

* [Enhancement](Nereids)refactor plan node into plan + operator (apache#9755)

Close apache#9623

Summary:
This pr refactor plan node into plan + operator.

In the previous version in nereids, a plan node consists of children and relational algebra, e.g.
```java
class LogicalJoin extends LogicalBinary {
  private Plan left, right;
}
```
This structure above is easy to understand, but it difficult to optimize `Memo.copyIn`: rule generate complete sub-plan,
and Memo must compare the complete sub-plan to distinct GroupExpression and hurt performance.

First, we need change the rule to generate partial sub-plan, and replace some children plan to a placeholder, e.g. LeafOp in Columbia optimizer. And then mark some children in sub-plan to unchanged, and bind the relate group, so don't have to compare and copy some sub-plan if relate group exists.

Second, we need separate the origin `Plan` into `Plan` and `Operator`, which Plan contains children and Operator, and Operator just denote relation relational algebra(no children/ input field). This design make operator and children not affect each other. So plan-group binder can generate placeholder plan(contains relate group) for the sub-query, don't have to generate current plan node case by case because the plan is immutable(means generate a new plan with replace children). And rule implementer can reuse the placeholder to generate partial sub-plan.

Operator and Plan have the similar inheritance structure like below. XxxPlan contains XxxOperator, e.g. LogicalBinary contains a LogicalBinaryOperator.
```
          TreeNode
             │
             │
     ┌───────┴────────┐                                                   Operator
     │                │                                                       │
     │                │                                                       │
     │                │                                                       │
     ▼                ▼                                                       ▼
Expression          Plan                                                PlanOperator
                      │                                                       │
                      │                                                       │
          ┌───────────┴─────────┐                                             │
          │                     │                                 ┌───────────┴──────────────────┐
          │                     │                                 │                              │
          │                     │                                 │                              │
          ▼                     ▼                                 ▼                              ▼
     LogicalPlan          PhysicalPlan                   LogicalPlanOperator           PhysicalPlanOperator
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          ├───►LogicalLeaf      ├──►PhysicalLeaf                  ├──► LogicalLeafOperator       ├───►PhysicalLeafOperator
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          ├───►LogicalUnary     ├──►PhysicalUnary                 ├──► LogicalUnaryOperator      ├───►PhysicalUnaryOperator
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          └───►LogicalBinary    └──►PhysicalBinary                └──► LogicalBinaryOperator     └───►PhysicalBinaryOperator
```

The concrete operator extends the XxxNaryOperator, e.g.
```java
class LogicalJoin extends LogicalBinaryOperator;
class PhysicalProject extends PhysicalUnaryOperator;
class LogicalRelation extends LogicalLeafOperator;
```

So the first example change to this:
```java
class LogicalBinary extends AbstractLogicalPlan implements BinaryPlan {
  private Plan left, right;
  private LogicalBinaryOperator operator;
}

class LogicalJoin extends LogicalBinaryOperator {}
```

Under such changes, Rule must build the plan and operator as needed, not only the plan like before.
for example: JoinCommutative Rule
```java
public Rule<Plan> build() {
  // the plan override function can automatic build plan, according to the Operator's type,
  // so return a LogicalBinary(LogicalJoin, Plan, Plan)
  return innerLogicalJoin().then(join -> plan(
    // operator
    new LogicalJoin(join.op.getJoinType().swap(), join.op.getOnClause()),
    // children
    join.right(),
    join.left()
  )).toRule(RuleType.LOGICAL_JOIN_COMMUTATIVE);
}
```

* [fix](memory tracker) Fix lru cache, compaction tracker, add USE_MEM_TRACKER compile (apache#9661)

1. Fix Lru Cache MemTracker consumption value is negative.
2. Fix compaction Cache MemTracker has no track.
3. Add USE_MEM_TRACKER compile option.
4. Make sure the malloc/free hook is not stopped at any time.

* [fix] group by with two NULL rows after left join (apache#9688)

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [doc] Add manual for Array data type and functions (apache#9700)

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [security] update canal version to fix fastjson security issue (apache#9763)

* [fix] disable transfer data large than 2GB by brpc (apache#9770)

because of brpc and protobuf cannot transfer data large than 2GB, if large than 2GB will overflow, so add a check before send

* [Improvement] fix typo (apache#9743)

* [stream-load-vec]: memtable flush only if necessary after aggregated (apache#9459)

Co-authored-by: weixiang <weixiang06@meituan.com>

* [feature-wip][array-type] Support more sub types. (apache#9466)

Please refer to apache#9465

* [fix](resource-tag) Consider resource tags when assigning tasks for broker & routine load (apache#9492)

This CL mainly changes:
1. Broker Load
    When assigning backends, use user level resource tag to find available backends.
    If user level resource tag is not set, broker load task can be assigned to any BE node,
    otherwise, task can only be assigned to BE node which match the user level tags.

2. Routine Load
    The current routine load job does not have user info, so it can not get user level tag when assigning tasks.
    So there are 2 ways:
    1. For old routine load job, use tags of replica allocation info to select BE nodes.
    2. For new routine load job, the user info will be added and persisted in routine load job.

* [fix](help) fix bug of help command (apache#9761)

This bug is introduced from apache#9306, that user need to execute
"help stream-load" to show the help doc.
But actually, it should be "help stream load".

* merge master

Co-authored-by: BePPPower <43782773+BePPPower@users.noreply.github.com>
Co-authored-by: BePPPower <fangtiewei@selectdb.com>
Co-authored-by: Adonis Ling <adonis0147@gmail.com>
Co-authored-by: Stalary <452024236@qq.com>
Co-authored-by: xueweizhang <zxw520blue1@163.com>
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
Co-authored-by: jiafeng.zhang <zhangjf1@gmail.com>
Co-authored-by: jakevin <30525741+jackwener@users.noreply.github.com>
Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com>
Co-authored-by: yixiutt <102007456+yixiutt@users.noreply.github.com>
Co-authored-by: yixiutt <yixiu@selectdb.com>
Co-authored-by: pengxiangyu <diablowcg@163.com>
Co-authored-by: deardeng <565620795@qq.com>
Co-authored-by: hongbin <xlwh@users.noreply.github.com>
Co-authored-by: plat1ko <36853835+platoneko@users.noreply.github.com>
Co-authored-by: wangbo <wangbo@apache.org>
Co-authored-by: Wang Bo <wangbo36@meituan.com>
Co-authored-by: Hui Tian <827677355@qq.com>
Co-authored-by: smallhibiscus <844981280@qq.com>
Co-authored-by: carlvinhust2012 <huchenghappy@126.com>
Co-authored-by: hucheng01 <hucheng01@baidu.com>
Co-authored-by: yinzhijian <373141588@qq.com>
Co-authored-by: Dongyang Li <hello_stephen@qq.com>
Co-authored-by: stephen <hello-stephen@qq.com>
Co-authored-by: FreeOnePlus <54164178+FreeOnePlus@users.noreply.github.com>
Co-authored-by: manyi <fop@freeoneplus.com>
Co-authored-by: camby <104178625@qq.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Co-authored-by: LOVEGISER <wangleigis@163.com>
Co-authored-by: 王磊 <lei.wang@unidt.com>
Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
Co-authored-by: EmmyMiao87 <522274284@qq.com>
Co-authored-by: yiguolei <676222867@qq.com>
Co-authored-by: jacktengg <18241664+jacktengg@users.noreply.github.com>
Co-authored-by: dataalive <99398130+dataalive@users.noreply.github.com>
Co-authored-by: Kang <kxiao.tiger@gmail.com>
Co-authored-by: zy-kkk <815574403@qq.com>
Co-authored-by: dujl <dujlmail@gmail.com>
Co-authored-by: Xin Liao <liaoxinbit@126.com>
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
Co-authored-by: Stalary <stalary@163.com>
Co-authored-by: Pxl <pxl290@qq.com>
Co-authored-by: ZenoYang <cookie.yz@qq.com>
Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
Co-authored-by: wudi <676366545@qq.com>
Co-authored-by: Shuangchi He <34329208+Yulv-git@users.noreply.github.com>
Co-authored-by: huangzhaowei <carlmartinmax@gmail.com>
Co-authored-by: Dayue Gao <gaodayue@meituan.com>
Co-authored-by: leo65535 <leo65535@163.com>
Co-authored-by: spaces-x <weixiao5220@gmail.com>
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
Co-authored-by: Jibing-Li <64681310+Jibing-Li@users.noreply.github.com>
Co-authored-by: xiepengcheng01 <100340096+xiepengcheng01@users.noreply.github.com>
Co-authored-by: gtchaos <gsls1817@gmail.com>
Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com>
Co-authored-by: zxealous <xealous0729@gmail.com>
Co-authored-by: zhengshiJ <32082872+zhengshiJ@users.noreply.github.com>
Co-authored-by: lidongyang <dongyang.li@rateup.com.cn>
Co-authored-by: 924060929 <924060929@qq.com>
Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>
Co-authored-by: weixiang <weixiang06@meituan.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. kind/improvement reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants