Skip to content

Conversation

@yangzhg
Copy link
Member

@yangzhg yangzhg commented Nov 26, 2020

changed the merge method of the unique table,
merged the cumulative version data first, and then merged with the base version.
For the data with only one base version, read directly without merging

Proposed changes

Optimize the read performance of AGG and UNIQUE table with too much versions
the benchmark as follows
By the way, in most cases, the meaning of merge in our code is merge sort, so to avoid ambiguity I renamed some functions and variables
MergeHeap -> MergeSortHeap
MergeIterator -> MergeSortIterator
new_merge_iterator -> new_sort_iterator

Test Data

This test data set is the catalog_sales data in the tpcds 10G data set. The data is divided into two parts, one is the complete data (corresponding to the big table) 3G, a total of 28802522 (after unqiue 14401261) rows, one only has 200M sampling data (corresponding to test The table) has a total of 1,000,000 rows, only one partition is used, and the complete data is divided into 10 buckets.

All tables are in segment V2 format

+-------+--------------+------+-------+---------+---------+
| Field | Type         | Null | Key   | Default | Extra   |
+-------+--------------+------+-------+---------+---------+
| k1    | BIGINT       | No   | true  | NULL    |         |
| k2    | BIGINT       | No   | true  | NULL    |         |
| k3    | BIGINT       | Yes  | true  | NULL    |         |
| k4    | BIGINT       | Yes  | true  | NULL    |         |
| k5    | BIGINT       | Yes  | true  | NULL    |         |
| k6    | BIGINT       | Yes  | true  | NULL    |         |
| k7    | BIGINT       | Yes  | true  | NULL    |         |
| k8    | BIGINT       | Yes  | true  | NULL    |         |
| k9    | BIGINT       | Yes  | true  | NULL    |         |
| k10   | BIGINT       | Yes  | true  | NULL    |         |
| v1    | BIGINT       | Yes  | false | NULL    | REPLACE |
| v2    | BIGINT       | Yes  | false | NULL    | REPLACE |
| v3    | BIGINT       | Yes  | false | NULL    | REPLACE |
| v4    | BIGINT       | Yes  | false | NULL    | REPLACE |
| v5    | BIGINT       | Yes  | false | NULL    | REPLACE |
| v6    | BIGINT       | Yes  | false | NULL    | REPLACE |
| v7    | BIGINT       | Yes  | false | NULL    | REPLACE |
| v8    | BIGINT       | Yes  | false | NULL    | REPLACE |
| v9    | BIGINT       | Yes  | false | NULL    | REPLACE |
| v10   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
| v11   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
| v12   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
| v13   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
| v14   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
| v15   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
| v16   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
| v17   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
| v18   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
| v19   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
| v20   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
| v21   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
| v22   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
| v23   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
| v24   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
+-------+--------------+------+-------+---------+---------+

This test is mainly to test the read performance, especially when there are a large number of small versions that are merged, so the test query for this time is select count(*) from (select k1,k2,k3,k4,k5 ,k6,k7,k8,k9,k10 from table_name) a;

UNIQUE_KEY and UNIQUE_KEY comparison

First of all, this test compares the difference in read performance between the UNIQUE_KEY table and the DUPLICTE_KEY table. The first version of the two versions is an empty version

Table Data size Number of partitions Number of buckets Number of rows Number of versions Query time (s)
test_uniq 93.585 MB 1 1 1,000,000 2 9
test_uniq_big 1.327 GB 10 10 14,401,261 2 13.5
test_dup 93.724 MB 1 1 1,000,000 2 4.8
test_dup_big 2.571 G 10 10 28,802,522 2 7

It can be seen that the read speed of the duplicate table is about 1 times that of unique

Optimized data

Table Data size Number of partitions Number of buckets Number of rows Number of versions Query time (s)
test_uniq 93.585 MB 1 1 1,000,000 2 4.8
test_uniq_big 1.327 GB 10 10 14,401,261 2 7.4
test_dup 93.724 MB 1 1 1,000,000 2 4.7
test_dup_big 2.571 G 10 10 28,802,522 2 7

After optimization, when the number of base versions is relatively small, the query performance is not much different.

UNIQUE_KEY multi-version reading optimization comparison

Since the data imported in multiple versions is random data, the data of the non-full version is a bit different, the test query is select count(*) from table_name

Table Data size Number of partitions Number of buckets Number of rows Number of versions Query time (s)
test_uniq(before) 136.288 MB 1 1 1008592 10000 11.7
test_uniq(after) 136.266 MB 1 1 1008635 10000 7.1
test_uniq_big(before) 1.368 GB 1 10 14401261 10000 15.5
test_uniq_big(after) 1.368 GB 1 10 14401261 10000 12
test_uniq_base(before) 94.252 MB 1 1 1000000 1 4
test_uniq_base(after) 94.252 MB 1 1 1000000 1 3.75
test_uniq_big_base(before) 1.327 GB 1 10 14401261 1 6
test_uniq_big_base(after) 1.327 GB 1 10 14401261 1 5.6
test_uniq(before) 96.348 MB 1 1 1008592 500 7.1
test_uniq(after) 96.349 MB 1 1 1008635 500 4.5
test_uniq_big(before) 1.329 GB 1 10 14401261 500 8.8
test_uniq_big(after) 1.327 GB 1 10 14401261 500 6.7

When there are many versions of a table, we have a lot of time spent on sorting and merging rowset
In our scenario, this sorting is actually a merge sorting of multiple ordered queues, and usually due to the existence of compapction, we will have a relatively large base rowset and several small rowset, so we can combine The small rowset is sorted first and merged with the large rowset to optimize the read performance

Types of changes

What types of changes does your code introduce to Doris?
Put an x in the boxes that apply

  • [] Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • [] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [] Documentation Update (if none of the other choices apply)
  • [] Code refactor (Modify the code structure, format the code, etc...)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@morningman morningman added kind/improvement area/storage Issues or PRs related to storage engine labels Nov 27, 2020
changed the merge method of the unique table,
merged the cumulative version data first, and then merged with the base version.
For the data with only one base version, read directly without merging
@morningman morningman changed the title Optimized the read performance of the table when have multi versions, Optimized the read performance of the table when have multi versions Nov 30, 2020
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman morningman added the approved Indicates a PR has been approved by one committer. label Nov 30, 2020
@yangzhg yangzhg merged commit df1f06e into apache:master Dec 1, 2020
@yangzhg yangzhg deleted the optimize_unique branch December 22, 2020 07:13
@yangzhg yangzhg mentioned this pull request Feb 9, 2021
Hastyshell pushed a commit to Hastyshell/doris that referenced this pull request Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. area/storage Issues or PRs related to storage engine kind/improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize the read performance of AGG and UNIQUE table with too much versions

2 participants