Fine grained partition writer optimization #6173

yibin87 · 2022-10-20T07:33:42Z

What problem does this PR solve?

Issue Number: ref #6157

Problem Summary:

Current FineGrainedShuffleWriter allocates new memory for scattered columns each time when batchWrite is triggered.
"Long tail problem", cached blocks under thres won't write until readSuffix, which's executed in a single thread.

This PR aims to reuse memory for scattered columns and "Long tail problem".
Referenced PR: #3787

What is changed and how it works?

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

None

Signed-off-by: yibin <huyibin@pingcap.com>

ti-chi-bot · 2022-10-20T07:33:44Z

[REVIEW NOTIFICATION]

This pull request has been approved by:

SeaRise
guo-shaoge

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

Signed-off-by: yibin <huyibin@pingcap.com>

dbms/src/Flash/Coprocessor/StreamingDAGResponseWriter.cpp

Signed-off-by: yibin <huyibin@pingcap.com>

dbms/src/DataStreams/ExchangeSenderBlockInputStream.cpp

dbms/src/Flash/Mpp/FineGrainedShuffleWriter.cpp

Signed-off-by: yibin <huyibin@pingcap.com>

yibin87 · 2022-10-28T02:37:32Z

/run-unit-tests

Signed-off-by: yibin <huyibin@pingcap.com>

…sh into fine_grained_partition_opt

yibin87 · 2022-10-31T02:27:01Z

/run-unit-tests

SeaRise

LGTM

dbms/src/Columns/tests/gtest_column_scatterTo.cpp

guo-shaoge · 2022-10-31T02:43:09Z

dbms/src/Flash/Mpp/FineGrainedShuffleWriter.cpp

    size_t rows = block.rows();
    rows_in_blocks += rows;
    if (rows > 0)
    {
        blocks.push_back(block);
    }

-    if (static_cast<UInt64>(rows_in_blocks) >= fine_grained_shuffle_batch_size)
+    if (blocks.size() == fine_grained_shuffle_stream_count || static_cast<UInt64>(rows_in_blocks) >= batch_send_row_limit)


So we changed the meaning of fine_grained_shuffle_batch_size, maybe need to talk to PM and also change the doc

Just a question: why trigger sending data when blocks.size() equals to fine_grained_shuffle_stream_count

Good suggestion and question. For 1, I'll follow. For 2, just think like this:
Previously, no fine_grained_shuffle_stream, N partitions, then one block => N blocks
With fine_grained_shuffle_stream, N partitions, M fine_grained_shuffle_stream_count, then one block => N * M blocks, thus here use blocks.size() == stream_count, can make final block close to previous block size.

Got it. So compared to the original situation, will trigger batchWrite more frequent when blocks are small. Guess this is the reason why long tail problem can be fix?

Long tail problem is actually fixed by the "flush" interface, which will be invoked concurrently in different threads. Previously, the remain cached blocks that are below thres, will be flushed in "finishWrite" interface, which is invoked sequentially in a single thread.

dbms/src/Flash/Mpp/FineGrainedShuffleWriter.cpp

Co-authored-by: SeaRise <hhssearise@foxmail.com>

Signed-off-by: yibin <huyibin@pingcap.com>

windtalker · 2022-10-31T05:16:10Z

dbms/src/Flash/Mpp/FineGrainedShuffleWriter.cpp

+{
+    /// Materialize sample_block so that header and reserved scatterColumns are full columns
+    /// Because ser/der don't support constant columns now
+    header = sample_block.cloneEmpty();


The column in ColumnWithTypeAndName can be nullptr sometimes, so I'm a little warried that HashBaseWriterHelper::materializeBlock may meet a npe problem, how about creating the scatter columns use the type in ColumnWithTypeAndName?

windtalker · 2022-10-31T05:18:33Z

dbms/src/Flash/Mpp/FineGrainedShuffleWriter.cpp

+    // fine_grained_shuffle_stream_count is in (0, 1024], and partition_num is uint16_t, so will not overflow.
+    num_bucket = partition_num * fine_grained_shuffle_stream_count;
+    partition_key_containers_for_reuse.resize(collators.size());
+    resetScatterColumns();


Looks like resetScatterColumns is only used in prepare, it's more like initScatterColumns?

Renamed to initScatterColumns

Signed-off-by: yibin <huyibin@pingcap.com>

windtalker

LGTM

yibin87 · 2022-10-31T06:28:06Z

/merge

ti-chi-bot · 2022-10-31T06:28:08Z

@yibin87: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

ti-chi-bot · 2022-10-31T06:28:10Z

This pull request has been accepted and is ready to merge.

Commit hash: 6ef1faf

ti-chi-bot · 2022-10-31T06:28:23Z

@yibin87: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

close pingcap#6157 Signed-off-by: CalvinNeo <calvinneo1995@gmail.com>

yibin87 added 4 commits October 19, 2022 11:23

Add reuse memory scatterTo api for IColumn

9e4d3cf

Signed-off-by: yibin <huyibin@pingcap.com>

Implement set,nothing,const,tuple columns scatterTo, and add gtest

0ff883a

Signed-off-by: yibin <huyibin@pingcap.com>

Partition Opt

b06df19

Signed-off-by: yibin <huyibin@pingcap.com>

Apply FineGrainedShuffleWriter optimization

9588d49

Signed-off-by: yibin <huyibin@pingcap.com>

ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 20, 2022

Fix gtest compilation failures

12b7c41

Signed-off-by: yibin <huyibin@pingcap.com>

ti-chi-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 20, 2022

yibin87 requested review from windtalker, SeaRise, fuzhe1989 and guo-shaoge October 20, 2022 07:51

yibin87 added 3 commits October 20, 2022 15:54

Little refact

e62857e

Signed-off-by: yibin <huyibin@pingcap.com>

Little refact

a83c7af

Signed-off-by: yibin <huyibin@pingcap.com>

Little refact

545bbb0

Signed-off-by: yibin <huyibin@pingcap.com>

SeaRise reviewed Oct 20, 2022

View reviewed changes

dbms/src/Flash/Coprocessor/StreamingDAGResponseWriter.cpp Outdated Show resolved Hide resolved

SeaRise self-requested a review October 20, 2022 10:40

yibin87 added 2 commits October 21, 2022 09:53

Extract flush method from write for DAGResponseWriter interface

f833565

Signed-off-by: yibin <huyibin@pingcap.com>

Fix after refactor

0f87acd

Signed-off-by: yibin <huyibin@pingcap.com>

SeaRise reviewed Oct 25, 2022

View reviewed changes

Update to comments and fix format issue

af4b9b5

Signed-off-by: yibin <huyibin@pingcap.com>

ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 28, 2022

yibin87 requested a review from SeaRise October 28, 2022 02:17

Merge branch 'master' into fine_grained_partition_opt

2b682af

ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 28, 2022

Merge branch 'master' into fine_grained_partition_opt

a593f6c

Merge branch 'master' into fine_grained_partition_opt

a2cb6f7

yibin87 added 4 commits October 31, 2022 10:09

Delete useless file

8f6072b

Signed-off-by: yibin <huyibin@pingcap.com>

Merge branch 'master' into fine_grained_partition_opt

a5f8c66

Convert constant columns for sample block

8b6d17e

Signed-off-by: yibin <huyibin@pingcap.com>

Merge branch 'fine_grained_partition_opt' of github.com:yibin87/tifla…

e5c6f63

…sh into fine_grained_partition_opt

SeaRise approved these changes Oct 31, 2022

View reviewed changes

dbms/src/Columns/tests/gtest_column_scatterTo.cpp Outdated Show resolved Hide resolved

ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Oct 31, 2022

guo-shaoge reviewed Oct 31, 2022

View reviewed changes

SeaRise reviewed Oct 31, 2022

View reviewed changes

dbms/src/Flash/Mpp/FineGrainedShuffleWriter.cpp Outdated Show resolved Hide resolved

yibin87 and others added 4 commits October 31, 2022 10:46

Update dbms/src/Flash/Mpp/FineGrainedShuffleWriter.cpp

9dbec38

Co-authored-by: SeaRise <hhssearise@foxmail.com>

Update dbms/src/Columns/tests/gtest_column_scatterTo.cpp

7d636ad

Co-authored-by: SeaRise <hhssearise@foxmail.com>

Fix compilation issue

8f54189

Signed-off-by: yibin <huyibin@pingcap.com>

Fix compilation issue

35c1357

Signed-off-by: yibin <huyibin@pingcap.com>

guo-shaoge approved these changes Oct 31, 2022

View reviewed changes

ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Oct 31, 2022

windtalker reviewed Oct 31, 2022

View reviewed changes

Handle potential null column issue

6ef1faf

Signed-off-by: yibin <huyibin@pingcap.com>

yibin87 requested a review from windtalker October 31, 2022 05:51

windtalker approved these changes Oct 31, 2022

View reviewed changes

ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Oct 31, 2022

Merge branch 'master' into fine_grained_partition_opt

b774dd0

ti-chi-bot merged commit c73daf6 into pingcap:master Oct 31, 2022

yibin87 mentioned this pull request Nov 1, 2022

Join & Aggregation Fine Grained Partition Optimization #6157

Closed

6 tasks

CalvinNeo pushed a commit to CalvinNeo/tiflash that referenced this pull request Nov 4, 2022

Fine grained partition writer optimization (pingcap#6173)

0992a27

close pingcap#6157 Signed-off-by: CalvinNeo <calvinneo1995@gmail.com>

yibin87 mentioned this pull request Aug 3, 2023

Expand the functionality of Local Runtime Filter #7891

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine grained partition writer optimization #6173

Fine grained partition writer optimization #6173

yibin87 commented Oct 20, 2022 •

edited

Loading

ti-chi-bot commented Oct 20, 2022 •

edited

Loading

yibin87 commented Oct 28, 2022

yibin87 commented Oct 31, 2022

SeaRise left a comment

guo-shaoge Oct 31, 2022

yibin87 Oct 31, 2022

guo-shaoge Oct 31, 2022 •

edited

Loading

yibin87 Oct 31, 2022

windtalker Oct 31, 2022

yibin87 Oct 31, 2022

windtalker Oct 31, 2022

yibin87 Oct 31, 2022

windtalker left a comment

yibin87 commented Oct 31, 2022

ti-chi-bot commented Oct 31, 2022

ti-chi-bot commented Oct 31, 2022

ti-chi-bot commented Oct 31, 2022

Fine grained partition writer optimization #6173

Fine grained partition writer optimization #6173

Conversation

yibin87 commented Oct 20, 2022 • edited Loading

What problem does this PR solve?

What is changed and how it works?

Check List

Release note

ti-chi-bot commented Oct 20, 2022 • edited Loading

yibin87 commented Oct 28, 2022

yibin87 commented Oct 31, 2022

SeaRise left a comment

Choose a reason for hiding this comment

guo-shaoge Oct 31, 2022

Choose a reason for hiding this comment

yibin87 Oct 31, 2022

Choose a reason for hiding this comment

guo-shaoge Oct 31, 2022 • edited Loading

Choose a reason for hiding this comment

yibin87 Oct 31, 2022

Choose a reason for hiding this comment

windtalker Oct 31, 2022

Choose a reason for hiding this comment

yibin87 Oct 31, 2022

Choose a reason for hiding this comment

windtalker Oct 31, 2022

Choose a reason for hiding this comment

yibin87 Oct 31, 2022

Choose a reason for hiding this comment

windtalker left a comment

Choose a reason for hiding this comment

yibin87 commented Oct 31, 2022

ti-chi-bot commented Oct 31, 2022

ti-chi-bot commented Oct 31, 2022

ti-chi-bot commented Oct 31, 2022

yibin87 commented Oct 20, 2022 •

edited

Loading

ti-chi-bot commented Oct 20, 2022 •

edited

Loading

guo-shaoge Oct 31, 2022 •

edited

Loading