Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine the lifecycle of memory_tracker for MPP query #6020

Merged
merged 10 commits into from
Sep 27, 2022

Conversation

windtalker
Copy link
Contributor

Signed-off-by: xufei xufeixw@mail.ustc.edu.cn

What problem does this PR solve?

Issue Number: ref #5609

Problem Summary:
This is a refine pr for #5610

What is changed and how it works?

  1. create processlist/memory_tracker in MPPTask::prepare, so when creating MPPTunnel/ExchangeReceiver, there is already a memory_tracker to use.
  2. save memory tracker in TunnelSender, so each TrackedMPPPacket does not need to hold the shared ptr of memory tracker
  3. remove some workaround code

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
    • run random failpoint tests for more than 12 hours
    • run mpp fail tests for more than 12 hours
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Sep 26, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • SeaRise
  • bestwoody

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 26, 2022
++read_packet_index;
packets[read_packet_index - 1]->recomputeTrackedMem();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recomputeTrackedMem may throw error, we need handle the exception properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I move the try/catch part into recomputeTrackedMem

@SeaRise SeaRise self-requested a review September 26, 2022 09:11
/// because after `write`, `async_tunnel_sender` can be destroyed at any time
/// so there is a risk that `res` is destructed after `aysnc_tunnel_sender`
/// is destructed which may cause the memory tracker in `res` become invalid
res->switchMemTracker(nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may cause MemTracker incorrect... How about switching to RootMemTracker?

Copy link
Contributor Author

@windtalker windtalker Sep 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? It just release the memory. Before this change, the memory is also released after write()

Copy link
Contributor

@bestwoody bestwoody Sep 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried about an extreme case that a lot of threads are doing EstablishCallData::trySendOneMsg() and early free() of the MemTracker cause real memory usage too large than tracked

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in before this pr, it also has the early free issue since async tunnel's write actually does not wait until write finish.

@@ -336,6 +348,7 @@ void MPPTask::preprocess()
void MPPTask::runImpl()
{
CPUAffinityManager::getInstance().bindSelfQueryThread();
current_memory_tracker = process_list_entry->get().getMemoryTrackerPtr().get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it necessary? since newThreadManager()->scheduleThenDetach will propgate memTracker into runImpl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, it is not necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change it to RUNTIME_ASSERT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

{
// note: no exception should be thrown rudely, since it's called by a GRPC poller.
for (size_t i = 0; i < read_packet_index; ++i)
{
auto & packet = packets[i];
// We shouldn't throw error directly, since the caller works in a standalone thread.
try
Copy link
Contributor

@bestwoody bestwoody Sep 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seems pushPacket can throw exception except recomputeTrackedMem such as fiu_do_on(FailPoints::random_receiver_async_msg_push_failure_failpoint, push_succeed = false;);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This failpoint just set the return value(push_succeed) to false, it does not throw exception.

Copy link
Contributor

@bestwoody bestwoody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 26, 2022
@SeaRise SeaRise self-requested a review September 27, 2022 01:07
Copy link
Contributor

@SeaRise SeaRise left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest LGTM

Comment on lines 23 to 30
#include <Flash/Coprocessor/DAGQuerySource.h>
#include <Flash/Coprocessor/DAGUtils.h>
#include <Flash/Mpp/ExchangeReceiver.h>
#include <Flash/Mpp/GRPCReceiverContext.h>
#include <Flash/Mpp/MPPTask.h>
#include <Flash/Mpp/MPPTunnelSet.h>
#include <Flash/Mpp/Utils.h>
#include <Flash/Planner/PlanQuerySource.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

useless #include

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 27, 2022
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
@windtalker
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

@windtalker: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 90ad917

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Sep 27, 2022
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
@ti-chi-bot ti-chi-bot removed the status/can-merge Indicates a PR has been approved by a committer. label Sep 27, 2022
@windtalker
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

@windtalker: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 6f05860

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Sep 27, 2022
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
@ti-chi-bot ti-chi-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed status/can-merge Indicates a PR has been approved by a committer. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 27, 2022
@windtalker
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

@windtalker: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: bf77528

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Sep 27, 2022
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
@ti-chi-bot ti-chi-bot removed the status/can-merge Indicates a PR has been approved by a committer. label Sep 27, 2022
@windtalker
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

@windtalker: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 045c6dc

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Sep 27, 2022
@sre-bot
Copy link
Collaborator

sre-bot commented Sep 27, 2022

Coverage for changed files

Filename                                        Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Common/MemoryTracker.h                               25                 7    72.00%          16                 6    62.50%          33                 9    72.73%           6                 1    83.33%
Flash/Coprocessor/DAGContext.cpp                     97                36    62.89%          26                 8    69.23%         175                50    71.43%          66                35    46.97%
Flash/Coprocessor/DAGContext.h                       39                 8    79.49%          30                 7    76.67%          87                22    74.71%          12                 4    66.67%
Flash/Coprocessor/DAGQuerySource.cpp                 20                 3    85.00%           5                 1    80.00%          39                 6    84.62%          14                 6    57.14%
Flash/Coprocessor/DAGStorageInterpreter.cpp         447               447     0.00%          38                38     0.00%         900               900     0.00%         276               276     0.00%
Flash/EstablishCall.cpp                              88                 9    89.77%          14                 1    92.86%         149                20    86.58%          44                10    77.27%
Flash/Mpp/ExchangeReceiver.cpp                      423               128    69.74%          36                 4    88.89%         587               133    77.34%         210                77    63.33%
Flash/Mpp/ExchangeReceiver.h                         14                 2    85.71%          12                 2    83.33%          26                 4    84.62%           2                 0   100.00%
Flash/Mpp/MPPHandler.cpp                             81                49    39.51%           2                 1    50.00%          68                38    44.12%          20                14    30.00%
Flash/Mpp/MPPReceiverSet.cpp                         17                 5    70.59%           5                 1    80.00%          25                 6    76.00%          12                 4    66.67%
Flash/Mpp/MPPReceiverSet.h                            1                 0   100.00%           1                 0   100.00%           1                 0   100.00%           0                 0         -
Flash/Mpp/MPPTask.cpp                               494               116    76.52%          24                 1    95.83%         438                85    80.59%         182                79    56.59%
Flash/Mpp/MPPTask.h                                   4                 1    75.00%           4                 1    75.00%           6                 1    83.33%           0                 0         -
Flash/Mpp/MPPTunnel.cpp                             350                60    82.86%          20                 0   100.00%         298                39    86.91%         138                43    68.84%
Flash/Mpp/MPPTunnel.h                                37                 3    91.89%          31                 3    90.32%          69                 5    92.75%           4                 0   100.00%
Flash/Mpp/MPPTunnelSet.cpp                           47                27    42.55%          10                 3    70.00%          87                48    44.83%          34                21    38.24%
Flash/Mpp/MPPTunnelSet.h                              4                 0   100.00%           4                 0   100.00%           6                 0   100.00%           0                 0         -
Flash/Mpp/TrackedMppDataPacket.h                     49                 6    87.76%          25                 3    88.00%         113                16    85.84%          20                 4    80.00%
Flash/Planner/PlanQuerySource.cpp                     4                 1    75.00%           4                 1    75.00%          10                 3    70.00%           0                 0         -
Interpreters/Context.cpp                            538               290    46.10%         174                71    59.20%        1136               590    48.06%         290               202    30.34%
Interpreters/Context.h                               11                 5    54.55%          11                 5    54.55%          11                 5    54.55%           0                 0         -
Interpreters/ProcessList.h                           26                10    61.54%          20                 7    65.00%          73                27    63.01%           6                 3    50.00%
Interpreters/executeQuery.cpp                       213               129    39.44%          15                 8    46.67%         428               216    49.53%         126                97    23.02%
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                              3029              1342    55.69%         527               172    67.36%        4765              2223    53.35%        1462               876    40.08%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18358      7513             59.08%    214865  77286        64.03%

full coverage report (for internal network access only)

@sre-bot
Copy link
Collaborator

sre-bot commented Sep 27, 2022

Coverage for changed files

Filename                                        Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Common/MemoryTracker.h                               25                 7    72.00%          16                 6    62.50%          33                 9    72.73%           6                 1    83.33%
Flash/Coprocessor/DAGContext.cpp                     97                36    62.89%          26                 8    69.23%         175                50    71.43%          66                35    46.97%
Flash/Coprocessor/DAGContext.h                       39                 8    79.49%          30                 7    76.67%          87                22    74.71%          12                 4    66.67%
Flash/Coprocessor/DAGQuerySource.cpp                 20                 3    85.00%           5                 1    80.00%          39                 6    84.62%          14                 6    57.14%
Flash/Coprocessor/DAGStorageInterpreter.cpp         447               447     0.00%          38                38     0.00%         900               900     0.00%         276               276     0.00%
Flash/EstablishCall.cpp                              88                 9    89.77%          14                 1    92.86%         149                20    86.58%          44                 9    79.55%
Flash/Mpp/ExchangeReceiver.cpp                      423               128    69.74%          36                 4    88.89%         587               133    77.34%         210                77    63.33%
Flash/Mpp/ExchangeReceiver.h                         14                 2    85.71%          12                 2    83.33%          26                 4    84.62%           2                 0   100.00%
Flash/Mpp/MPPHandler.cpp                             81                49    39.51%           2                 1    50.00%          68                38    44.12%          20                14    30.00%
Flash/Mpp/MPPReceiverSet.cpp                         17                 5    70.59%           5                 1    80.00%          25                 6    76.00%          12                 4    66.67%
Flash/Mpp/MPPReceiverSet.h                            1                 0   100.00%           1                 0   100.00%           1                 0   100.00%           0                 0         -
Flash/Mpp/MPPTask.cpp                               494               116    76.52%          24                 1    95.83%         438                85    80.59%         182                79    56.59%
Flash/Mpp/MPPTask.h                                   4                 1    75.00%           4                 1    75.00%           6                 1    83.33%           0                 0         -
Flash/Mpp/MPPTunnel.cpp                             350                60    82.86%          20                 0   100.00%         298                39    86.91%         138                42    69.57%
Flash/Mpp/MPPTunnel.h                                37                 3    91.89%          31                 3    90.32%          69                 5    92.75%           4                 0   100.00%
Flash/Mpp/MPPTunnelSet.cpp                           47                27    42.55%          10                 3    70.00%          87                48    44.83%          34                21    38.24%
Flash/Mpp/MPPTunnelSet.h                              4                 0   100.00%           4                 0   100.00%           6                 0   100.00%           0                 0         -
Flash/Mpp/TrackedMppDataPacket.h                     49                 6    87.76%          25                 3    88.00%         113                16    85.84%          20                 4    80.00%
Flash/Planner/PlanQuerySource.cpp                     4                 1    75.00%           4                 1    75.00%          10                 3    70.00%           0                 0         -
Interpreters/Context.cpp                            538               290    46.10%         174                71    59.20%        1136               590    48.06%         290               201    30.69%
Interpreters/Context.h                               11                 5    54.55%          11                 5    54.55%          11                 5    54.55%           0                 0         -
Interpreters/ProcessList.h                           26                10    61.54%          20                 7    65.00%          73                27    63.01%           6                 3    50.00%
Interpreters/executeQuery.cpp                       213               129    39.44%          15                 8    46.67%         432               217    49.77%         126                97    23.02%
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                              3029              1342    55.69%         527               172    67.36%        4769              2224    53.37%        1462               873    40.29%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18358      7513             59.08%    214869  77253        64.05%

full coverage report (for internal network access only)

@ti-chi-bot ti-chi-bot merged commit 3548744 into pingcap:master Sep 27, 2022
@windtalker windtalker deleted the memory_tracker branch May 6, 2023 06:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants