drainer/: Change unreasonable buff size and reduce memory usage #735

july2993 · 2019-08-30T06:00:45Z

What problem does this PR solve?

the maxBinlogItemCount var will affect the job buffer size in an unreasonable way.

In syncer.sync will work like this

// where will be worker-count goroutine to run this
for {
    get job from "jobChan chan *job"
    if reach some condition {
          execute the pending job(execute sql)
     } else {
         continue
     }
}

note execute the sql may take times, and not consume the job chan.
so the upstream may block at add job into the chan, and also will not add job the other sync job chan.

What is changed and how it works?

avoid any buffer at upstream to much mem and not decrease performance, and don't make buffer size of job chan depent on maxBinlogItemCount

the flowing v1 means f9e4589
v2 means one more commit 3074b3b

before 10:53 version: v2.1.16 maxBinlogItemCount=16
start at 10:53 version: v1 maxBinlogItemCount=16
start at 30 11:17 version:v2.1.16 maxBinlogItemCount=64k (the previous default value)
start at 30 11:47 version: v1 maxBinlogItemCount=0

all using worker-count = 256, txn-batch = 20
note when the drainer restarts, the first 5 minutes will use safe-mode, the event metrics count the changed events.

Another test:

the upstream data one binlog near 60M

start at 19:25: version: v1
start at 19:45: version: v2

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)

july2993 · 2019-08-30T06:20:29Z

/run-all-tests

july2993 · 2019-08-30T07:23:02Z

/run-all-tests

july2993 · 2019-08-30T12:49:59Z

/run-all-tests

ericsyh · 2019-08-30T13:48:34Z

/run-all-tests

IANTHEREAL · 2019-08-30T13:56:47Z

lgtm

WangXiangUSTC · 2019-08-30T15:05:39Z

/run-all-tests

zier-one · 2019-08-30T16:08:04Z

why ci was failed?

july2993 · 2019-08-30T16:10:59Z

why ci was failed?

i don't know, failed at status test, but not related to this pr.
(run ok locally and failed ci at another pr too-2.1)

zier-one · 2019-08-30T16:13:21Z

LGTM

lichunzhu · 2019-09-02T02:12:33Z

/run-integration-test

lichunzhu · 2019-09-03T05:37:31Z

IMO, CI fails because drainer hasn't consumed the pump 8250's storaged binlogs in 15 seconds when closing pump. This problem can be reproduced on my server.

july2993 · 2019-09-03T05:47:11Z

IMO, CI fails because drainer hasn't consumed the pump 8250's storaged binlogs in 15 seconds when closing pump. This problem can be reproduced on my server.

Is it something relate to this pr change? can you fix this?

lichunzhu · 2019-09-03T05:54:49Z

IMO, CI fails because drainer hasn't consumed the pump 8250's storaged binlogs in 15 seconds when closing pump. This problem can be reproduced on my server.

Is it something relate to this pr change? can you fix this?

I'm not so sure now. I'm working on that now and later I will get a conclusion.

july2993 · 2019-09-03T14:43:09Z

/run-all-tests

lichunzhu · 2019-09-05T08:21:40Z

/run-all-tests

july2993 · 2019-09-05T08:56:51Z

/run-all-tests

lichunzhu · 2019-09-05T09:51:30Z

There is still 4s gap between drainer and pump after 15s. I guess when drainer has cache it can cache some binlogs from pump then drainer can still sort and execute the cached binlogs even though this pump is paused. Maybe binlogChanSize shouldn't be set to 0. Or we can set check time in check_status more bigger.

july2993 · 2019-09-08T09:49:06Z

/run-all-tests

july2993 · 2019-09-08T10:48:13Z

/run-all-tests

july2993 · 2019-09-08T11:26:14Z

/run-all-tests

july2993 · 2019-09-09T02:21:02Z

/run-all-tests

july2993 · 2019-09-09T02:52:28Z

@lichunzhu PTAL, the root cause is when runs into the status test, it will start with commit-ts = 0 and fetch data from the oldest point.

lichunzhu

LGTM

* Avoid any buffer at upstream to avoid too much memory usage and not decreasing performance. * Add txnManager in loader to manage cached txn memory usage. Without txnManager the performance of loader will be affected.

drainer/: Change unreasonable buff size

f9e4589

july2993 added status/PTAL type/enhancement labels Aug 30, 2019

Merge branch 'release-2.1' into hjh/var

ac75822

july2993 requested review from IANTHEREAL, lichunzhu and zier-one August 30, 2019 06:20

july2993 added 2 commits August 30, 2019 19:24

Remove buf per pump

3074b3b

Merge remote-tracking branch 'july2993/hjh/var' into hjh/var

db1987a

july2993 changed the title ~~drainer/: Change unreasonable buff size~~ drainer/: Change unreasonable buff size and reduce memory usage Aug 30, 2019

zier-one added the status/LGT2 label Aug 30, 2019

zier-one previously approved these changes Aug 30, 2019

View reviewed changes

lichunzhu mentioned this pull request Sep 2, 2019

drainer/: Reduce memory usage (#735) #737

Merged

lichunzhu mentioned this pull request Sep 3, 2019

fix pump storage quit bug #739

Merged

Merge branch 'release-2.1' into hjh/var

41c3372

july2993 removed the status/PTAL label Sep 5, 2019

july2993 added 3 commits September 5, 2019 19:05

Merge branch 'release-2.1' into hjh/var

af0d51b

set cache size = 8

54ef773

Merge remote-tracking branch 'july2993/hjh/var' into hjh/var

48c78ac

july2993 dismissed zier-one’s stale review via 48c78ac September 8, 2019 09:31

Add log checkpoint info at startup

d7ed4bb

Avoid state test start at the oldest ts

fb02fa3

fix lost args

3826243

lichunzhu approved these changes Sep 9, 2019

View reviewed changes

july2993 merged commit 9d7fb86 into pingcap:release-2.1 Sep 9, 2019

july2993 deleted the hjh/var branch September 9, 2019 03:07

july2993 mentioned this pull request Dec 31, 2019

Avoid the sleep style for doing nothing #863

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

drainer/: Change unreasonable buff size and reduce memory usage #735

drainer/: Change unreasonable buff size and reduce memory usage #735

july2993 commented Aug 30, 2019 •

edited

Loading

july2993 commented Aug 30, 2019

july2993 commented Aug 30, 2019

july2993 commented Aug 30, 2019

ericsyh commented Aug 30, 2019

IANTHEREAL commented Aug 30, 2019

WangXiangUSTC commented Aug 30, 2019

zier-one commented Aug 30, 2019 •

edited

Loading

july2993 commented Aug 30, 2019

zier-one commented Aug 30, 2019

lichunzhu commented Sep 2, 2019

lichunzhu commented Sep 3, 2019 •

edited

Loading

july2993 commented Sep 3, 2019

lichunzhu commented Sep 3, 2019

july2993 commented Sep 3, 2019

lichunzhu commented Sep 5, 2019

july2993 commented Sep 5, 2019

lichunzhu commented Sep 5, 2019

july2993 commented Sep 8, 2019

july2993 commented Sep 8, 2019

july2993 commented Sep 8, 2019

july2993 commented Sep 9, 2019

july2993 commented Sep 9, 2019

lichunzhu left a comment

drainer/: Change unreasonable buff size and reduce memory usage #735

drainer/: Change unreasonable buff size and reduce memory usage #735

Conversation

july2993 commented Aug 30, 2019 • edited Loading

What problem does this PR solve?

What is changed and how it works?

Another test:

Check List

july2993 commented Aug 30, 2019

july2993 commented Aug 30, 2019

july2993 commented Aug 30, 2019

ericsyh commented Aug 30, 2019

IANTHEREAL commented Aug 30, 2019

WangXiangUSTC commented Aug 30, 2019

zier-one commented Aug 30, 2019 • edited Loading

july2993 commented Aug 30, 2019

zier-one commented Aug 30, 2019

lichunzhu commented Sep 2, 2019

lichunzhu commented Sep 3, 2019 • edited Loading

july2993 commented Sep 3, 2019

lichunzhu commented Sep 3, 2019

july2993 commented Sep 3, 2019

lichunzhu commented Sep 5, 2019

july2993 commented Sep 5, 2019

lichunzhu commented Sep 5, 2019

july2993 commented Sep 8, 2019

july2993 commented Sep 8, 2019

july2993 commented Sep 8, 2019

july2993 commented Sep 9, 2019

july2993 commented Sep 9, 2019

lichunzhu left a comment

Choose a reason for hiding this comment

july2993 commented Aug 30, 2019 •

edited

Loading

zier-one commented Aug 30, 2019 •

edited

Loading

lichunzhu commented Sep 3, 2019 •

edited

Loading