This repository has been archived by the owner on Dec 8, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 66
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…g/csv Conflicts: lightning/mydump/csv_parser_generated.go
This expected to avoid about 3.5% of alloc_objects alloc_objects: Total: 773496750 773873722 (flat, cum) 7.18% 177 . . parser.fieldIndexes = parser.fieldIndexes[:0] 178 . . 179 . . isEmptyLine := true ... 225 386621314 386621314 str := string(parser.recordBuffer) // Convert to string once to batch allocations 226 386875436 386875436 dst := make([]string, len(parser.fieldIndexes))
This take most alloc in WriteRows: ROUTINE ======================== github.com/pingcap/tidb-lightning/lightning/backend.(*importer).WriteRows in /Users/huangjiahao/go/src/github.com/pingcap/tidb-lightning/lightning/backend/importer.go 797370418 980241246 (flat, cum) 9.09% of Total . . 155: kvs := rows.(kvPairs) ... ... . . 192: for i, pair := range kvs { 772641868 772641868 193: mutations[i] = &kv.Mutation{ . . 194: Op: kv.Mutation_Put, . . 195: Key: pair.Key, . . 196: Value: pair.Val, . . 197: } . . 198: }
Lightning allocates too many transient objects and heap size is small, so garbage collections happen too frequently and lots of time is spent in GC component. In a test of loading the table `order_line.csv` of 14k TPCC. The time need of `encode kv data and write` step reduce from 52m4s to 37m30s when change GOGC from 100 to 500, the total time needed reduce near 15m too. The cost of this is the memory of lightnin at runtime grow from about 200M to 700M, but it's acceptable. So we set the gc percentage as 500 default to reduce the GC frequency instead of 100.
has been move to Importer part
For pingcap/tidb@495f8b7 disable UpdateDeltaForTable if TxnCtx is nil
july2993
force-pushed
the
xhy/refine_encode
branch
from
March 16, 2020 11:47
97d8711
to
0181a3f
Compare
july2993
added
the
status/PTAL
This PR is ready for review. Add this label back after committing new changes
label
Mar 16, 2020
/run-all-tests |
kennytm
reviewed
Mar 16, 2020
@kennytm PTAL |
kennytm
reviewed
Mar 16, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
kennytm
added
status/LGT1
One reviewer already commented LGTM (LGTM1)
and removed
status/PTAL
This PR is ready for review. Add this label back after committing new changes
labels
Mar 16, 2020
3pointer
reviewed
Mar 16, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
kennytm
approved these changes
Mar 16, 2020
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
optimize the performance of lightning
the main performance improve is by setting
SetGCPercent as
500default
compare toUse pool for mutation
andReuse slice of record
.disable UpdateDeltaForTable if TxnCtx is nil
should reduce 14% time inEncode
according to the work by kennytm before in the comment.What is changed and how it works?
500
defaultLightning allocates too many transient objects and heap size is small,
so garbage collections happen too frequently and lots of time is spent in GC component.
In a test of loading the table
order_line.csv
of 14k TPCC.The time need for
encode kv data and write
step reduce from 52m4s to 37m30s when changeGOGC from 100 to 500, the total time needed to restore the table reduce near 15m too.
The cost of this is the memory of lightning at runtime grows from about 200M to 700M, but it's acceptable.
So we set the GC percentage as 500 default to reduce the GC frequency instead of 100.
see commit cce3ea6
Reuse slice of record
6f21d82
see commit
update dependency version of tidb
commit: 0181a3f
For pingcap/tidb@495f8b7
disable UpdateDeltaForTable if TxnCtx is nil
Check List
Tests
Side effects
Related changes