-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue3373 storage exit crash #3553
Merged
critical27
merged 4 commits into
vesoft-inc:master
from
cangfengzhs:issue3373-storage-exit-crash
Jan 7, 2022
Merged
Issue3373 storage exit crash #3553
critical27
merged 4 commits into
vesoft-inc:master
from
cangfengzhs:issue3373-storage-exit-crash
Jan 7, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cangfengzhs
force-pushed
the
issue3373-storage-exit-crash
branch
from
December 29, 2021 02:17
46dcfa5
to
cbc618f
Compare
critical27
reviewed
Dec 29, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job.
- I think
localCacheLock_
is useless now, maybe we could remove it. - move
killedPlans_
andkilledPlans_
to the same rcu
critical27
requested review from
liuyu85cn,
CPWstatic,
wenhaocs and
yixinglu
December 29, 2021 04:48
cangfengzhs
force-pushed
the
issue3373-storage-exit-crash
branch
from
December 30, 2021 11:05
1f53617
to
a5f3f65
Compare
critical27
requested changes
Dec 31, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for test
cangfengzhs
force-pushed
the
issue3373-storage-exit-crash
branch
from
January 6, 2022 07:44
bf32da3
to
7081904
Compare
critical27
approved these changes
Jan 7, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A long story... Good job~~ LGTM
CPWstatic
approved these changes
Jan 7, 2022
Sophie-Xie
pushed a commit
that referenced
this pull request
Jan 10, 2022
* use rcu replace thread local fix storage exit crash format address some comment * fix bug * fix bug
critical27
added a commit
that referenced
this pull request
Jan 10, 2022
* Fix typos (#3615) Co-authored-by: kyle.cao <kyle.cao@vesoft.com> * fix fetch edges tostring (#3613) Co-authored-by: Sophie <84560950+Sophie-Xie@users.noreply.github.com> Co-authored-by: Yichen Wang <18348405+Aiee@users.noreply.github.com> * fix create space assign offline host (#3583) * fix create space * fix test case Co-authored-by: Harris.Chu <1726587+HarrisChu@users.noreply.github.com> * Disable ARM version docker image since related third party not ready (#3618) * Unify raft error code (#3620) * Meta upgrader v3 (#3540) * Replace group when create space * Support white list * fix test case * support zone operations * fix * Support meta upgrade v3 * add more check about parse host result (#3628) * Ut fix (#3611) * Enable ut and fix chaindelete * Add mock server default worker * fix service crash (#3616) * Cleanup branch param in package script (#3622) * fix crash when the expression exceed the depth (#3606) * Enhance login password check (#3629) * fix_batch_insert_problem (#3627) * filter data before batch insert * add test cases * add more testcase * add notifyStop() for metaClient (#3621) * add notifyStop() for metaClient * do clean * Fix removeSession() (#3651) Co-authored-by: Yee <2520865+yixinglu@users.noreply.github.com> * Issue3373 storage exit crash (#3553) * use rcu replace thread local fix storage exit crash format address some comment * fix bug * fix bug * Fix coalesce bug (#3653) * fix coalesce * fix test * add test * add tck * fix * fix * fix * delete double check agg in where clause (#3647) Co-authored-by: Yee <2520865+yixinglu@users.noreply.github.com> Co-authored-by: cpw <13495049+CPWstatic@users.noreply.github.com> * fix meta crash after create space (#3660) Co-authored-by: Yichen Wang <18348405+Aiee@users.noreply.github.com> Co-authored-by: Yichen Wang <18348405+Aiee@users.noreply.github.com> Co-authored-by: kyle.cao <kyle.cao@vesoft.com> Co-authored-by: jimingquan <mingquan.ji@vesoft.com> Co-authored-by: yaphet <4414314+darionyaphet@users.noreply.github.com> Co-authored-by: Harris.Chu <1726587+HarrisChu@users.noreply.github.com> Co-authored-by: Yee <2520865+yixinglu@users.noreply.github.com> Co-authored-by: Doodle <13706157+critical27@users.noreply.github.com> Co-authored-by: Alex Xing <90179377+SuperYoko@users.noreply.github.com> Co-authored-by: endy.li <25311962+heroicNeZha@users.noreply.github.com> Co-authored-by: lionel.liu@vesoft.com <52276794+liuyu85cn@users.noreply.github.com> Co-authored-by: hs.zhang <22708345+cangfengzhs@users.noreply.github.com> Co-authored-by: jakevin <30525741+jackwener@users.noreply.github.com> Co-authored-by: cpw <13495049+CPWstatic@users.noreply.github.com>
yixinglu
pushed a commit
to yixinglu/nebula
that referenced
this pull request
Mar 21, 2022
* use rcu replace thread local fix storage exit crash format address some comment * fix bug * fix bug fix bug fix bug Co-authored-by: hs.zhang <22708345+cangfengzhs@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
What does this PR do?
Use RCU replace ThreadLocal in MetaClient
Which issue(s)/PR(s) this PR relates to?
#3373
#3497
Special notes for your reviewer, ex. impact of this fix, etc:
At the beginning, we found that storage would crash after running for a long time (a large number of insert edge operations were performed at the same time). At the same time, Storage's memory usage will be very high. So we guess that there is a memory leak after the system OOM. However, it was later discovered that this is not the problem. Even if the Storage does not have OOM, it will crash when it is stopped. All coredump stacks destruct a static thread variable when the thread exits. This is a variable of type folly::SingletonThreadLocal introduced in MetaClient.
At the same time, in another scenario, if compaction is triggered when storage is started, it will crash directly, and the coredump stack and stop will be the same.
After a long time of investigation, we did not find the specific cause of this problem, but we found that this was a problem that only appeared after the introduction of folly::SingletonThreadLocal, so we chose to deprecate folly::SingletonThreadLocal and replace it with RCU it.
After using RCU, there is indeed no crash. I am not sure whether it was really fixed or just because the probability of crash has decreased and I did not find it.
In addition, the performance of using RCU should also be better than the performance of ThreadLocal, because no read-write lock means no blocking
Additional context/ Design document:
Checklist:
Release notes:
Please confirm whether to be reflected in release notes and how to describe: