Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed a bug where the storage service might crash (#583) #595

Merged
merged 5 commits into from
Jul 19, 2019

Conversation

monadbobo
Copy link
Contributor

summary:

When the storage service deamon is started, the nebula::meta::MetaClient
will use the MetaServerBasedPartManager object(MetaClient::registerListener),
but the MetaServerBasedPartManager object will be released before
nebula::meta::MetaClient(through KVStore object).

Added the MetaClient::unRegisterListener method to stop the background thread
and reset the listener_ object, And the unRegisterListener is called
when the MetaServerBasedPartManager object was released.

summary:

  When the storage service deamon is started, the nebula::meta::MetaClient
  will use the MetaServerBasedPartManager object(MetaClient::registerListener),
  but the MetaServerBasedPartManager object will be released before
  nebula::meta::MetaClient(through KVStore object).

  Added the MetaClient::unRegisterListener method to stop the background thread
  and reset the listener_ object, And the unRegisterListener is called
  when the MetaServerBasedPartManager object was released.
@monadbobo monadbobo added the ready-for-testing PR: ready for the CI test label Jul 8, 2019
@nebula-community-bot
Copy link
Member

Unit testing passed.

@critical27
Copy link
Contributor

critical27 commented Jul 9, 2019

I am a little confused. Since PartManager has a MetaClient, MetaClient has a listener, and PartManager object will be released before MetaClient, why don't we just set the MetaClient in PartManager to nullptr?
The resource in MetaClient will be released in ~MetaClient. I think it's ok.

@monadbobo
Copy link
Contributor Author

I am a little confused. Since PartManager has a MetaClient, MetaClient has a listener, and PartManager object will be released before MetaClient, why don't we just set the MetaClient in PartManager to nullptr?
The resource in MetaClient will be released in ~MetaClient. I think it's ok.

That's because two threads will be started in metaclient, and MetaServerBasedPartManager will be used in these two threads.

listener_ = listener;
}

void unRegisterListener() {
deInit();
Copy link
Contributor

@critical27 critical27 Jul 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood now. Maybe you should not call deInit here? Because other's may still use the bgThread in MetaClient, e.g. SchemaManager.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we need to stop the background thread first, otherwise it will have race conditions in listener_.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd better protect listener_ in MetaClient instead of stop bgThread directly. Because we don't want meta client is controlled by kvstore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my understanding.
1 metaclient itself is dependent on the listener_ object.
2 If listener_ is released, both two threads of metaclient will not work, then there is no need for metaclient at this time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly. Listener_ is not necessary for meta client.

Use a read-write lock in metaclient to protect the listener_.
@nebula-community-bot
Copy link
Member

Unit testing passed.

Copy link
Contributor

@dangleptr dangleptr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally, the pr lgtm. Does the crash problem fixed after unRegisterListener?

@@ -69,9 +69,16 @@ class MetaClient {
void init();

void registerListener(MetaChangedListener* listener) {
CHECK(listener_ == nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the check under the lock

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally, the pr lgtm. Does the crash problem fixed after unRegisterListener?

Yes, this PR will fix the crash.

@nebula-community-bot
Copy link
Member

Unit testing passed.

Copy link
Contributor

@dangleptr dangleptr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done. LGTM. Thanks for taking care of it.

@nebula-community-bot
Copy link
Member

Unit testing passed.

@dutor dutor merged commit 9214d99 into vesoft-inc:master Jul 19, 2019
tong-hao pushed a commit to tong-hao/nebula that referenced this pull request Jun 1, 2021
…esoft-inc#595)

* Fixed a bug where the storage service might crash (vesoft-inc#583)

summary:

  When the storage service deamon is started, the nebula::meta::MetaClient
  will use the MetaServerBasedPartManager object(MetaClient::registerListener),
  but the MetaServerBasedPartManager object will be released before
  nebula::meta::MetaClient(through KVStore object).

  Added the MetaClient::unRegisterListener method to stop the background thread
  and reset the listener_ object, And the unRegisterListener is called
  when the MetaServerBasedPartManager object was released.

* Fixed a bug where the storage service might crash (vesoft-inc#583)

Use a read-write lock in metaclient to protect the listener_.

*         Fixed a bug where the storage service might crash (vesoft-inc#583)

        Fixed possible thread competition for checking listener_ .
yixinglu pushed a commit to yixinglu/nebula that referenced this pull request Mar 21, 2022
* uuid

* uuid

Co-authored-by: jakevin <30525741+jackwener@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-for-testing PR: ready for the CI test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants