Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle zkCache failure: invalidate cache and zk-getData failure #377

Merged
merged 6 commits into from
Apr 25, 2017

Conversation

rdhabalia
Copy link
Contributor

@rdhabalia rdhabalia commented Apr 22, 2017

Motivation

Due to some reason if broker lose Global-Zk session then ZooKeeperCache should handle failure while getting data from lost zkSession.

Modifications

  • Handle null data from ZooKeeperCache to prevent NPE.
  • It should invalidate failed future from the cache so, next time when call comes then ZKCache can load fresh data from reconnected ZKSession.
  • It should handle unexpected exception thrown while accessing disconnected zkSession.
  • Introduce dedicated executor for zkCacheCallback to avoid bottleneck on pulsar's executor

Result

It helps ZooKeeperCache to reload zk-node data if it fails earlier with zksession issue.

@rdhabalia rdhabalia added the type/bug The PR fixed a bug or issue reported a bug label Apr 22, 2017
@rdhabalia rdhabalia added this to the 1.18 milestone Apr 22, 2017
@rdhabalia rdhabalia self-assigned this Apr 22, 2017
@@ -166,6 +166,23 @@ private void invalidateExists(String path) {
existsCache.invalidate(path);
}

public void asyncInvalidate(String path) {
if (scheduledExecutor != null) {
Copy link
Contributor Author

@rdhabalia rdhabalia Apr 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@merlimat

Why the executors could be null?

because in case of ZookeeperCacheLoader we depend on ForkJoinPool.
Also, it can also possible if someone pass null executor in constructor: ZooKeeperCache(ZooKeeper zkSession, OrderedSafeExecutor executor, ScheduledExecutorService scheduledExecutor)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it better to check this in the constructor then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we have multiple tests which passes null for executor to test specific scenario. Also,

  • we do handle null executor in ZkCache at multiple places
  • so, if caller may pass null executor then I think it should be handled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.. we can add assertion in constructor for non null executor .

@rdhabalia rdhabalia force-pushed the handle_zk branch 4 times, most recently from 3969dc7 to 38779db Compare April 23, 2017 08:58
@@ -95,6 +95,8 @@
private LocalZooKeeperConnectionService localZooKeeperConnectionProvider;
private final ScheduledExecutorService executor = Executors.newScheduledThreadPool(20,
new DefaultThreadFactory("pulsar"));
private final ScheduledExecutorService cacheExecutor = Executors.newScheduledThreadPool(10,
new DefaultThreadFactory("cache-callback"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it better to be specicif? zk-cache-callback instead of cache-callback?

@@ -166,6 +166,23 @@ private void invalidateExists(String path) {
existsCache.invalidate(path);
}

public void asyncInvalidate(String path) {
if (scheduledExecutor != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it better to check this in the constructor then?

@@ -95,6 +95,8 @@
private LocalZooKeeperConnectionService localZooKeeperConnectionProvider;
private final ScheduledExecutorService executor = Executors.newScheduledThreadPool(20,
new DefaultThreadFactory("pulsar"));
private final ScheduledExecutorService cacheExecutor = Executors.newScheduledThreadPool(10,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a thought - can we make the pools size configurable and put all constants in one place - for all pools cacheExecutor, scheduledExecutorScheduler and orderedExecutor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may configure number of netty-io and executor threads but I think we can create an issue and should address into different PR.

Copy link
Contributor

@saandrews saandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rdhabalia
Copy link
Contributor Author

@merlimat updated with testcases. can you please review when you get a chance.

Copy link
Contributor

@merlimat merlimat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@merlimat merlimat merged commit 09b6bb0 into apache:master Apr 25, 2017
rdhabalia added a commit that referenced this pull request Apr 25, 2017
* handle zkCache failure: invalidate cache and zk-getData failure

* introduce separate executor to serve zkcache callback

* add executor to discovery service

* remove testing npe

* add assetion on zkCache executor parameter and update tests

* update executor thread in testcase for intermittent failure
@rdhabalia rdhabalia deleted the handle_zk branch June 21, 2017 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants