Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue 210] - allow unexpected child shards to be ignored #240

Merged

Conversation

zerth
Copy link
Contributor

@zerth zerth commented Oct 11, 2017

This PR, which currently lacks tests, addresses issue #210.

Now instead of always throwing an assertion if a child shard has an
open parent, consider worker configuration before doing so. If
configured to ignore such shards, do not create leases for them during
shard sync. This is intended to mitigate failing worker init when
processing dynamodb streams with many thousands of shards (which can
happen for tables with thousands of partitions).

This new behavior can be enabled by adding the following to a
configuration/properties file:

ignoreUnexpectedChildShards = true

now instead of always throwing an assertion if a child shard has an
open parent, consider worker configuration before doing so.  if
configured to ignore such shards, do not create leases for them during
shard sync.  this is intended to mitigate failing worker init when
processing dynamodb streams with many thousands of shards (which can
happen for tables with thousands of partitions).

this new behavior can be enabled by adding the following to a
configuration/properties file:

```
ignoreUnexpectedChildShards = true
```
@pfifer
Copy link
Contributor

pfifer commented Oct 25, 2017

Thanks for submitting this, just a couple of requests:

Can you update this with the new changes, and remove the white space changes? I would also like to see some tests to verify that it doesn't break the existing behavior, and tests for the new behavior.

@zerth
Copy link
Contributor Author

zerth commented Oct 30, 2017

Will do.

zerth added 4 commits November 6, 2017 15:47
instead of adding the `ignoreUnexpectedChildShards` field to various
objects and passing it as an explicit parameter, refrain from adding
the field except where needed and obtain the value from the
already-passed `KinesisClientLibConfiguration` parameter.
@zerth
Copy link
Contributor Author

zerth commented Nov 7, 2017

@pfifer: Branch has been updated and unrelated whitespace changes removed, and two basic testcases added which exercise the new behavior. Looking at the existing tests, it seems to me they adequately cover the default behavior.

Copy link
Contributor

@pfifer pfifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor issues that would break existing code. Everything else looks good.

@@ -351,6 +359,7 @@ public KinesisClientLibConfiguration(String applicationName,
long parentShardPollIntervalMillis,
long shardSyncIntervalMillis,
boolean cleanupTerminatedShardsBeforeExpiry,
boolean ignoreUnexpectedChildShards,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're changing a public constructor and adding a parameter in the middle of it. This will break anyone who uses the constructor. I would prefer to add a new constructor with the parameter. The other option is not to have the parameter, and use the with/set operations to configure the feature.

At some point we will look into switching to a builder pattern for the configuration.

for (String id : inconsistentShardIds) {
ids += " " + id;
}
throw new KinesisClientLibIOException(String.valueOf(inconsistentShardIds.size()) + " open child shards (" + ids + ") are inconsistent."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be cleaner using String.format.

@zerth
Copy link
Contributor Author

zerth commented Dec 12, 2017

@pfifer: most recent comments addressed. I removed the config constructor changes to avoid polluting that module (and since my own usage involves only .properties files).

Copy link
Contributor

@pfifer pfifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor issues you can fix if you want to.

I'm going to start the process to merge the changes on my side.

private static void assertAllParentShardsAreClosed(Set<String> inconsistentShardIds)
throws KinesisClientLibIOException {
if (!inconsistentShardIds.isEmpty()) {
String ids = "";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use StringUtils#join here instead of manually building the string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. I don't normally use Java as has no doubt been apparent. :-)

@pfifer
Copy link
Contributor

pfifer commented Dec 21, 2017

Please confirm that we can use, modify, copy, and redistribute this contribution. Thanks.

@zerth
Copy link
Contributor Author

zerth commented Dec 22, 2017

Please confirm that we can use, modify, copy, and redistribute this contribution.

I confirm this to be true.

@pfifer pfifer merged commit 9074864 into awslabs:master Jan 4, 2018
@pfifer pfifer added this to the v1.8.9 milestone Jan 12, 2018
pfifer added a commit to pfifer/amazon-kinesis-client that referenced this pull request Jan 15, 2018
* Allow disabling check for the case where a child shard has an open parent shard.
  There is a race condition where it's possible for the a parent shard
  to appear open, while having child shards. This check can now be
  disabled by setting ignoreUnexpectedChildShards in the
  KinesisClientLibConfiguration to true.
  * PR awslabs#240
  * Issue awslabs#210
* Upgraded the AWS SDK for Java to 1.11.261
  * PR awslabs#281
pfifer added a commit that referenced this pull request Jan 15, 2018
* Allow disabling check for the case where a child shard has an open parent shard.
  There is a race condition where it's possible for the a parent shard
  to appear open, while having child shards. This check can now be
  disabled by setting ignoreUnexpectedChildShards in the
  KinesisClientLibConfiguration to true.
  * PR #240
  * Issue #210
* Upgraded the AWS SDK for Java to 1.11.261
  * PR #281
@ivoanjo
Copy link

ivoanjo commented Mar 28, 2019

Hey there, and sorry for necro'ing 😆

I got here because we were getting bitten by this exact same issue, and AWS support pointed us at #210 which did solve our issue.

But I'm really curious -- @pfifer can you comment on why this setting is not the default? E.g. when would it be relevant for me to care about this exception?

Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants