Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run multiple instance of scheduler on one JVM #395

Merged
merged 11 commits into from
Oct 10, 2018
Merged

Run multiple instance of scheduler on one JVM #395

merged 11 commits into from
Oct 10, 2018

Conversation

xiaoyu-meng-mxy
Copy link
Contributor

Description of changes:
Enable KCL V2 to run multiple instances of the Scheduler on a single JVM.
Multiple KCL schedulers should be able to run with different event source mappings of different customers in the same JVM.

Why?
Lambda event bridge application handles a high number of event source mappings of the customers. Running multiple KCL schedulers on the same instance and on the same JVM reduces the number of hosts required process the streams.

Solution:
There are static synchronized functions in KCL and they are not suitable for this scenario. They have to be modified to be synchronized per event source mapping and not per class.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@pfifer pfifer added the v2.x Issues related to the 2.x version label Sep 7, 2018
Copy link
Contributor

@pfifer pfifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of small changes

* It will create new leases/activities when it discovers new Kinesis shards (bootstrap/resharding).
* It deletes leases for shards that have been trimmed from Kinesis, or if we've completed processing it
* and begun processing it's child shards.
*/
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@NoArgsConstructor(access = AccessLevel.PUBLIC)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't need to add this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

public ShardSyncTaskManager(ShardDetector shardDetector, LeaseRefresher leaseRefresher, InitialPositionInStreamExtended initialPositionInStream,
boolean cleanupLeasesUponShardCompletion, boolean ignoreUnexpectedChildShards, long shardSyncIdleTimeMillis,
ExecutorService executorService, MetricsFactory metricsFactory) {
this(shardDetector, leaseRefresher, initialPositionInStream, cleanupLeasesUponShardCompletion, ignoreUnexpectedChildShards, shardSyncIdleTimeMillis, executorService, metricsFactory, new ShardSyncer());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not depend on the ShardSyncTaskManager creating the ShardSyncer, instead handle the creation in DynamoDBLeaseManagementFactory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified.

@@ -220,6 +220,15 @@ public LeaseManagementConfig metricsFactory(final MetricsFactory metricsFactory)
}
};

private ShardSyncer shardSyncer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to create the getter

private ShardSyncer shardSyncer = new ShardSyncer();

This should work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes you are right

@NonNull
private final ShardSyncer shardSyncer;

public ShardSyncTaskManager(ShardDetector shardDetector, LeaseRefresher leaseRefresher, InitialPositionInStreamExtended initialPositionInStream,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you change the order of MetricsFactory and ShardSyncer. We have maintained MetricsFactory as the last parameter of the constructors.

Copy link
Contributor Author

@xiaoyu-meng-mxy xiaoyu-meng-mxy Sep 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified

@@ -71,7 +68,7 @@
* @throws KinesisClientLibIOException
*/
// CHECKSTYLE:OFF CyclomaticComplexity
public static synchronized void checkAndCreateLeasesForNewShards(@NonNull final ShardDetector shardDetector,
public synchronized void checkAndCreateLeasesForNewShards(@NonNull final ShardDetector shardDetector,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a breaking change. A better way would be to deprecate the existing method and creating a new non-static method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, modified.

* worker and checks if the second worker can start syncing shards.
*/
@Test(timeout = 30000L)
public final void testWorkersCanSyncShardsInParallel() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid multithreaded testing in unit tests. Testing a single instance should be fine for unit testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, will remove this unit test.

Meng added 2 commits September 21, 2018 16:28
…e the order for metricsFactory and HierarchichalShardSyncer in ShardConsumerArgument
Copy link
Contributor

@sahilpalvia sahilpalvia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just a few more minor changes

@@ -236,7 +239,8 @@ private void initialize() {
if (!skipShardSyncAtWorkerInitializationIfLeasesExist || leaseRefresher.isLeaseTableEmpty()) {
log.info("Syncing Kinesis shard info");
ShardSyncTask shardSyncTask = new ShardSyncTask(shardDetector, leaseRefresher, initialPosition,
cleanupLeasesUponShardCompletion, ignoreUnexpetedChildShards, 0L, metricsFactory);
cleanupLeasesUponShardCompletion, ignoreUnexpetedChildShards, 0L,
hierarchichalShardSyncer, metricsFactory);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misspelled Hierarchical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ye.. Fixed

private final MetricsFactory metricsFactory;

public ShardSyncTaskManager(ShardDetector shardDetector, LeaseRefresher leaseRefresher, InitialPositionInStreamExtended initialPositionInStream,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you create the deprecated constructor too, with the default being HierarchicalShardSyncer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -81,7 +97,7 @@ private synchronized boolean checkAndSubmitNextTask() {
initialPositionInStream,
cleanupLeasesUponShardCompletion,
ignoreUnexpectedChildShards,
shardSyncIdleTimeMillis,
shardSyncIdleTimeMillis, hierarchichalShardSyncer,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you follow the formatting style for the file and include the parameter on a new line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -54,6 +55,7 @@
InitialPositionInStreamExtended.newInitialPosition(InitialPositionInStream.TRIM_HORIZON);
private static final ShutdownReason TERMINATE_SHUTDOWN_REASON = ShutdownReason.SHARD_END;
private static final MetricsFactory NULL_METRICS_FACTORY = new NullMetricsFactory();
private static final HierarchichalShardSyncer SHARD_SYNCER = new HierarchichalShardSyncer();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be an instance of the ShardSyncer? Or can it be mocked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DOne

@@ -109,7 +110,8 @@ public void setup() {
taskBackoffTimeMillis, skipShardSyncAtWorkerInitializationIfLeasesExist,
listShardsBackoffTimeInMillis, maxListShardsRetryAttempts,
shouldCallProcessRecordsEvenForEmptyRecordList, idleTimeInMillis, INITIAL_POSITION_IN_STREAM,
cleanupLeasesOfCompletedShards, ignoreUnexpectedChildShards, shardDetector, metricsFactory, new AggregatorUtil());
cleanupLeasesOfCompletedShards, ignoreUnexpectedChildShards, shardDetector, new AggregatorUtil(),
new HierarchichalShardSyncer(), metricsFactory);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be a new instance of the ShardSyncer? Or can it be a mock object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before this change, you used static methods in ShardSyncer.class and you never mock it. That's why I think we need to create a new instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use mock

* @param initialLeaseTableWriteCapacity
* @param hierarchichalShardSyncer
*/
public DynamoDBLeaseManagementFactory(final KinesisAsyncClient kinesisClient, final String streamName,
Copy link
Contributor

@sahilpalvia sahilpalvia Sep 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make it use the formatter on this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@sahilpalvia sahilpalvia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Thanks for providing this.

@pfifer
Copy link
Contributor

pfifer commented Oct 9, 2018

Thanks for the change, and sorry about the delay.

One more thing can you please pull resolve the conflicts, and update the PR.

@pfifer pfifer added this to the v2.0.4 milestone Oct 9, 2018
@sahilpalvia sahilpalvia merged commit 14c6829 into awslabs:master Oct 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v2.x Issues related to the 2.x version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants