[CCR] Add more unit tests for shard follow task #32121

martijnvg · 2018-07-17T10:08:25Z

The added tests are based on specific scenarios as described in the test plan.
Before this change the ShardFollowNodeTaskTests contained more random like tests,
but these have been removed and in a followup pr better random tests will
be added in a new test class as is described in the test plan.

The added tests are based on specific scenarios as described in the test plan. Before this change the ShardFollowNodeTaskTests contained more random like tests, but these have been removed and in a followup pr better random tests will be added in a new test class as is described in the test plan.

elasticmachine · 2018-07-17T10:08:27Z

Pinging @elastic/es-distributed

dnhatn

Thanks @martijnvg for adding many great tests. I left some comments.

dnhatn · 2018-07-18T21:43:15Z

...ck/plugin/ccr/src/test/java/org/elasticsearch/xpack/ccr/action/ShardFollowNodeTaskTests.java

+        task.coordinateReads();
+        assertThat(shardChangesRequests.size(), equalTo(8));
+        assertThat(shardChangesRequests.get(0)[0], equalTo(0L));
+        assertThat(shardChangesRequests.get(0)[1], equalTo(8L));


Can we fold all these assertions into a single one? I think this should cover enough.

assertThat(shardChangesRequests, contains(new long[][]{ {0L, 8L}, {9L, 8L}, {18L, 8L}, {27L, 8L}, {36L, 8L}, {45L, 8L}, {54L, 8L}, {63L, 8L} }));

Moreover, the leader should not return more than the requesting batch size. Here, we request 8 operations, but it returns 9 operations.

That is not the end offset but the maxBatchOperationCount: https://github.com/elastic/elasticsearch/pull/32121/files#diff-e143edbe5e6171625fe27ec29039aabdR622

Since that is always the same perhaps it not worth asserting it. Should we just assert the from offset?

dnhatn · 2018-07-18T21:47:09Z

...ck/plugin/ccr/src/test/java/org/elasticsearch/xpack/ccr/action/ShardFollowNodeTaskTests.java

+        assertThat(status.getLeaderGlobalCheckpoint(), equalTo(128L));
+    }
+
+    public void testCoordinateReads_maxConcurrentReads() {


Can we avoid using camel_case here? I think testMaxConcurrentReads should be good.

dnhatn · 2018-07-18T21:54:04Z

x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/ShardFollowNodeTask.java

@@ -61,13 +61,14 @@
    private final BiConsumer<TimeValue, Runnable> scheduler;

    private volatile long lastRequestedSeqno;
-    private volatile long leaderGlobalCheckpoint;
+    // package-protected visibility for testing only:
+    volatile long leaderGlobalCheckpoint;


I am not sure if we should increase the visibility to allow changing this volatile variable?

To be honest, I'm not happy with it too. But otherwise testing the isolated steps is much trickier.

@dnhatn What do you think about this: 47bc37c ?

isn't this exposed in the status object? can't we use that maybe? just an idea

@bleskes The test sets the leaderGlobalCheckpoint. I just changed how this test now sets it via this: 47bc37c

this will make testing easier with out changing leaderGlobalCheckpoint field's visibility

based on a from value + maxOperationCount, - 1 need be applied

dnhatn

I left some points to discuss.

dnhatn · 2018-07-19T23:33:06Z

...ck/plugin/ccr/src/test/java/org/elasticsearch/xpack/ccr/action/ShardFollowNodeTaskTests.java

+    void startTask(ShardFollowNodeTask task, long leaderGlobalCheckpoint, long followerGlobalCheckpoint) {
+        task.start(followerGlobalCheckpoint);
+        // Shortcut to just set leaderGlobalCheckpoint, calling for example handleReadResponse() has side effects that
+        // complicates testing in isolation.


I think we can have a ShardFollowNodeTask#start method that accepts two params: leaderGCP and followerGCP. Then we can remove updateLeaderGlobalCheckpoint and this helper method startTask. WDYT?

Right, I think that in the case of this test that this would be better. The only downside is that in production code we will always set the leaderGCP and followerGCP to the same value. ( The ShardFollowTasksExecutor only fetches the followerGCP). But I think this better then separating out updating the leaderGCP in a separate method.

dnhatn · 2018-07-19T23:36:42Z

...ck/plugin/ccr/src/test/java/org/elasticsearch/xpack/ccr/action/ShardFollowNodeTaskTests.java

+
+        shardChangesRequests.clear();
+        synchronized (task) {
+            task.updateLeaderGlobalCheckpoint(128L);


We can setup other scenarios to test the cancellation if we remove updateLeaderGlobalCheckpoint. For example, make the read limits reached, then cancel, then verify that we won't issue any read request.

dnhatn · 2018-07-19T23:38:18Z

...ck/plugin/ccr/src/test/java/org/elasticsearch/xpack/ccr/action/ShardFollowNodeTaskTests.java

+        shardChangesRequests.clear();
+        // Also invokes the coordinatesReads() method:
+        task.innerHandleReadResponse(0L, 63L, generateShardChangesResponse(0, 63, 0L, 128L));
+        assertThat(shardChangesRequests.size(), equalTo(0)); // no more reads, because write buffer is full


I am wondering if we should add buffer (size or operations) to the Status object? We can do it in a follow up if you are okay.

I think we should add this too, but in a follow up. I think when Jason is going to work on the ccr stats then this is a stat that is very likely to be added.

Yes, indeed.

…per method," This reverts commit 47bc37c.

…od for testing purposes

…ask.

dnhatn

Thanks @martijnvg. LGTM

The added tests are based on specific scenarios as described in the test plan. Before this change the ShardFollowNodeTaskTests contained more random like tests, but these have been removed and in a followup pr better random tests will be added in a new test class as is described in the test plan.

We modified the way we calculate to_seqno in #32121 but did not adjust this test accordingly. If min_seqno equals to max_seqno, the size should be one instead of zero. Relates #32121

martijnvg added review :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features labels Jul 17, 2018

martijnvg requested review from bleskes and jasontedor July 17, 2018 10:08

bleskes requested a review from dnhatn July 17, 2018 10:12

martijnvg added 2 commits July 18, 2018 15:09

Merge remote-tracking branch 'es/ccr' into ccr_more_unit_tests

569ef55

fixed issues after merge

82f27f5

dnhatn reviewed Jul 18, 2018

View reviewed changes

martijnvg added 4 commits July 19, 2018 09:05

extract just updating the leaderGlobalCheckpoint in a new helper method,

47bc37c

this will make testing easier with out changing leaderGlobalCheckpoint field's visibility

renamed test methods

19aa8ac

maxRequiredSeqno and toSeqNo are exclusive so when computing them

5cebe6e

based on a from value + maxOperationCount, - 1 need be applied

folded assertions

071949a

dnhatn reviewed Jul 19, 2018

View reviewed changes

martijnvg added 3 commits July 20, 2018 08:10

Revert "extract just updating the leaderGlobalCheckpoint in a new hel…

a63a221

…per method," This reverts commit 47bc37c.

allowed both leader and follower GCP to be provided in the start meth…

2b7d401

…od for testing purposes

add more cancel tests and add missing cancel checks in shard follow t…

2b99af6

…ask.

dnhatn approved these changes Jul 20, 2018

View reviewed changes

martijnvg merged commit a6b7497 into elastic:ccr Jul 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CCR] Add more unit tests for shard follow task #32121

[CCR] Add more unit tests for shard follow task #32121

martijnvg commented Jul 17, 2018

elasticmachine commented Jul 17, 2018

dnhatn left a comment

dnhatn Jul 18, 2018

martijnvg Jul 19, 2018

dnhatn Jul 18, 2018

dnhatn Jul 18, 2018

martijnvg Jul 19, 2018

martijnvg Jul 19, 2018

bleskes Jul 19, 2018

martijnvg Jul 19, 2018

dnhatn left a comment

dnhatn Jul 19, 2018

martijnvg Jul 20, 2018

dnhatn Jul 19, 2018

dnhatn Jul 19, 2018 •

edited

Loading

martijnvg Jul 20, 2018

jasontedor Jul 20, 2018

dnhatn left a comment

[CCR] Add more unit tests for shard follow task #32121

[CCR] Add more unit tests for shard follow task #32121

Conversation

martijnvg commented Jul 17, 2018

elasticmachine commented Jul 17, 2018

dnhatn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dnhatn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dnhatn Jul 19, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dnhatn left a comment

Choose a reason for hiding this comment

dnhatn Jul 19, 2018 •

edited

Loading