Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CCR] Add more unit tests for shard follow task #32121

Merged
merged 10 commits into from
Jul 20, 2018

Conversation

martijnvg
Copy link
Member

The added tests are based on specific scenarios as described in the test plan.
Before this change the ShardFollowNodeTaskTests contained more random like tests,
but these have been removed and in a followup pr better random tests will
be added in a new test class as is described in the test plan.

The added tests are based on specific scenarios as described in the test plan.
Before this change the ShardFollowNodeTaskTests contained more random like tests,
but these have been removed and in a followup pr better random tests will
be added in a new test class as is described in the test plan.
@martijnvg martijnvg added review :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features labels Jul 17, 2018
@martijnvg martijnvg requested review from bleskes and jasontedor July 17, 2018 10:08
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@bleskes bleskes requested a review from dnhatn July 17, 2018 10:12
Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @martijnvg for adding many great tests. I left some comments.

task.coordinateReads();
assertThat(shardChangesRequests.size(), equalTo(8));
assertThat(shardChangesRequests.get(0)[0], equalTo(0L));
assertThat(shardChangesRequests.get(0)[1], equalTo(8L));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we fold all these assertions into a single one? I think this should cover enough.

   assertThat(shardChangesRequests, contains(new long[][]{
            {0L, 8L}, {9L, 8L}, {18L, 8L}, {27L, 8L}, {36L, 8L}, {45L, 8L}, {54L, 8L}, {63L, 8L}
        }));

Moreover, the leader should not return more than the requesting batch size. Here, we request 8 operations, but it returns 9 operations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not the end offset but the maxBatchOperationCount: https://github.com/elastic/elasticsearch/pull/32121/files#diff-e143edbe5e6171625fe27ec29039aabdR622

Since that is always the same perhaps it not worth asserting it. Should we just assert the from offset?

assertThat(status.getLeaderGlobalCheckpoint(), equalTo(128L));
}

public void testCoordinateReads_maxConcurrentReads() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid using camel_case here? I think testMaxConcurrentReads should be good.

@@ -61,13 +61,14 @@
private final BiConsumer<TimeValue, Runnable> scheduler;

private volatile long lastRequestedSeqno;
private volatile long leaderGlobalCheckpoint;
// package-protected visibility for testing only:
volatile long leaderGlobalCheckpoint;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we should increase the visibility to allow changing this volatile variable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I'm not happy with it too. But otherwise testing the isolated steps is much trickier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dnhatn What do you think about this: 47bc37c ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this exposed in the status object? can't we use that maybe? just an idea

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bleskes The test sets the leaderGlobalCheckpoint. I just changed how this test now sets it via this: 47bc37c

this will make testing easier with out changing leaderGlobalCheckpoint field's visibility
based on a from value + maxOperationCount, - 1 need be applied
Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some points to discuss.

void startTask(ShardFollowNodeTask task, long leaderGlobalCheckpoint, long followerGlobalCheckpoint) {
task.start(followerGlobalCheckpoint);
// Shortcut to just set leaderGlobalCheckpoint, calling for example handleReadResponse() has side effects that
// complicates testing in isolation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can have a ShardFollowNodeTask#start method that accepts two params: leaderGCP and followerGCP. Then we can remove updateLeaderGlobalCheckpoint and this helper method startTask. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I think that in the case of this test that this would be better. The only downside is that in production code we will always set the leaderGCP and followerGCP to the same value. ( The ShardFollowTasksExecutor only fetches the followerGCP). But I think this better then separating out updating the leaderGCP in a separate method.


shardChangesRequests.clear();
synchronized (task) {
task.updateLeaderGlobalCheckpoint(128L);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can setup other scenarios to test the cancellation if we remove updateLeaderGlobalCheckpoint. For example, make the read limits reached, then cancel, then verify that we won't issue any read request.

shardChangesRequests.clear();
// Also invokes the coordinatesReads() method:
task.innerHandleReadResponse(0L, 63L, generateShardChangesResponse(0, 63, 0L, 128L));
assertThat(shardChangesRequests.size(), equalTo(0)); // no more reads, because write buffer is full
Copy link
Member

@dnhatn dnhatn Jul 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we should add buffer (size or operations) to the Status object? We can do it in a follow up if you are okay.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add this too, but in a follow up. I think when Jason is going to work on the ccr stats then this is a stat that is very likely to be added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed.

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @martijnvg. LGTM

@martijnvg martijnvg merged commit a6b7497 into elastic:ccr Jul 20, 2018
martijnvg added a commit that referenced this pull request Jul 20, 2018
The added tests are based on specific scenarios as described in the test plan.
Before this change the ShardFollowNodeTaskTests contained more random like tests,
but these have been removed and in a followup pr better random tests will
be added in a new test class as is described in the test plan.
dnhatn added a commit that referenced this pull request Jul 20, 2018
We modified the way we calculate to_seqno in #32121 but did not adjust
this test accordingly. If min_seqno equals to max_seqno, the size should be
one instead of zero.

Relates #32121
dnhatn added a commit that referenced this pull request Jul 22, 2018
We modified the way we calculate to_seqno in #32121 but did not adjust
this test accordingly. If min_seqno equals to max_seqno, the size should be
one instead of zero.

Relates #32121
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/CCR Issues around the Cross Cluster State Replication features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants